cvs2svn-2.4.0/0000775000076500007650000000000012027373500014202 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/HACKING0000664000076500007650000000431111435427214015174 0ustar mhaggermhagger00000000000000 -*-text-*- =========================== Hacker's Guide To cvs2svn =========================== This project tends to use the same social and technical guidelines (where applicable) as Subversion itself. You can view them online at http://subversion.apache.org/docs/community-guide/conventions.html. * The source code is accessible from two places: * The primary repository for cvs2svn is held in Subversion under the following URL: http://cvs2svn.tigris.org/svn/cvs2svn The repository is readable by anybody using username="guest", password="" (i.e., just press return). * Michael Haggerty maintains a git mirror of the trunk of cvs2svn (and perhaps some other branches) at repo.or.cz. The summary page is at http://repo.or.cz/w/cvs2svn.git and the (read-only) mirror is at git://repo.or.cz/cvs2svn.git * Read the files under doc/, especially: * doc/design-notes.txt gives a high-level description of the algorithm used by cvs2svn to make sense of the CVS history. * doc/symbol-notes.txt describes how CVS symbols are handled. * doc/making-releases.txt describes the procedure for making a new release of cvs2svn. * Read the files under www/, especially: * www/features.html describes abstractly many of the CVS peculiarities that cvs2svn attempts to deal with. Please note that changes committed to the trunk version of www/ are automatically deployed to the cvs2svn project website. * Read the class and method docstrings. * Adhere to the code formatting conventions of the rest of the project (e.g., limit line length to 79 characters). * We no longer require the exhaustive commit messages required by the Subversion project. But please include commit messages that: * Describe the *reason* for the change. * Attribute changes to their original author using lines like Patch by: Joe Schmo * Please put a new test in run-tests.py when you fix a bug. * Use 2 spaces between sentences in comments and docstrings. (This helps sentence-motion commands in some editors.) Happy hacking! cvs2svn-2.4.0/cvs2hg-example.options0000664000076500007650000006317511710517257020467 0ustar mhaggermhagger00000000000000# (Be in -*- mode: python; coding: utf-8 -*- mode.) # # ==================================================================== # Copyright (c) 2006-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== # ##################### # ## PLEASE READ ME! ## # ##################### # # This is a template for an options file that can be used to configure # cvs2svn to convert to Mercurial rather than to Subversion. See # www/cvs2git.html and www/cvs2svn.html for general information, and # see the comments in this file for information about what options are # available and how they can be set. # # "cvs2hg" is shorthand for "cvs2git in the mode where it is # outputting to Mercurial instead of git". But the program that needs # to be run is still called "cvs2git". Run it with the --options # option, passing it this file as argument: # # cvs2git --options=cvs2hg-example.options # # Mercurial can (experimentally at this time) read git-fast-import # format via its "hg fastimport" extension, with exceptions: # # 1. "hg fastimport" does not support blobs, so the contents of the # revisions are output inline rather than in a separate blobs file. # This increases the size of the output, because file contents that # appear identically on multiple branches have to be output # multiple times. # # Many options do not have defaults, so it is easier to copy this file # and modify what you need rather than creating a new options file # from scratch. This file is in Python syntax, but you don't need to # know Python to modify it. But if you *do* know Python, then you # will be happy to know that you can use arbitary Python constructs to # do fancy configuration tricks. # # But please be aware of the following: # # * In many places, leading whitespace is significant in Python (it is # used instead of curly braces to group statements together). # Therefore, if you don't know what you are doing, it is best to # leave the whitespace as it is. # # * In normal strings, Python treats a backslash ("\") as an escape # character. Therefore, if you want to specify a string that # contains a backslash, you need either to escape the backslash with # another backslash ("\\"), or use a "raw string", as in one if the # following equivalent examples: # # cvs_executable = 'c:\\windows\\system32\\cvs.exe' # cvs_executable = r'c:\windows\system32\cvs.exe' # # See http://docs.python.org/tutorial/introduction.html#strings for # more information. # # Two identifiers will have been defined before this file is executed, # and can be used freely within this file: # # ctx -- a Ctx object (see cvs2svn_lib/context.py), which holds # many configuration options # # run_options -- an instance of the GitRunOptions class (see # cvs2svn_lib/git_run_options.py), which holds some variables # governing how cvs2git is run # Import some modules that are used in setting the options: import os from cvs2svn_lib import config from cvs2svn_lib import changeset_database from cvs2svn_lib.common import CVSTextDecoder from cvs2svn_lib.log import logger from cvs2svn_lib.git_output_option import GitRevisionInlineWriter from cvs2svn_lib.git_output_option import GitOutputOption from cvs2svn_lib.dvcs_common import KeywordHandlingPropertySetter from cvs2svn_lib.revision_manager import NullRevisionCollector from cvs2svn_lib.rcs_revision_manager import RCSRevisionReader from cvs2svn_lib.cvs_revision_manager import CVSRevisionReader from cvs2svn_lib.symbol_strategy import AllBranchRule from cvs2svn_lib.symbol_strategy import AllTagRule from cvs2svn_lib.symbol_strategy import BranchIfCommitsRule from cvs2svn_lib.symbol_strategy import ExcludeRegexpStrategyRule from cvs2svn_lib.symbol_strategy import ForceBranchRegexpStrategyRule from cvs2svn_lib.symbol_strategy import ForceTagRegexpStrategyRule from cvs2svn_lib.symbol_strategy import ExcludeTrivialImportBranchRule from cvs2svn_lib.symbol_strategy import ExcludeVendorBranchRule from cvs2svn_lib.symbol_strategy import HeuristicStrategyRule from cvs2svn_lib.symbol_strategy import UnambiguousUsageRule from cvs2svn_lib.symbol_strategy import HeuristicPreferredParentRule from cvs2svn_lib.symbol_strategy import SymbolHintsFileRule from cvs2svn_lib.symbol_transform import ReplaceSubstringsSymbolTransform from cvs2svn_lib.symbol_transform import RegexpSymbolTransform from cvs2svn_lib.symbol_transform import IgnoreSymbolTransform from cvs2svn_lib.symbol_transform import NormalizePathsSymbolTransform from cvs2svn_lib.property_setters import AutoPropsPropertySetter from cvs2svn_lib.property_setters import ConditionalPropertySetter from cvs2svn_lib.property_setters import cvs_file_is_binary from cvs2svn_lib.property_setters import CVSBinaryFileDefaultMimeTypeSetter from cvs2svn_lib.property_setters import CVSBinaryFileEOLStyleSetter from cvs2svn_lib.property_setters import DefaultEOLStyleSetter from cvs2svn_lib.property_setters import EOLStyleFromMimeTypeSetter from cvs2svn_lib.property_setters import ExecutablePropertySetter from cvs2svn_lib.property_setters import KeywordsPropertySetter from cvs2svn_lib.property_setters import MimeMapper from cvs2svn_lib.property_setters import SVNBinaryFileKeywordsPropertySetter # To choose the level of logging output, uncomment one of the # following lines: #logger.log_level = logger.WARN #logger.log_level = logger.QUIET logger.log_level = logger.NORMAL #logger.log_level = logger.VERBOSE #logger.log_level = logger.DEBUG # The directory to use for temporary files: ctx.tmpdir = r'cvs2svn-tmp' # cvs2hg does not need to keep track of what revisions will be # excluded, so leave this option unchanged: ctx.revision_collector = NullRevisionCollector() # cvs2hg's revision reader is set via the GitOutputOption constructor, # so leave this option set to None. ctx.revision_reader = None # Change the following line to True if the conversion should only # include the trunk of the repository (i.e., all branches and tags # should be omitted from the conversion): ctx.trunk_only = False # How to convert CVS author names, log messages, and filenames to # Unicode. The first argument to CVSTextDecoder is a list of encoders # that are tried in order in 'strict' mode until one of them succeeds. # If none of those succeeds, then fallback_encoder (if it is # specified) is used in lossy 'replace' mode. Setting a fallback # encoder ensures that the encoder always succeeds, but it can cause # information loss. ctx.cvs_author_decoder = CVSTextDecoder( [ #'utf8', #'latin1', 'ascii', ], #fallback_encoding='ascii' ) ctx.cvs_log_decoder = CVSTextDecoder( [ #'utf8', #'latin1', 'ascii', ], #fallback_encoding='ascii', eol_fix='\n', ) # You might want to be especially strict when converting filenames to # Unicode (e.g., maybe not specify a fallback_encoding). ctx.cvs_filename_decoder = CVSTextDecoder( [ #'utf8', #'latin1', 'ascii', ], #fallback_encoding='ascii' ) # Template for the commit message to be used for initial project # commits. ctx.initial_project_commit_message = ( 'Standard project directories initialized by cvs2svn.' ) # Template for the commit message to be used for post commits, in # which modifications to a vendor branch are copied back to trunk. # This message can use '%(revnum)d' to include the SVN revision number # of the revision that included the change to the vendor branch # (admittedly rather pointless in a cvs2hg conversion). ctx.post_commit_message = ( 'This commit was generated by cvs2svn to track changes on a CVS ' 'vendor branch.' ) # Template for the commit message to be used for commits in which # symbols are created. This message can use '%(symbol_type)s' to # include the type of the symbol ('branch' or 'tag') or # '%(symbol_name)s' to include the name of the symbol. ctx.symbol_commit_message = ( "This commit was manufactured by cvs2svn to create %(symbol_type)s " "'%(symbol_name)s'." ) # Some CVS clients for MacOS store resource fork data into CVS along # with the file contents itself by wrapping it all up in a container # format called "AppleSingle". Subversion currently does not support # MacOS resource forks. Nevertheless, sometimes the resource fork # information is not necessary and can be discarded. Set the # following option to True if you would like cvs2svn to identify files # whose contents are encoded in AppleSingle format, and discard all # but the data fork for such files before committing them to # Subversion. (Please note that AppleSingle contents are identified # by the AppleSingle magic number as the first four bytes of the file. # This check is not failproof, so only set this option if you think # you need it.) ctx.decode_apple_single = False # This option can be set to the name of a filename to which are stored # statistics and conversion decisions about the CVS symbols. ctx.symbol_info_filename = None #ctx.symbol_info_filename = 'symbol-info.txt' # cvs2svn uses "symbol strategy rules" to help decide how to handle # CVS symbols. The rules in a project's symbol_strategy_rules are # applied in order, and each rule is allowed to modify the symbol. # The result (after each of the rules has been applied) is used for # the conversion. # # 1. A CVS symbol might be used as a tag in one file and as a branch # in another file. cvs2svn has to decide whether to convert such a # symbol as a tag or as a branch. cvs2svn uses a series of # heuristic rules to decide how to convert a symbol. The user can # override the default rules for specific symbols or symbols # matching regular expressions. # # 2. cvs2svn is also capable of excluding symbols from the conversion # (provided no other symbols depend on them. # # 3. CVS does not record unambiguously the line of development from # which a symbol sprouted. cvs2svn uses a heuristic to choose a # symbol's "preferred parents". # # The standard branch/tag/exclude StrategyRules do not change a symbol # that has already been processed by an earlier rule, so in effect the # first matching rule is the one that is used. global_symbol_strategy_rules = [ # It is possible to specify manually exactly how symbols should be # converted and what line of development should be used as the # preferred parent. To do so, create a file containing the symbol # hints and enable the following option. # # The format of the hints file is described in the documentation # for the --symbol-hints command-line option. The file output by # the --write-symbol-info (i.e., ctx.symbol_info_filename) option # is in the same format. The simplest way to use this option is # to run the conversion through CollateSymbolsPass with # --write-symbol-info option, copy the symbol info and edit it to # create a hints file, then re-start the conversion at # CollateSymbolsPass with this option enabled. #SymbolHintsFileRule('symbol-hints.txt'), # To force all symbols matching a regular expression to be # converted as branches, add rules like the following: #ForceBranchRegexpStrategyRule(r'branch.*'), # To force all symbols matching a regular expression to be # converted as tags, add rules like the following: #ForceTagRegexpStrategyRule(r'tag.*'), # To force all symbols matching a regular expression to be # excluded from the conversion, add rules like the following: #ExcludeRegexpStrategyRule(r'unknown-.*'), # Sometimes people use "cvs import" to get their own source code # into CVS. This practice creates a vendor branch 1.1.1 and # imports the code onto the vendor branch as 1.1.1.1, then copies # the same content to the trunk as version 1.1. Normally, such # vendor branches are useless and they complicate the SVN history # unnecessarily. The following rule excludes any branches that # only existed as a vendor branch with a single import (leaving # only the 1.1 revision). If you want to retain such branches, # comment out the following line. (Please note that this rule # does not exclude vendor *tags*, as they are not so easy to # identify.) ExcludeTrivialImportBranchRule(), # To exclude all vendor branches (branches that had "cvs import"s # on them but no other kinds of commits), uncomment the following # line: #ExcludeVendorBranchRule(), # Usually you want this rule, to convert unambiguous symbols # (symbols that were only ever used as tags or only ever used as # branches in CVS) the same way they were used in CVS: UnambiguousUsageRule(), # If there was ever a commit on a symbol, then it cannot be # converted as a tag. This rule causes all such symbols to be # converted as branches. If you would like to resolve such # ambiguities manually, comment out the following line: BranchIfCommitsRule(), # Last in the list can be a catch-all rule that is used for # symbols that were not matched by any of the more specific rules # above. (Assuming that BranchIfCommitsRule() was included above, # then the symbols that are still indeterminate at this point can # sensibly be converted as branches or tags.) Include at most one # of these lines. If none of these catch-all rules are included, # then the presence of any ambiguous symbols (that haven't been # disambiguated above) is an error: # Convert ambiguous symbols based on whether they were used more # often as branches or as tags: HeuristicStrategyRule(), # Convert all ambiguous symbols as branches: #AllBranchRule(), # Convert all ambiguous symbols as tags: #AllTagRule(), # The last rule is here to choose the preferred parent of branches # and tags, that is, the line of development from which the symbol # sprouts. HeuristicPreferredParentRule(), ] # Specify a username to be used for commits for which CVS doesn't # record the original author (for example, the creation of a branch). # This should be a simple (unix-style) username, but it can be # translated into a hg-style name by the author_transforms map. ctx.username = 'cvs2svn' # ctx.file_property_setters and ctx.revision_property_setters contain # rules used to set the svn properties on files in the converted # archive. For each file, the rules are tried one by one. Any rule # can add or suppress one or more svn properties. Typically the rules # will not overwrite properties set by a previous rule (though they # are free to do so). ctx.file_property_setters should be used for # properties that remain the same for the life of the file; these # should implement FilePropertySetter. ctx.revision_property_setters # should be used for properties that are allowed to vary from revision # to revision; these should implement RevisionPropertySetter. # # Obviously, SVN properties per se are not interesting for a cvs2hg # conversion, but some of these properties have side-effects that do # affect the hg output. FIXME: Document this in more detail. ctx.file_property_setters.extend([ # To read auto-props rules from a file, uncomment the following line # and specify a filename. The boolean argument specifies whether # case should be ignored when matching filenames to the filename # patterns found in the auto-props file: #AutoPropsPropertySetter( # r'/home/username/.subversion/config', # ignore_case=True, # ), # To read mime types from a file and use them to set svn:mime-type # based on the filename extensions, uncomment the following line # and specify a filename (see # http://en.wikipedia.org/wiki/Mime.types for information about # mime.types files): #MimeMapper(r'/etc/mime.types', ignore_case=False), # Omit the svn:eol-style property from any files that are listed # as binary (i.e., mode '-kb') in CVS: CVSBinaryFileEOLStyleSetter(), # If the file is binary and its svn:mime-type property is not yet # set, set svn:mime-type to 'application/octet-stream'. CVSBinaryFileDefaultMimeTypeSetter(), # To try to determine the eol-style from the mime type, uncomment # the following line: #EOLStyleFromMimeTypeSetter(), # Choose one of the following lines to set the default # svn:eol-style if none of the above rules applied. The argument # is the svn:eol-style that should be applied, or None if no # svn:eol-style should be set (i.e., the file should be treated as # binary). # # The default is to treat all files as binary unless one of the # previous rules has determined otherwise, because this is the # safest approach. However, if you have been diligent about # marking binary files with -kb in CVS and/or you have used the # above rules to definitely mark binary files as binary, then you # might prefer to use 'native' as the default, as it is usually # the most convenient setting for text files. Other possible # options: 'CRLF', 'CR', 'LF'. DefaultEOLStyleSetter(None), #DefaultEOLStyleSetter('native'), # Prevent svn:keywords from being set on files that have # svn:eol-style unset. SVNBinaryFileKeywordsPropertySetter(), # If svn:keywords has not been set yet, set it based on the file's # CVS mode: KeywordsPropertySetter(config.SVN_KEYWORDS_VALUE), # Set the svn:executable flag on any files that are marked in CVS as # being executable: ExecutablePropertySetter(), # The following causes keywords to be untouched in binary files and # collapsed in all text to be committed: ConditionalPropertySetter( cvs_file_is_binary, KeywordHandlingPropertySetter('untouched'), ), KeywordHandlingPropertySetter('collapsed'), ]) ctx.revision_property_setters.extend([ ]) # To skip the cleanup of temporary files, uncomment the following # option: #ctx.skip_cleanup = True # In CVS, it is perfectly possible to make a single commit that # affects more than one project or more than one branch of a single # project. Subversion also allows such commits. Therefore, by # default, when cvs2svn sees what looks like a cross-project or # cross-branch CVS commit, it converts it into a # cross-project/cross-branch Subversion commit. # # However, other tools and SCMs have trouble representing # cross-project or cross-branch commits. (For example, Trac's Revtree # plugin, http://www.trac-hacks.org/wiki/RevtreePlugin is confused by # such commits.) Therefore, we provide the following two options to # allow cross-project/cross-branch commits to be suppressed. # cvs2hg only supports single-project conversions (multiple-project # conversions wouldn't really make sense for hg anyway). So this # option must be set to False: ctx.cross_project_commits = False # Mercurial itself doesn't allow commits that affect more than one # branch, so this option must be set to False: ctx.cross_branch_commits = False # cvs2hg does not yet handle translating .cvsignore files into # .hgignore content, so by default, the .cvsignore files are included # inthe conversion output. If you would like to omit the .cvsignore # files from the output, set this option to False: ctx.keep_cvsignore = True # By default, it is a fatal error for a CVS ",v" file to appear both # inside and outside of an "Attic" subdirectory (this should never # happen, but frequently occurs due to botched repository # administration). If you would like to retain both versions of such # files, change the following option to True, and the attic version of # the file will be written to a subdirectory called "Attic" in the # output repository: ctx.retain_conflicting_attic_files = False # CVS uses unix login names as author names whereas "hg fastimport" # format requires author names to be of the form "foo ". The # default is to set the author to "cvsauthor ". # author_transforms can be used to map cvsauthor names (e.g., # "jrandom") to a true name and email address (e.g., "J. Random # " for the example shown). All strings should # be either Unicode strings (i.e., with "u" as a prefix) or 8-bit # strings in the utf-8 encoding. The values can either be strings in # the form "name " or tuples (name, email). Please substitute # your own project's usernames here to use with the author_transforms # option of GitOutputOption below. author_transforms={ 'jrandom' : ('J. Random', 'jrandom@example.com'), 'mhagger' : 'Michael Haggerty ', 'brane' : (u'Branko Čibej', 'brane@xbc.nu'), 'ringstrom' : 'Tobias Ringström ', 'dionisos' : (u'Erik Hülsmann', 'e.huelsmann@gmx.net'), # This one will be used for commits for which CVS doesn't record # the original author, as explained above. 'cvs2svn' : 'cvs2svn ', } # This is the main option that causes cvs2svn to output to an "hg # fastimport"-format dumpfile rather than to Subversion: ctx.output_option = GitOutputOption( # The file in which to write the "hg fastimport" stream that # contains the changesets and branch/tag information: os.path.join(ctx.tmpdir, 'hg-dump.dat'), # Write the file contents inline in the "hg fastimport" stream, # rather than using a separate blobs file (which "hg fastimport" # cannot handle). revision_writer=GitRevisionInlineWriter( # cvs2hg uses either RCS's "co" command or CVS's "cvs co -p" to # extract the content of file revisions. Here you can choose # whether to use RCS (faster, but fails in some rare # circumstances) or CVS (much slower, but more reliable). #RCSRevisionReader(co_executable=r'co') CVSRevisionReader(cvs_executable=r'cvs') ), # Optional map from CVS author names to hg author names: author_transforms=author_transforms, ) # Change this option to True to turn on profiling of cvs2svn (for # debugging purposes): run_options.profiling = False # Should CVSItem -> Changeset database files be memory mapped? In # some tests, using memory mapping speeded up the overall conversion # by about 5%. But this option can cause the conversion to fail with # an out of memory error if the conversion computer runs out of # virtual address space (e.g., when running a very large conversion on # a 32-bit operating system). Therefore it is disabled by default. # Uncomment the following line to allow these database files to be # memory mapped. #changeset_database.use_mmap_for_cvs_item_to_changeset_table = True # Now set the project to be converted to hg. cvs2hg only supports # single-project conversions, so this method must only be called once: run_options.set_project( # The filesystem path to the part of the CVS repository (*not* a # CVS working copy) that should be converted. This may be a # subdirectory (i.e., a module) within a larger CVS repository. r'test-data/main-cvsrepos', # A list of symbol transformations that can be used to rename # symbols in this project. symbol_transforms=[ # Use IgnoreSymbolTransforms like the following to completely # ignore symbols matching a regular expression when parsing # the CVS repository, for example to avoid warnings about # branches with two names and to choose the preferred name. # It is *not* recommended to use this instead of # ExcludeRegexpStrategyRule; though more efficient, # IgnoreSymbolTransforms are less flexible and don't exclude # branches correctly. The argument is a Python-style regular # expression that has to match the *whole* CVS symbol name: #IgnoreSymbolTransform(r'nightly-build-tag-.*') # RegexpSymbolTransforms transform symbols textually using a # regular expression. The first argument is a Python regular # expression pattern and the second is a replacement pattern. # The pattern is matched against each symbol name. If it # matches the whole symbol name, then the symbol name is # replaced with the corresponding replacement text. The # replacement can include substitution patterns (e.g., r'\1' # or r'\g'). Typically you will want to use raw strings # (strings with a preceding 'r', like shown in the examples) # for the regexp and its replacement to avoid backslash # substitution within those strings. #RegexpSymbolTransform(r'release-(\d+)_(\d+)', # r'release-\1.\2'), #RegexpSymbolTransform(r'release-(\d+)_(\d+)_(\d+)', # r'release-\1.\2.\3'), # Simple 1:1 character replacements can also be done. The # following transform, which converts backslashes into forward # slashes, should usually be included: ReplaceSubstringsSymbolTransform('\\','/'), # This last rule eliminates leading, trailing, and repeated # slashes within the output symbol names: NormalizePathsSymbolTransform(), ], # See the definition of global_symbol_strategy_rules above for a # description of this option: symbol_strategy_rules=global_symbol_strategy_rules, # Exclude paths from the conversion. Should be relative to # repository path and use forward slashes: #exclude_paths=['file-to-exclude.txt,v', 'dir/to/exclude'], ) cvs2svn-2.4.0/cvs2git-example.options0000664000076500007650000006647312027257624020661 0ustar mhaggermhagger00000000000000# (Be in -*- mode: python; coding: utf-8 -*- mode.) # # ==================================================================== # Copyright (c) 2006-2010 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== # ##################### # ## PLEASE READ ME! ## # ##################### # # This is a template for an options file that can be used to configure # cvs2svn to convert to git rather than to Subversion. See # www/cvs2git.html and www/cvs2svn.html for general information, and # see the comments in this file for information about what options are # available and how they can be set. # # The program that is run to convert from CVS to git is called # cvs2git. Run it with the --options option, passing it this file # like this: # # cvs2git --options=cvs2git-example.options # # The output of cvs2git is a blob file and a dump file that can be # loaded into git using the "git fast-import" command. Please read # www/cvs2git.html for more information. # # Many options do not have defaults, so it is easier to copy this file # and modify what you need rather than creating a new options file # from scratch. This file is in Python syntax, but you don't need to # know Python to modify it. But if you *do* know Python, then you # will be happy to know that you can use arbitary Python constructs to # do fancy configuration tricks. # # But please be aware of the following: # # * In many places, leading whitespace is significant in Python (it is # used instead of curly braces to group statements together). # Therefore, if you don't know what you are doing, it is best to # leave the whitespace as it is. # # * In normal strings, Python treats a backslash ("\") as an escape # character. Therefore, if you want to specify a string that # contains a backslash, you need either to escape the backslash with # another backslash ("\\"), or use a "raw string", as in one if the # following equivalent examples: # # cvs_executable = 'c:\\windows\\system32\\cvs.exe' # cvs_executable = r'c:\windows\system32\cvs.exe' # # See http://docs.python.org/tutorial/introduction.html#strings for # more information. # # Two identifiers will have been defined before this file is executed, # and can be used freely within this file: # # ctx -- a Ctx object (see cvs2svn_lib/context.py), which holds # many configuration options # # run_options -- an instance of the GitRunOptions class (see # cvs2svn_lib/git_run_options.py), which holds some variables # governing how cvs2git is run # Import some modules that are used in setting the options: import os from cvs2svn_lib import config from cvs2svn_lib import changeset_database from cvs2svn_lib.common import CVSTextDecoder from cvs2svn_lib.log import logger from cvs2svn_lib.git_revision_collector import GitRevisionCollector from cvs2svn_lib.external_blob_generator import ExternalBlobGenerator from cvs2svn_lib.git_output_option import GitRevisionMarkWriter from cvs2svn_lib.git_output_option import GitOutputOption from cvs2svn_lib.dvcs_common import KeywordHandlingPropertySetter from cvs2svn_lib.rcs_revision_manager import RCSRevisionReader from cvs2svn_lib.cvs_revision_manager import CVSRevisionReader from cvs2svn_lib.symbol_strategy import AllBranchRule from cvs2svn_lib.symbol_strategy import AllTagRule from cvs2svn_lib.symbol_strategy import BranchIfCommitsRule from cvs2svn_lib.symbol_strategy import ExcludeRegexpStrategyRule from cvs2svn_lib.symbol_strategy import ForceBranchRegexpStrategyRule from cvs2svn_lib.symbol_strategy import ForceTagRegexpStrategyRule from cvs2svn_lib.symbol_strategy import ExcludeTrivialImportBranchRule from cvs2svn_lib.symbol_strategy import ExcludeVendorBranchRule from cvs2svn_lib.symbol_strategy import HeuristicStrategyRule from cvs2svn_lib.symbol_strategy import UnambiguousUsageRule from cvs2svn_lib.symbol_strategy import HeuristicPreferredParentRule from cvs2svn_lib.symbol_strategy import SymbolHintsFileRule from cvs2svn_lib.symbol_transform import ReplaceSubstringsSymbolTransform from cvs2svn_lib.symbol_transform import RegexpSymbolTransform from cvs2svn_lib.symbol_transform import IgnoreSymbolTransform from cvs2svn_lib.symbol_transform import NormalizePathsSymbolTransform from cvs2svn_lib.property_setters import AutoPropsPropertySetter from cvs2svn_lib.property_setters import ConditionalPropertySetter from cvs2svn_lib.property_setters import cvs_file_is_binary from cvs2svn_lib.property_setters import CVSBinaryFileDefaultMimeTypeSetter from cvs2svn_lib.property_setters import CVSBinaryFileEOLStyleSetter from cvs2svn_lib.property_setters import DefaultEOLStyleSetter from cvs2svn_lib.property_setters import EOLStyleFromMimeTypeSetter from cvs2svn_lib.property_setters import ExecutablePropertySetter from cvs2svn_lib.property_setters import KeywordsPropertySetter from cvs2svn_lib.property_setters import MimeMapper from cvs2svn_lib.property_setters import SVNBinaryFileKeywordsPropertySetter # To choose the level of logging output, uncomment one of the # following lines: #logger.log_level = logger.WARN #logger.log_level = logger.QUIET logger.log_level = logger.NORMAL #logger.log_level = logger.VERBOSE #logger.log_level = logger.DEBUG # The directory to use for temporary files: ctx.tmpdir = r'cvs2svn-tmp' # During FilterSymbolsPass, cvs2git records the contents of file # revisions into a "blob" file in git-fast-import format. The # ctx.revision_collector option configures that process. Choose one of the two ersions and customize its options. # This first alternative is much slower but is better tested and has a # chance of working with CVSNT repositories. It invokes CVS or RCS to # reconstuct the contents of CVS file revisions: ctx.revision_collector = GitRevisionCollector( # The file in which to write the git-fast-import stream that # contains the file revision contents: 'cvs2svn-tmp/git-blob.dat', # The following option specifies how the revision contents of the # RCS files should be read. # # RCSRevisionReader uses RCS's "co" program to extract the # revision contents of the RCS files during CollectRevsPass. The # constructor argument specifies how to invoke the "co" # executable. # # CVSRevisionReader uses the "cvs" program to extract the revision # contents out of the RCS files during OutputPass. This option is # considerably slower than RCSRevisionReader because "cvs" is # considerably slower than "co". However, it works in some # situations where RCSRevisionReader fails; see the HTML # documentation of the "--use-cvs" option for details. The # constructor argument specifies how to invoke the "co" # executable. It is also possible to pass a global_options # parameter to CVSRevisionReader to specify which options should # be passed to the cvs command. By default the correct options # are usually chosen, but for CVSNT you might want to add # global_options=['-q', '-N', '-f']. # # Uncomment one of the two following lines: #RCSRevisionReader(co_executable=r'co'), CVSRevisionReader(cvs_executable=r'cvs'), ) # This second alternative is vastly faster than the version above. It # uses an external Python program to reconstruct the contents of CVS # file revisions: #ctx.revision_collector = ExternalBlobGenerator('cvs2svn-tmp/git-blob.dat') # cvs2git doesn't need a revision reader because OutputPass only # refers to blobs that were output during CollectRevsPass, so leave # this option set to None. ctx.revision_reader = None # Change the following line to True if the conversion should only # include the trunk of the repository (i.e., all branches and tags # should be omitted from the conversion): ctx.trunk_only = False # How to convert CVS author names, log messages, and filenames to # Unicode. The first argument to CVSTextDecoder is a list of encoders # that are tried in order in 'strict' mode until one of them succeeds. # If none of those succeeds, then fallback_encoder (if it is # specified) is used in lossy 'replace' mode. Setting a fallback # encoder ensures that the encoder always succeeds, but it can cause # information loss. ctx.cvs_author_decoder = CVSTextDecoder( [ #'utf8', #'latin1', 'ascii', ], #fallback_encoding='ascii' ) ctx.cvs_log_decoder = CVSTextDecoder( [ #'utf8', #'latin1', 'ascii', ], #fallback_encoding='ascii', eol_fix='\n', ) # You might want to be especially strict when converting filenames to # Unicode (e.g., maybe not specify a fallback_encoding). ctx.cvs_filename_decoder = CVSTextDecoder( [ #'utf8', #'latin1', 'ascii', ], #fallback_encoding='ascii' ) # Template for the commit message to be used for initial project # commits. ctx.initial_project_commit_message = ( 'Standard project directories initialized by cvs2svn.' ) # Template for the commit message to be used for post commits, in # which modifications to a vendor branch are copied back to trunk. # This message can use '%(revnum)d' to include the SVN revision number # of the revision that included the change to the vendor branch # (admittedly rather pointless in a cvs2git conversion). ctx.post_commit_message = ( 'This commit was generated by cvs2svn to track changes on a CVS ' 'vendor branch.' ) # Template for the commit message to be used for commits in which # symbols are created. This message can use '%(symbol_type)s' to # include the type of the symbol ('branch' or 'tag') or # '%(symbol_name)s' to include the name of the symbol. ctx.symbol_commit_message = ( "This commit was manufactured by cvs2svn to create %(symbol_type)s " "'%(symbol_name)s'." ) # Template for the commit message to be used for commits in which # tags are pseudo-merged back to their source branch. This message can # use '%(symbol_name)s' to include the name of the symbol. # (Not used by default unless you enable tie_tag_fixup_branches on # GitOutputOption.) ctx.tie_tag_ancestry_message = ( "This commit was manufactured by cvs2svn to tie ancestry for " "tag '%(symbol_name)s' back to the source branch." ) # Some CVS clients for MacOS store resource fork data into CVS along # with the file contents itself by wrapping it all up in a container # format called "AppleSingle". Subversion currently does not support # MacOS resource forks. Nevertheless, sometimes the resource fork # information is not necessary and can be discarded. Set the # following option to True if you would like cvs2svn to identify files # whose contents are encoded in AppleSingle format, and discard all # but the data fork for such files before committing them to # Subversion. (Please note that AppleSingle contents are identified # by the AppleSingle magic number as the first four bytes of the file. # This check is not failproof, so only set this option if you think # you need it.) ctx.decode_apple_single = False # This option can be set to the name of a filename to which are stored # statistics and conversion decisions about the CVS symbols. ctx.symbol_info_filename = None #ctx.symbol_info_filename = 'symbol-info.txt' # cvs2svn uses "symbol strategy rules" to help decide how to handle # CVS symbols. The rules in a project's symbol_strategy_rules are # applied in order, and each rule is allowed to modify the symbol. # The result (after each of the rules has been applied) is used for # the conversion. # # 1. A CVS symbol might be used as a tag in one file and as a branch # in another file. cvs2svn has to decide whether to convert such a # symbol as a tag or as a branch. cvs2svn uses a series of # heuristic rules to decide how to convert a symbol. The user can # override the default rules for specific symbols or symbols # matching regular expressions. # # 2. cvs2svn is also capable of excluding symbols from the conversion # (provided no other symbols depend on them. # # 3. CVS does not record unambiguously the line of development from # which a symbol sprouted. cvs2svn uses a heuristic to choose a # symbol's "preferred parents". # # The standard branch/tag/exclude StrategyRules do not change a symbol # that has already been processed by an earlier rule, so in effect the # first matching rule is the one that is used. global_symbol_strategy_rules = [ # It is possible to specify manually exactly how symbols should be # converted and what line of development should be used as the # preferred parent. To do so, create a file containing the symbol # hints and enable the following option. # # The format of the hints file is described in the documentation # for the --symbol-hints command-line option. The file output by # the --write-symbol-info (i.e., ctx.symbol_info_filename) option # is in the same format. The simplest way to use this option is # to run the conversion through CollateSymbolsPass with # --write-symbol-info option, copy the symbol info and edit it to # create a hints file, then re-start the conversion at # CollateSymbolsPass with this option enabled. #SymbolHintsFileRule('symbol-hints.txt'), # To force all symbols matching a regular expression to be # converted as branches, add rules like the following: #ForceBranchRegexpStrategyRule(r'branch.*'), # To force all symbols matching a regular expression to be # converted as tags, add rules like the following: #ForceTagRegexpStrategyRule(r'tag.*'), # To force all symbols matching a regular expression to be # excluded from the conversion, add rules like the following: #ExcludeRegexpStrategyRule(r'unknown-.*'), # Sometimes people use "cvs import" to get their own source code # into CVS. This practice creates a vendor branch 1.1.1 and # imports the code onto the vendor branch as 1.1.1.1, then copies # the same content to the trunk as version 1.1. Normally, such # vendor branches are useless and they complicate the SVN history # unnecessarily. The following rule excludes any branches that # only existed as a vendor branch with a single import (leaving # only the 1.1 revision). If you want to retain such branches, # comment out the following line. (Please note that this rule # does not exclude vendor *tags*, as they are not so easy to # identify.) ExcludeTrivialImportBranchRule(), # To exclude all vendor branches (branches that had "cvs import"s # on them but no other kinds of commits), uncomment the following # line: #ExcludeVendorBranchRule(), # Usually you want this rule, to convert unambiguous symbols # (symbols that were only ever used as tags or only ever used as # branches in CVS) the same way they were used in CVS: UnambiguousUsageRule(), # If there was ever a commit on a symbol, then it cannot be # converted as a tag. This rule causes all such symbols to be # converted as branches. If you would like to resolve such # ambiguities manually, comment out the following line: BranchIfCommitsRule(), # Last in the list can be a catch-all rule that is used for # symbols that were not matched by any of the more specific rules # above. (Assuming that BranchIfCommitsRule() was included above, # then the symbols that are still indeterminate at this point can # sensibly be converted as branches or tags.) Include at most one # of these lines. If none of these catch-all rules are included, # then the presence of any ambiguous symbols (that haven't been # disambiguated above) is an error: # Convert ambiguous symbols based on whether they were used more # often as branches or as tags: HeuristicStrategyRule(), # Convert all ambiguous symbols as branches: #AllBranchRule(), # Convert all ambiguous symbols as tags: #AllTagRule(), # The last rule is here to choose the preferred parent of branches # and tags, that is, the line of development from which the symbol # sprouts. HeuristicPreferredParentRule(), ] # Specify a username to be used for commits for which CVS doesn't # record the original author (for example, the creation of a branch). # This should be a simple (unix-style) username, but it can be # translated into a git-style name by the author_transforms map. ctx.username = 'cvs2svn' # ctx.file_property_setters and ctx.revision_property_setters contain # rules used to set the svn properties on files in the converted # archive. For each file, the rules are tried one by one. Any rule # can add or suppress one or more svn properties. Typically the rules # will not overwrite properties set by a previous rule (though they # are free to do so). ctx.file_property_setters should be used for # properties that remain the same for the life of the file; these # should implement FilePropertySetter. ctx.revision_property_setters # should be used for properties that are allowed to vary from revision # to revision; these should implement RevisionPropertySetter. # # Obviously, SVN properties per se are not interesting for a cvs2git # conversion, but some of these properties have side-effects that do # affect the git output. FIXME: Document this in more detail. ctx.file_property_setters.extend([ # To read auto-props rules from a file, uncomment the following line # and specify a filename. The boolean argument specifies whether # case should be ignored when matching filenames to the filename # patterns found in the auto-props file: #AutoPropsPropertySetter( # r'/home/username/.subversion/config', # ignore_case=True, # ), # To read mime types from a file and use them to set svn:mime-type # based on the filename extensions, uncomment the following line # and specify a filename (see # http://en.wikipedia.org/wiki/Mime.types for information about # mime.types files): #MimeMapper(r'/etc/mime.types', ignore_case=False), # Omit the svn:eol-style property from any files that are listed # as binary (i.e., mode '-kb') in CVS: CVSBinaryFileEOLStyleSetter(), # If the file is binary and its svn:mime-type property is not yet # set, set svn:mime-type to 'application/octet-stream'. CVSBinaryFileDefaultMimeTypeSetter(), # To try to determine the eol-style from the mime type, uncomment # the following line: #EOLStyleFromMimeTypeSetter(), # Choose one of the following lines to set the default # svn:eol-style if none of the above rules applied. The argument # is the svn:eol-style that should be applied, or None if no # svn:eol-style should be set (i.e., the file should be treated as # binary). # # The default is to treat all files as binary unless one of the # previous rules has determined otherwise, because this is the # safest approach. However, if you have been diligent about # marking binary files with -kb in CVS and/or you have used the # above rules to definitely mark binary files as binary, then you # might prefer to use 'native' as the default, as it is usually # the most convenient setting for text files. Other possible # options: 'CRLF', 'CR', 'LF'. DefaultEOLStyleSetter(None), #DefaultEOLStyleSetter('native'), # Prevent svn:keywords from being set on files that have # svn:eol-style unset. SVNBinaryFileKeywordsPropertySetter(), # If svn:keywords has not been set yet, set it based on the file's # CVS mode: KeywordsPropertySetter(config.SVN_KEYWORDS_VALUE), # Set the svn:executable flag on any files that are marked in CVS as # being executable: ExecutablePropertySetter(), # The following causes keywords to be untouched in binary files and # collapsed in all text to be committed: ConditionalPropertySetter( cvs_file_is_binary, KeywordHandlingPropertySetter('untouched'), ), KeywordHandlingPropertySetter('collapsed'), ]) ctx.revision_property_setters.extend([ ]) # To skip the cleanup of temporary files, uncomment the following # option: #ctx.skip_cleanup = True # In CVS, it is perfectly possible to make a single commit that # affects more than one project or more than one branch of a single # project. Subversion also allows such commits. Therefore, by # default, when cvs2svn sees what looks like a cross-project or # cross-branch CVS commit, it converts it into a # cross-project/cross-branch Subversion commit. # # However, other tools and SCMs have trouble representing # cross-project or cross-branch commits. (For example, Trac's Revtree # plugin, http://www.trac-hacks.org/wiki/RevtreePlugin is confused by # such commits.) Therefore, we provide the following two options to # allow cross-project/cross-branch commits to be suppressed. # cvs2git only supports single-project conversions (multiple-project # conversions wouldn't really make sense for git anyway). So this # option must be set to False: ctx.cross_project_commits = False # git itself doesn't allow commits that affect more than one branch, # so this option must be set to False: ctx.cross_branch_commits = False # cvs2git does not yet handle translating .cvsignore files into # .gitignore files, so by default, the .cvsignore files are included # in the conversion output. If you would like to omit the .cvsignore # files from the output, set this option to False: ctx.keep_cvsignore = True # By default, it is a fatal error for a CVS ",v" file to appear both # inside and outside of an "Attic" subdirectory (this should never # happen, but frequently occurs due to botched repository # administration). If you would like to retain both versions of such # files, change the following option to True, and the attic version of # the file will be written to a subdirectory called "Attic" in the # output repository: ctx.retain_conflicting_attic_files = False # CVS uses unix login names as author names whereas git requires # author names to be of the form "foo ". The default is to set # the git author to "cvsauthor ". author_transforms can be # used to map cvsauthor names (e.g., "jrandom") to a true name and # email address (e.g., "J. Random " for the # example shown). All strings should be either Unicode strings (i.e., # with "u" as a prefix) or 8-bit strings in the utf-8 encoding. The # values can either be strings in the form "name " or tuples # (name, email). Please substitute your own project's usernames here # to use with the author_transforms option of GitOutputOption below. author_transforms={ 'jrandom' : ('J. Random', 'jrandom@example.com'), 'mhagger' : 'Michael Haggerty ', 'brane' : (u'Branko Čibej', 'brane@xbc.nu'), 'ringstrom' : 'Tobias Ringström ', 'dionisos' : (u'Erik Hülsmann', 'e.huelsmann@gmx.net'), # This one will be used for commits for which CVS doesn't record # the original author, as explained above. 'cvs2svn' : 'cvs2svn ', } # This is the main option that causes cvs2svn to output to a # "fastimport"-format dumpfile rather than to Subversion: ctx.output_option = GitOutputOption( # The file in which to write the git-fast-import stream that # contains the changesets and branch/tag information: os.path.join(ctx.tmpdir, 'git-dump.dat'), # The blobs will be written via the revision recorder, so in # OutputPass we only have to emit references to the blob marks: GitRevisionMarkWriter(), # Optional map from CVS author names to git author names: author_transforms=author_transforms, ) # Change this option to True to turn on profiling of cvs2svn (for # debugging purposes): run_options.profiling = False # Should CVSItem -> Changeset database files be memory mapped? In # some tests, using memory mapping speeded up the overall conversion # by about 5%. But this option can cause the conversion to fail with # an out of memory error if the conversion computer runs out of # virtual address space (e.g., when running a very large conversion on # a 32-bit operating system). Therefore it is disabled by default. # Uncomment the following line to allow these database files to be # memory mapped. #changeset_database.use_mmap_for_cvs_item_to_changeset_table = True # Now set the project to be converted to git. cvs2git only supports # single-project conversions, so this method must only be called # once: run_options.set_project( # The filesystem path to the part of the CVS repository (*not* a # CVS working copy) that should be converted. This may be a # subdirectory (i.e., a module) within a larger CVS repository. r'test-data/main-cvsrepos', # A list of symbol transformations that can be used to rename # symbols in this project. symbol_transforms=[ # Use IgnoreSymbolTransforms like the following to completely # ignore symbols matching a regular expression when parsing # the CVS repository, for example to avoid warnings about # branches with two names and to choose the preferred name. # It is *not* recommended to use this instead of # ExcludeRegexpStrategyRule; though more efficient, # IgnoreSymbolTransforms are less flexible and don't exclude # branches correctly. The argument is a Python-style regular # expression that has to match the *whole* CVS symbol name: #IgnoreSymbolTransform(r'nightly-build-tag-.*') # RegexpSymbolTransforms transform symbols textually using a # regular expression. The first argument is a Python regular # expression pattern and the second is a replacement pattern. # The pattern is matched against each symbol name. If it # matches the whole symbol name, then the symbol name is # replaced with the corresponding replacement text. The # replacement can include substitution patterns (e.g., r'\1' # or r'\g'). Typically you will want to use raw strings # (strings with a preceding 'r', like shown in the examples) # for the regexp and its replacement to avoid backslash # substitution within those strings. #RegexpSymbolTransform(r'release-(\d+)_(\d+)', # r'release-\1.\2'), #RegexpSymbolTransform(r'release-(\d+)_(\d+)_(\d+)', # r'release-\1.\2.\3'), # Simple 1:1 character replacements can also be done. The # following transform, which converts backslashes into forward # slashes, should usually be included: ReplaceSubstringsSymbolTransform('\\','/'), # This last rule eliminates leading, trailing, and repeated # slashes within the output symbol names: NormalizePathsSymbolTransform(), ], # See the definition of global_symbol_strategy_rules above for a # description of this option: symbol_strategy_rules=global_symbol_strategy_rules, # Exclude paths from the conversion. Should be relative to # repository path and use forward slashes: #exclude_paths=['file-to-exclude.txt,v', 'dir/to/exclude'], ) cvs2svn-2.4.0/contrib/0000775000076500007650000000000012027373500015642 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/contrib/show_db.py0000775000076500007650000001351711354150217017653 0ustar mhaggermhagger00000000000000#!/usr/bin/env python import anydbm import marshal import sys import os import getopt import cPickle as pickle from cStringIO import StringIO sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) from cvs2svn_lib import config from cvs2svn_lib.context import Ctx from cvs2svn_lib.common import DB_OPEN_READ from cvs2svn_lib.artifact_manager import artifact_manager def usage(): cmd = sys.argv[0] sys.stderr.write('Usage: %s OPTION [DIRECTORY]\n\n' % os.path.basename(cmd)) sys.stderr.write( 'Show the contents of the temporary database files created by cvs2svn\n' 'in a structured human-readable way.\n' '\n' 'OPTION is one of:\n' ' -R SVNRepositoryMirror revisions table\n' ' -N SVNRepositoryMirror nodes table\n' ' -r rev SVNRepositoryMirror node tree for specific revision\n' ' -m MetadataDatabase\n' ' -f CVSPathDatabase\n' ' -c PersistenceManager SVNCommit table\n' ' -C PersistenceManager cvs-revs-to-svn-revnums table\n' ' -i CVSItemDatabase (normal)\n' ' -I CVSItemDatabase (filtered)\n' ' -p file Show the given file, assuming it contains a pickle.\n' '\n' 'DIRECTORY is the directory containing the temporary database files.\n' 'If omitted, the current directory is assumed.\n') sys.exit(1) def print_node_tree(db, key="0", name="", prefix=""): print "%s%s (%s)" % (prefix, name, key) if name[:1] != "/": dict = marshal.loads(db[key]) items = dict.items() items.sort() for entry in items: print_node_tree(db, entry[1], entry[0], prefix + " ") def show_int2str_db(fname): db = anydbm.open(fname, 'r') k = map(int, db.keys()) k.sort() for i in k: print "%6d: %s" % (i, db[str(i)]) def show_str2marshal_db(fname): db = anydbm.open(fname, 'r') k = db.keys() k.sort() for i in k: print "%6s: %s" % (i, marshal.loads(db[i])) def show_str2pickle_db(fname): db = anydbm.open(fname, 'r') k = db.keys() k.sort() for i in k: o = pickle.loads(db[i]) print "%6s: %r" % (i, o) print " %s" % (o,) def show_str2ppickle_db(fname): db = anydbm.open(fname, 'r') k = db.keys() k.remove('_') k.sort(key=lambda s: int(s, 16)) u1 = pickle.Unpickler(StringIO(db['_'])) u1.load() for i in k: u2 = pickle.Unpickler(StringIO(db[i])) u2.memo = u1.memo.copy() o = u2.load() print "%6s: %r" % (i, o) print " %s" % (o,) def show_cvsitemstore(): for cvs_file_items in Ctx()._cvs_items_db.iter_cvs_file_items(): items = cvs_file_items.values() items.sort(key=lambda i: i.id) for item in items: print "%6x: %r" % (item.id, item,) def show_filtered_cvs_item_store(): from cvs2svn_lib.cvs_item_database import IndexedCVSItemStore db = IndexedCVSItemStore( artifact_manager.get_temp_file(config.CVS_ITEMS_FILTERED_STORE), artifact_manager.get_temp_file(config.CVS_ITEMS_FILTERED_INDEX_TABLE), DB_OPEN_READ) ids = list(db.iterkeys()) ids.sort() for id in ids: cvs_item = db[id] print "%6x: %r" % (cvs_item.id, cvs_item,) class ProjectList: """A mock project-list that can be assigned to Ctx()._projects.""" def __init__(self): self.projects = {} def __getitem__(self, i): return self.projects.setdefault(i, 'Project%d' % i) def prime_ctx(): def rf(filename): artifact_manager.register_temp_file(filename, None) from cvs2svn_lib.common import DB_OPEN_READ from cvs2svn_lib.symbol_database import SymbolDatabase from cvs2svn_lib.cvs_path_database import CVSPathDatabase rf(config.CVS_PATHS_DB) rf(config.SYMBOL_DB) from cvs2svn_lib.cvs_item_database import OldCVSItemStore from cvs2svn_lib.metadata_database import MetadataDatabase rf(config.METADATA_DB) rf(config.CVS_ITEMS_STORE) rf(config.CVS_ITEMS_FILTERED_STORE) rf(config.CVS_ITEMS_FILTERED_INDEX_TABLE) artifact_manager.pass_started(None) Ctx()._projects = ProjectList() Ctx()._symbol_db = SymbolDatabase() Ctx()._cvs_path_db = CVSPathDatabase(DB_OPEN_READ) Ctx()._cvs_items_db = OldCVSItemStore( artifact_manager.get_temp_file(config.CVS_ITEMS_STORE) ) Ctx()._metadata_db = MetadataDatabase(DB_OPEN_READ) def main(): try: opts, args = getopt.getopt(sys.argv[1:], "RNr:mlfcCiIp:") except getopt.GetoptError: usage() if len(args) > 1 or len(opts) != 1: usage() if len(args) == 1: Ctx().tmpdir = args[0] for o, a in opts: if o == "-R": show_int2str_db(config.SVN_MIRROR_REVISIONS_TABLE) elif o == "-N": show_str2marshal_db( config.SVN_MIRROR_NODES_STORE, config.SVN_MIRROR_NODES_INDEX_TABLE ) elif o == "-r": try: revnum = int(a) except ValueError: sys.stderr.write('Option -r requires a valid revision number\n') sys.exit(1) db = anydbm.open(config.SVN_MIRROR_REVISIONS_TABLE, 'r') key = db[str(revnum)] db.close() db = anydbm.open(config.SVN_MIRROR_NODES_STORE, 'r') print_node_tree(db, key, "Revision %d" % revnum) elif o == "-m": show_str2marshal_db(config.METADATA_DB) elif o == "-f": prime_ctx() cvs_files = list(Ctx()._cvs_path_db.itervalues()) cvs_files.sort() for cvs_file in cvs_files: print '%6x: %s' % (cvs_file.id, cvs_file,) elif o == "-c": prime_ctx() show_str2ppickle_db( config.SVN_COMMITS_INDEX_TABLE, config.SVN_COMMITS_STORE ) elif o == "-C": show_str2marshal_db(config.CVS_REVS_TO_SVN_REVNUMS) elif o == "-i": prime_ctx() show_cvsitemstore() elif o == "-I": prime_ctx() show_filtered_cvs_item_store() elif o == "-p": obj = pickle.load(open(a)) print repr(obj) print obj else: usage() sys.exit(2) if __name__ == '__main__': main() cvs2svn-2.4.0/contrib/cvsVsvn.pl0000775000076500007650000001212210646512654017662 0ustar mhaggermhagger00000000000000#!/usr/bin/perl -w # # (C) 2005 The Measurement Factory http://www.measurement-factory.com/ # This software is distributed under Apache License, version 2.0. # # The cvsVsvn.pl script compares CVS and Subversion projects. The # user is notified if project files maintained by CVS differ from # those maintained by Subversion: # # $ CVSROOT=/my/cvsrepo ./cvsVsvn project1 svn+ssh://host/path/projects/p1 # collecting CVS tags... # found 34 CVS tags # comparing tagged snapshots... # HEAD snapshots appear to be the same # ... # CVS and SVN repositories differ because cvs_foo and svn_tags_foo # export directories differ in cvsVsvn.tmp # # The comparison is done for every CVS tag and branch (including # HEAD), by exporting corresponding CVS and Subversion snapshots and # running 'diff' against the two resulting directories. One can edit # the script or use the environment variable DIFF_OPTIONS to alter # 'diff' behavior (e.g., ignore differences in some files). # Commit logs are not compared, unfortunately. This script is also # confused by files that differ due to different keyword expansion by # CVS and SVN. use strict; # cvsVsvn exports a user-specified module from CVS and Subversion # repositories and compares the two exported directories using the # 'diff' tool. The procedure is performed for all CVS tags (including # HEAD and branches). die(&usage()) unless @ARGV == 2; my ($CvsModule, $SvnModule) = @ARGV; my $TmpDir = 'cvsVsvn.tmp'; # directory to store temporary files my @Tags = &collectTags(); print(STDERR "comparing tagged snapshots...\n"); foreach my $tagPair (@Tags) { &compareTags($tagPair->{cvs}, $tagPair->{svn}); } print(STDERR "CVS and Subversion repositories appear to be the same\n"); exit(0); sub collectTags { print(STDERR "collecting CVS tags...\n"); my @tags = ( { cvs => 'HEAD', svn => 'trunk' } ); # get CVS log headers with symbolic tags my %names = (); my $inNames; my $cmd = sprintf('cvs rlog -h %s', $CvsModule); open(IF, "$cmd|") or die("cannot execute $cmd: $!, stopped"); while () { if ($inNames) { my ($name, $version) = /\s+(\S+):\s*(\d\S*)/; if ($inNames = defined $version) { my @nums = split(/\./, $version); my $isBranch = (2*int(@nums/2) != @nums) || (@nums > 2 && $nums[$#nums-1] == 0); my $status = $isBranch ? 'branches' : 'tags'; my $oldStatus = $names{$name}; next if $oldStatus && $oldStatus eq $status; die("change in $name tag status, stopped") if $oldStatus; $names{$name} = $status; } } else { $inNames = /^symbolic names:/; } } close(IF); while (my ($name, $status) = each %names) { my $tagPair = { cvs => $name, svn => sprintf('%s/%s', $status, $name) }; push (@tags, $tagPair); } printf(STDERR "found %d CVS tags\n", scalar @tags); return @tags; } sub compareTags { my ($cvsTag, $svnTag) = @_; &prepDirs(); &cvsExport($cvsTag); &svnExport($svnTag); &diffDir($cvsTag, $svnTag); # identical directories, clean up &cleanDirs(); } sub diffDir { my ($cvsTag, $svnTag) = @_; my $cvsDir = &cvsDir($cvsTag); my $svnDir = &svnDir($svnTag); my $same = systemf('diff --brief -b -B -r "%s" "%s"', $cvsDir, $svnDir) == 0; die("CVS and SVN repositories differ because ". "$cvsDir and $svnDir export directories differ in $TmpDir; stopped") unless $same; print(STDERR "$cvsTag snapshots appear to be the same\n"); return 0; } sub makeDir { my $dir = shift; &systemf('mkdir %s', $dir) == 0 or die("cannot create $dir: $!, stopped"); } sub prepDirs { &makeDir($TmpDir); chdir($TmpDir) or die($!); } sub cleanDirs { chdir('..') or die($!); &systemf('rm -irf %s', $TmpDir) == 0 or die("cannot delete $TmpDir: $!, stopped"); } sub cvsExport { my ($cvsTag) = @_; my $dir = &cvsDir($cvsTag); &makeDir($dir); &systemf('cvs -Q export -r %s -d %s %s', $cvsTag, $dir, $CvsModule) == 0 or die("cannot export $cvsTag of CVS module '$CvsModule', stopped"); } sub svnExport { my ($svnTag) = @_; my $dir = &svnDir($svnTag); my $cvsOk = &systemf('svn list %s/%s > /dev/null', $SvnModule, $svnTag) == 0 && &systemf('svn -q export %s/%s %s', $SvnModule, $svnTag, $dir) == 0; die("cannot export $svnTag of svn module '$SvnModule', stopped") unless $cvsOk && -d $dir; } sub tag2dir { my ($category, $tag) = @_; my $dir = sprintf('%s_%s', $category, $tag); # remove dangerous chars $dir =~ s/[^A-z0-9_\.\-]+/_/g; return $dir; } sub cvsDir { return &tag2dir('cvs', @_); } sub svnDir { return &tag2dir('svn', @_); } sub systemf { my ($fmt, @params) = @_; my $cmd = sprintf($fmt, (@params)); #print(STDERR "$cmd\n"); return system($cmd); } sub usage { return "usage: $0 \n"; } cvs2svn-2.4.0/contrib/__init__.py0000664000076500007650000000143310646512654017766 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2007 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Allow this directory to be imported as a module.""" cvs2svn-2.4.0/contrib/shrink_test_case.py0000775000076500007650000005331011710517256021557 0ustar mhaggermhagger00000000000000#! /usr/bin/python # (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2006-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Shrink a test case as much as possible. !!!!!!! WARNING !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !! This script irretrievably destroys the CVS repository that it is !! !! applied to! !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! This script is meant to be used to shrink the size of a CVS repository that is to be used as a test case for cvs2svn. It tries to throw out parts of the repository while preserving the bug. CVSREPO should be the path of a copy of a CVS archive. TEST_COMMAND is a command that should run successfully (i.e., with exit code '0') if the bug is still present, and fail if the bug is absent.""" import sys import os import shutil import optparse from cStringIO import StringIO sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) from cvs2svn_lib.key_generator import KeyGenerator from cvs2svn_lib.rcsparser import Sink from cvs2svn_lib.rcsparser import parse from contrib.rcs_file_filter import WriteRCSFileSink from contrib.rcs_file_filter import FilterSink usage = 'USAGE: %prog [options] CVSREPO TEST_COMMAND' description = """\ Simplify a CVS repository while preserving the presence of a bug. ***THE CVS REPOSITORY WILL BE DESTROYED*** CVSREPO is the path to a CVS repository. TEST_COMMAND is a command that runs successfully (i.e., with exit code '0') if the bug is still present, and fails if the bug is absent. """ verbose = 1 tmpdir = 'shrink_test_case-tmp' file_key_generator = KeyGenerator(1) def get_tmp_filename(): return os.path.join(tmpdir, 'f%07d.tmp' % file_key_generator.gen_id()) class CommandFailedException(Exception): pass def command(cmd, *args): if verbose >= 2: sys.stderr.write('Running: %s %s...' % (cmd, ' '.join(args),)) retval = os.spawnlp(os.P_WAIT, cmd, cmd, *args) if retval: if verbose >= 2: sys.stderr.write('failed (%s).\n' % retval) raise CommandFailedException(' '.join([cmd] + list(args))) else: if verbose >= 2: sys.stderr.write('succeeded.\n') class Modification: """A reversible modification that can be made to the repository.""" def get_size(self): """Return the estimated size of this modification. This should be approximately the number of bytes by which the problem will be shrunk if this modification is successful. It is used to choose the order to attempt the modifications.""" raise NotImplementedError() def modify(self): """Modify the repository. Store enough information that the change can be reverted.""" raise NotImplementedError() def revert(self): """Revert this modification.""" raise NotImplementedError() def commit(self): """Make this modification permanent.""" raise NotImplementedError() def try_mod(self, test_command): if verbose >= 1: sys.stdout.write('Testing with the following modifications:\n') self.output(sys.stdout, ' ') self.modify() try: test_command() except CommandFailedException: if verbose >= 1: sys.stdout.write( 'The bug disappeared. Reverting modifications.\n' ) else: sys.stdout.write('Attempted modification unsuccessful.\n') self.revert() return False except KeyboardInterrupt: sys.stderr.write('Interrupted. Reverting last modifications.\n') self.revert() raise except Exception: sys.stderr.write( 'Unexpected exception. Reverting last modifications.\n' ) self.revert() raise else: self.commit() if verbose >= 1: sys.stdout.write('The bug remains. Keeping modifications.\n') else: sys.stdout.write( 'The bug remains after the following modifications:\n' ) self.output(sys.stdout, ' ') return True def get_submodifications(self, success): """Return a generator or iterable of submodifications. Return submodifications that should be tried after this this modification. SUCCESS specifies whether this modification was successful.""" return [] def output(self, f, prefix=''): raise NotImplementedError() def __repr__(self): return str(self) class EmptyModificationListException(Exception): pass class SplitModification(Modification): """Holds two modifications split out of a failing modification. Because the original modification failed, it known that mod1+mod2 can't succeed. So if mod1 succeeds, mod2 need not be attempted (though its submodifications are attempted).""" def __init__(self, mod1, mod2): # Choose mod1 to be the larger modification: if mod2.get_size() > mod1.get_size(): mod1, mod2 = mod2, mod1 self.mod1 = mod1 self.mod2 = mod2 def get_size(self): return self.mod1.get_size() def modify(self): self.mod1.modify() def revert(self): self.mod1.revert() def commit(self): self.mod1.commit() def get_submodifications(self, success): if success: for mod in self.mod2.get_submodifications(False): yield mod else: yield self.mod2 for mod in self.mod1.get_submodifications(success): yield mod def output(self, f, prefix=''): self.mod1.output(f, prefix=prefix) def __str__(self): return 'SplitModification(%s, %s)' % (self.mod1, self.mod2,) class CompoundModification(Modification): def __init__(self, modifications): if not modifications: raise EmptyModificationListException() self.modifications = modifications self.size = sum(mod.get_size() for mod in self.modifications) def get_size(self): return self.size def modify(self): for modification in self.modifications: modification.modify() def revert(self): for modification in self.modifications: modification.revert() def commit(self): for modification in self.modifications: modification.commit() def get_submodifications(self, success): if success: # All modifications were completed successfully; no need # to try subsets: pass elif len(self.modifications) == 1: # Our modification list cannot be subdivided, but maybe # the remaining modification can: for mod in self.modifications[0].get_submodifications(False): yield mod else: # Create subsets of each half of the list and put them in # a SplitModification: n = len(self.modifications) // 2 yield SplitModification( create_modification(self.modifications[:n]), create_modification(self.modifications[n:]) ) def output(self, f, prefix=''): for modification in self.modifications: modification.output(f, prefix=prefix) def __str__(self): return str(self.modifications) def create_modification(mods): """Create and return a Modification based on the iterable MODS. Raise EmptyModificationListException if mods is empty.""" mods = list(mods) if len(mods) == 1: return mods[0] else: return CompoundModification(mods) def compute_dir_size(path): # Add a little bit for the directory itself. size = 100L for filename in os.listdir(path): subpath = os.path.join(path, filename) if os.path.isdir(subpath): size += compute_dir_size(subpath) elif os.path.isfile(subpath): size += os.path.getsize(subpath) return size class DeleteDirectoryModification(Modification): def __init__(self, path): self.path = path self.size = compute_dir_size(self.path) def get_size(self): return self.size def modify(self): self.tempfile = get_tmp_filename() shutil.move(self.path, self.tempfile) def revert(self): shutil.move(self.tempfile, self.path) self.tempfile = None def commit(self): shutil.rmtree(self.tempfile) self.tempfile = None def get_submodifications(self, success): if success: # The whole directory could be deleted; no need to recurse: pass else: # Try deleting subdirectories: mods = [ DeleteDirectoryModification(subdir) for subdir in get_dirs(self.path) ] if mods: yield create_modification(mods) # Try deleting files: mods = [ DeleteFileModification(filename) for filename in get_files(self.path) ] if mods: yield create_modification(mods) def output(self, f, prefix=''): f.write('%sDeleted directory %r\n' % (prefix, self.path,)) def __str__(self): return 'DeleteDirectory(%r)' % self.path class DeleteFileModification(Modification): def __init__(self, path): self.path = path self.size = os.path.getsize(self.path) def get_size(self): return self.size def modify(self): self.tempfile = get_tmp_filename() shutil.move(self.path, self.tempfile) def revert(self): shutil.move(self.tempfile, self.path) self.tempfile = None def commit(self): os.remove(self.tempfile) self.tempfile = None def output(self, f, prefix=''): f.write('%sDeleted file %r\n' % (prefix, self.path,)) def __str__(self): return 'DeleteFile(%r)' % self.path def rev_tuple(revision): retval = [int(s) for s in revision.split('.') if int(s)] if retval[-2] == 0: del retval[-2] return tuple(retval) class RCSFileFilter: def get_size(self): raise NotImplementedError() def get_filter_sink(self, sink): raise NotImplementedError() def filter(self, text): fout = StringIO() sink = WriteRCSFileSink(fout) filter = self.get_filter_sink(sink) parse(StringIO(text), filter) return fout.getvalue() def get_subfilters(self): return [] def output(self, f, prefix=''): raise NotImplementedError() class DeleteTagRCSFileFilter(RCSFileFilter): class Sink(FilterSink): def __init__(self, sink, tagname): FilterSink.__init__(self, sink) self.tagname = tagname def define_tag(self, name, revision): if name != self.tagname: FilterSink.define_tag(self, name, revision) def __init__(self, tagname): self.tagname = tagname def get_size(self): return 50 def get_filter_sink(self, sink): return self.Sink(sink, self.tagname) def output(self, f, prefix=''): f.write('%sDeleted tag %r\n' % (prefix, self.tagname,)) def get_tag_set(path): class TagCollector(Sink): def __init__(self): self.tags = set() # A map { branch_tuple : name } for branches on which no # revisions have yet been seen: self.branches = {} def define_tag(self, name, revision): revtuple = rev_tuple(revision) if len(revtuple) % 2 == 0: # This is a tag (as opposed to branch) self.tags.add(name) else: self.branches[revtuple] = name def define_revision( self, revision, timestamp, author, state, branches, next ): branch = rev_tuple(revision)[:-1] try: del self.branches[branch] except KeyError: pass def get_tags(self): tags = self.tags for branch in self.branches.values(): tags.add(branch) return tags tag_collector = TagCollector() parse(open(path, 'rb'), tag_collector) return tag_collector.get_tags() class DeleteBranchTreeRCSFileFilter(RCSFileFilter): class Sink(FilterSink): def __init__(self, sink, branch_rev): FilterSink.__init__(self, sink) self.branch_rev = branch_rev def is_on_branch(self, revision): revtuple = rev_tuple(revision) return revtuple[:len(self.branch_rev)] == self.branch_rev def define_tag(self, name, revision): if not self.is_on_branch(revision): FilterSink.define_tag(self, name, revision) def define_revision( self, revision, timestamp, author, state, branches, next ): if not self.is_on_branch(revision): branches = [ branch for branch in branches if not self.is_on_branch(branch) ] FilterSink.define_revision( self, revision, timestamp, author, state, branches, next ) def set_revision_info(self, revision, log, text): if not self.is_on_branch(revision): FilterSink.set_revision_info(self, revision, log, text) def __init__(self, branch_rev, subbranch_tree): self.branch_rev = branch_rev self.subbranch_tree = subbranch_tree def get_size(self): return 100 def get_filter_sink(self, sink): return self.Sink(sink, self.branch_rev) def get_subfilters(self): for (branch_rev, subbranch_tree) in self.subbranch_tree: yield DeleteBranchTreeRCSFileFilter(branch_rev, subbranch_tree) def output(self, f, prefix=''): f.write( '%sDeleted branch %s\n' % (prefix, '.'.join([str(s) for s in self.branch_rev]),) ) def get_branch_tree(path): """Return the forest of branches in path. Return [(branch_revision, [sub_branch, ...]), ...], where branch_revision is a revtuple and sub_branch has the same form as the whole return value. """ class BranchCollector(Sink): def __init__(self): self.branches = {} def define_revision( self, revision, timestamp, author, state, branches, next ): parent = rev_tuple(revision)[:-1] if len(parent) == 1: parent = (1,) entry = self.branches.setdefault(parent, []) for branch in branches: entry.append(rev_tuple(branch)[:-1]) def _get_subbranches(self, parent): retval = [] try: branches = self.branches[parent] except KeyError: return [] del self.branches[parent] for branch in branches: subbranches = self._get_subbranches(branch) retval.append((branch, subbranches,)) return retval def get_branches(self): retval = self._get_subbranches((1,)) assert not self.branches return retval branch_collector = BranchCollector() parse(open(path, 'rb'), branch_collector) return branch_collector.get_branches() class RCSFileModification(Modification): """A Modification that involves changing the contents of an RCS file.""" def __init__(self, path, filters): self.path = path self.filters = filters[:] self.size = 0 for filter in self.filters: self.size += filter.get_size() def get_size(self): return self.size def modify(self): self.tempfile = get_tmp_filename() shutil.move(self.path, self.tempfile) text = open(self.tempfile, 'rb').read() for filter in self.filters: text = filter.filter(text) open(self.path, 'wb').write(text) def revert(self): shutil.move(self.tempfile, self.path) self.tempfile = None def commit(self): os.remove(self.tempfile) self.tempfile = None def get_submodifications(self, success): if success: # All filters completed successfully; no need to try # subsets: pass elif len(self.filters) == 1: # The last filter failed; see if it has any subfilters: subfilters = list(self.filters[0].get_subfilters()) if subfilters: yield RCSFileModification(self.path, subfilters) else: n = len(self.filters) // 2 yield SplitModification( RCSFileModification(self.path, self.filters[:n]), RCSFileModification(self.path, self.filters[n:]) ) def output(self, f, prefix=''): f.write('%sModified file %r\n' % (prefix, self.path,)) for filter in self.filters: filter.output(f, prefix=(prefix + ' ')) def __str__(self): return 'RCSFileModification(%r)' % (self.filters,) def try_modification_combinations(test_command, mods): """Try MOD and its submodifications. Return True if any modifications were successful.""" # A list of lists of modifications that should still be tried: todo = list(mods) while todo: todo.sort(key=lambda mod: mod.get_size()) mod = todo.pop() success = mod.try_mod(test_command) # Now add possible submodifications to the list of things to try: todo.extend(mod.get_submodifications(success)) def get_dirs(path): filenames = os.listdir(path) filenames.sort() for filename in filenames: subpath = os.path.join(path, filename) if os.path.isdir(subpath): yield subpath def get_files(path, recurse=False): filenames = os.listdir(path) filenames.sort() for filename in filenames: subpath = os.path.join(path, filename) if os.path.isfile(subpath): yield subpath elif recurse and os.path.isdir(subpath): for x in get_files(subpath, recurse=recurse): yield x def shrink_repository(test_command, cvsrepo): try_modification_combinations( test_command, [DeleteDirectoryModification(cvsrepo)] ) # Try deleting branches: mods = [] for path in get_files(cvsrepo, recurse=True): branch_tree = get_branch_tree(path) if branch_tree: filters = [] for (branch_revision, subbranch_tree) in branch_tree: filters.append( DeleteBranchTreeRCSFileFilter( branch_revision, subbranch_tree ) ) mods.append(RCSFileModification(path, filters)) if mods: try_modification_combinations(test_command, mods) # Try deleting tags: mods = [] for path in get_files(cvsrepo, recurse=True): tags = list(get_tag_set(path)) if tags: tags.sort() filters = [DeleteTagRCSFileFilter(tag) for tag in tags] mods.append(RCSFileModification(path, filters)) if mods: try_modification_combinations(test_command, mods) first_fail_message = """\ ERROR! The test command failed with the original repository. The test command should be designed so that it succeeds (indicating that the bug is still present) with the original repository, and fails only after the bug disappears. Please fix your test command and start again. """ class MyHelpFormatter(optparse.IndentedHelpFormatter): """A HelpFormatter for optparse that doesn't reformat the description.""" def format_description(self, description): return description def main(): parser = optparse.OptionParser( usage=usage, description=description, formatter=MyHelpFormatter(), ) parser.set_defaults(skip_initial_test=False) parser.add_option( '--skip-initial-test', action='store_true', default=False, help='skip verifying that the bug exists in the original repository', ) (options, args) = parser.parse_args() cvsrepo = args[0] def test_command(): command(*args[1:]) if not os.path.isdir(tmpdir): os.makedirs(tmpdir) if not options.skip_initial_test: sys.stdout.write('Testing with the original repository.\n') try: test_command() except CommandFailedException, e: sys.stderr.write(first_fail_message) sys.exit(1) sys.stdout.write( 'The bug is confirmed to exist in the initial repository.\n' ) try: try: shrink_repository(test_command, cvsrepo) except KeyboardInterrupt: pass finally: try: os.rmdir(tmpdir) except Exception, e: sys.stderr.write('ERROR: %s (ignored)\n' % (e,)) if __name__ == '__main__': main() cvs2svn-2.4.0/contrib/rcs_file_filter.py0000775000076500007650000001433111710517256021362 0ustar mhaggermhagger00000000000000#! /usr/bin/python # (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2006-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Filter an RCS file.""" import sys import os import time sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) from cvs2svn_lib.rcsparser import Sink from cvs2svn_lib.rcsparser import parse def at_quote(s): return '@' + s.replace('@', '@@') + '@' def format_date(date): date_tuple = time.gmtime(date) year = date_tuple[0] if 1900 <= year <= 1999: year = '%02d' % (year - 1900) else: year = '%04d' % year return year + time.strftime('.%m.%d.%H.%M.%S', date_tuple) class WriteRCSFileSink(Sink): """A Sink that outputs reconstructed RCS file contents.""" def __init__(self, f): """Create a Sink object that will write its output into F. F should be a file-like object.""" self.f = f self.head = None self.principal_branch = None self.accessors = [] self.symbols = [] self.lockers = [] self.locking = None self.comment = None self.expansion = None def set_head_revision(self, revision): self.head = revision def set_principal_branch(self, branch_name): self.principal_branch = branch_name def set_access(self, accessors): self.accessors = accessors def define_tag(self, name, revision): self.symbols.append((name, revision,)) def set_locker(self, revision, locker): self.lockers.append((revision, locker,)) def set_locking(self, mode): self.locking = mode def set_comment(self, comment): self.comment = comment def set_expansion(self, mode): self.expansion = mode def admin_completed(self): self.f.write('head\t%s;\n' % self.head) if self.principal_branch is not None: self.f.write('branch\t%s;\n' % self.principal_branch) self.f.write('access') for accessor in self.accessors: self.f.write('\n\t%s' % accessor) self.f.write(';\n') self.f.write('symbols') for (name, revision) in self.symbols: self.f.write('\n\t%s:%s' % (name, revision)) self.f.write(';\n') self.f.write('locks') for (revision, locker) in self.lockers: self.f.write('\n\t%s:%s' % (locker, revision)) self.f.write(';') if self.locking is not None: self.f.write(' %s;' % self.locking) self.f.write('\n') if self.comment is not None: self.f.write('comment\t%s;\n' % at_quote(self.comment)) if self.expansion is not None: self.f.write('expand\t%s;\n' % at_quote(self.expansion)) self.f.write('\n') def define_revision( self, revision, timestamp, author, state, branches, next ): self.f.write( '\n%s\ndate\t%s;\tauthor %s;\tstate %s;\n' % (revision, format_date(timestamp), author, state,) ) self.f.write('branches') for branch in branches: self.f.write('\n\t%s' % branch) self.f.write(';\n') self.f.write('next\t%s;\n' % (next or '')) def tree_completed(self): pass def set_description(self, description): self.f.write('\n\ndesc\n%s\n' % at_quote(description)) def set_revision_info(self, revision, log, text): self.f.write('\n') self.f.write('\n') self.f.write('%s\n' % revision) self.f.write('log\n%s\n' % at_quote(log)) self.f.write('text\n%s\n' % at_quote(text)) def parse_completed(self): pass class FilterSink(Sink): """A Sink that passes callbacks through to another sink. This is intended for use as a base class for other filter classes that modify the data before passing it through.""" def __init__(self, sink): """Create a Sink object that will write its output into SINK. SINK should be a cvs2svn_rcsparse.Sink.""" self.sink = sink def set_head_revision(self, revision): self.sink.set_head_revision(revision) def set_principal_branch(self, branch_name): self.sink.set_principal_branch(branch_name) def set_access(self, accessors): self.sink.set_access(accessors) def define_tag(self, name, revision): self.sink.define_tag(name, revision) def set_locker(self, revision, locker): self.sink.set_locker(revision, locker) def set_locking(self, mode): self.sink.set_locking(mode) def set_comment(self, comment): self.sink.set_comment(comment) def set_expansion(self, mode): self.sink.set_expansion(mode) def admin_completed(self): self.sink.admin_completed() def define_revision( self, revision, timestamp, author, state, branches, next ): self.sink.define_revision( revision, timestamp, author, state, branches, next ) def tree_completed(self): self.sink.tree_completed() def set_description(self, description): self.sink.set_description(description) def set_revision_info(self, revision, log, text): self.sink.set_revision_info(revision, log, text) def parse_completed(self): self.sink.parse_completed() if __name__ == '__main__': if sys.argv[1:]: for path in sys.argv[1:]: if os.path.isfile(path) and path.endswith(',v'): parse(open(path, 'rb'), WriteRCSFileSink(sys.stdout)) else: sys.stderr.write('%r is being ignored.\n' % path) else: parse(sys.stdin, WriteRCSFileSink(sys.stdout)) cvs2svn-2.4.0/contrib/renumber_branch.py0000775000076500007650000001334411710517256021366 0ustar mhaggermhagger00000000000000#! /usr/bin/python # (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (C) 2010 Cabot Communications Ltd. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Usage: renumber_branch.py OLDREVNUM NEWREVNUM PATH... WARNING: This modifies RCS files in-place. Make sure you only run it on a _copy_ of your repository. And have backups. Modify RCS files in PATH to renumber a revision and/or branch. Will also renumber any revisions on the branch and any branches from a renumbered revision. E.g. if you ask to renumber branch 1.3.5 to 1.3.99, it will also renumber revision 1.3.5.1 to 1.3.99.1, and renumber branch 1.3.5.1.7 to 1.3.99.1.7. This is usually what you want. Originally written to correct a non-standard vendor branch number, by renumbering the 1.1.2 branch to 1.1.1. This allows cvs2svn to detect that it's a vendor branch. This doesn't enforce all the rules about revision numbers. It is possible to make invalid repositories using this tool. This does try to detect if the specified revision number is already in use, and fail in that case. """ import sys import os sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) from cvs2svn_lib.rcsparser import parse from rcs_file_filter import WriteRCSFileSink from rcs_file_filter import FilterSink class RenumberingFilter(FilterSink): '''A filter that transforms all revision numbers using a function provided to the constructor.''' def __init__(self, sink, transform_revision_func): '''Constructor. SINK is the object we're wrapping. It must implement the cvs2svn_rcsparse.Sink interface. TRANSFORM_REVISION_FUNC is a function that takes a single CVS revision number, as a string, and returns the possibly-transformed revision number in the same format. ''' FilterSink.__init__(self, sink) self.transform_rev = transform_revision_func def set_head_revision(self, revision): FilterSink.set_head_revision(self, self.transform_rev(revision)) def set_principal_branch(self, branch_name): FilterSink.set_principal_branch(self, self.transform_rev(branch_name)) def define_tag(self, name, revision): FilterSink.define_tag(self, name, self.transform_rev(revision)) def define_revision( self, revision, timestamp, author, state, branches, next ): revision = self.transform_rev(revision) branches = [self.transform_rev(b) for b in branches] if next is not None: next = self.transform_rev(next) FilterSink.define_revision( self, revision, timestamp, author, state, branches, next ) def set_revision_info(self, revision, log, text): FilterSink.set_revision_info(self, self.transform_rev(revision), log, text) def get_transform_func(rev_from, rev_to, force): rev_from_z = '%s.0.%s' % tuple(rev_from.rsplit('.', 1)) rev_to_z = '%s.0.%s' % tuple(rev_to.rsplit('.', 1)) def transform_revision(revision): if revision == rev_from or revision.startswith(rev_from + '.'): revision = rev_to + revision[len(rev_from):] elif revision == rev_from_z: revision = rev_to_z elif not force and (revision == rev_to or revision == rev_to_z or revision.startswith(rev_to + '.')): raise Exception('Target branch already exists') return revision return transform_revision def process_file(filename, rev_from, rev_to, force): func = get_transform_func(rev_from, rev_to, force) tmp_filename = filename + '.tmp' infp = open(filename, 'rb') outfp = open(tmp_filename, 'wb') try: writer = WriteRCSFileSink(outfp) revfilter = RenumberingFilter(writer, func) parse(infp, revfilter) finally: outfp.close() infp.close() os.rename(tmp_filename, filename) def iter_files_in_dir(top_path): for (dirpath, dirnames, filenames) in os.walk(top_path): for name in filenames: yield os.path.join(dirpath, name) def iter_rcs_files(list_of_starting_paths, verbose=False): for base_path in list_of_starting_paths: if os.path.isfile(base_path) and base_path.endswith(',v'): yield base_path elif os.path.isdir(base_path): for file_path in iter_files_in_dir(base_path): if file_path.endswith(',v'): yield file_path elif verbose: sys.stdout.write('File %s is being ignored.\n' % file_path) elif verbose: sys.stdout.write('PATH %s is being ignored.\n' % base_path) def main(): if len(sys.argv) < 4 or '.' not in sys.argv[1] or '.' not in sys.argv[2]: sys.stderr.write('Usage: %s OLDREVNUM NEWREVNUM PATH...\n' % (sys.argv[0],)) sys.exit(1) rev_from = sys.argv[1] rev_to = sys.argv[2] force = False for path in iter_rcs_files(sys.argv[3:], verbose=True): sys.stdout.write('Processing %s...' % path) process_file(path, rev_from, rev_to, force) sys.stdout.write('done.\n') if __name__ == '__main__': main() cvs2svn-2.4.0/contrib/find_illegal_filenames.py0000775000076500007650000000321611500107342022647 0ustar mhaggermhagger00000000000000#! /usr/bin/python # (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2006 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Search a directory for files whose names contain illegal characters. Usage: find_illegal_filenames.py PATH ... PATH should be a directory. It will be traversed looking for filenames that contain characters that are not allowed in paths in an SVN archive.""" import sys import os sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) from cvs2svn_lib.common import FatalError from cvs2svn_lib.output_option import OutputOption def visit_directory(unused, dirname, files): for file in files: path = os.path.join(dirname, file) try: OutputOption().verify_filename_legal(file) except FatalError: sys.stderr.write('File %r contains illegal characters!\n' % path) if not sys.argv[1:]: sys.stderr.write('usage: %s PATH ...\n' % sys.argv[0]) sys.exit(1) for path in sys.argv[1:]: os.path.walk(path, visit_directory, None) cvs2svn-2.4.0/contrib/verify-cvs2svn.py0000775000076500007650000005125011710517257021137 0ustar mhaggermhagger00000000000000#!/usr/bin/env python # (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== # # The purpose of verify-cvs2svn is to verify the result of a cvs2svn # repository conversion. The following tests are performed: # # 1. Content checking of the HEAD revision of trunk, all tags and all # branches. Only the tags and branches in the Subversion # repository are checked, i.e. there are no checks to verify that # all tags and branches in the CVS repository are present. # # This program only works if you converted a subdirectory of a CVS # repository, and not the whole repository. If you really did convert # a whole repository and need to check it, you must create a CVSROOT # directory above the current root using cvs init. # # ==================================================================== import os import sys import optparse import subprocess import shutil import re import tarfile # CVS and Subversion command line client commands CVS_CMD = 'cvs' SVN_CMD = 'svn' HG_CMD = 'hg' GIT_CMD = 'git' def pipe(cmd): """Run cmd as a pipe. Return (output, status).""" child = subprocess.Popen(cmd, stdout=subprocess.PIPE) output = child.stdout.read() status = child.wait() return (output, status) def cmd_failed(cmd, output, status): print 'CMD FAILED:', ' '.join(cmd) print 'Output:' sys.stdout.write(output) raise RuntimeError('%s command failed!' % cmd[0]) def split_output(self, cmd): (output, status) = pipe(cmd) if status: cmd_failed(cmd, output, status) retval = output.split(os.linesep)[:-1] if retval and not retval[-1]: del retval[-1] return retval class CvsRepos: def __init__(self, path): """Open the CVS repository at PATH.""" path = os.path.abspath(path) if not os.path.isdir(path): raise RuntimeError('CVS path is not a directory') if os.path.exists(os.path.join(path, 'CVSROOT')): # The whole repository self.module = "." self.cvsroot = path else: self.cvsroot = os.path.dirname(path) self.module = os.path.basename(path) while not os.path.exists(os.path.join(self.cvsroot, 'CVSROOT')): parent = os.path.dirname(self.cvsroot) if parent == self.cvsroot: raise RuntimeError('Cannot find the CVSROOT') self.module = os.path.join(os.path.basename(self.cvsroot), self.module) self.cvsroot = parent def __str__(self): return os.path.basename(self.cvsroot) def export(self, dest_path, rev=None, keyword_opt=None): """Export revision REV to DEST_PATH where REV can be None to export the HEAD revision, or any valid CVS revision string to export that revision.""" os.mkdir(dest_path) cmd = [CVS_CMD, '-Q', '-d', ':local:' + self.cvsroot, 'export'] if rev: cmd.extend(['-r', rev]) else: cmd.extend(['-D', 'now']) if keyword_opt: cmd.append(keyword_opt) cmd.extend(['-d', dest_path, self.module]) (output, status) = pipe(cmd) if status or output: cmd_failed(cmd, output, status) class SvnRepos: name = 'svn' def __init__(self, url): """Open the Subversion repository at URL.""" # Check if the user supplied an URL or a path if url.find('://') == -1: abspath = os.path.abspath(url) url = 'file://' + (abspath[0] != '/' and '/' or '') + abspath if os.sep != '/': url = url.replace(os.sep, '/') self.url = url # Cache a list of all tags and branches list = self.list('') if 'tags' in list: self.tag_list = self.list('tags') else: self.tag_list = [] if 'branches' in list: self.branch_list = self.list('branches') else: self.branch_list = [] def __str__(self): return self.url.split('/')[-1] def export(self, path, dest_path): """Export PATH to DEST_PATH.""" url = '/'.join([self.url, path]) cmd = [SVN_CMD, 'export', '-q', url, dest_path] (output, status) = pipe(cmd) if status or output: cmd_failed(cmd, output, status) def export_trunk(self, dest_path): """Export trunk to DEST_PATH.""" self.export('trunk', dest_path) def export_tag(self, dest_path, tag): """Export the tag TAG to DEST_PATH.""" self.export('tags/' + tag, dest_path) def export_branch(self, dest_path, branch): """Export the branch BRANCH to DEST_PATH.""" self.export('branches/' + branch, dest_path) def list(self, path): """Return a list of all files and directories in PATH.""" cmd = [SVN_CMD, 'ls', self.url + '/' + path] entries = [] for line in split_output(cmd): if line: entries.append(line.rstrip('/')) return entries def tags(self): """Return a list of all tags in the repository.""" return self.tag_list def branches(self): """Return a list of all branches in the repository.""" return self.branch_list class HgRepos: name = 'hg' def __init__(self, path): self.path = path self.base_cmd = [HG_CMD, '-R', self.path] self._branches = None # cache result of branches() self._have_default = None # so export_trunk() doesn't blow up def __str__(self): return os.path.basename(self.path) def _export(self, dest_path, rev): cmd = self.base_cmd + ['archive', '--type', 'files', '--rev', rev, '--exclude', 're:^\.hg', dest_path] (output, status) = pipe(cmd) if status or output: cmd_failed(cmd, output, status) # If Mercurial has nothing to export, then it doesn't create # dest_path. This breaks tree_compare(), so just check that the # manifest for the chosen revision really is empty, and if so create # the empty dir. if not os.path.exists(dest_path): cmd = self.base_cmd + ['manifest', '--rev', rev] manifest = [fn for fn in split_output(cmd) if not fn.startswith('.hg')] if not manifest: os.mkdir(dest_path) def export_trunk(self, dest_path): self.branches() # ensure _have_default is set if self._have_default: self._export(dest_path, 'default') else: # same as CVS does when exporting empty trunk os.mkdir(dest_path) def export_tag(self, dest_path, tag): self._export(dest_path, tag) def export_branch(self, dest_path, branch): self._export(dest_path, branch) def tags(self): cmd = self.base_cmd + ['tags', '-q'] tags = split_output(cmd) tags.remove('tip') return tags def branches(self): if self._branches is None: cmd = self.base_cmd + ['branches', '-q'] self._branches = branches = split_output(cmd) try: branches.remove('default') self._have_default = True except ValueError: self._have_default = False return self._branches class GitRepos: name = 'git' def __init__(self, path): self.path = path self.repo_cmd = [ GIT_CMD, '--git-dir=' + os.path.join(self.path, '.git'), '--work-tree=' + self.path, ] self._branches = None # cache result of branches() self._have_master = None # so export_trunk() doesn't blow up def __str__(self): return os.path.basename(self.path) def _export(self, dest_path, rev): # clone the repository cmd = [GIT_CMD, 'archive', '--remote=' + self.path, '--format=tar', rev] git_proc = subprocess.Popen(cmd, stdout=subprocess.PIPE) if False: # Unfortunately for some git tags the below causes # git_proc.wait() to hang. The git archive process is in a # state and the verify-cvs2svn hangs for good. tar = tarfile.open(mode="r|", fileobj=git_proc.stdout) for tarinfo in tar: tar.extract(tarinfo, dest_path) tar.close() else: os.mkdir(dest_path) tar_proc = subprocess.Popen( ['tar', '-C', dest_path, '-x'], stdin=git_proc.stdout, stdout=subprocess.PIPE, ) output = tar_proc.stdout.read() status = tar_proc.wait() if output or status: raise RuntimeError( 'Git tar extraction of rev %s from repo %s to %s failed (%s)!' % (rev, self.path, dest_path, output) ) status = git_proc.wait() if status: raise RuntimeError( 'Git extract of rev %s from repo %s to %s failed!' % (rev, self.path, dest_path) ) if not os.path.exists(dest_path): raise RuntimeError( 'Git clone of %s to %s failed!' % (self.path, dest_path) ) def export_trunk(self, dest_path): self.branches() # ensure _have_default is set if self._have_master: self._export(dest_path, 'master') else: # same as CVS does when exporting empty trunk os.mkdir(dest_path) def export_tag(self, dest_path, tag): self._export(dest_path, tag) def export_branch(self, dest_path, branch): self._export(dest_path, branch) def tags(self): cmd = self.repo_cmd + ['tag'] tags = split_output(cmd) return tags def branches(self): if self._branches is None: cmd = self.repo_cmd + ['branch'] branches = split_output(cmd) # Remove the two chracters at the start of the branch name for i in range(len(branches)): branches[i] = branches[i][2:] self._branches = branches try: branches.remove('master') self._have_master = True except ValueError: self._have_master = False return self._branches def transform_symbol(ctx, name): """Transform the symbol NAME using the renaming rules specified with --symbol-transform. Return the transformed symbol name.""" for (pattern, replacement) in ctx.symbol_transforms: newname = pattern.sub(replacement, name) if newname != name: print " symbol '%s' transformed to '%s'" % (name, newname) name = newname return name class Failures(object): def __init__(self): self.count = 0 # number of failures seen def __str__(self): return str(self.count) def __repr__(self): return "<%s at 0x%x: %s>" % (self.__class__.__name__, id(self), self.count) def report(self, summary, details=None): self.count += 1 sys.stdout.write(' FAIL: %s\n' % summary) if details: for line in details: sys.stdout.write(' %s\n' % line) def __nonzero__(self): return self.count > 0 def file_compare(failures, base1, base2, run_diff, rel_path): """Compare the mode and contents of two files. The paths are specified as two base paths BASE1 and BASE2, and a path REL_PATH that is relative to the two base paths. Return True iff the file mode and contents are identical.""" ok = True path1 = os.path.join(base1, rel_path) path2 = os.path.join(base2, rel_path) mode1 = os.stat(path1).st_mode & 0700 # only look at owner bits mode2 = os.stat(path2).st_mode & 0700 if mode1 != mode2: failures.report('File modes differ for %s' % rel_path, details=['%s: %o' % (path1, mode1), '%s: %o' % (path2, mode2)]) ok = False file1 = open(path1, 'rb') file2 = open(path2, 'rb') while True: data1 = file1.read(8192) data2 = file2.read(8192) if data1 != data2: if run_diff: cmd = ['diff', '-u', path1, path2] (output, status) = pipe(cmd) diff = output.split(os.linesep) else: diff = None failures.report('File contents differ for %s' % rel_path, details=diff) ok = False break if len(data1) == 0: # eof break return ok def tree_compare(failures, base1, base2, run_diff, rel_path=''): """Compare the contents of two directory trees, including file contents. The paths are specified as two base paths BASE1 and BASE2, and a path REL_PATH that is relative to the two base paths. Return True iff the trees are identical.""" if not rel_path: path1 = base1 path2 = base2 else: path1 = os.path.join(base1, rel_path) path2 = os.path.join(base2, rel_path) if not os.path.exists(path1): failures.report('%s does not exist' % path1) return False if not os.path.exists(path2): failures.report('%s does not exist' % path2) return False if os.path.isfile(path1) and os.path.isfile(path2): return file_compare(failures, base1, base2, run_diff, rel_path) if not (os.path.isdir(path1) and os.path.isdir(path2)): failures.report('Path types differ for %r' % rel_path) return False entries1 = os.listdir(path1) entries1.sort() entries2 = os.listdir(path2) entries2.sort() ok = True missing = filter(lambda x: x not in entries2, entries1) extra = filter(lambda x: x not in entries1, entries2) if missing: failures.report('Directory /%s is missing entries: %s' % (rel_path, ', '.join(missing))) ok = False if extra: failures.report('Directory /%s has extra entries: %s' % (rel_path, ', '.join(extra))) ok = False for entry in entries1: new_rel_path = os.path.join(rel_path, entry) if not tree_compare(failures, base1, base2, run_diff, new_rel_path): ok = False return ok def verify_contents_single(failures, cvsrepos, verifyrepos, kind, label, ctx): """Verify the HEAD revision of a trunk, tag, or branch. Verify that the contents of the HEAD revision of all directories and files in the conversion repository VERIFYREPOS match the ones in the CVS repository CVSREPOS. KIND can be either 'trunk', 'tag' or 'branch'. If KIND is either 'tag' or 'branch', LABEL is used to specify the name of the tag or branch. CTX has the attributes: CTX.tmpdir: specifying the directory for all temporary files. CTX.skip_cleanup: if true, the temporary files are not deleted. CTX.run_diff: if true, run diff on differing files.""" itemname = kind + (kind != 'trunk' and '-' + label or '') cvs_export_dir = os.path.join( ctx.tmpdir, 'cvs-export-%s' % itemname) vrf_export_dir = os.path.join( ctx.tmpdir, '%s-export-%s' % (verifyrepos.name, itemname)) if label: cvslabel = transform_symbol(ctx, label) else: cvslabel = None try: cvsrepos.export(cvs_export_dir, cvslabel, ctx.keyword_opt) if kind == 'trunk': verifyrepos.export_trunk(vrf_export_dir) elif kind == 'tag': verifyrepos.export_tag(vrf_export_dir, label) else: verifyrepos.export_branch(vrf_export_dir, label) if not tree_compare( failures, cvs_export_dir, vrf_export_dir, ctx.run_diff ): return False finally: if not ctx.skip_cleanup: if os.path.exists(cvs_export_dir): shutil.rmtree(cvs_export_dir) if os.path.exists(vrf_export_dir): shutil.rmtree(vrf_export_dir) return True def verify_contents(failures, cvsrepos, verifyrepos, ctx): """Verify that the contents of the HEAD revision of all directories and files in the trunk, all tags and all branches in the conversion repository VERIFYREPOS matches the ones in the CVS repository CVSREPOS. CTX is passed through to verify_contents_single().""" # branches/tags that failed: locations = [] # Verify contents of trunk print 'Verifying trunk' sys.stdout.flush() if not verify_contents_single( failures, cvsrepos, verifyrepos, 'trunk', None, ctx ): locations.append('trunk') # Verify contents of all tags for tag in verifyrepos.tags(): print 'Verifying tag', tag sys.stdout.flush() if not verify_contents_single( failures, cvsrepos, verifyrepos, 'tag', tag, ctx ): locations.append('tag:' + tag) # Verify contents of all branches for branch in verifyrepos.branches(): if branch[:10] == 'unlabeled-': print 'Skipped branch', branch else: print 'Verifying branch', branch if not verify_contents_single( failures, cvsrepos, verifyrepos, 'branch', branch, ctx ): locations.append('branch:' + branch) sys.stdout.flush() assert bool(failures) == bool(locations), \ "failures = %r\nlocations = %r" % (failures, locations) # Show the results if failures: sys.stdout.write('FAIL: %s != %s: %d failure(s) in:\n' % (cvsrepos, verifyrepos, failures.count)) for location in locations: sys.stdout.write(' %s\n' % location) else: sys.stdout.write('PASS: %s == %s\n' % (cvsrepos, verifyrepos)) sys.stdout.flush() class OptionContext: pass def main(argv): parser = optparse.OptionParser( usage='%prog [options] cvs-repos verify-repos') parser.add_option('--branch', help='verify contents of the branch BRANCH only') parser.add_option('--diff', action='store_true', dest='run_diff', help='run diff on differing files') parser.add_option('--tag', help='verify contents of the tag TAG only') parser.add_option('--tmpdir', metavar='PATH', help='path to store temporary files') parser.add_option('--trunk', action='store_true', help='verify contents of trunk only') parser.add_option('--symbol-transform', action='append', metavar='P:S', help='transform symbol names from P to S like cvs2svn, ' 'except transforms SVN symbol to CVS symbol') parser.add_option('--svn', action='store_const', dest='repos_type', const='svn', help='assume verify-repos is svn [default]') parser.add_option('--hg', action='store_const', dest='repos_type', const='hg', help='assume verify-repos is hg') parser.add_option('--git', action='store_const', dest='repos_type', const='git', help='assume verify-repos is git') parser.add_option('--suppress-keywords', action='store_const', dest='keyword_opt', const='-kk', help='suppress CVS keyword expansion ' '(equivalent to --keyword-opt=-kk)') parser.add_option('--keyword-opt', metavar='OPT', help='control CVS keyword expansion by adding OPT to ' 'cvs export command line') parser.set_defaults(run_diff=False, tmpdir='', skip_cleanup=False, symbol_transforms=[], repos_type='svn') (options, args) = parser.parse_args() symbol_transforms = [] for value in options.symbol_transforms: # This is broken! [pattern, replacement] = value.split(":") try: symbol_transforms.append( RegexpSymbolTransform(pattern, replacement)) except re.error: parser.error("'%s' is not a valid regexp." % (pattern,)) def error(msg): """Print an error to sys.stderr.""" sys.stderr.write('Error: ' + str(msg) + '\n') verify_branch = options.branch verify_tag = options.tag verify_trunk = options.trunk # Consistency check for options and arguments. if len(args) != 2: parser.error("wrong number of arguments") cvs_path = args[0] verify_path = args[1] verify_klass = {'svn': SvnRepos, 'hg': HgRepos, 'git': GitRepos}[options.repos_type] failures = Failures() try: # Open the repositories cvsrepos = CvsRepos(cvs_path) verifyrepos = verify_klass(verify_path) # Do our thing... if verify_branch: print 'Verifying branch', verify_branch verify_contents_single( failures, cvsrepos, verifyrepos, 'branch', verify_branch, options ) elif verify_tag: print 'Verifying tag', verify_tag verify_contents_single( failures, cvsrepos, verifyrepos, 'tag', verify_tag, options ) elif verify_trunk: print 'Verifying trunk' verify_contents_single( failures, cvsrepos, verifyrepos, 'trunk', None, options ) else: # Verify trunk, tags and branches verify_contents(failures, cvsrepos, verifyrepos, options) except RuntimeError, e: error(str(e)) except KeyboardInterrupt: pass sys.exit(failures and 1 or 0) if __name__ == '__main__': main(sys.argv) cvs2svn-2.4.0/contrib/destroy_repository.py0000775000076500007650000004041611710517256022222 0ustar mhaggermhagger00000000000000#! /usr/bin/python # (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2006-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Usage: destroy_repository.py OPTION... PATH... Strip the text content out of RCS-format files. *** This script irretrievably destroys any RCS files that it is applied to! This script attempts to strip the file text, log messages, and author names out of RCS files, in addition to renaming RCS files and directories. (This is useful to make test cases smaller and to remove much of the proprietary information that is stored in a repository.) Note that this script does NOT obliterate other information that might also be considered proprietary, such as 'CVSROOT' directories and their contents, commit dates, etc. In fact, it's not guaranteed even to obliterate all of the file text, or to do anything else for that matter. The following OPTIONs are recognized: --all destroy all data (this is the default if no options are given) --data destroy revision data (file contents) only --metadata destroy revision metadata (author, log message, description) only --symbols destroy symbol names (branch/tag names) only --filenames destroy the filenames of RCS files --basenames destroy basenames only (keep filename extensions, such as '.txt') (--filenames overrides --basenames) --dirnames destroy directory names within given PATH. PATH itself (if a directory) is not destroyed. --cvsroot delete files within 'CVSROOT' directories, instead of leaving them untouched. The 'CVSROOT' directory itself is preserved. --no- where is one of the above options negates the meaning of that option. Each PATH that is a *,v file will be stripped. Each PATH that is a directory will be traversed and all of its *,v files stripped. Other PATHs will be ignored. Examples of usage: destroy_repository.py PATH destroys all data in PATH destroy_repository.py --all PATH same as above destroy_repository.py --data PATH destroys only revision data destroy_repository.py --no-data PATH destroys everything but revision data destroy_repository.py --data --metadata PATH destroys revision data and metadata only ---->8---- The *,v files must be writable by the user running the script. Typically CVS repositories are read-only, so you might have to run something like $ chmod -R ug+w my/repo/path before running this script. Most cvs2svn behavior is completely independent of the text contained in an RCS file. (The text is not even looked at until OutputPass.) The idea is to use this script when preparing test cases for problems that you experience with cvs2svn. Instead of sending us your whole CVS repository, you should: 1. Make a copy of the original repository 2. Run this script on the copy (NEVER ON THE ORIGINAL!!!) 3. Verify that the problem still exists when you use cvs2svn to convert the 'destroyed' copy 4. Send us the 'destroyed' copy along with the exact cvs2svn version that you used, the exact command line that you used to start the conversion, and the options file if you used one. Please also consider using shrink_test_case.py to localize the problem even further. """ import sys import os import shutil import re sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) from cvs2svn_lib.key_generator import KeyGenerator from cvs2svn_lib.rcsparser import parse from rcs_file_filter import WriteRCSFileSink from rcs_file_filter import FilterSink # Which components to be destroyed. Default to all. destroy = { 'data': True, 'metadata': True, 'symbols': True, 'filenames': True, 'basenames': True, 'dirnames': True, 'cvsroot': True, } tmpdir = 'destroy_repository-tmp' file_key_generator = KeyGenerator(1) def get_tmp_filename(): return os.path.join(tmpdir, 'f%07d.tmp' % file_key_generator.gen_id()) # Mapping from "real" symbol name to rewritten symbol name symbol_map = {} def rewrite_symbol(name): if name not in symbol_map: symbol_map[name] = "symbol%05d" % (len(symbol_map)) return symbol_map[name] # Mapping from "real" filename to rewritten filename filename_map = { # Empty filename should always map to empty filename. This is useful when # preserving the last component of filenames with only one component. '': '', } # Set the following to true if we should not destroy the last filename # component (aka. filename extension) keep_last_filename_component = False def rewrite_filename(pathname): if not destroy['filenames']: return pathname (dirname, filename) = os.path.split(pathname) extra = '' # Strip trailing ',v' now, and re-append it to the rewritten filename if filename.endswith(',v'): extra += ',v' filename = filename[:-2] if keep_last_filename_component: (filename, extension) = os.path.splitext(filename) if not extension: # filename has no extension. Do not rewrite this filename # at all. return pathname extra = extension + extra # Rewrite filename try: return os.path.join(dirname, filename_map[filename] + extra) except KeyError: # filename_map[filename] does not exist. Generate automatically: num = len(filename_map) while True: filename_map[filename] = "file%03d" % (num) retval = os.path.join(dirname, filename_map[filename] + extra) if not os.path.exists(retval): return retval num += 1 # List of directory names to be renamed. This list is filled while we walk # the directory structure, and then processed afterwards, in order to not # mess up the directory structure while it is being walked. rename_dir_list = [] def rename_dirs(): """Rename all directories occuring in rename_dir_list""" # Make sure we rename subdirs _before_ renaming their parents rename_dir_list.reverse() rename_map = {} num = 0 for d in rename_dir_list: (parent, name) = os.path.split(d) # Skip rewriting 'Attic' directories if name == "Attic": continue if name not in rename_map: while True: num += 1 rename_map[name] = "dir%03d" % (num) if not os.path.exists(os.path.join(parent, rename_map[name])): break new_d = os.path.join(parent, rename_map[name]) assert not os.path.exists(new_d) shutil.move(d, new_d) class Substituter: def __init__(self, template): self.template = template self.key_generator = KeyGenerator(1) # A map from old values to new ones. self.substitutions = {} def get_substitution(self, s): r = self.substitutions.get(s) if r == None: r = self.template % self.key_generator.gen_id() self.substitutions[s] = r return r class LogSubstituter(Substituter): # If a log messages matches any of these regular expressions, it # is passed through untouched. untouchable_log_res = [ re.compile(r'^Initial revision\n$'), re.compile(r'^file (?P.+) was initially added' r' on branch (?P.+)\.\n$'), re.compile(r'^\*\*\* empty log message \*\*\*\n$'), re.compile(r'^initial checkin$'), ] def __init__(self): Substituter.__init__(self, 'log %d') def get_substitution(self, log): keep_log = '' for untouchable_log_re in self.untouchable_log_res: m = untouchable_log_re.search(log) if m: # We have matched one of the above regexps # Keep log message keep_log = log # Check if we matched a regexp with a named subgroup groups = m.groupdict() if 'symbol' in groups and destroy['symbols']: # Need to rewrite symbol name symbol = groups['symbol'] keep_log = keep_log.replace(symbol, rewrite_symbol(symbol)) if 'filename' in groups and destroy['filenames']: # Need to rewrite filename filename = groups['filename'] keep_log = keep_log.replace( filename, rewrite_filename(filename) ) if keep_log: return keep_log if destroy['metadata']: return Substituter.get_substitution(self, log) return log class DestroyerFilterSink(FilterSink): def __init__(self, author_substituter, log_substituter, sink): FilterSink.__init__(self, sink) self.author_substituter = author_substituter self.log_substituter = log_substituter def set_head_revision(self, revision): self.head_revision = revision FilterSink.set_head_revision(self, revision) def define_tag(self, name, revision): if destroy['symbols']: name = rewrite_symbol(name) FilterSink.define_tag(self, name, revision) def define_revision( self, revision, timestamp, author, state, branches, next ): if destroy['metadata']: author = self.author_substituter.get_substitution(author) FilterSink.define_revision( self, revision, timestamp, author, state, branches, next ) def set_description(self, description): if destroy['metadata']: description = '' FilterSink.set_description(self, description) def set_revision_info(self, revision, log, text): if destroy['data']: if revision == self.head_revision: # Set the HEAD text unconditionally. (It could be # that revision HEAD-1 has an empty deltatext, in # which case the HEAD text was actually committed in # an earlier commit.) text = ( 'This text was last seen in HEAD (revision %s)\n' ) % (revision,) elif text == '': # This is a no-op revision; preserve that fact. (It # might be relied on by cvs2svn). pass else: # Otherwise, replace the data. if revision.count('.') == 1: # On trunk, it could be that revision N-1 has an # empty deltatext, in which case text for revision # N was actually committed in an earlier commit. text = ( 'd1 1\n' 'a1 1\n' 'This text was last seen in revision %s\n' ) % (revision,) else: # On a branch, we know that the text was changed # in revision N (even though the same text might # also be kept across later revisions N+1 etc.) text = ( 'd1 1\n' 'a1 1\n' 'This text was committed in revision %s\n' ) % (revision,) if destroy['metadata'] or destroy['symbols'] or destroy['filenames']: log = self.log_substituter.get_substitution(log) FilterSink.set_revision_info(self, revision, log, text) class FileDestroyer: def __init__(self): self.log_substituter = LogSubstituter() self.author_substituter = Substituter('author%d') def destroy_file(self, filename): tmp_filename = get_tmp_filename() f = open(tmp_filename, 'wb') new_filename = rewrite_filename(filename) parse( open(filename, 'rb'), DestroyerFilterSink( self.author_substituter, self.log_substituter, WriteRCSFileSink(f), ) ) f.close() # Replace the original file with the new one: assert filename == new_filename or not os.path.exists(new_filename) os.remove(filename) shutil.move(tmp_filename, new_filename) def visit(self, dirname, names): # Special handling of CVSROOT directories if "CVSROOT" in names: path = os.path.join(dirname, "CVSROOT") if destroy['cvsroot']: # Remove all contents within CVSROOT sys.stderr.write('Deleting %s contents...' % path) shutil.rmtree(path) os.mkdir(path) else: # Leave CVSROOT alone sys.stderr.write('Skipping %s...' % path) del names[names.index("CVSROOT")] sys.stderr.write('done.\n') for name in names: path = os.path.join(dirname, name) if os.path.isfile(path) and path.endswith(',v'): sys.stderr.write('Destroying %s...' % path) self.destroy_file(path) sys.stderr.write('done.\n') elif os.path.isdir(path): if destroy['dirnames']: rename_dir_list.append(path) # Subdirectories are traversed automatically pass else: sys.stderr.write('File %s is being ignored.\n' % path) def destroy_dir(self, path): os.path.walk(path, FileDestroyer.visit, self) def usage_abort(msg): if msg: print >>sys.stderr, "ERROR:", msg print >>sys.stderr # Use this file's docstring as a usage string, but only the first part print __doc__.split('\n---->8----', 1)[0] sys.exit(1) if __name__ == '__main__': if not os.path.isdir(tmpdir): os.makedirs(tmpdir) # Paths to be destroyed paths = [] # Command-line argument processing first_option = True for arg in sys.argv[1:]: if arg.startswith("--"): # Option processing option = arg[2:].lower() value = True if option.startswith("no-"): value = False option = option[3:] if first_option: # Use the first option on the command-line to determine the # default actions. If the first option is negated (i.e. --no-X) # the default action should be to destroy everything. # Otherwise, the default action should be to destroy nothing. # This makes both positive and negative options work # intuitively (e.g. "--data" will destroy only data, while # "--no-data" will destroy everything BUT data). for d in destroy.keys(): destroy[d] = not value first_option = False if option in destroy: destroy[option] = value elif option == "all": for d in destroy.keys(): destroy[d] = value else: usage_abort("Unknown OPTION '%s'" % arg) else: # Path argument paths.append(arg) # If --basenames if given (and not also --filenames), we shall destroy # filenames, up to, but not including the last component. if destroy['basenames'] and not destroy['filenames']: destroy['filenames'] = True keep_last_filename_component = True if not paths: usage_abort("No PATH given") # Destroy given PATHs file_destroyer = FileDestroyer() for path in paths: if os.path.isfile(path) and path.endswith(',v'): file_destroyer.destroy_file(path) elif os.path.isdir(path): file_destroyer.destroy_dir(path) else: sys.stderr.write('PATH %s is being ignored.\n' % path) if destroy['dirnames']: rename_dirs() cvs2svn-2.4.0/contrib/git-move-refs.py0000775000076500007650000001224412027257624020716 0ustar mhaggermhagger00000000000000#!/usr/bin/python """Remove redundant fixup commits from a cvs2svn-converted git repository. Process each head ref and/or tag in a git repository. If the associated commit is tree-wise identical with another commit, the head or tag is moved to point at the other commit (i.e., refs pointing at identical content will all point at a single fixup commit). Furthermore, if one of the parents of the fixup commit is identical to the fixup commit itself, then the head or tag is moved to the parent. The script is meant to be run against a repository converted by cvs2svn, since cvs2svn creates empty commits for some tags and head refs (branches). """ usage = 'USAGE: %prog [options]' import sys import optparse from subprocess import Popen, PIPE, call # Cache trees we have already seen, and that are suitable targets for # moved refs tree_cache = {} # tree SHA1 -> commit SHA1 # Cache parent commit -> parent tree mapping parent_cache = {} # commit SHA1 -> tree SHA1 def resolve_commit(commit): """Return the tree object associated with the given commit.""" get_tree_cmd = ["git", "rev-parse", commit + "^{tree}"] tree = Popen(get_tree_cmd, stdout = PIPE).communicate()[0].strip() return tree def move_ref(ref, from_commit, to_commit, ref_type): """Move the given head to the given commit. ref_type is either "tags" or "heads" """ if from_commit != to_commit: print "Moving ref %s from %s to %s..." % (ref, from_commit, to_commit), if ref_type == "tags": command = "tag" else: command = "branch" retcode = call(["git", command, "-f", ref, to_commit]) if retcode == 0: print "done" else: print "FAILED" def try_to_move_ref(ref, commit, tree, parents, ref_type): """Try to move the given ref to a separate commit (with identical tree).""" if tree in tree_cache: # We have already found a suitable commit for this tree move_ref(ref, commit, tree_cache[tree], ref_type) return # Try to move this ref to one of its commit's parents for p in parents: if p not in parent_cache: # Not in cache parent_cache[p] = resolve_commit(p) p_tree = parent_cache[p] if tree == p_tree: # We can move ref to parent p move_ref(ref, commit, p, ref_type) commit = p break # Register the resulting commit object in the tree_cache assert tree not in tree_cache # Sanity check tree_cache[tree] = commit def process_refs(ref_type): tree_cache.clear() parent_cache.clear() # Command for retrieving refs and associated metadata # See 'git for-each-ref' manual page for --format details get_ref_info_cmd = [ "git", "for-each-ref", "--format=%(refname)%00%(objecttype)%00%(subject)%00" "%(objectname)%00%(tree)%00%(parent)%00" "%(*objectname)%00%(*tree)%00%(*parent)", "refs/%s" % (ref_type,), ] get_ref_info = Popen(get_ref_info_cmd, stdout = PIPE) while True: # While get_ref_info process is still running for line in get_ref_info.stdout: line = line.strip() (ref, objtype, subject, commit, tree, parents, commit_alt, tree_alt, parents_alt) = line.split(chr(0)) if objtype == "tag": commit = commit_alt tree = tree_alt parents = parents_alt elif objtype != "commit": continue if subject.startswith("This commit was manufactured by cvs2svn") \ or not subject: # We shall try to move this ref, if possible parent_list = [] if parents: parent_list = parents.split(" ") for p in parent_list: assert len(p) == 40 ref_prefix = "refs/%s/" % (ref_type,) assert ref.startswith(ref_prefix) try_to_move_ref( ref[len(ref_prefix):], commit, tree, parent_list, ref_type ) else: # We shall not move this ref, but it is a possible target # for other refs that we _do_ want to move tree_cache.setdefault(tree, commit) if get_ref_info.poll() is not None: # Break if no longer running: break assert get_ref_info.returncode == 0 def main(args): parser = optparse.OptionParser(usage=usage, description=__doc__) parser.add_option( '--tags', '-t', action='store_true', default=False, help='process tags', ) parser.add_option( '--branches', '-b', action='store_true', default=False, help='process branches', ) (options, args) = parser.parse_args(args=args) if args: parser.error('Unexpected command-line arguments') if not (options.tags or options.branches): # By default, process tags but not branches: options.tags = True if options.tags: process_refs("tags") if options.branches: process_refs("heads") main(sys.argv[1:]) cvs2svn-2.4.0/contrib/profile-repos.py0000775000076500007650000000646011317026325021014 0ustar mhaggermhagger00000000000000#!/usr/bin/env python # ==================================================================== # Copyright (c) 2000-2006 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """ Report information about CVS revisions, tags, and branches in a CVS repository by examining the temporary files output by pass 1 of cvs2svn on that repository. NOTE: You have to run the conversion pass yourself! """ import sys, os, os.path from cvs2svn_lib.common import DB_OPEN_READ from cvs2svn_lib.config import CVS_PATHS_DB from cvs2svn_lib.config import CVS_ITEMS_DB from cvs2svn_lib.config import CVS_ITEMS_ALL_DATAFILE from cvs2svn_lib.cvs_path_database import CVSPathDatabase from cvs2svn_lib.cvs_item_database import CVSItemDatabase def do_it(): cvs_path_db = CVSPathDatabase(CVS_PATHS_DB, DB_OPEN_READ) cvs_items_db = CVSItemDatabase(cvs_path_db, CVS_ITEMS_DB, DB_OPEN_READ) fp = open(CVS_ITEMS_ALL_DATAFILE, 'r') tags = { } branches = { } max_tags = 0 max_branches = 0 line_count = 0 total_tags = 0 total_branches = 0 while 1: line_count = line_count + 1 line = fp.readline() if not line: break cvs_rev_key = line.strip() cvs_rev = cvs_items_db[cvs_rev_key] # Handle tags num_tags = len(cvs_rev.tags) max_tags = (num_tags > max_tags) \ and num_tags or max_tags total_tags = total_tags + num_tags for tag in cvs_rev.tags: tags[tag] = None # Handle branches num_branches = len(cvs_rev.branches) max_branches = (num_branches > max_branches) \ and num_branches or max_branches total_branches = total_branches + num_branches for branch in cvs_rev.branches: branches[branch] = None symbols = {} symbols.update(tags) symbols.update(branches) num_symbols = len(symbols.keys()) num_tags = len(tags.keys()) num_branches = len(branches.keys()) avg_tags = total_tags * 1.0 / line_count avg_branches = total_branches * 1.0 / line_count print ' Total CVS Revisions: %d\n' \ ' Total Unique Tags: %d\n' \ ' Peak Revision Tags: %d\n' \ ' Avg. Tags/Revision: %2.1f\n' \ ' Total Unique Branches: %d\n' \ 'Peak Revision Branches: %d\n' \ 'Avg. Branches/Revision: %2.1f\n' \ ' Total Unique Symbols: %d%s\n' \ % (line_count, num_tags, max_tags, avg_tags, num_branches, max_branches, avg_branches, num_symbols, num_symbols == num_tags + num_branches and ' ' or ' (!)', ) if __name__ == "__main__": argc = len(sys.argv) if argc < 2: print 'Usage: %s /path/to/cvs2svn-temporary-directory' \ % (os.path.basename(sys.argv[0])) print __doc__ sys.exit(0) os.chdir(sys.argv[1]) do_it() cvs2svn-2.4.0/contrib/cvs2svn_memlog0000775000076500007650000000571611434364604020553 0ustar mhaggermhagger00000000000000#!/usr/bin/env python # (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Run cvs2svn, but logging memory usage. Memory use is logged every MemoryLogger.interval seconds. This script takes the same parameters as cvs2svn. Memory use is determined by reading from the /proc filesystem. This method is not very portable, but should hopefully work on a typical modern Linux.""" import sys import os # Make sure that a supported version of Python is being used. Do this # as early as possible, using only code compatible with Python 1.5.2 # and Python 3.x before the check. if not (0x02040000 <= sys.hexversion < 0x03000000): sys.stderr.write("ERROR: Python 2, version 2.4 or higher required.\n") sys.exit(1) sys.path.insert(0, os.path.dirname(os.path.dirname(sys.argv[0]))) import re import time import optparse import threading from cvs2svn_lib.common import FatalException from cvs2svn_lib.log import logger from cvs2svn_lib.main import main usage = '%prog [--interval=VALUE] [--help|-h] -- CVS2SVN-ARGS' description = """\ Run cvs2svn while logging its memory usage. ('--' is required to separate %(progname)s options from the options and arguments that will be passed through to cvs2svn.) """ rss_re = re.compile(r'^VmRSS\:\s+(?P.*)$') def get_memory_used(): filename = '/proc/%d/status' % (os.getpid(),) for l in open(filename).readlines(): l = l.strip() m = rss_re.match(l) if m: return m.group('mem') return 'Unknown' class MemoryLogger(threading.Thread): def __init__(self, interval): threading.Thread.__init__(self) self.setDaemon(True) self.start_time = time.time() self.interval = interval def run(self): i = 0 while True: delay = self.start_time + self.interval * i - time.time() if delay > 0: time.sleep(delay) logger.write('Memory used: %s' % (get_memory_used(),)) i += 1 parser = optparse.OptionParser(usage=usage, description=description) parser.set_defaults(interval=1.0) parser.add_option( '--interval', action='store', type='float', help='the time in seconds between memory logs', ) (options, args) = parser.parse_args() MemoryLogger(interval=options.interval).start() try: main(sys.argv[0], args) except FatalException, e: sys.stderr.write(str(e) + '\n') sys.exit(1) cvs2svn-2.4.0/MANIFEST.in0000664000076500007650000000073711244062540015746 0ustar mhaggermhagger00000000000000include *.py dist.sh MANIFEST.in Makefile include README BUGS COMMITTERS COPYING HACKING CHANGES include cvs2svn-example.options include cvs2git-example.options include cvs2bzr-example.options include cvs2hg-example.options recursive-include svntest * recursive-include test-data * recursive-include doc * recursive-include www * recursive-include contrib *.py *.pl cvs2svn_memlog prune www/tigris-branding prune www/xhtml1-20020801 prune www/xhtml1.catalog prune www/xhtml1.tgz cvs2svn-2.4.0/cvs2svn-example.options0000664000076500007650000007645612027257624020706 0ustar mhaggermhagger00000000000000# (Be in -*- mode: python; coding: utf-8 -*- mode.) # # ==================================================================== # Copyright (c) 2006-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== # ##################### # ## PLEASE READ ME! ## # ##################### # # This is a template for an options file that can be used to configure # cvs2svn. Many options do not have defaults, so it is easier to copy # this file and modify what you need rather than creating a new # options file from scratch. # # This file is in Python syntax, but you don't need to know Python to # modify it. But if you *do* know Python, then you will be happy to # know that you can use arbitary Python constructs to do fancy # configuration tricks. # # But please be aware of the following: # # * In many places, leading whitespace is significant in Python (it is # used instead of curly braces to group statements together). # Therefore, if you don't know what you are doing, it is best to # leave the whitespace as it is. # # * In normal strings, Python treats a backslash ("\") as an escape # character. Therefore, if you want to specify a string that # contains a backslash, you need either to escape the backslash with # another backslash ("\\"), or use a "raw string", as in one if the # following equivalent examples: # # cvs_executable = 'c:\\windows\\system32\\cvs.exe' # cvs_executable = r'c:\windows\system32\cvs.exe' # # See http://docs.python.org/tutorial/introduction.html#strings for # more information. # # Two identifiers will have been defined before this file is executed, # and can be used freely within this file: # # ctx -- a Ctx object (see cvs2svn_lib/context.py), which holds # many configuration options # # run_options -- an instance of the SVNRunOptions class (see # cvs2svn_lib/svn_run_options.py), which holds some variables # governing how cvs2svn is run # Import some modules that are used in setting the options: from cvs2svn_lib import config from cvs2svn_lib import changeset_database from cvs2svn_lib.common import CVSTextDecoder from cvs2svn_lib.log import logger from cvs2svn_lib.svn_output_option import DumpfileOutputOption from cvs2svn_lib.svn_output_option import ExistingRepositoryOutputOption from cvs2svn_lib.svn_output_option import NewRepositoryOutputOption from cvs2svn_lib.svn_run_options import SVNEOLFixPropertySetter from cvs2svn_lib.svn_run_options import SVNKeywordHandlingPropertySetter from cvs2svn_lib.revision_manager import NullRevisionCollector from cvs2svn_lib.rcs_revision_manager import RCSRevisionReader from cvs2svn_lib.cvs_revision_manager import CVSRevisionReader from cvs2svn_lib.checkout_internal import InternalRevisionCollector from cvs2svn_lib.checkout_internal import InternalRevisionReader from cvs2svn_lib.symbol_strategy import AllBranchRule from cvs2svn_lib.symbol_strategy import AllTagRule from cvs2svn_lib.symbol_strategy import BranchIfCommitsRule from cvs2svn_lib.symbol_strategy import ExcludeRegexpStrategyRule from cvs2svn_lib.symbol_strategy import ForceBranchRegexpStrategyRule from cvs2svn_lib.symbol_strategy import ForceTagRegexpStrategyRule from cvs2svn_lib.symbol_strategy import ExcludeTrivialImportBranchRule from cvs2svn_lib.symbol_strategy import ExcludeVendorBranchRule from cvs2svn_lib.symbol_strategy import HeuristicStrategyRule from cvs2svn_lib.symbol_strategy import UnambiguousUsageRule from cvs2svn_lib.symbol_strategy import HeuristicPreferredParentRule from cvs2svn_lib.symbol_strategy import SymbolHintsFileRule from cvs2svn_lib.symbol_transform import ReplaceSubstringsSymbolTransform from cvs2svn_lib.symbol_transform import RegexpSymbolTransform from cvs2svn_lib.symbol_transform import IgnoreSymbolTransform from cvs2svn_lib.symbol_transform import NormalizePathsSymbolTransform from cvs2svn_lib.property_setters import AutoPropsPropertySetter from cvs2svn_lib.property_setters import CVSBinaryFileDefaultMimeTypeSetter from cvs2svn_lib.property_setters import CVSBinaryFileEOLStyleSetter from cvs2svn_lib.property_setters import CVSRevisionNumberSetter from cvs2svn_lib.property_setters import DefaultEOLStyleSetter from cvs2svn_lib.property_setters import EOLStyleFromMimeTypeSetter from cvs2svn_lib.property_setters import ExecutablePropertySetter from cvs2svn_lib.property_setters import DescriptionPropertySetter from cvs2svn_lib.property_setters import KeywordsPropertySetter from cvs2svn_lib.property_setters import MimeMapper from cvs2svn_lib.property_setters import SVNBinaryFileKeywordsPropertySetter # To choose the level of logging output, uncomment one of the # following lines: #logger.log_level = logger.WARN #logger.log_level = logger.QUIET logger.log_level = logger.NORMAL #logger.log_level = logger.VERBOSE #logger.log_level = logger.DEBUG # The directory to use for temporary files: ctx.tmpdir = r'cvs2svn-tmp' # author_transforms can be used to map CVS author names (e.g., # "jrandom") to whatever names make sense for your SVN configuration # (e.g., "john.j.random"). All values should be either Unicode # strings (i.e., with "u" as a prefix) or 8-bit strings in the utf-8 # encoding. To use this feature, please substitute your own project's # usernames here and uncomment the author_transforms option when # setting ctx.output_option below author_transforms={ 'jrandom' : u'john.j.random', 'brane' : u'Branko.Čibej', 'ringstrom' : 'ringström', 'dionisos' : u'e.hülsmann', } # There are several possible options for where to put the output of a # cvs2svn conversion. Please choose one of the following and adjust # the parameters as necessary: # Use this output option if you would like cvs2svn to create a new SVN # repository and store the converted repository there. The first # argument is the path to which the repository should be written (this # repository must not already exist). The (optional) fs_type argument # allows a --fs-type option to be passed to "svnadmin create". The # (optional) bdb_txn_nosync argument can be specified to set the # --bdb-txn-nosync option on a bdb repository. The (optional) # create_options argument can be specified to set a list of verbatim # options to be passed to "svnadmin create". The (optional) # author_transforms argument allows CVS author names to be transformed # arbitrarily into SVN author names (as described above): ctx.output_option = NewRepositoryOutputOption( r'/path/to/svnrepo', #fs_type='fsfs', #bdb_txn_nosync=False, #create_options=['--pre-1.5-compatible'], #author_transforms=author_transforms, ) # Use this output option if you would like cvs2svn to store the # converted CVS repository into an SVN repository that already exists. # The first argument is the filesystem path of an existing local SVN # repository (this repository must already exist). The # author_transforms option is as described above: #ctx.output_option = ExistingRepositoryOutputOption( # r'/path/to/svnrepo', # Path to repository # #author_transforms=author_transforms, # ) # Use this type of output option if you want the output of the # conversion to be written to a SVN dumpfile instead of committing # them into an actual repository. The author_transforms option is as # described above: #ctx.output_option = DumpfileOutputOption( # dumpfile_path=r'/path/to/cvs2svn-dump', # Name of dumpfile to create # #author_transforms=author_transforms, # ) # Independent of the ctx.output_option selected, the following option # can be set to True to suppress cvs2svn output altogether: ctx.dry_run = False # The following set of options specifies how the revision contents of # the RCS files should be read. # # The default selection is InternalRevisionReader, which uses built-in # code that reads the RCS deltas while parsing the files in # CollectRevsPass. This method is very fast but requires lots of # temporary disk space. The disk space is required for (1) storing # all of the RCS deltas, and (2) during OutputPass, keeping a copy of # the full text of every revision that still has a descendant that # hasn't yet been committed. Since this can includes multiple # revisions of each file (i.e., on multiple branches), the required # amount of temporary space can potentially be many times the size of # a checked out copy of the whole repository. Setting compress=True # cuts the disk space requirements by about 50% at the price of # increased CPU usage. Using compression usually speeds up the # conversion due to the reduced I/O pressure, unless --tmpdir is on a # RAM disk. This method does not expand CVS's "Log" keywords. # # The second possibility is RCSRevisionReader, which uses RCS's "co" # program to extract the revision contents of the RCS files during # OutputPass. This option doesn't require any temporary space, but it # is relatively slow because (1) "co" has to be executed very many # times; and (2) "co" itself has to assemble many file deltas to # compute the contents of a particular revision. The constructor # argument specifies how to invoke the "co" executable. # # The third possibility is CVSRevisionReader, which uses the "cvs" # program to extract the revision contents out of the RCS files during # OutputPass. This option doesn't require any temporary space, but it # is the slowest of all, because "cvs" is considerably slower than # "co". However, it works in some situations where RCSRevisionReader # fails; see the HTML documentation of the "--use-cvs" option for # details. The constructor argument specifies how to invoke the "co" # executable. # # Choose one of the following three groups of lines: ctx.revision_collector = InternalRevisionCollector(compress=True) ctx.revision_reader = InternalRevisionReader(compress=True) #ctx.revision_collector = NullRevisionCollector() #ctx.revision_reader = RCSRevisionReader(co_executable=r'co') # It is also possible to pass a global_options parameter to # CVSRevisionReader to specify which options should be passed to the # cvs command. By default the correct options are usually chosen, but # for CVSNT you might want to add global_options=['-q', '-N', '-f']. #ctx.revision_collector = NullRevisionCollector() #ctx.revision_reader = CVSRevisionReader(cvs_executable=r'cvs') # Set the name (and optionally the path) to the 'svnadmin' command, # which is needed for NewRepositoryOutputOption or # ExistingRepositoryOutputOption. The default is the "svnadmin" # command in the user's PATH: #ctx.svnadmin_executable = r'svnadmin' # Change the following line to True if the conversion should only # include the trunk of the repository (i.e., all branches and tags # should be ignored): ctx.trunk_only = False # Normally, cvs2svn ignores directories within the CVS repository if # they do not contain valid RCS files. This produces a Subversion # repository whose behavior imitates that of CVS if CVS is typically # used with the "-P" option. However, sometimes these empty # directories are needed by a project (e.g., by the build procedure). # If so, the following option can be sent to True to cause empty # directories to be created in the SVN repository when their parent # directory is created and removed when their parent directory is # removed. (This is more likely to be useful than the behavior of CVS # when its "-P" option is not used.) ctx.include_empty_directories = False # Normally, cvs2svn deletes a directory once the last file has been # deleted from it (a la "cvs -P"). Change the following line to False # if you would like such directories to be retained in the Subversion # repository through the rest of history: ctx.prune = True # How to convert author names, log messages, and filenames to Unicode. # The first argument to CVSTextDecoder is a list of encoders that are # tried in order in 'strict' mode until one of them succeeds. If none # of those succeeds, then fallback_encoder is used in lossy 'replace' # mode (if it is specified). Setting a fallback encoder ensures that # the encoder always succeeds, but it can cause information loss. ctx.cvs_author_decoder = CVSTextDecoder( [ #'utf8', #'latin1', 'ascii', ], #fallback_encoding='ascii' ) ctx.cvs_log_decoder = CVSTextDecoder( [ #'utf8', #'latin1', 'ascii', ], #fallback_encoding='ascii', eol_fix='\n', ) # You might want to be especially strict when converting filenames to # Unicode (e.g., maybe not specify a fallback_encoding). ctx.cvs_filename_decoder = CVSTextDecoder( [ #'utf8', #'latin1', 'ascii', ], #fallback_encoding='ascii' ) # Template for the commit message to be used for initial project # commits. ctx.initial_project_commit_message = ( 'Standard project directories initialized by cvs2svn.' ) # Template for the commit message to be used for post commits, in # which modifications to a vendor branch are copied back to trunk. # This message can use '%(revnum)d' to include the revision number of # the revision that included the change to the vendor branch. ctx.post_commit_message = ( 'This commit was generated by cvs2svn to compensate for ' 'changes in r%(revnum)d, which included commits to RCS files ' 'with non-trunk default branches.' ) # Template for the commit message to be used for commits in which # symbols are created. This message can use '%(symbol_type)s' to # include the type of the symbol ('branch' or 'tag') or # '%(symbol_name)s' to include the name of the symbol. ctx.symbol_commit_message = ( "This commit was manufactured by cvs2svn to create %(symbol_type)s " "'%(symbol_name)s'." ) # Some CVS clients for MacOS store resource fork data into CVS along # with the file contents itself by wrapping it all up in a container # format called "AppleSingle". Subversion currently does not support # MacOS resource forks. Nevertheless, sometimes the resource fork # information is not necessary and can be discarded. Set the # following option to True if you would like cvs2svn to identify files # whose contents are encoded in AppleSingle format, and discard all # but the data fork for such files before committing them to # Subversion. (Please note that AppleSingle contents are identified # by the AppleSingle magic number as the first four bytes of the file. # This check is not failproof, so only set this option if you think # you need it.) ctx.decode_apple_single = False # This option can be set to the name of a filename to which are stored # statistics and conversion decisions about the CVS symbols. ctx.symbol_info_filename = None #ctx.symbol_info_filename = 'symbol-info.txt' # cvs2svn uses "symbol strategy rules" to help decide how to handle # CVS symbols. The rules in a project's symbol_strategy_rules are # applied in order, and each rule is allowed to modify the symbol. # The result (after each of the rules has been applied) is used for # the conversion. # # 1. A CVS symbol might be used as a tag in one file and as a branch # in another file. cvs2svn has to decide whether to convert such a # symbol as a tag or as a branch. cvs2svn uses a series of # heuristic rules to decide how to convert a symbol. The user can # override the default rules for specific symbols or symbols # matching regular expressions. # # 2. cvs2svn is also capable of excluding symbols from the conversion # (provided no other symbols depend on them. # # 3. CVS does not record unambiguously the line of development from # which a symbol sprouted. cvs2svn uses a heuristic to choose a # symbol's "preferred parents". # # The standard branch/tag/exclude StrategyRules do not change a symbol # that has already been processed by an earlier rule, so in effect the # first matching rule is the one that is used. global_symbol_strategy_rules = [ # It is possible to specify manually exactly how symbols should be # converted and what line of development should be used as the # preferred parent. To do so, create a file containing the symbol # hints and enable the following option. # # The format of the hints file is described in the documentation # for the SymbolHintsFileRule class in # cvs2svn_lib/symbol_strategy.py. The file output by the # --write-symbol-info (i.e., ctx.symbol_info_filename) option is # in the same format. The simplest way to use this option is to # run the conversion through CollateSymbolsPass with # --write-symbol-info option, copy the symbol info and edit it to # create a hints file, then re-start the conversion at # CollateSymbolsPass with this option enabled. #SymbolHintsFileRule('symbol-hints.txt'), # To force all symbols matching a regular expression to be # converted as branches, add rules like the following: #ForceBranchRegexpStrategyRule(r'branch.*'), # To force all symbols matching a regular expression to be # converted as tags, add rules like the following: #ForceTagRegexpStrategyRule(r'tag.*'), # To force all symbols matching a regular expression to be # excluded from the conversion, add rules like the following: #ExcludeRegexpStrategyRule(r'unknown-.*'), # Sometimes people use "cvs import" to get their own source code # into CVS. This practice creates a vendor branch 1.1.1 and # imports the code onto the vendor branch as 1.1.1.1, then copies # the same content to the trunk as version 1.1. Normally, such # vendor branches are useless and they complicate the SVN history # unnecessarily. The following rule excludes any branches that # only existed as a vendor branch with a single import (leaving # only the 1.1 revision). If you want to retain such branches, # comment out the following line. (Please note that this rule # does not exclude vendor *tags*, as they are not so easy to # identify.) ExcludeTrivialImportBranchRule(), # To exclude all vendor branches (branches that had "cvs import"s # on them but no other kinds of commits), uncomment the following # line: #ExcludeVendorBranchRule(), # Usually you want this rule, to convert unambiguous symbols # (symbols that were only ever used as tags or only ever used as # branches in CVS) the same way they were used in CVS: UnambiguousUsageRule(), # If there was ever a commit on a symbol, then it cannot be # converted as a tag. This rule causes all such symbols to be # converted as branches. If you would like to resolve such # ambiguities manually, comment out the following line: BranchIfCommitsRule(), # Last in the list can be a catch-all rule that is used for # symbols that were not matched by any of the more specific rules # above. (Assuming that BranchIfCommitsRule() was included above, # then the symbols that are still indeterminate at this point can # sensibly be converted as branches or tags.) Include at most one # of these lines. If none of these catch-all rules are included, # then the presence of any ambiguous symbols (that haven't been # disambiguated above) is an error: # Convert ambiguous symbols based on whether they were used more # often as branches or as tags: HeuristicStrategyRule(), # Convert all ambiguous symbols as branches: #AllBranchRule(), # Convert all ambiguous symbols as tags: #AllTagRule(), # The last rule is here to choose the preferred parent of branches # and tags, that is, the line of development from which the symbol # sprouts. HeuristicPreferredParentRule(), ] # Specify a username to be used for commits generated by cvs2svn. If # this option is set to None then no username will be used for such # commits: ctx.username = None #ctx.username = 'cvs2svn' # ctx.file_property_setters and ctx.revision_property_setters contain # rules used to set the svn properties on files in the converted # archive. For each file, the rules are tried one by one. Any rule # can add or suppress one or more svn properties. Typically the rules # will not overwrite properties set by a previous rule (though they # are free to do so). ctx.file_property_setters should be used for # properties that remain the same for the life of the file; these # should implement FilePropertySetter. ctx.revision_property_setters # should be used for properties that are allowed to vary from revision # to revision; these should implement RevisionPropertySetter. ctx.file_property_setters.extend([ # To read auto-props rules from a file, uncomment the following line # and specify a filename. The boolean argument specifies whether # case should be ignored when matching filenames to the filename # patterns found in the auto-props file: #AutoPropsPropertySetter( # r'/home/username/.subversion/config', # ignore_case=True, # ), # To read mime types from a file and use them to set svn:mime-type # based on the filename extensions, uncomment the following line # and specify a filename (see # http://en.wikipedia.org/wiki/Mime.types for information about # mime.types files): #MimeMapper(r'/etc/mime.types', ignore_case=False), # Omit the svn:eol-style property from any files that are listed # as binary (i.e., mode '-kb') in CVS: CVSBinaryFileEOLStyleSetter(), # If the file is binary and its svn:mime-type property is not yet # set, set svn:mime-type to 'application/octet-stream'. CVSBinaryFileDefaultMimeTypeSetter(), # To try to determine the eol-style from the mime type, uncomment # the following line: #EOLStyleFromMimeTypeSetter(), # Choose one of the following lines to set the default # svn:eol-style if none of the above rules applied. The argument # is the svn:eol-style that should be applied, or None if no # svn:eol-style should be set (i.e., the file should be treated as # binary). # # The default is to treat all files as binary unless one of the # previous rules has determined otherwise, because this is the # safest approach. However, if you have been diligent about # marking binary files with -kb in CVS and/or you have used the # above rules to definitely mark binary files as binary, then you # might prefer to use 'native' as the default, as it is usually # the most convenient setting for text files. Other possible # options: 'CRLF', 'CR', 'LF'. DefaultEOLStyleSetter(None), #DefaultEOLStyleSetter('native'), # Prevent svn:keywords from being set on files that have # svn:eol-style unset. SVNBinaryFileKeywordsPropertySetter(), # If svn:keywords has not been set yet, set it based on the file's # CVS mode: KeywordsPropertySetter(config.SVN_KEYWORDS_VALUE), # Set the svn:executable flag on any files that are marked in CVS as # being executable: ExecutablePropertySetter(), # Set the cvs:description property to the CVS description of any # file that has one: DescriptionPropertySetter(propname='cvs:description'), # The following is for internal use. It determines how to handle # keywords in the text being committed: SVNKeywordHandlingPropertySetter(), # The following is for internal use. It determines how to munge # EOL sequences based on how the svn:eol-style property is set. SVNEOLFixPropertySetter(), ]) ctx.revision_property_setters.extend([ # Uncomment the following line to include the original CVS revision # numbers as file properties in the SVN archive: #CVSRevisionNumberSetter(propname='cvs2svn:cvs-rev'), ]) # To skip the cleanup of temporary files, uncomment the following # option: #ctx.skip_cleanup = True # In CVS, it is perfectly possible to make a single commit that # affects more than one project or more than one branch of a single # project. Subversion also allows such commits. Therefore, by # default, when cvs2svn sees what looks like a cross-project or # cross-branch CVS commit, it converts it into a # cross-project/cross-branch Subversion commit. # # However, other tools and SCMs have trouble representing # cross-project or cross-branch commits. (For example, Trac's Revtree # plugin, http://www.trac-hacks.org/wiki/RevtreePlugin is confused by # such commits.) Therefore, we provide the following two options to # allow cross-project/cross-branch commits to be suppressed. # To prevent CVS commits from different projects from being merged # into single SVN commits, change this option to False: ctx.cross_project_commits = True # To prevent CVS commits on different branches from being merged into # single SVN commits, change this option to False: ctx.cross_branch_commits = True # By default, .cvsignore files are rendered in the output by setting # corresponding svn:ignore properties on the parent directory, but the # .cvsignore files themselves are not included in the conversion # output. If you would like to include the .cvsignore files in the # output, change this option to True: ctx.keep_cvsignore = False # By default, it is a fatal error for a CVS ",v" file to appear both # inside and outside of an "Attic" subdirectory (this should never # happen, but frequently occurs due to botched repository # administration). If you would like to retain both versions of such # files, change the following option to True, and the attic version of # the file will be left in an SVN subdirectory called "Attic": ctx.retain_conflicting_attic_files = False # Now use stanzas like the following to define CVS projects that # should be converted. The arguments are: # # - The filesystem path of the project within the CVS repository. # # - The path that should be used for the "trunk" directory of this # project within the SVN repository. This is an SVN path, so it # should always use forward slashes ("/"). # # - The path that should be used for the "branches" directory of this # project within the SVN repository. This is an SVN path, so it # should always use forward slashes ("/"). # # - The path that should be used for the "tags" directory of this # project within the SVN repository. This is an SVN path, so it # should always use forward slashes ("/"). # # - A list of symbol transformations that can be used to rename # symbols in this project. Each entry is a tuple (pattern, # replacement), where pattern is a Python regular expression pattern # and replacement is the text that should replace the pattern. Each # pattern is matched against each symbol name. If the pattern # matches, then it is replaced with the corresponding replacement # text. The replacement can include substitution patterns (e.g., # r'\1' or r'\g'). Typically you will want to use raw strings # (strings with a preceding 'r', like shown in the examples) for the # regexp and its replacement to avoid backslash substitution within # those strings. # Create the default project (using ctx.trunk, ctx.branches, and ctx.tags): run_options.add_project( r'test-data/main-cvsrepos', trunk_path='trunk', branches_path='branches', tags_path='tags', initial_directories=[ # The project's trunk_path, branches_path, and tags_path # directories are added to the SVN repository in the project's # first commit. If you would like additional SVN directories # to be created in the project's first commit, list them here: #'releases', ], symbol_transforms=[ # Use IgnoreSymbolTransforms like the following to completely # ignore symbols matching a regular expression when parsing # the CVS repository, for example to avoid warnings about # branches with two names and to choose the preferred name. # It is *not* recommended to use this instead of # ExcludeRegexpStrategyRule; though more efficient, # IgnoreSymbolTransforms are less flexible and don't exclude # branches correctly. The argument is a Python-style regular # expression that has to match the *whole* CVS symbol name: #IgnoreSymbolTransform(r'nightly-build-tag-.*') # RegexpSymbolTransforms transform symbols textually using a # regular expression. The first argument is a Python regular # expression pattern and the second is a replacement pattern. # The pattern is matched against each symbol name. If it # matches the whole symbol name, then the symbol name is # replaced with the corresponding replacement text. The # replacement can include substitution patterns (e.g., r'\1' # or r'\g'). Typically you will want to use raw strings # (strings with a preceding 'r', like shown in the examples) # for the regexp and its replacement to avoid backslash # substitution within those strings. #RegexpSymbolTransform(r'release-(\d+)_(\d+)', # r'release-\1.\2'), #RegexpSymbolTransform(r'release-(\d+)_(\d+)_(\d+)', # r'release-\1.\2.\3'), # Simple 1:1 character replacements can also be done. The # following transform, which converts backslashes into forward # slashes, should usually be included: ReplaceSubstringsSymbolTransform('\\','/'), # Eliminate leading, trailing, and repeated slashes. This # transform should always be included: NormalizePathsSymbolTransform(), ], symbol_strategy_rules=[ # Additional, project-specific symbol strategy rules can # be added here. ] + global_symbol_strategy_rules, # Exclude paths from the conversion. Should be relative to # repository path and use forward slashes: #exclude_paths=['file-to-exclude.txt,v', 'dir/to/exclude'], ) # Add a second project, to be stored to projA/trunk, projA/branches, # and projA/tags: #run_options.add_project( # r'my/cvsrepo/projA', # trunk_path='projA/trunk', # branches_path='projA/branches', # tags_path='projA/tags', # initial_directories=[ # ], # symbol_transforms=[ # #RegexpSymbolTransform(r'release-(\d+)_(\d+)', # # r'release-\1.\2'), # #RegexpSymbolTransform(r'release-(\d+)_(\d+)_(\d+)', # # r'release-\1.\2.\3'), # ReplaceSubstringsSymbolTransform('\\','/'), # NormalizePathsSymbolTransform(), # ], # symbol_strategy_rules=[ # # Additional, project-specific symbol strategy rules can # # be added here. # ] + global_symbol_strategy_rules, # ) # Change this option to True to turn on profiling of cvs2svn (for # debugging purposes): run_options.profiling = False # Should CVSItem -> Changeset database files be memory mapped? In # some tests, using memory mapping speeded up the overall conversion # by about 5%. But this option can cause the conversion to fail with # an out of memory error if the conversion computer runs out of # virtual address space (e.g., when running a very large conversion on # a 32-bit operating system). Therefore it is disabled by default. # Uncomment the following line to allow these database files to be # memory mapped. #changeset_database.use_mmap_for_cvs_item_to_changeset_table = True cvs2svn-2.4.0/dist.sh0000775000076500007650000000146010646512653015516 0ustar mhaggermhagger00000000000000#!/bin/sh set -e # Build a cvs2svn distribution. VERSION=`python cvs2svn_lib/version.py` echo "Building cvs2svn ${VERSION}" WC_REV=`svnversion -n .` DIST_BASE=cvs2svn-${VERSION} DIST_FULL=${DIST_BASE}.tar.gz if echo ${WC_REV} | grep -q -e '[^0-9]'; then echo "Packaging requires a single-revision, pristine working copy." echo "" echo "Run 'svn update' to get a working copy without mixed revisions," echo "and make sure there are no local modifications." exit 1 fi # Clean up anything that might have been left from a previous run. rm -rf dist MANIFEST ${DIST_FULL} make clean # Build the dist, Python's way. ./setup.py sdist mv dist/${DIST_FULL} . # Clean up after this run. rm -rf dist MANIFEST # We're outta here. echo "" echo "Done:" echo "" ls -l ${DIST_FULL} md5sum ${DIST_FULL} echo "" cvs2svn-2.4.0/README0000664000076500007650000000012310646512653015067 0ustar mhaggermhagger00000000000000For documentation, see www/cvs2svn.html or http://cvs2svn.tigris.org/cvs2svn.html. cvs2svn-2.4.0/doc/0000775000076500007650000000000012027373500014747 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/doc/properties.txt0000664000076500007650000001135211710517257017715 0ustar mhaggermhagger00000000000000This file documents how file and revision properties are used in cvs2svn. cvs2svn allows arbitrary properties to be associated with CVSFile and CVSRevision instances. These properties are combined to form the effective properties for each CVSRevision. Properties set in a CVSRevision take precedence over properties set in the corresponding CVSFile. These properties can be set very flexibly by FilePropertySetter and RevisionPropertySetter objects, which in turn can be implemented arbitrarily and set via the conversion configuration file. Several types of PropertySetters are already provided, and examples of there use are shown in the example configuration files. The properties are determined early in the conversion and are retained for the duration of the conversion. CVSFile.properties holds properties that do not change for the life of the file; for example, whether keywords should be expanded in the file contents. CVSRevision.properties holds properties that can vary from one file revision to another. The only current example of a revision property is the cvs2svn:rev-num property. Properties whose names start with underscore are reserved for the internal use of cvs2svn. The properties can be used by backends for any purpose. Currently, they are used for two purposes: 1. Passing RevisionReaders information about how a file revision's contents should be transformed before being written to the new VCS. Please note that this does not necessarily correspond to how the revision contents will look after being checked out of the new VCS; for example, Subversion requires keywords to be *unexpanded* in the dumpfile stream if Subversion is going to expand them. These properties are: _keyword_handling -- How should RCS keywords be handled? 'untouched' -- The keywords should be output literally as they are recorded in the RCS file. Please note that this results in the keywords' being expanded the way they were when the revision was checked *in* to CVS, which typically reflects how CVS expanded them when the *previous* revision was checked *out* of CVS. This mode is appropriate for binary files. 'collapsed' -- The keywords should be collapsed in the output; e.g., "$Author: jrandom $" -> "$Author$". This mode is appropriate for output of non-binary files to Subversion (because Subversion re-expands the keywords itself) and might be useful for text files for other VCSs if you would like this history to be "clean" of keywords. 'expanded' -- The keywords should be expanded in the output the same way as CVS would expand keywords when checking out the revision; e.g., "$Author$" -> "$Author: jrandom $". If this value is used, keywords are expanded regardless of whether CVS considers the file to be a text file. This mode might be useful for outputting text files to other VCSs if you would like the content of historical revisions to be as similar as possible to the content as it would be checked out of CVS. 'deleted' -- The keywords and their values (and some surrounding whitespace?) should be deleted entirely. NOT YET IMPLEMENTED. 'replaced' -- The keywords should be deleted entirely and replaced by their values; e.g., "$Author$" -> "jrandom", like CVS's "-kv" option. This is not a very useful feature, but is listed for completeness. NOT YET IMPLEMENTED. _eol_fix -- Should end-of-line sequences be made uniform before committing to the target VCS? If this property is set to a non-empty value, then every end-of-line character sequence ('\n', '\r\n', or '\r') is converted to the specified value (which should obviously be a valid end-of-line character sequence). If this property is not set, then the end-of-line character sequences are output literally as they are recorded in the RCS file. 2. cvs2svn: Specifying Subversion versioned properties. Any properties that do not start with an underscore are converted into Subversion versioned properties on the associated file. By this mechanism, arbitrary Subversion properties can be set. A number of PropertySetters are provided to set common Subversion properties such as svn:mime-type, svn:eol-style, svn:executable, and svn:keywords. Other properties can be set via the AutoPropsPropertySetter or by implementing custom PropertySetters. cvs2svn-2.4.0/doc/symbol-notes.txt0000664000076500007650000003561111317026325020152 0ustar mhaggermhagger00000000000000This is a description of how symbols (tags and branches) are handled by cvs2svn, determined by reading the code. Notation ======== CVSFile -- a single file within the CVS repository. This object basically only records the filename of the corresponding RCS file, and the relative filename that this file will have within the SVN repository. A single CVSFile object is used for all of the CVSItems on all lines of development related to that file. The following terms and the corresponding classes represent project-wide concepts. For example, a project will only have a single Branch named "foo" even if many files appear on that branch. Each of these objects is assigned a unique integer ID during CollectRevsPass which is preserved during the entire conversion (even if, say, a Branch is mutated into a Tag). Trunk -- the main line of development for a particular Project in CVS. The Trunk class inherits from LineOfDevelopment. Symbol -- a Branch or a Tag within a particular Project (see below). Instances of this class are also used to represent symbols early in the conversion, before it has been decided whether to convert the symbol as a Branch or as a Tag. A Symbol contains an id, a Project, and a name. Branch -- a symbol within a particular Project that will be treated as a branch in SVN. Usually corresponds to a branch tag in CVS, but might be a non-branch tag that was mutated in CollateSymbolsPass. In SVN, this will correspond to a subdirectory of the project's "branches" directory. The Branch class inherits from Symbol and from LineOfDevelopment. Tag -- a symbol within a particular Project that will be treated as a tag in SVN. Usually corresponds to a non-branch tag in CVS, but might be a branch tag that was mutated in CollateSymbolsPass. In SVN, this will correspond to a subdirectory of the project's "tags" directory. The Tags class inherits from Symbol and from LineOfDevelopment. ExcludedSymbol -- a CVS symbol that will be excluded from the cvs2svn output. LineOfDevelopment -- a Trunk, Branch, or Tag. The following terms and the corresponding classes represent particular CVS events in particular CVS files. For example, the CVSBranch representing the creation of Branch "foo" in one file will be distinct from the CVSBranch representing the creation of branch "foo" in another file, even if the two files are in the same Project. Each CVSItem is assigned a unique integer ID during CollectRevsPass which is preserved during the entire conversion (even if, say, a CVSBranch is mutated into a CVSTag). CVSItem -- abstract base class representing any discernible event within a single RCS file, for example the creation of revision 1.6, or the tagging of the file with tag "bar". Each CVSItem has a unique integer ID. CVSRevision -- a particular revision within a particular file (e.g., file.txt:1.6). A CVSRevision occurs on a particular LineOfDevelopment. CVSRevision inherits from CVSItem. CVSSymbol -- a CVSBranch or CVSTag (see below). CVSSymbol inherits from CVSItem. CVSBranch -- the creation of a particular Branch on a particular file. A CVSBranch has a Symbol instance telling the Symbol associated with the branch, and also records the LineOfDevelopment from which the branch was created. In the SVN repository, a CVSBranch corresponds to an "svn copy" of a file to a subdirectory of the project's "branches" directory. CVSBranch inherits from CVSSymbol. CVSTag -- the creation of a particular Tag on a particular file. A CVSTag has a Symbol instance telling the Symbol associated with the tag, and also records the LineOfDevelopment from which the tag was created. In the SVN repository, a CVSTag corresponds to an "svn copy" of a file to a subdirectory of the project's "tags" directory. CVSTag inherits from CVSSymbol. CollectRevsPass =============== Collect all information about CVS tags and branches from the CVS repository. For each project, create a Trunk object to represent the trunk line of development for that project. The Trunk object for one Project is distinct from the Trunk objects for other Projects. For each symbol name seen in each project, create a Symbol object. The Symbol object contains its id, project, and name. The very first thing that is done when a symbol is read is that the Project's symbol transform rules are given a chance to transform the symbol name (or even cause it to be discarded). The result of the transformation is used as the symbol name in the rest of the program. Because this transformation process is so low-level, it is capable of making a more fundamental kind of change than the symbol strategy rules that come later: * Symbols can be renamed. * Symbols can be fully discarded, as if they never appeared in the CVS repository. This can even be done for a malformed symbol or for a branch symbol that refers to the same branch as another branch symbol (which would otherwise be a fatal error). * Two distinct symbols in different files within the same project can be transformed to the same name, in which case they are treated as a single symbol. * Two distinct symbols within a single file can be transformed to the same name, provided they refer to the same revision number. This effectively discards one of the symbols. * Two symbols with the same name in different files can be given distinct names, in which case they are treated as completely separate symbols. For each Symbol object, collect the following statistics: * In how many files was the symbol used as a branch and in how many was it used as a tag. * In how many files was there a commit on a branch with that name. * Which other symbols branched off of a branch with that name. * In how many files could each other line of development have served as the source of this symbol. These are called the "possible parents" of the symbol. These statistics are used in CollateSymbolsPass to determine which symbols can be excluded or converted from tags to branches or vice versa. The possible parents information is important because CVS is ambiguous about what line of development was the source of a branch. A branch numbered 1.3.6 might have been created from trunk (revision 1.3), from branch 1.3.2, or from branch 1.3.4; it is simply impossible to tell based on the information in a single RCS file. [Actually, the situation is even more confusing. If a branch tag is deleted from CVS, the branch number is recycled. So it is even possible that branch 1.3.6 was created from branch 1.3.8 or 1.3.10 or ... We address this confusion by noting the order that the branches were listed in the RCS file header. It appears that CVS lists branches in the header in reverse chronological order of creation.] For each tag seen within each file, create a CVSTag object recording its id, CVSFile, Symbol, and the id of the CVSRevision being tagged. For each branch seen within each file, create a CVSBranch object recording an id, CVSFile, Symbol, the branch number (e.g., '1.4.2'), the id of the CVSRevision from which the branch sprouts, and the id of the first CVSRevision on the branch (if any). For each revision seen within each file, create a CVSRevision object recording (among other things) and id, the line of development (trunk or branch) on which the revision appeared, a list of ids of CVSTags tagging the revision, and a list of ids of CVSBranches sprouting from the revision. This pass also adjusts the CVS dependency tree to work around some CVS quirks. (See design-notes.txt for the details.) These adjustments can result in CVSBranches being deleted, for example, if a file was added on a branch. In such a case, any CVSRevisions that were previously on the branch will be created by adding the file to the branch directory, rather than copying the file from the source directory to the branch directory. CleanMetadataPass ================= N/A CollateSymbolsPass ================== Allow the project's symbol strategy rules to affect how symbols are converted: * A symbol can be excluded from the conversion output (as long as there are no other non-excluded symbols that depend on it). In this case, the Symbol will be converted into an ExcludedSymbol instance. * A tag symbol can be converted as a branch. In this case, the Symbol will be converted into a Branch instance. * A branch symbol can be converted as a tag, provided there were never any commits on the branch. In this case, the Symbol will be converted into a Tag instance. * The SVN path where a symbol will be placed is determined. Typically, symbols are laid out in the standard trunk/branches/tags Subversion repository layout, but strategy rules can in fact place symbols arbitrarily. * The preferred parent of each symbol is determined. The preferred parent of a Symbol is chosen to be the line of development that appeared as a possible parent of this symbol in the most CVSFiles. This pass creates the symbol database, SYMBOL_DB, which is accessed in later passes via the SymbolDatabase class. The SymbolDatabase contains TypedSymbol (Branch, Tag, or ExcludedSymbol) instances indicating how each symbol should be processed in the conversion. The ID used for a TypedSymbol is the same as the ID allocated to the corresponding symbol in CollectRevsPass, so references in CVSItems do not have to be updated. FilterSymbolsPass ================= Iterate through all of the CVSItems, mutating CVSTags to CVSBranches and vice versa and excluding other CVSSymbols as specified by the types of the TypedSymbols in the SymbolDatabase. Additionally, filter out any CVSRevisions that reside on excluded CVSBranches. Write a line of text to CVS_SYMBOLS_DATAFILE for each surviving CVSSymbol, containing its Symbol id and a pickled version of the CVSSymbol. (This file will be sorted in SortSymbolsPass then used in InitializeChangesetsPass to create SymbolChangesets.) Also adjust the file's dependency tree by grafting CVSSymbols onto their preferred parents. This is not always possible; if not, leave the CVSSymbol where it was. Finally, record symbol "openings" and "closings". A CVSSymbol is considered "opened" by the CVSRevision or CVSBranch from which the CVSSymbol sprouts. A CVSSymbol is considered "closed" by the CVSRevision that overwrites or deletes the CVSSymbol's opening. (Every CVSSymbol has an opening, but not all of them have closings; for example, the opening CVSRevision might still exist at HEAD.) Record in each CVSRevision and CVSBranch a list of all of the CVSSymbols that it opens. Record in each CVSRevision a list of all of the CVSSymbols that it closes (CVSBranches cannot close CVSSymbols). SortRevisionsPass ================= N/A SortSymbolsPass =============== Sort CVS_SYMBOLS_DATAFILE, creating CVS_SYMBOLS_SORTED_DATAFILE. The sort groups together symbol items that might be added to the same SymbolChangeset. InitializeChangesetsPass ======================== Read CVS_SYMBOLS_SORTED_DATAFILE, grouping CVSSymbol items with the same Symbol id into SymbolChangesets. BreakRevisionChangesetCyclesPass ================================ N/A RevisionTopologicalSortPass =========================== N/A BreakSymbolChangesetCyclesPass ============================== Read in the changeset graph consisting only of SymbolChangesets and break up symbol changesets as necessary to break any cycles that are found. BreakAllChangesetCyclesPass =========================== Read in the entire changeset graph and break up symbol changesets as necessary to break any cycles that are found. TopologicalSortPass =================== Update the conversion statistics with excluded symbols omitted. CreateRevsPass ============== Create SVNCommits and assign svn revision numbers to each one. Create a database (SVN_COMMITS_*) to map svn revision numbers to SVNCommits and another (CVS_REVS_TO_SVN_REVNUMS) to map each CVSRevision id to the number of the svn revision containing it. Also, SymbolingsLogger writes a line to SYMBOL_OPENINGS_CLOSINGS for each opening or closing for each CVSSymbol, noting in what SVN revision the opening or closing occurred. SortSymbolOpeningsClosingsPass ============================== This pass sorts SYMBOL_OPENINGS_CLOSINGS into SYMBOL_OPENINGS_CLOSINGS_SORTED. This orders the file first by symbol ID, and second by Subversion revision number, thus grouping all openings and closings for each symbolic name together. IndexSymbolsPass ================ Iterate through all the lines in SYMBOL_OPENINGS_CLOSINGS_SORTED, writing out a pickled map to SYMBOL_OFFSETS_DB telling at what offset in SYMBOL_OPENINGS_CLOSINGS_SORTED the lines corresponding to each Symbol begin. This will allow us to seek to the various offsets in the file and sequentially read only the openings and closings that we need. OutputPass ========== The filling of a symbol is triggered when SVNSymbolCommit.commit() calls SVNRepositoryMirror.fill_symbol(). The SVNSymbolCommit contains the list of CVSSymbols that have to be copied to a symbol directory in this revision. However, we still have to do a lot of work to figure out what SVN revision number to use as the source of these copies, and also to group file copies together into directory copies when possible. The SYMBOL_OPENINGS_CLOSINGS_SORTED file lists the opening and closing SVN revision of each revision that has to be copied to the symbol directory. We use this information to try to find SVN revision numbers that can serve as the source for as many files as possible, to avoid having to pick and choose sources from many SVN revisions. Furthermore, when a bunch of files in a directory have to be copied at the same time, it is cheaper to copy the directory as a whole. But if not *all* of the files within the directory had to be copied, then the unneeded files have to be deleted again from the copied directory. Or if some of the files have to be copied from different source SVN revision numbers, then those files have to be overwritten in the copied directory with the correct versions. Finally, it can happen that a single Symbol has to be filled multiple times (because the initial SymbolChangeset had to be broken up). In this case, the first fill can copy the source directory to the destination directory (maybe with fixups), but subsequent copies have to copy individual files to avoid overwriting content that is already present in the destination directory. To figure all of this out, we need to know all of the files that existed in every previous SVN revision, in every line of development. This is done using the SVNRepositoryMirror class, which keeps a skeleton record of the entire SVN history in a database using data structures similar to those used by SVN itself. cvs2svn-2.4.0/doc/design-notes.txt0000664000076500007650000007027611434364605020132 0ustar mhaggermhagger00000000000000 How cvs2svn Works ================= Theory and requirements ------ --- ------------ There are two main problem converting a CVS repository to SVN: - CVS does not record enough information to determine what actually happened to a repository. For example, CVS does not record: - Which file modifications were part of the same commit - The timestamp of tag and branch creations - Exactly which revision was the base of a branch (there is ambiguity between x.y, x.y.2.0, x.y.4.0, etc.) - When the default branch was changed (for example, from a vendor branch back to trunk). - The timestamps in a CVS archive are not reliable. It can easily happen that timestamps are not even monotonic, and large errors (for example due to a failing server clock battery) are not unusual. The absolutely crucial, sine qua non requirement of a conversion is that the dependency relationships within a file be honored, mainly: - A revision depends on its predecessor - A branch creation depends on the revision from which it branched, and commits on the branch depend on the branch creation - A tag creation depends on the revision being tagged These dependencies are reliably defined in the CVS repository, and they trump all others, so they are the scaffolding of the conversion. Moreover, it is highly desirable that the timestamps of the SVN commits be monotonically increasing. Within these constraints we also want the results of the conversion to resemble the history of the CVS repository as closely as possible. For example, the set of file changes grouped together in an SVN commit should be the same as the files changed within the corresponding CVS commit, insofar as that can be achieved in a manner that is consistent with the dependency requirements. And the SVN commit timestamps should recreate the time of the CVS commit as far as possible without violating the monotonicity requirement. The basic idea of the conversion is this: create the largest conceivable changesets, then split up changesets as necessary to break any cycles in the graph of changeset dependencies. When all cycles have been removed, then do a topological sort of the changesets (with ambiguities resolved using CVS timestamps) to determine a self-consistent changeset commit order. The quality of the conversion (not in terms of correctness, but in terms of minimizing the number of svn commits) is mostly determined by the cleverness of the heuristics used to split up cycles. And all of this has to be affordable, especially in terms of conversion time and RAM usage, for even the largest CVS repositories. Implementation -------------- A cvs2svn run consists of a number of passes. Each pass saves the data it produces to files on disk, so that a) we don't hold huge amounts of state in memory, and b) the conversion process is resumable. The intermediate files are referred to here by the symbolic constants holding their filenames in config.py. CollectRevsPass (formerly called pass1) =============== The goal of this pass is to collect from the CVS files all of the data that will be required for the conversion. If the --use-internal-co option was used, this pass also collects the file delta data; for -use-rcs or -use-cvs, the actual file contents are read again in OutputPass. To collect this data, we walk over the repository, collecting data about the RCS files into an instance of CollectData. Each RCS file is processed with rcsparse.parse(), which invokes callbacks from an instance of cvs2svn's _FileDataCollector class (which is a subclass of rcsparse.Sink). While a file is being processed, all of the data for the file (except for contents and log messages) is held in memory. When the file has been read completely, its data is converted into an instance of CVSFileItems, and this instance is manipulated a bit then pickled and stored to CVS_ITEMS_STORE. For each RCS file, the first thing the parser encounters is the administrative header, including the head revision, the principal branch, symbolic names, RCS comments, etc. The main thing that happens here is that _FileDataCollector.define_tag() is invoked on each symbolic name and its attached revision, so all the tags and branches of this file get collected. Next, the parser hits the revision summary section. That's the part of the RCS file that looks like this: 1.6 date 2002.06.12.04.54.12; author captnmark; state Exp; branches 1.6.2.1; next 1.5; 1.5 date 2002.05.28.18.02.11; author captnmark; state Exp; branches; next 1.4; [...] For each revision summary, _FileDataCollector.define_revision() is invoked, recording that revision's metadata in various variables of the _FileDataCollector class instance. Next, the parser encounters the *real* revision data, which has the log messages and file contents. For each revision, it invokes _FileDataCollector.set_revision_info(), which sets some more fields in _RevisionData. When the parser is done with the file, _ProjectDataCollector takes the resulting CVSFileItems object and manipulates it to handle some CVS features: - If the file had a vendor branch, make some adjustments to the file dependency graph to reflect implicit dependencies related to the vendor branch. Also delete the 1.1 revision in the usual case that it doesn't contain any useful information. - If the file was added on a branch rather than on trunk, then delete the "dead" 1.1 revision on trunk in the usual case that it doesn't contain any useful information. - If the file was added on a branch after it already existed on trunk, then recent versions of CVS add an extra "dead" revision on the branch. Remove this revision in the usual case that it doesn't contain any useful information, and sever the branch from trunk (since the branch version is independent of the trunk version). - If the conversion was started with the --trunk-only option, then 1. graft any non-trunk default branch revisions onto trunk (because they affect the history of the default branch), and 2. delete all branches and tags and all remaining branch revisions. Finally, the CVSFileItems instance is stored to a database and statistics about how symbols were used in the file are recorded. That's it -- the RCS file is done. When every CVS file is done, CollectRevsPass is complete, and: - The basic information about each project is stored to PROJECTS. - The basic information about each file and directory (filename, path, etc) is written as a pickled CVSPath instance to CVS_PATHS_DB. - Information about each symbol seen, along with statistics like how often it was used as a branch or tag, is written as a pickled symbol_statistics._Stat object to SYMBOL_STATISTICS. This includes the following information: ID -- a unique positive identifying integer NAME -- the symbol name TAG_CREATE_COUNT -- the number of times the symbol was used as a tag BRANCH_CREATE_COUNT -- the number of times the symbol was used as a branch BRANCH_COMMIT_COUNT -- the number of files in which there was a commit on a branch with this name. BRANCH_BLOCKERS -- the set of other symbols that ever sprouted from a branch with this name. (A symbol cannot be excluded from the conversion unless all of its blockers are also excluded.) POSSIBLE_PARENTS -- a count of in how many files each other branch could have served as the symbol's source. These data are used to look for inconsistencies in the use of symbols under CVS and to decide which symbols can be excluded or forced to be branches and/or tags. The POSSIBLE_PARENTS data is used to pick the "optimum" parent from which the symbol should sprout in as many files as possible. For a multiproject conversion, distinct symbol records (and IDs) are created for symbols in separate projects, even if they have the same name. This is to prevent symbols in separate projects from being filled at the same time. - Information about each CVS event is converted into a CVSItem instance and stored to CVS_ITEMS_STORE. There are several types of CVSItems: CVSRevision -- A specific revision of a specific CVS file. CVSBranch -- The creation of a branch tag in a specific CVS file. CVSTag -- The creation of a non-branch tag in a specific CVS file. The CVSItems are grouped into CVSFileItems instances, one per CVSFile. But a multi-file commit will still be scattered all over the place. - Selected metadata for each CVS revision, including the author and log message, is written to METADATA_INDEX_TABLE and METADATA_STORE. The purpose is twofold: first, to save space by not having to save this information multiple times, and second because CVSRevisions that have the same metadata are candidates to be combined into an SVN changeset. First, an SHA digest is created for each set of metadata. The digest is constructed so that CVSRevisions that can be combined are all mapped to the same digest. CVSRevisions that were part of a single CVS commit always have a common author and log message, therefore these fields are always included in the digest. Moreover: - if ctx.cross_project_commits is False, we avoid combining CVS revisions from separate projects by including the project.id in the digest. - if ctx.cross_branch_commits is False, we avoid combining CVS revisions from different branches by including the branch name in the digest. During the database creation phase, the database keeps track of a map digest (20-byte string) -> metadata_id (int) to allow the record for a set of metadata to be located efficiently. As data are collected, it stores a map metadata_id (int) -> (author, log_msg,) (tuple) into the database for use in future passes. CVSRevision records include the metadata_id. During this run, each CVSFile, Symbol, CVSItem, and metadata record is assigned an arbitrary unique ID that is used throughout the conversion to refer to it. CleanMetadataPass ================= Encode the cvs revision metadata as UTF-8, ensuring that all entries can be decoded using the chosen encodings. Output the results to METADATA_CLEAN_INDEX_TABLE and METADATA_CLEAN_STORE. CollateSymbolsPass ================== Use the symbol statistics collected in CollectRevsPass and any runtime options to determine which symbols should be treated as branches, which as tags, and which should be excluded from the conversion altogether. Create SYMBOL_DB, which contains a pickle of a list of TypedSymbol (Branch, Tag, or ExcludedSymbol) instances indicating how each symbol should be processed in the conversion. The IDs used for a TypedSymbol is the same as the ID allocated to the corresponding symbol in CollectRevsPass, so references in CVSItems do not have to be updated. FilterSymbolsPass ================= This pass works through the CVSFileItems instances stored in CVS_ITEMS_STORE, processing all of the items from each file as a group. (This is the last pass in which all of the CVSItems for a file are in memory at once.) It does the following things: - Exclude any symbols that CollateSymbolsPass determined should be excluded, and any revisions on such branches. Also delete references from other CVSItems to those that are being deleted. - Transform any branches to tags or vice versa, also depending on the results of CollateSymbolsPass, and fix up the references from other CVSItems. - Decide what line of development to use as the parent for each symbol in the file, and adjust the file's dependency tree accordingly. - For each CVSRevision, record the list of symbols that the revision opens and closes. - Write each surviving CVSRevision to CVS_REVS_DATAFILE. Each line of the file has the format METADATA_ID TIMESTAMP CVS_REVISION where TIMESTAMP is a fixed-width timestamp, and CVS_REVISION is the pickled CVSRevision in a format that does not contain any newlines. These summaries will be sorted in SortRevisionsPass then used by InitializeChangesetsPass to create preliminary RevisionChangesets. - Write the CVSSymbols to CVS_SYMBOLS_DATAFILE. Each line of the file has the format SYMBOL_ID CVS_SYMBOL where CVS_SYMBOL is the pickled CVSSymbol in a format that does not contain any newlines. This information will be sorted by SYMBOL_ID in SortSymbolsPass then used to create preliminary SymbolChangesets. - Invokes callback methods of the registered RevisionCollector. The purpose of RevisionCollectors and RevisionReaders is documented in the file revision-reader.txt. SortRevisionsPass ================= Sort CVS_REVS_DATAFILE (written by FilterSymbolsPass), creating CVS_REVS_SORTED_DATAFILE. The sort groups items that might be added to the same changeset together and, within a group, sorts revisions by timestamp. This step makes it easy for InitializeChangesetsPass to read the initial draft of RevisionChangesets straight from the file. SortSymbolsPass =============== Sort CVS_SYMBOLS_DATAFILE (written by FilterSymbolsPass), creating CVS_SYMBOLS_SORTED_DATAFILE. The sort groups together symbol items that might be added to the same changeset (though not in anything resembling chronological order). The output of this pass is used by InitializeChangesetsPass. InitializeChangesetsPass ======================== This pass creates first-draft changesets, splitting them using COMMIT_THRESHOLD and breaking up any revision changesets that have internal dependencies. The raw material for creating revision changesets is CVS_REVS_SORTED_DATAFILE, which already has CVSRevisions sorted in such a way that potential changesets are grouped together and sorted by date. The contents of this file are read line by line, and the corresponding CVSRevisions are accumulated into a changeset. Whenever the metadata_id changes, or whenever there is a time gap of more than COMMIT_THRESHOLD (currently set to 5 minutes) between CVSRevisions, then a new changeset is started. At this point a revision changeset can have internal dependencies if two commits were made to the same file with the same log message within COMMIT_THRESHOLD of each other. The next job of this pass is to split up changesets in such a way to break such internal dependencies. This is done by sorting the CVSRevisions within a changeset by timestamp, then choosing the split point that breaks the most internal dependencies. This procedure is continued recursively until there are no more dependencies internal to a single changeset. Analogously, the CVSSymbol items from CVS_SYMBOLS_SORTED_DATAFILE are grouped into symbol changesets. (Symbol changesets cannot have internal dependencies, so there is no need to break them up at this stage.) Finally, this pass writes a CVSItem database with the CVSItems written in order grouped by the preliminary changeset to which they belong. Even though the preliminary changesets still have to be split up to form final changesets, grouping the CVSItems this way improves the locality of disk accesses and thereby speeds up later passes. The result of this pass is two databases: - CVS_ITEM_TO_CHANGESET, which maps CVSItem ids to the id of the changeset containing the item, and - CHANGESETS_STORE and CHANGESETS_INDEX, which contain the changeset objects themselves, indexed by changeset id. - CVS_ITEMS_SORTED_STORE and CVS_ITEMS_SORTED_INDEX_TABLE, which contain the pickled CVSItems ordered by changeset. BreakRevisionChangesetCyclesPass ================================ There can still be cycles in the dependency graph of RevisionChangesets caused by: - Interleaved commits. Since CVS commits are not atomic, it can happen that two commits are in progress at the same time and each alters the same two files, but in different orders. These should be small cycles involving only a few revision changesets. To resolve these cycles, one or more of the RevisionChangesets have to be split up (eventually becoming separate svn commits). - Cycles involving a RevisionChangeset formed by the accidental combination of unrelated items within a short period of time that have the same author and log message. These should also be small cycles involving only a few changesets. The job of this pass is to break up such cycles (those involving only CVSRevisions). This pass works by building up the graph of revision changesets and their dependencies in memory, then attempting a topological sort of the changesets. Whenever the topological sort stalls, that implies the existence of a cycle, one of which can easily be determined. This cycle is broken through the use of heuristics that try to determine an "efficient" way of splitting one or more of the changesets that are involved. The new RevisionChangesets are written to CVS_ITEM_TO_CHANGESET_REVBROKEN, CHANGESETS_REVBROKEN_STORE, and CHANGESETS_REVBROKEN_INDEX, along with the unmodified SymbolChangesets. These files are in the same format as the analogous files produced by InitializeChangesetsPass. RevisionTopologicalSortPass =========================== Topologically sort the RevisionChangesets, thereby picking the order in which the RevisionChangesets will be committed. (Since the previous pass eliminated any dependency cycles, this sort is guaranteed to succeed.) Ambiguities in the topological sort are resolved using the changesets' timestamps. Then simplify the changeset graph into a linear chain by converting each RevisionChangeset into an OrderedChangeset that stores dependency links only to its commit-order predecessor and successor. This simplified graph enforces the commit order that resulted from the topological sort, even after the SymbolChangesets are added back into the graph later. Store the OrderedChangesets into CHANGESETS_REVSORTED_STORE and CHANGESETS_REVSORTED_INDEX along with the unmodified SymbolChangesets. BreakSymbolChangesetCyclesPass ============================== It is possible for there to be cycles in the graph of SymbolChangesets caused by: - Split creation of branches. It is possible that branch A depends on branch B in one file, but B depends on A in another file. These cycles can be large, but they only involve SymbolChangesets. Break up such dependency loops. Output the results to CVS_ITEM_TO_CHANGESET_SYMBROKEN, CHANGESETS_SYMBROKEN_STORE, and CHANGESETS_SYMBROKEN_INDEX. BreakAllChangesetCyclesPass =========================== The complete changeset graph (including both RevisionChangesets and BranchChangesets) can still have dependency cycles cause by: - Split creation of branches. The same branch tag can be added to different files at completely different times. It is possible that the revision that was branched later depends on a RevisionChangeset that involves a file on the branch that was created earlier. These cycles can be large, but they always involve a SymbolChangeset. To resolve these cycles, the SymbolChangeset is split up into two changesets. In fact, tag changesets do not have to be considered--CVSTags cannot participate in dependency cycles because no other CVSItem can depend on a CVSTag. Since the input of this pass has been through RevisionTopologicalSortPass, all revision cycles have already been broken up and the order that the RevisionChangesets will be committed has been determined. In this pass, the complete changeset graph is created in memory, including the linear list of OrderedChangesets from RevisionTopologicalSortPass plus all of the symbol changesets. Because this pass doesn't break up any OrderedChangesets, it is constrained to finding places within the revision changeset sequence in which the symbol changeset commits can be inserted. The new changesets are written to CVS_ITEM_TO_CHANGESET_ALLBROKEN, CHANGESETS_ALLBROKEN_STORE, and CHANGESETS_ALLBROKEN_INDEX, which are in the same format as the analogous files produced by InitializeChangesetsPass. TopologicalSortPass =================== Now that the earlier passes have broken up any dependency cycles among the changesets, it is possible to order all of the changesets in such a way that all of a changeset's dependencies are committed before the changeset itself. This pass does so by again building up the graph of changesets in memory, then at each step picking a changeset that has no remaining dependencies and removing it from the graph. Whenever more than one dependency-free changeset is available, symbol changesets are chosen before revision changesets. As changesets are processed, the timestamp sequence is ensured to be monotonic by the simple expedient of adjusting retrograde timestamps to be later than their predecessor. Timestamps that lie in the future, on the other hand, are assumed to be bogus and are adjusted backwards, also to be just later than their predecessor. This pass writes a line to CHANGESETS_SORTED_DATAFILE for each RevisionChangeset, in the order that the changesets should be committed. Each lines contains CHANGESET_ID TIMESTAMP where CHANGESET_ID is the id of the changeset in the CHANGESETS_ALLBROKEN_* databases and TIMESTAMP is the timstamp that should be assigned to it when it is committed. Both values are written in hexadecimal. CreateRevsPass (formerly called pass5) ============== This pass generates SVNCommits from Changesets and records symbol openings and closings. (One Changeset can result in multiple SVNCommits, for example if it causes symbols to be filled or copies to a vendor branch.) This pass does the following: 1. Creates a database file to map Subversion revision numbers to SVNCommit instances (SVN_COMMITS_STORE and SVN_COMMITS_INDEX_TABLE). Creates another database file to map CVS Revisions to their Subversion Revision numbers (CVS_REVS_TO_SVN_REVNUMS). 2. When a file is copied to a symbolic name in cvs2svn, it is copied from a specific source: either a CVSRevision, or a copy created by a previous CVSBranch of the file. The copy has to be made from an SVN revision that is during the lifetime of the source. The SVN revision when the source was created is called the symbol's "opening", and the SVN revision when it was deleted or overwritten is called the symbol's "closing". In this pass, the SymbolingsLogger class writes out a line to SYMBOL_OPENINGS_CLOSINGS for each symbol opening or closing. Note that some openings do not have closings, namely if the corresponding source is still present at the HEAD revision. The format of each line is: SYMBOL_ID SVN_REVNUM TYPE CVS_SYMBOL_ID For example: 1c 234 O 1a7 34 245 O 1a9 18a 241 C 1a7 122 201 O 1b3 Here is what the columns mean: SYMBOL_ID -- The id of the branch or tag that has an opening in this SVN_REVNUM, in hexadecimal. SVN_REVNUM -- The Subversion revision number in which the opening or closing occurred. (There can be multiple openings and closings per SVN_REVNUM). TYPE -- "O" for openings and "C" for closings. CVS_SYMBOL_ID -- The id of the CVSSymbol instance whose opening or closing is being described, in hexadecimal. Each CVSSymbol that tags a non-dead file has exactly one opening and either zero or one closing. The closing, if it exists, always occurs in a later SVN revision than the opening. See SymbolingsLogger for more details. SortSymbolOpeningsClosingsPass (formerly called pass6) ============================== This pass sorts SYMBOL_OPENINGS_CLOSINGS into SYMBOL_OPENINGS_CLOSINGS_SORTED. This orders the file first by symbol ID, and second by Subversion revision number, thus grouping all openings and closings for each symbolic name together. IndexSymbolsPass (formerly called pass7) ================ This pass iterates through all the lines in SYMBOL_OPENINGS_CLOSINGS_SORTED, writing out a pickle file (SYMBOL_OFFSETS_DB) mapping SYMBOL_ID to the file offset in SYMBOL_OPENINGS_CLOSINGS_SORTED where SYMBOL_ID is first encountered. This will allow us to seek to the various offsets in the file and sequentially read only the openings and closings that we need. OutputPass (formerly called pass8) ========== This pass opens the svn-commits database and sequentially plays out all the commits to either a Subversion repository or to a dumpfile. It also decides what sources to use to fill symbols. In --dumpfile mode, the result of this pass is a Subversion repository dumpfile (suitable for input to 'svnadmin load'). The dumpfile is the data's last static stage: last chance to check over the data, run it through svndumpfilter, move the dumpfile to another machine, etc. When not in --dumpfile mode, no full dumpfile is created. Instead, miniature dumpfiles representing a single revisions are created, loaded into the repository, and then removed. In both modes, the dumpfile revisions are created by walking through the SVN_COMMITS_* database. The database in MIRROR_NODES_STORE and MIRROR_NODES_INDEX_TABLE holds a skeletal mirror of the repository structure at each SVN revision. This mirror keeps track of which files existed on each LOD, but does not record any file contents. cvs2svn requires this information to decide which paths to copy when filling branches and tags. When .cvsignore files are modified, cvs2svn computes the corresponding svn:ignore properties and applies the properties to the parent directory. The .cvsignore files themselves are not included in the output unless the --keep-cvsignore option was specified. But in either case, the .cvsignore files are recorded within the repository mirror as if they were being written to disk, to ensure that the containing directory is not pruned if the directory in CVS still contained a .cvsignore file. =============================== Branches and Tags Plan. =============================== This pass is also where tag and branch creation is done. Since subversion does tags and branches by copying from existing revisions (then maybe editing the copy, making subcopies underneath, etc), the big question for cvs2svn is how to achieve the minimum number of operations per creation. For example, if it's possible to get the right tag by just copying revision 53, then it's better to do that than, say, copying revision 51 and then sub-copying in bits of revision 52 and 53. Tags are created as soon as cvs2svn encounters the last CVS Revision that is a source for that tag. The whole tag is created in one Subversion commit. Branches are created as soon as all of their prerequisites are in place. If a branch creation had to be broken up due to dependency cycles, then non-final parts are also created as soon as their prerequisites are ready. In such a case, the SymbolChangeset specifies how much of the branch can be created in each step. How just-in-time branch creation works: In order to make the "best" set of copies/deletes when creating a branch, cvs2svn keeps track of two sets of trees while it's making commits: 1. A skeleton mirror of the subversion repository, that is, a record of which file existed on which LOD for each SVN revision. 2. A tree for each CVS symbolic name, and the svn file/directory revisions from which various parts of that tree could be copied. Each LOD is recorded as a tree using the following schema: unique keys map to marshal.dumps() representations of dictionaries, which in turn map path component names to other unique keys: root_key ==> { entryname1 : entrykey1, entryname2 : entrykey2, ... } entrykey1 ==> { entrynameX : entrykeyX, ... } entrykey2 ==> { entrynameY : entrykeyY, ... } entrykeyX ==> { etc, etc ...} entrykeyY ==> { etc, etc ...} (The leaf nodes -- files -- are represented by None.) The repository mirror allows cvs2svn to remember what paths exist in what revisions. For details on how branches and tags are created, please see the docstring the SymbolingsLogger class (and its methods). cvs2svn-2.4.0/doc/revision-reader.txt0000664000076500007650000001156011434364605020620 0ustar mhaggermhagger00000000000000This file contains a description of the RevisionCollector / RevisionReader mechanism. cvs2svn now includes hooks to make it possible to avoid having to invoke CVS or RCS zillions of times in OutputPass (which is otherwise the most expensive part of the conversion). Here is a brief description of how the hooks work. Most conversions [1] require an instance of RevisionReader, whose responsibility is to produce the text contents of CVS revisions on demand during OutputPass. The RevisionReader can read the CVS revision contents directly out of the RCS files during OutputPass. But additional hooks support the construction of different kinds of RevisionReader that record the CVS file revisions' contents during FilterSymbolsPass then output the contents during OutputPass. The interface that is used during FilterSymbolsPass to allow the collection of revision information is: RevisionCollector -- can collect information during FilterSymbolsPass to help the RevisionReader produce RCS file revision contents during OutputPass. The type of RevisionCollector/RevisionReader to be used for a run of cvs2svn can be set using --use-internal-co, --use-rcs, or --use-cvs, or via the --options file with lines like: ctx.revision_collector = MyRevisionCollector() ctx.revision_reader = MyRevisionReader() The following RevisionCollectors are supplied with cvs2svn: NullRevisionCollector -- does nothing (for RevisionReaders that don't need anything to happen in FilterSymbolsPass). InternalRevisionCollector -- records the delta text and dependencies for required revisions in FilterSymbolsPass, for use with the InternalRevisionReader. GitRevisionCollector -- uses another RevisionReader to reconstruct the revisions' fulltext during FilterSymbolsPass, then writes the fulltexts to a blobfile in git-fast-import format. This file, combined with the dumpfile created in OutputPass, can be loaded into git. ExternalBlobGenerator -- uses an external Python program to reconstruct the revision fulltexts in FilterSymbolsPass and write them to a blobfile in git-fast-import format. This option is very fast because (1) it uses code similar to that used by InternalRevisionCollector/InternalRevisionReader, and (2) it processes all revisions from a file at once, thereby avoiding a lot of disk seeking. The following RevisionReaders are supplied with cvs2svn: InternalRevisionReader -- reconstitutes the revisions' contents during OutputPass from the data recorded by InternalRevisionCollector. This is by far the fastest option for cvs2svn conversions, but it requires a substantial amount of temporary disk space for the duration of the conversion. RCSRevisionReader -- uses RCS's "co" command to extract the revision text during OutputPass. This is slower than InternalRevisionReader because "co" has to be executed very many times, but is better tested and does not require any temporary disk space. RCSRevisionReader does not use a RevisionCollector. CVSRevisionReader -- uses the "cvs" command to extract the revision text during OutputPass. This is even slower than RCSRevisionReader, but it can handle some CVS file quirks that stymy RCSRevisionReader (see the cvs2svn HTML documentation). CVSRevisionReader does not use a RevisionCollector. It is possible to write your own RevisionCollector and RevisionReader if you would like to do things differently. A RevisionCollector, with callback methods that are invoked as the CVS files are parsed, can be used to collect information during FilterSymbolsPass. Its process_file() method is allowed to set an arbitrary token (for example, a content hash) in CVSItem.revision_reader_token. This token is carried along by cvs2svn for use by the RevisionReader in OutputPass. Later, when OutputPass requires the file contents, it calls RevisionReader.get_content(), which is passed a CVSRevision instance and has to return the file revision's contents. The fancy RevisionReader could use the token to retrieve the pre-stored file contents without having to call CVS or RCS at all. [1] The exception is cvs2git conversions, which need a RevisionCollector but not a RevisionReader. The reason is that "git fast-import" allows file revision contents to be written as "blobs" in arbitrary order, to be hooked together later into proper changesets. This feature is very beneficial to the performance of cvs2git, because it allows all revisions of a single file to be generated at the same time (with good disk locality) rather than having to jump around from file to file getting single revisions in changeset order. Unfortunately, neither "bzr fast-import" nor "hg fastimport" support separate blobs. cvs2svn-2.4.0/doc/text-transformations.txt0000664000076500007650000000750411500107342021723 0ustar mhaggermhagger00000000000000This file collects information about how the various VCSs deal with keyword expansion, EOL handling, and file permissions. CVS/RCS ======= There are several keyword expansion modes that can be set on files. These options also affect whether EOLs are all converted to/from the local convention. The following keywords are recognized: Author, CVSHeader (CVS only), Date, Header, Id, Locker, Name, RCSfile, Revision, Source, State, and Log. ("Log" is a special case because its expansion is irreversible.) Keyword names are case sensitive. There is also a provision to define "Local keywords", which use a user-defined name but expand as one of the standard keywords. These modes are selected via a "-k" command-line option, so for example the "kv" option is selected using the option "-kkv". The available modes are: * kv, kvl: Expand keywords, or expand them including locker name. Also munge line endings. * k: Collapse keywords (e.g. "$Revision: 1.2 $ -> "$Revision$"), except for the $Log$ option, which is expanded. Also munge line endings. * o: Leave keywords (including the $Log$ keyword) in the form that they were checked in. Also munge line endings. * v: Generate only keyword values instead of the full keyword string; e.g., "$Revision$" -> "5.7". (This is mostly useful for export.) Also munge line endings. * b: Leave keywords in the form that they were checked in, inhibit munging of line endings between canonical LF to the local convention, and prevent merging of file differences. Whether or not a file is executable is determined from the executable bit of the corresponding RCS file in the repository. Please note that CVSNT handles file modes differently: it supports additional modes, and it allows the file mode to differ from one file revision to another. This is the main reason that cvs2svn doesn't work reliably with CVSNT repositories. Subversion ========== * svn:executable: If this property is set, the file is marked as executable upon checkout. * svn:mime-type: Used to decide whether line-based merging is safe, whether to show diff-based deltas between revisions, etc. * svn:keywords: List the keywords that will be expanded in the file. * svn:eol-style: Specifies how to manipulate EOL characters. If this property is set, then the file can be committed to Subversion using somewhat more flexible EOL conventions. In the Subversion repository the text is normalized to a particular EOL style, probably the "svnadmin dump" style listed below. On checkout or export, the LFs will be converted to the specified EOL style. Possible values: * LF, CRLF, CR: On commit: text can contain any mixture of EOL styles. svnadmin dump: file text contains the specified EOL format. svnadmin load: should presumably be consistent with the "svnadmin dump" format. * native: On commit: text can use any EOL style, but lines must be consistent. svnadmin dump: file text contains the canonical LF format. svnadmin load: should presumably be consistent with the "svnadmin dump" format. Git === * The executable status of a file is determined by a file mode attribute in the fast-import file. * Keywords: Not obvious what to do. There is support for expanding an $Id$ keyword via the gitattributes mechanism. Others could only be supported via custom-written filters. * EOL style: Normally, git does not do any line-end conversion. However, there is a way to use gitattributes to mark particular files as text files, and to use the configuration settings of core.autocrlf and core.safecrlf to affect conversions between LF and CRLF formats. * The "diff" gitattribute can be used to tell how to generate diffs for a file (otherwise a heuristic is used). This attribute can be used to force a file to be treated as text/binary, or tell what "diff driver" to use. cvs2svn-2.4.0/doc/making-releases.txt0000664000076500007650000000643611233672717020602 0ustar mhaggermhagger00000000000000Making releases =============== Pre-release (repeat as appropriate): A. Backport changes if appropriate. B. Update CHANGES. C. Run the testsuite, check everything is OK. D. Trial-run ./dist.sh, check the output is sane. E. Run "www/validate.sh www/*.html" to make sure that the documentation HTML is valid. Creating the release: 1. If this is an A.B.0 release, make a branch: svn copy http://cvs2svn.tigris.org/svn/cvs2svn/trunk \ http://cvs2svn.tigris.org/svn/cvs2svn/branches/A.B.x and then increment the -dev VERSION in cvs2svn_lib/version.py on trunk. 2. Set the release number and date in CHANGES on trunk. 3. Switch to a branch working copy. 4. Merge CHANGES to the release branch. 5. Make a trial distribution and see that the unit tests run: ./dist.sh tar -xzf cvs2svn-A.B.C-dev.tar.gz cd cvs2svn-A.B.C-dev ./run-tests.py cd .. rm -rf cvs2svn-A.B.C-dev 6. Set VERSION in cvs2svn_lib/version.py and then run: svn copy . http://cvs2svn.tigris.org/svn/cvs2svn/tags/A.B.C 7. Increment the -dev VERSION in cvs2svn_lib/version.py on the A.B.x branch. 8. Switch to the tag. 9. Run: ./dist.sh 10. Create a detached signature for the tar file: gpg --detach-sign -a cvs2svn-A.B.C.tar.gz Publishing the release: 1. Upload tarball and signature to website download area (log in, go to "Downloads", then "Releases" folder, then "Add new file"). 2. Move old releases into the 'Old' folder of the download area (click on the "Edit" link next to the files, then change Status -> "Archival" and Folder -> "Old"). 3. Create a project announcement on the website. See example template below. 4. Send an announcement to announce@cvs2svn.tigris.org. (users@cvs2svn.tigris.org is subscribed to announce, so there is no need to send to both lists.) See example template below. 5. Update the topic on #cvs2svn. Publishing the release to PyPI (http://pypi.python.org): 1. Unpack the release tarball. 2. Run python setup.py register Within PyPI, cvs2svn appears here: http://pypi.python.org/pypi/cvs2svn Release announcement templates ============================== Here are suggested release announcement templates. Fill in the substitutions as appropriate, and refer to previous announcements for examples. Web: [[[ cvs2svn VERSION is now released.
The MD5 checksum is CHECKSUM
For more information see CHANGES.
Download: cvs2svn-VERSION.tar.gz. ]]] Email: [[[ Subject: cvs2svn VERSION released To: announce@cvs2svn.tigris.org, git@vger.kernel.org Reply-to: users@cvs2svn.tigris.org cvs2svn VERSION is now released. BRIEF_SUMMARY_OF_VERSION_HIGHLIGHTS For more information see: http://cvs2svn.tigris.org/source/browse/cvs2svn/tags/VERSION/CHANGES?view=markup You can get it here: http://cvs2svn.tigris.org/files/documents/1462/NNNNN/cvs2svn-VERSION.tar.gz The MD5 checksum is CHECKSUM. Please send any bug reports and comments to users@cvs2svn.tigris.org. YOUR_NAME, on behalf of the cvs2svn development team. ]]] cvs2svn-2.4.0/cvs2svn_lib/0000775000076500007650000000000012027373500016434 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/cvs2svn_lib/abstract_rcs_revision_manager.py0000664000076500007650000000660111710517256025101 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2010 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Base class for RCSRevisionReader and CVSRevisionReader.""" from cvs2svn_lib.common import canonicalize_eol from cvs2svn_lib.common import FatalError from cvs2svn_lib.process import get_command_output from cvs2svn_lib.context import Ctx from cvs2svn_lib.revision_manager import RevisionReader from cvs2svn_lib.keyword_expander import expand_keywords from cvs2svn_lib.keyword_expander import collapse_keywords from cvs2svn_lib.apple_single_filter import get_maybe_apple_single class AbstractRCSRevisionReader(RevisionReader): """A base class for RCSRevisionReader and CVSRevisionReader.""" # A map from (eol_fix, keyword_handling) to ('-k' option needed for # RCS/CVS, explicit_keyword_handling). The preference is to allow # CVS/RCS to handle keyword expansion itself whenever possible. But # it is not possible in combination with eol_fix==False, because the # only option that CVS/RCS has that leaves the EOLs alone is '-kb' # mode, which leaves the keywords untouched. Therefore, whenever # eol_fix is False, we need to use '-kb' mode and then (if # necessary) expand or collapse the keywords ourselves. _text_options = { (False, 'collapsed') : (['-kb'], 'collapsed'), (False, 'expanded') : (['-kb'], 'expanded'), (False, 'untouched') : (['-kb'], None), (True, 'collapsed') : (['-kk'], None), (True, 'expanded') : (['-kkv'], None), (True, 'untouched') : (['-ko'], None), } def get_pipe_command(self, cvs_rev, k_option): """Return the command that is needed to get the contents for CVS_REV. K_OPTION is a list containing the '-k' option that is needed, if any.""" raise NotImplementedError() def get_content(self, cvs_rev): # Is EOL fixing requested? eol_fix = cvs_rev.get_property('_eol_fix') or None # How do we want keywords to be handled? keyword_handling = cvs_rev.get_property('_keyword_handling') or None try: (k_option, explicit_keyword_handling) = self._text_options[ bool(eol_fix), keyword_handling ] except KeyError: raise FatalError( 'Undefined _keyword_handling property (%r) for %s' % (keyword_handling, cvs_rev,) ) data = get_command_output(self.get_pipe_command(cvs_rev, k_option)) if Ctx().decode_apple_single: # Insert a filter to decode any files that are in AppleSingle # format: data = get_maybe_apple_single(data) if explicit_keyword_handling == 'expanded': data = expand_keywords(data, cvs_rev) elif explicit_keyword_handling == 'collapsed': data = collapse_keywords(data) if eol_fix: data = canonicalize_eol(data, eol_fix) return data cvs2svn-2.4.0/cvs2svn_lib/changeset_database.py0000664000076500007650000000464611710517256022613 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2006-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains classes to store changesets.""" from cvs2svn_lib.changeset import Changeset from cvs2svn_lib.changeset import RevisionChangeset from cvs2svn_lib.changeset import OrderedChangeset from cvs2svn_lib.changeset import SymbolChangeset from cvs2svn_lib.changeset import BranchChangeset from cvs2svn_lib.changeset import TagChangeset from cvs2svn_lib.record_table import UnsignedIntegerPacker from cvs2svn_lib.record_table import MmapRecordTable from cvs2svn_lib.record_table import RecordTable from cvs2svn_lib.indexed_database import IndexedStore from cvs2svn_lib.serializer import PrimedPickleSerializer # Should the CVSItemToChangesetTable database files be memory mapped? # This speeds up the converstion but can cause the computer's virtual # address space to be exhausted. This option can be changed # externally, affecting any CVSItemToChangesetTables opened subsequent # to the change: use_mmap_for_cvs_item_to_changeset_table = False def CVSItemToChangesetTable(filename, mode): if use_mmap_for_cvs_item_to_changeset_table: return MmapRecordTable(filename, mode, UnsignedIntegerPacker()) else: return RecordTable(filename, mode, UnsignedIntegerPacker()) class ChangesetDatabase(IndexedStore): def __init__(self, filename, index_filename, mode): primer = ( Changeset, RevisionChangeset, OrderedChangeset, SymbolChangeset, BranchChangeset, TagChangeset, ) IndexedStore.__init__( self, filename, index_filename, mode, PrimedPickleSerializer(primer)) def store(self, changeset): self.add(changeset) def keys(self): return list(self.iterkeys()) def close(self): IndexedStore.close(self) cvs2svn-2.4.0/cvs2svn_lib/changeset.py0000664000076500007650000002014211244046724020753 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2006-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Manage change sets.""" from cvs2svn_lib.common import InternalError from cvs2svn_lib.context import Ctx from cvs2svn_lib.symbol import Branch from cvs2svn_lib.symbol import Tag from cvs2svn_lib.time_range import TimeRange from cvs2svn_lib.changeset_graph_node import ChangesetGraphNode class Changeset(object): """A set of cvs_items that might potentially form a single change set.""" def __init__(self, id, cvs_item_ids): self.id = id self.cvs_item_ids = list(cvs_item_ids) def iter_cvs_items(self): """Yield the CVSItems within this Changeset.""" for (id, cvs_item) in Ctx()._cvs_items_db.get_many(self.cvs_item_ids): assert cvs_item is not None yield cvs_item def get_projects_opened(self): """Return the set of projects that might be opened by this changeset.""" raise NotImplementedError() def create_graph_node(self, cvs_item_to_changeset_id): """Return a ChangesetGraphNode for this Changeset.""" raise NotImplementedError() def create_split_changeset(self, id, cvs_item_ids): """Return a Changeset with the specified contents. This method is only implemented for changesets that can be split. The type of the new changeset should be the same as that of SELF, and any other information from SELF should also be copied to the new changeset.""" raise NotImplementedError() def __getstate__(self): return (self.id, self.cvs_item_ids,) def __setstate__(self, state): (self.id, self.cvs_item_ids,) = state def __cmp__(self, other): raise NotImplementedError() def __str__(self): raise NotImplementedError() def __repr__(self): return '%s [%s]' % ( self, ', '.join(['%x' % id for id in self.cvs_item_ids]),) class RevisionChangeset(Changeset): """A Changeset consisting of CVSRevisions.""" _sort_order = 3 def create_graph_node(self, cvs_item_to_changeset_id): time_range = TimeRange() pred_ids = set() succ_ids = set() for cvs_item in self.iter_cvs_items(): time_range.add(cvs_item.timestamp) for pred_id in cvs_item.get_pred_ids(): changeset_id = cvs_item_to_changeset_id.get(pred_id) if changeset_id is not None: pred_ids.add(changeset_id) for succ_id in cvs_item.get_succ_ids(): changeset_id = cvs_item_to_changeset_id.get(succ_id) if changeset_id is not None: succ_ids.add(changeset_id) return ChangesetGraphNode(self, time_range, pred_ids, succ_ids) def create_split_changeset(self, id, cvs_item_ids): return RevisionChangeset(id, cvs_item_ids) def __cmp__(self, other): return cmp(self._sort_order, other._sort_order) \ or cmp(self.id, other.id) def __str__(self): return 'RevisionChangeset<%x>' % (self.id,) class OrderedChangeset(Changeset): """A Changeset of CVSRevisions whose preliminary order is known. The first changeset ordering involves only RevisionChangesets, and results in a full ordering of RevisionChangesets (i.e., a linear chain of dependencies with the order consistent with the dependencies). These OrderedChangesets form the skeleton for the full topological sort that includes SymbolChangesets as well.""" _sort_order = 2 def __init__(self, id, cvs_item_ids, ordinal, prev_id, next_id): Changeset.__init__(self, id, cvs_item_ids) # The order of this changeset among all OrderedChangesets: self.ordinal = ordinal # The changeset id of the previous OrderedChangeset, or None if # this is the first OrderedChangeset: self.prev_id = prev_id # The changeset id of the next OrderedChangeset, or None if this # is the last OrderedChangeset: self.next_id = next_id def get_projects_opened(self): retval = set() for cvs_item in self.iter_cvs_items(): retval.add(cvs_item.cvs_file.project) return retval def create_graph_node(self, cvs_item_to_changeset_id): time_range = TimeRange() pred_ids = set() succ_ids = set() if self.prev_id is not None: pred_ids.add(self.prev_id) if self.next_id is not None: succ_ids.add(self.next_id) for cvs_item in self.iter_cvs_items(): time_range.add(cvs_item.timestamp) for pred_id in cvs_item.get_symbol_pred_ids(): changeset_id = cvs_item_to_changeset_id.get(pred_id) if changeset_id is not None: pred_ids.add(changeset_id) for succ_id in cvs_item.get_symbol_succ_ids(): changeset_id = cvs_item_to_changeset_id.get(succ_id) if changeset_id is not None: succ_ids.add(changeset_id) return ChangesetGraphNode(self, time_range, pred_ids, succ_ids) def __getstate__(self): return ( Changeset.__getstate__(self), self.ordinal, self.prev_id, self.next_id,) def __setstate__(self, state): (changeset_state, self.ordinal, self.prev_id, self.next_id,) = state Changeset.__setstate__(self, changeset_state) def __cmp__(self, other): return cmp(self._sort_order, other._sort_order) \ or cmp(self.id, other.id) def __str__(self): return 'OrderedChangeset<%x(%d)>' % (self.id, self.ordinal,) class SymbolChangeset(Changeset): """A Changeset consisting of CVSSymbols.""" def __init__(self, id, symbol, cvs_item_ids): Changeset.__init__(self, id, cvs_item_ids) self.symbol = symbol def get_projects_opened(self): # A SymbolChangeset can never open a project. return set() def create_graph_node(self, cvs_item_to_changeset_id): pred_ids = set() succ_ids = set() for cvs_item in self.iter_cvs_items(): for pred_id in cvs_item.get_pred_ids(): changeset_id = cvs_item_to_changeset_id.get(pred_id) if changeset_id is not None: pred_ids.add(changeset_id) for succ_id in cvs_item.get_succ_ids(): changeset_id = cvs_item_to_changeset_id.get(succ_id) if changeset_id is not None: succ_ids.add(changeset_id) return ChangesetGraphNode(self, TimeRange(), pred_ids, succ_ids) def __cmp__(self, other): return cmp(self._sort_order, other._sort_order) \ or cmp(self.symbol, other.symbol) \ or cmp(self.id, other.id) def __getstate__(self): return (Changeset.__getstate__(self), self.symbol.id,) def __setstate__(self, state): (changeset_state, symbol_id) = state Changeset.__setstate__(self, changeset_state) self.symbol = Ctx()._symbol_db.get_symbol(symbol_id) class BranchChangeset(SymbolChangeset): """A Changeset consisting of CVSBranches.""" _sort_order = 1 def create_split_changeset(self, id, cvs_item_ids): return BranchChangeset(id, self.symbol, cvs_item_ids) def __str__(self): return 'BranchChangeset<%x>("%s")' % (self.id, self.symbol,) class TagChangeset(SymbolChangeset): """A Changeset consisting of CVSTags.""" _sort_order = 0 def create_split_changeset(self, id, cvs_item_ids): return TagChangeset(id, self.symbol, cvs_item_ids) def __str__(self): return 'TagChangeset<%x>("%s")' % (self.id, self.symbol,) def create_symbol_changeset(id, symbol, cvs_item_ids): """Factory function for SymbolChangesets. Return a BranchChangeset or TagChangeset, depending on the type of SYMBOL. SYMBOL must be a Branch or Tag.""" if isinstance(symbol, Branch): return BranchChangeset(id, symbol, cvs_item_ids) if isinstance(symbol, Tag): return TagChangeset(id, symbol, cvs_item_ids) else: raise InternalError('Unknown symbol type %s' % (symbol,)) cvs2svn-2.4.0/cvs2svn_lib/stats_keeper.py0000664000076500007650000001325111244045370021502 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains the StatsKeeper class. A StatsKeeper can pickle itself to a STATISTICS_FILE. This module also includes a function to read a StatsKeeper from a STATISTICS_FILE.""" import time import cPickle from cStringIO import StringIO from cvs2svn_lib.cvs_item import CVSRevision from cvs2svn_lib.cvs_item import CVSBranch from cvs2svn_lib.cvs_item import CVSTag class StatsKeeper: def __init__(self): self._svn_rev_count = None self._first_rev_date = 1L<<32 self._last_rev_date = 0 self._pass_timings = { } self._stats_reflect_exclude = False self.reset_cvs_rev_info() def log_duration_for_pass(self, duration, pass_num, pass_name): self._pass_timings[pass_num] = (pass_name, duration,) def set_stats_reflect_exclude(self, value): self._stats_reflect_exclude = value def reset_cvs_rev_info(self): self._repos_file_count = 0 self._repos_size = 0 self._cvs_revs_count = 0 self._cvs_branches_count = 0 self._cvs_tags_count = 0 # A set of tag_ids seen: self._tag_ids = set() # A set of branch_ids seen: self._branch_ids = set() def record_cvs_file(self, cvs_file): self._repos_file_count += 1 self._repos_size += cvs_file.file_size def _record_cvs_rev(self, cvs_rev): self._cvs_revs_count += 1 if cvs_rev.timestamp < self._first_rev_date: self._first_rev_date = cvs_rev.timestamp if cvs_rev.timestamp > self._last_rev_date: self._last_rev_date = cvs_rev.timestamp def _record_cvs_branch(self, cvs_branch): self._cvs_branches_count += 1 self._branch_ids.add(cvs_branch.symbol.id) def _record_cvs_tag(self, cvs_tag): self._cvs_tags_count += 1 self._tag_ids.add(cvs_tag.symbol.id) def record_cvs_item(self, cvs_item): if isinstance(cvs_item, CVSRevision): self._record_cvs_rev(cvs_item) elif isinstance(cvs_item, CVSBranch): self._record_cvs_branch(cvs_item) elif isinstance(cvs_item, CVSTag): self._record_cvs_tag(cvs_item) else: raise RuntimeError('Unknown CVSItem type') def set_svn_rev_count(self, count): self._svn_rev_count = count def svn_rev_count(self): return self._svn_rev_count def __getstate__(self): state = self.__dict__.copy() # This can get kinda large, so we don't store it: return state def archive(self, filename): f = open(filename, 'wb') cPickle.dump(self, f) f.close() def __str__(self): f = StringIO() f.write('\n') f.write('cvs2svn Statistics:\n') f.write('------------------\n') f.write('Total CVS Files: %10i\n' % (self._repos_file_count,)) f.write('Total CVS Revisions: %10i\n' % (self._cvs_revs_count,)) f.write('Total CVS Branches: %10i\n' % (self._cvs_branches_count,)) f.write('Total CVS Tags: %10i\n' % (self._cvs_tags_count,)) f.write('Total Unique Tags: %10i\n' % (len(self._tag_ids),)) f.write('Total Unique Branches: %10i\n' % (len(self._branch_ids),)) f.write('CVS Repos Size in KB: %10i\n' % ((self._repos_size / 1024),)) if self._svn_rev_count is not None: f.write('Total SVN Commits: %10i\n' % self._svn_rev_count) f.write( 'First Revision Date: %s\n' % (time.ctime(self._first_rev_date),) ) f.write( 'Last Revision Date: %s\n' % (time.ctime(self._last_rev_date),) ) f.write('------------------') if not self._stats_reflect_exclude: f.write( '\n' '(These are unaltered CVS repository stats and do not\n' ' reflect tags or branches excluded via --exclude)\n' ) return f.getvalue() @staticmethod def _get_timing_format(value): # Output times with up to 3 decimal places: decimals = max(0, 4 - len('%d' % int(value))) length = len(('%%.%df' % decimals) % value) return '%%%d.%df' % (length, decimals,) def single_pass_timing(self, pass_num): (pass_name, duration,) = self._pass_timings[pass_num] format = self._get_timing_format(duration) time_string = format % (duration,) return ( 'Time for pass%d (%s): %s seconds.' % (pass_num, pass_name, time_string,) ) def timings(self): passes = self._pass_timings.keys() passes.sort() f = StringIO() f.write('Timings (seconds):\n') f.write('------------------\n') total = 0.0 for pass_num in passes: (pass_name, duration,) = self._pass_timings[pass_num] total += duration format = self._get_timing_format(total) for pass_num in passes: (pass_name, duration,) = self._pass_timings[pass_num] f.write( (format + ' pass%-2d %s\n') % (duration, pass_num, pass_name,) ) f.write((format + ' total') % total) return f.getvalue() def read_stats_keeper(filename): """Factory function: Return a _StatsKeeper instance. Read the instance from FILENAME as written by StatsKeeper.archive().""" f = open(filename, 'rb') retval = cPickle.load(f) f.close() return retval cvs2svn-2.4.0/cvs2svn_lib/changeset_graph_node.py0000664000076500007650000000331111244046316023135 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2006-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """A node in the changeset dependency graph.""" class ChangesetGraphNode(object): """A node in the changeset dependency graph.""" __slots__ = ['id', 'time_range', 'pred_ids', 'succ_ids'] def __init__(self, changeset, time_range, pred_ids, succ_ids): # The id of the ChangesetGraphNode is the same as the id of the # changeset. self.id = changeset.id # The range of times of CVSItems within this Changeset. self.time_range = time_range # The set of changeset ids of changesets that are direct # predecessors of this one. self.pred_ids = pred_ids # The set of changeset ids of changesets that are direct # successors of this one. self.succ_ids = succ_ids def __repr__(self): """For convenience only. The format is subject to change at any time.""" return '%x; pred=[%s]; succ=[%s]' % ( self.id, ','.join(['%x' % id for id in self.pred_ids]), ','.join(['%x' % id for id in self.succ_ids]), ) cvs2svn-2.4.0/cvs2svn_lib/key_generator.py0000664000076500007650000000257311244046461021656 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains the KeyGenerator class.""" class KeyGenerator: """Generate a series of unique keys.""" def __init__(self, first_id=1): """Initialize a KeyGenerator with the specified FIRST_ID. FIRST_ID should be an int or long, and the generated keys will be of the same type.""" self._key_base = first_id self._last_id = None def gen_id(self): """Generate and return a previously-unused key, as an integer.""" self._last_id = self._key_base self._key_base += 1 return self._last_id def get_last_id(self): """Return the last id that was generated, as an integer.""" return self._last_id cvs2svn-2.4.0/cvs2svn_lib/cvs_item_database.py0000664000076500007650000001477311710517256022465 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains a database that can store arbitrary CVSItems.""" import re import cPickle from cvs2svn_lib.cvs_item import CVSRevisionAdd from cvs2svn_lib.cvs_item import CVSRevisionChange from cvs2svn_lib.cvs_item import CVSRevisionDelete from cvs2svn_lib.cvs_item import CVSRevisionNoop from cvs2svn_lib.cvs_item import CVSBranch from cvs2svn_lib.cvs_item import CVSBranchNoop from cvs2svn_lib.cvs_item import CVSTag from cvs2svn_lib.cvs_item import CVSTagNoop from cvs2svn_lib.cvs_file_items import CVSFileItems from cvs2svn_lib.serializer import Serializer from cvs2svn_lib.serializer import PrimedPickleSerializer from cvs2svn_lib.indexed_database import IndexedStore cvs_item_primer = ( CVSRevisionAdd, CVSRevisionChange, CVSRevisionDelete, CVSRevisionNoop, CVSBranch, CVSBranchNoop, CVSTag, CVSTagNoop, ) class NewCVSItemStore: """A file of sequential CVSItems, grouped by CVSFile. The file consists of a sequence of pickles. The zeroth one is a Serializer as described in the serializer module. Subsequent ones are pickled lists of CVSItems, each list containing all of the CVSItems for a single file. We don't use a single pickler for all items because the memo would grow too large.""" def __init__(self, filename): """Initialize an instance, creating the file and writing the primer.""" self.f = open(filename, 'wb') self.serializer = PrimedPickleSerializer( cvs_item_primer + (CVSFileItems,) ) cPickle.dump(self.serializer, self.f, -1) def add(self, cvs_file_items): """Write CVS_FILE_ITEMS into the database.""" self.serializer.dumpf(self.f, cvs_file_items) def close(self): self.f.close() self.f = None class OldCVSItemStore: """Read a file created by NewCVSItemStore. The file must be read sequentially, one CVSFileItems instance at a time.""" def __init__(self, filename): self.f = open(filename, 'rb') # Read the memo from the first pickle: self.serializer = cPickle.load(self.f) def iter_cvs_file_items(self): """Iterate through the CVSFileItems instances, one file at a time. Each time yield a CVSFileItems instance for one CVSFile.""" try: while True: yield self.serializer.loadf(self.f) except EOFError: return def close(self): self.f.close() self.f = None class LinewiseSerializer(Serializer): """A serializer that writes exactly one line for each object. The actual serialization is done by a wrapped serializer; this class only escapes any newlines in the serialized data then appends a single newline.""" def __init__(self, wrapee): self.wrapee = wrapee @staticmethod def _encode_newlines(s): """Return s with newlines and backslashes encoded. The string is returned with the following character transformations: LF -> \n CR -> \r ^Z -> \z (needed for Windows) \ -> \\ """ return s.replace('\\', '\\\\') \ .replace('\n', '\\n') \ .replace('\r', '\\r') \ .replace('\x1a', '\\z') _escape_re = re.compile(r'(\\\\|\\n|\\r|\\z)') _subst = {'\\n' : '\n', '\\r' : '\r', '\\z' : '\x1a', '\\\\' : '\\'} @staticmethod def _decode_newlines(s): """Return s with newlines and backslashes decoded. This function reverses the encoding of _encode_newlines(). """ def repl(m): return LinewiseSerializer._subst[m.group(1)] return LinewiseSerializer._escape_re.sub(repl, s) def dumpf(self, f, object): f.write(self.dumps(object)) def dumps(self, object): return self._encode_newlines(self.wrapee.dumps(object)) + '\n' def loadf(self, f): return self.loads(f.readline()) def loads(self, s): return self.wrapee.loads(self._decode_newlines(s[:-1])) class NewSortableCVSRevisionDatabase(object): """A serially-accessible, sortable file for holding CVSRevisions. This class creates such files.""" def __init__(self, filename, serializer): self.f = open(filename, 'w') self.serializer = LinewiseSerializer(serializer) def add(self, cvs_rev): self.f.write( '%x %08x %s' % ( cvs_rev.metadata_id, cvs_rev.timestamp, self.serializer.dumps(cvs_rev), ) ) def close(self): self.f.close() self.f = None class OldSortableCVSRevisionDatabase(object): """A serially-accessible, sortable file for holding CVSRevisions. This class reads such files.""" def __init__(self, filename, serializer): self.filename = filename self.serializer = LinewiseSerializer(serializer) def __iter__(self): f = open(self.filename, 'r') for l in f: s = l.split(' ', 2)[-1] yield self.serializer.loads(s) f.close() def close(self): pass class NewSortableCVSSymbolDatabase(object): """A serially-accessible, sortable file for holding CVSSymbols. This class creates such files.""" def __init__(self, filename, serializer): self.f = open(filename, 'w') self.serializer = LinewiseSerializer(serializer) def add(self, cvs_symbol): self.f.write( '%x %s' % (cvs_symbol.symbol.id, self.serializer.dumps(cvs_symbol)) ) def close(self): self.f.close() self.f = None class OldSortableCVSSymbolDatabase(object): """A serially-accessible, sortable file for holding CVSSymbols. This class reads such files.""" def __init__(self, filename, serializer): self.filename = filename self.serializer = LinewiseSerializer(serializer) def __iter__(self): f = open(self.filename, 'r') for l in f: s = l.split(' ', 1)[-1] yield self.serializer.loads(s) f.close() def close(self): pass def IndexedCVSItemStore(filename, index_filename, mode): return IndexedStore( filename, index_filename, mode, PrimedPickleSerializer(cvs_item_primer) ) cvs2svn-2.4.0/cvs2svn_lib/context.py0000664000076500007650000000663511710517256020512 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Store the context (options, etc) for a cvs2svn run.""" import os import textwrap from cvs2svn_lib import config from cvs2svn_lib.common import CVSTextDecoder class Ctx: """Session state for this run of cvs2svn. For example, run-time options are stored here. This class is a Borg (see http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/66531).""" __shared_state = { } def __init__(self): self.__dict__ = self.__shared_state if self.__dict__: return # Else, initialize to defaults. self.set_defaults() def set_defaults(self): """Set all parameters to their default values.""" self.output_option = None self.dry_run = False self.revision_collector = None self.revision_reader = None self.svnadmin_executable = config.SVNADMIN_EXECUTABLE self.trunk_only = False self.include_empty_directories = False self.prune = True self.cvs_author_decoder = CVSTextDecoder(['ascii']) self.cvs_log_decoder = CVSTextDecoder(['ascii'], eol_fix='\n') self.cvs_filename_decoder = CVSTextDecoder(['ascii']) self.decode_apple_single = False self.symbol_info_filename = None self.username = None self.file_property_setters = [] self.revision_property_setters = [] self.tmpdir = 'cvs2svn-tmp' self.skip_cleanup = False self.keep_cvsignore = False self.cross_project_commits = True self.cross_branch_commits = True self.retain_conflicting_attic_files = False # textwrap.TextWrapper instance to be used for wrapping log messages: self.text_wrapper = textwrap.TextWrapper(width=76, break_long_words=False) self.initial_project_commit_message = ( 'Standard project directories initialized by cvs2svn.' ) self.post_commit_message = ( 'This commit was generated by cvs2svn to compensate for ' 'changes in r%(revnum)d, which included commits to RCS files ' 'with non-trunk default branches.' ) self.symbol_commit_message = ( "This commit was manufactured by cvs2svn to create %(symbol_type)s " "'%(symbol_name)s'." ) self.tie_tag_ancestry_message = ( "This commit was manufactured by cvs2svn to tie ancestry for " "tag '%(symbol_name)s' back to the source branch." ) def get_temp_filename(self, basename): return os.path.join(self.tmpdir, basename) def clean(self): """Dispose of items in our dictionary that are not intended to live past the end of a pass (identified by exactly one leading underscore).""" for attr in self.__dict__.keys(): if (attr.startswith('_') and not attr.startswith('__') and not attr.startswith('_Ctx__')): delattr(self, attr) cvs2svn-2.4.0/cvs2svn_lib/output_option.py0000664000076500007650000001016511710517256021747 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains classes that hold the cvs2svn output options.""" from cvs2svn_lib.common import IllegalSVNPathError class OutputOption: """Represents an output choice for a run of cvs2svn.""" # name of output format (for error messages), capitalized for use at # the start of a sentence. This class attribute must be set by # subclasses name = None def register_artifacts(self, which_pass): """Register artifacts that will be needed for this output option. WHICH_PASS is the pass that will call our callbacks, so it should be used to do the registering (e.g., call WHICH_PASS.register_temp_file() and/or WHICH_PASS.register_temp_file_needed()).""" pass def verify_filename_legal(self, filename): """Verify that FILENAME is a legal filename. FILENAME is a path component of a CVS path. Check that it won't choke the destination VCS: - Check that it is not empty. - Check that it is not equal to '.' or '..'. - Check that the filename does not include any characters that are illegal in the destination VCS. If any of these tests fail, raise an IllegalSVNPathError. This method should be overridden as needed by derived classes.""" if filename == '': raise IllegalSVNPathError("Empty filename component.") if filename in ['.', '..']: raise IllegalSVNPathError("Illegal filename component %r." % (filename,)) def check(self): """Check that the options stored in SELF are sensible. This might including the existence of a repository on disk, etc.""" raise NotImplementedError() def check_symbols(self, symbol_map): """Check that the symbols in SYMBOL_MAP are OK for this output option. SYMBOL_MAP is a map {AbstractSymbol : (Trunk|TypedSymbol)}, indicating how each symbol is planned to be converted. Raise a FatalError if the symbol plan is not acceptable for this output option.""" raise NotImplementedError() def setup(self, svn_rev_count): """Prepare this output option.""" raise NotImplementedError() def process_initial_project_commit(self, svn_commit): """Process SVN_COMMIT, which is an SVNInitialProjectCommit.""" raise NotImplementedError() def process_primary_commit(self, svn_commit): """Process SVN_COMMIT, which is an SVNPrimaryCommit.""" raise NotImplementedError() def process_post_commit(self, svn_commit): """Process SVN_COMMIT, which is an SVNPostCommit.""" raise NotImplementedError() def process_branch_commit(self, svn_commit): """Process SVN_COMMIT, which is an SVNBranchCommit.""" raise NotImplementedError() def process_tag_commit(self, svn_commit): """Process SVN_COMMIT, which is an SVNTagCommit.""" raise NotImplementedError() def cleanup(self): """Perform any required cleanup related to this output option.""" raise NotImplementedError() class NullOutputOption(OutputOption): """An OutputOption that doesn't do anything.""" name = 'null' def check(self): pass def check_symbols(self, symbol_map): pass def setup(self, svn_rev_count): pass def process_initial_project_commit(self, svn_commit): pass def process_primary_commit(self, svn_commit): pass def process_post_commit(self, svn_commit): pass def process_branch_commit(self, svn_commit): pass def process_tag_commit(self, svn_commit): pass def cleanup(self): pass cvs2svn-2.4.0/cvs2svn_lib/keyword_expander.py0000664000076500007650000000720412027257624022373 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2007-2010 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Expand RCS/CVS keywords.""" import re import time from cvs2svn_lib.context import Ctx class _KeywordExpander: """A class whose instances provide substitutions for CVS keywords. This class is used via its __call__() method, which should be called with a match object representing a match for a CVS keyword string. The method returns the replacement for the matched text. The __call__() method works by calling the method with the same name as that of the CVS keyword (converted to lower case). Instances of this class can be passed as the REPL argument to re.sub().""" date_fmt_old = "%Y/%m/%d %H:%M:%S" # CVS 1.11, rcs date_fmt_new = "%Y-%m-%d %H:%M:%S" # CVS 1.12 date_fmt = date_fmt_new @classmethod def use_old_date_format(klass): """Class method to ensure exact compatibility with CVS 1.11 output. Use this if you want to verify your conversion and you're using CVS 1.11.""" klass.date_fmt = klass.date_fmt_old def __init__(self, cvs_rev): self.cvs_rev = cvs_rev def __call__(self, match): return '$%s: %s $' % ( match.group(1), getattr(self, match.group(1).lower())(), ) def author(self): return Ctx()._metadata_db[self.cvs_rev.metadata_id].original_author def date(self): return time.strftime(self.date_fmt, time.gmtime(self.cvs_rev.timestamp)) def header(self): return '%s %s %s %s Exp' % ( self.source(), self.cvs_rev.rev, self.date(), self.author(), ) def id(self): return '%s %s %s %s Exp' % ( self.rcsfile(), self.cvs_rev.rev, self.date(), self.author(), ) def locker(self): # Handle kvl like kv, as a converted repo is supposed to have no # locks. return '' def log(self): # Would need some special handling. return 'not supported by cvs2svn' def name(self): # Cannot work, as just creating a new symbol does not check out # the revision again. return 'not supported by cvs2svn' def rcsfile(self): return self.cvs_rev.cvs_file.rcs_basename + ",v" def revision(self): return self.cvs_rev.rev def source(self): project = self.cvs_rev.cvs_file.project return '%s/%s%s' % ( project.cvs_repository_root, project.cvs_module, '/'.join(self.cvs_rev.cvs_file.get_path_components(rcs=True)), ) def state(self): # We check out only live revisions. return 'Exp' _kws = 'Author|Date|Header|Id|Locker|Log|Name|RCSfile|Revision|Source|State' _kw_re = re.compile(r'\$(' + _kws + r'):[^$\n]*\$') _kwo_re = re.compile(r'\$(' + _kws + r')(:[^$\n]*)?\$') def expand_keywords(text, cvs_rev): """Return TEXT with keywords expanded for CVS_REV. E.g., '$Author$' -> '$Author: jrandom $'.""" return _kwo_re.sub(_KeywordExpander(cvs_rev), text) def collapse_keywords(text): """Return TEXT with keywords collapsed. E.g., '$Author: jrandom $' -> '$Author$'.""" return _kw_re.sub(r'$\1$', text) cvs2svn-2.4.0/cvs2svn_lib/pass_manager.py0000664000076500007650000001641711500107341021450 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains tools to manage the passes of a conversion.""" import time import gc from cvs2svn_lib import config from cvs2svn_lib.common import FatalError from cvs2svn_lib.context import Ctx from cvs2svn_lib.log import logger from cvs2svn_lib.stats_keeper import StatsKeeper from cvs2svn_lib.stats_keeper import read_stats_keeper from cvs2svn_lib.artifact_manager import artifact_manager class InvalidPassError(FatalError): def __init__(self, msg): FatalError.__init__( self, msg + '\nUse --help-passes for more information.') def check_for_garbage(): # We've turned off the garbage collector because we shouldn't # need it (we don't create circular dependencies) and because it # is therefore a waste of time. So here we check for any # unreachable objects and generate a debug-level warning if any # occur: gc.set_debug(gc.DEBUG_SAVEALL) gc_count = gc.collect() if gc_count: if logger.is_on(logger.DEBUG): logger.debug( 'INTERNAL: %d unreachable object(s) were garbage collected:' % (gc_count,) ) for g in gc.garbage: logger.debug(' %s' % (g,)) del gc.garbage[:] class Pass(object): """Base class for one step of the conversion.""" def __init__(self): # By default, use the pass object's class name as the pass name: self.name = self.__class__.__name__ def register_artifacts(self): """Register artifacts (created and needed) in artifact_manager.""" raise NotImplementedError() def _register_temp_file(self, basename): """Helper method; for brevity only.""" artifact_manager.register_temp_file(basename, self) def _register_temp_file_needed(self, basename): """Helper method; for brevity only.""" artifact_manager.register_temp_file_needed(basename, self) def run(self, run_options, stats_keeper): """Carry out this step of the conversion. RUN_OPTIONS is an instance of RunOptions. STATS_KEEPER is an instance of StatsKeeper.""" raise NotImplementedError() class PassManager: """Manage a list of passes that can be executed separately or all at once. Passes are numbered starting with 1.""" def __init__(self, passes): """Construct a PassManager with the specified PASSES. Internally, passes are numbered starting with 1. So PASSES[0] is considered to be pass number 1.""" self.passes = passes self.num_passes = len(self.passes) def get_pass_number(self, pass_name, default=None): """Return the number of the pass indicated by PASS_NAME. PASS_NAME should be a string containing the name or number of a pass. If a number, it should be in the range 1 <= value <= self.num_passes. Return an integer in the same range. If PASS_NAME is the empty string and DEFAULT is specified, return DEFAULT. Raise InvalidPassError if PASS_NAME cannot be converted into a valid pass number.""" if not pass_name and default is not None: assert 1 <= default <= self.num_passes return default try: # Does pass_name look like an integer? pass_number = int(pass_name) if not 1 <= pass_number <= self.num_passes: raise InvalidPassError( 'illegal value (%d) for pass number. Must be 1 through %d or\n' 'the name of a known pass.' % (pass_number,self.num_passes,)) return pass_number except ValueError: # Is pass_name the name of one of the passes? for (i, the_pass) in enumerate(self.passes): if the_pass.name == pass_name: return i + 1 raise InvalidPassError('Unknown pass name (%r).' % (pass_name,)) def run(self, run_options): """Run the specified passes, one after another. RUN_OPTIONS will be passed to the Passes' run() methods. RUN_OPTIONS.start_pass is the number of the first pass that should be run. RUN_OPTIONS.end_pass is the number of the last pass that should be run. It must be that 1 <= RUN_OPTIONS.start_pass <= RUN_OPTIONS.end_pass <= self.num_passes.""" # Convert start_pass and end_pass into the indices of the passes # to execute, using the Python index range convention (i.e., first # pass executed and first pass *after* the ones that should be # executed). index_start = run_options.start_pass - 1 index_end = run_options.end_pass # Inform the artifact manager when artifacts are created and used: for (i, the_pass) in enumerate(self.passes): the_pass.register_artifacts() # Each pass creates a new version of the statistics file: artifact_manager.register_temp_file( config.STATISTICS_FILE % (i + 1,), the_pass ) if i != 0: # Each pass subsequent to the first reads the statistics file # from the preceding pass: artifact_manager.register_temp_file_needed( config.STATISTICS_FILE % (i + 1 - 1,), the_pass ) # Tell the artifact manager about passes that are being skipped this run: for the_pass in self.passes[0:index_start]: artifact_manager.pass_skipped(the_pass) start_time = time.time() for i in range(index_start, index_end): the_pass = self.passes[i] logger.quiet('----- pass %d (%s) -----' % (i + 1, the_pass.name,)) artifact_manager.pass_started(the_pass) if i == 0: stats_keeper = StatsKeeper() else: stats_keeper = read_stats_keeper( artifact_manager.get_temp_file( config.STATISTICS_FILE % (i + 1 - 1,) ) ) the_pass.run(run_options, stats_keeper) end_time = time.time() stats_keeper.log_duration_for_pass( end_time - start_time, i + 1, the_pass.name ) logger.normal(stats_keeper.single_pass_timing(i + 1)) stats_keeper.archive( artifact_manager.get_temp_file(config.STATISTICS_FILE % (i + 1,)) ) start_time = end_time Ctx().clean() # Allow the artifact manager to clean up artifacts that are no # longer needed: artifact_manager.pass_done(the_pass, Ctx().skip_cleanup) check_for_garbage() # Tell the artifact manager about passes that are being deferred: for the_pass in self.passes[index_end:]: artifact_manager.pass_deferred(the_pass) logger.quiet(stats_keeper) logger.normal(stats_keeper.timings()) # Consistency check: artifact_manager.check_clean() def help_passes(self): """Output (to sys.stdout) the indices and names of available passes.""" print 'PASSES:' for (i, the_pass) in enumerate(self.passes): print '%5d : %s' % (i + 1, the_pass.name,) cvs2svn-2.4.0/cvs2svn_lib/rcs_revision_manager.py0000664000076500007650000000331411500107341023177 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Access the CVS repository via RCS's 'co' command.""" from cvs2svn_lib.common import FatalError from cvs2svn_lib.process import check_command_runs from cvs2svn_lib.process import CommandFailedException from cvs2svn_lib.abstract_rcs_revision_manager import AbstractRCSRevisionReader class RCSRevisionReader(AbstractRCSRevisionReader): """A RevisionReader that reads the contents via RCS.""" def __init__(self, co_executable): self.co_executable = co_executable try: check_command_runs([self.co_executable, '-V'], self.co_executable) except CommandFailedException, e: raise FatalError('%s\n' 'Please check that co is installed and in your PATH\n' '(it is a part of the RCS software).' % (e,)) def get_pipe_command(self, cvs_rev, k_option): return [ self.co_executable, '-q', '-x,v', '-p%s' % (cvs_rev.rev,) ] + k_option + [ cvs_rev.cvs_file.rcs_path ] cvs2svn-2.4.0/cvs2svn_lib/cvs_item.py0000664000076500007650000007026211434364604020634 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains classes to store atomic CVS events. A CVSItem is a single event, pertaining to a single file, that can be determined to have occured based on the information in the CVS repository. The inheritance tree is as follows: CVSItem | +--CVSRevision | | | +--CVSRevisionModification (* -> 'Exp') | | | | | +--CVSRevisionAdd ('dead' -> 'Exp') | | | | | +--CVSRevisionChange ('Exp' -> 'Exp') | | | +--CVSRevisionAbsent (* -> 'dead') | | | +--CVSRevisionDelete ('Exp' -> 'dead') | | | +--CVSRevisionNoop ('dead' -> 'dead') | +--CVSSymbol | +--CVSBranch | | | +--CVSBranchNoop | +--CVSTag | +--CVSTagNoop """ from cvs2svn_lib.context import Ctx class CVSItem(object): __slots__ = [ 'id', 'cvs_file', 'revision_reader_token', ] def __init__(self, id, cvs_file, revision_reader_token): self.id = id self.cvs_file = cvs_file self.revision_reader_token = revision_reader_token def __eq__(self, other): return self.id == other.id def __cmp__(self, other): return cmp(self.id, other.id) def __hash__(self): return self.id def __getstate__(self): raise NotImplementedError() def __setstate__(self, data): raise NotImplementedError() def get_svn_path(self): """Return the SVN path associated with this CVSItem.""" raise NotImplementedError() def get_pred_ids(self): """Return the CVSItem.ids of direct predecessors of SELF. A predecessor is defined to be a CVSItem that has to have been committed before this one.""" raise NotImplementedError() def get_succ_ids(self): """Return the CVSItem.ids of direct successors of SELF. A direct successor is defined to be a CVSItem that has this one as a direct predecessor.""" raise NotImplementedError() def get_cvs_symbol_ids_opened(self): """Return an iterable over the ids of CVSSymbols that this item opens. The definition of 'open' is that the path corresponding to this CVSItem will have to be copied when filling the corresponding symbol.""" raise NotImplementedError() def get_ids_closed(self): """Return an iterable over the CVSItem.ids of CVSItems closed by this one. A CVSItem A is said to close a CVSItem B if committing A causes B to be overwritten or deleted (no longer available) in the SVN repository. This is interesting because it sets the last SVN revision number from which the contents of B can be copied (for example, to fill a symbol). See the concrete implementations of this method for the exact rules about what closes what.""" raise NotImplementedError() def check_links(self, cvs_file_items): """Check for consistency of links to other CVSItems. Other items can be looked up in CVS_FILE_ITEMS, which is an instance of CVSFileItems. Raise an AssertionError if there is a problem.""" raise NotImplementedError() def __repr__(self): return '%s(%s)' % (self.__class__.__name__, self,) class CVSRevision(CVSItem): """Information about a single CVS revision. A CVSRevision holds the information known about a single version of a single file. Members: id -- (int) unique ID for this revision. cvs_file -- (CVSFile) CVSFile affected by this revision. timestamp -- (int) date stamp for this revision. metadata_id -- (int) id of metadata instance record in metadata_db. prev_id -- (int) id of the logically previous CVSRevision, either on the same or the source branch (or None). next_id -- (int) id of the logically next CVSRevision (or None). rev -- (string) the CVS revision number, e.g., '1.3'. deltatext_exists -- (bool) true iff this revision's deltatext is not empty. lod -- (LineOfDevelopment) LOD on which this revision occurred. first_on_branch_id -- (int or None) if this revision is the first on its branch, the cvs_branch_id of that branch; else, None. ntdbr -- (bool) true iff this is a non-trunk default branch revision. ntdbr_prev_id -- (int or None) Iff this is the 1.2 revision after the end of a default branch, the id of the last rev on the default branch; else, None. ntdbr_next_id -- (int or None) Iff this is the last revision on a default branch preceding a 1.2 rev, the id of the 1.2 revision; else, None. tag_ids -- (list of int) ids of all CVSTags rooted at this CVSRevision. branch_ids -- (list of int) ids of all CVSBranches rooted at this CVSRevision. branch_commit_ids -- (list of int) ids of first CVSRevision committed on each branch rooted in this revision (for branches with commits). opened_symbols -- (None or list of (symbol_id, cvs_symbol_id) tuples) information about all CVSSymbols opened by this revision. This member is set in FilterSymbolsPass; before then, it is None. closed_symbols -- (None or list of (symbol_id, cvs_symbol_id) tuples) information about all CVSSymbols closed by this revision. This member is set in FilterSymbolsPass; before then, it is None. properties -- (dict) the file properties that vary from revision to revision. The get_properties() method combines the properties found in SELF.cvs_file.properties with those here; the latter take precedence. Keys are strings. Values are strings (indicating the property value) or None (indicating that the property should be left unset, even if it is set in SELF.cvs_file.properties). Different backends can use properties for different purposes; for cvs2svn these become SVN versioned properties. Properties whose names start with underscore are reserved for internal cvs2svn purposes. properties_changed -- (None or bool) Will this CVSRevision's get_properties() method return a different value than the same call of the predecessor revision? revision_reader_token -- (arbitrary) a token that can be set by RevisionCollector for the later use of RevisionReader. """ __slots__ = [ 'timestamp', 'metadata_id', 'prev_id', 'next_id', 'rev', 'deltatext_exists', 'lod', 'first_on_branch_id', 'ntdbr', 'ntdbr_prev_id', 'ntdbr_next_id', 'tag_ids', 'branch_ids', 'branch_commit_ids', 'opened_symbols', 'closed_symbols', 'properties', 'properties_changed', ] def __init__( self, id, cvs_file, timestamp, metadata_id, prev_id, next_id, rev, deltatext_exists, lod, first_on_branch_id, ntdbr, ntdbr_prev_id, ntdbr_next_id, tag_ids, branch_ids, branch_commit_ids, revision_reader_token, ): """Initialize a new CVSRevision object.""" CVSItem.__init__(self, id, cvs_file, revision_reader_token) self.timestamp = timestamp self.metadata_id = metadata_id self.prev_id = prev_id self.next_id = next_id self.rev = rev self.deltatext_exists = deltatext_exists self.lod = lod self.first_on_branch_id = first_on_branch_id self.ntdbr = ntdbr self.ntdbr_prev_id = ntdbr_prev_id self.ntdbr_next_id = ntdbr_next_id self.tag_ids = tag_ids self.branch_ids = branch_ids self.branch_commit_ids = branch_commit_ids self.opened_symbols = None self.closed_symbols = None self.properties = None self.properties_changed = None def _get_cvs_path(self): return self.cvs_file.cvs_path cvs_path = property(_get_cvs_path) def get_svn_path(self): return self.lod.get_path(self.cvs_file.cvs_path) def __getstate__(self): """Return the contents of this instance, for pickling. The presence of this method improves the space efficiency of pickling CVSRevision instances.""" return ( self.id, self.cvs_file.id, self.timestamp, self.metadata_id, self.prev_id, self.next_id, self.rev, self.deltatext_exists, self.lod.id, self.first_on_branch_id, self.ntdbr, self.ntdbr_prev_id, self.ntdbr_next_id, self.tag_ids, self.branch_ids, self.branch_commit_ids, self.opened_symbols, self.closed_symbols, self.properties, self.properties_changed, self.revision_reader_token, ) def __setstate__(self, data): (self.id, cvs_file_id, self.timestamp, self.metadata_id, self.prev_id, self.next_id, self.rev, self.deltatext_exists, lod_id, self.first_on_branch_id, self.ntdbr, self.ntdbr_prev_id, self.ntdbr_next_id, self.tag_ids, self.branch_ids, self.branch_commit_ids, self.opened_symbols, self.closed_symbols, self.properties, self.properties_changed, self.revision_reader_token) = data self.cvs_file = Ctx()._cvs_path_db.get_path(cvs_file_id) self.lod = Ctx()._symbol_db.get_symbol(lod_id) def get_properties(self): """Return all of the properties needed for this CVSRevision. Combine SELF.cvs_file.properties and SELF.properties to get the final properties needed for this CVSRevision. (The properties in SELF have precedence.) Return the properties as a map {key : value}, where keys and values are both strings. (Entries with value == None are omitted.) Different backends can use properties for different purposes; for cvs2svn these become SVN versioned properties. Properties whose names start with underscore are reserved for internal cvs2svn purposes.""" properties = self.cvs_file.properties.copy() properties.update(self.properties) for (k,v) in properties.items(): if v is None: del properties[k] return properties def get_property(self, name, default=None): """Return a particular property for this CVSRevision. This is logically the same as SELF.get_properties().get(name, default) but implemented more efficiently.""" if name in self.properties: return self.properties[name] else: return self.cvs_file.properties.get(name, default) def get_effective_prev_id(self): """Return the ID of the effective predecessor of this item. This is the ID of the item that determines whether the object existed before this CVSRevision.""" if self.ntdbr_prev_id is not None: return self.ntdbr_prev_id else: return self.prev_id def get_symbol_pred_ids(self): """Return the pred_ids for symbol predecessors.""" retval = set() if self.first_on_branch_id is not None: retval.add(self.first_on_branch_id) return retval def get_pred_ids(self): retval = self.get_symbol_pred_ids() if self.prev_id is not None: retval.add(self.prev_id) if self.ntdbr_prev_id is not None: retval.add(self.ntdbr_prev_id) return retval def get_symbol_succ_ids(self): """Return the succ_ids for symbol successors.""" retval = set() for id in self.branch_ids + self.tag_ids: retval.add(id) return retval def get_succ_ids(self): retval = self.get_symbol_succ_ids() if self.next_id is not None: retval.add(self.next_id) if self.ntdbr_next_id is not None: retval.add(self.ntdbr_next_id) for id in self.branch_commit_ids: retval.add(id) return retval def get_ids_closed(self): # Special handling is needed in the case of non-trunk default # branches. The following cases have to be handled: # # Case 1: Revision 1.1 not deleted; revision 1.2 exists: # # 1.1 -----------------> 1.2 # \ ^ ^ / # \ | | / # 1.1.1.1 -> 1.1.1.2 # # * 1.1.1.1 closes 1.1 (because its post-commit overwrites 1.1 # on trunk) # # * 1.1.1.2 closes 1.1.1.1 # # * 1.2 doesn't close anything (the post-commit from 1.1.1.1 # already closed 1.1, and no symbols can sprout from the # post-commit of 1.1.1.2) # # Case 2: Revision 1.1 not deleted; revision 1.2 does not exist: # # 1.1 .................. # \ ^ ^ # \ | | # 1.1.1.1 -> 1.1.1.2 # # * 1.1.1.1 closes 1.1 (because its post-commit overwrites 1.1 # on trunk) # # * 1.1.1.2 closes 1.1.1.1 # # Case 3: Revision 1.1 deleted; revision 1.2 exists: # # ............... 1.2 # ^ ^ / # | | / # 1.1.1.1 -> 1.1.1.2 # # * 1.1.1.1 doesn't close anything # # * 1.1.1.2 closes 1.1.1.1 # # * 1.2 doesn't close anything (no symbols can sprout from the # post-commit of 1.1.1.2) # # Case 4: Revision 1.1 deleted; revision 1.2 doesn't exist: # # ............... # ^ ^ # | | # 1.1.1.1 -> 1.1.1.2 # # * 1.1.1.1 doesn't close anything # # * 1.1.1.2 closes 1.1.1.1 if self.first_on_branch_id is not None: # The first CVSRevision on a branch is considered to close the # branch: yield self.first_on_branch_id if self.ntdbr: # If the 1.1 revision was not deleted, the 1.1.1.1 revision is # considered to close it: yield self.prev_id elif self.ntdbr_prev_id is not None: # This is the special case of a 1.2 revision that follows a # non-trunk default branch. Either 1.1 was deleted or the first # default branch revision closed 1.1, so we don't have to close # 1.1. Technically, we close the revision on trunk that was # copied from the last non-trunk default branch revision in a # post-commit, but for now no symbols can sprout from that # revision so we ignore that one, too. pass elif self.prev_id is not None: # Since this CVSRevision is not the first on a branch, its # prev_id is on the same LOD and this item closes that one: yield self.prev_id def _get_branch_ids_recursively(self, cvs_file_items): """Return the set of all CVSBranches that sprout from this CVSRevision. After parent adjustment in FilterSymbolsPass, it is possible for branches to sprout directly from a CVSRevision, or from those branches, etc. Return all branches that sprout from this CVSRevision, directly or indirectly.""" retval = set() branch_ids_to_process = list(self.branch_ids) while branch_ids_to_process: branch = cvs_file_items[branch_ids_to_process.pop()] retval.add(branch) branch_ids_to_process.extend(branch.branch_ids) return retval def check_links(self, cvs_file_items): assert self.cvs_file == cvs_file_items.cvs_file prev = cvs_file_items.get(self.prev_id) next = cvs_file_items.get(self.next_id) first_on_branch = cvs_file_items.get(self.first_on_branch_id) ntdbr_next = cvs_file_items.get(self.ntdbr_next_id) ntdbr_prev = cvs_file_items.get(self.ntdbr_prev_id) effective_prev = cvs_file_items.get(self.get_effective_prev_id()) if prev is None: # This is the first CVSRevision on trunk or a detached branch: assert self.id in cvs_file_items.root_ids elif first_on_branch is not None: # This is the first CVSRevision on an existing branch: assert isinstance(first_on_branch, CVSBranch) assert first_on_branch.symbol == self.lod assert first_on_branch.next_id == self.id cvs_revision_source = first_on_branch.get_cvs_revision_source( cvs_file_items ) assert cvs_revision_source.id == prev.id assert self.id in prev.branch_commit_ids else: # This revision follows another revision on the same LOD: assert prev.next_id == self.id assert prev.lod == self.lod if next is not None: assert next.prev_id == self.id assert next.lod == self.lod if ntdbr_next is not None: assert self.ntdbr assert ntdbr_next.ntdbr_prev_id == self.id if ntdbr_prev is not None: assert ntdbr_prev.ntdbr_next_id == self.id for tag_id in self.tag_ids: tag = cvs_file_items[tag_id] assert isinstance(tag, CVSTag) assert tag.source_id == self.id assert tag.source_lod == self.lod for branch_id in self.branch_ids: branch = cvs_file_items[branch_id] assert isinstance(branch, CVSBranch) assert branch.source_id == self.id assert branch.source_lod == self.lod branch_commit_ids = list(self.branch_commit_ids) for branch in self._get_branch_ids_recursively(cvs_file_items): assert isinstance(branch, CVSBranch) if branch.next_id is not None: assert branch.next_id in branch_commit_ids branch_commit_ids.remove(branch.next_id) assert not branch_commit_ids assert self.__class__ == cvs_revision_type_map[( isinstance(self, CVSRevisionModification), effective_prev is not None and isinstance(effective_prev, CVSRevisionModification), )] def __str__(self): """For convenience only. The format is subject to change at any time.""" return '%s:%s<%x>' % (self.cvs_file, self.rev, self.id,) class CVSRevisionModification(CVSRevision): """Base class for CVSRevisionAdd or CVSRevisionChange.""" __slots__ = [] def get_cvs_symbol_ids_opened(self): return self.tag_ids + self.branch_ids class CVSRevisionAdd(CVSRevisionModification): """A CVSRevision that creates a file that previously didn't exist. The file might have never existed on this LOD, or it might have existed previously but been deleted by a CVSRevisionDelete.""" __slots__ = [] class CVSRevisionChange(CVSRevisionModification): """A CVSRevision that modifies a file that already existed on this LOD.""" __slots__ = [] class CVSRevisionAbsent(CVSRevision): """A CVSRevision for which the file is nonexistent on this LOD.""" __slots__ = [] def get_cvs_symbol_ids_opened(self): return [] class CVSRevisionDelete(CVSRevisionAbsent): """A CVSRevision that deletes a file that existed on this LOD.""" __slots__ = [] class CVSRevisionNoop(CVSRevisionAbsent): """A CVSRevision that doesn't do anything. The revision was 'dead' and the predecessor either didn't exist or was also 'dead'. These revisions can't necessarily be thrown away because (1) they impose ordering constraints on other items; (2) they might have a nontrivial log message that we don't want to throw away.""" __slots__ = [] # A map # # {(nondead(cvs_rev), nondead(prev_cvs_rev)) : cvs_revision_subtype} # # , where nondead() means that the cvs revision exists and is not # 'dead', and CVS_REVISION_SUBTYPE is the subtype of CVSRevision that # should be used for CVS_REV. cvs_revision_type_map = { (False, False) : CVSRevisionNoop, (False, True) : CVSRevisionDelete, (True, False) : CVSRevisionAdd, (True, True) : CVSRevisionChange, } class CVSSymbol(CVSItem): """Represent a symbol on a particular CVSFile. This is the base class for CVSBranch and CVSTag. Members: id -- (int) unique ID for this item. cvs_file -- (CVSFile) CVSFile affected by this item. symbol -- (Symbol) the symbol affected by this CVSSymbol. source_lod -- (LineOfDevelopment) the LOD that is the source for this CVSSymbol. source_id -- (int) the ID of the CVSRevision or CVSBranch that is the source for this item. This initially points to a CVSRevision, but can be changed to a CVSBranch via parent adjustment in FilterSymbolsPass. revision_reader_token -- (arbitrary) a token that can be set by RevisionCollector for the later use of RevisionReader. """ __slots__ = [ 'symbol', 'source_lod', 'source_id', ] def __init__( self, id, cvs_file, symbol, source_lod, source_id, revision_reader_token, ): """Initialize a CVSSymbol object.""" CVSItem.__init__(self, id, cvs_file, revision_reader_token) self.symbol = symbol self.source_lod = source_lod self.source_id = source_id def get_cvs_revision_source(self, cvs_file_items): """Return the CVSRevision that is the ultimate source of this symbol.""" cvs_source = cvs_file_items[self.source_id] while not isinstance(cvs_source, CVSRevision): cvs_source = cvs_file_items[cvs_source.source_id] return cvs_source def get_svn_path(self): return self.symbol.get_path(self.cvs_file.cvs_path) def get_ids_closed(self): # A Symbol does not close any other CVSItems: return [] class CVSBranch(CVSSymbol): """Represent the creation of a branch in a particular CVSFile. Members: id -- (int) unique ID for this item. cvs_file -- (CVSFile) CVSFile affected by this item. symbol -- (Symbol) the symbol affected by this CVSSymbol. branch_number -- (string) the number of this branch (e.g., '1.3.4'), or None if this is a converted CVSTag. source_lod -- (LineOfDevelopment) the LOD that is the source for this CVSSymbol. source_id -- (int) id of the CVSRevision or CVSBranch from which this branch sprouts. This initially points to a CVSRevision, but can be changed to a CVSBranch via parent adjustment in FilterSymbolsPass. next_id -- (int or None) id of first CVSRevision on this branch, if any; else, None. tag_ids -- (list of int) ids of all CVSTags rooted at this CVSBranch (can be set due to parent adjustment in FilterSymbolsPass). branch_ids -- (list of int) ids of all CVSBranches rooted at this CVSBranch (can be set due to parent adjustment in FilterSymbolsPass). opened_symbols -- (None or list of (symbol_id, cvs_symbol_id) tuples) information about all CVSSymbols opened by this branch. This member is set in FilterSymbolsPass; before then, it is None. revision_reader_token -- (arbitrary) a token that can be set by RevisionCollector for the later use of RevisionReader. """ __slots__ = [ 'branch_number', 'next_id', 'tag_ids', 'branch_ids', 'opened_symbols', ] def __init__( self, id, cvs_file, symbol, branch_number, source_lod, source_id, next_id, revision_reader_token, ): """Initialize a CVSBranch.""" CVSSymbol.__init__( self, id, cvs_file, symbol, source_lod, source_id, revision_reader_token, ) self.branch_number = branch_number self.next_id = next_id self.tag_ids = [] self.branch_ids = [] self.opened_symbols = None def __getstate__(self): return ( self.id, self.cvs_file.id, self.symbol.id, self.branch_number, self.source_lod.id, self.source_id, self.next_id, self.tag_ids, self.branch_ids, self.opened_symbols, self.revision_reader_token, ) def __setstate__(self, data): ( self.id, cvs_file_id, symbol_id, self.branch_number, source_lod_id, self.source_id, self.next_id, self.tag_ids, self.branch_ids, self.opened_symbols, self.revision_reader_token, ) = data self.cvs_file = Ctx()._cvs_path_db.get_path(cvs_file_id) self.symbol = Ctx()._symbol_db.get_symbol(symbol_id) self.source_lod = Ctx()._symbol_db.get_symbol(source_lod_id) def get_pred_ids(self): return set([self.source_id]) def get_succ_ids(self): retval = set(self.tag_ids + self.branch_ids) if self.next_id is not None: retval.add(self.next_id) return retval def get_cvs_symbol_ids_opened(self): return self.tag_ids + self.branch_ids def check_links(self, cvs_file_items): source = cvs_file_items.get(self.source_id) next = cvs_file_items.get(self.next_id) assert self.id in source.branch_ids if isinstance(source, CVSRevision): assert self.source_lod == source.lod elif isinstance(source, CVSBranch): assert self.source_lod == source.symbol else: assert False if next is not None: assert isinstance(next, CVSRevision) assert next.lod == self.symbol assert next.first_on_branch_id == self.id for tag_id in self.tag_ids: tag = cvs_file_items[tag_id] assert isinstance(tag, CVSTag) assert tag.source_id == self.id assert tag.source_lod == self.symbol for branch_id in self.branch_ids: branch = cvs_file_items[branch_id] assert isinstance(branch, CVSBranch) assert branch.source_id == self.id assert branch.source_lod == self.symbol def __str__(self): """For convenience only. The format is subject to change at any time.""" return '%s:%s:%s<%x>' \ % (self.cvs_file, self.symbol, self.branch_number, self.id,) class CVSBranchNoop(CVSBranch): """A CVSBranch whose source is a CVSRevisionAbsent.""" __slots__ = [] def get_cvs_symbol_ids_opened(self): return [] # A map # # {nondead(source_cvs_rev) : cvs_branch_subtype} # # , where nondead() means that the cvs revision exists and is not # 'dead', and CVS_BRANCH_SUBTYPE is the subtype of CVSBranch that # should be used. cvs_branch_type_map = { False : CVSBranchNoop, True : CVSBranch, } class CVSTag(CVSSymbol): """Represent the creation of a tag on a particular CVSFile. Members: id -- (int) unique ID for this item. cvs_file -- (CVSFile) CVSFile affected by this item. symbol -- (Symbol) the symbol affected by this CVSSymbol. source_lod -- (LineOfDevelopment) the LOD that is the source for this CVSSymbol. source_id -- (int) the ID of the CVSRevision or CVSBranch that is being tagged. This initially points to a CVSRevision, but can be changed to a CVSBranch via parent adjustment in FilterSymbolsPass. revision_reader_token -- (arbitrary) a token that can be set by RevisionCollector for the later use of RevisionReader. """ __slots__ = [] def __init__( self, id, cvs_file, symbol, source_lod, source_id, revision_reader_token, ): """Initialize a CVSTag.""" CVSSymbol.__init__( self, id, cvs_file, symbol, source_lod, source_id, revision_reader_token, ) def __getstate__(self): return ( self.id, self.cvs_file.id, self.symbol.id, self.source_lod.id, self.source_id, self.revision_reader_token, ) def __setstate__(self, data): ( self.id, cvs_file_id, symbol_id, source_lod_id, self.source_id, self.revision_reader_token, ) = data self.cvs_file = Ctx()._cvs_path_db.get_path(cvs_file_id) self.symbol = Ctx()._symbol_db.get_symbol(symbol_id) self.source_lod = Ctx()._symbol_db.get_symbol(source_lod_id) def get_pred_ids(self): return set([self.source_id]) def get_succ_ids(self): return set() def get_cvs_symbol_ids_opened(self): return [] def check_links(self, cvs_file_items): source = cvs_file_items.get(self.source_id) assert self.id in source.tag_ids if isinstance(source, CVSRevision): assert self.source_lod == source.lod elif isinstance(source, CVSBranch): assert self.source_lod == source.symbol else: assert False def __str__(self): """For convenience only. The format is subject to change at any time.""" return '%s:%s<%x>' \ % (self.cvs_file, self.symbol, self.id,) class CVSTagNoop(CVSTag): """A CVSTag whose source is a CVSRevisionAbsent.""" __slots__ = [] # A map # # {nondead(source_cvs_rev) : cvs_tag_subtype} # # , where nondead() means that the cvs revision exists and is not # 'dead', and CVS_TAG_SUBTYPE is the subtype of CVSTag that should be # used. cvs_tag_type_map = { False : CVSTagNoop, True : CVSTag, } cvs2svn-2.4.0/cvs2svn_lib/metadata.py0000664000076500007650000000162511043077510020571 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Represent CVSRevision metadata.""" class Metadata(object): def __init__(self, id, author, log_msg): self.id = id self.author = author self.log_msg = log_msg cvs2svn-2.4.0/cvs2svn_lib/svn_commit_creator.py0000664000076500007650000001753511500107341022707 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains the SVNCommitCreator class.""" import time from cvs2svn_lib.common import InternalError from cvs2svn_lib.log import logger from cvs2svn_lib.context import Ctx from cvs2svn_lib.cvs_item import CVSRevisionNoop from cvs2svn_lib.cvs_item import CVSBranchNoop from cvs2svn_lib.cvs_item import CVSTagNoop from cvs2svn_lib.changeset import OrderedChangeset from cvs2svn_lib.changeset import BranchChangeset from cvs2svn_lib.changeset import TagChangeset from cvs2svn_lib.svn_commit import SVNInitialProjectCommit from cvs2svn_lib.svn_commit import SVNPrimaryCommit from cvs2svn_lib.svn_commit import SVNPostCommit from cvs2svn_lib.svn_commit import SVNBranchCommit from cvs2svn_lib.svn_commit import SVNTagCommit from cvs2svn_lib.key_generator import KeyGenerator class SVNCommitCreator: """This class creates and yields SVNCommits via process_changeset().""" def __init__(self): # The revision number to assign to the next new SVNCommit. self.revnum_generator = KeyGenerator() # A set containing the Projects that have already been # initialized: self._initialized_projects = set() def _post_commit(self, cvs_revs, motivating_revnum, timestamp): """Generate any SVNCommits needed to follow CVS_REVS. That is, handle non-trunk default branches. A revision on a CVS non-trunk default branch is visible in a default CVS checkout of HEAD. So we copy such commits over to Subversion's trunk so that checking out SVN trunk gives the same output as checking out of CVS's default branch.""" cvs_revs = [ cvs_rev for cvs_rev in cvs_revs if cvs_rev.ntdbr and not isinstance(cvs_rev, CVSRevisionNoop) ] if cvs_revs: cvs_revs.sort( lambda a, b: cmp(a.cvs_file.rcs_path, b.cvs_file.rcs_path) ) # Generate an SVNCommit for all of our default branch cvs_revs. yield SVNPostCommit( motivating_revnum, cvs_revs, timestamp, self.revnum_generator.gen_id(), ) def _process_revision_changeset(self, changeset, timestamp): """Process CHANGESET, using TIMESTAMP as the commit time. Create and yield one or more SVNCommits in the process. CHANGESET must be an OrderedChangeset. TIMESTAMP is used as the timestamp for any resulting SVNCommits.""" if not changeset.cvs_item_ids: logger.warn('Changeset has no items: %r' % changeset) return logger.verbose('-' * 60) logger.verbose('CVS Revision grouping:') logger.verbose(' Time: %s' % time.ctime(timestamp)) # Generate an SVNCommit unconditionally. Even if the only change in # this group of CVSRevisions is a deletion of an already-deleted # file (that is, a CVS revision in state 'dead' whose predecessor # was also in state 'dead'), the conversion will still generate a # Subversion revision containing the log message for the second dead # revision, because we don't want to lose that information. cvs_revs = list(changeset.iter_cvs_items()) if cvs_revs: cvs_revs.sort(lambda a, b: cmp(a.cvs_file.rcs_path, b.cvs_file.rcs_path)) svn_commit = SVNPrimaryCommit( cvs_revs, timestamp, self.revnum_generator.gen_id() ) yield svn_commit for cvs_rev in cvs_revs: Ctx()._symbolings_logger.log_revision(cvs_rev, svn_commit.revnum) # Generate an SVNPostCommit if we have default branch revs. If # some of the revisions in this commit happened on a non-trunk # default branch, then those files have to be copied into trunk # manually after being changed on the branch (because the RCS # "default branch" appears as head, i.e., trunk, in practice). # Unfortunately, Subversion doesn't support copies with sources # in the current txn. All copies must be based in committed # revisions. Therefore, we generate the copies in a new # revision. for svn_post_commit in self._post_commit( cvs_revs, svn_commit.revnum, timestamp ): yield svn_post_commit def _process_tag_changeset(self, changeset, timestamp): """Process TagChangeset CHANGESET, producing a SVNTagCommit. Filter out CVSTagNoops. If no CVSTags are left, don't generate a SVNTagCommit.""" if Ctx().trunk_only: raise InternalError( 'TagChangeset encountered during a --trunk-only conversion') cvs_tag_ids = [ cvs_tag.id for cvs_tag in changeset.iter_cvs_items() if not isinstance(cvs_tag, CVSTagNoop) ] if cvs_tag_ids: yield SVNTagCommit( changeset.symbol, cvs_tag_ids, timestamp, self.revnum_generator.gen_id(), ) else: logger.debug( 'Omitting %r because it contains only CVSTagNoops' % (changeset,) ) def _process_branch_changeset(self, changeset, timestamp): """Process BranchChangeset CHANGESET, producing a SVNBranchCommit. Filter out CVSBranchNoops. If no CVSBranches are left, don't generate a SVNBranchCommit.""" if Ctx().trunk_only: raise InternalError( 'BranchChangeset encountered during a --trunk-only conversion') cvs_branches = [ cvs_branch for cvs_branch in changeset.iter_cvs_items() if not isinstance(cvs_branch, CVSBranchNoop) ] if cvs_branches: svn_commit = SVNBranchCommit( changeset.symbol, [cvs_branch.id for cvs_branch in cvs_branches], timestamp, self.revnum_generator.gen_id(), ) yield svn_commit for cvs_branch in cvs_branches: Ctx()._symbolings_logger.log_branch_revision( cvs_branch, svn_commit.revnum ) else: logger.debug( 'Omitting %r because it contains only CVSBranchNoops' % (changeset,) ) def process_changeset(self, changeset, timestamp): """Process CHANGESET, using TIMESTAMP for all of its entries. Return a generator that generates the resulting SVNCommits. The changesets must be fed to this function in proper dependency order.""" # First create any new projects that might be opened by the # changeset: projects_opened = \ changeset.get_projects_opened() - self._initialized_projects if projects_opened: if Ctx().cross_project_commits: yield SVNInitialProjectCommit( timestamp, projects_opened, self.revnum_generator.gen_id() ) else: for project in projects_opened: yield SVNInitialProjectCommit( timestamp, [project], self.revnum_generator.gen_id() ) self._initialized_projects.update(projects_opened) if isinstance(changeset, OrderedChangeset): for svn_commit \ in self._process_revision_changeset(changeset, timestamp): yield svn_commit elif isinstance(changeset, TagChangeset): for svn_commit in self._process_tag_changeset(changeset, timestamp): yield svn_commit elif isinstance(changeset, BranchChangeset): for svn_commit in self._process_branch_changeset(changeset, timestamp): yield svn_commit else: raise TypeError('Illegal changeset %r' % changeset) cvs2svn-2.4.0/cvs2svn_lib/symbol_statistics.py0000664000076500007650000004361611710517256022605 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module gathers and processes statistics about lines of development.""" import cPickle from cvs2svn_lib import config from cvs2svn_lib.common import error_prefix from cvs2svn_lib.common import FatalException from cvs2svn_lib.log import logger from cvs2svn_lib.artifact_manager import artifact_manager from cvs2svn_lib.symbol import Trunk from cvs2svn_lib.symbol import IncludedSymbol from cvs2svn_lib.symbol import Branch from cvs2svn_lib.symbol import Tag from cvs2svn_lib.symbol import ExcludedSymbol class SymbolPlanError(FatalException): pass class SymbolPlanException(SymbolPlanError): def __init__(self, stats, symbol, msg): self.stats = stats self.symbol = symbol SymbolPlanError.__init__( self, 'Cannot convert the following symbol to %s: %s\n %s' % (symbol, msg, self.stats,) ) class IndeterminateSymbolException(SymbolPlanException): def __init__(self, stats, symbol): SymbolPlanException.__init__(self, stats, symbol, 'Indeterminate type') class _Stats: """A summary of information about a symbol (tag or branch). Members: lod -- the LineOfDevelopment instance of the lod being described tag_create_count -- the number of files in which this lod appears as a tag branch_create_count -- the number of files in which this lod appears as a branch branch_commit_count -- the number of files in which there were commits on this lod trivial_import_count -- the number of files in which this branch was purely a non-trunk default branch containing exactly one revision. pure_ntdb_count -- the number of files in which this branch was purely a non-trunk default branch (consisting only of non-trunk default branch revisions). branch_blockers -- a set of Symbol instances for any symbols that sprout from a branch with this name. possible_parents -- a map {LineOfDevelopment : count} indicating in how many files each LOD could have served as the parent of self.lod.""" def __init__(self, lod): self.lod = lod self.tag_create_count = 0 self.branch_create_count = 0 self.branch_commit_count = 0 self.branch_blockers = set() self.trivial_import_count = 0 self.pure_ntdb_count = 0 self.possible_parents = { } def register_tag_creation(self): """Register the creation of this lod as a tag.""" self.tag_create_count += 1 def register_branch_creation(self): """Register the creation of this lod as a branch.""" self.branch_create_count += 1 def register_branch_commit(self): """Register that there were commit(s) on this branch in one file.""" self.branch_commit_count += 1 def register_branch_blocker(self, blocker): """Register BLOCKER as preventing this symbol from being deleted. BLOCKER is a tag or a branch that springs from a revision on this symbol.""" self.branch_blockers.add(blocker) def register_trivial_import(self): """Register that this branch is a trivial import branch in one file.""" self.trivial_import_count += 1 def register_pure_ntdb(self): """Register that this branch is a pure import branch in one file.""" self.pure_ntdb_count += 1 def register_possible_parent(self, lod): """Register that LOD was a possible parent for SELF.lod in a file.""" self.possible_parents[lod] = self.possible_parents.get(lod, 0) + 1 def register_branch_possible_parents(self, cvs_branch, cvs_file_items): """Register any possible parents of this symbol from CVS_BRANCH.""" # This routine is a bottleneck. So we define some local variables # to speed up access to frequently-needed variables. register = self.register_possible_parent parent_cvs_rev = cvs_file_items[cvs_branch.source_id] # The "obvious" parent of a branch is the branch holding the # revision where the branch is rooted: register(parent_cvs_rev.lod) # If the parent revision is a non-trunk default (vendor) branch # revision, then count trunk as a possible parent. In particular, # the symbol could be grafted to the post-commit that copies the # vendor branch changes to trunk. On the other hand, our vendor # branch handling is currently too stupid to do so. On the other # other hand, when the vendor branch is being excluded from the # conversion, then the vendor branch revision will be moved to # trunk, again making trunk a possible parent--and *this* our code # can handle. In the end, considering trunk a possible parent can # never affect the correctness of the conversion, and on balance # seems to improve the selection of symbol parents. if parent_cvs_rev.ntdbr: register(cvs_file_items.trunk) # Any other branches that are rooted at the same revision and # were committed earlier than the branch are also possible # parents: symbol = cvs_branch.symbol for branch_id in parent_cvs_rev.branch_ids: parent_symbol = cvs_file_items[branch_id].symbol # A branch cannot be its own parent, nor can a branch's # parent be a branch that was created after it. So we stop # iterating when we reached the branch whose parents we are # collecting: if parent_symbol == symbol: break register(parent_symbol) def register_tag_possible_parents(self, cvs_tag, cvs_file_items): """Register any possible parents of this symbol from CVS_TAG.""" # This routine is a bottleneck. So use local variables to speed # up access to frequently-needed objects. register = self.register_possible_parent parent_cvs_rev = cvs_file_items[cvs_tag.source_id] # The "obvious" parent of a tag is the branch holding the # revision where the branch is rooted: register(parent_cvs_rev.lod) # If the parent revision is a non-trunk default (vendor) branch # revision, then count trunk as a possible parent. See the # comment by the analogous code in # register_branch_possible_parents() for more details. if parent_cvs_rev.ntdbr: register(cvs_file_items.trunk) # Branches that are rooted at the same revision are also # possible parents: for branch_id in parent_cvs_rev.branch_ids: parent_symbol = cvs_file_items[branch_id].symbol register(parent_symbol) def is_ghost(self): """Return True iff this lod never really existed.""" return ( not isinstance(self.lod, Trunk) and self.branch_commit_count == 0 and not self.branch_blockers and not self.possible_parents ) def check_valid(self, symbol): """Check whether SYMBOL is a valid conversion of SELF.lod. It is planned to convert SELF.lod as SYMBOL. Verify that SYMBOL is a TypedSymbol and that the information that it contains is consistent with that stored in SELF.lod. (This routine does not do higher-level tests of whether the chosen conversion is actually sensible.) If there are any problems, raise a SymbolPlanException.""" if not isinstance(symbol, (Trunk, Branch, Tag, ExcludedSymbol)): raise IndeterminateSymbolException(self, symbol) if symbol.id != self.lod.id: raise SymbolPlanException(self, symbol, 'IDs must match') if symbol.project != self.lod.project: raise SymbolPlanException(self, symbol, 'Projects must match') if isinstance(symbol, IncludedSymbol) and symbol.name != self.lod.name: raise SymbolPlanException(self, symbol, 'Names must match') def check_preferred_parent_allowed(self, symbol): """Check that SYMBOL's preferred_parent_id is an allowed parent. SYMBOL is the planned conversion of SELF.lod. Verify that its preferred_parent_id is a possible parent of SELF.lod. If not, raise a SymbolPlanException describing the problem.""" if isinstance(symbol, IncludedSymbol) \ and symbol.preferred_parent_id is not None: for pp in self.possible_parents.keys(): if pp.id == symbol.preferred_parent_id: return else: raise SymbolPlanException( self, symbol, 'The selected parent is not among the symbol\'s ' 'possible parents.' ) def __str__(self): return ( '\'%s\' is ' 'a tag in %d files, ' 'a branch in %d files, ' 'a trivial import in %d files, ' 'a pure import in %d files, ' 'and has commits in %d files' % (self.lod, self.tag_create_count, self.branch_create_count, self.trivial_import_count, self.pure_ntdb_count, self.branch_commit_count) ) def __repr__(self): retval = ['%s\n possible parents:\n' % (self,)] parent_counts = self.possible_parents.items() parent_counts.sort(lambda a,b: - cmp(a[1], b[1])) for (symbol, count) in parent_counts: if isinstance(symbol, Trunk): retval.append(' trunk : %d\n' % count) else: retval.append(' \'%s\' : %d\n' % (symbol.name, count)) if self.branch_blockers: blockers = list(self.branch_blockers) blockers.sort() retval.append(' blockers:\n') for blocker in blockers: retval.append(' \'%s\'\n' % (blocker,)) return ''.join(retval) class SymbolStatisticsCollector: """Collect statistics about lines of development. Record a summary of information about each line of development in the RCS files for later storage into a database. The database is created in CollectRevsPass and it is used in CollateSymbolsPass (via the SymbolStatistics class). collect_data._SymbolDataCollector inserts information into instances of this class by by calling its register_*() methods. Its main purpose is to assist in the decisions about which symbols can be treated as branches and tags and which may be excluded. The data collected by this class can be written to the file config.SYMBOL_STATISTICS.""" def __init__(self): # A map { lod -> _Stats } for all lines of development: self._stats = { } def __getitem__(self, lod): """Return the _Stats record for line of development LOD. Create and register a new one if necessary.""" try: return self._stats[lod] except KeyError: stats = _Stats(lod) self._stats[lod] = stats return stats def register(self, cvs_file_items): """Register the statistics for each symbol in CVS_FILE_ITEMS.""" for lod_items in cvs_file_items.iter_lods(): if lod_items.lod is not None: branch_stats = self[lod_items.lod] branch_stats.register_branch_creation() if lod_items.cvs_revisions: branch_stats.register_branch_commit() if lod_items.is_trivial_import(): branch_stats.register_trivial_import() if lod_items.is_pure_ntdb(): branch_stats.register_pure_ntdb() for cvs_symbol in lod_items.iter_blockers(): branch_stats.register_branch_blocker(cvs_symbol.symbol) if lod_items.cvs_branch is not None: branch_stats.register_branch_possible_parents( lod_items.cvs_branch, cvs_file_items ) for cvs_tag in lod_items.cvs_tags: tag_stats = self[cvs_tag.symbol] tag_stats.register_tag_creation() tag_stats.register_tag_possible_parents(cvs_tag, cvs_file_items) def purge_ghost_symbols(self): """Purge any symbols that don't have any activity. Such ghost symbols can arise if a symbol was defined in an RCS file but pointed at a non-existent revision.""" for stats in self._stats.values(): if stats.is_ghost(): logger.warn('Deleting ghost symbol: %s' % (stats.lod,)) del self._stats[stats.lod] def close(self): """Store the stats database to the SYMBOL_STATISTICS file.""" f = open(artifact_manager.get_temp_file(config.SYMBOL_STATISTICS), 'wb') cPickle.dump(self._stats.values(), f, -1) f.close() self._stats = None class SymbolStatistics: """Read and handle line of development statistics. The statistics are read from a database created by SymbolStatisticsCollector. This class has methods to process the statistics information and help with decisions about: 1. What tags and branches should be processed/excluded 2. What tags should be forced to be branches and vice versa (this class maintains some statistics to help the user decide) 3. Are there inconsistencies? - A symbol that is sometimes a branch and sometimes a tag - A forced branch with commit(s) on it - A non-excluded branch depends on an excluded branch The data in this class is read from a pickle file.""" def __init__(self, filename): """Read the stats database from FILENAME.""" # A map { LineOfDevelopment -> _Stats } for all lines of # development: self._stats = { } # A map { LineOfDevelopment.id -> _Stats } for all lines of # development: self._stats_by_id = { } stats_list = cPickle.load(open(filename, 'rb')) for stats in stats_list: self._stats[stats.lod] = stats self._stats_by_id[stats.lod.id] = stats def __len__(self): return len(self._stats) def __getitem__(self, lod_id): return self._stats_by_id[lod_id] def get_stats(self, lod): """Return the _Stats object for LineOfDevelopment instance LOD. Raise KeyError if no such lod exists.""" return self._stats[lod] def __iter__(self): return self._stats.itervalues() def _check_blocked_excludes(self, symbol_map): """Check for any excluded LODs that are blocked by non-excluded symbols. If any are found, describe the problem to logger.error() and raise a FatalException.""" # A list of (lod,[blocker,...]) tuples for excludes that are # blocked by the specified non-excluded blockers: problems = [] for lod in symbol_map.itervalues(): if isinstance(lod, ExcludedSymbol): # Symbol is excluded; make sure that its blockers are also # excluded: lod_blockers = [] for blocker in self.get_stats(lod).branch_blockers: if isinstance(symbol_map.get(blocker, None), IncludedSymbol): lod_blockers.append(blocker) if lod_blockers: problems.append((lod, lod_blockers)) if problems: s = [] for (lod, lod_blockers) in problems: s.append( '%s: %s cannot be excluded because the following symbols ' 'depend on it:\n' % (error_prefix, lod,) ) for blocker in lod_blockers: s.append(' %s\n' % (blocker,)) s.append('\n') logger.error(''.join(s)) raise FatalException() def _check_invalid_tags(self, symbol_map): """Check for commits on any symbols that are to be converted as tags. SYMBOL_MAP is a map {AbstractSymbol : (Trunk|TypedSymbol)} indicating how each AbstractSymbol is to be converted. If there is a commit on a symbol, then it cannot be converted as a tag. If any tags with commits are found, output error messages describing the problems then raise a FatalException.""" logger.quiet("Checking for forced tags with commits...") invalid_tags = [ ] for symbol in symbol_map.itervalues(): if isinstance(symbol, Tag): stats = self.get_stats(symbol) if stats.branch_commit_count > 0: invalid_tags.append(symbol) if not invalid_tags: # No problems found: return s = [] s.append( '%s: The following branches cannot be forced to be tags ' 'because they have commits:\n' % (error_prefix,) ) for tag in invalid_tags: s.append(' %s\n' % (tag.name)) s.append('\n') logger.error(''.join(s)) raise FatalException() def check_consistency(self, symbol_map): """Check the plan for how to convert symbols for consistency. SYMBOL_MAP is a map {AbstractSymbol : (Trunk|TypedSymbol)} indicating how each AbstractSymbol is to be converted. If any problems are detected, describe the problem to logger.error() and raise a FatalException.""" # We want to do all of the consistency checks even if one of them # fails, so that the user gets as much feedback as possible. Set # this variable to True if any errors are found. error_found = False # Check that the planned preferred parents are OK for all # IncludedSymbols: for lod in symbol_map.itervalues(): if isinstance(lod, IncludedSymbol): stats = self.get_stats(lod) try: stats.check_preferred_parent_allowed(lod) except SymbolPlanException, e: logger.error('%s\n' % (e,)) error_found = True try: self._check_blocked_excludes(symbol_map) except FatalException: error_found = True try: self._check_invalid_tags(symbol_map) except FatalException: error_found = True if error_found: raise FatalException( 'Please fix the above errors and restart CollateSymbolsPass' ) def exclude_symbol(self, symbol): """SYMBOL has been excluded; remove it from our statistics.""" del self._stats[symbol] del self._stats_by_id[symbol.id] # Remove references to this symbol from other statistics objects: for stats in self._stats.itervalues(): stats.branch_blockers.discard(symbol) if symbol in stats.possible_parents: del stats.possible_parents[symbol] cvs2svn-2.4.0/cvs2svn_lib/cvs_path.py0000664000076500007650000003013412027257624020626 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Classes that represent files and directories within CVS repositories.""" import os from cvs2svn_lib.common import path_join from cvs2svn_lib.context import Ctx class CVSPath(object): """Represent a CVS file or directory. Members: id -- (int) unique ID for this CVSPath. At any moment, there is at most one CVSPath instance with a particular ID. (This means that object identity is the same as object equality, and objects can be used as map keys even though they don't have a __hash__() method). project -- (Project) the project containing this CVSPath. parent_directory -- (CVSDirectory or None) the CVSDirectory containing this CVSPath. rcs_basename -- (string) the base name of the filename path in the CVS repository corresponding to this CVSPath (but with ',v' removed for CVSFiles). The rcs_basename of the root directory of a project is ''. rcs_path -- (string) the filesystem path to this CVSPath in the CVS repository. This is in native format, and already normalised the way os.path.normpath() normalises paths. It starts with the repository path passed to run_options.add_project() in the options.py file. ordinal -- (int) the order that this instance should be sorted relative to other CVSPath instances. This member is set based on the ordering imposed by sort_key() by CVSPathDatabase after all CVSFiles have been processed. Comparisons of CVSPath using __cmp__() simply compare the ordinals. """ __slots__ = [ 'id', 'project', 'parent_directory', 'rcs_basename', 'ordinal', 'rcs_path', ] def __init__(self, id, project, parent_directory, rcs_basename): self.id = id self.project = project self.parent_directory = parent_directory self.rcs_basename = rcs_basename # The rcs_path used to be computed on demand, but it turned out to # be a hot path through the code in some cases. It's used by # SubtreeSymbolTransform and similar transforms, so it's called at # least: # # (num_files * num_symbols_per_file * num_subtree_symbol_transforms) # # times. On a large repository with several subtree symbol # transforms, that can exceed 100,000,000 calls. And # _calculate_rcs_path() is quite complex, so doing that every time # could add about 10 minutes to the cvs2svn runtime. # # So now we precalculate this and just return it. self.rcs_path = os.path.normpath(self._calculate_rcs_path()) def __getstate__(self): """This method must only be called after ordinal has been set.""" return ( self.id, self.project.id, self.parent_directory, self.rcs_basename, self.ordinal, ) def __setstate__(self, state): ( self.id, project_id, self.parent_directory, self.rcs_basename, self.ordinal, ) = state self.project = Ctx()._projects[project_id] self.rcs_path = os.path.normpath(self._calculate_rcs_path()) def get_ancestry(self): """Return a list of the CVSPaths leading from the root path to SELF. Return the CVSPaths in a list, starting with self.project.get_root_cvs_directory() and ending with self.""" ancestry = [] p = self while p is not None: ancestry.append(p) p = p.parent_directory ancestry.reverse() return ancestry def get_path_components(self, rcs=False): """Return the path components to this CVSPath. Return the components of this CVSPath's path, relative to the project's project_cvs_repos_path, as a list of strings. If rcs is True, return the components of the filesystem path to the RCS file corresponding to this CVSPath (i.e., including any 'Attic' component and trailing ',v'. If rcs is False, return the components of the logical CVS path name (i.e., including 'Attic' only if the file is to be left in an Attic directory in the SVN repository and without trailing ',v').""" raise NotImplementedError() def get_cvs_path(self): """Return the canonical path within the Project. The canonical path: - Uses forward slashes - Doesn't include ',v' for files - This doesn't include the 'Attic' segment of the path unless the file is to be left in an Attic directory in the SVN repository; i.e., if a filename exists in and out of Attic and the --retain-conflicting-attic-files option was specified. """ return path_join(*self.get_path_components(rcs=False)) cvs_path = property(get_cvs_path) def _calculate_rcs_path(self): """Return the filesystem path in the CVS repo corresponding to SELF.""" return os.path.join( self.project.project_cvs_repos_path, *self.get_path_components(rcs=True) ) def __eq__(a, b): """Compare two CVSPath instances for equality. This method is supplied to avoid using __cmp__() for comparing for equality.""" return a is b def sort_key(self): """Return the key that should be used for sorting CVSPath instances. This is a relatively expensive computation, so it is only used once, the the results are used to set the ordinal member.""" return ( # Sort first by project: self.project, # Then by directory components: self.get_path_components(rcs=False), ) def __cmp__(a, b): """This method must only be called after ordinal has been set.""" return cmp(a.ordinal, b.ordinal) class CVSDirectory(CVSPath): """Represent a CVS directory. Members: id -- (int or None) unique id for this file. If None, a new id is generated. project -- (Project) the project containing this file. parent_directory -- (CVSDirectory or None) the CVSDirectory containing this CVSDirectory. rcs_basename -- (string) the base name of the filename path in the CVS repository corresponding to this CVSDirectory. The rcs_basename of the root directory of a project is ''. ordinal -- (int) the order that this instance should be sorted relative to other CVSPath instances. See CVSPath.ordinal. empty_subdirectory_ids -- (list of int) a list of the ids of any direct subdirectories that are empty. (An empty directory is defined to be a directory that doesn't contain any RCS files or non-empty subdirectories. """ __slots__ = ['empty_subdirectory_ids'] def __init__(self, id, project, parent_directory, rcs_basename): """Initialize a new CVSDirectory object.""" CVSPath.__init__(self, id, project, parent_directory, rcs_basename) # This member is filled in by CollectData.close(): self.empty_subdirectory_ids = [] def get_path_components(self, rcs=False): components = [] p = self while p.parent_directory is not None: components.append(p.rcs_basename) p = p.parent_directory components.reverse() return components def __getstate__(self): return ( CVSPath.__getstate__(self), self.empty_subdirectory_ids, ) def __setstate__(self, state): ( cvs_path_state, self.empty_subdirectory_ids, ) = state CVSPath.__setstate__(self, cvs_path_state) def __str__(self): """For convenience only. The format is subject to change at any time.""" return self.cvs_path + '/' def __repr__(self): return 'CVSDirectory<%x>(%r)' % (self.id, str(self),) class CVSFile(CVSPath): """Represent a CVS file. Members: id -- (int) unique id for this file. project -- (Project) the project containing this file. parent_directory -- (CVSDirectory) the CVSDirectory containing this CVSFile. rcs_basename -- (string) the base name of the RCS file in the CVS repository corresponding to this CVSPath (but with the ',v' removed). ordinal -- (int) the order that this instance should be sorted relative to other CVSPath instances. See CVSPath.ordinal. _in_attic -- (bool) True if RCS file is in an Attic subdirectory that is not considered the parent directory. (If a file is in-and-out-of-attic and one copy is to be left in Attic after the conversion, then the Attic directory is that file's PARENT_DIRECTORY and _IN_ATTIC is False.) executable -- (bool) True iff RCS file has executable bit set. file_size -- (long) size of the RCS file in bytes. mode -- (string or None) 'kv', 'b', etc., as read from the CVS file. description -- (string or None) the file description as read from the RCS file. properties -- (dict) file properties that are preserved across this history of this file. Keys are strings; values are strings (indicating the property value) or None (indicating that the property should be left unset). These properties can be overridden by CVSRevision.properties. Different backends can use these properties for different purposes; for cvs2svn they become SVN versioned properties. Properties whose names start with underscore are reserved for internal cvs2svn purposes. PARENT_DIRECTORY might contain an 'Attic' component if it should be retained in the SVN repository; i.e., if the same filename exists out of Attic and the --retain-conflicting-attic-files option was specified. """ __slots__ = [ '_in_attic', 'executable', 'file_size', 'mode', 'description', 'properties', ] def __init__( self, id, project, parent_directory, rcs_basename, in_attic, executable, file_size, mode, description ): """Initialize a new CVSFile object.""" assert parent_directory is not None # This member is needed by _calculate_rcs_path(), which is # called by CVSPath.__init__(). So initialize it before calling # CVSPath.__init__(). self._in_attic = in_attic CVSPath.__init__(self, id, project, parent_directory, rcs_basename) self.executable = executable self.file_size = file_size self.mode = mode self.description = description self.properties = None def determine_file_properties(self, file_property_setters): """Determine the properties for this file from FILE_PROPERTY_SETTERS. This must only be called after SELF.mode and SELF.description have been set by CollectData.""" self.properties = {} for file_property_setter in file_property_setters: file_property_setter.set_properties(self) def get_path_components(self, rcs=False): components = self.parent_directory.get_path_components(rcs=rcs) if rcs: if self._in_attic: components.append('Attic') components.append(self.rcs_basename + ',v') else: components.append(self.rcs_basename) return components def __getstate__(self): return ( CVSPath.__getstate__(self), self._in_attic, self.executable, self.file_size, self.mode, self.description, self.properties, ) def __setstate__(self, state): ( cvs_path_state, self._in_attic, self.executable, self.file_size, self.mode, self.description, self.properties, ) = state CVSPath.__setstate__(self, cvs_path_state) def __str__(self): """For convenience only. The format is subject to change at any time.""" return self.cvs_path def __repr__(self): return 'CVSFile<%x>(%r)' % (self.id, str(self),) cvs2svn-2.4.0/cvs2svn_lib/openings_closings.py0000664000076500007650000002103711244045753022542 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains classes to keep track of symbol openings/closings.""" import cPickle from cvs2svn_lib import config from cvs2svn_lib.common import InternalError from cvs2svn_lib.artifact_manager import artifact_manager from cvs2svn_lib.svn_revision_range import SVNRevisionRange # Constants used in SYMBOL_OPENINGS_CLOSINGS OPENING = 'O' CLOSING = 'C' class SymbolingsLogger: """Manage the file that contains lines for symbol openings and closings. This data will later be used to determine valid SVNRevision ranges from which a file can be copied when creating a branch or tag in Subversion. Do this by finding 'Openings' and 'Closings' for each file copied onto a branch or tag. An 'Opening' is the beginning of the lifetime of the source (CVSRevision or CVSBranch) from which a given CVSSymbol sprouts. The 'Closing' is the SVN revision when the source is deleted or overwritten. For example, on file 'foo.c', branch BEE has branch number 1.2.2 and obviously sprouts from revision 1.2. Therefore, the SVN revision when 1.2 is committed is the opening for BEE on path 'foo.c', and the SVN revision when 1.3 is committed is the closing for BEE on path 'foo.c'. Note that there may be many revisions chronologically between 1.2 and 1.3, for example, revisions on branches of 'foo.c', perhaps even including on branch BEE itself. But 1.3 is the next revision *on the same line* as 1.2, that is why it is the closing revision for those symbolic names of which 1.2 is the opening. The reason for doing all this hullabaloo is (1) to determine what range of SVN revision numbers can be used as the source of a copy of a particular file onto a branch/tag, and (2) to minimize the number of copies and deletes per creation by choosing source SVN revision numbers that can be used for as many files as possible. For example, revisions 1.2 and 1.3 of foo.c might correspond to revisions 17 and 30 in Subversion. That means that when creating branch BEE, foo.c has to be copied from a Subversion revision number in the range 17 <= revnum < 30. Now if there were another file, 'bar.c', in the same directory, and 'bar.c's opening and closing for BEE correspond to revisions 24 and 39 in Subversion, then we can kill two birds with one stone by copying the whole directory from somewhere in the range 24 <= revnum < 30.""" def __init__(self): self.symbolings = open( artifact_manager.get_temp_file(config.SYMBOL_OPENINGS_CLOSINGS), 'w') def log_revision(self, cvs_rev, svn_revnum): """Log any openings and closings found in CVS_REV.""" for (symbol_id, cvs_symbol_id,) in cvs_rev.opened_symbols: self._log_opening(symbol_id, cvs_symbol_id, svn_revnum) for (symbol_id, cvs_symbol_id) in cvs_rev.closed_symbols: self._log_closing(symbol_id, cvs_symbol_id, svn_revnum) def log_branch_revision(self, cvs_branch, svn_revnum): """Log any openings and closings found in CVS_BRANCH.""" for (symbol_id, cvs_symbol_id,) in cvs_branch.opened_symbols: self._log_opening(symbol_id, cvs_symbol_id, svn_revnum) def _log(self, symbol_id, cvs_symbol_id, svn_revnum, type): """Log an opening or closing to self.symbolings. Write out a single line to the symbol_openings_closings file representing that SVN_REVNUM is either the opening or closing (TYPE) of CVS_SYMBOL_ID for SYMBOL_ID. TYPE should be one of the following constants: OPENING or CLOSING.""" self.symbolings.write( '%x %d %s %x\n' % (symbol_id, svn_revnum, type, cvs_symbol_id) ) def _log_opening(self, symbol_id, cvs_symbol_id, svn_revnum): """Log an opening to self.symbolings. See _log() for more information.""" self._log(symbol_id, cvs_symbol_id, svn_revnum, OPENING) def _log_closing(self, symbol_id, cvs_symbol_id, svn_revnum): """Log a closing to self.symbolings. See _log() for more information.""" self._log(symbol_id, cvs_symbol_id, svn_revnum, CLOSING) def close(self): self.symbolings.close() self.symbolings = None class SymbolingsReader: """Provides an interface to retrieve symbol openings and closings. This class accesses the SYMBOL_OPENINGS_CLOSINGS_SORTED file and the SYMBOL_OFFSETS_DB. Does the heavy lifting of finding and returning the correct opening and closing Subversion revision numbers for a given symbolic name and SVN revision number range.""" def __init__(self): """Opens the SYMBOL_OPENINGS_CLOSINGS_SORTED for reading, and reads the offsets database into memory.""" self.symbolings = open( artifact_manager.get_temp_file( config.SYMBOL_OPENINGS_CLOSINGS_SORTED), 'r') # The offsets_db is really small, and we need to read and write # from it a fair bit, so suck it into memory offsets_db = file( artifact_manager.get_temp_file(config.SYMBOL_OFFSETS_DB), 'rb') # A map from symbol_id to offset. The values of this map are # incremented as the openings and closings for a symbol are # consumed. self.offsets = cPickle.load(offsets_db) offsets_db.close() def close(self): self.symbolings.close() del self.symbolings del self.offsets def _generate_lines(self, symbol): """Generate the lines for SYMBOL. SYMBOL is a TypedSymbol instance. Yield the tuple (revnum, type, cvs_symbol_id) for all openings and closings for SYMBOL.""" if symbol.id in self.offsets: # Set our read offset for self.symbolings to the offset for this # symbol: self.symbolings.seek(self.offsets[symbol.id]) while True: line = self.symbolings.readline().rstrip() if not line: break (id, revnum, type, cvs_symbol_id) = line.split() id = int(id, 16) revnum = int(revnum) if id != symbol.id: break cvs_symbol_id = int(cvs_symbol_id, 16) yield (revnum, type, cvs_symbol_id) def get_range_map(self, svn_symbol_commit): """Return the ranges of all CVSSymbols in SVN_SYMBOL_COMMIT. Return a map { CVSSymbol : SVNRevisionRange }.""" # A map { cvs_symbol_id : CVSSymbol }: cvs_symbol_map = {} for cvs_symbol in svn_symbol_commit.get_cvs_items(): cvs_symbol_map[cvs_symbol.id] = cvs_symbol range_map = {} for (revnum, type, cvs_symbol_id) \ in self._generate_lines(svn_symbol_commit.symbol): cvs_symbol = cvs_symbol_map.get(cvs_symbol_id) if cvs_symbol is None: # This CVSSymbol is not part of SVN_SYMBOL_COMMIT. continue range = range_map.get(cvs_symbol) if type == OPENING: if range is not None: raise InternalError( 'Multiple openings logged for %r' % (cvs_symbol,) ) range_map[cvs_symbol] = SVNRevisionRange( cvs_symbol.source_lod, revnum ) else: if range is None: raise InternalError( 'Closing precedes opening for %r' % (cvs_symbol,) ) if range.closing_revnum is not None: raise InternalError( 'Multiple closings logged for %r' % (cvs_symbol,) ) range.add_closing(revnum) # Make sure that all CVSSymbols are accounted for, and adjust the # closings to be not later than svn_symbol_commit.revnum. for cvs_symbol in cvs_symbol_map.itervalues(): try: range = range_map[cvs_symbol] except KeyError: raise InternalError('No opening for %s' % (cvs_symbol,)) if range.opening_revnum >= svn_symbol_commit.revnum: raise InternalError( 'Opening in r%d not ready for %s in r%d' % (range.opening_revnum, cvs_symbol, svn_symbol_commit.revnum,) ) if range.closing_revnum is not None \ and range.closing_revnum > svn_symbol_commit.revnum: range.closing_revnum = None return range_map cvs2svn-2.4.0/cvs2svn_lib/version.py0000664000076500007650000000165412027373143020504 0ustar mhaggermhagger00000000000000#!/usr/bin/env python # (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2007-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== # The version of cvs2svn: VERSION = '2.4.0' # If this file is run as a script, print the cvs2svn version number to # stdout: if __name__ == '__main__': print VERSION cvs2svn-2.4.0/cvs2svn_lib/external_blob_generator.py0000664000076500007650000001026511500107341023671 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2009-2010 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Use the generate_blobs.py script to generate git blobs. Use a separate program, generate_blobs.py, to generate a git blob file directly from the RCS files, setting blob marks that we choose. This method is very much faster then generating the blobs from within the main program for several reasons: * The revision fulltexts are generated using internal code (rather than spawning rcs or cvs once per revision). This gain is analogous to the benefit of using --use-internal-co rather than --use-cvs or --use-rcs for cvs2svn. * Intermediate revisions' fulltext can usually be held in RAM rather than being written to temporary storage, and output as they are generated (git-fast-import doesn't care about their order). * The generate_blobs.py script runs in parallel to the main cvs2git script, allowing benefits to be had from multiple CPUs. """ import sys import os import subprocess import cPickle as pickle from cvs2svn_lib.common import FatalError from cvs2svn_lib.log import logger from cvs2svn_lib.cvs_item import CVSRevisionDelete from cvs2svn_lib.revision_manager import RevisionCollector from cvs2svn_lib.key_generator import KeyGenerator class ExternalBlobGenerator(RevisionCollector): """Have generate_blobs.py output file revisions to a blob file.""" def __init__(self, blob_filename): self.blob_filename = blob_filename def start(self): self._mark_generator = KeyGenerator() logger.normal('Starting generate_blobs.py...') self._popen = subprocess.Popen( [ sys.executable, os.path.join(os.path.dirname(__file__), 'generate_blobs.py'), self.blob_filename, ], stdin=subprocess.PIPE, ) def _process_symbol(self, cvs_symbol, cvs_file_items): """Record the original source of CVS_SYMBOL. Determine the original revision source of CVS_SYMBOL, and store it as the symbol's revision_reader_token.""" cvs_source = cvs_symbol.get_cvs_revision_source(cvs_file_items) cvs_symbol.revision_reader_token = cvs_source.revision_reader_token def process_file(self, cvs_file_items): marks = {} for lod_items in cvs_file_items.iter_lods(): for cvs_rev in lod_items.cvs_revisions: if not isinstance(cvs_rev, CVSRevisionDelete): mark = self._mark_generator.gen_id() cvs_rev.revision_reader_token = mark marks[cvs_rev.rev] = mark # A separate pickler is used for each dump(), so that its memo # doesn't grow very large. The default ASCII protocol is used so # that this works without changes on systems that distinguish # between text and binary files. pickle.dump((cvs_file_items.cvs_file.rcs_path, marks), self._popen.stdin) self._popen.stdin.flush() # Now that all CVSRevisions' revision_reader_tokens are set, # iterate through symbols and set their tokens to those of their # original source revisions: for lod_items in cvs_file_items.iter_lods(): if lod_items.cvs_branch is not None: self._process_symbol(lod_items.cvs_branch, cvs_file_items) for cvs_tag in lod_items.cvs_tags: self._process_symbol(cvs_tag, cvs_file_items) def finish(self): self._popen.stdin.close() logger.normal('Waiting for generate_blobs.py to finish...') returncode = self._popen.wait() if returncode: raise FatalError( 'generate_blobs.py failed with return code %s.' % (returncode,) ) else: logger.normal('generate_blobs.py is done.') cvs2svn-2.4.0/cvs2svn_lib/symbol.py0000664000076500007650000001756311244046551020332 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains classes that represent trunk, branches, and tags. The classes in this module represent several concepts related to symbols and lines of development in the abstract; that is, not within a particular file, but across all files in a project. The classes in this module are organized into the following class hierarchy: AbstractSymbol | +--LineOfDevelopment | | | +--Trunk | | | +--IncludedSymbol (also inherits from TypedSymbol) | | | +--Branch | | | +--Tag | +--Symbol | +--TypedSymbol | +--IncludedSymbol (also inherits from LineOfDevelopment) | | | +--Branch | | | +--Tag | +--ExcludedSymbol Please note the use of multiple inheritance. All AbstractSymbols contain an id that is globally unique across all AbstractSymbols. Moreover, the id of an AbstractSymbol remains the same even if the symbol is mutated (as described below), and two AbstractSymbols are considered equal iff their ids are the same, even if the two instances have different types. Symbols in different projects always have different ids and are therefore always distinct. (Indeed, this is pretty much the defining characteristic of a project.) Even if, for example, two projects each have branches with the same name, the Symbols representing the branches are distinct and have distinct ids. (This is important to avoid having to rewrite databases with new symbol ids in CollateSymbolsPass.) AbstractSymbols are all initially created in CollectRevsPass as either Trunk or Symbol instances. A Symbol instance is essentially an undifferentiated Symbol. In CollateSymbolsPass, it is decided which symbols will be converted as branches, which as tags, and which excluded altogether. At the beginning of this pass, the symbols are all represented by instances of the non-specific Symbol class. During CollateSymbolsPass, each Symbol instance is replaced by an instance of Branch, Tag, or ExcludedSymbol with the same id. (Trunk instances are left unchanged.) At the end of CollateSymbolsPass, all ExcludedSymbols are discarded and processing continues with only Trunk, Branch, and Tag instances. These three classes inherit from LineOfDevelopment; therefore, in later passes the term LineOfDevelopment (abbreviated to LOD) is used to refer to such objects.""" from cvs2svn_lib.context import Ctx from cvs2svn_lib.common import path_join class AbstractSymbol: """Base class for all other classes in this file.""" def __init__(self, id, project): self.id = id self.project = project def __hash__(self): return self.id def __eq__(self, other): return self.id == other.id class LineOfDevelopment(AbstractSymbol): """Base class for Trunk, Branch, and Tag. This is basically the abstraction for what will be a root tree in the Subversion repository.""" def __init__(self, id, project): AbstractSymbol.__init__(self, id, project) self.base_path = None def get_path(self, *components): """Return the svn path for this LineOfDevelopment.""" return path_join(self.base_path, *components) class Trunk(LineOfDevelopment): """Represent the main line of development.""" def __getstate__(self): return (self.id, self.project.id, self.base_path,) def __setstate__(self, state): (self.id, project_id, self.base_path,) = state self.project = Ctx()._projects[project_id] def __cmp__(self, other): if isinstance(other, Trunk): return cmp(self.project, other.project) elif isinstance(other, Symbol): # Allow Trunk to compare less than Symbols: return -1 else: raise NotImplementedError() def __str__(self): """For convenience only. The format is subject to change at any time.""" return 'Trunk' def __repr__(self): return '%s<%x>' % (self, self.id,) class Symbol(AbstractSymbol): """Represents a symbol within one project in the CVS repository. Instance of the Symbol class itself are used to represent symbols from the CVS repository. CVS, of course, distinguishes between normal tags and branch tags, but we allow symbol types to be changed in CollateSymbolsPass. Therefore, we store all CVS symbols as Symbol instances at the beginning of the conversion. In CollateSymbolsPass, Symbols are replaced by Branches, Tags, and ExcludedSymbols (the latter being discarded at the end of that pass).""" def __init__(self, id, project, name, preferred_parent_id=None): AbstractSymbol.__init__(self, id, project) self.name = name # If this symbol has a preferred parent, this member is the id of # the LineOfDevelopment instance representing it. If the symbol # never appeared in a CVSTag or CVSBranch (for example, because # all of the branches on this LOD have been detached from the # dependency tree), then this field is set to None. This field is # set during FilterSymbolsPass. self.preferred_parent_id = preferred_parent_id def __getstate__(self): return (self.id, self.project.id, self.name, self.preferred_parent_id,) def __setstate__(self, state): (self.id, project_id, self.name, self.preferred_parent_id,) = state self.project = Ctx()._projects[project_id] def __cmp__(self, other): if isinstance(other, Symbol): return cmp(self.project, other.project) \ or cmp(self.name, other.name) \ or cmp(self.id, other.id) elif isinstance(other, Trunk): # Allow Symbols to compare greater than Trunk: return +1 else: raise NotImplementedError() def __str__(self): return self.name def __repr__(self): return '%s<%x>' % (self, self.id,) class TypedSymbol(Symbol): """A Symbol whose type (branch, tag, or excluded) has been decided.""" def __init__(self, symbol): Symbol.__init__( self, symbol.id, symbol.project, symbol.name, symbol.preferred_parent_id, ) class IncludedSymbol(TypedSymbol, LineOfDevelopment): """A TypedSymbol that will be included in the conversion.""" def __init__(self, symbol): TypedSymbol.__init__(self, symbol) # We can't call the LineOfDevelopment constructor, so initialize # its extra member explicitly: try: # If the old symbol had a base_path set, then use it: self.base_path = symbol.base_path except AttributeError: self.base_path = None def __getstate__(self): return (TypedSymbol.__getstate__(self), self.base_path,) def __setstate__(self, state): (super_state, self.base_path,) = state TypedSymbol.__setstate__(self, super_state) class Branch(IncludedSymbol): """An object that describes a CVS branch.""" def __str__(self): """For convenience only. The format is subject to change at any time.""" return 'Branch(%r)' % (self.name,) class Tag(IncludedSymbol): def __str__(self): """For convenience only. The format is subject to change at any time.""" return 'Tag(%r)' % (self.name,) class ExcludedSymbol(TypedSymbol): def __str__(self): """For convenience only. The format is subject to change at any time.""" return 'ExcludedSymbol(%r)' % (self.name,) cvs2svn-2.4.0/cvs2svn_lib/git_revision_collector.py0000664000076500007650000000621611434364604023570 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2007-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Write file contents to a stream of git-fast-import blobs.""" from cvs2svn_lib.cvs_item import CVSRevisionDelete from cvs2svn_lib.revision_manager import RevisionCollector from cvs2svn_lib.key_generator import KeyGenerator class GitRevisionCollector(RevisionCollector): """Output file revisions to git-fast-import.""" def __init__(self, blob_filename, revision_reader): self.blob_filename = blob_filename self.revision_reader = revision_reader def register_artifacts(self, which_pass): self.revision_reader.register_artifacts(which_pass) def start(self): self.revision_reader.start() self.dump_file = open(self.blob_filename, 'wb') self._mark_generator = KeyGenerator() def _process_revision(self, cvs_rev): """Write the revision fulltext to a blob if it is not dead.""" if isinstance(cvs_rev, CVSRevisionDelete): # There is no need to record a delete revision, and its token # will never be needed: return # FIXME: We have to decide what to do about keyword substitution # and eol_style here: fulltext = self.revision_reader.get_content(cvs_rev) mark = self._mark_generator.gen_id() self.dump_file.write('blob\n') self.dump_file.write('mark :%d\n' % (mark,)) self.dump_file.write('data %d\n' % (len(fulltext),)) self.dump_file.write(fulltext) self.dump_file.write('\n') cvs_rev.revision_reader_token = mark def _process_symbol(self, cvs_symbol, cvs_file_items): """Record the original source of CVS_SYMBOL. Determine the original revision source of CVS_SYMBOL, and store it as the symbol's revision_reader_token.""" cvs_source = cvs_symbol.get_cvs_revision_source(cvs_file_items) cvs_symbol.revision_reader_token = cvs_source.revision_reader_token def process_file(self, cvs_file_items): for lod_items in cvs_file_items.iter_lods(): for cvs_rev in lod_items.cvs_revisions: self._process_revision(cvs_rev) # Now that all CVSRevisions' revision_reader_tokens are set, # iterate through symbols and set their tokens to those of their # original source revisions: for lod_items in cvs_file_items.iter_lods(): if lod_items.cvs_branch is not None: self._process_symbol(lod_items.cvs_branch, cvs_file_items) for cvs_tag in lod_items.cvs_tags: self._process_symbol(cvs_tag, cvs_file_items) def finish(self): self.revision_reader.finish() self.dump_file.close() cvs2svn-2.4.0/cvs2svn_lib/run_options.py0000664000076500007650000012450211710517256021377 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains classes to set common cvs2xxx run options.""" import sys import re import optparse from optparse import OptionGroup import datetime import codecs import time from cvs2svn_lib.version import VERSION from cvs2svn_lib import config from cvs2svn_lib.common import error_prefix from cvs2svn_lib.common import FatalError from cvs2svn_lib.man_writer import ManWriter from cvs2svn_lib.log import logger from cvs2svn_lib.context import Ctx from cvs2svn_lib.man_writer import ManOption from cvs2svn_lib.pass_manager import InvalidPassError from cvs2svn_lib.revision_manager import NullRevisionCollector from cvs2svn_lib.rcs_revision_manager import RCSRevisionReader from cvs2svn_lib.cvs_revision_manager import CVSRevisionReader from cvs2svn_lib.checkout_internal import InternalRevisionCollector from cvs2svn_lib.checkout_internal import InternalRevisionReader from cvs2svn_lib.symbol_strategy import AllBranchRule from cvs2svn_lib.symbol_strategy import AllExcludedRule from cvs2svn_lib.symbol_strategy import AllTagRule from cvs2svn_lib.symbol_strategy import BranchIfCommitsRule from cvs2svn_lib.symbol_strategy import ExcludeRegexpStrategyRule from cvs2svn_lib.symbol_strategy import ForceBranchRegexpStrategyRule from cvs2svn_lib.symbol_strategy import ForceTagRegexpStrategyRule from cvs2svn_lib.symbol_strategy import ExcludeTrivialImportBranchRule from cvs2svn_lib.symbol_strategy import HeuristicStrategyRule from cvs2svn_lib.symbol_strategy import UnambiguousUsageRule from cvs2svn_lib.symbol_strategy import HeuristicPreferredParentRule from cvs2svn_lib.symbol_strategy import SymbolHintsFileRule from cvs2svn_lib.symbol_transform import ReplaceSubstringsSymbolTransform from cvs2svn_lib.symbol_transform import RegexpSymbolTransform from cvs2svn_lib.symbol_transform import NormalizePathsSymbolTransform from cvs2svn_lib.property_setters import AutoPropsPropertySetter from cvs2svn_lib.property_setters import CVSBinaryFileDefaultMimeTypeSetter from cvs2svn_lib.property_setters import CVSBinaryFileEOLStyleSetter from cvs2svn_lib.property_setters import CVSRevisionNumberSetter from cvs2svn_lib.property_setters import DefaultEOLStyleSetter from cvs2svn_lib.property_setters import EOLStyleFromMimeTypeSetter from cvs2svn_lib.property_setters import ExecutablePropertySetter from cvs2svn_lib.property_setters import DescriptionPropertySetter from cvs2svn_lib.property_setters import KeywordsPropertySetter from cvs2svn_lib.property_setters import MimeMapper from cvs2svn_lib.property_setters import SVNBinaryFileKeywordsPropertySetter usage = """\ Usage: %prog --options OPTIONFILE %prog [OPTION...] OUTPUT-OPTION CVS-REPOS-PATH""" description="""\ Convert a CVS repository into a Subversion repository, including history. """ class IncompatibleOption(ManOption): """A ManOption that is incompatible with the --options option. Record that the option was used so that error checking can later be done.""" def __init__(self, *args, **kw): ManOption.__init__(self, *args, **kw) def take_action(self, action, dest, opt, value, values, parser): oio = parser.values.options_incompatible_options if opt not in oio: oio.append(opt) return ManOption.take_action( self, action, dest, opt, value, values, parser ) class ContextOption(ManOption): """A ManOption that stores its value to Ctx.""" def __init__(self, *args, **kw): if kw.get('action') not in self.STORE_ACTIONS: raise ValueError('Invalid action: %s' % (kw['action'],)) self.__compatible_with_option = kw.pop('compatible_with_option', False) self.__action = kw.pop('action') try: self.__dest = kw.pop('dest') except KeyError: opt = args[0] if not opt.startswith('--'): raise ValueError() self.__dest = opt[2:].replace('-', '_') if 'const' in kw: self.__const = kw.pop('const') kw['action'] = 'callback' kw['callback'] = self.__callback ManOption.__init__(self, *args, **kw) def __callback(self, option, opt_str, value, parser): if not self.__compatible_with_option: oio = parser.values.options_incompatible_options if opt_str not in oio: oio.append(opt_str) action = self.__action dest = self.__dest if action == "store": setattr(Ctx(), dest, value) elif action == "store_const": setattr(Ctx(), dest, self.__const) elif action == "store_true": setattr(Ctx(), dest, True) elif action == "store_false": setattr(Ctx(), dest, False) elif action == "append": getattr(Ctx(), dest).append(value) elif action == "count": setattr(Ctx(), dest, getattr(Ctx(), dest, 0) + 1) else: raise RuntimeError("unknown action %r" % self.__action) return 1 class IncompatibleOptionsException(FatalError): pass # Options that are not allowed to be used with --trunk-only: SYMBOL_OPTIONS = [ '--symbol-transform', '--symbol-hints', '--force-branch', '--force-tag', '--exclude', '--keep-trivial-imports', '--symbol-default', '--no-cross-branch-commits', ] class SymbolOptionsWithTrunkOnlyException(IncompatibleOptionsException): def __init__(self): IncompatibleOptionsException.__init__( self, 'The following symbol-related options cannot be used together\n' 'with --trunk-only:\n' ' %s' % ('\n '.join(SYMBOL_OPTIONS),) ) def not_both(opt1val, opt1name, opt2val, opt2name): """Raise an exception if both opt1val and opt2val are set.""" if opt1val and opt2val: raise IncompatibleOptionsException( "cannot pass both '%s' and '%s'." % (opt1name, opt2name,) ) class RunOptions(object): """A place to store meta-options that are used to start the conversion.""" # Components of the man page. Attributes set to None here must be set # by subclasses; others may be overridden/augmented by subclasses if # they wish. short_desc = None synopsis = None long_desc = None files = None authors = [ u"C. Michael Pilato ", u"Greg Stein ", u"Branko \u010cibej ", u"Blair Zajac ", u"Max Bowsher ", u"Brian Fitzpatrick ", u"Tobias Ringstr\u00f6m ", u"Karl Fogel ", u"Erik H\u00fclsmann ", u"David Summers ", u"Michael Haggerty ", ] see_also = None def __init__(self, progname, cmd_args, pass_manager): """Process the command-line options, storing run options to SELF. PROGNAME is the name of the program, used in the usage string. CMD_ARGS is the list of command-line arguments passed to the program. PASS_MANAGER is an instance of PassManager, needed to help process the -p and --help-passes options.""" self.progname = progname self.cmd_args = cmd_args self.pass_manager = pass_manager self.start_pass = 1 self.end_pass = self.pass_manager.num_passes self.profiling = False self.projects = [] # A list of one list of SymbolStrategyRules for each project: self.project_symbol_strategy_rules = [] parser = self.parser = optparse.OptionParser( usage=usage, description=self.get_description(), add_help_option=False, ) # A place to record any options used that are incompatible with # --options: parser.set_default('options_incompatible_options', []) # Populate the options parser with the options, one group at a # time: parser.add_option_group(self._get_options_file_options_group()) parser.add_option_group(self._get_output_options_group()) parser.add_option_group(self._get_conversion_options_group()) parser.add_option_group(self._get_symbol_handling_options_group()) parser.add_option_group(self._get_subversion_properties_options_group()) parser.add_option_group(self._get_extraction_options_group()) parser.add_option_group(self._get_environment_options_group()) parser.add_option_group(self._get_partial_conversion_options_group()) parser.add_option_group(self._get_information_options_group()) (self.options, self.args) = parser.parse_args(args=self.cmd_args) # Now the log level has been set; log the time when the run started: logger.verbose( time.strftime( 'Conversion start time: %Y-%m-%d %I:%M:%S %Z', time.localtime(logger.start_time) ) ) if self.options.options_file_found: # Check that no options that are incompatible with --options # were used: self.verify_option_compatibility() else: # --options was not specified. So do the main initialization # based on other command-line options: self.process_options() # Check for problems with the options: self.check_options() def get_description(self): return description def _get_options_file_options_group(self): group = OptionGroup( self.parser, 'Configuration via options file' ) self.parser.set_default('options_file_found', False) group.add_option(ManOption( '--options', type='string', action='callback', callback=self.callback_options, help=( 'read the conversion options from PATH. This ' 'method allows more flexibility than using ' 'command-line options. See documentation for info' ), man_help=( 'Read the conversion options from \\fIpath\\fR instead of from ' 'the command line. This option allows far more conversion ' 'flexibility than can be achieved using the command-line alone. ' 'See the documentation for more information. Only the following ' 'command-line options are allowed in combination with ' '\\fB--options\\fR: \\fB-h\\fR/\\fB--help\\fR, ' '\\fB--help-passes\\fR, \\fB--version\\fR, ' '\\fB-v\\fR/\\fB--verbose\\fR, \\fB-q\\fR/\\fB--quiet\\fR, ' '\\fB-p\\fR/\\fB--pass\\fR/\\fB--passes\\fR, \\fB--dry-run\\fR, ' '\\fB--profile\\fR, \\fB--trunk-only\\fR, \\fB--encoding\\fR, ' 'and \\fB--fallback-encoding\\fR. ' 'Options are processed in the order specified on the command ' 'line.' ), metavar='PATH', )) return group def _get_output_options_group(self): group = OptionGroup(self.parser, 'Output options') return group def _get_conversion_options_group(self): group = OptionGroup(self.parser, 'Conversion options') group.add_option(ContextOption( '--trunk-only', action='store_true', compatible_with_option=True, help='convert only trunk commits, not tags nor branches', man_help=( 'Convert only trunk commits, not tags nor branches.' ), )) group.add_option(ManOption( '--encoding', type='string', action='callback', callback=self.callback_encoding, help=( 'encoding for paths and log messages in CVS repos. ' 'If option is specified multiple times, encoders ' 'are tried in order until one succeeds. See ' 'http://docs.python.org/lib/standard-encodings.html ' 'for a list of standard Python encodings.' ), man_help=( 'Use \\fIencoding\\fR as the encoding for filenames, log ' 'messages, and author names in the CVS repos. This option ' 'may be specified multiple times, in which case the encodings ' 'are tried in order until one succeeds. Default: ascii. See ' 'http://docs.python.org/lib/standard-encodings.html for a list ' 'of other standard encodings.' ), metavar='ENC', )) group.add_option(ManOption( '--fallback-encoding', type='string', action='callback', callback=self.callback_fallback_encoding, help='If all --encodings fail, use lossy encoding with ENC', man_help=( 'If none of the encodings specified with \\fB--encoding\\fR ' 'succeed in decoding an author name or log message, then fall ' 'back to using \\fIencoding\\fR in lossy \'replace\' mode. ' 'Use of this option may cause information to be lost, but at ' 'least it allows the conversion to run to completion. This ' 'option only affects the encoding of log messages and author ' 'names; there is no fallback encoding for filenames. (By ' 'using an \\fB--options\\fR file, it is possible to specify ' 'a fallback encoding for filenames.) Default: disabled.' ), metavar='ENC', )) group.add_option(ContextOption( '--retain-conflicting-attic-files', action='store_true', help=( 'if a file appears both in and out of ' 'the CVS Attic, then leave the attic version in a ' 'SVN directory called "Attic"' ), man_help=( 'If a file appears both inside an outside of the CVS attic, ' 'retain the attic version in an SVN subdirectory called ' '\'Attic\'. (Normally this situation is treated as a fatal ' 'error.)' ), )) return group def _get_symbol_handling_options_group(self): group = OptionGroup(self.parser, 'Symbol handling') self.parser.set_default('symbol_transforms', []) group.add_option(IncompatibleOption( '--symbol-transform', type='string', action='callback', callback=self.callback_symbol_transform, help=( 'transform symbol names from P to S, where P and S ' 'use Python regexp and reference syntax ' 'respectively. P must match the whole symbol name' ), man_help=( 'Transform RCS/CVS symbol names before entering them into ' 'Subversion. \\fIpattern\\fR is a Python regexp pattern that ' 'is matches against the entire symbol name; \\fIreplacement\\fR ' 'is a replacement using Python\'s regexp reference syntax. ' 'You may specify any number of these options; they will be ' 'applied in the order given on the command line.' ), metavar='P:S', )) self.parser.set_default('symbol_strategy_rules', []) group.add_option(IncompatibleOption( '--symbol-hints', type='string', action='callback', callback=self.callback_symbol_hints, help='read symbol conversion hints from PATH', man_help=( 'Read symbol conversion hints from \\fIpath\\fR. The format of ' '\\fIpath\\fR is the same as the format output by ' '\\fB--write-symbol-info\\fR, namely a text file with four ' 'whitespace-separated columns: \\fIproject-id\\fR, ' '\\fIsymbol\\fR, \\fIconversion\\fR, and ' '\\fIparent-lod-name\\fR. \\fIproject-id\\fR is the numerical ' 'ID of the project to which the symbol belongs, counting from ' '0. \\fIproject-id\\fR can be set to \'.\' if ' 'project-specificity is not needed. \\fIsymbol-name\\fR is the ' 'name of the symbol being specified. \\fIconversion\\fR ' 'specifies how the symbol should be converted, and can be one ' 'of the values \'branch\', \'tag\', or \'exclude\'. If ' '\\fIconversion\\fR is \'.\', then this rule does not affect ' 'how the symbol is converted. \\fIparent-lod-name\\fR is the ' 'name of the symbol from which this symbol should sprout, or ' '\'.trunk.\' if the symbol should sprout from trunk. If ' '\\fIparent-lod-name\\fR is omitted or \'.\', then this rule ' 'does not affect the preferred parent of this symbol. The file ' 'may contain blank lines or comment lines (lines whose first ' 'non-whitespace character is \'#\').' ), metavar='PATH', )) self.parser.set_default('symbol_default', 'heuristic') group.add_option(IncompatibleOption( '--symbol-default', type='choice', choices=['heuristic', 'strict', 'branch', 'tag', 'exclude'], action='store', help=( 'specify how ambiguous symbols are converted. ' 'OPT is "heuristic" (default), "strict", "branch", ' '"tag" or "exclude"' ), man_help=( 'Specify how to convert ambiguous symbols (those that appear in ' 'the CVS archive as both branches and tags). \\fIopt\\fR must ' 'be \'heuristic\' (decide how to treat each ambiguous symbol ' 'based on whether it was used more often as a branch/tag in ' 'CVS), \'strict\' (no default; every ambiguous symbol has to be ' 'resolved manually using \\fB--force-branch\\fR, ' '\\fB--force-tag\\fR, or \\fB--exclude\\fR), \'branch\' (treat ' 'every ambiguous symbol as a branch), \'tag\' (treat every ' 'ambiguous symbol as a tag), or \'exclude\' (do not convert ' 'ambiguous symbols). The default is \'heuristic\'.' ), metavar='OPT', )) group.add_option(IncompatibleOption( '--force-branch', type='string', action='callback', callback=self.callback_force_branch, help='force symbols matching REGEXP to be branches', man_help=( 'Force symbols whose names match \\fIregexp\\fR to be branches. ' '\\fIregexp\\fR must match the whole symbol name.' ), metavar='REGEXP', )) group.add_option(IncompatibleOption( '--force-tag', type='string', action='callback', callback=self.callback_force_tag, help='force symbols matching REGEXP to be tags', man_help=( 'Force symbols whose names match \\fIregexp\\fR to be tags. ' '\\fIregexp\\fR must match the whole symbol name.' ), metavar='REGEXP', )) group.add_option(IncompatibleOption( '--exclude', type='string', action='callback', callback=self.callback_exclude, help='exclude branches and tags matching REGEXP', man_help=( 'Exclude branches and tags whose names match \\fIregexp\\fR ' 'from the conversion. \\fIregexp\\fR must match the whole ' 'symbol name.' ), metavar='REGEXP', )) self.parser.set_default('keep_trivial_imports', False) group.add_option(IncompatibleOption( '--keep-trivial-imports', action='store_true', help=( 'do not exclude branches that were only used for ' 'a single import (usually these are unneeded)' ), man_help=( 'Do not exclude branches that were only used for a single ' 'import. (By default such branches are excluded because they ' 'are usually created by the inappropriate use of \\fBcvs ' 'import\\fR.)' ), )) return group def _get_subversion_properties_options_group(self): group = OptionGroup(self.parser, 'Subversion properties') group.add_option(ContextOption( '--username', type='string', action='store', help='username for cvs2svn-synthesized commits', man_help=( 'Set the default username to \\fIname\\fR when cvs2svn needs ' 'to generate a commit for which CVS does not record the ' 'original username. This happens when a branch or tag is ' 'created. The default is to use no author at all for such ' 'commits.' ), metavar='NAME', )) self.parser.set_default('auto_props_files', []) group.add_option(IncompatibleOption( '--auto-props', type='string', action='append', dest='auto_props_files', help=( 'set file properties from the auto-props section ' 'of a file in svn config format' ), man_help=( 'Specify a file in the format of Subversion\'s config file, ' 'whose [auto-props] section can be used to set arbitrary ' 'properties on files in the Subversion repository based on ' 'their filenames. (The [auto-props] section header must be ' 'present; other sections of the config file, including the ' 'enable-auto-props setting, are ignored.) Filenames are matched ' 'to the filename patterns case-insensitively.' ), metavar='FILE', )) self.parser.set_default('mime_types_files', []) group.add_option(IncompatibleOption( '--mime-types', type='string', action='append', dest='mime_types_files', help=( 'specify an apache-style mime.types file for setting ' 'svn:mime-type' ), man_help=( 'Specify an apache-style mime.types \\fIfile\\fR for setting ' 'svn:mime-type.' ), metavar='FILE', )) self.parser.set_default('eol_from_mime_type', False) group.add_option(IncompatibleOption( '--eol-from-mime-type', action='store_true', help='set svn:eol-style from mime type if known', man_help=( 'For files that don\'t have the kb expansion mode but have a ' 'known mime type, set the eol-style based on the mime type. ' 'For such files, set svn:eol-style to "native" if the mime type ' 'begins with "text/", and leave it unset (i.e., no EOL ' 'translation) otherwise. Files with unknown mime types are ' 'not affected by this option. This option has no effect ' 'unless the \\fB--mime-types\\fR option is also specified.' ), )) self.parser.set_default('default_eol', 'binary') group.add_option(IncompatibleOption( '--default-eol', type='choice', choices=['binary', 'native', 'CRLF', 'LF', 'CR'], action='store', help=( 'default svn:eol-style for non-binary files with ' 'undetermined mime types. STYLE is "binary" ' '(default), "native", "CRLF", "LF", or "CR"' ), man_help=( 'Set svn:eol-style to \\fIstyle\\fR for files that don\'t have ' 'the CVS \'kb\' expansion mode and whose end-of-line ' 'translation mode hasn\'t been determined by one of the other ' 'options. \\fIstyle\\fR must be \'binary\' (default), ' '\'native\', \'CRLF\', \'LF\', or \'CR\'.' ), metavar='STYLE', )) self.parser.set_default('keywords_off', False) group.add_option(IncompatibleOption( '--keywords-off', action='store_true', help=( 'don\'t set svn:keywords on any files (by default, ' 'cvs2svn sets svn:keywords on non-binary files to "%s")' % (config.SVN_KEYWORDS_VALUE,) ), man_help=( 'By default, cvs2svn sets svn:keywords on CVS files to "author ' 'id date" if the mode of the RCS file in question is either kv, ' 'kvl or unset. If you use the --keywords-off switch, cvs2svn ' 'will not set svn:keywords for any file. While this will not ' 'touch the keywords in the contents of your files, Subversion ' 'will not expand them.' ), )) group.add_option(ContextOption( '--keep-cvsignore', action='store_true', help=( 'keep .cvsignore files (in addition to creating ' 'the analogous svn:ignore properties)' ), man_help=( 'Include \\fI.cvsignore\\fR files in the output. (Normally ' 'they are unneeded because cvs2svn sets the corresponding ' '\\fIsvn:ignore\\fR properties.)' ), )) group.add_option(IncompatibleOption( '--cvs-revnums', action='callback', callback=self.callback_cvs_revnums, help='record CVS revision numbers as file properties', man_help=( 'Record CVS revision numbers as file properties in the ' 'Subversion repository. (Note that unless it is removed ' 'explicitly, the last CVS revision number will remain ' 'associated with the file even after the file is changed ' 'within Subversion.)' ), )) # Deprecated options: group.add_option(IncompatibleOption( '--no-default-eol', action='store_const', dest='default_eol', const=None, help=optparse.SUPPRESS_HELP, man_help=optparse.SUPPRESS_HELP, )) self.parser.set_default('auto_props_ignore_case', True) # True is the default now, so this option has no effect: group.add_option(IncompatibleOption( '--auto-props-ignore-case', action='store_true', help=optparse.SUPPRESS_HELP, man_help=optparse.SUPPRESS_HELP, )) return group def _get_extraction_options_group(self): group = OptionGroup(self.parser, 'Extraction options') return group def _add_use_internal_co_option(self, group): self.parser.set_default('use_internal_co', False) group.add_option(IncompatibleOption( '--use-internal-co', action='store_true', help=( 'use internal code to extract revision contents ' '(fastest but disk space intensive) (default)' ), man_help=( 'Use internal code to extract revision contents. This ' 'is up to 50% faster than using \\fB--use-rcs\\fR, but needs ' 'a lot of disk space: roughly the size of your CVS repository ' 'plus the peak size of a complete checkout of the repository ' 'with all branches that existed and still had commits pending ' 'at a given time. This option is the default.' ), )) def _add_use_cvs_option(self, group): self.parser.set_default('use_cvs', False) group.add_option(IncompatibleOption( '--use-cvs', action='store_true', help=( 'use CVS to extract revision contents (slower than ' '--use-internal-co or --use-rcs)' ), man_help=( 'Use CVS to extract revision contents. This option is slower ' 'than \\fB--use-internal-co\\fR or \\fB--use-rcs\\fR.' ), )) def _add_use_rcs_option(self, group): self.parser.set_default('use_rcs', False) group.add_option(IncompatibleOption( '--use-rcs', action='store_true', help=( 'use RCS to extract revision contents (faster than ' '--use-cvs but fails in some cases)' ), man_help=( 'Use RCS \'co\' to extract revision contents. This option is ' 'faster than \\fB--use-cvs\\fR but fails in some cases.' ), )) def _get_environment_options_group(self): group = OptionGroup(self.parser, 'Environment options') group.add_option(ContextOption( '--tmpdir', type='string', action='store', help=( 'directory to use for temporary data files ' '(default "cvs2svn-tmp")' ), man_help=( 'Set the \\fIpath\\fR to use for temporary data. Default ' 'is a directory called \\fIcvs2svn-tmp\\fR under the current ' 'directory.' ), metavar='PATH', )) self.parser.set_default('co_executable', config.CO_EXECUTABLE) group.add_option(IncompatibleOption( '--co', type='string', action='store', dest='co_executable', help='path to the "co" program (required if --use-rcs)', man_help=( 'Path to the \\fIco\\fR program. (\\fIco\\fR is needed if the ' '\\fB--use-rcs\\fR option is used.)' ), metavar='PATH', )) self.parser.set_default('cvs_executable', config.CVS_EXECUTABLE) group.add_option(IncompatibleOption( '--cvs', type='string', action='store', dest='cvs_executable', help='path to the "cvs" program (required if --use-cvs)', man_help=( 'Path to the \\fIcvs\\fR program. (\\fIcvs\\fR is needed if the ' '\\fB--use-cvs\\fR option is used.)' ), metavar='PATH', )) return group def _get_partial_conversion_options_group(self): group = OptionGroup(self.parser, 'Partial conversions') group.add_option(ManOption( '--pass', type='string', action='callback', callback=self.callback_passes, help='execute only specified PASS of conversion', man_help=( 'Execute only pass \\fIpass\\fR of the conversion. ' '\\fIpass\\fR can be specified by name or by number (see ' '\\fB--help-passes\\fR).' ), metavar='PASS', )) group.add_option(ManOption( '--passes', '-p', type='string', action='callback', callback=self.callback_passes, help=( 'execute passes START through END, inclusive (PASS, ' 'START, and END can be pass names or numbers)' ), man_help=( 'Execute passes \\fIstart\\fR through \\fIend\\fR of the ' 'conversion (inclusive). \\fIstart\\fR and \\fIend\\fR can be ' 'specified by name or by number (see \\fB--help-passes\\fR). ' 'If \\fIstart\\fR or \\fIend\\fR is missing, it defaults to ' 'the first or last pass, respectively. For this to work the ' 'earlier passes must have been completed before on the ' 'same CVS repository, and the generated data files must be ' 'in the temporary directory (see \\fB--tmpdir\\fR).' ), metavar='[START]:[END]', )) return group def _get_information_options_group(self): group = OptionGroup(self.parser, 'Information options') group.add_option(ManOption( '--version', action='callback', callback=self.callback_version, help='print the version number', man_help='Print the version number.', )) group.add_option(ManOption( '--help', '-h', action="help", help='print this usage message and exit with success', man_help='Print the usage message and exit with success.', )) group.add_option(ManOption( '--help-passes', action='callback', callback=self.callback_help_passes, help='list the available passes and their numbers', man_help=( 'Print the numbers and names of the conversion passes and ' 'exit with success.' ), )) group.add_option(ManOption( '--man', action='callback', callback=self.callback_manpage, help='write the manpage for this program to standard output', man_help=( 'Output the unix-style manpage for this program to standard ' 'output.' ), )) group.add_option(ManOption( '--verbose', '-v', action='callback', callback=self.callback_verbose, help='verbose (may be specified twice for debug output)', man_help=( 'Print more information while running. This option may be ' 'specified twice to output voluminous debugging information.' ), )) group.add_option(ManOption( '--quiet', '-q', action='callback', callback=self.callback_quiet, help='quiet (may be specified twice for very quiet)', man_help=( 'Print less information while running. This option may be ' 'specified twice to suppress all non-error output.' ), )) group.add_option(ContextOption( '--write-symbol-info', type='string', action='store', dest='symbol_info_filename', help='write information and statistics about CVS symbols to PATH.', man_help=( 'Write to \\fIpath\\fR symbol statistics and information about ' 'how symbols were converted during CollateSymbolsPass.' ), metavar='PATH', )) group.add_option(ContextOption( '--skip-cleanup', action='store_true', help='prevent the deletion of intermediate files', man_help='Prevent the deletion of temporary files.', )) prof = 'cProfile' try: import cProfile except ImportError: prof = 'hotshot' group.add_option(ManOption( '--profile', action='callback', callback=self.callback_profile, help='profile with \'' + prof + '\' (into file cvs2svn.' + prof + ')', man_help=( 'Profile with \'' + prof + '\' (into file \\fIcvs2svn.' + prof + '\\fR).' ), )) return group def callback_options(self, option, opt_str, value, parser): parser.values.options_file_found = True self.process_options_file(value) def callback_encoding(self, option, opt_str, value, parser): ctx = Ctx() try: ctx.cvs_author_decoder.add_encoding(value) ctx.cvs_log_decoder.add_encoding(value) ctx.cvs_filename_decoder.add_encoding(value) except LookupError, e: raise FatalError(str(e)) def callback_fallback_encoding(self, option, opt_str, value, parser): ctx = Ctx() try: ctx.cvs_author_decoder.set_fallback_encoding(value) ctx.cvs_log_decoder.set_fallback_encoding(value) # Don't use fallback_encoding for filenames. except LookupError, e: raise FatalError(str(e)) def callback_help_passes(self, option, opt_str, value, parser): self.pass_manager.help_passes() sys.exit(0) def callback_manpage(self, option, opt_str, value, parser): f = codecs.getwriter('utf_8')(sys.stdout) writer = ManWriter(parser, section='1', date=datetime.date.today(), source='Version %s' % (VERSION,), manual='User Commands', short_desc=self.short_desc, synopsis=self.synopsis, long_desc=self.long_desc, files=self.files, authors=self.authors, see_also=self.see_also) writer.write_manpage(f) sys.exit(0) def callback_version(self, option, opt_str, value, parser): sys.stdout.write( '%s version %s\n' % (self.progname, VERSION) ) sys.exit(0) def callback_verbose(self, option, opt_str, value, parser): logger.increase_verbosity() def callback_quiet(self, option, opt_str, value, parser): logger.decrease_verbosity() def callback_passes(self, option, opt_str, value, parser): if value.find(':') >= 0: start_pass, end_pass = value.split(':') self.start_pass = self.pass_manager.get_pass_number(start_pass, 1) self.end_pass = self.pass_manager.get_pass_number( end_pass, self.pass_manager.num_passes ) else: self.end_pass = \ self.start_pass = \ self.pass_manager.get_pass_number(value) def callback_profile(self, option, opt_str, value, parser): self.profiling = True def callback_symbol_hints(self, option, opt_str, value, parser): parser.values.symbol_strategy_rules.append(SymbolHintsFileRule(value)) def callback_force_branch(self, option, opt_str, value, parser): parser.values.symbol_strategy_rules.append( ForceBranchRegexpStrategyRule(value) ) def callback_force_tag(self, option, opt_str, value, parser): parser.values.symbol_strategy_rules.append( ForceTagRegexpStrategyRule(value) ) def callback_exclude(self, option, opt_str, value, parser): parser.values.symbol_strategy_rules.append( ExcludeRegexpStrategyRule(value) ) def callback_cvs_revnums(self, option, opt_str, value, parser): Ctx().revision_property_setters.append(CVSRevisionNumberSetter()) def callback_symbol_transform(self, option, opt_str, value, parser): [pattern, replacement] = value.split(":") try: parser.values.symbol_transforms.append( RegexpSymbolTransform(pattern, replacement) ) except re.error: raise FatalError("'%s' is not a valid regexp." % (pattern,)) # Common to SVNRunOptions, HgRunOptions (GitRunOptions and # BzrRunOptions do not support --use-internal-co, so cannot use this). def process_all_extraction_options(self): ctx = Ctx() options = self.options not_both(options.use_rcs, '--use-rcs', options.use_cvs, '--use-cvs') not_both(options.use_rcs, '--use-rcs', options.use_internal_co, '--use-internal-co') not_both(options.use_cvs, '--use-cvs', options.use_internal_co, '--use-internal-co') if options.use_rcs: ctx.revision_collector = NullRevisionCollector() ctx.revision_reader = RCSRevisionReader(options.co_executable) elif options.use_cvs: ctx.revision_collector = NullRevisionCollector() ctx.revision_reader = CVSRevisionReader(options.cvs_executable) else: # --use-internal-co is the default: ctx.revision_collector = InternalRevisionCollector(compress=True) ctx.revision_reader = InternalRevisionReader(compress=True) def process_symbol_strategy_options(self): """Process symbol strategy-related options.""" ctx = Ctx() options = self.options # Add the standard symbol name cleanup rules: self.options.symbol_transforms.extend([ ReplaceSubstringsSymbolTransform('\\','/'), # Remove leading, trailing, and repeated slashes: NormalizePathsSymbolTransform(), ]) if ctx.trunk_only: if options.symbol_strategy_rules or options.keep_trivial_imports: raise SymbolOptionsWithTrunkOnlyException() else: if not options.keep_trivial_imports: options.symbol_strategy_rules.append(ExcludeTrivialImportBranchRule()) options.symbol_strategy_rules.append(UnambiguousUsageRule()) if options.symbol_default == 'strict': pass elif options.symbol_default == 'branch': options.symbol_strategy_rules.append(AllBranchRule()) elif options.symbol_default == 'tag': options.symbol_strategy_rules.append(AllTagRule()) elif options.symbol_default == 'heuristic': options.symbol_strategy_rules.append(BranchIfCommitsRule()) options.symbol_strategy_rules.append(HeuristicStrategyRule()) elif options.symbol_default == 'exclude': options.symbol_strategy_rules.append(AllExcludedRule()) else: assert False # Now add a rule whose job it is to pick the preferred parents of # branches and tags: options.symbol_strategy_rules.append(HeuristicPreferredParentRule()) def process_property_setter_options(self): """Process the options that set SVN properties.""" ctx = Ctx() options = self.options for value in options.auto_props_files: ctx.file_property_setters.append( AutoPropsPropertySetter(value, options.auto_props_ignore_case) ) for value in options.mime_types_files: ctx.file_property_setters.append(MimeMapper(value)) ctx.file_property_setters.append(CVSBinaryFileEOLStyleSetter()) ctx.file_property_setters.append(CVSBinaryFileDefaultMimeTypeSetter()) if options.eol_from_mime_type: ctx.file_property_setters.append(EOLStyleFromMimeTypeSetter()) ctx.file_property_setters.append( DefaultEOLStyleSetter(options.default_eol) ) ctx.file_property_setters.append(SVNBinaryFileKeywordsPropertySetter()) if not options.keywords_off: ctx.file_property_setters.append( KeywordsPropertySetter(config.SVN_KEYWORDS_VALUE) ) ctx.file_property_setters.append(ExecutablePropertySetter()) ctx.file_property_setters.append(DescriptionPropertySetter()) def process_options(self): """Do the main configuration based on command-line options. This method is only called if the --options option was not specified.""" raise NotImplementedError() def check_options(self): """Check the the run options are OK. This should only be called after all options have been processed.""" # Convenience var, so we don't have to keep instantiating this Borg. ctx = Ctx() if not self.start_pass <= self.end_pass: raise InvalidPassError( 'Ending pass must not come before starting pass.') if not ctx.dry_run and ctx.output_option is None: raise FatalError('No output option specified.') if ctx.output_option is not None: ctx.output_option.check() if not self.projects: raise FatalError('No project specified.') def verify_option_compatibility(self): """Verify that no options incompatible with --options were used. The --options option was specified. Verify that no incompatible options or arguments were specified.""" if self.options.options_incompatible_options or self.args: if self.options.options_incompatible_options: oio = self.options.options_incompatible_options logger.error( '%s: The following options cannot be used in combination with ' 'the --options\n' 'option:\n' ' %s\n' % (error_prefix, '\n '.join(oio)) ) if self.args: logger.error( '%s: No cvs-repos-path arguments are allowed with the --options ' 'option.\n' % (error_prefix,) ) sys.exit(1) def process_options_file(self, options_filename): """Read options from the file named OPTIONS_FILENAME. Store the run options to SELF.""" g = { 'ctx' : Ctx(), 'run_options' : self, } execfile(options_filename, g) def usage(self): self.parser.print_help() cvs2svn-2.4.0/cvs2svn_lib/__init__.py0000664000076500007650000000144311244047343020552 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This package contains modules that support cvs2svn.""" cvs2svn-2.4.0/cvs2svn_lib/svn_output_option.py0000664000076500007650000007141311500107341022623 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Classes for outputting the converted repository to SVN.""" import os import re from cvs2svn_lib import config from cvs2svn_lib.common import InternalError from cvs2svn_lib.common import FatalError from cvs2svn_lib.common import FatalException from cvs2svn_lib.common import error_prefix from cvs2svn_lib.common import format_date from cvs2svn_lib.common import IllegalSVNPathError from cvs2svn_lib.common import PathsNotDisjointException from cvs2svn_lib.common import verify_paths_disjoint from cvs2svn_lib.log import logger from cvs2svn_lib.context import Ctx from cvs2svn_lib.artifact_manager import artifact_manager from cvs2svn_lib.process import CommandFailedException from cvs2svn_lib.process import check_command_runs from cvs2svn_lib.process import call_command from cvs2svn_lib.cvs_path import CVSDirectory from cvs2svn_lib.symbol import Trunk from cvs2svn_lib.symbol import LineOfDevelopment from cvs2svn_lib.cvs_item import CVSRevisionAdd from cvs2svn_lib.cvs_item import CVSRevisionChange from cvs2svn_lib.cvs_item import CVSRevisionDelete from cvs2svn_lib.cvs_item import CVSRevisionNoop from cvs2svn_lib.repository_mirror import RepositoryMirror from cvs2svn_lib.repository_mirror import PathExistsError from cvs2svn_lib.openings_closings import SymbolingsReader from cvs2svn_lib.fill_source import get_source_set from cvs2svn_lib.svn_dump import DumpstreamDelegate from cvs2svn_lib.svn_dump import LoaderPipe from cvs2svn_lib.output_option import OutputOption class SVNOutputOption(OutputOption): """An OutputOption appropriate for output to Subversion.""" name = 'Subversion' class ParentMissingError(Exception): """The parent of a path is missing. Exception raised if an attempt is made to add a path to the repository mirror but the parent's path doesn't exist in the youngest revision of the repository.""" pass class ExpectedDirectoryError(Exception): """A file was found where a directory was expected.""" pass def __init__(self, author_transforms=None): self._mirror = RepositoryMirror() def to_utf8(s): if isinstance(s, unicode): return s.encode('utf8') else: return s self.author_transforms = {} if author_transforms is not None: for (cvsauthor, name) in author_transforms.iteritems(): cvsauthor = to_utf8(cvsauthor) name = to_utf8(name) self.author_transforms[cvsauthor] = name def register_artifacts(self, which_pass): # These artifacts are needed for SymbolingsReader: artifact_manager.register_temp_file_needed( config.SYMBOL_OPENINGS_CLOSINGS_SORTED, which_pass ) artifact_manager.register_temp_file_needed( config.SYMBOL_OFFSETS_DB, which_pass ) self._mirror.register_artifacts(which_pass) Ctx().revision_reader.register_artifacts(which_pass) # Characters not allowed in Subversion filenames: illegal_filename_characters_re = re.compile('[\\\x00-\\\x1f\\\x7f]') def verify_filename_legal(self, filename): OutputOption.verify_filename_legal(self, filename) m = SVNOutputOption.illegal_filename_characters_re.search(filename) if m: raise IllegalSVNPathError( '%s does not allow character %r in filename %r.' % (self.name, m.group(), filename,) ) def check_symbols(self, symbol_map): """Check that the paths of all included LODs are set and disjoint.""" error_found = False # Check that all included LODs have their base paths set, and # collect the paths into a list: paths = [] for lod in symbol_map.itervalues(): if isinstance(lod, LineOfDevelopment): if lod.base_path is None: logger.error('%s: No path was set for %r\n' % (error_prefix, lod,)) error_found = True else: paths.append(lod.base_path) # Check that the SVN paths of all LODS are disjoint: try: verify_paths_disjoint(*paths) except PathsNotDisjointException, e: logger.error(str(e)) error_found = True if error_found: raise FatalException( 'Please fix the above errors and restart CollateSymbolsPass' ) def setup(self, svn_rev_count): self._symbolings_reader = SymbolingsReader() self._mirror.open() self._delegates = [] Ctx().revision_reader.start() self.svn_rev_count = svn_rev_count def _get_author(self, svn_commit): author = svn_commit.get_author() name = self.author_transforms.get(author, author) return name def _get_revprops(self, svn_commit): """Return the Subversion revprops for this SVNCommit.""" return { 'svn:author' : self._get_author(svn_commit), 'svn:log' : svn_commit.get_log_msg(), 'svn:date' : format_date(svn_commit.date), } def start_commit(self, revnum, revprops): """Start a new commit.""" logger.verbose("=" * 60) logger.normal( "Starting Subversion r%d / %d" % (revnum, self.svn_rev_count) ) self._mirror.start_commit(revnum) self._invoke_delegates('start_commit', revnum, revprops) def end_commit(self): """Called at the end of each commit. This method copies the newly created nodes to the on-disk nodes db.""" self._mirror.end_commit() self._invoke_delegates('end_commit') def delete_lod(self, lod): """Delete the main path for LOD from the tree. The path must currently exist. Silently refuse to delete trunk paths.""" if isinstance(lod, Trunk): # Never delete a Trunk path. return logger.verbose(" Deleting %s" % (lod.get_path(),)) self._mirror.get_current_lod_directory(lod).delete() self._invoke_delegates('delete_lod', lod) def delete_path(self, cvs_path, lod, should_prune=False): """Delete CVS_PATH from LOD.""" if cvs_path.parent_directory is None: self.delete_lod(lod) return logger.verbose(" Deleting %s" % (lod.get_path(cvs_path.cvs_path),)) parent_node = self._mirror.get_current_path( cvs_path.parent_directory, lod ) del parent_node[cvs_path] self._invoke_delegates('delete_path', lod, cvs_path) if should_prune: while parent_node is not None and len(parent_node) == 0: # A drawback of this code is that we issue a delete for each # path and not just a single delete for the topmost directory # pruned. node = parent_node cvs_path = node.cvs_path if cvs_path.parent_directory is None: parent_node = None self.delete_lod(lod) else: parent_node = node.parent_mirror_dir node.delete() logger.verbose(" Deleting %s" % (lod.get_path(cvs_path.cvs_path),)) self._invoke_delegates('delete_path', lod, cvs_path) def initialize_project(self, project): """Create the basic structure for PROJECT.""" logger.verbose(" Initializing project %s" % (project,)) self._invoke_delegates('initialize_project', project) # Don't invoke delegates. self._mirror.add_lod(project.get_trunk()) if Ctx().include_empty_directories: self._make_empty_subdirectories( project.get_root_cvs_directory(), project.get_trunk() ) def change_path(self, cvs_rev): """Register a change in self._youngest for the CVS_REV's svn_path.""" logger.verbose(" Changing %s" % (cvs_rev.get_svn_path(),)) # We do not have to update the nodes because our mirror is only # concerned with the presence or absence of paths, and a file # content change does not cause any path changes. self._invoke_delegates('change_path', cvs_rev) def _make_empty_subdirectories(self, cvs_directory, lod): """Make any empty subdirectories of CVS_DIRECTORY in LOD.""" for empty_subdirectory_id in cvs_directory.empty_subdirectory_ids: empty_subdirectory = Ctx()._cvs_path_db.get_path(empty_subdirectory_id) logger.verbose( " New Directory %s" % (lod.get_path(empty_subdirectory.cvs_path),) ) # There is no need to record the empty subdirectories in the # mirror, since they live and die with their parent directories. self._invoke_delegates('mkdir', lod, empty_subdirectory) self._make_empty_subdirectories(empty_subdirectory, lod) def _mkdir_p(self, cvs_directory, lod): """Make sure that CVS_DIRECTORY exists in LOD. If not, create it, calling delegates. Return the node for CVS_DIRECTORY.""" ancestry = cvs_directory.get_ancestry() try: node = self._mirror.get_current_lod_directory(lod) except KeyError: logger.verbose(" Initializing %s" % (lod,)) node = self._mirror.add_lod(lod) self._invoke_delegates('initialize_lod', lod) if ancestry and Ctx().include_empty_directories: self._make_empty_subdirectories(ancestry[0], lod) for sub_path in ancestry[1:]: try: node = node[sub_path] except KeyError: logger.verbose( " New Directory %s" % (lod.get_path(sub_path.cvs_path),) ) node = node.mkdir(sub_path) self._invoke_delegates('mkdir', lod, sub_path) if Ctx().include_empty_directories: self._make_empty_subdirectories(sub_path, lod) if node is None: raise self.ExpectedDirectoryError( 'File found at \'%s\' where directory was expected.' % (sub_path,) ) return node def add_path(self, cvs_rev): """Add the CVS_REV's svn_path to the repository mirror. Create any missing intermediate paths.""" cvs_file = cvs_rev.cvs_file parent_path = cvs_file.parent_directory lod = cvs_rev.lod parent_node = self._mkdir_p(parent_path, lod) logger.verbose(" Adding %s" % (cvs_rev.get_svn_path(),)) parent_node.add_file(cvs_file) self._invoke_delegates('add_path', cvs_rev) def _show_copy(self, src_path, dest_path, src_revnum): """Print a line stating that we are 'copying' revision SRC_REVNUM of SRC_PATH to DEST_PATH.""" logger.verbose( " Copying revision %d of %s\n" " to %s\n" % (src_revnum, src_path, dest_path,) ) def copy_lod(self, src_lod, dest_lod, src_revnum): """Copy all of SRC_LOD at SRC_REVNUM to DST_LOD. In the youngest revision of the repository, the destination LOD *must not* already exist. Return the new node at DEST_LOD. Note that this node is not necessarily writable, though its parent node necessarily is.""" self._show_copy(src_lod.get_path(), dest_lod.get_path(), src_revnum) node = self._mirror.copy_lod(src_lod, dest_lod, src_revnum) self._invoke_delegates('copy_lod', src_lod, dest_lod, src_revnum) return node def copy_path( self, cvs_path, src_lod, dest_lod, src_revnum, create_parent=False ): """Copy CVS_PATH from SRC_LOD at SRC_REVNUM to DST_LOD. In the youngest revision of the repository, the destination's parent *must* exist unless CREATE_PARENT is specified. But the destination itself *must not* exist. Return the new node at (CVS_PATH, DEST_LOD), as a CurrentMirrorDirectory.""" if cvs_path.parent_directory is None: return self.copy_lod(src_lod, dest_lod, src_revnum) # Get the node of our source, or None if it is a file: src_node = self._mirror.get_old_path(cvs_path, src_lod, src_revnum) # Get the parent path of the destination: if create_parent: dest_parent_node = self._mkdir_p(cvs_path.parent_directory, dest_lod) else: try: dest_parent_node = self._mirror.get_current_path( cvs_path.parent_directory, dest_lod ) except KeyError: raise self.ParentMissingError( 'Attempt to add path \'%s\' to repository mirror, ' 'but its parent directory doesn\'t exist in the mirror.' % (dest_lod.get_path(cvs_path.cvs_path),) ) if cvs_path in dest_parent_node: raise PathExistsError( 'Attempt to add path \'%s\' to repository mirror ' 'when it already exists in the mirror.' % (dest_lod.get_path(cvs_path.cvs_path),) ) self._show_copy( src_lod.get_path(cvs_path.cvs_path), dest_lod.get_path(cvs_path.cvs_path), src_revnum, ) dest_parent_node[cvs_path] = src_node self._invoke_delegates( 'copy_path', cvs_path, src_lod, dest_lod, src_revnum ) return dest_parent_node[cvs_path] def fill_symbol(self, svn_symbol_commit, fill_source): """Perform all copies for the CVSSymbols in SVN_SYMBOL_COMMIT. The symbolic name is guaranteed to exist in the Subversion repository by the end of this call, even if there are no paths under it.""" symbol = svn_symbol_commit.symbol try: dest_node = self._mirror.get_current_lod_directory(symbol) except KeyError: self._fill_directory(symbol, None, fill_source, None) else: self._fill_directory(symbol, dest_node, fill_source, None) def _fill_directory(self, symbol, dest_node, fill_source, parent_source): """Fill the tag or branch SYMBOL at the path indicated by FILL_SOURCE. Use items from FILL_SOURCE, and recurse into the child items. Fill SYMBOL starting at the path FILL_SOURCE.cvs_path. DEST_NODE is the node of this destination path, or None if the destination does not yet exist. All directories above this path have already been filled. FILL_SOURCE is a FillSource instance describing the items within a subtree of the repository that still need to be copied to the destination. PARENT_SOURCE is the SVNRevisionRange that was used to copy the parent directory, if it was copied in this commit. We prefer to copy from the same source as was used for the parent, since it typically requires less touching-up. If PARENT_SOURCE is None, then the parent directory was not copied in this commit, so no revision is preferable to any other.""" copy_source = fill_source.compute_best_source(parent_source) # Figure out if we shall copy to this destination and delete any # destination path that is in the way. if dest_node is None: # The destination does not exist at all, so it definitely has to # be copied: dest_node = self.copy_path( fill_source.cvs_path, copy_source.source_lod, symbol, copy_source.opening_revnum ) elif (parent_source is not None) and ( copy_source.source_lod != parent_source.source_lod or copy_source.opening_revnum != parent_source.opening_revnum ): # The parent path was copied from a different source than we # need to use, so we have to delete the version that was copied # with the parent then re-copy from the correct source: self.delete_path(fill_source.cvs_path, symbol) dest_node = self.copy_path( fill_source.cvs_path, copy_source.source_lod, symbol, copy_source.opening_revnum ) else: copy_source = parent_source # The map {CVSPath : FillSource} of entries within this directory # that need filling: src_entries = fill_source.get_subsource_map() if copy_source is not None: self._prune_extra_entries( fill_source.cvs_path, symbol, dest_node, src_entries ) return self._cleanup_filled_directory( symbol, dest_node, src_entries, copy_source ) def _cleanup_filled_directory( self, symbol, dest_node, src_entries, copy_source ): """The directory at DEST_NODE has been filled and pruned; recurse. Recurse into the SRC_ENTRIES, in alphabetical order. If DEST_NODE was copied in this revision, COPY_SOURCE should indicate where it was copied from; otherwise, COPY_SOURCE should be None.""" cvs_paths = src_entries.keys() cvs_paths.sort() for cvs_path in cvs_paths: if isinstance(cvs_path, CVSDirectory): # Path is a CVSDirectory: try: dest_subnode = dest_node[cvs_path] except KeyError: # Path doesn't exist yet; it has to be created: dest_node = self._fill_directory( symbol, None, src_entries[cvs_path], None ).parent_mirror_dir else: # Path already exists, but might have to be cleaned up: dest_node = self._fill_directory( symbol, dest_subnode, src_entries[cvs_path], copy_source ).parent_mirror_dir else: # Path is a CVSFile: self._fill_file( symbol, cvs_path in dest_node, src_entries[cvs_path], copy_source ) # Reread dest_node since the call to _fill_file() might have # made it writable: dest_node = self._mirror.get_current_path( dest_node.cvs_path, dest_node.lod ) return dest_node def _fill_file(self, symbol, dest_existed, fill_source, parent_source): """Fill the tag or branch SYMBOL at the path indicated by FILL_SOURCE. Use items from FILL_SOURCE. Fill SYMBOL at path FILL_SOURCE.cvs_path. DEST_NODE is the node of this destination path, or None if the destination does not yet exist. All directories above this path have already been filled as needed. FILL_SOURCE is a FillSource instance describing the item that needs to be copied to the destination. PARENT_SOURCE is the source from which the parent directory was copied, or None if the parent directory was not copied during this commit. We prefer to copy from PARENT_SOURCE, since it typically requires less touching-up. If PARENT_SOURCE is None, then the parent directory was not copied in this commit, so no revision is preferable to any other.""" copy_source = fill_source.compute_best_source(parent_source) # Figure out if we shall copy to this destination and delete any # destination path that is in the way. if not dest_existed: # The destination does not exist at all, so it definitely has to # be copied: self.copy_path( fill_source.cvs_path, copy_source.source_lod, symbol, copy_source.opening_revnum ) elif (parent_source is not None) and ( copy_source.source_lod != parent_source.source_lod or copy_source.opening_revnum != parent_source.opening_revnum ): # The parent path was copied from a different source than we # need to use, so we have to delete the version that was copied # with the parent and then re-copy from the correct source: self.delete_path(fill_source.cvs_path, symbol) self.copy_path( fill_source.cvs_path, copy_source.source_lod, symbol, copy_source.opening_revnum ) def _prune_extra_entries( self, dest_cvs_path, symbol, dest_node, src_entries ): """Delete any entries in DEST_NODE that are not in SRC_ENTRIES.""" delete_list = [ cvs_path for cvs_path in dest_node if cvs_path not in src_entries ] # Sort the delete list so that the output is in a consistent # order: delete_list.sort() for cvs_path in delete_list: logger.verbose(" Deleting %s" % (symbol.get_path(cvs_path.cvs_path),)) del dest_node[cvs_path] self._invoke_delegates('delete_path', symbol, cvs_path) def add_delegate(self, delegate): """Adds DELEGATE to self._delegates. For every delegate you add, whenever a repository action method is performed, delegate's corresponding repository action method is called. Multiple delegates will be called in the order that they are added. See SVNRepositoryDelegate for more information.""" self._delegates.append(delegate) def _invoke_delegates(self, method, *args): """Invoke a method on each delegate. Iterate through each of our delegates, in the order that they were added, and call the delegate's method named METHOD with the arguments in ARGS.""" for delegate in self._delegates: getattr(delegate, method)(*args) def process_initial_project_commit(self, svn_commit): self.start_commit(svn_commit.revnum, self._get_revprops(svn_commit)) for project in svn_commit.projects: self.initialize_project(project) self.end_commit() def process_primary_commit(self, svn_commit): self.start_commit(svn_commit.revnum, self._get_revprops(svn_commit)) # This actually commits CVSRevisions if len(svn_commit.cvs_revs) > 1: plural = "s" else: plural = "" logger.verbose("Committing %d CVSRevision%s" % (len(svn_commit.cvs_revs), plural)) for cvs_rev in svn_commit.cvs_revs: if isinstance(cvs_rev, CVSRevisionNoop): pass elif isinstance(cvs_rev, CVSRevisionDelete): self.delete_path(cvs_rev.cvs_file, cvs_rev.lod, Ctx().prune) elif isinstance(cvs_rev, CVSRevisionAdd): self.add_path(cvs_rev) elif isinstance(cvs_rev, CVSRevisionChange): self.change_path(cvs_rev) self.end_commit() def process_post_commit(self, svn_commit): self.start_commit(svn_commit.revnum, self._get_revprops(svn_commit)) logger.verbose( 'Synchronizing default branch motivated by %d' % (svn_commit.motivating_revnum,) ) for cvs_rev in svn_commit.cvs_revs: trunk = cvs_rev.cvs_file.project.get_trunk() if isinstance(cvs_rev, CVSRevisionAdd): # Copy from branch to trunk: self.copy_path( cvs_rev.cvs_file, cvs_rev.lod, trunk, svn_commit.motivating_revnum, True ) elif isinstance(cvs_rev, CVSRevisionChange): # Delete old version of the path on trunk... self.delete_path(cvs_rev.cvs_file, trunk) # ...and copy the new version over from branch: self.copy_path( cvs_rev.cvs_file, cvs_rev.lod, trunk, svn_commit.motivating_revnum, True ) elif isinstance(cvs_rev, CVSRevisionDelete): # Delete trunk path: self.delete_path(cvs_rev.cvs_file, trunk) elif isinstance(cvs_rev, CVSRevisionNoop): # Do nothing pass else: raise InternalError('Unexpected CVSRevision type: %s' % (cvs_rev,)) self.end_commit() def process_branch_commit(self, svn_commit): self.start_commit(svn_commit.revnum, self._get_revprops(svn_commit)) logger.verbose('Filling branch:', svn_commit.symbol.name) # Get the set of sources for the symbolic name: source_set = get_source_set( svn_commit.symbol, self._symbolings_reader.get_range_map(svn_commit), ) self.fill_symbol(svn_commit, source_set) self.end_commit() def process_tag_commit(self, svn_commit): self.start_commit(svn_commit.revnum, self._get_revprops(svn_commit)) logger.verbose('Filling tag:', svn_commit.symbol.name) # Get the set of sources for the symbolic name: source_set = get_source_set( svn_commit.symbol, self._symbolings_reader.get_range_map(svn_commit), ) self.fill_symbol(svn_commit, source_set) self.end_commit() def cleanup(self): self._invoke_delegates('finish') logger.verbose("Finished creating Subversion repository.") logger.quiet("Done.") self._mirror.close() self._mirror = None Ctx().revision_reader.finish() self._symbolings_reader.close() del self._symbolings_reader class DumpfileOutputOption(SVNOutputOption): """Output the result of the conversion into a dumpfile.""" def __init__(self, dumpfile_path, author_transforms=None): SVNOutputOption.__init__(self, author_transforms) self.dumpfile_path = dumpfile_path def check(self): pass def setup(self, svn_rev_count): logger.quiet("Starting Subversion Dumpfile.") SVNOutputOption.setup(self, svn_rev_count) if not Ctx().dry_run: self.add_delegate( DumpstreamDelegate( Ctx().revision_reader, open(self.dumpfile_path, 'wb') ) ) class RepositoryOutputOption(SVNOutputOption): """Output the result of the conversion into an SVN repository.""" def __init__(self, target, author_transforms=None): SVNOutputOption.__init__(self, author_transforms) self.target = target def check(self): if not Ctx().dry_run: # Verify that svnadmin can be executed. The 'help' subcommand # should be harmless. try: check_command_runs([Ctx().svnadmin_executable, 'help'], 'svnadmin') except CommandFailedException, e: raise FatalError( '%s\n' 'svnadmin could not be executed. Please ensure that it is\n' 'installed and/or use the --svnadmin option.' % (e,)) def setup(self, svn_rev_count): logger.quiet("Starting Subversion Repository.") SVNOutputOption.setup(self, svn_rev_count) if not Ctx().dry_run: self.add_delegate( DumpstreamDelegate(Ctx().revision_reader, LoaderPipe(self.target)) ) class NewRepositoryOutputOption(RepositoryOutputOption): """Output the result of the conversion into a new SVN repository.""" def __init__( self, target, fs_type=None, bdb_txn_nosync=None, author_transforms=None, create_options=[], ): RepositoryOutputOption.__init__(self, target, author_transforms) self.bdb_txn_nosync = bdb_txn_nosync # Determine the options to be passed to "svnadmin create": if not fs_type: # User didn't say what kind repository (bdb, fsfs, etc). We # still pass --bdb-txn-nosync. It's a no-op if the default # repository type doesn't support it, but we definitely want it # if BDB is the default. self.create_options = ['--bdb-txn-nosync'] elif fs_type == 'bdb': # User explicitly specified bdb. # # Since this is a BDB repository, pass --bdb-txn-nosync, because # it gives us a 4-5x speed boost (if cvs2svn is creating the # repository, cvs2svn should be the only program accessing the # svn repository until cvs2svn is done). But we'll turn no-sync # off in self.finish(), unless instructed otherwise. self.create_options = ['--fs-type=bdb', '--bdb-txn-nosync'] else: # User specified something other than bdb. self.create_options = ['--fs-type=%s' % fs_type] # Now append the user's explicitly-set create options: self.create_options += create_options def check(self): RepositoryOutputOption.check(self) if not Ctx().dry_run and os.path.exists(self.target): raise FatalError("the svn-repos-path '%s' exists.\n" "Remove it, or pass '--existing-svnrepos'." % self.target) def setup(self, svn_rev_count): logger.normal("Creating new repository '%s'" % (self.target)) if Ctx().dry_run: # Do not actually create repository: pass else: call_command([ Ctx().svnadmin_executable, 'create', ] + self.create_options + [ self.target ]) RepositoryOutputOption.setup(self, svn_rev_count) def cleanup(self): RepositoryOutputOption.cleanup(self) # If this is a BDB repository, and we created the repository, and # --bdb-no-sync wasn't passed, then comment out the DB_TXN_NOSYNC # line in the DB_CONFIG file, because txn syncing should be on by # default in BDB repositories. # # We determine if this is a BDB repository by looking for the # DB_CONFIG file, which doesn't exist in FSFS, rather than by # checking self.fs_type. That way this code will Do The Right # Thing in all circumstances. db_config = os.path.join(self.target, "db/DB_CONFIG") if Ctx().dry_run: # Do not change repository: pass elif not self.bdb_txn_nosync and os.path.exists(db_config): no_sync = 'set_flags DB_TXN_NOSYNC\n' contents = open(db_config, 'r').readlines() index = contents.index(no_sync) contents[index] = '# ' + no_sync open(db_config, 'w').writelines(contents) class ExistingRepositoryOutputOption(RepositoryOutputOption): """Output the result of the conversion into an existing SVN repository.""" def __init__(self, target, author_transforms=None): RepositoryOutputOption.__init__(self, target, author_transforms) def check(self): RepositoryOutputOption.check(self) if not os.path.isdir(self.target): raise FatalError("the svn-repos-path '%s' is not an " "existing directory." % self.target) cvs2svn-2.4.0/cvs2svn_lib/cvs_path_database.py0000664000076500007650000000534011500107341022434 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains database facilities used by cvs2svn.""" import cPickle from cvs2svn_lib import config from cvs2svn_lib.common import DB_OPEN_READ from cvs2svn_lib.common import DB_OPEN_NEW from cvs2svn_lib.artifact_manager import artifact_manager from cvs2svn_lib.cvs_path import CVSPath class CVSPathDatabase: """A database to store CVSPath objects and retrieve them by their id. All RCS files within every CVS project repository are recorded here as CVSFile instances, and all directories within every CVS project repository (including empty directories) are recorded here as CVSDirectory instances.""" def __init__(self, mode): """Initialize an instance, opening database in MODE (where MODE is either DB_OPEN_NEW or DB_OPEN_READ).""" self.mode = mode # A map { id : CVSPath } self._cvs_paths = {} if self.mode == DB_OPEN_NEW: pass elif self.mode == DB_OPEN_READ: f = open(artifact_manager.get_temp_file(config.CVS_PATHS_DB), 'rb') cvs_paths = cPickle.load(f) for cvs_path in cvs_paths: self._cvs_paths[cvs_path.id] = cvs_path else: raise RuntimeError('Invalid mode %r' % self.mode) def set_cvs_path_ordinals(self): cvs_files = sorted(self.itervalues(), key=CVSPath.sort_key) for (i, cvs_file) in enumerate(cvs_files): cvs_file.ordinal = i def log_path(self, cvs_path): """Add CVS_PATH, a CVSPath instance, to the database.""" if self.mode == DB_OPEN_READ: raise RuntimeError('Cannot write items in mode %r' % self.mode) self._cvs_paths[cvs_path.id] = cvs_path def itervalues(self): return self._cvs_paths.itervalues() def get_path(self, id): """Return the CVSPath with the specified ID.""" return self._cvs_paths[id] def close(self): if self.mode == DB_OPEN_NEW: self.set_cvs_path_ordinals() f = open(artifact_manager.get_temp_file(config.CVS_PATHS_DB), 'wb') cPickle.dump(self._cvs_paths.values(), f, -1) f.close() self._cvs_paths = None cvs2svn-2.4.0/cvs2svn_lib/git_output_option.py0000664000076500007650000004455111710517256022620 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2007-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Classes for outputting the converted repository to git. For information about the format allowed by git-fast-import, see: http://www.kernel.org/pub/software/scm/git/docs/git-fast-import.html """ import bisect import time from cvs2svn_lib.common import InternalError from cvs2svn_lib.log import logger from cvs2svn_lib.context import Ctx from cvs2svn_lib.symbol import Trunk from cvs2svn_lib.symbol import Branch from cvs2svn_lib.symbol import Tag from cvs2svn_lib.cvs_item import CVSSymbol from cvs2svn_lib.dvcs_common import DVCSOutputOption from cvs2svn_lib.dvcs_common import MirrorUpdater from cvs2svn_lib.key_generator import KeyGenerator class GitRevisionWriter(MirrorUpdater): def start(self, mirror, f): super(GitRevisionWriter, self).start(mirror) self.f = f def _modify_file(self, cvs_item, post_commit): raise NotImplementedError() def add_file(self, cvs_rev, post_commit): super(GitRevisionWriter, self).add_file(cvs_rev, post_commit) self._modify_file(cvs_rev, post_commit) def modify_file(self, cvs_rev, post_commit): super(GitRevisionWriter, self).modify_file(cvs_rev, post_commit) self._modify_file(cvs_rev, post_commit) def delete_file(self, cvs_rev, post_commit): super(GitRevisionWriter, self).delete_file(cvs_rev, post_commit) self.f.write('D %s\n' % (cvs_rev.cvs_file.cvs_path,)) def branch_file(self, cvs_symbol): super(GitRevisionWriter, self).branch_file(cvs_symbol) self._modify_file(cvs_symbol, post_commit=False) def finish(self): super(GitRevisionWriter, self).finish() del self.f class GitRevisionMarkWriter(GitRevisionWriter): def _modify_file(self, cvs_item, post_commit): if cvs_item.cvs_file.executable: mode = '100755' else: mode = '100644' self.f.write( 'M %s :%d %s\n' % (mode, cvs_item.revision_reader_token, cvs_item.cvs_file.cvs_path,) ) class GitRevisionInlineWriter(GitRevisionWriter): def __init__(self, revision_reader): self.revision_reader = revision_reader def register_artifacts(self, which_pass): GitRevisionWriter.register_artifacts(self, which_pass) self.revision_reader.register_artifacts(which_pass) def start(self, mirror, f): GitRevisionWriter.start(self, mirror, f) self.revision_reader.start() def _modify_file(self, cvs_item, post_commit): if cvs_item.cvs_file.executable: mode = '100755' else: mode = '100644' self.f.write( 'M %s inline %s\n' % (mode, cvs_item.cvs_file.cvs_path,) ) if isinstance(cvs_item, CVSSymbol): cvs_rev = cvs_item.get_cvs_revision_source(Ctx()._cvs_items_db) else: cvs_rev = cvs_item # FIXME: We have to decide what to do about keyword substitution # and eol_style here: fulltext = self.revision_reader.get_content(cvs_rev) self.f.write('data %d\n' % (len(fulltext),)) self.f.write(fulltext) self.f.write('\n') def finish(self): GitRevisionWriter.finish(self) self.revision_reader.finish() class GitOutputOption(DVCSOutputOption): """An OutputOption that outputs to a git-fast-import formatted file. Members: dump_filename -- (string) the name of the file to which the git-fast-import commands for defining revisions will be written. author_transforms -- a map from CVS author names to git full name and email address. See DVCSOutputOption.normalize_author_transforms() for information about the form of this parameter. """ name = "Git" # The first mark number used for git-fast-import commit marks. This # value needs to be large to avoid conflicts with blob marks. _first_commit_mark = 1000000000 def __init__( self, dump_filename, revision_writer, author_transforms=None, tie_tag_fixup_branches=False, ): """Constructor. DUMP_FILENAME is the name of the file to which the git-fast-import commands for defining revisions should be written. (Please note that depending on the style of revision writer, the actual file contents might not be written to this file.) REVISION_WRITER is a GitRevisionWriter that is used to output either the content of revisions or a mark that was previously used to label a blob. AUTHOR_TRANSFORMS is a map {cvsauthor : (fullname, email)} from CVS author names to git full name and email address. All of the contents should either be Unicode strings or 8-bit strings encoded as UTF-8. TIE_TAG_FIXUP_BRANCHES means whether after finishing with a tag fixup branch, it should be psuedo-merged (ancestry linked but no content changes) back into its source branch, to dispose of the open head. """ DVCSOutputOption.__init__(self) self.dump_filename = dump_filename self.revision_writer = revision_writer self.author_transforms = self.normalize_author_transforms( author_transforms ) self.tie_tag_fixup_branches = tie_tag_fixup_branches self._mark_generator = KeyGenerator(GitOutputOption._first_commit_mark) def register_artifacts(self, which_pass): DVCSOutputOption.register_artifacts(self, which_pass) self.revision_writer.register_artifacts(which_pass) def check_symbols(self, symbol_map): # FIXME: What constraints does git impose on symbols? pass def setup(self, svn_rev_count): DVCSOutputOption.setup(self, svn_rev_count) self.f = open(self.dump_filename, 'wb') # The youngest revnum that has been committed so far: self._youngest = 0 # A map {lod : [(revnum, mark)]} giving each of the revision # numbers in which there was a commit to lod, and the mark active # at the end of the revnum. self._marks = {} self.revision_writer.start(self._mirror, self.f) def _create_commit_mark(self, lod, revnum): mark = self._mark_generator.gen_id() self._set_lod_mark(lod, revnum, mark) return mark def _set_lod_mark(self, lod, revnum, mark): """Record MARK as the status of LOD for REVNUM. If there is already an entry for REVNUM, overwrite it. If not, append a new entry to the self._marks list for LOD.""" assert revnum >= self._youngest entry = (revnum, mark) try: modifications = self._marks[lod] except KeyError: # This LOD hasn't appeared before; create a new list and add the # entry: self._marks[lod] = [entry] else: # A record exists, so it necessarily has at least one element: if modifications[-1][0] == revnum: modifications[-1] = entry else: modifications.append(entry) self._youngest = revnum def _get_author(self, svn_commit): """Return the author to be used for SVN_COMMIT. Return the author as a UTF-8 string in the form needed by git fast-import; that is, 'name '.""" cvs_author = svn_commit.get_author() return self._map_author(cvs_author) def _map_author(self, cvs_author): return self.author_transforms.get(cvs_author, "%s <>" % (cvs_author,)) @staticmethod def _get_log_msg(svn_commit): return svn_commit.get_log_msg() def process_initial_project_commit(self, svn_commit): self._mirror.start_commit(svn_commit.revnum) self._mirror.end_commit() def process_primary_commit(self, svn_commit): author = self._get_author(svn_commit) log_msg = self._get_log_msg(svn_commit) lods = set() for cvs_rev in svn_commit.get_cvs_items(): lods.add(cvs_rev.lod) if len(lods) != 1: raise InternalError('Commit affects %d LODs' % (len(lods),)) lod = lods.pop() self._mirror.start_commit(svn_commit.revnum) if isinstance(lod, Trunk): # FIXME: is this correct?: self.f.write('commit refs/heads/master\n') else: self.f.write('commit refs/heads/%s\n' % (lod.name,)) self.f.write( 'mark :%d\n' % (self._create_commit_mark(lod, svn_commit.revnum),) ) self.f.write( 'committer %s %d +0000\n' % (author, svn_commit.date,) ) self.f.write('data %d\n' % (len(log_msg),)) self.f.write('%s\n' % (log_msg,)) for cvs_rev in svn_commit.get_cvs_items(): self.revision_writer.process_revision(cvs_rev, post_commit=False) self.f.write('\n') self._mirror.end_commit() def process_post_commit(self, svn_commit): author = self._get_author(svn_commit) log_msg = self._get_log_msg(svn_commit) source_lods = set() for cvs_rev in svn_commit.cvs_revs: source_lods.add(cvs_rev.lod) if len(source_lods) != 1: raise InternalError('Commit is from %d LODs' % (len(source_lods),)) source_lod = source_lods.pop() self._mirror.start_commit(svn_commit.revnum) # FIXME: is this correct?: self.f.write('commit refs/heads/master\n') self.f.write( 'mark :%d\n' % (self._create_commit_mark(None, svn_commit.revnum),) ) self.f.write( 'committer %s %d +0000\n' % (author, svn_commit.date,) ) self.f.write('data %d\n' % (len(log_msg),)) self.f.write('%s\n' % (log_msg,)) self.f.write( 'merge :%d\n' % (self._get_source_mark(source_lod, svn_commit.revnum),) ) for cvs_rev in svn_commit.cvs_revs: self.revision_writer.process_revision(cvs_rev, post_commit=True) self.f.write('\n') self._mirror.end_commit() def _get_source_mark(self, source_lod, revnum): """Return the mark active on SOURCE_LOD at the end of REVNUM.""" modifications = self._marks[source_lod] i = bisect.bisect_left(modifications, (revnum + 1,)) - 1 (revnum, mark) = modifications[i] return mark def describe_lod_to_user(self, lod): """This needs to make sense to users of the fastimported result.""" if isinstance(lod, Trunk): return 'master' else: return lod.name def _describe_commit(self, svn_commit, lod): author = self._map_author(svn_commit.get_author()) if author.endswith(" <>"): author = author[:-3] date = time.strftime( "%Y-%m-%d %H:%M:%S UTC", time.gmtime(svn_commit.date) ) log_msg = svn_commit.get_log_msg() if log_msg.find('\n') != -1: log_msg = log_msg[:log_msg.index('\n')] return "%s %s %s '%s'" % ( self.describe_lod_to_user(lod), date, author, log_msg,) def _process_symbol_commit(self, svn_commit, git_branch, source_groups): author = self._get_author(svn_commit) log_msg = self._get_log_msg(svn_commit) # There are two distinct cases we need to care for here: # 1. initial creation of a LOD # 2. fixup of an existing LOD to include more files, because the LOD in # CVS was created piecemeal over time, with intervening commits # We look at _marks here, but self._mirror._get_lod_history(lod).exists() # might be technically more correct (though _get_lod_history is currently # underscore-private) is_initial_lod_creation = svn_commit.symbol not in self._marks # Create the mark, only after the check above mark = self._create_commit_mark(svn_commit.symbol, svn_commit.revnum) if is_initial_lod_creation: # Get the primary parent p_source_revnum, p_source_lod, p_cvs_symbols = source_groups[0] try: p_source_node = self._mirror.get_old_lod_directory( p_source_lod, p_source_revnum ) except KeyError: raise InternalError('Source %r does not exist' % (p_source_lod,)) cvs_files_to_delete = set(self._get_all_files(p_source_node)) for (source_revnum, source_lod, cvs_symbols,) in source_groups: for cvs_symbol in cvs_symbols: cvs_files_to_delete.discard(cvs_symbol.cvs_file) # Write a trailer to the log message which describes the cherrypicks that # make up this symbol creation. log_msg += "\n" if is_initial_lod_creation: log_msg += "\nSprout from %s" % ( self._describe_commit( Ctx()._persistence_manager.get_svn_commit(p_source_revnum), p_source_lod ), ) for (source_revnum, source_lod, cvs_symbols,) \ in source_groups[(is_initial_lod_creation and 1 or 0):]: log_msg += "\nCherrypick from %s:" % ( self._describe_commit( Ctx()._persistence_manager.get_svn_commit(source_revnum), source_lod ), ) for cvs_path in sorted( cvs_symbol.cvs_file.cvs_path for cvs_symbol in cvs_symbols ): log_msg += "\n %s" % (cvs_path,) if is_initial_lod_creation: if cvs_files_to_delete: log_msg += "\nDelete:" for cvs_path in sorted( cvs_file.cvs_path for cvs_file in cvs_files_to_delete ): log_msg += "\n %s" % (cvs_path,) self.f.write('commit %s\n' % (git_branch,)) self.f.write('mark :%d\n' % (mark,)) self.f.write('committer %s %d +0000\n' % (author, svn_commit.date,)) self.f.write('data %d\n' % (len(log_msg),)) self.f.write('%s\n' % (log_msg,)) # Only record actual DVCS ancestry for the primary sprout parent, # all the rest are effectively cherrypicks. if is_initial_lod_creation: self.f.write( 'from :%d\n' % (self._get_source_mark(p_source_lod, p_source_revnum),) ) for (source_revnum, source_lod, cvs_symbols,) in source_groups: for cvs_symbol in cvs_symbols: self.revision_writer.branch_file(cvs_symbol) if is_initial_lod_creation: for cvs_file in cvs_files_to_delete: self.f.write('D %s\n' % (cvs_file.cvs_path,)) self.f.write('\n') return mark def process_branch_commit(self, svn_commit): self._mirror.start_commit(svn_commit.revnum) source_groups = self._get_source_groups(svn_commit) if self._is_simple_copy(svn_commit, source_groups): (source_revnum, source_lod, cvs_symbols) = source_groups[0] logger.debug( '%s will be created via a simple copy from %s:r%d' % (svn_commit.symbol, source_lod, source_revnum,) ) mark = self._get_source_mark(source_lod, source_revnum) self._set_symbol(svn_commit.symbol, mark) self._mirror.copy_lod(source_lod, svn_commit.symbol, source_revnum) self._set_lod_mark(svn_commit.symbol, svn_commit.revnum, mark) else: logger.debug( '%s will be created via fixup commit(s)' % (svn_commit.symbol,) ) self._process_symbol_commit( svn_commit, 'refs/heads/%s' % (svn_commit.symbol.name,), source_groups, ) self._mirror.end_commit() def _set_symbol(self, symbol, mark): if isinstance(symbol, Branch): category = 'heads' elif isinstance(symbol, Tag): category = 'tags' else: raise InternalError() self.f.write('reset refs/%s/%s\n' % (category, symbol.name,)) self.f.write('from :%d\n' % (mark,)) def get_tag_fixup_branch_name(self, svn_commit): # The branch name to use for the "tag fixup branches". The # git-fast-import documentation suggests using 'TAG_FIXUP' # (outside of the refs/heads namespace), but this is currently # broken. Use a name containing '.', which is not allowed in CVS # symbols, to avoid conflicts (though of course a conflict could # still result if the user requests symbol transformations). return 'refs/heads/TAG.FIXUP' def process_tag_commit(self, svn_commit): # FIXME: For now we create a fixup branch with the same name as # the tag, then the tag. We never delete the fixup branch. self._mirror.start_commit(svn_commit.revnum) source_groups = self._get_source_groups(svn_commit) if self._is_simple_copy(svn_commit, source_groups): (source_revnum, source_lod, cvs_symbols) = source_groups[0] logger.debug( '%s will be created via a simple copy from %s:r%d' % (svn_commit.symbol, source_lod, source_revnum,) ) mark = self._get_source_mark(source_lod, source_revnum) self._set_symbol(svn_commit.symbol, mark) self._mirror.copy_lod(source_lod, svn_commit.symbol, source_revnum) self._set_lod_mark(svn_commit.symbol, svn_commit.revnum, mark) else: logger.debug( '%s will be created via a fixup branch' % (svn_commit.symbol,) ) fixup_branch_name = self.get_tag_fixup_branch_name(svn_commit) # Create the fixup branch (which might involve making more than # one commit): mark = self._process_symbol_commit( svn_commit, fixup_branch_name, source_groups ) # Store the mark of the last commit to the fixup branch as the # value of the tag: self._set_symbol(svn_commit.symbol, mark) self.f.write('reset %s\n' % (fixup_branch_name,)) self.f.write('\n') if self.tie_tag_fixup_branches: source_lod = source_groups[0][1] source_lod_git_branch = \ 'refs/heads/%s' % (getattr(source_lod, 'name', 'master'),) mark2 = self._create_commit_mark(source_lod, svn_commit.revnum) author = self._map_author(Ctx().username) log_msg = self._get_log_msg_for_ancestry_tie(svn_commit) self.f.write('commit %s\n' % (source_lod_git_branch,)) self.f.write('mark :%d\n' % (mark2,)) self.f.write('committer %s %d +0000\n' % (author, svn_commit.date,)) self.f.write('data %d\n' % (len(log_msg),)) self.f.write('%s\n' % (log_msg,)) self.f.write( 'merge :%d\n' % (mark,) ) self.f.write('\n') self._mirror.end_commit() def _get_log_msg_for_ancestry_tie(self, svn_commit): return Ctx().text_wrapper.fill( Ctx().tie_tag_ancestry_message % { 'symbol_name' : svn_commit.symbol.name, } ) def cleanup(self): DVCSOutputOption.cleanup(self) self.revision_writer.finish() self.f.close() del self.f cvs2svn-2.4.0/cvs2svn_lib/persistence_manager.py0000664000076500007650000001004211710517256023027 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains class PersistenceManager.""" from cvs2svn_lib import config from cvs2svn_lib.common import DB_OPEN_NEW from cvs2svn_lib.common import DB_OPEN_READ from cvs2svn_lib.common import SVN_INVALID_REVNUM from cvs2svn_lib.artifact_manager import artifact_manager from cvs2svn_lib.record_table import SignedIntegerPacker from cvs2svn_lib.record_table import RecordTable from cvs2svn_lib.serializer import PrimedPickleSerializer from cvs2svn_lib.indexed_database import IndexedDatabase from cvs2svn_lib.svn_commit import SVNRevisionCommit from cvs2svn_lib.svn_commit import SVNInitialProjectCommit from cvs2svn_lib.svn_commit import SVNPrimaryCommit from cvs2svn_lib.svn_commit import SVNBranchCommit from cvs2svn_lib.svn_commit import SVNTagCommit from cvs2svn_lib.svn_commit import SVNPostCommit class PersistenceManager: """The PersistenceManager allows us to effectively store SVNCommits to disk and retrieve them later using only their subversion revision number as the key. It also returns the subversion revision number for a given CVSRevision's unique key. All information pertinent to each SVNCommit is stored in a series of on-disk databases so that SVNCommits can be retrieved on-demand. MODE is one of the constants DB_OPEN_NEW or DB_OPEN_READ. In 'new' mode, PersistenceManager will initialize a new set of on-disk databases and be fully-featured. In 'read' mode, PersistenceManager will open existing on-disk databases and the set_* methods will be unavailable.""" def __init__(self, mode): self.mode = mode if mode not in (DB_OPEN_NEW, DB_OPEN_READ): raise RuntimeError("Invalid 'mode' argument to PersistenceManager") primer = ( SVNInitialProjectCommit, SVNPrimaryCommit, SVNPostCommit, SVNBranchCommit, SVNTagCommit, ) serializer = PrimedPickleSerializer(primer) self.svn_commit_db = IndexedDatabase( artifact_manager.get_temp_file(config.SVN_COMMITS_INDEX_TABLE), artifact_manager.get_temp_file(config.SVN_COMMITS_STORE), mode, serializer) self.cvs2svn_db = RecordTable( artifact_manager.get_temp_file(config.CVS_REVS_TO_SVN_REVNUMS), mode, SignedIntegerPacker(SVN_INVALID_REVNUM)) def get_svn_revnum(self, cvs_rev_id): """Return the Subversion revision number in which CVS_REV_ID was committed, or SVN_INVALID_REVNUM if there is no mapping for CVS_REV_ID.""" return self.cvs2svn_db.get(cvs_rev_id, SVN_INVALID_REVNUM) def get_svn_commit(self, svn_revnum): """Return an SVNCommit that corresponds to SVN_REVNUM. If no SVNCommit exists for revnum SVN_REVNUM, then return None.""" return self.svn_commit_db.get(svn_revnum, None) def put_svn_commit(self, svn_commit): """Record the bidirectional mapping between SVN_REVNUM and CVS_REVS and record associated attributes.""" if self.mode == DB_OPEN_READ: raise RuntimeError( 'Write operation attempted on read-only PersistenceManager' ) self.svn_commit_db[svn_commit.revnum] = svn_commit if isinstance(svn_commit, SVNRevisionCommit): for cvs_rev in svn_commit.cvs_revs: self.cvs2svn_db[cvs_rev.id] = svn_commit.revnum def close(self): self.cvs2svn_db.close() self.cvs2svn_db = None self.svn_commit_db.close() self.svn_commit_db = None cvs2svn-2.4.0/cvs2svn_lib/svn_dump.py0000664000076500007650000003525111710517256020655 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2010 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains code to output to Subversion dumpfile format.""" import subprocess try: from hashlib import md5 except ImportError: from md5 import new as md5 from cvs2svn_lib.common import CommandError from cvs2svn_lib.common import FatalError from cvs2svn_lib.common import InternalError from cvs2svn_lib.common import path_split from cvs2svn_lib.context import Ctx from cvs2svn_lib.cvs_path import CVSDirectory from cvs2svn_lib.cvs_path import CVSFile from cvs2svn_lib.svn_repository_delegate import SVNRepositoryDelegate # Things that can happen to a file. OP_ADD = 'add' OP_CHANGE = 'change' class DumpstreamDelegate(SVNRepositoryDelegate): """Write output in Subversion dumpfile format.""" def __init__(self, revision_reader, dumpfile): """Return a new DumpstreamDelegate instance. DUMPFILE should be a file-like object opened in binary mode, to which the dump stream will be written. The only methods called on the object are write() and close().""" self._revision_reader = revision_reader self._dumpfile = dumpfile self._write_dumpfile_header() # A set of the basic project infrastructure project directories # that have been created so far, as SVN paths. (The root # directory is considered to be present at initialization.) This # includes all of the LOD paths, and all of their parent # directories etc. self._basic_directories = set(['']) def _write_dumpfile_header(self): """Initialize the dumpfile with the standard headers. Since the CVS repository doesn't have a UUID, and the Subversion repository will be created with one anyway, we don't specify a UUID in the dumpfile.""" self._dumpfile.write('SVN-fs-dump-format-version: 2\n\n') def _utf8_path(self, path): """Return a copy of PATH encoded in UTF-8.""" # Convert each path component separately (as they may each use # different encodings). try: return '/'.join([ Ctx().cvs_filename_decoder(piece).encode('utf8') for piece in path.split('/') ]) except UnicodeError: raise FatalError( "Unable to convert a path '%s' to internal encoding.\n" "Consider rerunning with one or more '--encoding' parameters or\n" "with '--fallback-encoding'." % (path,)) @staticmethod def _string_for_props(properties): """Return PROPERTIES in the form needed for the dumpfile.""" prop_strings = [] for (k, v) in sorted(properties.iteritems()): if k.startswith('_'): # Such properties are for internal use only. pass elif v is None: # None indicates that the property should be left unset. pass else: prop_strings.append('K %d\n%s\nV %d\n%s\n' % (len(k), k, len(v), v)) prop_strings.append('PROPS-END\n') return ''.join(prop_strings) def start_commit(self, revnum, revprops): """Emit the start of SVN_COMMIT (an SVNCommit).""" # The start of a new commit typically looks like this: # # Revision-number: 1 # Prop-content-length: 129 # Content-length: 129 # # K 7 # svn:log # V 27 # Log message for revision 1. # K 10 # svn:author # V 7 # jrandom # K 8 # svn:date # V 27 # 2003-04-22T22:57:58.132837Z # PROPS-END # # Notice that the length headers count everything -- not just the # length of the data but also the lengths of the lengths, including # the 'K ' or 'V ' prefixes. # # The reason there are both Prop-content-length and Content-length # is that the former includes just props, while the latter includes # everything. That's the generic header form for any entity in a # dumpfile. But since revisions only have props, the two lengths # are always the same for revisions. # Calculate the output needed for the property definitions. all_prop_strings = self._string_for_props(revprops) total_len = len(all_prop_strings) # Print the revision header and revprops self._dumpfile.write( 'Revision-number: %d\n' 'Prop-content-length: %d\n' 'Content-length: %d\n' '\n' '%s' '\n' % (revnum, total_len, total_len, all_prop_strings) ) def end_commit(self): pass def _make_any_dir(self, path): """Emit the creation of directory PATH.""" self._dumpfile.write( "Node-path: %s\n" "Node-kind: dir\n" "Node-action: add\n" "\n" "\n" % self._utf8_path(path) ) def _register_basic_directory(self, path, create): """Register the creation of PATH if it is not already there. Create any parent directories that do not already exist. If CREATE is set, also create PATH if it doesn't already exist. This method should only be used for the LOD paths and the directories containing them, not for directories within an LOD path.""" if path not in self._basic_directories: # Make sure that the parent directory is present: self._register_basic_directory(path_split(path)[0], True) if create: self._make_any_dir(path) self._basic_directories.add(path) def initialize_project(self, project): """Create any initial directories for the project. The trunk, tags, and branches directories directories are created the first time the project is seen. Be sure not to create parent directories that already exist (e.g., because two directories share part of their paths either within or across projects).""" for path in project.get_initial_directories(): self._register_basic_directory(path, True) def initialize_lod(self, lod): lod_path = lod.get_path() if lod_path: self._register_basic_directory(lod_path, True) def mkdir(self, lod, cvs_directory): self._make_any_dir(lod.get_path(cvs_directory.cvs_path)) def _add_or_change_path(self, cvs_rev, op): """Emit the addition or change corresponding to CVS_REV. OP is either the constant OP_ADD or OP_CHANGE.""" assert op in [OP_ADD, OP_CHANGE] # The property handling here takes advantage of an undocumented # but IMHO consistent feature of the Subversion dumpfile-loading # code. When a node's properties aren't mentioned (that is, the # "Prop-content-length:" header is absent, no properties are # listed at all, and there is no "PROPS-END\n" line) then no # change is made to the node's properties. # # This is consistent with the way dumpfiles behave w.r.t. text # content changes, so I'm comfortable relying on it. If you # commit a change to *just* the properties of some node that # already has text contents from a previous revision, then in the # dumpfile output for the prop change, no "Text-content-length:" # nor "Text-content-md5:" header will be present, and the text of # the file will not be given. But this does not cause the file's # text to be erased! It simply remains unchanged. # # This works out great for cvs2svn, due to lucky coincidences: # # For files, we set most properties in the first revision and # never change them. (The only exception is the 'cvs2svn:cvs-rev' # property.) If 'cvs2svn:cvs-rev' is not being used, then there # is no need to remember the full set of properties on a given # file once we've set it. # # For directories, the only property we set is "svn:ignore", and # while we may change it after the first revision, we always do so # based on the contents of a ".cvsignore" file -- in other words, # CVS is doing the remembering for us, so we still don't have to # preserve the previous value of the property ourselves. # Calculate the (sorted-by-name) property string and length, if any. svn_props = cvs_rev.get_properties() if cvs_rev.properties_changed: prop_contents = self._string_for_props(svn_props) props_header = 'Prop-content-length: %d\n' % len(prop_contents) else: prop_contents = '' props_header = '' data = self._revision_reader.get_content(cvs_rev) # treat .cvsignore as a directory property dir_path, basename = path_split(cvs_rev.get_svn_path()) if basename == '.cvsignore': ignore_contents = self._string_for_props({ 'svn:ignore' : ''.join((s + '\n') for s in generate_ignores(data)) }) ignore_len = len(ignore_contents) # write headers, then props self._dumpfile.write( 'Node-path: %s\n' 'Node-kind: dir\n' 'Node-action: change\n' 'Prop-content-length: %d\n' 'Content-length: %d\n' '\n' '%s' % (self._utf8_path(dir_path), ignore_len, ignore_len, ignore_contents) ) if not Ctx().keep_cvsignore: return checksum = md5() checksum.update(data) # The content length is the length of property data, text data, # and any metadata around/inside around them: self._dumpfile.write( 'Node-path: %s\n' 'Node-kind: file\n' 'Node-action: %s\n' '%s' # no property header if no props 'Text-content-length: %d\n' 'Text-content-md5: %s\n' 'Content-length: %d\n' '\n' % ( self._utf8_path(cvs_rev.get_svn_path()), op, props_header, len(data), checksum.hexdigest(), len(data) + len(prop_contents), ) ) if prop_contents: self._dumpfile.write(prop_contents) self._dumpfile.write(data) # This record is done (write two newlines -- one to terminate # contents that weren't themselves newline-termination, one to # provide a blank line for readability. self._dumpfile.write('\n\n') def add_path(self, cvs_rev): """Emit the addition corresponding to CVS_REV, a CVSRevisionAdd.""" self._add_or_change_path(cvs_rev, OP_ADD) def change_path(self, cvs_rev): """Emit the change corresponding to CVS_REV, a CVSRevisionChange.""" self._add_or_change_path(cvs_rev, OP_CHANGE) def delete_lod(self, lod): """Emit the deletion of LOD.""" self._dumpfile.write( 'Node-path: %s\n' 'Node-action: delete\n' '\n' % (self._utf8_path(lod.get_path()),) ) self._basic_directories.remove(lod.get_path()) def delete_path(self, lod, cvs_path): dir_path, basename = path_split(lod.get_path(cvs_path.get_cvs_path())) if basename == '.cvsignore': # When a .cvsignore file is deleted, the directory's svn:ignore # property needs to be deleted. ignore_contents = 'PROPS-END\n' ignore_len = len(ignore_contents) # write headers, then props self._dumpfile.write( 'Node-path: %s\n' 'Node-kind: dir\n' 'Node-action: change\n' 'Prop-content-length: %d\n' 'Content-length: %d\n' '\n' '%s' % (self._utf8_path(dir_path), ignore_len, ignore_len, ignore_contents) ) if not Ctx().keep_cvsignore: return self._dumpfile.write( 'Node-path: %s\n' 'Node-action: delete\n' '\n' % (self._utf8_path(lod.get_path(cvs_path.cvs_path)),) ) def copy_lod(self, src_lod, dest_lod, src_revnum): # Register the main LOD directory, and create parent directories # as needed: self._register_basic_directory(dest_lod.get_path(), False) self._dumpfile.write( 'Node-path: %s\n' 'Node-kind: dir\n' 'Node-action: add\n' 'Node-copyfrom-rev: %d\n' 'Node-copyfrom-path: %s\n' '\n' % (self._utf8_path(dest_lod.get_path()), src_revnum, self._utf8_path(src_lod.get_path())) ) def copy_path(self, cvs_path, src_lod, dest_lod, src_revnum): if isinstance(cvs_path, CVSFile): node_kind = 'file' if cvs_path.rcs_basename == '.cvsignore': # FIXME: Here we have to adjust the containing directory's # svn:ignore property to reflect the addition of the # .cvsignore file to the LOD! This is awkward because we # don't have the contents of the .cvsignore file available. if not Ctx().keep_cvsignore: return elif isinstance(cvs_path, CVSDirectory): node_kind = 'dir' else: raise InternalError() self._dumpfile.write( 'Node-path: %s\n' 'Node-kind: %s\n' 'Node-action: add\n' 'Node-copyfrom-rev: %d\n' 'Node-copyfrom-path: %s\n' '\n' % ( self._utf8_path(dest_lod.get_path(cvs_path.cvs_path)), node_kind, src_revnum, self._utf8_path(src_lod.get_path(cvs_path.cvs_path)) ) ) def finish(self): """Perform any cleanup necessary after all revisions have been committed.""" self._dumpfile.close() def generate_ignores(raw_ignore_val): ignore_vals = [ ] for ignore in raw_ignore_val.split(): # Reset the list if we encounter a '!' # See http://cvsbook.red-bean.com/cvsbook.html#cvsignore if ignore == '!': ignore_vals = [ ] else: ignore_vals.append(ignore) return ignore_vals class LoaderPipe(object): """A file-like object that writes to 'svnadmin load'. Some error checking and reporting are done when writing.""" def __init__(self, target): self.loader_pipe = subprocess.Popen( [Ctx().svnadmin_executable, 'load', '-q', target], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, ) self.loader_pipe.stdout.close() def write(self, s): try: self.loader_pipe.stdin.write(s) except IOError: raise FatalError( 'svnadmin failed with the following output while ' 'loading the dumpfile:\n%s' % (self.loader_pipe.stderr.read(),) ) def close(self): self.loader_pipe.stdin.close() error_output = self.loader_pipe.stderr.read() exit_status = self.loader_pipe.wait() del self.loader_pipe if exit_status: raise CommandError('svnadmin load', exit_status, error_output) cvs2svn-2.4.0/cvs2svn_lib/serializer.py0000664000076500007650000001044211244046037021162 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Picklers and unpicklers that are primed with known objects.""" import cStringIO import marshal import cPickle import zlib class Serializer: """An object able to serialize/deserialize some class of objects.""" def dumpf(self, f, object): """Serialize OBJECT to file-like object F.""" raise NotImplementedError() def dumps(self, object): """Return a string containing OBJECT in serialized form.""" raise NotImplementedError() def loadf(self, f): """Return the next object deserialized from file-like object F.""" raise NotImplementedError() def loads(self, s): """Return the object deserialized from string S.""" raise NotImplementedError() class MarshalSerializer(Serializer): """This class uses the marshal module to serialize/deserialize. This means that it shares the limitations of the marshal module, namely only being able to serialize a few simple python data types without reference loops.""" def dumpf(self, f, object): marshal.dump(object, f) def dumps(self, object): return marshal.dumps(object) def loadf(self, f): return marshal.load(f) def loads(self, s): return marshal.loads(s) class PrimedPickleSerializer(Serializer): """This class acts as a pickler/unpickler with a pre-initialized memo. The picklers and unpicklers are 'pre-trained' to recognize the objects that are in the primer. If objects are recognized from PRIMER, then only their persistent IDs need to be pickled instead of the whole object. (Note that the memos needed for pickling and unpickling are different.) A new pickler/unpickler is created for each use, each time with the memo initialized appropriately for pickling or unpickling.""" def __init__(self, primer): """Prepare to make picklers/unpicklers with the specified primer. The Pickler and Unpickler are 'primed' by pre-pickling PRIMER, which can be an arbitrary object (e.g., a list of objects that are expected to occur frequently in the objects to be serialized).""" f = cStringIO.StringIO() pickler = cPickle.Pickler(f, -1) pickler.dump(primer) self.pickler_memo = pickler.memo unpickler = cPickle.Unpickler(cStringIO.StringIO(f.getvalue())) unpickler.load() self.unpickler_memo = unpickler.memo def dumpf(self, f, object): """Serialize OBJECT to file-like object F.""" pickler = cPickle.Pickler(f, -1) pickler.memo = self.pickler_memo.copy() pickler.dump(object) def dumps(self, object): """Return a string containing OBJECT in serialized form.""" f = cStringIO.StringIO() self.dumpf(f, object) return f.getvalue() def loadf(self, f): """Return the next object deserialized from file-like object F.""" unpickler = cPickle.Unpickler(f) unpickler.memo = self.unpickler_memo.copy() return unpickler.load() def loads(self, s): """Return the object deserialized from string S.""" return self.loadf(cStringIO.StringIO(s)) class CompressingSerializer(Serializer): """This class wraps other Serializers to compress their serialized data.""" def __init__(self, wrapee): """Constructor. WRAPEE is the Serializer whose bitstream ought to be compressed.""" self.wrapee = wrapee def dumpf(self, f, object): marshal.dump(zlib.compress(self.wrapee.dumps(object), 9), f) def dumps(self, object): return marshal.dumps(zlib.compress(self.wrapee.dumps(object), 9)) def loadf(self, f): return self.wrapee.loads(zlib.decompress(marshal.load(f))) def loads(self, s): return self.wrapee.loads(zlib.decompress(marshal.loads(s))) cvs2svn-2.4.0/cvs2svn_lib/database.py0000664000076500007650000001250311710517256020561 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains database facilities used by cvs2svn.""" import sys import os import cPickle from cvs2svn_lib.common import DB_OPEN_NEW from cvs2svn_lib.common import warning_prefix from cvs2svn_lib.common import error_prefix from cvs2svn_lib.log import logger # DBM module selection # 1. If we have bsddb3, it is probably newer than bsddb. Fake bsddb = bsddb3, # so that the dbhash module used by anydbm will use bsddb3. try: import bsddb3 sys.modules['bsddb'] = bsddb3 except ImportError: pass # 2. These DBM modules are not good for cvs2svn. import anydbm if anydbm._defaultmod.__name__ in ['dumbdbm', 'dbm']: logger.error( '%s: cvs2svn uses the anydbm package, which depends on lower level ' 'dbm\n' 'libraries. Your system has %s, with which cvs2svn is known to have\n' 'problems. To use cvs2svn, you must install a Python dbm library ' 'other than\n' 'dumbdbm or dbm. See ' 'http://python.org/doc/current/lib/module-anydbm.html\n' 'for more information.\n' % (error_prefix, anydbm._defaultmod.__name__,) ) sys.exit(1) # 3. If we are using the old bsddb185 module, then try prefer gdbm instead. # Unfortunately, gdbm appears not to be trouble free, either. if hasattr(anydbm._defaultmod, 'bsddb') \ and not hasattr(anydbm._defaultmod.bsddb, '__version__'): try: gdbm = __import__('gdbm') except ImportError: logger.warn( '%s: The version of the bsddb module found on your computer ' 'has been\n' 'reported to malfunction on some datasets, causing KeyError ' 'exceptions.\n' % (warning_prefix,) ) else: anydbm._defaultmod = gdbm class Database: """A database that uses a Serializer to store objects of a certain type. The serializer is stored in the database under the key self.serializer_key. (This implies that self.serializer_key may not be used as a key for normal entries.) The backing database is an anydbm-based DBM. """ serializer_key = '_.%$1\t;_ ' def __init__(self, filename, mode, serializer=None): """Constructor. The database stores its Serializer, so none needs to be supplied when opening an existing database.""" # pybsddb3 has a bug which prevents it from working with # Berkeley DB 4.2 if you open the db with 'n' ("new"). This # causes the DB_TRUNCATE flag to be passed, which is disallowed # for databases protected by lock and transaction support # (bsddb databases use locking from bsddb version 4.2.4 onwards). # # Therefore, manually perform the removal (we can do this, because # we know that for bsddb - but *not* anydbm in general - the database # consists of one file with the name we specify, rather than several # based on that name). if mode == DB_OPEN_NEW and anydbm._defaultmod.__name__ == 'dbhash': if os.path.isfile(filename): os.unlink(filename) self.db = anydbm.open(filename, 'c') else: self.db = anydbm.open(filename, mode) # Import implementations for many mapping interface methods. for meth_name in ('__delitem__', '__iter__', 'has_key', '__contains__', 'iterkeys', 'clear'): meth_ref = getattr(self.db, meth_name, None) if meth_ref: setattr(self, meth_name, meth_ref) if mode == DB_OPEN_NEW: self.serializer = serializer self.db[self.serializer_key] = cPickle.dumps(self.serializer) else: self.serializer = cPickle.loads(self.db[self.serializer_key]) def __getitem__(self, key): return self.serializer.loads(self.db[key]) def __setitem__(self, key, value): self.db[key] = self.serializer.dumps(value) def __delitem__(self, key): # gdbm defines a __delitem__ method, but it cannot be assigned. So # this method provides a fallback definition via explicit delegation: del self.db[key] def keys(self): retval = self.db.keys() retval.remove(self.serializer_key) return retval def __iter__(self): for key in self.keys(): yield key def has_key(self, key): try: self.db[key] return True except KeyError: return False def __contains__(self, key): return self.has_key(key) def iterkeys(self): return self.__iter__() def clear(self): for key in self.keys(): del self[key] def items(self): return [(key, self[key],) for key in self.keys()] def values(self): return [self[key] for key in self.keys()] def get(self, key, default=None): try: return self[key] except KeyError: return default def close(self): self.db.close() self.db = None cvs2svn-2.4.0/cvs2svn_lib/cvs_revision_manager.py0000664000076500007650000000720111710517256023217 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Access the CVS repository via CVS's 'cvs' command.""" from cvs2svn_lib.common import FatalError from cvs2svn_lib.process import check_command_runs from cvs2svn_lib.process import CommandFailedException from cvs2svn_lib.abstract_rcs_revision_manager import AbstractRCSRevisionReader class CVSRevisionReader(AbstractRCSRevisionReader): """A RevisionReader that reads the contents via CVS.""" # Different versions of CVS support different global options. Here # are the global options that we try to use, in order of decreasing # preference: _possible_global_options = [ ['-Q', '-R', '-f'], ['-Q', '-R'], ['-Q', '-f'], ['-Q'], ['-q', '-R', '-f'], ['-q', '-R'], ['-q', '-f'], ['-q'], ] def __init__(self, cvs_executable, global_options=None): """Initialize a CVSRevisionReader. CVS_EXECUTABLE is the CVS command (possibly including the full path to the executable; otherwise it is sought in the $PATH). GLOBAL_ARGUMENTS, if specified, should be a list of global options that are passed to the CVS command before the subcommand. If GLOBAL_ARGUMENTS is not specified, then each of the possibilities listed in _possible_global_options is checked in order until one is found that runs successfully and without any output to stderr.""" self.cvs_executable = cvs_executable if global_options is None: for global_options in self._possible_global_options: try: self._check_cvs_runs(global_options) except CommandFailedException, e: pass else: break else: raise FatalError( '%s\n' 'Please check that cvs is installed and in your PATH.' % (e,) ) else: try: self._check_cvs_runs(global_options) except CommandFailedException, e: raise FatalError( '%s\n' 'Please check that cvs is installed and in your PATH and that\n' 'the global options that you specified (%r) are correct.' % (e, global_options,) ) # The global options were OK; use them for all CVS invocations. self.global_options = global_options def _check_cvs_runs(self, global_options): """Check that CVS can be started. Try running 'cvs --version' with the current setting for self.cvs_executable and the specified global_options. If not successful, raise a CommandFailedException.""" check_command_runs( [self.cvs_executable] + global_options + ['--version'], self.cvs_executable, ) def get_pipe_command(self, cvs_rev, k_option): project = cvs_rev.cvs_file.project return [ self.cvs_executable ] + self.global_options + [ '-d', ':local:' + project.cvs_repository_root, 'co', '-r' + cvs_rev.rev, '-p' ] + k_option + [ project.cvs_module + cvs_rev.cvs_path ] cvs2svn-2.4.0/cvs2svn_lib/symbol_transform.py0000664000076500007650000002744411500107341022412 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2006-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains classes to transform symbol names.""" import os import re from cvs2svn_lib.log import logger from cvs2svn_lib.common import FatalError from cvs2svn_lib.common import IllegalSVNPathError from cvs2svn_lib.common import normalize_svn_path from cvs2svn_lib.common import is_branch_revision_number class SymbolTransform: """Transform symbol names arbitrarily.""" def transform(self, cvs_file, symbol_name, revision): """Possibly transform SYMBOL_NAME, which was found in CVS_FILE. Return the transformed symbol name. If this SymbolTransform doesn't apply, return the original SYMBOL_NAME. If this symbol should be ignored entirely, return None. (Please note that ignoring a branch via this mechanism only causes the branch *name* to be ignored; the branch contents will still be converted. Usually branches should be excluded using --exclude.) REVISION contains the CVS revision number to which the symbol was attached in the file as a string (with zeros removed). This method is free to use the information in CVS_FILE (including CVS_FILE.project) to decide whether and/or how to transform SYMBOL_NAME.""" raise NotImplementedError() class ReplaceSubstringsSymbolTransform(SymbolTransform): """Replace specific substrings in symbol names. If the substring occurs multiple times, replace all copies.""" def __init__(self, old, new): self.old = old self.new = new def transform(self, cvs_file, symbol_name, revision): return symbol_name.replace(self.old, self.new) class NormalizePathsSymbolTransform(SymbolTransform): def transform(self, cvs_file, symbol_name, revision): try: return normalize_svn_path(symbol_name) except IllegalSVNPathError, e: raise FatalError('Problem with %s: %s' % (symbol_name, e,)) class CompoundSymbolTransform(SymbolTransform): """A SymbolTransform that applies other SymbolTransforms in series. Each of the contained SymbolTransforms is applied, one after the other. If any of them returns None, then None is returned (the following SymbolTransforms are ignored).""" def __init__(self, symbol_transforms): """Ininitialize a CompoundSymbolTransform. SYMBOL_TRANSFORMS is an iterable of SymbolTransform instances.""" self.symbol_transforms = list(symbol_transforms) def transform(self, cvs_file, symbol_name, revision): for symbol_transform in self.symbol_transforms: symbol_name = symbol_transform.transform( cvs_file, symbol_name, revision ) if symbol_name is None: # Don't continue with other symbol transforms: break return symbol_name class RegexpSymbolTransform(SymbolTransform): """Transform symbols by using a regexp textual substitution.""" def __init__(self, pattern, replacement): """Create a SymbolTransform that transforms symbols matching PATTERN. PATTERN is a regular expression that should match the whole symbol name. REPLACEMENT is the replacement text, which may include patterns like r'\1' or r'\g<1>' or r'\g' (where 'name' is a reference to a named substring in the pattern of the form r'(?P...)').""" self.pattern = re.compile('^' + pattern + '$') self.replacement = replacement def transform(self, cvs_file, symbol_name, revision): return self.pattern.sub(self.replacement, symbol_name) class SymbolMapper(SymbolTransform): """A SymbolTransform that transforms specific symbol definitions. The user has to specify the exact CVS filename, symbol name, and revision number to be transformed, and the new name (or None if the symbol should be ignored). The mappings can be set via a constructor argument or by calling __setitem__().""" def __init__(self, items=[]): """Initialize the mapper. ITEMS is a list of tuples (cvs_filename, symbol_name, revision, new_name) which will be set as mappings.""" # A map {(cvs_filename, symbol_name, revision) : new_name}: self._map = {} for (cvs_filename, symbol_name, revision, new_name) in items: self[cvs_filename, symbol_name, revision] = new_name def __setitem__(self, (cvs_filename, symbol_name, revision), new_name): """Set a mapping for a particular file, symbol, and revision.""" cvs_filename = os.path.normcase(os.path.normpath(cvs_filename)) key = (cvs_filename, symbol_name, revision) if key in self._map: logger.warn( 'Overwriting symbol transform for\n' ' filename=%r symbol=%s revision=%s' % (cvs_filename, symbol_name, revision,) ) self._map[key] = new_name def transform(self, cvs_file, symbol_name, revision): # cvs_file.rcs_path is guaranteed to already be normalised the way # os.path.normpath() normalises paths. No need to call it again. cvs_filename = os.path.normcase(cvs_file.rcs_path) return self._map.get( (cvs_filename, symbol_name, revision), symbol_name ) class SubtreeSymbolMapper(SymbolTransform): """A SymbolTransform that transforms symbols within a whole repo subtree. The user has to specify a CVS repository path (a filename or directory) and the original symbol name. All symbols under that path will be renamed to the specified new name (which can be None if the symbol should be ignored). The mappings can be set via a constructor argument or by calling __setitem__(). Only the most specific rule is applied.""" def __init__(self, items=[]): """Initialize the mapper. ITEMS is a list of tuples (cvs_path, symbol_name, new_name) which will be set as mappings. cvs_path is a string naming a directory within the CVS repository.""" # A map {symbol_name : {cvs_path : new_name}}: self._map = {} for (cvs_path, symbol_name, new_name) in items: self[cvs_path, symbol_name] = new_name def __setitem__(self, (cvs_path, symbol_name), new_name): """Set a mapping for a particular file and symbol.""" try: symbol_map = self._map[symbol_name] except KeyError: symbol_map = {} self._map[symbol_name] = symbol_map cvs_path = os.path.normcase(os.path.normpath(cvs_path)) if cvs_path in symbol_map: logger.warn( 'Overwriting symbol transform for\n' ' directory=%r symbol=%s' % (cvs_path, symbol_name,) ) symbol_map[cvs_path] = new_name def transform(self, cvs_file, symbol_name, revision): try: symbol_map = self._map[symbol_name] except KeyError: # No rules for that symbol name return symbol_name # cvs_file.rcs_path is guaranteed to already be normalised the way # os.path.normpath() normalises paths. No need to call it again. cvs_path = os.path.normcase(cvs_file.rcs_path) while True: try: return symbol_map[cvs_path] except KeyError: new_cvs_path = os.path.dirname(cvs_path) if new_cvs_path == cvs_path: # No rules found for that path; return symbol name unaltered. return symbol_name else: cvs_path = new_cvs_path class IgnoreSymbolTransform(SymbolTransform): """Ignore symbols matching a specified regular expression.""" def __init__(self, pattern): """Create an SymbolTransform that ignores symbols matching PATTERN. PATTERN is a regular expression that should match the whole symbol name.""" self.pattern = re.compile('^' + pattern + '$') def transform(self, cvs_file, symbol_name, revision): if self.pattern.match(symbol_name): return None else: return symbol_name class SubtreeSymbolTransform(SymbolTransform): """A wrapper around another SymbolTransform, that limits it to a specified subtree.""" def __init__(self, cvs_path, inner_symbol_transform): """Constructor. CVS_PATH is the path in the repository. INNER_SYMBOL_TRANSFORM is the SymbolTransform to wrap.""" assert isinstance(cvs_path, str) self.__subtree = os.path.normcase(os.path.normpath(cvs_path)) self.__subtree_len = len(self.__subtree) self.__inner = inner_symbol_transform def __does_rule_apply_to(self, cvs_file): # # NOTE: This turns out to be a hot path through the code. # # It used to use logic similar to SubtreeSymbolMapper.transform(). And # it used to take 44% of cvs2svn's total runtime on one real-world test. # Now it's been optimized, it takes about 2%. # # This is called about: # (num_files * num_symbols_per_file * num_subtree_symbol_transforms) # times. On a large repository with several of these transforms, # that can exceed 100,000,000 calls. # # cvs_file.rcs_path is guaranteed to already be normalised the way # os.path.normpath() normalises paths. So we don't need to call # os.path.normpath() again. (The os.path.normpath() function does # quite a lot, so it's expensive). # # os.path.normcase is a no-op on POSIX systems (and therefore fast). # Even on Windows it's only a memory allocation and case-change, it # should be quite fast. cvs_path = os.path.normcase(cvs_file.rcs_path) # Do most of the work in a single call, without allocating memory. if not cvs_path.startswith(self.__subtree): # Different prefix. # This is the common "no match" case. return False if len(cvs_path) == self.__subtree_len: # Exact match. # # This is quite rare, as self.__subtree is usually a directory and # cvs_path is always a file. return True # We know cvs_path starts with self.__subtree, check the next character # is a '/' (or if we're on Windows, a '\\'). If so, then cvs_path is a # file under the self.__subtree directory tree, so we match. If not, # then it's not a match. return cvs_path[self.__subtree_len] == os.path.sep def transform(self, cvs_file, symbol_name, revision): if self.__does_rule_apply_to(cvs_file): return self.__inner.transform(cvs_file, symbol_name, revision) else: # Rule does not apply to that path; return symbol name unaltered. return symbol_name class TagOnlyTransform(SymbolTransform): """A wrapper around another SymbolTransform, that limits it to CVS tags (not CVS branches).""" def __init__(self, inner_symbol_transform): """Constructor. INNER_SYMBOL_TRANSFORM is the SymbolTransform to wrap.""" self.__inner = inner_symbol_transform def transform(self, cvs_file, symbol_name, revision): if is_branch_revision_number(revision): # It's a branch return symbol_name else: # It's a tag return self.__inner.transform(cvs_file, symbol_name, revision) class BranchOnlyTransform(SymbolTransform): """A wrapper around another SymbolTransform, that limits it to CVS branches (not CVS tags).""" def __init__(self, inner_symbol_transform): """Constructor. INNER_SYMBOL_TRANSFORM is the SymbolTransform to wrap.""" self.__inner = inner_symbol_transform def transform(self, cvs_file, symbol_name, revision): if is_branch_revision_number(revision): # It's a branch return self.__inner.transform(cvs_file, symbol_name, revision) else: # It's a tag return symbol_name cvs2svn-2.4.0/cvs2svn_lib/metadata_database.py0000664000076500007650000000704511710517256022426 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains classes to manage CVSRevision metadata.""" try: from hashlib import sha1 except ImportError: from sha import new as sha1 from cvs2svn_lib.context import Ctx from cvs2svn_lib.indexed_database import IndexedDatabase from cvs2svn_lib.key_generator import KeyGenerator from cvs2svn_lib.serializer import PrimedPickleSerializer from cvs2svn_lib.metadata import Metadata def MetadataDatabase(store_filename, index_table_filename, mode): """A database to store Metadata instances that describe CVSRevisions. This database manages a map id -> Metadata instance where id is a unique identifier for the metadata.""" return IndexedDatabase( store_filename, index_table_filename, mode, PrimedPickleSerializer((Metadata,)), ) class MetadataLogger: """Store and generate IDs for the metadata associated with CVSRevisions. We want CVSRevisions that might be able to be combined to have the same metadata ID, so we want a one-to-one relationship id <-> metadata. We could simply construct a map {metadata : id}, but the map would grow too large. Therefore, we generate a digest containing the significant parts of the metadata, and construct a map {digest : id}. To get the ID for a new set of metadata, we first create the digest. If there is already an ID registered for that digest, we simply return it. If not, we generate a new ID, store the metadata in the metadata database under that ID, record the mapping {digest : id}, and return the new id. What metadata is included in the digest? The author, log_msg, project_id (if Ctx().cross_project_commits is not set), and branch_name (if Ctx().cross_branch_commits is not set).""" def __init__(self, metadata_db): self._metadata_db = metadata_db # A map { digest : id }: self._digest_to_id = {} # A key_generator to generate keys for metadata that haven't been # seen yet: self.key_generator = KeyGenerator() def store(self, project, branch_name, author, log_msg): """Store the metadata and return its id. Locate the record for a commit with the specified (PROJECT, BRANCH_NAME, AUTHOR, LOG_MSG) and return its id. (Depending on policy, not all of these items are necessarily used when creating the unique id.) If there is no such record, create one and return its newly-generated id.""" key = [author, log_msg] if not Ctx().cross_project_commits: key.append('%x' % project.id) if not Ctx().cross_branch_commits: key.append(branch_name or '') digest = sha1('\0'.join(key)).digest() try: # See if it is already known: return self._digest_to_id[digest] except KeyError: id = self.key_generator.gen_id() self._digest_to_id[digest] = id self._metadata_db[id] = Metadata(id, author, log_msg) return id cvs2svn-2.4.0/cvs2svn_lib/collect_data.py0000664000076500007650000012563311710517256021444 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Data collection classes. This module contains the code used to collect data from the CVS repository. It parses *,v files, recording all useful information except for the actual file contents. As a *,v file is parsed, the information pertaining to the file is accumulated in memory, mostly in _RevisionData, _BranchData, and _TagData objects. When parsing is complete, a final pass is made over the data to create some final dependency links, collect statistics, etc., then the _*Data objects are converted into CVSItem objects (CVSRevision, CVSBranch, and CVSTag respectively) and the CVSItems are dumped into databases. During the data collection, persistent unique ids are allocated to many types of objects: CVSFile, Symbol, and CVSItems. CVSItems are a special case. CVSItem ids are unique across all CVSItem types, and the ids are carried over from the corresponding data collection objects: _RevisionData -> CVSRevision _BranchData -> CVSBranch _TagData -> CVSTag In a later pass it is possible to convert tags <-> branches. But even if this occurs, the new branch or tag uses the same id as the old tag or branch. """ import re from cvs2svn_lib import config from cvs2svn_lib.common import DB_OPEN_NEW from cvs2svn_lib.common import warning_prefix from cvs2svn_lib.common import error_prefix from cvs2svn_lib.common import is_trunk_revision from cvs2svn_lib.common import is_branch_revision_number from cvs2svn_lib.log import logger from cvs2svn_lib.context import Ctx from cvs2svn_lib.artifact_manager import artifact_manager from cvs2svn_lib.cvs_path import CVSFile from cvs2svn_lib.cvs_path import CVSDirectory from cvs2svn_lib.symbol import Symbol from cvs2svn_lib.symbol import Trunk from cvs2svn_lib.cvs_item import CVSRevision from cvs2svn_lib.cvs_item import CVSBranch from cvs2svn_lib.cvs_item import CVSTag from cvs2svn_lib.cvs_item import cvs_revision_type_map from cvs2svn_lib.cvs_file_items import VendorBranchError from cvs2svn_lib.cvs_file_items import CVSFileItems from cvs2svn_lib.key_generator import KeyGenerator from cvs2svn_lib.cvs_item_database import NewCVSItemStore from cvs2svn_lib.symbol_statistics import SymbolStatisticsCollector from cvs2svn_lib.metadata_database import MetadataDatabase from cvs2svn_lib.metadata_database import MetadataLogger from cvs2svn_lib.rcsparser import Sink from cvs2svn_lib.rcsparser import parse from cvs2svn_lib.rcsparser import RCSParseError # A regular expression defining "valid" revision numbers (used to # check that symbol definitions are reasonable). _valid_revision_re = re.compile(r''' ^ (?:\d+\.)+ # Digit groups with trailing dots \d+ # And the last digit group. $ ''', re.VERBOSE) _branch_revision_re = re.compile(r''' ^ ((?:\d+\.\d+\.)+) # A nonzero even number of digit groups w/trailing dot (?:0\.)? # CVS sticks an extra 0 here; RCS does not (\d+) # And the last digit group $ ''', re.VERBOSE) def is_same_line_of_development(rev1, rev2): """Return True if rev1 and rev2 are on the same line of development (i.e., both on trunk, or both on the same branch); return False otherwise. Either rev1 or rev2 can be None, in which case automatically return False.""" if rev1 is None or rev2 is None: return False if is_trunk_revision(rev1) and is_trunk_revision(rev2): # Trunk revisions have to be handled specially because the main # trunk version number can be changed; e.g., from 1 to 2. return True if rev1[0:rev1.rfind('.')] == rev2[0:rev2.rfind('.')]: return True return False class _RevisionData: """We track the state of each revision so that in set_revision_info, we can determine if our op is an add/change/delete. We can do this because in set_revision_info, we'll have all of the _RevisionData for a file at our fingertips, and we need to examine the state of our prev_rev to determine if we're an add or a change. Without the state of the prev_rev, we are unable to distinguish between an add and a change.""" def __init__(self, cvs_rev_id, rev, timestamp, author, state): # The id of this revision: self.cvs_rev_id = cvs_rev_id self.rev = rev self.timestamp = timestamp self.author = author self.state = state # If this is the first revision on a branch, then this is the # branch_data of that branch; otherwise it is None. self.parent_branch_data = None # The revision number of the parent of this revision along the # same line of development, if any. For the first revision R on a # branch, we consider the revision from which R sprouted to be the # 'parent'. If this is the root revision in the file's revision # tree, then this field is None. # # Note that this revision can't be determined arithmetically (due # to cvsadmin -o), which is why this field is necessary. self.parent = None # The revision number of the primary child of this revision (the # child along the same line of development), if any; otherwise, # None. self.child = None # The _BranchData instances of branches that sprout from this # revision, sorted in ascending order by branch number. It would # be inconvenient to initialize it here because we would have to # scan through all branches known by the _SymbolDataCollector to # find the ones having us as the parent. Instead, this # information is filled in by # _FileDataCollector._resolve_dependencies() and sorted by # _FileDataCollector._sort_branches(). self.branches_data = [] # The revision numbers of the first commits on any branches on # which commits occurred. This dependency is kept explicitly # because otherwise a revision-only topological sort would miss # the dependency that exists via branches_data. self.branches_revs_data = [] # The _TagData instances of tags that are connected to this # revision. self.tags_data = [] # A token that may be set by a RevisionCollector, then used by # RevisionReader to obtain the text again. self.revision_reader_token = None def get_first_on_branch_id(self): return self.parent_branch_data and self.parent_branch_data.id class _SymbolData: """Collection area for information about a symbol in a single CVSFile. SYMBOL is an instance of Symbol, undifferentiated as a Branch or a Tag regardless of whether self is a _BranchData or a _TagData.""" def __init__(self, id, symbol): """Initialize an object for SYMBOL.""" # The unique id that will be used for this particular symbol in # this particular file. This same id will be used for the CVSItem # that is derived from this instance. self.id = id # An instance of Symbol. self.symbol = symbol class _BranchData(_SymbolData): """Collection area for information about a Branch in a single CVSFile.""" def __init__(self, id, symbol, branch_number): _SymbolData.__init__(self, id, symbol) # The branch number (e.g., '1.5.2') of this branch. self.branch_number = branch_number # The revision number of the revision from which this branch # sprouts (e.g., '1.5'). self.parent = self.branch_number[:self.branch_number.rindex(".")] # The revision number of the first commit on this branch, if any # (e.g., '1.5.2.1'); otherwise, None. self.child = None class _TagData(_SymbolData): """Collection area for information about a Tag in a single CVSFile.""" def __init__(self, id, symbol, rev): _SymbolData.__init__(self, id, symbol) # The revision number being tagged (e.g., '1.5.2.3'). self.rev = rev class _SymbolDataCollector(object): """Collect information about symbols in a single CVSFile.""" def __init__(self, fdc, cvs_file): self.fdc = fdc self.cvs_file = cvs_file self.pdc = self.fdc.pdc self.collect_data = self.fdc.collect_data # A list [(name, revision), ...] of symbols defined in the header # of the file. The name has already been transformed using the # symbol transform rules. If the symbol transform rules indicate # that the symbol should be ignored, then it is never added to # this list. This list is processed then deleted in # process_symbols(). self._symbol_defs = [] # A set containing the transformed names of symbols in this file # (used to detect duplicates during processing of unlabeled # branches): self._defined_symbols = set() # Map { branch_number : _BranchData }, where branch_number has an # odd number of digits. self.branches_data = { } # Map { revision : [ tag_data ] }, where revision has an even # number of digits, and the value is a list of _TagData objects # for tags that apply to that revision. self.tags_data = { } def _add_branch(self, name, branch_number): """Record that BRANCH_NUMBER is the branch number for branch NAME, and derive and record the revision from which NAME sprouts. BRANCH_NUMBER is an RCS branch number with an odd number of components, for example '1.7.2' (never '1.7.0.2'). Return the _BranchData instance (which is usually newly-created).""" branch_data = self.branches_data.get(branch_number) if branch_data is not None: logger.warn( "%s: in '%s':\n" " branch '%s' already has name '%s',\n" " cannot also have name '%s', ignoring the latter\n" % (warning_prefix, self.cvs_file.rcs_path, branch_number, branch_data.symbol.name, name) ) return branch_data symbol = self.pdc.get_symbol(name) branch_data = _BranchData( self.collect_data.item_key_generator.gen_id(), symbol, branch_number ) self.branches_data[branch_number] = branch_data return branch_data def _construct_distinct_name(self, name, original_name): """Construct a distinct symbol name from NAME. If NAME is distinct, return it. If it is already used in this file (as determined from its presence in self._defined_symbols), construct and return a new name that is not already used.""" if name not in self._defined_symbols: return name else: index = 1 while True: dup_name = '%s-DUPLICATE-%d' % (name, index,) if dup_name not in self._defined_symbols: self.collect_data.record_fatal_error( "Symbol name '%s' is already used in '%s'.\n" "The unlabeled branch '%s' must be renamed using " "--symbol-transform." % (name, self.cvs_file.rcs_path, original_name,) ) return dup_name def _add_unlabeled_branch(self, branch_number): original_name = "unlabeled-" + branch_number name = self.transform_symbol(original_name, branch_number) if name is None: self.collect_data.record_fatal_error( "The unlabeled branch '%s' in '%s' contains commits.\n" "It may not be ignored via a symbol transform. (Use --exclude " "instead.)" % (original_name, self.cvs_file.rcs_path,) ) # Retain the original name to allow the conversion to continue: name = original_name distinct_name = self._construct_distinct_name(name, original_name) self._defined_symbols.add(distinct_name) return self._add_branch(distinct_name, branch_number) def _add_tag(self, name, revision): """Record that tag NAME refers to the specified REVISION.""" symbol = self.pdc.get_symbol(name) tag_data = _TagData( self.collect_data.item_key_generator.gen_id(), symbol, revision ) self.tags_data.setdefault(revision, []).append(tag_data) return tag_data def transform_symbol(self, name, revision): """Transform a symbol according to the project's symbol transforms. Transform the symbol with the original name NAME and canonicalized revision number REVISION. Return the new symbol name or None if the symbol should be ignored entirely. Log the results of the symbol transform if necessary.""" old_name = name # Apply any user-defined symbol transforms to the symbol name: name = self.cvs_file.project.transform_symbol( self.cvs_file, name, revision ) if name is None: # Ignore symbol: self.pdc.log_symbol_transform(old_name, None) logger.verbose( " symbol '%s'=%s ignored in %s" % (old_name, revision, self.cvs_file.rcs_path,) ) else: if name != old_name: self.pdc.log_symbol_transform(old_name, name) logger.verbose( " symbol '%s'=%s transformed to '%s' in %s" % (old_name, revision, name, self.cvs_file.rcs_path,) ) return name def define_symbol(self, name, revision): """Record a symbol definition for later processing.""" # Canonicalize the revision number: revision = _branch_revision_re.sub(r'\1\2', revision) # Apply any user-defined symbol transforms to the symbol name: name = self.transform_symbol(name, revision) if name is not None: # Verify that the revision number is valid: if _valid_revision_re.match(revision): # The revision number is valid; record it for later processing: self._symbol_defs.append( (name, revision) ) else: logger.warn( 'In %r:\n' ' branch %r references invalid revision %s\n' ' and will be ignored.' % (self.cvs_file.rcs_path, name, revision,) ) def _eliminate_trivial_duplicate_defs(self, symbol_defs): """Iterate through SYMBOL_DEFS, Removing identical duplicate definitions. Duplicate definitions of symbol names have been seen in the wild, and they can also happen when --symbol-transform is used. If a symbol is defined to the same revision number repeatedly, then ignore all but the last definition.""" # Make a copy, since we have to iterate through the definitions # twice: symbol_defs = list(symbol_defs) # A map { (name, revision) : [index,...] } of the indexes where # symbol definitions name=revision were found: known_definitions = {} for (i, symbol_def) in enumerate(symbol_defs): known_definitions.setdefault(symbol_def, []).append(i) # A set of the indexes of entries that have to be removed from # symbol_defs: dup_indexes = set() for ((name, revision), indexes) in known_definitions.iteritems(): if len(indexes) > 1: logger.verbose( "in %r:\n" " symbol %s:%s defined multiple times; ignoring duplicates\n" % (self.cvs_file.rcs_path, name, revision,) ) dup_indexes.update(indexes[:-1]) for (i, symbol_def) in enumerate(symbol_defs): if i not in dup_indexes: yield symbol_def def _process_duplicate_defs(self, symbol_defs): """Iterate through SYMBOL_DEFS, processing duplicate names. Duplicate definitions of symbol names have been seen in the wild, and they can also happen when --symbol-transform is used. If a symbol is defined multiple times, then it is a fatal error. This method should be called after _eliminate_trivial_duplicate_defs().""" # Make a copy, since we have to access multiple times: symbol_defs = list(symbol_defs) # A map {name : [index,...]} mapping the names of symbols to a # list of their definitions' indexes in symbol_defs: known_symbols = {} for (i, (name, revision)) in enumerate(symbol_defs): known_symbols.setdefault(name, []).append(i) known_symbols = known_symbols.items() known_symbols.sort() dup_indexes = set() for (name, indexes) in known_symbols: if len(indexes) > 1: # This symbol was defined multiple times. self.collect_data.record_fatal_error( "Multiple definitions of the symbol '%s' in '%s': %s" % ( name, self.cvs_file.rcs_path, ' '.join([symbol_defs[i][1] for i in indexes]), ) ) # Ignore all but the last definition for now, to allow the # conversion to proceed: dup_indexes.update(indexes[:-1]) for (i, symbol_def) in enumerate(symbol_defs): if i not in dup_indexes: yield symbol_def def _process_symbol(self, name, revision): """Process a symbol called NAME, which is associated with REVISON. REVISION is a canonical revision number with zeros removed, for example: '1.7', '1.7.2', or '1.1.1' or '1.1.1.1'. NAME is a transformed branch or tag name.""" # Add symbol to our records: if is_branch_revision_number(revision): self._add_branch(name, revision) else: self._add_tag(name, revision) def process_symbols(self): """Process the symbol definitions from SELF._symbol_defs.""" symbol_defs = self._symbol_defs del self._symbol_defs symbol_defs = self._eliminate_trivial_duplicate_defs(symbol_defs) symbol_defs = self._process_duplicate_defs(symbol_defs) for (name, revision) in symbol_defs: self._defined_symbols.add(name) self._process_symbol(name, revision) @staticmethod def rev_to_branch_number(revision): """Return the branch_number of the branch on which REVISION lies. REVISION is a branch revision number with an even number of components; for example '1.7.2.1' (never '1.7.2' nor '1.7.0.2'). The return value is the branch number (for example, '1.7.2'). Return none iff REVISION is a trunk revision such as '1.2'.""" if is_trunk_revision(revision): return None return revision[:revision.rindex(".")] def rev_to_branch_data(self, revision): """Return the branch_data of the branch on which REVISION lies. REVISION must be a branch revision number with an even number of components; for example '1.7.2.1' (never '1.7.2' nor '1.7.0.2'). Raise KeyError iff REVISION is unknown.""" assert not is_trunk_revision(revision) return self.branches_data[self.rev_to_branch_number(revision)] def rev_to_lod(self, revision): """Return the line of development on which REVISION lies. REVISION must be a revision number with an even number of components. Raise KeyError iff REVISION is unknown.""" if is_trunk_revision(revision): return self.pdc.trunk else: return self.rev_to_branch_data(revision).symbol class _FileDataCollector(Sink): """Class responsible for collecting RCS data for a particular file. Any collected data that need to be remembered are stored into the referenced CollectData instance.""" def __init__(self, pdc, cvs_file): """Create an object that is prepared to receive data for CVS_FILE. CVS_FILE is a CVSFile instance. COLLECT_DATA is used to store the information collected about the file.""" self.pdc = pdc self.cvs_file = cvs_file self.collect_data = self.pdc.collect_data self.project = self.cvs_file.project # A place to store information about the symbols in this file: self.sdc = _SymbolDataCollector(self, self.cvs_file) # { revision : _RevisionData instance } self._rev_data = { } # Lists [ (parent, child) ] of revision number pairs indicating # that revision child depends on revision parent along the main # line of development. self._primary_dependencies = [] # If set, this is an RCS branch number -- rcsparse calls this the # "principal branch", but CVS and RCS refer to it as the "default # branch", so that's what we call it, even though the rcsparse API # setter method is still 'set_principal_branch'. self.default_branch = None # True iff revision 1.1 of the file appears to have been imported # (as opposed to added normally). self._file_imported = False def _get_rev_id(self, revision): if revision is None: return None return self._rev_data[revision].cvs_rev_id def set_principal_branch(self, branch): """This is a callback method declared in Sink.""" if branch.find('.') == -1: # This just sets the default branch to trunk. Normally this # shouldn't occur, but it has been seen in at least one CVS # repository. Just ignore it. return m = _branch_revision_re.match(branch) if not m: self.collect_data.record_fatal_error( 'The default branch %s in file %r is not a valid branch number' % (branch, self.cvs_file.rcs_path,) ) return branch = m.group(1) + m.group(2) if branch.count('.') != 2: # We don't know how to deal with a non-top-level default # branch (what does CVS do?). So if this case is detected, # punt: self.collect_data.record_fatal_error( 'The default branch %s in file %r is not a top-level branch' % (branch, self.cvs_file.rcs_path,) ) return self.default_branch = branch def define_tag(self, name, revision): """Remember the symbol name and revision, but don't process them yet. This is a callback method declared in Sink.""" self.sdc.define_symbol(name, revision) def set_expansion(self, mode): """This is a callback method declared in Sink.""" self.cvs_file.mode = mode def admin_completed(self): """This is a callback method declared in Sink.""" self.sdc.process_symbols() def define_revision(self, revision, timestamp, author, state, branches, next): """This is a callback method declared in Sink.""" for branch in branches: try: branch_data = self.sdc.rev_to_branch_data(branch) except KeyError: # Normally we learn about the branches from the branch names # and numbers parsed from the symbolic name header. But this # must have been an unlabeled branch that slipped through the # net. Generate a name for it and create a _BranchData record # for it now. branch_data = self.sdc._add_unlabeled_branch( self.sdc.rev_to_branch_number(branch)) assert branch_data.child is None branch_data.child = branch if revision in self._rev_data: # This revision has already been seen. logger.error('File %r contains duplicate definitions of revision %s.' % (self.cvs_file.rcs_path, revision,)) raise RuntimeError() # Record basic information about the revision: rev_data = _RevisionData( self.collect_data.item_key_generator.gen_id(), revision, int(timestamp), author, state) self._rev_data[revision] = rev_data # When on trunk, the RCS 'next' revision number points to what # humans might consider to be the 'previous' revision number. For # example, 1.3's RCS 'next' is 1.2. # # However, on a branch, the RCS 'next' revision number really does # point to what humans would consider to be the 'next' revision # number. For example, 1.1.2.1's RCS 'next' would be 1.1.2.2. # # In other words, in RCS, 'next' always means "where to find the next # deltatext that you need this revision to retrieve. # # That said, we don't *want* RCS's behavior here, so we determine # whether we're on trunk or a branch and set the dependencies # accordingly. if next: if is_trunk_revision(revision): self._primary_dependencies.append( (next, revision,) ) else: self._primary_dependencies.append( (revision, next,) ) def tree_completed(self): """The revision tree has been parsed. Analyze it for consistency and connect some loose ends. This is a callback method declared in Sink.""" self._resolve_primary_dependencies() self._resolve_branch_dependencies() self._sort_branches() self._resolve_tag_dependencies() # Compute the preliminary CVSFileItems for this file: cvs_items = [] cvs_items.extend(self._get_cvs_revisions()) cvs_items.extend(self._get_cvs_branches()) cvs_items.extend(self._get_cvs_tags()) self._cvs_file_items = CVSFileItems( self.cvs_file, self.pdc.trunk, cvs_items ) self._cvs_file_items.check_link_consistency() def _resolve_primary_dependencies(self): """Resolve the dependencies listed in self._primary_dependencies.""" for (parent, child,) in self._primary_dependencies: parent_data = self._rev_data[parent] assert parent_data.child is None parent_data.child = child child_data = self._rev_data[child] assert child_data.parent is None child_data.parent = parent def _resolve_branch_dependencies(self): """Resolve dependencies involving branches.""" for branch_data in self.sdc.branches_data.values(): # The branch_data's parent has the branch as a child regardless # of whether the branch had any subsequent commits: try: parent_data = self._rev_data[branch_data.parent] except KeyError: logger.warn( 'In %r:\n' ' branch %r references non-existing revision %s\n' ' and will be ignored.' % (self.cvs_file.rcs_path, branch_data.symbol.name, branch_data.parent,)) del self.sdc.branches_data[branch_data.branch_number] else: parent_data.branches_data.append(branch_data) # If the branch has a child (i.e., something was committed on # the branch), then we store a reference to the branch_data # there, define the child's parent to be the branch's parent, # and list the child in the branch parent's branches_revs_data: if branch_data.child is not None: child_data = self._rev_data[branch_data.child] assert child_data.parent_branch_data is None child_data.parent_branch_data = branch_data assert child_data.parent is None child_data.parent = branch_data.parent parent_data.branches_revs_data.append(branch_data.child) def _sort_branches(self): """Sort the branches sprouting from each revision in creation order. Creation order is taken to be the reverse of the order that they are listed in the symbols part of the RCS file. (If a branch is created then deleted, a later branch can be assigned the recycled branch number; therefore branch numbers are not an indication of creation order.)""" for rev_data in self._rev_data.values(): rev_data.branches_data.sort(lambda a, b: - cmp(a.id, b.id)) def _resolve_tag_dependencies(self): """Resolve dependencies involving tags.""" for (rev, tag_data_list) in self.sdc.tags_data.items(): try: parent_data = self._rev_data[rev] except KeyError: logger.warn( 'In %r:\n' ' the following tag(s) reference non-existing revision %s\n' ' and will be ignored:\n' ' %s' % ( self.cvs_file.rcs_path, rev, ', '.join([repr(tag_data.symbol.name) for tag_data in tag_data_list]),)) del self.sdc.tags_data[rev] else: for tag_data in tag_data_list: assert tag_data.rev == rev # The tag_data's rev has the tag as a child: parent_data.tags_data.append(tag_data) def _get_cvs_branches(self): """Generate the CVSBranches present in this file.""" for branch_data in self.sdc.branches_data.values(): yield CVSBranch( branch_data.id, self.cvs_file, branch_data.symbol, branch_data.branch_number, self.sdc.rev_to_lod(branch_data.parent), self._get_rev_id(branch_data.parent), self._get_rev_id(branch_data.child), None, ) def _get_cvs_tags(self): """Generate the CVSTags present in this file.""" for tags_data in self.sdc.tags_data.values(): for tag_data in tags_data: yield CVSTag( tag_data.id, self.cvs_file, tag_data.symbol, self.sdc.rev_to_lod(tag_data.rev), self._get_rev_id(tag_data.rev), None, ) def set_description(self, description): """This is a callback method declared in Sink.""" self.cvs_file.description = description self.cvs_file.determine_file_properties(Ctx().file_property_setters) def set_revision_info(self, revision, log, text): """This is a callback method declared in Sink.""" rev_data = self._rev_data[revision] cvs_rev = self._cvs_file_items[rev_data.cvs_rev_id] if cvs_rev.metadata_id is not None: # Users have reported problems with repositories in which the # deltatext block for revision 1.1 appears twice. It is not # known whether this results from a CVS/RCS bug, or from botched # hand-editing of the repository. In any case, empirically, cvs # and rcs both use the first version when checking out data, so # that's what we will do. (For the record: "cvs log" fails on # such a file; "rlog" prints the log message from the first # block and ignores the second one.) logger.warn( "%s: in '%s':\n" " Deltatext block for revision %s appeared twice;\n" " ignoring the second occurrence.\n" % (warning_prefix, self.cvs_file.rcs_path, revision,) ) return if is_trunk_revision(revision): branch_name = None else: branch_name = self.sdc.rev_to_branch_data(revision).symbol.name cvs_rev.metadata_id = self.collect_data.metadata_logger.store( self.project, branch_name, rev_data.author, log ) cvs_rev.deltatext_exists = bool(text) # If this is revision 1.1, determine whether the file appears to # have been created via 'cvs add' instead of 'cvs import'. The # test is that the log message CVS uses for 1.1 in imports is # "Initial revision\n" with no period. (This fact helps determine # whether this file might have had a default branch in the past.) if revision == '1.1': self._file_imported = (log == 'Initial revision\n') def parse_completed(self): """Finish the processing of this file. This is a callback method declared in Sink.""" # Make sure that there was an info section for each revision: for cvs_item in self._cvs_file_items.values(): if isinstance(cvs_item, CVSRevision) and cvs_item.metadata_id is None: self.collect_data.record_fatal_error( '%r has no deltatext section for revision %s' % (self.cvs_file.rcs_path, cvs_item.rev,) ) def _determine_operation(self, rev_data): prev_rev_data = self._rev_data.get(rev_data.parent) return cvs_revision_type_map[( rev_data.state != 'dead', prev_rev_data is not None and prev_rev_data.state != 'dead', )] def _get_cvs_revisions(self): """Generate the CVSRevisions present in this file.""" for rev_data in self._rev_data.itervalues(): yield self._get_cvs_revision(rev_data) def _get_cvs_revision(self, rev_data): """Create and return a CVSRevision for REV_DATA.""" branch_ids = [ branch_data.id for branch_data in rev_data.branches_data ] branch_commit_ids = [ self._get_rev_id(rev) for rev in rev_data.branches_revs_data ] tag_ids = [ tag_data.id for tag_data in rev_data.tags_data ] revision_type = self._determine_operation(rev_data) return revision_type( self._get_rev_id(rev_data.rev), self.cvs_file, rev_data.timestamp, None, self._get_rev_id(rev_data.parent), self._get_rev_id(rev_data.child), rev_data.rev, True, self.sdc.rev_to_lod(rev_data.rev), rev_data.get_first_on_branch_id(), False, None, None, tag_ids, branch_ids, branch_commit_ids, rev_data.revision_reader_token ) def get_cvs_file_items(self): """Finish up and return a CVSFileItems instance for this file. This method must only be called once.""" self._process_ntdbrs() # Break a circular reference loop, allowing the memory for self # and sdc to be freed. del self.sdc return self._cvs_file_items def _process_ntdbrs(self): """Fix up any non-trunk default branch revisions (if present). If a non-trunk default branch is determined to have existed, yield the _RevisionData.ids for all revisions that were once non-trunk default revisions, in dependency order. There are two cases to handle: One case is simple. The RCS file lists a default branch explicitly in its header, such as '1.1.1'. In this case, we know that every revision on the vendor branch is to be treated as head of trunk at that point in time. But there's also a degenerate case. The RCS file does not currently have a default branch, yet we can deduce that for some period in the past it probably *did* have one. For example, the file has vendor revisions 1.1.1.1 -> 1.1.1.96, all of which are dated before 1.2, and then it has 1.1.1.97 -> 1.1.1.100 dated after 1.2. In this case, we should record 1.1.1.96 as the last vendor revision to have been the head of the default branch. If any non-trunk default branch revisions are found: - Set their ntdbr members to True. - Connect the last one with revision 1.2. - Remove revision 1.1 if it is not needed. """ try: if self.default_branch: try: vendor_cvs_branch_id = self.sdc.branches_data[self.default_branch].id except KeyError: logger.warn( '%s: In %s:\n' ' vendor branch %r is not present in file and will be ignored.' % (warning_prefix, self.cvs_file.rcs_path, self.default_branch,) ) self.default_branch = None return vendor_lod_items = self._cvs_file_items.get_lod_items( self._cvs_file_items[vendor_cvs_branch_id] ) if not self._cvs_file_items.process_live_ntdb(vendor_lod_items): return elif self._file_imported: vendor_branch_data = self.sdc.branches_data.get('1.1.1') if vendor_branch_data is None: return else: vendor_lod_items = self._cvs_file_items.get_lod_items( self._cvs_file_items[vendor_branch_data.id] ) if not self._cvs_file_items.process_historical_ntdb( vendor_lod_items ): return else: return except VendorBranchError, e: self.collect_data.record_fatal_error(str(e)) return if self._file_imported: self._cvs_file_items.imported_remove_1_1(vendor_lod_items) self._cvs_file_items.check_link_consistency() class _ProjectDataCollector: def __init__(self, collect_data, project): self.collect_data = collect_data self.project = project self.num_files = 0 # The Trunk LineOfDevelopment object for this project: self.trunk = Trunk( self.collect_data.symbol_key_generator.gen_id(), self.project ) self.project.trunk_id = self.trunk.id # This causes a record for self.trunk to spring into existence: self.collect_data.register_trunk(self.trunk) # A map { name -> Symbol } for all known symbols in this project. # The symbols listed here are undifferentiated into Branches and # Tags because the same name might appear as a branch in one file # and a tag in another. self.symbols = {} # A map { (old_name, new_name) : count } indicating how many files # were affected by each each symbol name transformation: self.symbol_transform_counts = {} def get_symbol(self, name): """Return the Symbol object for the symbol named NAME in this project. If such a symbol does not yet exist, allocate a new symbol_id, create a Symbol instance, store it in self.symbols, and return it.""" symbol = self.symbols.get(name) if symbol is None: symbol = Symbol( self.collect_data.symbol_key_generator.gen_id(), self.project, name) self.symbols[name] = symbol return symbol def log_symbol_transform(self, old_name, new_name): """Record that OLD_NAME was transformed to NEW_NAME in one file. This information is used to generated a statistical summary of symbol transforms.""" try: self.symbol_transform_counts[old_name, new_name] += 1 except KeyError: self.symbol_transform_counts[old_name, new_name] = 1 def summarize_symbol_transforms(self): if self.symbol_transform_counts and logger.is_on(logger.NORMAL): logger.normal('Summary of symbol transforms:') transforms = self.symbol_transform_counts.items() transforms.sort() for ((old_name, new_name), count) in transforms: if new_name is None: logger.normal(' "%s" ignored in %d files' % (old_name, count,)) else: logger.normal( ' "%s" transformed to "%s" in %d files' % (old_name, new_name, count,) ) def process_file(self, cvs_file): logger.normal(cvs_file.rcs_path) fdc = _FileDataCollector(self, cvs_file) try: parse(open(cvs_file.rcs_path, 'rb'), fdc) except (RCSParseError, RuntimeError): self.collect_data.record_fatal_error( "%r is not a valid ,v file" % (cvs_file.rcs_path,) ) # Abort the processing of this file, but let the pass continue # with other files: return except ValueError, e: self.collect_data.record_fatal_error( "%r is not a valid ,v file (%s)" % (cvs_file.rcs_path, str(e),) ) # Abort the processing of this file, but let the pass continue # with other files: return except: logger.warn("Exception occurred while parsing %s" % cvs_file.rcs_path) raise else: self.num_files += 1 return fdc.get_cvs_file_items() class CollectData: """Repository for data collected by parsing the CVS repository files. This class manages the databases into which information collected from the CVS repository is stored. The data are stored into this class by _FileDataCollector instances, one of which is created for each file to be parsed.""" def __init__(self, stats_keeper): self._cvs_item_store = NewCVSItemStore( artifact_manager.get_temp_file(config.CVS_ITEMS_STORE)) self.metadata_db = MetadataDatabase( artifact_manager.get_temp_file(config.METADATA_STORE), artifact_manager.get_temp_file(config.METADATA_INDEX_TABLE), DB_OPEN_NEW, ) self.metadata_logger = MetadataLogger(self.metadata_db) self.fatal_errors = [] self.num_files = 0 self.symbol_stats = SymbolStatisticsCollector() self.stats_keeper = stats_keeper # Key generator for CVSItems: self.item_key_generator = KeyGenerator() # Key generator for Symbols: self.symbol_key_generator = KeyGenerator() def record_fatal_error(self, err): """Record that fatal error ERR was found. ERR is a string (without trailing newline) describing the error. Output the error to stderr immediately, and record a copy to be output again in a summary at the end of CollectRevsPass.""" err = '%s: %s' % (error_prefix, err,) logger.error(err + '\n') self.fatal_errors.append(err) def add_cvs_directory(self, cvs_directory): """Record CVS_DIRECTORY.""" Ctx()._cvs_path_db.log_path(cvs_directory) def add_cvs_file_items(self, cvs_file_items): """Record the information from CVS_FILE_ITEMS. Store the CVSFile to _cvs_path_db under its persistent id, store the CVSItems, and record the CVSItems to self.stats_keeper.""" Ctx()._cvs_path_db.log_path(cvs_file_items.cvs_file) self._cvs_item_store.add(cvs_file_items) self.stats_keeper.record_cvs_file(cvs_file_items.cvs_file) for cvs_item in cvs_file_items.values(): self.stats_keeper.record_cvs_item(cvs_item) def register_trunk(self, trunk): """Create a symbol statistics record for the specified trunk LOD.""" # This causes a record to spring into existence: self.symbol_stats[trunk] def _process_cvs_file_items(self, cvs_file_items): """Process the CVSFileItems from one CVSFile.""" # Remove an initial delete on trunk if it is not needed: cvs_file_items.remove_unneeded_initial_trunk_delete(self.metadata_db) # Remove initial branch deletes that are not needed: cvs_file_items.remove_initial_branch_deletes(self.metadata_db) # If this is a --trunk-only conversion, discard all branches and # tags, then draft any non-trunk default branch revisions to # trunk: if Ctx().trunk_only: cvs_file_items.exclude_non_trunk() cvs_file_items.check_link_consistency() self.add_cvs_file_items(cvs_file_items) self.symbol_stats.register(cvs_file_items) def process_project(self, project, cvs_paths): pdc = _ProjectDataCollector(self, project) found_rcs_file = False for cvs_path in cvs_paths: if isinstance(cvs_path, CVSDirectory): self.add_cvs_directory(cvs_path) else: cvs_file_items = pdc.process_file(cvs_path) self._process_cvs_file_items(cvs_file_items) found_rcs_file = True if not found_rcs_file: self.record_fatal_error( 'No RCS files found under %r!\n' 'Are you absolutely certain you are pointing cvs2svn\n' 'at a CVS repository?\n' % (project.project_cvs_repos_path,) ) pdc.summarize_symbol_transforms() self.num_files += pdc.num_files logger.verbose('Processed', self.num_files, 'files') def _register_empty_subdirectories(self): """Set the CVSDirectory.empty_subdirectory_id members.""" directories = set( path for path in Ctx()._cvs_path_db.itervalues() if isinstance(path, CVSDirectory) ) for path in Ctx()._cvs_path_db.itervalues(): if isinstance(path, CVSFile): directory = path.parent_directory while directory is not None and directory in directories: directories.remove(directory) directory = directory.parent_directory for directory in directories: if directory.parent_directory is not None: directory.parent_directory.empty_subdirectory_ids.append(directory.id) def close(self): """Close the data structures associated with this instance. Return a list of fatal errors encountered while processing input. Each list entry is a string describing one fatal error.""" self.symbol_stats.purge_ghost_symbols() self.symbol_stats.close() self.symbol_stats = None self.metadata_logger = None self.metadata_db.close() self.metadata_db = None self._cvs_item_store.close() self._cvs_item_store = None self._register_empty_subdirectories() retval = self.fatal_errors self.fatal_errors = None return retval cvs2svn-2.4.0/cvs2svn_lib/record_table.py0000664000076500007650000002750711434364604021454 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Classes to manage Databases of fixed-length records. The databases map small, non-negative integers to fixed-size records. The records are written in index order to a disk file. Gaps in the index sequence leave gaps in the data file, so for best space efficiency the indexes of existing records should be approximately continuous. To use a RecordTable, you need a class derived from Packer which can serialize/deserialize your records into fixed-size strings. Deriving classes have to specify how to pack records into strings and unpack strings into records by overwriting the pack() and unpack() methods respectively. Note that these classes keep track of gaps in the records that have been written by filling them with packer.empty_value. If a record is read which contains packer.empty_value, then a KeyError is raised.""" import os import types import struct import mmap from cvs2svn_lib.common import DB_OPEN_READ from cvs2svn_lib.common import DB_OPEN_WRITE from cvs2svn_lib.common import DB_OPEN_NEW from cvs2svn_lib.log import logger # A unique value that can be used to stand for "unset" without # preventing the use of None. _unset = object() class Packer(object): def __init__(self, record_len, empty_value=None): self.record_len = record_len if empty_value is None: self.empty_value = '\0' * self.record_len else: assert type(empty_value) is types.StringType assert len(empty_value) == self.record_len self.empty_value = empty_value def pack(self, v): """Pack record V into a string of length self.record_len.""" raise NotImplementedError() def unpack(self, s): """Unpack string S into a record.""" raise NotImplementedError() class StructPacker(Packer): def __init__(self, format, empty_value=_unset): self.format = format if empty_value is not _unset: empty_value = self.pack(empty_value) else: empty_value = None Packer.__init__(self, struct.calcsize(self.format), empty_value=empty_value) def pack(self, v): return struct.pack(self.format, v) def unpack(self, v): return struct.unpack(self.format, v)[0] class UnsignedIntegerPacker(StructPacker): def __init__(self, empty_value=0): StructPacker.__init__(self, '=I', empty_value) class SignedIntegerPacker(StructPacker): def __init__(self, empty_value=0): StructPacker.__init__(self, '=i', empty_value) class FileOffsetPacker(Packer): """A packer suitable for file offsets. We store the 5 least significant bytes of the file offset. This is enough bits to represent 1 TiB. Of course if the computer doesn't have large file support, only the lowest 31 bits can be nonzero, and the offsets are limited to 2 GiB.""" # Convert file offsets to 8-bit little-endian unsigned longs... INDEX_FORMAT = '= self._max_memory_cache: self.flush() self._limit = max(self._limit, i + 1) def _get_packed_record(self, i): try: return self._cache[i][1] except KeyError: if not 0 <= i < self._limit_written: raise KeyError(i) self.f.seek(i * self._record_len) s = self.f.read(self._record_len) self._cache[i] = (False, s) if len(self._cache) >= self._max_memory_cache: self.flush() return s def close(self): self.flush() self._cache = None self.f.close() self.f = None class MmapRecordTable(AbstractRecordTable): GROWTH_INCREMENT = 65536 def __init__(self, filename, mode, packer): AbstractRecordTable.__init__(self, filename, mode, packer) if self.mode == DB_OPEN_NEW: self.python_file = open(self.filename, 'wb+') self.python_file.write('\0' * self.GROWTH_INCREMENT) self.python_file.flush() self._filesize = self.GROWTH_INCREMENT self.f = mmap.mmap( self.python_file.fileno(), self._filesize, access=mmap.ACCESS_WRITE ) # The index just beyond the last record ever written: self._limit = 0 elif self.mode == DB_OPEN_WRITE: self.python_file = open(self.filename, 'rb+') self._filesize = os.path.getsize(self.filename) self.f = mmap.mmap( self.python_file.fileno(), self._filesize, access=mmap.ACCESS_WRITE ) # The index just beyond the last record ever written: self._limit = os.path.getsize(self.filename) // self._record_len elif self.mode == DB_OPEN_READ: self.python_file = open(self.filename, 'rb') self._filesize = os.path.getsize(self.filename) self.f = mmap.mmap( self.python_file.fileno(), self._filesize, access=mmap.ACCESS_READ ) # The index just beyond the last record ever written: self._limit = os.path.getsize(self.filename) // self._record_len else: raise RuntimeError('Invalid mode %r' % self.mode) def flush(self): self.f.flush() def _set_packed_record(self, i, s): if self.mode == DB_OPEN_READ: raise RecordTableAccessError() if i < 0: raise KeyError() if i >= self._limit: # This write extends the range of valid indices. First check # whether the file has to be enlarged: new_size = (i + 1) * self._record_len if new_size > self._filesize: self._filesize = ( (new_size + self.GROWTH_INCREMENT - 1) // self.GROWTH_INCREMENT * self.GROWTH_INCREMENT ) self.f.resize(self._filesize) if i > self._limit: # Pad up to the new record with empty_value: self.f[self._limit * self._record_len:i * self._record_len] = \ self.packer.empty_value * (i - self._limit) self._limit = i + 1 self.f[i * self._record_len:(i + 1) * self._record_len] = s def _get_packed_record(self, i): if not 0 <= i < self._limit: raise KeyError(i) return self.f[i * self._record_len:(i + 1) * self._record_len] def close(self): self.flush() self.f.close() self.python_file.close() cvs2svn-2.4.0/cvs2svn_lib/bzr_output_option.py0000664000076500007650000000414711317026325022622 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Classes for outputting the converted repository to bzr. Relies heavily on the support for outputting to git, with a few tweaks to make the dialect of the fast-import file more suited to bzr. """ from cvs2svn_lib.git_output_option import GitOutputOption from cvs2svn_lib.symbol import Trunk class BzrOutputOption(GitOutputOption): """An OutputOption that outputs to a git-fast-import formatted file, in a dialect more suited to bzr. """ name = "Bzr" def __init__( self, dump_filename, revision_writer, author_transforms=None, tie_tag_fixup_branches=True, ): """Constructor. See superclass for meaning of parameters. """ GitOutputOption.__init__(self, dump_filename, revision_writer, author_transforms, tie_tag_fixup_branches) def get_tag_fixup_branch_name(self, svn_commit): # Use a name containing '.', which is not allowed in CVS symbols, to avoid # conflicts (though of course a conflict could still result if the user # requests symbol transformations). return 'refs/heads/tag-fixup.%s' % svn_commit.symbol.name def describe_lod_to_user(self, lod): """This needs to make sense to users of the fastimported result.""" # This sort of needs to replicate the entire branch name mapping logic from # bzr-fastimport :-/ if isinstance(lod, Trunk): return 'trunk' else: return lod.name cvs2svn-2.4.0/cvs2svn_lib/man_writer.py0000664000076500007650000001326311434364604021170 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains the ManWriter class for outputting manpages.""" import optparse import re whitespace_re = re.compile(r'\s+') def wrap(s, width=70): # Convert all whitespace substrings to single spaces: s = whitespace_re.sub(' ', s) s = s.strip() retval = [] while s: if len(s) <= width: retval.append(s) break i = s.rfind(' ', 0, width + 1) if i == -1: # There were no spaces within the first width+1 characters; break # at the next space after width: i = s.find(' ', width + 1) if i == -1: # There were no spaces in s at all. retval.append(s) break retval.append(s[:i].rstrip()) s = s[i+1:].lstrip() for (i,line) in enumerate(retval): if line.startswith('\'') or line.startswith('.'): # These are roff control characters and have to be escaped: retval[i] = '\\' + line return '\n'.join(retval) class ManOption(optparse.Option): """An optparse.Option that holds an explicit string for the man page.""" def __init__(self, *args, **kw): self.man_help = kw.pop('man_help') optparse.Option.__init__(self, *args, **kw) class ManWriter(object): def __init__( self, parser, section=None, date=None, source=None, manual=None, short_desc=None, synopsis=None, long_desc=None, files=None, authors=None, see_also=None, ): self.parser = parser self.section = section self.date = date self.source = source self.manual = manual self.short_desc = short_desc self.synopsis = synopsis self.long_desc = long_desc self.files = files self.authors = authors self.see_also = see_also def write_title(self, f): f.write('.\\" Process this file with\n') f.write( '.\\" groff -man -Tascii %s.%s\n' % ( self.parser.get_prog_name(), self.section, ) ) f.write( '.TH %s "%s" "%s" "%s" "%s"\n' % ( self.parser.get_prog_name().upper(), self.section, self.date.strftime('%b %d, %Y'), self.source, self.manual, ) ) def write_name(self, f): f.write('.SH "NAME"\n') f.write( '%s \- %s\n' % ( self.parser.get_prog_name(), self.short_desc, ) ) def write_synopsis(self, f): f.write('.SH "SYNOPSIS"\n') f.write(self.synopsis) def write_description(self, f): f.write('.SH "DESCRIPTION"\n') f.write(self.long_desc) def _get_option_strings(self, option): """Return a list of option strings formatted with their metavariables. This method is very similar to optparse.HelpFormatter.format_option_strings(). """ if option.takes_value(): metavar = (option.metavar or option.dest).lower() short_opts = [ '\\fB%s\\fR \\fI%s\\fR' % (opt, metavar) for opt in option._short_opts ] long_opts = [ '\\fB%s\\fR=\\fI%s\\fR' % (opt, metavar) for opt in option._long_opts ] else: short_opts = [ '\\fB%s\\fR' % (opt,) for opt in option._short_opts ] long_opts = [ '\\fB%s\\fR' % (opt,) for opt in option._long_opts ] return short_opts + long_opts def _write_option(self, f, option): man_help = getattr(option, 'man_help', option.help) if man_help is not optparse.SUPPRESS_HELP: man_help = wrap(man_help) f.write('.IP "%s"\n' % (', '.join(self._get_option_strings(option)),)) f.write('%s\n' % (man_help,)) def _write_container_help(self, f, container): for option in container.option_list: if option.help is not optparse.SUPPRESS_HELP: self._write_option(f, option) def write_options(self, f): f.write('.SH "OPTIONS"\n') if self.parser.option_list: (self._write_container_help(f, self.parser)) for group in self.parser.option_groups: f.write('.SH "%s"\n' % (group.title.upper(),)) if group.description: f.write(self.format_description(group.description) + '\n') self._write_container_help(f, group) def write_files(self, f): f.write('.SH "FILES"\n') f.write(self.files) def write_authors(self, f): f.write('.SH "AUTHORS"\n') f.write("Main authors are:\n") for author in self.authors: f.write(".br\n") f.write(author + "\n") f.write(".PP\n") f.write( "Manpage was written for the Debian GNU/Linux system by\n" "Laszlo 'GCS' Boszormenyi (but may be used by others).\n") def write_see_also(self, f): f.write('.SH "SEE ALSO"\n') f.write(', '.join([ '%s(%s)' % (name, section,) for (name, section,) in self.see_also ]) + '\n') def write_manpage(self, f): self.write_title(f) self.write_name(f) self.write_synopsis(f) self.write_description(f) self.write_options(f) self.write_files(f) self.write_authors(f) self.write_see_also(f) cvs2svn-2.4.0/cvs2svn_lib/cvs_file_items.py0000664000076500007650000011531111710517256022011 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2006-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains a class to manage the CVSItems related to one file.""" import re from cvs2svn_lib.common import InternalError from cvs2svn_lib.common import FatalError from cvs2svn_lib.context import Ctx from cvs2svn_lib.log import logger from cvs2svn_lib.symbol import Trunk from cvs2svn_lib.symbol import Branch from cvs2svn_lib.symbol import Tag from cvs2svn_lib.symbol import ExcludedSymbol from cvs2svn_lib.cvs_item import CVSRevision from cvs2svn_lib.cvs_item import CVSRevisionModification from cvs2svn_lib.cvs_item import CVSRevisionAdd from cvs2svn_lib.cvs_item import CVSRevisionChange from cvs2svn_lib.cvs_item import CVSRevisionAbsent from cvs2svn_lib.cvs_item import CVSRevisionNoop from cvs2svn_lib.cvs_item import CVSSymbol from cvs2svn_lib.cvs_item import CVSBranch from cvs2svn_lib.cvs_item import CVSTag from cvs2svn_lib.cvs_item import cvs_revision_type_map from cvs2svn_lib.cvs_item import cvs_branch_type_map from cvs2svn_lib.cvs_item import cvs_tag_type_map class VendorBranchError(Exception): """There is an error in the structure of the file revision tree.""" pass class LODItems(object): def __init__(self, lod, cvs_branch, cvs_revisions, cvs_branches, cvs_tags): # The LineOfDevelopment described by this instance. self.lod = lod # The CVSBranch starting this LOD, if any; otherwise, None. self.cvs_branch = cvs_branch # The list of CVSRevisions on this LOD, if any. The CVSRevisions # are listed in dependency order. self.cvs_revisions = cvs_revisions # A list of CVSBranches that sprout from this LOD (either from # cvs_branch or from one of the CVSRevisions). self.cvs_branches = cvs_branches # A list of CVSTags that sprout from this LOD (either from # cvs_branch or from one of the CVSRevisions). self.cvs_tags = cvs_tags def is_trivial_import(self): """Return True iff this LOD is a trivial import branch in this file. A trivial import branch is a branch that was used for a single import and nothing else. Such a branch is eligible for being grafted onto trunk, even if it has branch blockers.""" return ( len(self.cvs_revisions) == 1 and self.cvs_revisions[0].ntdbr ) def is_pure_ntdb(self): """Return True iff this LOD is a pure NTDB in this file. A pure non-trunk default branch is defined to be a branch that contains only NTDB revisions (and at least one of them). Such a branch is eligible for being grafted onto trunk, even if it has branch blockers.""" return ( self.cvs_revisions and self.cvs_revisions[-1].ntdbr ) def iter_blockers(self): if self.is_pure_ntdb(): # Such a branch has no blockers, because the blockers can be # grafted to trunk. pass else: # Other branches are only blocked by symbols that sprout from # non-NTDB revisions: non_ntdbr_revision_ids = set() for cvs_revision in self.cvs_revisions: if not cvs_revision.ntdbr: non_ntdbr_revision_ids.add(cvs_revision.id) for cvs_tag in self.cvs_tags: if cvs_tag.source_id in non_ntdbr_revision_ids: yield cvs_tag for cvs_branch in self.cvs_branches: if cvs_branch.source_id in non_ntdbr_revision_ids: yield cvs_branch class CVSFileItems(object): def __init__(self, cvs_file, trunk, cvs_items, original_ids=None): # The file whose data this instance holds. self.cvs_file = cvs_file # The symbol that represents "Trunk" in this file. self.trunk = trunk # A map from CVSItem.id to CVSItem: self._cvs_items = {} # The cvs_item_id of each root in the CVSItem forest. (A root is # defined to be any CVSRevision with no prev_id.) self.root_ids = set() for cvs_item in cvs_items: self.add(cvs_item) if isinstance(cvs_item, CVSRevision) and cvs_item.prev_id is None: self.root_ids.add(cvs_item.id) # self.original_ids is a dict {cvs_rev.rev : cvs_rev.id} holding # the IDs originally allocated to each CVS revision number. This # member is stored for the convenience of RevisionCollectors. if original_ids is not None: self.original_ids = original_ids else: self.original_ids = {} for cvs_item in cvs_items: if isinstance(cvs_item, CVSRevision): self.original_ids[cvs_item.rev] = cvs_item.id def __getstate__(self): return (self.cvs_file.id, self.values(), self.original_ids,) def __setstate__(self, state): (cvs_file_id, cvs_items, original_ids,) = state cvs_file = Ctx()._cvs_path_db.get_path(cvs_file_id) CVSFileItems.__init__( self, cvs_file, cvs_file.project.get_trunk(), cvs_items, original_ids=original_ids, ) def add(self, cvs_item): self._cvs_items[cvs_item.id] = cvs_item def __getitem__(self, id): """Return the CVSItem with the specified ID.""" return self._cvs_items[id] def get(self, id, default=None): return self._cvs_items.get(id, default) def __delitem__(self, id): assert id not in self.root_ids del self._cvs_items[id] def values(self): return self._cvs_items.values() def check_link_consistency(self): """Check that the CVSItems are linked correctly with each other.""" for cvs_item in self.values(): try: cvs_item.check_links(self) except AssertionError: logger.error( 'Link consistency error in %s\n' 'This is probably a bug internal to cvs2svn. Please file a bug\n' 'report including the following stack trace (see FAQ for more ' 'info).' % (cvs_item,)) raise def _get_lod(self, lod, cvs_branch, start_id): """Return the indicated LODItems. LOD is the corresponding LineOfDevelopment. CVS_BRANCH is the CVSBranch instance that starts the LOD if any; otherwise it is None. START_ID is the id of the first CVSRevision on this LOD, or None if there are none.""" cvs_revisions = [] cvs_branches = [] cvs_tags = [] def process_subitems(cvs_item): """Process the branches and tags that are rooted in CVS_ITEM. CVS_ITEM can be a CVSRevision or a CVSBranch.""" for branch_id in cvs_item.branch_ids[:]: cvs_branches.append(self[branch_id]) for tag_id in cvs_item.tag_ids: cvs_tags.append(self[tag_id]) if cvs_branch is not None: # Include the symbols sprouting directly from the CVSBranch: process_subitems(cvs_branch) id = start_id while id is not None: cvs_rev = self[id] cvs_revisions.append(cvs_rev) process_subitems(cvs_rev) id = cvs_rev.next_id return LODItems(lod, cvs_branch, cvs_revisions, cvs_branches, cvs_tags) def get_lod_items(self, cvs_branch): """Return an LODItems describing the branch that starts at CVS_BRANCH. CVS_BRANCH must be an instance of CVSBranch contained in this CVSFileItems.""" return self._get_lod(cvs_branch.symbol, cvs_branch, cvs_branch.next_id) def iter_root_lods(self): """Iterate over the LODItems for all root LODs (non-recursively).""" for id in list(self.root_ids): cvs_item = self[id] if isinstance(cvs_item, CVSRevision): # This LOD doesn't have a CVSBranch associated with it. # Either it is Trunk, or it is a branch whose CVSBranch has # been deleted. yield self._get_lod(cvs_item.lod, None, id) elif isinstance(cvs_item, CVSBranch): # This is a Branch that has been severed from the rest of the # tree. yield self._get_lod(cvs_item.symbol, cvs_item, cvs_item.next_id) else: raise InternalError('Unexpected root item: %s' % (cvs_item,)) def _iter_tree(self, lod, cvs_branch, start_id): """Iterate over the tree that starts at the specified line of development. LOD is the LineOfDevelopment where the iteration should start. CVS_BRANCH is the CVSBranch instance that starts the LOD if any; otherwise it is None. START_ID is the id of the first CVSRevision on this LOD, or None if there are none. There are two cases handled by this routine: trunk (where LOD is a Trunk instance, CVS_BRANCH is None, and START_ID is the id of the 1.1 revision) and a branch (where LOD is a Branch instance, CVS_BRANCH is a CVSBranch instance, and START_ID is either the id of the first CVSRevision on the branch or None if there are no CVSRevisions on the branch). Note that CVS_BRANCH and START_ID cannot simultaneously be None. Yield an LODItems instance for each line of development.""" cvs_revisions = [] cvs_branches = [] cvs_tags = [] def process_subitems(cvs_item): """Process the branches and tags that are rooted in CVS_ITEM. CVS_ITEM can be a CVSRevision or a CVSBranch.""" for branch_id in cvs_item.branch_ids[:]: # Recurse into the branch: branch = self[branch_id] for lod_items in self._iter_tree( branch.symbol, branch, branch.next_id ): yield lod_items # The caller might have deleted the branch that we just # yielded. If it is no longer present, then do not add it to # the list of cvs_branches. try: cvs_branches.append(self[branch_id]) except KeyError: pass for tag_id in cvs_item.tag_ids: cvs_tags.append(self[tag_id]) if cvs_branch is not None: # Include the symbols sprouting directly from the CVSBranch: for lod_items in process_subitems(cvs_branch): yield lod_items id = start_id while id is not None: cvs_rev = self[id] cvs_revisions.append(cvs_rev) for lod_items in process_subitems(cvs_rev): yield lod_items id = cvs_rev.next_id yield LODItems(lod, cvs_branch, cvs_revisions, cvs_branches, cvs_tags) def iter_lods(self): """Iterate over LinesOfDevelopment in this file, in depth-first order. For each LOD, yield an LODItems instance. The traversal starts at each root node but returns the LODs in depth-first order. It is allowed to modify the CVSFileItems instance while the traversal is occurring, but only in ways that don't affect the tree structure above (i.e., towards the trunk from) the current LOD.""" # Make a list out of root_ids so that callers can change it: for id in list(self.root_ids): cvs_item = self[id] if isinstance(cvs_item, CVSRevision): # This LOD doesn't have a CVSBranch associated with it. # Either it is Trunk, or it is a branch whose CVSBranch has # been deleted. lod = cvs_item.lod cvs_branch = None elif isinstance(cvs_item, CVSBranch): # This is a Branch that has been severed from the rest of the # tree. lod = cvs_item.symbol id = cvs_item.next_id cvs_branch = cvs_item else: raise InternalError('Unexpected root item: %s' % (cvs_item,)) for lod_items in self._iter_tree(lod, cvs_branch, id): yield lod_items def iter_deltatext_ancestors(self, cvs_rev): """Generate the delta-dependency ancestors of CVS_REV. Generate then ancestors of CVS_REV in deltatext order; i.e., back along branches towards trunk, then outwards along trunk towards HEAD.""" while True: # Determine the next candidate source revision: if isinstance(cvs_rev.lod, Trunk): if cvs_rev.next_id is None: # HEAD has no ancestors, so we are done: return else: cvs_rev = self[cvs_rev.next_id] else: cvs_rev = self[cvs_rev.prev_id] yield cvs_rev def _sever_branch(self, lod_items): """Sever the branch from its source and discard the CVSBranch. LOD_ITEMS describes a branch that should be severed from its source, deleting the CVSBranch and creating a new root. Also set LOD_ITEMS.cvs_branch to None. If LOD_ITEMS has no source (e.g., because it is the trunk branch or because it has already been severed), do nothing. This method can only be used before symbols have been grafted onto CVSBranches. It does not adjust NTDBR, NTDBR_PREV_ID or NTDBR_NEXT_ID even if LOD_ITEMS describes a NTDB.""" cvs_branch = lod_items.cvs_branch if cvs_branch is None: return assert not cvs_branch.tag_ids assert not cvs_branch.branch_ids source_rev = self[cvs_branch.source_id] # We only cover the following case, even though after # FilterSymbolsPass cvs_branch.source_id might refer to another # CVSBranch. assert isinstance(source_rev, CVSRevision) # Delete the CVSBranch itself: lod_items.cvs_branch = None del self[cvs_branch.id] # Delete the reference from the source revision to the CVSBranch: source_rev.branch_ids.remove(cvs_branch.id) # Delete the reference from the first revision on the branch to # the CVSBranch: if lod_items.cvs_revisions: first_rev = lod_items.cvs_revisions[0] # Delete the reference from first_rev to the CVSBranch: first_rev.first_on_branch_id = None # Delete the reference from the source revision to the first # revision on the branch: source_rev.branch_commit_ids.remove(first_rev.id) # ...and vice versa: first_rev.prev_id = None # Change the type of first_rev (e.g., from Change to Add): first_rev.__class__ = cvs_revision_type_map[ (isinstance(first_rev, CVSRevisionModification), False,) ] # Now first_rev is a new root: self.root_ids.add(first_rev.id) def adjust_ntdbrs(self, ntdbr_cvs_revs): """Adjust the specified non-trunk default branch revisions. NTDBR_CVS_REVS is a list of CVSRevision instances in this file that have been determined to be non-trunk default branch revisions. The first revision on the default branch is handled strangely by CVS. If a file is imported (as opposed to being added), CVS creates a 1.1 revision, then creates a vendor branch 1.1.1 based on 1.1, then creates a 1.1.1.1 revision that is identical to the 1.1 revision (i.e., its deltatext is empty). The log message that the user typed when importing is stored with the 1.1.1.1 revision. The 1.1 revision always contains a standard, generated log message, 'Initial revision\n'. When we detect a straightforward import like this, we want to handle it by deleting the 1.1 revision (which doesn't contain any useful information) and making 1.1.1.1 into an independent root in the file's dependency tree. In SVN, 1.1.1.1 will be added directly to the vendor branch with its initial content. Then in a special 'post-commit', the 1.1.1.1 revision is copied back to trunk. If the user imports again to the same vendor branch, then CVS creates revisions 1.1.1.2, 1.1.1.3, etc. on the vendor branch, *without* counterparts in trunk (even though these revisions effectively play the role of trunk revisions). So after we add such revisions to the vendor branch, we also copy them back to trunk in post-commits. Set the ntdbr members of the revisions listed in NTDBR_CVS_REVS to True. Also, if there is a 1.2 revision, then set that revision to depend on the last non-trunk default branch revision and possibly adjust its type accordingly.""" for cvs_rev in ntdbr_cvs_revs: cvs_rev.ntdbr = True # Look for a 1.2 revision: rev_1_1 = self[ntdbr_cvs_revs[0].prev_id] rev_1_2 = self.get(rev_1_1.next_id) if rev_1_2 is not None: # Revision 1.2 logically follows the imported revisions, not # 1.1. Accordingly, connect it to the last NTDBR and possibly # change its type. last_ntdbr = ntdbr_cvs_revs[-1] rev_1_2.ntdbr_prev_id = last_ntdbr.id last_ntdbr.ntdbr_next_id = rev_1_2.id rev_1_2.__class__ = cvs_revision_type_map[( isinstance(rev_1_2, CVSRevisionModification), isinstance(last_ntdbr, CVSRevisionModification), )] def process_live_ntdb(self, vendor_lod_items): """VENDOR_LOD_ITEMS is a live default branch; process it. In this case, all revisions on the default branch are NTDBRs and it is an error if there is also a '1.2' revision. Return True iff this transformation really does something. Raise a VendorBranchError if there is a '1.2' revision.""" rev_1_1 = self[vendor_lod_items.cvs_branch.source_id] rev_1_2_id = rev_1_1.next_id if rev_1_2_id is not None: raise VendorBranchError( 'File \'%s\' has default branch=%s but also a revision %s' % (self.cvs_file.rcs_path, vendor_lod_items.cvs_branch.branch_number, self[rev_1_2_id].rev,) ) ntdbr_cvs_revs = list(vendor_lod_items.cvs_revisions) if ntdbr_cvs_revs: self.adjust_ntdbrs(ntdbr_cvs_revs) return True else: return False def process_historical_ntdb(self, vendor_lod_items): """There appears to have been a non-trunk default branch in the past. There is currently no default branch, but the branch described by file appears to have been imported. So our educated guess is that all revisions on the '1.1.1' branch (described by VENDOR_LOD_ITEMS) with timestamps prior to the timestamp of '1.2' were non-trunk default branch revisions. Return True iff this transformation really does something. This really only handles standard '1.1.1.*'-style vendor revisions. One could conceivably have a file whose default branch is 1.1.3 or whatever, or was that at some point in time, with vendor revisions 1.1.3.1, 1.1.3.2, etc. But with the default branch gone now, we'd have no basis for assuming that the non-standard vendor branch had ever been the default branch anyway. Note that we rely on comparisons between the timestamps of the revisions on the vendor branch and that of revision 1.2, even though the timestamps might be incorrect due to clock skew. We could do a slightly better job if we used the changeset timestamps, as it is possible that the dependencies that went into determining those timestamps are more accurate. But that would require an extra pass or two.""" rev_1_1 = self[vendor_lod_items.cvs_branch.source_id] rev_1_2_id = rev_1_1.next_id if rev_1_2_id is None: rev_1_2_timestamp = None else: rev_1_2_timestamp = self[rev_1_2_id].timestamp ntdbr_cvs_revs = [] for cvs_rev in vendor_lod_items.cvs_revisions: if rev_1_2_timestamp is not None \ and cvs_rev.timestamp >= rev_1_2_timestamp: # That's the end of the once-default branch. break ntdbr_cvs_revs.append(cvs_rev) if ntdbr_cvs_revs: self.adjust_ntdbrs(ntdbr_cvs_revs) return True else: return False def imported_remove_1_1(self, vendor_lod_items): """This file was imported. Remove the 1.1 revision if possible. VENDOR_LOD_ITEMS is the LODItems instance for the vendor branch. See adjust_ntdbrs() for more information.""" assert vendor_lod_items.cvs_revisions cvs_rev = vendor_lod_items.cvs_revisions[0] if isinstance(cvs_rev, CVSRevisionModification) \ and not cvs_rev.deltatext_exists: cvs_branch = vendor_lod_items.cvs_branch rev_1_1 = self[cvs_branch.source_id] assert isinstance(rev_1_1, CVSRevision) logger.debug('Removing unnecessary revision %s' % (rev_1_1,)) # Delete the 1.1.1 CVSBranch and sever the vendor branch from trunk: self._sever_branch(vendor_lod_items) # Delete rev_1_1: self.root_ids.remove(rev_1_1.id) del self[rev_1_1.id] rev_1_2_id = rev_1_1.next_id if rev_1_2_id is not None: rev_1_2 = self[rev_1_2_id] rev_1_2.prev_id = None self.root_ids.add(rev_1_2.id) # Move any tags and branches from rev_1_1 to cvs_rev: cvs_rev.tag_ids.extend(rev_1_1.tag_ids) for id in rev_1_1.tag_ids: cvs_tag = self[id] cvs_tag.source_lod = cvs_rev.lod cvs_tag.source_id = cvs_rev.id cvs_rev.branch_ids[0:0] = rev_1_1.branch_ids for id in rev_1_1.branch_ids: cvs_branch = self[id] cvs_branch.source_lod = cvs_rev.lod cvs_branch.source_id = cvs_rev.id cvs_rev.branch_commit_ids[0:0] = rev_1_1.branch_commit_ids for id in rev_1_1.branch_commit_ids: cvs_rev2 = self[id] cvs_rev2.prev_id = cvs_rev.id def _is_unneeded_initial_trunk_delete(self, cvs_item, metadata_db): if not isinstance(cvs_item, CVSRevisionNoop): # This rule can only be applied to dead revisions. return False if cvs_item.rev != '1.1': return False if not isinstance(cvs_item.lod, Trunk): return False if cvs_item.closed_symbols: return False if cvs_item.ntdbr: return False log_msg = metadata_db[cvs_item.metadata_id].log_msg return bool( re.match( r'file .* was initially added on branch .*\.\n$', log_msg, ) or re.match( # This variant commit message was reported by one user: r'file .* was added on branch .*\n$', log_msg, ) ) def remove_unneeded_initial_trunk_delete(self, metadata_db): """Remove unneeded deletes for this file. If a file is added on a branch, then a trunk revision is added at the same time in the 'Dead' state. This revision doesn't do anything useful, so delete it.""" for id in self.root_ids: cvs_item = self[id] if self._is_unneeded_initial_trunk_delete(cvs_item, metadata_db): logger.debug('Removing unnecessary delete %s' % (cvs_item,)) # Sever any CVSBranches rooted at cvs_item. for cvs_branch_id in cvs_item.branch_ids[:]: cvs_branch = self[cvs_branch_id] self._sever_branch(self.get_lod_items(cvs_branch)) # Tagging a dead revision doesn't do anything, so remove any # CVSTags that refer to cvs_item: while cvs_item.tag_ids: del self[cvs_item.tag_ids.pop()] # Now delete cvs_item itself: self.root_ids.remove(cvs_item.id) del self[cvs_item.id] if cvs_item.next_id is not None: cvs_rev_next = self[cvs_item.next_id] cvs_rev_next.prev_id = None self.root_ids.add(cvs_rev_next.id) # This can only happen once per file, so we're done: return def _is_unneeded_initial_branch_delete(self, lod_items, metadata_db): """Return True iff the initial revision in LOD_ITEMS can be deleted.""" if not lod_items.cvs_revisions: return False cvs_revision = lod_items.cvs_revisions[0] if cvs_revision.ntdbr: return False if not isinstance(cvs_revision, CVSRevisionAbsent): return False if cvs_revision.branch_ids: return False log_msg = metadata_db[cvs_revision.metadata_id].log_msg return bool(re.match( r'file .* was added on branch .* on ' r'\d{4}\-\d{2}\-\d{2} \d{2}\:\d{2}\:\d{2}( [\+\-]\d{4})?' '\n$', log_msg, )) def remove_initial_branch_deletes(self, metadata_db): """If the first revision on a branch is an unnecessary delete, remove it. If a file is added on a branch (whether or not it already existed on trunk), then new versions of CVS add a first branch revision in the 'dead' state (to indicate that the file did not exist on the branch when the branch was created) followed by the second branch revision, which is an add. When we encounter this situation, we sever the branch from trunk and delete the first branch revision.""" for lod_items in self.iter_lods(): if self._is_unneeded_initial_branch_delete(lod_items, metadata_db): cvs_revision = lod_items.cvs_revisions[0] logger.debug( 'Removing unnecessary initial branch delete %s' % (cvs_revision,) ) # Sever the branch from its source if necessary: self._sever_branch(lod_items) # Delete the first revision on the branch: self.root_ids.remove(cvs_revision.id) del self[cvs_revision.id] # If it had a successor, adjust its backreference and add it # to the root_ids: if cvs_revision.next_id is not None: cvs_rev_next = self[cvs_revision.next_id] cvs_rev_next.prev_id = None self.root_ids.add(cvs_rev_next.id) # Tagging a dead revision doesn't do anything, so remove any # tags that were set on it: for tag_id in cvs_revision.tag_ids: del self[tag_id] def _exclude_tag(self, cvs_tag): """Exclude the specified CVS_TAG.""" del self[cvs_tag.id] # A CVSTag is the successor of the CVSRevision that it # sprouts from. Delete this tag from that revision's # tag_ids: self[cvs_tag.source_id].tag_ids.remove(cvs_tag.id) def _exclude_branch(self, lod_items): """Exclude the branch described by LOD_ITEMS, including its revisions. (Do not update the LOD_ITEMS instance itself.) If the LOD starts with non-trunk default branch revisions, leave the branch and the NTDB revisions in place, but delete any subsequent revisions that are not NTDB revisions. In this case, return True; otherwise return False""" if lod_items.cvs_revisions and lod_items.cvs_revisions[0].ntdbr: for cvs_rev in lod_items.cvs_revisions: if not cvs_rev.ntdbr: # We've found the first non-NTDBR, and it's stored in cvs_rev: break else: # There was no revision following the NTDBRs: cvs_rev = None if cvs_rev: last_ntdbr = self[cvs_rev.prev_id] last_ntdbr.next_id = None while True: del self[cvs_rev.id] if cvs_rev.next_id is None: break cvs_rev = self[cvs_rev.next_id] return True else: if lod_items.cvs_branch is not None: # Delete the CVSBranch itself: cvs_branch = lod_items.cvs_branch del self[cvs_branch.id] # A CVSBranch is the successor of the CVSRevision that it # sprouts from. Delete this branch from that revision's # branch_ids: self[cvs_branch.source_id].branch_ids.remove(cvs_branch.id) if lod_items.cvs_revisions: # The first CVSRevision on the branch has to be either detached # from the revision from which the branch sprang, or removed # from self.root_ids: cvs_rev = lod_items.cvs_revisions[0] if cvs_rev.prev_id is None: self.root_ids.remove(cvs_rev.id) else: self[cvs_rev.prev_id].branch_commit_ids.remove(cvs_rev.id) for cvs_rev in lod_items.cvs_revisions: del self[cvs_rev.id] return False def graft_ntdbr_to_trunk(self): """Graft the non-trunk default branch revisions to trunk. They should already be alone on a branch that may or may not have a CVSBranch connecting it to trunk.""" for lod_items in self.iter_lods(): if lod_items.cvs_revisions and lod_items.cvs_revisions[0].ntdbr: assert lod_items.is_pure_ntdb() first_rev = lod_items.cvs_revisions[0] last_rev = lod_items.cvs_revisions[-1] rev_1_1 = self.get(first_rev.prev_id) rev_1_2 = self.get(last_rev.ntdbr_next_id) self._sever_branch(lod_items) if rev_1_1 is not None: rev_1_1.next_id = first_rev.id first_rev.prev_id = rev_1_1.id self.root_ids.remove(first_rev.id) first_rev.__class__ = cvs_revision_type_map[( isinstance(first_rev, CVSRevisionModification), isinstance(rev_1_1, CVSRevisionModification), )] if rev_1_2 is not None: rev_1_2.ntdbr_prev_id = None last_rev.ntdbr_next_id = None if rev_1_2.prev_id is None: self.root_ids.remove(rev_1_2.id) rev_1_2.prev_id = last_rev.id last_rev.next_id = rev_1_2.id # The effective_pred_id of rev_1_2 was not changed, so we # don't have to change rev_1_2's type. for cvs_rev in lod_items.cvs_revisions: cvs_rev.ntdbr = False cvs_rev.lod = self.trunk for cvs_branch in lod_items.cvs_branches: cvs_branch.source_lod = self.trunk for cvs_tag in lod_items.cvs_tags: cvs_tag.source_lod = self.trunk return def exclude_non_trunk(self): """Delete all tags and branches.""" ntdbr_excluded = False for lod_items in self.iter_lods(): for cvs_tag in lod_items.cvs_tags[:]: self._exclude_tag(cvs_tag) lod_items.cvs_tags.remove(cvs_tag) if not isinstance(lod_items.lod, Trunk): assert not lod_items.cvs_branches ntdbr_excluded |= self._exclude_branch(lod_items) if ntdbr_excluded: self.graft_ntdbr_to_trunk() def filter_excluded_symbols(self): """Delete any excluded symbols and references to them.""" ntdbr_excluded = False for lod_items in self.iter_lods(): # Delete any excluded tags: for cvs_tag in lod_items.cvs_tags[:]: if isinstance(cvs_tag.symbol, ExcludedSymbol): self._exclude_tag(cvs_tag) lod_items.cvs_tags.remove(cvs_tag) # Delete the whole branch if it is to be excluded: if isinstance(lod_items.lod, ExcludedSymbol): # A symbol can only be excluded if no other symbols spring # from it. This was already checked in CollateSymbolsPass, so # these conditions should already be satisfied. assert not list(lod_items.iter_blockers()) ntdbr_excluded |= self._exclude_branch(lod_items) if ntdbr_excluded: self.graft_ntdbr_to_trunk() def _mutate_branch_to_tag(self, cvs_branch): """Mutate the branch CVS_BRANCH into a tag.""" if cvs_branch.next_id is not None: # This shouldn't happen because it was checked in # CollateSymbolsPass: raise FatalError('Attempt to exclude a branch with commits.') cvs_tag = CVSTag( cvs_branch.id, cvs_branch.cvs_file, cvs_branch.symbol, cvs_branch.source_lod, cvs_branch.source_id, cvs_branch.revision_reader_token, ) self.add(cvs_tag) cvs_revision = self[cvs_tag.source_id] cvs_revision.branch_ids.remove(cvs_tag.id) cvs_revision.tag_ids.append(cvs_tag.id) def _mutate_tag_to_branch(self, cvs_tag): """Mutate the tag into a branch.""" cvs_branch = CVSBranch( cvs_tag.id, cvs_tag.cvs_file, cvs_tag.symbol, None, cvs_tag.source_lod, cvs_tag.source_id, None, cvs_tag.revision_reader_token, ) self.add(cvs_branch) cvs_revision = self[cvs_branch.source_id] cvs_revision.tag_ids.remove(cvs_branch.id) cvs_revision.branch_ids.append(cvs_branch.id) def _mutate_symbol(self, cvs_symbol): """Mutate CVS_SYMBOL if necessary.""" symbol = cvs_symbol.symbol if isinstance(cvs_symbol, CVSBranch) and isinstance(symbol, Tag): self._mutate_branch_to_tag(cvs_symbol) elif isinstance(cvs_symbol, CVSTag) and isinstance(symbol, Branch): self._mutate_tag_to_branch(cvs_symbol) def mutate_symbols(self): """Force symbols to be tags/branches based on self.symbol_db.""" for cvs_item in self.values(): if isinstance(cvs_item, CVSRevision): # This CVSRevision may be affected by the mutation of any # CVSSymbols that it references, but there is nothing to do # here directly. pass elif isinstance(cvs_item, CVSSymbol): self._mutate_symbol(cvs_item) else: raise RuntimeError('Unknown cvs item type') def _adjust_tag_parent(self, cvs_tag): """Adjust the parent of CVS_TAG if possible and preferred. CVS_TAG is an instance of CVSTag. This method must be called in leaf-to-trunk order.""" # The Symbol that cvs_tag would like to have as a parent: preferred_parent = Ctx()._symbol_db.get_symbol( cvs_tag.symbol.preferred_parent_id) if cvs_tag.source_lod == preferred_parent: # The preferred parent is already the parent. return # The CVSRevision that is its direct parent: source = self[cvs_tag.source_id] assert isinstance(source, CVSRevision) if isinstance(preferred_parent, Trunk): # It is not possible to graft *onto* Trunk: return # Try to find the preferred parent among the possible parents: for branch_id in source.branch_ids: if self[branch_id].symbol == preferred_parent: # We found it! break else: # The preferred parent is not a possible parent in this file. return parent = self[branch_id] assert isinstance(parent, CVSBranch) logger.debug('Grafting %s from %s (on %s) onto %s' % ( cvs_tag, source, source.lod, parent,)) # Switch parent: source.tag_ids.remove(cvs_tag.id) parent.tag_ids.append(cvs_tag.id) cvs_tag.source_lod = parent.symbol cvs_tag.source_id = parent.id def _adjust_branch_parents(self, cvs_branch): """Adjust the parent of CVS_BRANCH if possible and preferred. CVS_BRANCH is an instance of CVSBranch. This method must be called in leaf-to-trunk order.""" # The Symbol that cvs_branch would like to have as a parent: preferred_parent = Ctx()._symbol_db.get_symbol( cvs_branch.symbol.preferred_parent_id) if cvs_branch.source_lod == preferred_parent: # The preferred parent is already the parent. return # The CVSRevision that is its direct parent: source = self[cvs_branch.source_id] # This is always a CVSRevision because we haven't adjusted it yet: assert isinstance(source, CVSRevision) if isinstance(preferred_parent, Trunk): # It is not possible to graft *onto* Trunk: return # Try to find the preferred parent among the possible parents: for branch_id in source.branch_ids: possible_parent = self[branch_id] if possible_parent.symbol == preferred_parent: # We found it! break elif possible_parent.symbol == cvs_branch.symbol: # Only branches that precede the branch to be adjusted are # considered possible parents. Leave parentage unchanged: return else: # This point should never be reached. raise InternalError( 'Possible parent search did not terminate as expected') parent = possible_parent assert isinstance(parent, CVSBranch) logger.debug('Grafting %s from %s (on %s) onto %s' % ( cvs_branch, source, source.lod, parent,)) # Switch parent: source.branch_ids.remove(cvs_branch.id) parent.branch_ids.append(cvs_branch.id) cvs_branch.source_lod = parent.symbol cvs_branch.source_id = parent.id def adjust_parents(self): """Adjust the parents of symbols to their preferred parents. If a CVSSymbol has a preferred parent that is different than its current parent, and if the preferred parent is an allowed parent of the CVSSymbol in this file, then graft the CVSSymbol onto its preferred parent.""" for lod_items in self.iter_lods(): for cvs_tag in lod_items.cvs_tags: self._adjust_tag_parent(cvs_tag) # It is important to process branches in reverse order, so that # a branch graft target (which necessarily occurs earlier in the # list than the branch itself) is not moved before the branch # itself. for cvs_branch in reversed(lod_items.cvs_branches): self._adjust_branch_parents(cvs_branch) def _get_revision_source(self, cvs_symbol): """Return the CVSRevision that is the ultimate source of CVS_SYMBOL.""" while True: cvs_item = self[cvs_symbol.source_id] if isinstance(cvs_item, CVSRevision): return cvs_item else: cvs_symbol = cvs_item def refine_symbols(self): """Refine the types of the CVSSymbols in this file. Adjust the symbol types based on whether the source exists: CVSBranch vs. CVSBranchNoop and CVSTag vs. CVSTagNoop.""" for lod_items in self.iter_lods(): for cvs_tag in lod_items.cvs_tags: source = self._get_revision_source(cvs_tag) cvs_tag.__class__ = cvs_tag_type_map[ isinstance(source, CVSRevisionModification) ] for cvs_branch in lod_items.cvs_branches: source = self._get_revision_source(cvs_branch) cvs_branch.__class__ = cvs_branch_type_map[ isinstance(source, CVSRevisionModification) ] def determine_revision_properties(self, revision_property_setters): """Set the properties and properties_changed fields on CVSRevisions.""" for lod_items in self.iter_lods(): for cvs_rev in lod_items.cvs_revisions: cvs_rev.properties = {} for revision_property_setter in revision_property_setters: revision_property_setter.set_properties(cvs_rev) for lod_items in self.iter_lods(): for cvs_rev in lod_items.cvs_revisions: if isinstance(cvs_rev, CVSRevisionAdd): cvs_rev.properties_changed = True elif isinstance(cvs_rev, CVSRevisionChange): prev_properties = self[ cvs_rev.get_effective_prev_id() ].get_properties() properties = cvs_rev.get_properties() cvs_rev.properties_changed = properties != prev_properties else: cvs_rev.properties_changed = False def record_opened_symbols(self): """Set CVSRevision.opened_symbols for the surviving revisions.""" for cvs_item in self.values(): if isinstance(cvs_item, (CVSRevision, CVSBranch)): cvs_item.opened_symbols = [] for cvs_symbol_opened_id in cvs_item.get_cvs_symbol_ids_opened(): cvs_symbol_opened = self[cvs_symbol_opened_id] cvs_item.opened_symbols.append( (cvs_symbol_opened.symbol.id, cvs_symbol_opened.id,) ) def record_closed_symbols(self): """Set CVSRevision.closed_symbols for the surviving revisions. A CVSRevision closes the symbols that were opened by the CVSItems that the CVSRevision closes. Got it? This method must be called after record_opened_symbols().""" for cvs_item in self.values(): if isinstance(cvs_item, CVSRevision): cvs_item.closed_symbols = [] for cvs_item_closed_id in cvs_item.get_ids_closed(): cvs_item_closed = self[cvs_item_closed_id] cvs_item.closed_symbols.extend(cvs_item_closed.opened_symbols) cvs2svn-2.4.0/cvs2svn_lib/checkout_internal.py0000664000076500007650000006601011710517256022520 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2007-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains classes that implement the --use-internal-co option. The idea is to patch up the revisions' contents incrementally, thus avoiding the huge number of process spawns and the O(n^2) overhead of using 'co' and 'cvs'. InternalRevisionCollector saves the RCS deltas and RCS revision trees to databases. Notably, deltas from the trunk need to be reversed, as CVS stores them so they apply from HEAD backwards. InternalRevisionReader produces the revisions' contents on demand. To generate the text for a typical revision, we need the revision's delta text plus the fulltext of the previous revision. Therefore, we maintain a checkout database containing a copy of the fulltext of any revision for which subsequent revisions still need to be retrieved. It is crucial to remove text from this database as soon as it is no longer needed, to prevent it from growing enormous. There are two reasons that the text from a revision can be needed: (1) because the revision itself still needs to be output to a dumpfile; (2) because another revision needs it as the base of its delta. We maintain a reference count for each revision, which includes *both* possibilities. The first time a revision's text is needed, it is generated by applying the revision's deltatext to the previous revision's fulltext, and the resulting fulltext is stored in the checkout database. Each time a revision's fulltext is retrieved, its reference count is decremented. When the reference count goes to zero, then the fulltext is deleted from the checkout database. The administrative data for managing this consists of one TextRecord entry for each revision. Each TextRecord has an id, which is the same id as used for the corresponding CVSRevision instance. It also maintains a count of the times it is expected to be retrieved. TextRecords come in several varieties: FullTextRecord -- Used for revisions whose fulltext is derived directly from the RCS file by the InternalRevisionCollector (i.e., typically revision 1.1 of each file). DeltaTextRecord -- Used for revisions that are defined via a delta relative to some other TextRecord. These records record the id of the TextRecord that holds the base text against which the delta is defined. When the text for a DeltaTextRecord is retrieved, the DeltaTextRecord instance is deleted and a CheckedOutTextRecord instance is created to take its place. CheckedOutTextRecord -- Used during OutputPass for a revision that started out as a DeltaTextRecord, but has already been retrieved (and therefore its fulltext is stored in the checkout database). While a file is being processed during FilterSymbolsPass, the fulltext and deltas are stored to the delta database, and TextRecord instances are created to keep track of things. The reference counts are all initialized: each record referred to by a delta has its refcount incremented, and each record that corresponds to a non-delete CVSRevision is incremented. After that, any records with refcount==0 are removed. When one record is removed, that can cause another record's reference count to go to zero and be removed too, recursively. When a TextRecord is deleted at this stage, its deltatext is also deleted from the delta database.""" from cvs2svn_lib import config from cvs2svn_lib.common import DB_OPEN_NEW from cvs2svn_lib.common import DB_OPEN_READ from cvs2svn_lib.common import warning_prefix from cvs2svn_lib.common import FatalError from cvs2svn_lib.common import InternalError from cvs2svn_lib.common import canonicalize_eol from cvs2svn_lib.common import is_trunk_revision from cvs2svn_lib.context import Ctx from cvs2svn_lib.log import logger from cvs2svn_lib.artifact_manager import artifact_manager from cvs2svn_lib.cvs_item import CVSRevisionModification from cvs2svn_lib.indexed_database import IndexedDatabase from cvs2svn_lib.rcs_stream import RCSStream from cvs2svn_lib.rcs_stream import MalformedDeltaException from cvs2svn_lib.keyword_expander import expand_keywords from cvs2svn_lib.keyword_expander import collapse_keywords from cvs2svn_lib.revision_manager import RevisionCollector from cvs2svn_lib.revision_manager import RevisionReader from cvs2svn_lib.serializer import MarshalSerializer from cvs2svn_lib.serializer import CompressingSerializer from cvs2svn_lib.serializer import PrimedPickleSerializer from cvs2svn_lib.apple_single_filter import get_maybe_apple_single from cvs2svn_lib.rcsparser import Sink from cvs2svn_lib.rcsparser import parse class TextRecord(object): """Bookkeeping data for the text of a single CVSRevision.""" __slots__ = ['id', 'refcount'] def __init__(self, id): # The cvs_rev_id of the revision whose text this is. self.id = id # The number of times that the text of this revision will be # retrieved. self.refcount = 0 def __getstate__(self): return (self.id, self.refcount,) def __setstate__(self, state): (self.id, self.refcount,) = state def increment_dependency_refcounts(self, text_record_db): """Increment the refcounts of any records that this one depends on.""" pass def decrement_refcount(self, text_record_db): """Decrement the number of times our text still has to be checked out. If the reference count goes to zero, call discard().""" self.refcount -= 1 if self.refcount == 0: text_record_db.discard(self.id) def checkout(self, text_record_db): """Workhorse of the checkout process. Return the text for this revision, decrement our reference count, and update the databases depending on whether there will be future checkouts.""" raise NotImplementedError() def free(self, text_record_db): """This instance will never again be checked out; free it. Also free any associated resources and decrement the refcounts of any other TextRecords that this one depends on.""" raise NotImplementedError() class FullTextRecord(TextRecord): """A record whose revision's fulltext is stored in the delta_db. These records are used for revisions whose fulltext was determined by the InternalRevisionCollector during FilterSymbolsPass. The fulltext for such a revision is is stored in the delta_db as a single string.""" __slots__ = [] def __getstate__(self): return (self.id, self.refcount,) def __setstate__(self, state): (self.id, self.refcount,) = state def checkout(self, text_record_db): text = text_record_db.delta_db[self.id] self.decrement_refcount(text_record_db) return text def free(self, text_record_db): del text_record_db.delta_db[self.id] def __str__(self): return 'FullTextRecord(%x, %d)' % (self.id, self.refcount,) class DeltaTextRecord(TextRecord): """A record whose revision's delta is stored as an RCS delta. The text of this revision must be derived by applying an RCS delta to the text of the predecessor revision. The RCS delta is stored in the delta_db.""" __slots__ = ['pred_id'] def __init__(self, id, pred_id): TextRecord.__init__(self, id) # The cvs_rev_id of the revision relative to which this delta is # defined. self.pred_id = pred_id def __getstate__(self): return (self.id, self.refcount, self.pred_id,) def __setstate__(self, state): (self.id, self.refcount, self.pred_id,) = state def increment_dependency_refcounts(self, text_record_db): text_record_db[self.pred_id].refcount += 1 def checkout(self, text_record_db): base_text = text_record_db[self.pred_id].checkout(text_record_db) rcs_stream = RCSStream(base_text) delta_text = text_record_db.delta_db[self.id] rcs_stream.apply_diff(delta_text) text = rcs_stream.get_text() del rcs_stream self.refcount -= 1 if self.refcount == 0: # This text will never be needed again; just delete ourselves # without ever having stored the fulltext to the checkout # database: del text_record_db[self.id] else: # Store a new CheckedOutTextRecord in place of ourselves: text_record_db.checkout_db['%x' % self.id] = text new_text_record = CheckedOutTextRecord(self.id) new_text_record.refcount = self.refcount text_record_db.replace(new_text_record) return text def free(self, text_record_db): del text_record_db.delta_db[self.id] text_record_db[self.pred_id].decrement_refcount(text_record_db) def __str__(self): return 'DeltaTextRecord(%x -> %x, %d)' % ( self.pred_id, self.id, self.refcount, ) class CheckedOutTextRecord(TextRecord): """A record whose revision's fulltext is stored in the text_record_db. These records are used for revisions whose fulltext has been computed already during OutputPass. The fulltext for such a revision is stored in the text_record_db as a single string.""" __slots__ = [] def __getstate__(self): return (self.id, self.refcount,) def __setstate__(self, state): (self.id, self.refcount,) = state def checkout(self, text_record_db): text = text_record_db.checkout_db['%x' % self.id] self.decrement_refcount(text_record_db) return text def free(self, text_record_db): del text_record_db.checkout_db['%x' % self.id] def __str__(self): return 'CheckedOutTextRecord(%x, %d)' % (self.id, self.refcount,) class NullDatabase(object): """A do-nothing database that can be used with TextRecordDatabase. Use this when you don't actually want to allow anything to be deleted.""" def __delitem__(self, id): pass class TextRecordDatabase: """Holds the TextRecord instances that are currently live. During FilterSymbolsPass, files are processed one by one and a new TextRecordDatabase instance is used for each file. During OutputPass, a single TextRecordDatabase instance is used for the duration of OutputPass; individual records are added and removed when they are active.""" def __init__(self, delta_db, checkout_db): # A map { cvs_rev_id -> TextRecord }. self.text_records = {} # A database-like object using cvs_rev_ids as keys and containing # fulltext/deltatext strings as values. Its __getitem__() method # is used to retrieve deltas when they are needed, and its # __delitem__() method is used to delete deltas when they can be # freed. The modifiability of the delta database varies from pass # to pass, so the object stored here varies as well: # # FilterSymbolsPass: a NullDatabase. The delta database cannot be # modified during this pass, and we have no need to retrieve # deltas, so we just use a dummy object here. # # OutputPass: a disabled IndexedDatabase. During this pass we # need to retrieve deltas, but we are not allowed to modify # the delta database. So we use an IndexedDatabase whose # __del__() method has been disabled to do nothing. self.delta_db = delta_db # A database-like object using cvs_rev_ids as keys and containing # fulltext strings as values. This database is only set during # OutputPass. self.checkout_db = checkout_db # If this is set to a list, then the list holds the ids of # text_records that have to be deleted; when discard() is called, # it adds the requested id to the list but does not delete it. If # this member is set to None, then text_records are deleted # immediately when discard() is called. self.deferred_deletes = None def __getstate__(self): return (self.text_records.values(),) def __setstate__(self, state): (text_records,) = state self.text_records = {} for text_record in text_records: self.add(text_record) self.delta_db = NullDatabase() self.checkout_db = NullDatabase() self.deferred_deletes = None def add(self, text_record): """Add TEXT_RECORD to our database. There must not already be a record with the same id.""" assert not self.text_records.has_key(text_record.id) self.text_records[text_record.id] = text_record def __getitem__(self, id): return self.text_records[id] def __delitem__(self, id): """Free the record with the specified ID.""" del self.text_records[id] def replace(self, text_record): """Store TEXT_RECORD in place of the existing record with the same id. Do not do anything with the old record.""" assert self.text_records.has_key(text_record.id) self.text_records[text_record.id] = text_record def discard(self, *ids): """The text records with IDS are no longer needed; discard them. This involves calling their free() methods and also removing them from SELF. If SELF.deferred_deletes is not None, then the ids to be deleted are added to the list instead of deleted immediately. This mechanism is to prevent a stack overflow from the avalanche of deletes that can result from deleting a long chain of revisions.""" if self.deferred_deletes is None: # This is an outer-level delete. self.deferred_deletes = list(ids) while self.deferred_deletes: id = self.deferred_deletes.pop() text_record = self[id] if text_record.refcount != 0: raise InternalError( 'TextRecordDatabase.discard(%s) called with refcount = %d' % (text_record, text_record.refcount,) ) # This call might cause other text_record ids to be added to # self.deferred_deletes: text_record.free(self) del self[id] self.deferred_deletes = None else: self.deferred_deletes.extend(ids) def itervalues(self): return self.text_records.itervalues() def recompute_refcounts(self, cvs_file_items): """Recompute the refcounts of the contained TextRecords. Use CVS_FILE_ITEMS to determine which records will be needed by cvs2svn.""" # First clear all of the refcounts: for text_record in self.itervalues(): text_record.refcount = 0 # Now increment the reference count of records that are needed as # the source of another record's deltas: for text_record in self.itervalues(): text_record.increment_dependency_refcounts(self.text_records) # Now increment the reference count of records that will be needed # by cvs2svn: for lod_items in cvs_file_items.iter_lods(): for cvs_rev in lod_items.cvs_revisions: if isinstance(cvs_rev, CVSRevisionModification): self[cvs_rev.id].refcount += 1 def free_unused(self): """Free any TextRecords whose reference counts are zero.""" # The deletion of some of these text records might cause others to # be unused, in which case they will be deleted automatically. # But since the initially-unused records are not referred to by # any others, we don't have to be afraid that they will be deleted # before we get to them. But it *is* crucial that we create the # whole unused list before starting the loop. unused = [ text_record.id for text_record in self.itervalues() if text_record.refcount == 0 ] self.discard(*unused) def log_leftovers(self): """If any TextRecords still exist, log them.""" if self.text_records: logger.warn( "%s: internal problem: leftover revisions in the checkout cache:" % warning_prefix) for text_record in self.itervalues(): logger.warn(' %s' % (text_record,)) def __repr__(self): """Debugging output of the current contents of the TextRecordDatabase.""" retval = ['TextRecordDatabase:'] for text_record in self.itervalues(): retval.append(' %s' % (text_record,)) return '\n'.join(retval) class _Sink(Sink): def __init__(self, revision_collector, cvs_file_items): self.revision_collector = revision_collector self.cvs_file_items = cvs_file_items # A map {rev : base_rev} indicating that the text for rev is # stored in CVS as a delta relative to base_rev. self.base_revisions = {} # The revision that is stored with its fulltext in CVS (usually # the oldest revision on trunk): self.head_revision = None # The first logical revision on trunk (usually '1.1'): self.revision_1_1 = None # Keep track of the revisions whose revision info has been seen so # far (to avoid repeated revision info blocks): self.revisions_seen = set() def set_head_revision(self, revision): self.head_revision = revision def define_revision( self, revision, timestamp, author, state, branches, next ): if next: self.base_revisions[next] = revision else: if is_trunk_revision(revision): self.revision_1_1 = revision for branch in branches: self.base_revisions[branch] = revision def set_revision_info(self, revision, log, text): if revision in self.revisions_seen: # One common form of CVS repository corruption is that the # Deltatext block for revision 1.1 appears twice. CollectData # has already warned about this problem; here we can just ignore # it. return else: self.revisions_seen.add(revision) cvs_rev_id = self.cvs_file_items.original_ids[revision] if is_trunk_revision(revision): # On trunk, revisions are encountered in reverse order (1. # ... 1.1) and deltas are inverted. The first text that we see # is the fulltext for the HEAD revision. After that, the text # corresponding to revision 1.N is the delta (1. -> # 1.)). We have to invert the deltas here so that we can # read the revisions out in dependency order; that is, for # revision 1.1 we want the fulltext, and for revision 1. we # want the delta (1. -> 1.). This means that we can't # compute the delta for a revision until we see its logical # parent. When we finally see revision 1.1 (which is recognized # because it doesn't have a parent), we can record the diff (1.1 # -> 1.2) for revision 1.2, and also the fulltext for 1.1. if revision == self.head_revision: # This is HEAD, as fulltext. Initialize the RCSStream so # that we can compute deltas backwards in time. self._rcs_stream = RCSStream(text) self._rcs_stream_revision = revision else: # Any other trunk revision is a backward delta. Apply the # delta to the RCSStream to mutate it to the contents of this # revision, and also to get the reverse delta, which we store # as the forward delta of our child revision. try: text = self._rcs_stream.invert_diff(text) except MalformedDeltaException, e: logger.error( 'Malformed RCS delta in %s, revision %s: %s' % (self.cvs_file_items.cvs_file.rcs_path, revision, e) ) raise RuntimeError() text_record = DeltaTextRecord( self.cvs_file_items.original_ids[self._rcs_stream_revision], cvs_rev_id ) self.revision_collector._writeout(text_record, text) self._rcs_stream_revision = revision if revision == self.revision_1_1: # This is revision 1.1. Write its fulltext: text_record = FullTextRecord(cvs_rev_id) self.revision_collector._writeout( text_record, self._rcs_stream.get_text() ) # There will be no more trunk revisions delivered, so free the # RCSStream. del self._rcs_stream del self._rcs_stream_revision else: # On branches, revisions are encountered in logical order # (.1 ... .) and the text corresponding to # revision . is the forward delta (. -> # .). That's what we need, so just store it. # FIXME: It would be nice to avoid writing out branch deltas # when --trunk-only. (They will be deleted when finish_file() # is called, but if the delta db is in an IndexedDatabase the # deletions won't actually recover any disk space.) text_record = DeltaTextRecord( cvs_rev_id, self.cvs_file_items.original_ids[self.base_revisions[revision]] ) self.revision_collector._writeout(text_record, text) return None class InternalRevisionCollector(RevisionCollector): """The RevisionCollector used by InternalRevisionReader.""" def __init__(self, compress): RevisionCollector.__init__(self) self._compress = compress def register_artifacts(self, which_pass): artifact_manager.register_temp_file( config.RCS_DELTAS_INDEX_TABLE, which_pass ) artifact_manager.register_temp_file(config.RCS_DELTAS_STORE, which_pass) artifact_manager.register_temp_file( config.RCS_TREES_INDEX_TABLE, which_pass ) artifact_manager.register_temp_file(config.RCS_TREES_STORE, which_pass) def start(self): serializer = MarshalSerializer() if self._compress: serializer = CompressingSerializer(serializer) self._delta_db = IndexedDatabase( artifact_manager.get_temp_file(config.RCS_DELTAS_STORE), artifact_manager.get_temp_file(config.RCS_DELTAS_INDEX_TABLE), DB_OPEN_NEW, serializer, ) primer = (FullTextRecord, DeltaTextRecord) self._rcs_trees = IndexedDatabase( artifact_manager.get_temp_file(config.RCS_TREES_STORE), artifact_manager.get_temp_file(config.RCS_TREES_INDEX_TABLE), DB_OPEN_NEW, PrimedPickleSerializer(primer), ) def _writeout(self, text_record, text): self.text_record_db.add(text_record) self._delta_db[text_record.id] = text def process_file(self, cvs_file_items): """Read revision information for the file described by CVS_FILE_ITEMS. Compute the text record refcounts, discard any records that are unneeded, and store the text records for the file to the _rcs_trees database.""" # A map from cvs_rev_id to TextRecord instance: self.text_record_db = TextRecordDatabase(self._delta_db, NullDatabase()) parse( open(cvs_file_items.cvs_file.rcs_path, 'rb'), _Sink(self, cvs_file_items), ) self.text_record_db.recompute_refcounts(cvs_file_items) self.text_record_db.free_unused() self._rcs_trees[cvs_file_items.cvs_file.id] = self.text_record_db del self.text_record_db def finish(self): self._delta_db.close() self._rcs_trees.close() class InternalRevisionReader(RevisionReader): """A RevisionReader that reads the contents from an own delta store.""" def __init__(self, compress): # Only import Database if an InternalRevisionReader is really # instantiated, because the import fails if a decent dbm is not # installed. from cvs2svn_lib.database import Database self._Database = Database self._compress = compress def register_artifacts(self, which_pass): artifact_manager.register_temp_file(config.CVS_CHECKOUT_DB, which_pass) artifact_manager.register_temp_file_needed( config.RCS_DELTAS_STORE, which_pass ) artifact_manager.register_temp_file_needed( config.RCS_DELTAS_INDEX_TABLE, which_pass ) artifact_manager.register_temp_file_needed( config.RCS_TREES_STORE, which_pass ) artifact_manager.register_temp_file_needed( config.RCS_TREES_INDEX_TABLE, which_pass ) def start(self): self._delta_db = IndexedDatabase( artifact_manager.get_temp_file(config.RCS_DELTAS_STORE), artifact_manager.get_temp_file(config.RCS_DELTAS_INDEX_TABLE), DB_OPEN_READ, ) self._delta_db.__delitem__ = lambda id: None self._tree_db = IndexedDatabase( artifact_manager.get_temp_file(config.RCS_TREES_STORE), artifact_manager.get_temp_file(config.RCS_TREES_INDEX_TABLE), DB_OPEN_READ, ) serializer = MarshalSerializer() if self._compress: serializer = CompressingSerializer(serializer) self._co_db = self._Database( artifact_manager.get_temp_file(config.CVS_CHECKOUT_DB), DB_OPEN_NEW, serializer, ) # The set of CVSFile instances whose TextRecords have already been # read: self._loaded_files = set() # A map { CVSFILE : _FileTree } for files that currently have live # revisions: self._text_record_db = TextRecordDatabase(self._delta_db, self._co_db) def _get_text_record(self, cvs_rev): """Return the TextRecord instance for CVS_REV. If the TextRecords for CVS_REV.cvs_file haven't been loaded yet, do so now.""" if cvs_rev.cvs_file not in self._loaded_files: for text_record in self._tree_db[cvs_rev.cvs_file.id].itervalues(): self._text_record_db.add(text_record) self._loaded_files.add(cvs_rev.cvs_file) return self._text_record_db[cvs_rev.id] def get_content(self, cvs_rev): """Check out the text for revision C_REV from the repository. Return the text. If CVS_REV has a property _keyword_handling, use it to determine how to handle RCS keywords in the output: 'collapsed' -- collapse keywords 'expanded' -- expand keywords 'untouched' -- output keywords in the form they are found in the RCS file Note that $Log$ never actually generates a log (which makes test 'requires_cvs()' fail). Revisions may be requested in any order, but if they are not requested in dependency order the checkout database will become very large. Revisions may be skipped. Each revision may be requested only once.""" try: text = self._get_text_record(cvs_rev).checkout(self._text_record_db) except MalformedDeltaException, (msg): raise FatalError( 'Malformed RCS delta in %s, revision %s: %s' % (cvs_rev.cvs_file.rcs_path, cvs_rev.rev, msg) ) keyword_handling = cvs_rev.get_property('_keyword_handling') if keyword_handling == 'untouched': # Leave keywords in the form that they were checked in. pass elif keyword_handling == 'collapsed': text = collapse_keywords(text) elif keyword_handling == 'expanded': text = expand_keywords(text, cvs_rev) else: raise FatalError( 'Undefined _keyword_handling property (%r) for %s' % (keyword_handling, cvs_rev,) ) if Ctx().decode_apple_single: # Insert a filter to decode any files that are in AppleSingle # format: text = get_maybe_apple_single(text) eol_fix = cvs_rev.get_property('_eol_fix') if eol_fix: text = canonicalize_eol(text, eol_fix) return text def finish(self): self._text_record_db.log_leftovers() del self._text_record_db self._delta_db.close() self._tree_db.close() self._co_db.close() cvs2svn-2.4.0/cvs2svn_lib/indexed_database.py0000664000076500007650000001300111710517256022253 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains database facilities used by cvs2svn.""" import cPickle from cvs2svn_lib.common import DB_OPEN_READ from cvs2svn_lib.common import DB_OPEN_WRITE from cvs2svn_lib.common import DB_OPEN_NEW from cvs2svn_lib.record_table import FileOffsetPacker from cvs2svn_lib.record_table import RecordTable class IndexedDatabase: """A file of objects that are written sequentially and read randomly. The objects are indexed by small non-negative integers, and a RecordTable is used to store the index -> fileoffset map. fileoffset=0 is used to represent an empty record. (An offset of 0 cannot occur for a legitimate record because the serializer is written there.) The main file consists of a sequence of pickles (or other serialized data format). The zeroth record is a pickled Serializer. Subsequent ones are objects serialized using the serializer. The offset of each object in the file is stored to an index table so that the data can later be retrieved randomly. Objects are always stored to the end of the file. If an object is deleted or overwritten, the fact is recorded in the index_table but the space in the pickle file is not garbage collected. This has the advantage that one can create a modified version of a database that shares the main data file with an old version by copying the index file. But it has the disadvantage that space is wasted whenever objects are written multiple times.""" def __init__(self, filename, index_filename, mode, serializer=None): """Initialize an IndexedDatabase, writing the serializer if necessary. SERIALIZER is only used if MODE is DB_OPEN_NEW; otherwise the serializer is read from the file.""" self.filename = filename self.index_filename = index_filename self.mode = mode if self.mode == DB_OPEN_NEW: self.f = open(self.filename, 'wb+') elif self.mode == DB_OPEN_WRITE: self.f = open(self.filename, 'rb+') elif self.mode == DB_OPEN_READ: self.f = open(self.filename, 'rb') else: raise RuntimeError('Invalid mode %r' % self.mode) self.index_table = RecordTable( self.index_filename, self.mode, FileOffsetPacker() ) if self.mode == DB_OPEN_NEW: assert serializer is not None self.serializer = serializer cPickle.dump(self.serializer, self.f, -1) else: # Read the memo from the first pickle: self.serializer = cPickle.load(self.f) # Seek to the end of the file, and record that position: self.f.seek(0, 2) self.fp = self.f.tell() self.eofp = self.fp def __setitem__(self, index, item): """Write ITEM into the database indexed by INDEX.""" # Make sure we're at the end of the file: if self.fp != self.eofp: self.f.seek(self.eofp) self.index_table[index] = self.eofp s = self.serializer.dumps(item) self.f.write(s) self.eofp += len(s) self.fp = self.eofp def _fetch(self, offset): if self.fp != offset: self.f.seek(offset) # There is no easy way to tell how much data will be read, so just # indicate that we don't know the current file pointer: self.fp = None return self.serializer.loadf(self.f) def iterkeys(self): return self.index_table.iterkeys() def itervalues(self): for offset in self.index_table.itervalues(): yield self._fetch(offset) def __getitem__(self, index): offset = self.index_table[index] return self._fetch(offset) def get(self, item, default=None): try: return self[item] except KeyError: return default def get_many(self, indexes, default=None): """Yield (index,item) tuples for INDEXES, in arbitrary order. Yield (index,default) for indexes with no defined values.""" offsets = [] for (index, offset) in self.index_table.get_many(indexes): if offset is None: yield (index, default) else: offsets.append((offset, index)) # Sort the offsets to reduce disk seeking: offsets.sort() for (offset,index) in offsets: yield (index, self._fetch(offset)) def __delitem__(self, index): # We don't actually free the data in self.f. del self.index_table[index] def close(self): self.index_table.close() self.index_table = None self.f.close() self.f = None def __str__(self): return 'IndexedDatabase(%r)' % (self.filename,) class IndexedStore(IndexedDatabase): """A file of items that is written sequentially and read randomly. This is just like IndexedDatabase, except that it has an additional add() method which assumes that the object to be written to the database has an 'id' member, which is used as its database index. See IndexedDatabase for more information.""" def add(self, item): """Write ITEM into the database indexed by ITEM.id.""" self[item.id] = item cvs2svn-2.4.0/cvs2svn_lib/svn_repository_delegate.py0000664000076500007650000000667011500107341023747 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains the SVNRepositoryDelegate class.""" class SVNRepositoryDelegate: """Abstract superclass for any delegate to SVNOutputOption. Subclasses must implement all of the methods below. For each method, a subclass implements, in its own way, the Subversion operation implied by the method's name. For example, for the add_path method, the DumpstreamDelegate writes out a 'Node-add:' command to a Subversion dumpfile.""" def start_commit(self, revnum, revprops): """An SVN commit is starting. Perform any actions needed to start an SVN commit with revision number REVNUM and revision properties REVPROPS.""" raise NotImplementedError() def end_commit(self): """An SVN commit is ending.""" raise NotImplementedError() def initialize_project(self, project): """Initialize PROJECT. For Subversion, this means to create the trunk, branches, and tags directories for PROJECT.""" raise NotImplementedError() def initialize_lod(self, lod): """Initialize LOD with no contents. LOD is an instance of LineOfDevelopment. It is also possible for an LOD to be created by copying from another LOD; such events are indicated via the copy_lod() callback.""" raise NotImplementedError() def mkdir(self, lod, cvs_directory): """Create CVS_DIRECTORY within LOD. LOD is a LineOfDevelopment; CVS_DIRECTORY is a CVSDirectory.""" raise NotImplementedError() def add_path(self, cvs_rev): """Add the path corresponding to CVS_REV to the repository. CVS_REV is a CVSRevisionAdd.""" raise NotImplementedError() def change_path(self, cvs_rev): """Change the path corresponding to CVS_REV in the repository. CVS_REV is a CVSRevisionChange.""" raise NotImplementedError() def delete_lod(self, lod): """Delete LOD from the repository. LOD is a LineOfDevelopment instance.""" raise NotImplementedError() def delete_path(self, lod, cvs_path): """Delete CVS_PATH from LOD. LOD is a LineOfDevelopment; CVS_PATH is a CVSPath.""" raise NotImplementedError() def copy_lod(self, src_lod, dest_lod, src_revnum): """Copy SRC_LOD in SRC_REVNUM to DEST_LOD. SRC_LOD and DEST_LOD are both LODs, and SRC_REVNUM is a subversion revision number (int).""" raise NotImplementedError() def copy_path(self, cvs_path, src_lod, dest_lod, src_revnum): """Copy CVS_PATH in SRC_LOD@SRC_REVNUM to DEST_LOD. CVS_PATH is a CVSPath, SRC_LOD and DEST_LOD are LODs, and SRC_REVNUM is a subversion revision number (int).""" raise NotImplementedError() def finish(self): """All SVN revisions have been committed. Perform any necessary cleanup.""" raise NotImplementedError() cvs2svn-2.4.0/cvs2svn_lib/rcsparser.py0000664000076500007650000000402211710517256021016 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2011 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """The interface between cvs2svn and cvs2svn_rcsparse.""" # These identifiers are imported to be exported: from cvs2svn_rcsparse.common import Sink from cvs2svn_rcsparse.common import RCSParseError selected_parser = None def select_texttools_parser(): """Configure this module to use the texttools parser. The texttools parser is faster but depends on mx.TextTools, which is not part of the Python standard library. If it is not installed, this function will raise an ImportError.""" global selected_parser import cvs2svn_rcsparse.texttools selected_parser = cvs2svn_rcsparse.texttools.Parser def select_python_parser(): """Configure this module to use the Python parser. The Python parser is slower but works everywhere.""" global selected_parser import cvs2svn_rcsparse.default selected_parser = cvs2svn_rcsparse.default.Parser def select_parser(): """Configure this module to use the best parser available.""" try: select_texttools_parser() except ImportError: select_python_parser() def parse(file, sink): """Parse an RCS file. The arguments are the same as those of cvs2svn_rcsparse.common._Parser.parse() (see that method's docstring for more details). """ if selected_parser is None: select_parser() return selected_parser().parse(file, sink) cvs2svn-2.4.0/cvs2svn_lib/symbol_strategy.py0000664000076500007650000005307011434364604022250 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """SymbolStrategy classes determine how to convert symbols.""" import re from cvs2svn_lib.common import FatalError from cvs2svn_lib.common import path_join from cvs2svn_lib.common import normalize_svn_path from cvs2svn_lib.log import logger from cvs2svn_lib.symbol import Trunk from cvs2svn_lib.symbol import TypedSymbol from cvs2svn_lib.symbol import Branch from cvs2svn_lib.symbol import Tag from cvs2svn_lib.symbol import ExcludedSymbol from cvs2svn_lib.symbol_statistics import SymbolPlanError class StrategyRule: """A single rule that might determine how to convert a symbol.""" def start(self, symbol_statistics): """This method is called once before get_symbol() is ever called. The StrategyRule can override this method to do whatever it wants to prepare itself for work. SYMBOL_STATISTICS is an instance of SymbolStatistics containing the statistics for all symbols in all projects.""" pass def get_symbol(self, symbol, stats): """Return an object describing what to do with the symbol in STATS. SYMBOL holds a Trunk or Symbol object as it has been determined so far. Hopefully one of these method calls will turn any naked Symbol instances into TypedSymbols. If this rule applies to the SYMBOL (whose statistics are collected in STATS), then return a new or modified AbstractSymbol object. If this rule doesn't apply, return SYMBOL unchanged.""" raise NotImplementedError() def finish(self): """This method is called once after get_symbol() is done being called. The StrategyRule can override this method do whatever it wants to release resources, etc.""" pass class _RegexpStrategyRule(StrategyRule): """A Strategy rule that bases its decisions on regexp matches. If self.regexp matches a symbol name, return self.action(symbol); otherwise, return the symbol unchanged.""" def __init__(self, pattern, action): """Initialize a _RegexpStrategyRule. PATTERN is a string that will be treated as a regexp pattern. PATTERN must match a full symbol name for the rule to apply (i.e., it is anchored at the beginning and end of the symbol name). ACTION is the class representing how the symbol should be converted. It should be one of the classes Branch, Tag, or ExcludedSymbol. If PATTERN matches a symbol name, then get_symbol() returns ACTION(name, id); otherwise it returns SYMBOL unchanged.""" try: self.regexp = re.compile('^' + pattern + '$') except re.error: raise FatalError("%r is not a valid regexp." % (pattern,)) self.action = action def log(self, symbol): raise NotImplementedError() def get_symbol(self, symbol, stats): if isinstance(symbol, (Trunk, TypedSymbol)): return symbol elif self.regexp.match(symbol.name): self.log(symbol) return self.action(symbol) else: return symbol class ForceBranchRegexpStrategyRule(_RegexpStrategyRule): """Force symbols matching pattern to be branches.""" def __init__(self, pattern): _RegexpStrategyRule.__init__(self, pattern, Branch) def log(self, symbol): logger.verbose( 'Converting symbol %s as a branch because it matches regexp "%s".' % (symbol, self.regexp.pattern,) ) class ForceTagRegexpStrategyRule(_RegexpStrategyRule): """Force symbols matching pattern to be tags.""" def __init__(self, pattern): _RegexpStrategyRule.__init__(self, pattern, Tag) def log(self, symbol): logger.verbose( 'Converting symbol %s as a tag because it matches regexp "%s".' % (symbol, self.regexp.pattern,) ) class ExcludeRegexpStrategyRule(_RegexpStrategyRule): """Exclude symbols matching pattern.""" def __init__(self, pattern): _RegexpStrategyRule.__init__(self, pattern, ExcludedSymbol) def log(self, symbol): logger.verbose( 'Excluding symbol %s because it matches regexp "%s".' % (symbol, self.regexp.pattern,) ) class ExcludeTrivialImportBranchRule(StrategyRule): """If a symbol is a trivial import branch, exclude it. A trivial import branch is defined to be a branch that only had a single import on it (no other kinds of commits) in every file in which it appeared. In most cases these branches are worthless.""" def get_symbol(self, symbol, stats): if isinstance(symbol, (Trunk, TypedSymbol)): return symbol if stats.tag_create_count == 0 \ and stats.branch_create_count == stats.trivial_import_count: logger.verbose( 'Excluding branch %s because it is a trivial import branch.' % (symbol,) ) return ExcludedSymbol(symbol) else: return symbol class ExcludeVendorBranchRule(StrategyRule): """If a symbol is a pure vendor branch, exclude it. A pure vendor branch is defined to be a branch that only had imports on it (no other kinds of commits) in every file in which it appeared.""" def get_symbol(self, symbol, stats): if isinstance(symbol, (Trunk, TypedSymbol)): return symbol if stats.tag_create_count == 0 \ and stats.branch_create_count == stats.pure_ntdb_count: logger.verbose( 'Excluding branch %s because it is a pure vendor branch.' % (symbol,) ) return ExcludedSymbol(symbol) else: return symbol class UnambiguousUsageRule(StrategyRule): """If a symbol is used unambiguously as a tag/branch, convert it as such.""" def get_symbol(self, symbol, stats): if isinstance(symbol, (Trunk, TypedSymbol)): return symbol is_tag = stats.tag_create_count > 0 is_branch = stats.branch_create_count > 0 or stats.branch_commit_count > 0 if is_tag and is_branch: # Can't decide return symbol elif is_branch: logger.verbose( 'Converting symbol %s as a branch because it is always used ' 'as a branch.' % (symbol,) ) return Branch(symbol) elif is_tag: logger.verbose( 'Converting symbol %s as a tag because it is always used ' 'as a tag.' % (symbol,) ) return Tag(symbol) else: # The symbol didn't appear at all: return symbol class BranchIfCommitsRule(StrategyRule): """If there was ever a commit on the symbol, convert it as a branch.""" def get_symbol(self, symbol, stats): if isinstance(symbol, (Trunk, TypedSymbol)): return symbol elif stats.branch_commit_count > 0: logger.verbose( 'Converting symbol %s as a branch because there are commits on it.' % (symbol,) ) return Branch(symbol) else: return symbol class HeuristicStrategyRule(StrategyRule): """Convert symbol based on how often it was used as a branch/tag. Whichever happened more often determines how the symbol is converted.""" def get_symbol(self, symbol, stats): if isinstance(symbol, (Trunk, TypedSymbol)): return symbol elif stats.tag_create_count >= stats.branch_create_count: logger.verbose( 'Converting symbol %s as a tag because it is more often used ' 'as a tag.' % (symbol,) ) return Tag(symbol) else: logger.verbose( 'Converting symbol %s as a branch because it is more often used ' 'as a branch.' % (symbol,) ) return Branch(symbol) class _CatchAllRule(StrategyRule): """Base class for catch-all rules. Usually this rule will appear after a list of more careful rules (including a general rule like UnambiguousUsageRule) and will therefore only apply to the symbols not handled earlier.""" def __init__(self, action): self._action = action def log(self, symbol): raise NotImplementedError() def get_symbol(self, symbol, stats): if isinstance(symbol, (Trunk, TypedSymbol)): return symbol else: self.log(symbol) return self._action(symbol) class AllBranchRule(_CatchAllRule): """Convert all symbols as branches. Usually this rule will appear after a list of more careful rules (including a general rule like UnambiguousUsageRule) and will therefore only apply to the symbols not handled earlier.""" def __init__(self): _CatchAllRule.__init__(self, Branch) def log(self, symbol): logger.verbose( 'Converting symbol %s as a branch because no other rules applied.' % (symbol,) ) class AllTagRule(_CatchAllRule): """Convert all symbols as tags. We don't worry about conflicts here; they will be caught later by SymbolStatistics.check_consistency(). Usually this rule will appear after a list of more careful rules (including a general rule like UnambiguousUsageRule) and will therefore only apply to the symbols not handled earlier.""" def __init__(self): _CatchAllRule.__init__(self, Tag) def log(self, symbol): logger.verbose( 'Converting symbol %s as a tag because no other rules applied.' % (symbol,) ) class AllExcludedRule(_CatchAllRule): """Exclude all symbols. Usually this rule will appear after a list of more careful rules (including a SymbolHintsFileRule or several ManualSymbolRules) and will therefore only apply to the symbols not handled earlier.""" def __init__(self): _CatchAllRule.__init__(self, ExcludedSymbol) def log(self, symbol): logger.verbose( 'Excluding symbol %s by catch-all rule.' % (symbol,) ) class TrunkPathRule(StrategyRule): """Set the base path for Trunk.""" def __init__(self, trunk_path): self.trunk_path = trunk_path def get_symbol(self, symbol, stats): if isinstance(symbol, Trunk) and symbol.base_path is None: symbol.base_path = self.trunk_path return symbol class SymbolPathRule(StrategyRule): """Set the base paths for symbol LODs.""" def __init__(self, symbol_type, base_path): self.symbol_type = symbol_type self.base_path = base_path def get_symbol(self, symbol, stats): if isinstance(symbol, self.symbol_type) and symbol.base_path is None: symbol.base_path = path_join(self.base_path, symbol.name) return symbol class BranchesPathRule(SymbolPathRule): """Set the base paths for Branch LODs.""" def __init__(self, branch_path): SymbolPathRule.__init__(self, Branch, branch_path) class TagsPathRule(SymbolPathRule): """Set the base paths for Tag LODs.""" def __init__(self, tag_path): SymbolPathRule.__init__(self, Tag, tag_path) class HeuristicPreferredParentRule(StrategyRule): """Use a heuristic rule to pick preferred parents. Pick the parent that should be preferred for any TypedSymbols. As parent, use the symbol that appeared most often as a possible parent of the symbol in question. If multiple symbols are tied, choose the one that comes first according to the Symbol class's natural sort order.""" def _get_preferred_parent(self, stats): """Return the LODs that are most often possible parents in STATS. Return the set of LinesOfDevelopment that appeared most often as possible parents. The return value might contain multiple symbols if multiple LinesOfDevelopment appeared the same number of times.""" best_count = -1 best_symbol = None for (symbol, count) in stats.possible_parents.items(): if count > best_count or (count == best_count and symbol < best_symbol): best_count = count best_symbol = symbol if best_symbol is None: return None else: return best_symbol def get_symbol(self, symbol, stats): if isinstance(symbol, TypedSymbol) and symbol.preferred_parent_id is None: preferred_parent = self._get_preferred_parent(stats) if preferred_parent is None: logger.verbose('%s has no preferred parent' % (symbol,)) else: symbol.preferred_parent_id = preferred_parent.id logger.verbose( 'The preferred parent of %s is %s' % (symbol, preferred_parent,) ) return symbol class ManualTrunkRule(StrategyRule): """Change the SVN path of Trunk LODs. Members: project_id -- (int or None) The id of the project whose trunk should be affected by this rule. If project_id is None, then the rule is not project-specific. svn_path -- (str) The SVN path that should be used as the base directory for this trunk. This member must not be None, though it may be the empty string for a single-project, trunk-only conversion. """ def __init__(self, project_id, svn_path): self.project_id = project_id self.svn_path = normalize_svn_path(svn_path, allow_empty=True) def get_symbol(self, symbol, stats): if (self.project_id is not None and self.project_id != stats.lod.project.id): return symbol if isinstance(symbol, Trunk): symbol.base_path = self.svn_path return symbol def convert_as_branch(symbol): logger.verbose( 'Converting symbol %s as a branch because of manual setting.' % (symbol,) ) return Branch(symbol) def convert_as_tag(symbol): logger.verbose( 'Converting symbol %s as a tag because of manual setting.' % (symbol,) ) return Tag(symbol) def exclude(symbol): logger.verbose( 'Excluding symbol %s because of manual setting.' % (symbol,) ) return ExcludedSymbol(symbol) class ManualSymbolRule(StrategyRule): """Change how particular symbols are converted. Members: project_id -- (int or None) The id of the project whose trunk should be affected by this rule. If project_id is None, then the rule is not project-specific. symbol_name -- (str) The name of the symbol that should be affected by this rule. conversion -- (callable or None) A callable that converts the symbol to its preferred output type. This should normally be one of (convert_as_branch, convert_as_tag, exclude). If this member is None, then this rule does not affect the symbol's output type. svn_path -- (str) The SVN path that should be used as the base directory for this trunk. This member must not be None, though it may be the empty string for a single-project, trunk-only conversion. parent_lod_name -- (str or None) The name of the line of development that should be preferred as the parent of this symbol. (The preferred parent is the line of development from which the symbol should sprout.) If this member is set to the string '.trunk.', then the symbol will be set to sprout directly from trunk. If this member is set to None, then this rule won't affect the symbol's parent. """ def __init__( self, project_id, symbol_name, conversion, svn_path, parent_lod_name ): self.project_id = project_id self.symbol_name = symbol_name self.conversion = conversion if svn_path is None: self.svn_path = None else: self.svn_path = normalize_svn_path(svn_path, allow_empty=True) self.parent_lod_name = parent_lod_name def _get_parent_by_id(self, parent_lod_name, stats): """Return the LOD object for the parent with name PARENT_LOD_NAME. STATS is the _Stats object describing a symbol whose parent needs to be determined from its name. If none of its possible parents has name PARENT_LOD_NAME, raise a SymbolPlanError.""" for pp in stats.possible_parents.keys(): if isinstance(pp, Trunk): pass elif pp.name == parent_lod_name: return pp else: parent_counts = stats.possible_parents.items() parent_counts.sort(lambda a,b: - cmp(a[1], b[1])) lines = [ '%s is not a valid parent for %s;' % (parent_lod_name, stats.lod,), ' possible parents (with counts):' ] for (symbol, count) in parent_counts: if isinstance(symbol, Trunk): lines.append(' .trunk. : %d' % count) else: lines.append(' %s : %d' % (symbol.name, count)) raise SymbolPlanError('\n'.join(lines)) def get_symbol(self, symbol, stats): if (self.project_id is not None and self.project_id != stats.lod.project.id): return symbol elif isinstance(symbol, Trunk): return symbol elif self.symbol_name == stats.lod.name: if self.conversion is not None: symbol = self.conversion(symbol) if self.parent_lod_name is None: pass elif self.parent_lod_name == '.trunk.': symbol.preferred_parent_id = stats.lod.project.trunk_id else: symbol.preferred_parent_id = self._get_parent_by_id( self.parent_lod_name, stats ).id if self.svn_path is not None: symbol.base_path = self.svn_path return symbol class SymbolHintsFileRule(StrategyRule): """Use manual symbol configurations read from a file. The input file is line-oriented with the following format: [ []] Where the fields are separated by whitespace and project-id -- the numerical id of the Project to which the symbol belongs (numbered starting with 0). This field can be '.' if the rule is not project-specific. symbol-name -- the name of the symbol being specified, or '.trunk.' if the rule should apply to trunk. conversion -- how the symbol should be treated in the conversion. This is one of the following values: 'branch', 'tag', or 'exclude'. This field can be '.' if the rule shouldn't affect how the symbol is treated in the conversion. svn-path -- the SVN path that should serve as the root path of this LOD. The path should be expressed as a path relative to the SVN root directory, with or without a leading '/'. This field can be omitted or '.' if the rule shouldn't affect the LOD's SVN path. parent-lod-name -- the name of the LOD that should serve as this symbol's parent. This field can be omitted or '.' if the rule shouldn't affect the symbol's parent, or it can be '.trunk.' to indicate that the symbol should sprout from the project's trunk.""" comment_re = re.compile(r'^(\#|$)') conversion_map = { 'branch' : convert_as_branch, 'tag' : convert_as_tag, 'exclude' : exclude, '.' : None, } def __init__(self, filename): self.filename = filename def start(self, symbol_statistics): self._rules = [] f = open(self.filename, 'r') for l in f: l = l.rstrip() s = l.lstrip() if self.comment_re.match(s): continue fields = s.split() if len(fields) < 3: raise FatalError( 'The following line in "%s" cannot be parsed:\n "%s"' % (self.filename, l,) ) project_id = fields.pop(0) symbol_name = fields.pop(0) conversion = fields.pop(0) if fields: svn_path = fields.pop(0) if svn_path == '.': svn_path = None elif svn_path[0] == '/': svn_path = svn_path[1:] else: svn_path = None if fields: parent_lod_name = fields.pop(0) else: parent_lod_name = '.' if fields: raise FatalError( 'The following line in "%s" cannot be parsed:\n "%s"' % (self.filename, l,) ) if project_id == '.': project_id = None else: try: project_id = int(project_id) except ValueError: raise FatalError( 'Illegal project_id in the following line:\n "%s"' % (l,) ) if symbol_name == '.trunk.': if conversion not in ['.', 'trunk']: raise FatalError('Trunk cannot be converted as a different type') if parent_lod_name != '.': raise FatalError('Trunk\'s parent cannot be set') if svn_path is None: # This rule doesn't do anything: pass else: self._rules.append(ManualTrunkRule(project_id, svn_path)) else: try: conversion = self.conversion_map[conversion] except KeyError: raise FatalError( 'Illegal conversion in the following line:\n "%s"' % (l,) ) if parent_lod_name == '.': parent_lod_name = None if conversion is None \ and svn_path is None \ and parent_lod_name is None: # There is nothing to be done: pass else: self._rules.append( ManualSymbolRule( project_id, symbol_name, conversion, svn_path, parent_lod_name ) ) for rule in self._rules: rule.start(symbol_statistics) def get_symbol(self, symbol, stats): for rule in self._rules: symbol = rule.get_symbol(symbol, stats) return symbol def finish(self): for rule in self._rules: rule.finish() del self._rules cvs2svn-2.4.0/cvs2svn_lib/sort.py0000664000076500007650000001755711434364604020022 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Functions to sort large files. The functions in this module were originally downloaded from the following URL: http://code.activestate.com/recipes/466302/ It was apparently submitted by Nicolas Lehuen on Tue, 17 Jan 2006. According to the terms of service of that website, the code is usable under the MIT license. """ import os import shutil import heapq import itertools import tempfile # The buffer size to use for open files: BUFSIZE = 64 * 1024 def get_default_max_merge(): """Return the default maximum number of files to merge at once.""" # The maximum number of files to merge at once. This number cannot # be unlimited because there are operating system restrictions on # the number of files that a process can have open at once. So... try: # If this constant is available via sysconf, we use half the # number available to the process as a whole. _SC_OPEN_MAX = os.sysconf('SC_OPEN_MAX') if _SC_OPEN_MAX == -1: # This also indicates an error: raise ValueError() return min(_SC_OPEN_MAX // 2, 100) except: # Otherwise, simply limit the number to this constant, which will # hopefully be OK on all operating systems: return 50 DEFAULT_MAX_MERGE = get_default_max_merge() def merge(iterables, key=None): """Merge (in the sense of mergesort) ITERABLES. Generate the output in order. If KEY is specified, it should be a function that returns the sort key.""" if key is None: key = lambda x : x values = [] for index, iterable in enumerate(iterables): try: iterator = iter(iterable) value = iterator.next() except StopIteration: pass else: values.append((key(value), index, value, iterator)) heapq.heapify(values) while values: k, index, value, iterator = heapq.heappop(values) yield value try: value = iterator.next() except StopIteration: pass else: heapq.heappush(values, (key(value), index, value, iterator)) def merge_files_onepass(input_filenames, output_filename, key=None): """Merge a number of input files into one output file. This is a merge in the sense of mergesort; namely, it is assumed that the input files are each sorted, and (under that assumption) the output file will also be sorted.""" input_filenames = list(input_filenames) if len(input_filenames) == 1: shutil.move(input_filenames[0], output_filename) else: output_file = file(output_filename, 'wb', BUFSIZE) try: chunks = [] try: for input_filename in input_filenames: chunks.append(open(input_filename, 'rb', BUFSIZE)) output_file.writelines(merge(chunks, key)) finally: for chunk in chunks: try: chunk.close() except: pass finally: output_file.close() def _try_delete_files(filenames): """Try to remove the named files. Ignore errors.""" for filename in filenames: try: os.remove(filename) except: pass def tempfile_generator(tempdirs=[]): """Yield filenames of temporary files.""" # Create an iterator that will choose directories to hold the # temporary files: if tempdirs: tempdirs = itertools.cycle(tempdirs) else: tempdirs = itertools.repeat(tempfile.gettempdir()) i = 0 while True: (fd, filename) = tempfile.mkstemp( '', 'sort%06i-' % (i,), tempdirs.next(), False ) os.close(fd) yield filename i += 1 def _merge_file_generation( input_filenames, delete_inputs, key=None, max_merge=DEFAULT_MAX_MERGE, tempfiles=None, ): """Merge multiple input files into fewer output files. This is a merge in the sense of mergesort; namely, it is assumed that the input files are each sorted, and (under that assumption) the output file will also be sorted. At most MAX_MERGE input files will be merged at once, to avoid exceeding operating system restrictions on the number of files that can be open at one time. If DELETE_INPUTS is True, then the input files will be deleted when they are no longer needed. If temporary files need to be used, they will be created using the specified TEMPFILES tempfile generator. Generate the names of the output files.""" if max_merge <= 1: raise ValueError('max_merge must be greater than one') if tempfiles is None: tempfiles = tempfile_generator() filenames = list(input_filenames) if len(filenames) <= 1: raise ValueError('It makes no sense to merge a single file') while filenames: group = filenames[:max_merge] del filenames[:max_merge] group_output = tempfiles.next() merge_files_onepass(group, group_output, key=key) if delete_inputs: _try_delete_files(group) yield group_output def merge_files( input_filenames, output_filename, key=None, delete_inputs=False, max_merge=DEFAULT_MAX_MERGE, tempfiles=None, ): """Merge a number of input files into one output file. This is a merge in the sense of mergesort; namely, it is assumed that the input files are each sorted, and (under that assumption) the output file will also be sorted. At most MAX_MERGE input files will be merged at once, to avoid exceeding operating system restrictions on the number of files that can be open at one time. If DELETE_INPUTS is True, then the input files will be deleted when they are no longer needed. If temporary files need to be used, they will be created using the specified TEMPFILES tempfile generator.""" filenames = list(input_filenames) if not filenames: # Create an empty file: open(output_filename, 'wb').close() else: if tempfiles is None: tempfiles = tempfile_generator() while len(filenames) > max_merge: # Reduce the number of files by performing groupwise merges: filenames = list( _merge_file_generation( filenames, delete_inputs, key=key, max_merge=max_merge, tempfiles=tempfiles ) ) # After the first iteration, we are only working with temporary # files so they can definitely be deleted them when we are done # with them: delete_inputs = True # The last merge writes the results directly into the output # file: merge_files_onepass(filenames, output_filename, key=key) if delete_inputs: _try_delete_files(filenames) def sort_file( input, output, key=None, buffer_size=32000, tempdirs=[], max_merge=DEFAULT_MAX_MERGE, ): tempfiles = tempfile_generator(tempdirs) filenames = [] input_file = file(input, 'rb', BUFSIZE) try: try: input_iterator = iter(input_file) while True: current_chunk = list(itertools.islice(input_iterator, buffer_size)) if not current_chunk: break current_chunk.sort(key=key) filename = tempfiles.next() filenames.append(filename) f = open(filename, 'w+b', BUFSIZE) try: f.writelines(current_chunk) finally: f.close() finally: input_file.close() merge_files( filenames, output, key=key, delete_inputs=True, max_merge=max_merge, tempfiles=tempfiles, ) finally: _try_delete_files(filenames) cvs2svn-2.4.0/cvs2svn_lib/hg_run_options.py0000664000076500007650000001064011434364604022052 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== from cvs2svn_lib.context import Ctx from cvs2svn_lib.run_options import IncompatibleOption from cvs2svn_lib.dvcs_common import DVCSRunOptions from cvs2svn_lib.hg_output_option import HgOutputOption class HgRunOptions(DVCSRunOptions): short_desc = 'convert a cvs repository into a Mercurial repository' synopsis = """\ .B cvs2hg [\\fIOPTION\\fR]... \\fIOUTPUT-OPTION CVS-REPOS-PATH\\fR .br .B cvs2hg [\\fIOPTION\\fR]... \\fI--options=PATH\\fR """ # XXX paragraph 2 copied straight from svn_run_options.py long_desc = """\ Create a new Mercurial repository based on the version history stored in a CVS repository. Each CVS commit will be mirrored in the Mercurial repository, including commit time and author (with optional remapping to Mercurial-style long usernames). .P \\fICVS-REPOS-PATH\\fR is the filesystem path of the part of the CVS repository that you want to convert. It is not possible to convert a CVS repository to which you only have remote access; see the FAQ for more information. This path doesn't have to be the top level directory of a CVS repository; it can point at a project within a repository, in which case only that project will be converted. This path or one of its parent directories has to contain a subdirectory called CVSROOT (though the CVSROOT directory can be empty). .P Unlike CVS or Subversion, Mercurial expects each repository to hold one independent project. If your CVS repository contains multiple independent projects, you should probably convert them to multiple independent Mercurial repositories with multiple runs of .B cvs2hg. """ # XXX copied from svn_run_options.py files = """\ A directory called \\fIcvs2svn-tmp\\fR (or the directory specified by \\fB--tmpdir\\fR) is used as scratch space for temporary data files. """ # XXX the cvs2{svn,git,bzr,hg} man pages should probably reference # each other see_also = [ ('cvs', '1'), ('hg', '1'), ] def __init__(self, *args, **kwargs): # Override some default values ctx = Ctx() ctx.username = "cvs2hg" ctx.symbol_commit_message = ( "artificial changeset to create " "%(symbol_type)s '%(symbol_name)s'") ctx.post_commit_message = ( "artificial changeset: compensate for changes in %(revnum)s " "(on non-trunk default branch in CVS)") super(HgRunOptions, self).__init__(*args, **kwargs) # This is a straight copy of SVNRunOptions._get_extraction_options_group(); # would be nice to refactor, but it's a bit awkward because GitRunOptions # doesn't support --use-internal-co option. def _get_extraction_options_group(self): group = DVCSRunOptions._get_extraction_options_group(self) self._add_use_internal_co_option(group) self._add_use_cvs_option(group) self._add_use_rcs_option(group) return group def _get_output_options_group(self): group = super(HgRunOptions, self)._get_output_options_group() # XXX what if the hg repo already exists? die, clobber, or append? # (currently we die at the start of OutputPass) group.add_option(IncompatibleOption( '--hgrepos', type='string', action='store', help='create Mercurial repository in PATH', man_help=( 'Convert to a Mercurial repository in \\fIpath\\fR. This creates ' 'a new Mercurial repository at \\fIpath\\fR. \\fIpath\\fR must ' 'not already exist.' ), metavar='PATH', )) # XXX --dry-run? return group def process_extraction_options(self): """Process options related to extracting data from the CVS repository.""" self.process_all_extraction_options() def process_output_options(self): Ctx().output_option = HgOutputOption( self.options.hgrepos, author_transforms={}, ) cvs2svn-2.4.0/cvs2svn_lib/config.py0000664000076500007650000002053411434364604020265 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains various configuration constants used by cvs2svn.""" SVN_KEYWORDS_VALUE = 'Author Date Id Revision' # The default names for the trunk/branches/tags directory for each # project: DEFAULT_TRUNK_BASE = 'trunk' DEFAULT_BRANCHES_BASE = 'branches' DEFAULT_TAGS_BASE = 'tags' SVNADMIN_EXECUTABLE = 'svnadmin' CO_EXECUTABLE = 'co' CVS_EXECUTABLE = 'cvs' # A pickled list of the projects defined for this conversion. PROJECTS = 'projects.pck' # A file holding the Serializer to be used for CVS_REVS_*_DATAFILE and # CVS_SYMBOLS_*_DATAFILE: ITEM_SERIALIZER = 'item-serializer.pck' # The first file contains the CVSRevisions in a form that can be # sorted to deduce preliminary Changesets. The second file is the # sorted version of the first. CVS_REVS_DATAFILE = 'revs.dat' CVS_REVS_SORTED_DATAFILE = 'revs-s.dat' # The first file contains the CVSSymbols in a form that can be sorted # to deduce preliminary Changesets. The second file is the sorted # version of the first. CVS_SYMBOLS_DATAFILE = 'symbols.dat' CVS_SYMBOLS_SORTED_DATAFILE = 'symbols-s.dat' # A mapping from CVSItem id to Changeset id. CVS_ITEM_TO_CHANGESET = 'cvs-item-to-changeset.dat' # A mapping from CVSItem id to Changeset id, after the # RevisionChangeset loops have been broken. CVS_ITEM_TO_CHANGESET_REVBROKEN = 'cvs-item-to-changeset-revbroken.dat' # A mapping from CVSItem id to Changeset id, after the SymbolChangeset # loops have been broken. CVS_ITEM_TO_CHANGESET_SYMBROKEN = 'cvs-item-to-changeset-symbroken.dat' # A mapping from CVSItem id to Changeset id, after all Changeset # loops have been broken. CVS_ITEM_TO_CHANGESET_ALLBROKEN = 'cvs-item-to-changeset-allbroken.dat' # A mapping from id to Changeset. CHANGESETS_INDEX = 'changesets-index.dat' CHANGESETS_STORE = 'changesets.pck' # A mapping from id to Changeset, after the RevisionChangeset loops # have been broken. CHANGESETS_REVBROKEN_INDEX = 'changesets-revbroken-index.dat' CHANGESETS_REVBROKEN_STORE = 'changesets-revbroken.pck' # A mapping from id to Changeset, after the RevisionChangesets have # been sorted and converted into OrderedChangesets. CHANGESETS_REVSORTED_INDEX = 'changesets-revsorted-index.dat' CHANGESETS_REVSORTED_STORE = 'changesets-revsorted.pck' # A mapping from id to Changeset, after the SymbolChangeset loops have # been broken. CHANGESETS_SYMBROKEN_INDEX = 'changesets-symbroken-index.dat' CHANGESETS_SYMBROKEN_STORE = 'changesets-symbroken.pck' # A mapping from id to Changeset, after all Changeset loops have been # broken. CHANGESETS_ALLBROKEN_INDEX = 'changesets-allbroken-index.dat' CHANGESETS_ALLBROKEN_STORE = 'changesets-allbroken.pck' # The RevisionChangesets in commit order. Each line contains the # changeset id and timestamp of one changeset, in hexadecimal, in the # order that the changesets should be committed to svn. CHANGESETS_SORTED_DATAFILE = 'changesets-s.txt' # A file containing a marshalled copy of all the statistics that have # been gathered so far is written at the end of each pass as a # marshalled dictionary. This is the pattern used to generate the # filenames. STATISTICS_FILE = 'statistics-%02d.pck' # This text file contains records (1 per line) that describe openings # and closings for copies to tags and branches. The format is as # follows: # # SYMBOL_ID SVN_REVNUM TYPE CVS_SYMBOL_ID # # where type is either OPENING or CLOSING. CVS_SYMBOL_ID is the id of # the CVSSymbol whose opening or closing is being described (in hex). SYMBOL_OPENINGS_CLOSINGS = 'symbolic-names.txt' # A sorted version of the above file. SYMBOL_ID and SVN_REVNUM are # the primary and secondary sorting criteria. It is important that # SYMBOL_IDs be located together to make it quick to read them at # once. The order of SVN_REVNUM is only important because it is # assumed by some internal consistency checks. SYMBOL_OPENINGS_CLOSINGS_SORTED = 'symbolic-names-s.txt' # Skeleton version of the repository filesystem. See class # RepositoryMirror for how these work. MIRROR_NODES_INDEX_TABLE = 'mirror-nodes-index.dat' MIRROR_NODES_STORE = 'mirror-nodes.pck' # Offsets pointing to the beginning of each symbol's records in # SYMBOL_OPENINGS_CLOSINGS_SORTED. This file contains a pickled map # from symbol_id to file offset. SYMBOL_OFFSETS_DB = 'symbol-offsets.pck' # Pickled map of CVSPath.id to instance. CVS_PATHS_DB = 'cvs-paths.pck' # A series of records. The first is a pickled serializer. Each # subsequent record is a serialized list of all CVSItems applying to a # CVSFile. CVS_ITEMS_STORE = 'cvs-items.pck' # The same as above, but with the CVSItems ordered in groups based on # their initial changesets. CVSItems will usually be accessed one # changeset at a time, so this ordering helps disk locality (even # though some of the changesets will later be broken up). CVS_ITEMS_SORTED_INDEX_TABLE = 'cvs-items-sorted-index.dat' CVS_ITEMS_SORTED_STORE = 'cvs-items-sorted.pck' # A record of all symbolic names that will be processed in the # conversion. This file contains a pickled list of TypedSymbol # objects. SYMBOL_DB = 'symbols.pck' # A pickled list of the statistics for all symbols. Each entry in the # list is an instance of cvs2svn_lib.symbol_statistics._Stats. SYMBOL_STATISTICS = 'symbol-statistics.pck' # These two databases provide a bidirectional mapping between # CVSRevision.ids (in hex) and Subversion revision numbers. # # The first maps CVSRevision.id to the SVN revision number of which it # is a part (more than one CVSRevision can map to the same SVN # revision number). # # The second maps Subversion revision numbers (as hex strings) to # pickled SVNCommit instances. CVS_REVS_TO_SVN_REVNUMS = 'cvs-revs-to-svn-revnums.dat' # This database maps Subversion revision numbers to pickled SVNCommit # instances. SVN_COMMITS_INDEX_TABLE = 'svn-commits-index.dat' SVN_COMMITS_STORE = 'svn-commits.pck' # How many bytes to read at a time from a pipe. 128 kiB should be # large enough to be efficient without wasting too much memory. PIPE_READ_SIZE = 128 * 1024 # Records the author and log message for each changeset. The database # contains a map metadata_id -> (author, logmessage). Each # CVSRevision that is eligible to be combined into the same SVN commit # is assigned the same id. Note that the (author, logmessage) pairs # are not necessarily all distinct; other data are taken into account # when constructing ids. METADATA_INDEX_TABLE = 'metadata-index.dat' METADATA_STORE = 'metadata.pck' # The same, after it has been cleaned up for the chosen output option: METADATA_CLEAN_INDEX_TABLE = 'metadata-clean-index.dat' METADATA_CLEAN_STORE = 'metadata-clean.pck' # The following four databases are used in conjunction with --use-internal-co. # Records the RCS deltas for all CVS revisions. The deltas are to be # applied forward, i.e. those from trunk are reversed wrt RCS. RCS_DELTAS_INDEX_TABLE = 'rcs-deltas-index.dat' RCS_DELTAS_STORE = 'rcs-deltas.pck' # Records the revision tree of each RCS file. The format is a list of # list of integers. The outer list holds lines of development, the inner list # revisions within the LODs, revisions are CVSItem ids. Branches "closer # to the trunk" appear later. Revisions are sorted by reverse chronological # order. The last revision of each branch is the revision it sprouts from. # Revisions that represent deletions at the end of a branch are omitted. RCS_TREES_INDEX_TABLE = 'rcs-trees-index.dat' RCS_TREES_STORE = 'rcs-trees.pck' # At any given time during OutputPass, holds the full text of each CVS # revision that was checked out already and still has descendants that will # be checked out. CVS_CHECKOUT_DB = 'cvs-checkout.db' # End of DBs related to --use-internal-co. # flush a commit if a 5 minute gap occurs. COMMIT_THRESHOLD = 5 * 60 cvs2svn-2.4.0/cvs2svn_lib/project.py0000664000076500007650000001673411710517256020475 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains database facilities used by cvs2svn.""" import os import cPickle from cvs2svn_lib.context import Ctx from cvs2svn_lib.common import FatalError from cvs2svn_lib.common import IllegalSVNPathError from cvs2svn_lib.common import normalize_svn_path from cvs2svn_lib.common import verify_paths_disjoint from cvs2svn_lib.symbol_transform import CompoundSymbolTransform class FileInAndOutOfAtticException(Exception): def __init__(self, non_attic_path, attic_path): Exception.__init__( self, "A CVS repository cannot contain both %s and %s" % (non_attic_path, attic_path)) self.non_attic_path = non_attic_path self.attic_path = attic_path def normalize_ttb_path(opt, path, allow_empty=False): try: return normalize_svn_path(path, allow_empty) except IllegalSVNPathError, e: raise FatalError('Problem with %s: %s' % (opt, e,)) class Project(object): """A project within a CVS repository.""" def __init__( self, id, project_cvs_repos_path, initial_directories=[], symbol_transforms=None, exclude_paths=[], ): """Create a new Project record. ID is a unique id for this project. PROJECT_CVS_REPOS_PATH is the main CVS directory for this project (within the filesystem). INITIAL_DIRECTORIES is an iterable of all SVN directories that should be created when the project is first created. Normally, this should include the trunk, branches, and tags directory. SYMBOL_TRANSFORMS is an iterable of SymbolTransform instances which will be used to transform any symbol names within this project. EXCLUDE_PATHS is an iterable of paths that should be excluded from the conversion. The paths should be relative to PROJECT_CVS_REPOS_PATH and use slashes ('/'). Paths for individual files should include the ',v' extension. """ self.id = id self.project_cvs_repos_path = os.path.normpath(project_cvs_repos_path) if not os.path.isdir(self.project_cvs_repos_path): raise FatalError("The specified CVS repository path '%s' is not an " "existing directory." % self.project_cvs_repos_path) self.cvs_repository_root, self.cvs_module = \ self.determine_repository_root( os.path.abspath(self.project_cvs_repos_path)) # The SVN directories to add when the project is first created: self._initial_directories = [] for path in initial_directories: try: path = normalize_svn_path(path, False) except IllegalSVNPathError, e: raise FatalError( 'Initial directory %r is not a legal SVN path: %s' % (path, e,) ) self._initial_directories.append(path) verify_paths_disjoint(*self._initial_directories) # A list of transformation rules (regexp, replacement) applied to # symbol names in this project. if symbol_transforms is None: symbol_transforms = [] self.symbol_transform = CompoundSymbolTransform(symbol_transforms) self.exclude_paths = set(exclude_paths) # The ID of the Trunk instance for this Project. This member is # filled in during CollectRevsPass. self.trunk_id = None # The ID of the CVSDirectory representing the root directory of # this project. This member is filled in during CollectRevsPass. self.root_cvs_directory_id = None def __eq__(self, other): return self.id == other.id def __cmp__(self, other): return cmp(self.cvs_module, other.cvs_module) \ or cmp(self.id, other.id) def __hash__(self): return self.id @staticmethod def determine_repository_root(path): """Ascend above the specified PATH if necessary to find the cvs_repository_root (a directory containing a CVSROOT directory) and the cvs_module (the path of the conversion root within the cvs repository). Return the root path and the module path of this project relative to the root. NB: cvs_module must be seperated by '/', *not* by os.sep.""" def is_cvs_repository_root(path): return os.path.isdir(os.path.join(path, 'CVSROOT')) original_path = path cvs_module = '' while not is_cvs_repository_root(path): # Step up one directory: prev_path = path path, module_component = os.path.split(path) if path == prev_path: # Hit the root (of the drive, on Windows) without finding a # CVSROOT dir. raise FatalError( "the path '%s' is not a CVS repository, nor a path " "within a CVS repository. A CVS repository contains " "a CVSROOT directory within its root directory." % (original_path,)) cvs_module = module_component + "/" + cvs_module return path, cvs_module def transform_symbol(self, cvs_file, symbol_name, revision): """Transform the symbol SYMBOL_NAME. SYMBOL_NAME refers to revision number REVISION in CVS_FILE. REVISION is the CVS revision number as a string, with zeros removed (e.g., '1.7' or '1.7.2'). Use the renaming rules specified with --symbol-transform to possibly rename the symbol. Return the transformed symbol name, the original name if it should not be transformed, or None if the symbol should be omitted from the conversion.""" return self.symbol_transform.transform(cvs_file, symbol_name, revision) def get_trunk(self): """Return the Trunk instance for this project. This method can only be called after self.trunk_id has been initialized in CollectRevsPass.""" return Ctx()._symbol_db.get_symbol(self.trunk_id) def get_root_cvs_directory(self): """Return the root CVSDirectory instance for this project. This method can only be called after self.root_cvs_directory_id has been initialized in CollectRevsPass.""" return Ctx()._cvs_path_db.get_path(self.root_cvs_directory_id) def get_initial_directories(self): """Generate the project's initial SVN directories. Yield as strings the SVN paths of directories that should be created when the project is first created.""" # Yield the path of the Trunk symbol for this project (which might # differ from the one passed to the --trunk option because of # SymbolStrategyRules). The trunk path might be '' during a # trunk-only conversion, but that is OK because DumpstreamDelegate # considers that directory to exist already and will therefore # ignore it: yield self.get_trunk().base_path for path in self._initial_directories: yield path def __str__(self): return self.project_cvs_repos_path def read_projects(filename): retval = {} for project in cPickle.load(open(filename, 'rb')): retval[project.id] = project return retval def write_projects(filename): cPickle.dump(Ctx()._projects.values(), open(filename, 'wb'), -1) cvs2svn-2.4.0/cvs2svn_lib/process.py0000664000076500007650000000707311500107341020464 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains generic utilities used by cvs2svn.""" import subprocess from cvs2svn_lib.common import FatalError from cvs2svn_lib.common import CommandError from cvs2svn_lib.log import logger def call_command(command, **kw): """Call the specified command, checking that it exits successfully. Raise a FatalError if the command cannot be executed, or if it exits with a non-zero exit code. Pass KW as keyword arguments to subprocess.call().""" logger.debug('Running command %r' % (command,)) try: retcode = subprocess.call(command, **kw) if retcode < 0: raise FatalError( 'Command terminated by signal %d: "%s"' % (-retcode, ' '.join(command),) ) elif retcode > 0: raise FatalError( 'Command failed with return code %d: "%s"' % (retcode, ' '.join(command),) ) except OSError, e: raise FatalError( 'Command execution failed (%s): "%s"' % (e, ' '.join(command),) ) class CommandFailedException(Exception): """Exception raised if check_command_runs() fails.""" pass def check_command_runs(command, commandname): """Check whether the command CMD can be executed without errors. CMD is a list or string, as accepted by subprocess.Popen(). CMDNAME is the name of the command as it should be included in exception error messages. This function checks three things: (1) the command can be run without throwing an OSError; (2) it exits with status=0; (3) it doesn't output anything to stderr. If any of these conditions is not met, raise a CommandFailedException describing the problem.""" logger.debug('Running command %r' % (command,)) try: pipe = subprocess.Popen( command, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, ) except OSError, e: raise CommandFailedException('error executing %s: %s' % (commandname, e,)) (stdout, stderr) = pipe.communicate() if pipe.returncode or stderr: msg = 'error executing %s; returncode=%s' % (commandname, pipe.returncode,) if stderr: msg += ', error output:\n%s' % (stderr,) raise CommandFailedException(msg) def get_command_output(command): """Run COMMAND and return its stdout. COMMAND is a list of strings. Run the command and return its stdout as a string. If the command exits with a nonzero return code or writes something to stderr, raise a CommandError.""" """A file-like object from which revision contents can be read.""" logger.debug('Running command %r' % (command,)) pipe = subprocess.Popen( command, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, ) (stdout, stderr) = pipe.communicate() if pipe.returncode or stderr: raise CommandError(' '.join(command), pipe.returncode, stderr) return stdout cvs2svn-2.4.0/cvs2svn_lib/common.py0000664000076500007650000003050611710517256020310 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains common facilities used by cvs2svn.""" import time import codecs from cvs2svn_lib.log import logger # Always use these constants for opening databases. DB_OPEN_READ = 'r' DB_OPEN_WRITE = 'w' DB_OPEN_NEW = 'n' SVN_INVALID_REVNUM = -1 # Warnings and errors start with these strings. They are typically # followed by a colon and a space, as in "%s: " ==> "WARNING: ". warning_prefix = "WARNING" error_prefix = "ERROR" class FatalException(Exception): """Exception thrown on a non-recoverable error. If this exception is thrown by main(), it is caught by the global layer of the program, its string representation is printed (followed by a newline), and the program is ended with an exit code of 1.""" pass class InternalError(Exception): """Exception thrown in the case of a cvs2svn internal error (aka, bug).""" pass class FatalError(FatalException): """A FatalException that prepends error_prefix to the message.""" def __init__(self, msg): """Use (error_prefix + ': ' + MSG) as the error message.""" FatalException.__init__(self, '%s: %s' % (error_prefix, msg,)) class CommandError(FatalError): """A FatalError caused by a failed command invocation. The error message includes the command name, exit code, and output.""" def __init__(self, command, exit_status, error_output=''): self.command = command self.exit_status = exit_status self.error_output = error_output if error_output.rstrip(): FatalError.__init__( self, 'The command %r failed with exit status=%s\n' 'and the following output:\n' '%s' % (self.command, self.exit_status, self.error_output.rstrip())) else: FatalError.__init__( self, 'The command %r failed with exit status=%s and no output' % (self.command, self.exit_status)) def canonicalize_eol(text, eol): """Replace any end-of-line sequences in TEXT with the string EOL.""" text = text.replace('\r\n', '\n') text = text.replace('\r', '\n') if eol != '\n': text = text.replace('\n', eol) return text def path_join(*components): """Join two or more pathname COMPONENTS, inserting '/' as needed. Empty component are skipped.""" return '/'.join(filter(None, components)) def path_split(path): """Split the svn pathname PATH into a pair, (HEAD, TAIL). This is similar to os.path.split(), but always uses '/' as path separator. PATH is an svn path, which should not start with a '/'. HEAD is everything before the last slash, and TAIL is everything after. If PATH ends in a slash, TAIL will be empty. If there is no slash in PATH, HEAD will be empty. If PATH is empty, both HEAD and TAIL are empty.""" pos = path.rfind('/') if pos == -1: return ('', path,) else: return (path[:pos], path[pos+1:],) class IllegalSVNPathError(FatalException): pass def normalize_svn_path(path, allow_empty=False): """Normalize an SVN path (e.g., one supplied by a user). 1. Strip leading, trailing, and duplicated '/'. 2. If ALLOW_EMPTY is not set, verify that PATH is not empty. Return the normalized path. If the path is invalid, raise an IllegalSVNPathError.""" norm_path = path_join(*path.split('/')) if not allow_empty and not norm_path: raise IllegalSVNPathError("Path is empty") return norm_path class PathRepeatedException(Exception): def __init__(self, path, count): self.path = path self.count = count Exception.__init__( self, 'Path %s is repeated %d times' % (self.path, self.count,) ) class PathsNestedException(Exception): def __init__(self, nest, nestlings): self.nest = nest self.nestlings = nestlings Exception.__init__( self, 'Path %s contains the following other paths: %s' % (self.nest, ', '.join(self.nestlings),) ) class PathsNotDisjointException(FatalException): """An exception that collects multiple other disjointness exceptions.""" def __init__(self, problems): self.problems = problems Exception.__init__( self, 'The following paths are not disjoint:\n' ' %s\n' % ('\n '.join([str(problem) for problem in self.problems]),) ) def verify_paths_disjoint(*paths): """Verify that all of the paths in the argument list are disjoint. If any of the paths is nested in another one (i.e., in the sense that 'a/b/c/d' is nested in 'a/b'), or any two paths are identical, raise a PathsNotDisjointException containing exceptions detailing the individual problems.""" def split(path): if not path: return [] else: return path.split('/') def contains(split_path1, split_path2): """Return True iff SPLIT_PATH1 contains SPLIT_PATH2.""" return ( len(split_path1) < len(split_path2) and split_path2[:len(split_path1)] == split_path1 ) paths = [(split(path), path) for path in paths] # If all overlapping elements are equal, a shorter list is # considered "less than" a longer one. Therefore if any paths are # nested, this sort will leave at least one such pair adjacent, in # the order [nest,nestling]. paths.sort() problems = [] # Create exceptions for any repeated paths, and delete the repeats # from the paths array: i = 0 while i < len(paths): split_path, path = paths[i] j = i + 1 while j < len(paths) and split_path == paths[j][0]: j += 1 if j - i > 1: problems.append(PathRepeatedException(path, j - i)) # Delete all but the first copy: del paths[i + 1:j] i += 1 # Create exceptions for paths nested in each other: i = 0 while i < len(paths): split_path, path = paths[i] j = i + 1 while j < len(paths) and contains(split_path, paths[j][0]): j += 1 if j - i > 1: problems.append(PathsNestedException( path, [path2 for (split_path2, path2) in paths[i + 1:j]] )) i += 1 if problems: raise PathsNotDisjointException(problems) def is_trunk_revision(rev): """Return True iff REV is a trunk revision. REV is a CVS revision number (e.g., '1.6' or '1.6.4.5'). Return True iff the revision is on trunk.""" return rev.count('.') == 1 def is_branch_revision_number(rev): """Return True iff REV is a branch revision number. REV is a CVS revision number in canonical form (i.e., with zeros removed). Return True iff it refers to a whole branch, as opposed to a single revision.""" return rev.count('.') % 2 == 0 def format_date(date): """Return an svn-compatible date string for DATE (seconds since epoch). A Subversion date looks like '2002-09-29T14:44:59.000000Z'.""" return time.strftime("%Y-%m-%dT%H:%M:%S.000000Z", time.gmtime(date)) class CVSTextDecoder: """Callable that decodes CVS strings into Unicode. Members: decoders -- a list [(name, CodecInfo.decode), ...] containing the names and decoders that will be used in 'strict' mode to attempt to decode inputs. fallback_decoder -- a tuple (name, CodecInfo.decode) containing the name and decoder that will be used in 'replace' mode if none of the decoders in DECODERS succeeds. If no fallback_decoder has been specified, this member contains None. eol_fix -- a string to which all EOL sequences will be converted, or None if they should be left unchanged. """ def __init__(self, encodings, fallback_encoding=None, eol_fix=None): """Create a CVSTextDecoder instance. ENCODINGS is a list containing the names of encodings that are attempted to be used as source encodings in 'strict' mode. FALLBACK_ENCODING, if specified, is the name of an encoding that should be used as a source encoding in lossy 'replace' mode if all of ENCODINGS failed. EOL_FIX is the string to which all EOL sequences should be converted. If it is set to None, then EOL sequences are left unchanged. Raise LookupError if any of the specified encodings is unknown.""" self.decoders = [] for encoding in encodings: self.add_encoding(encoding) self.set_fallback_encoding(fallback_encoding) self.eol_fix = eol_fix def add_encoding(self, encoding): """Add an encoding to be tried in 'strict' mode. ENCODING is the name of an encoding. If it is unknown, raise a LookupError.""" for (name, decoder) in self.decoders: if name == encoding: return else: self.decoders.append( (encoding, codecs.lookup(encoding)[1]) ) def set_fallback_encoding(self, encoding): """Set the fallback encoding, to be tried in 'replace' mode. ENCODING is the name of an encoding. If it is unknown, raise a LookupError.""" if encoding is None: self.fallback_decoder = None else: self.fallback_decoder = (encoding, codecs.lookup(encoding)[1]) def decode(self, s): """Try to decode string S using our configured source encodings. Return the string as a Unicode string. If S is already a unicode string, do nothing. Raise UnicodeError if the string cannot be decoded using any of the source encodings and no fallback encoding was specified.""" if isinstance(s, unicode): return s for (name, decoder) in self.decoders: try: return decoder(s)[0] except ValueError: logger.verbose("Encoding '%s' failed for string %r" % (name, s)) if self.fallback_decoder is not None: (name, decoder) = self.fallback_decoder return decoder(s, 'replace')[0] else: raise UnicodeError() def __call__(self, s): s = self.decode(s) if self.eol_fix is not None: s = canonicalize_eol(s, self.eol_fix) return s class Timestamper: """Return monotonic timestamps derived from changeset timestamps.""" def __init__(self): # The last timestamp that has been returned: self.timestamp = 0.0 # The maximum timestamp that is considered reasonable: self.max_timestamp = time.time() + 24.0 * 60.0 * 60.0 def get(self, timestamp, change_expected): """Return a reasonable timestamp derived from TIMESTAMP. Push TIMESTAMP into the future if necessary to ensure that it is at least one second later than every other timestamp that has been returned by previous calls to this method. If CHANGE_EXPECTED is not True, then log a message if the timestamp has to be changed.""" if timestamp > self.max_timestamp: # If a timestamp is in the future, it is assumed that it is # bogus. Shift it backwards in time to prevent it forcing other # timestamps to be pushed even further in the future. # Note that this is not nearly a complete solution to the bogus # timestamp problem. A timestamp in the future still affects # the ordering of changesets, and a changeset having such a # timestamp will not be committed until all changesets with # earlier timestamps have been committed, even if other # changesets with even earlier timestamps depend on this one. self.timestamp = self.timestamp + 1.0 if not change_expected: logger.warn( 'Timestamp "%s" is in the future; changed to "%s".' % (time.asctime(time.gmtime(timestamp)), time.asctime(time.gmtime(self.timestamp)),) ) elif timestamp < self.timestamp + 1.0: self.timestamp = self.timestamp + 1.0 if not change_expected and logger.is_on(logger.VERBOSE): logger.verbose( 'Timestamp "%s" adjusted to "%s" to ensure monotonicity.' % (time.asctime(time.gmtime(timestamp)), time.asctime(time.gmtime(self.timestamp)),) ) else: self.timestamp = timestamp return self.timestamp cvs2svn-2.4.0/cvs2svn_lib/svn_commit.py0000664000076500007650000002564411500107341021170 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains the SVNCommit classes. There are five types of SVNCommits: SVNInitialProjectCommit -- Initializes a project (creates its trunk, branches, and tags directories). SVNPrimaryCommit -- Commits one or more CVSRevisions on one or more lines of development. SVNBranchCommit -- Creates or fills a branch; that is, copies files from a source line of development to a target branch. SVNTagCommit -- Creates or fills a tag; that is, copies files from a source line of development to a target tag. SVNPostCommit -- Updates trunk to reflect changes on a non-trunk default branch. """ from cvs2svn_lib.common import InternalError from cvs2svn_lib.context import Ctx from cvs2svn_lib.symbol import Branch from cvs2svn_lib.symbol import Tag class SVNCommit: """This represents one commit to the Subversion Repository.""" def __init__(self, date, revnum): """Instantiate an SVNCommit. REVNUM is the SVN revision number of this commit.""" # The date of the commit, as an integer. While the SVNCommit is # being built up, this contains the latest date seen so far. This # member is set externally. self.date = date # The SVN revision number of this commit, as an integer. self.revnum = revnum def __getstate__(self): return (self.date, self.revnum,) def __setstate__(self, state): (self.date, self.revnum,) = state def get_cvs_items(self): """Return a list containing the CVSItems in this commit.""" raise NotImplementedError() def get_author(self): """Return the author or this commit, or None if none is to be used. The return value is exactly as the author appeared in the RCS file, with undefined character encoding.""" raise NotImplementedError() def get_log_msg(self): """Return a log message for this commit. The return value is exactly as the log message appeared in the RCS file, with undefined character encoding.""" raise NotImplementedError() def get_warning_summary(self): """Return a summary of this commit that can be used in warnings.""" return '(subversion rev %s)' % (self.revnum,) def get_description(self): """Return a partial description of this SVNCommit, for logging.""" raise NotImplementedError() def output(self, output_option): """Cause this commit to be output to OUTPUT_OPTION. This method is used for double-dispatch. Derived classes should call the OutputOption.process_*_commit() method appropriate for the type of SVNCommit.""" raise NotImplementedError() def __str__(self): """ Print a human-readable description of this SVNCommit. This description is not intended to be machine-parseable.""" ret = "SVNCommit #: " + str(self.revnum) + "\n" ret += " debug description: " + self.get_description() + "\n" return ret class SVNInitialProjectCommit(SVNCommit): def __init__(self, date, projects, revnum): SVNCommit.__init__(self, date, revnum) self.projects = list(projects) def __getstate__(self): return ( SVNCommit.__getstate__(self), [project.id for project in self.projects], ) def __setstate__(self, state): (svn_commit_state, project_ids,) = state SVNCommit.__setstate__(self, svn_commit_state) self.projects = [ Ctx()._projects[project_id] for project_id in project_ids ] def get_cvs_items(self): return [] def get_author(self): return Ctx().username def get_log_msg(self): return Ctx().text_wrapper.fill( Ctx().initial_project_commit_message % {} ) def get_description(self): return 'Project initialization' def output(self, output_option): output_option.process_initial_project_commit(self) class SVNRevisionCommit(SVNCommit): """A SVNCommit that includes actual CVS revisions.""" def __init__(self, cvs_revs, date, revnum): SVNCommit.__init__(self, date, revnum) self.cvs_revs = list(cvs_revs) # This value is set lazily by _get_metadata(): self._metadata = None def __getstate__(self): """Return the part of the state represented by this mixin.""" return ( SVNCommit.__getstate__(self), [cvs_rev.id for cvs_rev in self.cvs_revs], ) def __setstate__(self, state): """Restore the part of the state represented by this mixin.""" (svn_commit_state, cvs_rev_ids) = state SVNCommit.__setstate__(self, svn_commit_state) self.cvs_revs = [ cvs_rev for (id, cvs_rev) in Ctx()._cvs_items_db.get_many(cvs_rev_ids) ] self._metadata = None def get_cvs_items(self): return self.cvs_revs def _get_metadata(self): """Return the Metadata instance for this commit.""" if self._metadata is None: # Set self._metadata for this commit from that of the first cvs # revision. if not self.cvs_revs: raise InternalError('SVNPrimaryCommit contains no CVS revisions') metadata_id = self.cvs_revs[0].metadata_id self._metadata = Ctx()._metadata_db[metadata_id] return self._metadata def get_author(self): return self._get_metadata().author def get_warning_summary(self): retval = [] retval.append(SVNCommit.get_warning_summary(self) + ' Related files:') for cvs_rev in self.cvs_revs: retval.append(' ' + cvs_rev.cvs_file.rcs_path) return '\n'.join(retval) def __str__(self): """Return the revision part of a description of this SVNCommit. Derived classes should append the output of this method to the output of SVNCommit.__str__().""" ret = [] ret.append(SVNCommit.__str__(self)) ret.append(' cvs_revs:\n') for cvs_rev in self.cvs_revs: ret.append(' %x\n' % (cvs_rev.id,)) return ''.join(ret) class SVNPrimaryCommit(SVNRevisionCommit): def __init__(self, cvs_revs, date, revnum): SVNRevisionCommit.__init__(self, cvs_revs, date, revnum) def get_log_msg(self): """Return the actual log message for this commit.""" return self._get_metadata().log_msg def get_description(self): return 'commit' def output(self, output_option): output_option.process_primary_commit(self) class SVNPostCommit(SVNRevisionCommit): def __init__(self, motivating_revnum, cvs_revs, date, revnum): SVNRevisionCommit.__init__(self, cvs_revs, date, revnum) # The subversion revision number of the *primary* commit where the # default branch changes actually happened. (NOTE: Secondary # commits that fill branches and tags also have a motivating # commit, but we do not record it because it is (currently) not # needed for anything.) motivating_revnum is used when generating # the log message for the commit that synchronizes the default # branch with trunk. # # It is possible for multiple synchronization commits to refer to # the same motivating commit revision number, and it is possible # for a single synchronization commit to contain CVSRevisions on # multiple different default branches. self.motivating_revnum = motivating_revnum def __getstate__(self): return ( SVNRevisionCommit.__getstate__(self), self.motivating_revnum, ) def __setstate__(self, state): (rev_state, self.motivating_revnum,) = state SVNRevisionCommit.__setstate__(self, rev_state) def get_cvs_items(self): # It might seem that we should return # SVNRevisionCommit.get_cvs_items(self) here, but this commit # doesn't really include those CVSItems, but rather followup # commits to those. return [] def get_log_msg(self): """Return a manufactured log message for this commit.""" return Ctx().text_wrapper.fill( Ctx().post_commit_message % {'revnum' : self.motivating_revnum} ) def get_description(self): return 'post-commit default branch(es)' def output(self, output_option): output_option.process_post_commit(self) class SVNSymbolCommit(SVNCommit): def __init__(self, symbol, cvs_symbol_ids, date, revnum): SVNCommit.__init__(self, date, revnum) # The TypedSymbol that is filled in this SVNCommit. self.symbol = symbol self.cvs_symbol_ids = cvs_symbol_ids def __getstate__(self): return ( SVNCommit.__getstate__(self), self.symbol.id, self.cvs_symbol_ids, ) def __setstate__(self, state): (svn_commit_state, symbol_id, self.cvs_symbol_ids) = state SVNCommit.__setstate__(self, svn_commit_state) self.symbol = Ctx()._symbol_db.get_symbol(symbol_id) def get_cvs_items(self): return [ cvs_symbol for (id, cvs_symbol) in Ctx()._cvs_items_db.get_many(self.cvs_symbol_ids) ] def _get_symbol_type(self): """Return the type of the self.symbol ('branch' or 'tag').""" raise NotImplementedError() def get_author(self): return Ctx().username def get_log_msg(self): """Return a manufactured log message for this commit.""" return Ctx().text_wrapper.fill( Ctx().symbol_commit_message % { 'symbol_type' : self._get_symbol_type(), 'symbol_name' : self.symbol.name, } ) def get_description(self): return 'copying to %s %r' % (self._get_symbol_type(), self.symbol.name,) def __str__(self): """ Print a human-readable description of this SVNCommit. This description is not intended to be machine-parseable.""" return ( SVNCommit.__str__(self) + " symbolic name: %s\n" % (self.symbol.name,) ) class SVNBranchCommit(SVNSymbolCommit): def __init__(self, symbol, cvs_symbol_ids, date, revnum): if not isinstance(symbol, Branch): raise InternalError('Incorrect symbol type %r' % (symbol,)) SVNSymbolCommit.__init__(self, symbol, cvs_symbol_ids, date, revnum) def _get_symbol_type(self): return 'branch' def output(self, output_option): output_option.process_branch_commit(self) class SVNTagCommit(SVNSymbolCommit): def __init__(self, symbol, cvs_symbol_ids, date, revnum): if not isinstance(symbol, Tag): raise InternalError('Incorrect symbol type %r' % (symbol,)) SVNSymbolCommit.__init__(self, symbol, cvs_symbol_ids, date, revnum) def _get_symbol_type(self): return 'tag' def output(self, output_option): output_option.process_tag_commit(self) cvs2svn-2.4.0/cvs2svn_lib/symbol_database.py0000664000076500007650000000375611244047703022155 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains the SymbolDatabase class.""" import cPickle from cvs2svn_lib import config from cvs2svn_lib.artifact_manager import artifact_manager class SymbolDatabase: """Read-only access to symbol database. This class allows iteration and lookups id -> symbol, where symbol is a TypedSymbol instance. The whole database is read into memory upon construction.""" def __init__(self): # A map { id : TypedSymbol } self._symbols = {} f = open(artifact_manager.get_temp_file(config.SYMBOL_DB), 'rb') symbols = cPickle.load(f) f.close() for symbol in symbols: self._symbols[symbol.id] = symbol def get_symbol(self, id): """Return the symbol instance with id ID. Raise KeyError if the symbol is not known.""" return self._symbols[id] def __iter__(self): """Iterate over the Symbol instances within this database.""" return self._symbols.itervalues() def close(self): self._symbols = None def create_symbol_database(symbols): """Create and fill a symbol database. Record each symbol that is listed in SYMBOLS, which is an iterable containing Trunk and TypedSymbol objects.""" f = open(artifact_manager.get_temp_file(config.SYMBOL_DB), 'wb') cPickle.dump(symbols, f, -1) f.close() cvs2svn-2.4.0/cvs2svn_lib/artifact_manager.py0000664000076500007650000002010111434364604022275 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module manages the artifacts produced by conversion passes.""" from cvs2svn_lib.log import logger from cvs2svn_lib.artifact import TempFile class ArtifactNotActiveError(Exception): """An artifact was requested when no passes that have registered that they need it are active.""" def __init__(self, artifact_name): Exception.__init__( self, 'Artifact %s is not currently active' % artifact_name) class ArtifactManager: """Manage artifacts that are created by one pass but needed by others. This class is responsible for cleaning up artifacts once they are no longer needed. The trick is that cvs2svn can be run pass by pass, so not all passes might be executed during a specific program run. To use this class: - Call artifact_manager.set_artifact(name, artifact) once for each known artifact. - Call artifact_manager.creates(which_pass, artifact) to indicate that WHICH_PASS is the pass that creates ARTIFACT. - Call artifact_manager.uses(which_pass, artifact) to indicate that WHICH_PASS needs to use ARTIFACT. There are also helper methods register_temp_file(), register_artifact_needed(), and register_temp_file_needed() which combine some useful operations. Then, in pass order: - Call pass_skipped() for any passes that were already executed during a previous cvs2svn run. - Call pass_started() when a pass is about to start execution. - If a pass that has been started will be continued during the next program run, then call pass_continued(). - If a pass that has been started finishes execution, call pass_done(), to allow any artifacts that won't be needed anymore to be cleaned up. - Call pass_deferred() for any passes that have been deferred to a future cvs2svn run. Finally: - Call check_clean() to verify that all artifacts have been accounted for.""" def __init__(self): # A map { artifact_name : artifact } of known artifacts. self._artifacts = { } # A map { pass : set_of_artifacts }, where set_of_artifacts is a # set of artifacts needed by the pass. self._pass_needs = { } # A set of passes that are currently being executed. self._active_passes = set() def set_artifact(self, name, artifact): """Add ARTIFACT to the list of artifacts that we manage. Store it under NAME.""" assert name not in self._artifacts self._artifacts[name] = artifact def get_artifact(self, name): """Return the artifact with the specified name. If the artifact does not currently exist, raise a KeyError. If it is not registered as being needed by one of the active passes, raise an ArtifactNotActiveError.""" artifact = self._artifacts[name] for active_pass in self._active_passes: if artifact in self._pass_needs[active_pass]: # OK return artifact else: raise ArtifactNotActiveError(name) def creates(self, which_pass, artifact): """Register that WHICH_PASS creates ARTIFACT. ARTIFACT must already have been registered.""" # An artifact is automatically "needed" in the pass in which it is # created: self.uses(which_pass, artifact) def uses(self, which_pass, artifact): """Register that WHICH_PASS uses ARTIFACT. ARTIFACT must already have been registered.""" artifact._passes_needed.add(which_pass) if which_pass in self._pass_needs: self._pass_needs[which_pass].add(artifact) else: self._pass_needs[which_pass] = set([artifact]) def register_temp_file(self, basename, which_pass): """Register a temporary file with base name BASENAME as an artifact. Return the filename of the temporary file.""" artifact = TempFile(basename) self.set_artifact(basename, artifact) self.creates(which_pass, artifact) def get_temp_file(self, basename): """Return the filename of the temporary file with the specified BASENAME. If the temporary file is not an existing, registered TempFile, raise a KeyError.""" return self.get_artifact(basename).filename def register_artifact_needed(self, artifact_name, which_pass): """Register that WHICH_PASS uses the artifact named ARTIFACT_NAME. An artifact with this name must already have been registered.""" artifact = self._artifacts[artifact_name] artifact._passes_needed.add(which_pass) if which_pass in self._pass_needs: self._pass_needs[which_pass].add(artifact) else: self._pass_needs[which_pass] = set([artifact,]) def register_temp_file_needed(self, basename, which_pass): """Register that a temporary file is needed by WHICH_PASS. Register that the temporary file with base name BASENAME is needed by WHICH_PASS.""" self.register_artifact_needed(basename, which_pass) def _unregister_artifacts(self, which_pass): """Unregister any artifacts that were needed for WHICH_PASS. Return a list of artifacts that are no longer needed at all.""" try: artifacts = list(self._pass_needs[which_pass]) except KeyError: # No artifacts were needed for that pass: return [] del self._pass_needs[which_pass] unneeded_artifacts = [] for artifact in artifacts: artifact._passes_needed.remove(which_pass) if not artifact._passes_needed: unneeded_artifacts.append(artifact) return unneeded_artifacts def pass_skipped(self, which_pass): """WHICH_PASS was executed during a previous cvs2svn run. Its artifacts were created then, and any artifacts that would normally be cleaned up after this pass have already been cleaned up.""" self._unregister_artifacts(which_pass) def pass_started(self, which_pass): """WHICH_PASS is starting.""" self._active_passes.add(which_pass) def pass_continued(self, which_pass): """WHICH_PASS will be continued during the next program run. WHICH_PASS, which has already been started, will be continued during the next program run. Unregister any artifacts that would be cleaned up at the end of WHICH_PASS without actually cleaning them up.""" self._active_passes.remove(which_pass) self._unregister_artifacts(which_pass) def pass_done(self, which_pass, skip_cleanup): """WHICH_PASS is done. Clean up all artifacts that are no longer needed. If SKIP_CLEANUP is True, then just do the bookkeeping without actually calling artifact.cleanup().""" self._active_passes.remove(which_pass) artifacts = self._unregister_artifacts(which_pass) if not skip_cleanup: for artifact in artifacts: artifact.cleanup() def pass_deferred(self, which_pass): """WHICH_PASS is being deferred until a future cvs2svn run. Unregister any artifacts that would be cleaned up during WHICH_PASS.""" self._unregister_artifacts(which_pass) def check_clean(self): """All passes have been processed. Output a warning messages if all artifacts have not been accounted for. (This is mainly a consistency check, that no artifacts were registered under nonexistent passes.)""" unclean_artifacts = [ str(artifact) for artifact in self._artifacts.values() if artifact._passes_needed] if unclean_artifacts: logger.warn( 'INTERNAL: The following artifacts were not cleaned up:\n %s\n' % ('\n '.join(unclean_artifacts))) # The default ArtifactManager instance: artifact_manager = ArtifactManager() cvs2svn-2.4.0/cvs2svn_lib/rcs_stream.py0000664000076500007650000002270211434364604021161 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2007 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module processes RCS diffs (deltas).""" from cStringIO import StringIO import re def msplit(s): """Split S into an array of lines. Only \n is a line separator. The line endings are part of the lines.""" # return s.splitlines(True) clobbers \r re = [ i + "\n" for i in s.split("\n") ] re[-1] = re[-1][:-1] if not re[-1]: del re[-1] return re class MalformedDeltaException(Exception): """A malformed RCS delta was encountered.""" pass ed_command_re = re.compile(r'^([ad])(\d+)\s(\d+)\n$') def generate_edits(diff): """Generate edit commands from an RCS diff block. DIFF is a string holding an entire RCS file delta. Generate a tuple (COMMAND, INPUT_POS, ARG) for each block implied by DIFF. Tuples describe the ed commands: ('a', INPUT_POS, LINES) : add LINES at INPUT_POS. LINES is a list of strings. ('d', INPUT_POS, COUNT) : delete COUNT input lines starting at line INPUT_POS. In all cases, INPUT_POS is expressed as a zero-offset line number within the input revision.""" diff = msplit(diff) i = 0 while i < len(diff): m = ed_command_re.match(diff[i]) if not m: raise MalformedDeltaException('Bad ed command') i += 1 command = m.group(1) start = int(m.group(2)) count = int(m.group(3)) if command == 'd': # "d" - Delete command yield ('d', start - 1, count) else: # "a" - Add command if i + count > len(diff): raise MalformedDeltaException('Add block truncated') yield ('a', start, diff[i:i + count]) i += count def merge_blocks(blocks): """Merge adjacent 'r'eplace or 'c'opy blocks.""" i = iter(blocks) try: (command1, old_lines1, new_lines1) = i.next() except StopIteration: return for (command2, old_lines2, new_lines2) in i: if command1 == 'r' and command2 == 'r': old_lines1 += old_lines2 new_lines1 += new_lines2 elif command1 == 'c' and command2 == 'c': old_lines1 += old_lines2 new_lines1 = old_lines1 else: yield (command1, old_lines1, new_lines1) (command1, old_lines1, new_lines1) = (command2, old_lines2, new_lines2) yield (command1, old_lines1, new_lines1) def invert_blocks(blocks): """Invert the blocks in BLOCKS. BLOCKS is an iterable over blocks. Invert them, in the sense that the input becomes the output and the output the input.""" for (command, old_lines, new_lines) in blocks: yield (command, new_lines, old_lines) def generate_edits_from_blocks(blocks): """Convert BLOCKS into an equivalent series of RCS edits. The edits are generated as tuples in the format described in the docstring for generate_edits(). It is important that deletes are emitted before adds in the output for two reasons: 1. The last line in the last 'add' block might end in a line that is not terminated with a newline, in which case no other command is allowed to follow it. 2. This is the canonical order used by RCS; this ensures that inverting twice gives back the original delta.""" # Merge adjacent 'r'eplace blocks to ensure that we emit adds and # deletes in the right order: blocks = merge_blocks(blocks) input_position = 0 for (command, old_lines, new_lines) in blocks: if command == 'c': input_position += len(old_lines) elif command == 'r': if old_lines: yield ('d', input_position, len(old_lines)) input_position += len(old_lines) if new_lines: yield ('a', input_position, new_lines) def write_edits(f, edits): """Write EDITS to file-like object f as an RCS diff.""" for (command, input_position, arg) in edits: if command == 'd': f.write('d%d %d\n' % (input_position + 1, arg,)) elif command == 'a': lines = arg f.write('a%d %d\n' % (input_position, len(lines),)) f.writelines(lines) del lines else: raise MalformedDeltaException('Unknown command %r' % (command,)) class RCSStream: """This class allows RCS deltas to be accumulated. This file holds the contents of a single RCS version in memory as an array of lines. It is able to apply an RCS delta to the version, thereby transforming the stored text into the following RCS version. While doing so, it can optionally also return the inverted delta. This class holds revisions in memory. It uses temporary memory space of a few times the size of a single revision plus a few times the size of a single delta.""" def __init__(self, text): """Instantiate and initialize the file content with TEXT.""" self.set_text(text) def get_text(self): """Return the current file content.""" return "".join(self._lines) def set_lines(self, lines): """Set the current contents to the specified LINES. LINES is an iterable over well-formed lines; i.e., each line contains exactly one LF as its last character, except that the list line can be unterminated. LINES will be consumed immediately; if it is a sequence, it will be copied.""" self._lines = list(lines) def set_text(self, text): """Set the current file content.""" self._lines = msplit(text) def generate_blocks(self, edits): """Generate edit blocks from an iterable of RCS edits. EDITS is an iterable over RCS edits, as generated by generate_edits(). Generate a tuple (COMMAND, OLD_LINES, NEW_LINES) for each block implied by EDITS when applied to the current contents of SELF. OLD_LINES and NEW_LINES are lists of strings, where each string is one line. OLD_LINES and NEW_LINES are newly-allocated lists, though they might both point at the same list. Blocks consist of copy and replace commands: ('c', OLD_LINES, NEW_LINES) : copy the lines from one version to the other, unaltered. In this case OLD_LINES==NEW_LINES. ('r', OLD_LINES, NEW_LINES) : replace OLD_LINES with NEW_LINES. Either OLD_LINES or NEW_LINES (or both) might be empty.""" # The number of lines from the old version that have been processed # so far: input_pos = 0 for (command, start, arg) in edits: if command == 'd': # "d" - Delete command count = arg if start < input_pos: raise MalformedDeltaException('Deletion before last edit') if start > len(self._lines): raise MalformedDeltaException('Deletion past file end') if start + count > len(self._lines): raise MalformedDeltaException('Deletion beyond file end') if input_pos < start: copied_lines = self._lines[input_pos:start] yield ('c', copied_lines, copied_lines) del copied_lines yield ('r', self._lines[start:start + count], []) input_pos = start + count else: # "a" - Add command lines = arg if start < input_pos: raise MalformedDeltaException('Insertion before last edit') if start > len(self._lines): raise MalformedDeltaException('Insertion past file end') if input_pos < start: copied_lines = self._lines[input_pos:start] yield ('c', copied_lines, copied_lines) del copied_lines input_pos = start yield ('r', [], lines) # Pass along the part of the input that follows all of the delta # blocks: copied_lines = self._lines[input_pos:] if copied_lines: yield ('c', copied_lines, copied_lines) def apply_diff(self, diff): """Apply the RCS diff DIFF to the current file content.""" lines = [] blocks = self.generate_blocks(generate_edits(diff)) for (command, old_lines, new_lines) in blocks: lines += new_lines self._lines = lines def apply_and_invert_edits(self, edits): """Apply EDITS and generate their inverse. Apply EDITS to the current file content. Simultaneously generate edits suitable for reverting the change.""" blocks = self.generate_blocks(edits) # Blocks have to be merged so that adjacent delete,add edits are # generated in that order: blocks = merge_blocks(blocks) # Convert the iterable into a list (1) so that we can modify # self._lines in-place, (2) because we need it twice. blocks = list(blocks) self._lines = [] for (command, old_lines, new_lines) in blocks: self._lines += new_lines return generate_edits_from_blocks(invert_blocks(blocks)) def invert_diff(self, diff): """Apply DIFF and generate its inverse. Apply the RCS diff DIFF to the current file content. Simultaneously generate an RCS diff suitable for reverting the change, and return it as a string.""" inverse_diff = StringIO() write_edits( inverse_diff, self.apply_and_invert_edits(generate_edits(diff)) ) return inverse_diff.getvalue() cvs2svn-2.4.0/cvs2svn_lib/bzr_run_options.py0000664000076500007650000001210511710517256022247 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module manages cvs2bzr run options.""" from cvs2svn_lib.common import FatalError from cvs2svn_lib.context import Ctx from cvs2svn_lib.dvcs_common import DVCSRunOptions from cvs2svn_lib.run_options import ContextOption from cvs2svn_lib.run_options import IncompatibleOption from cvs2svn_lib.run_options import not_both from cvs2svn_lib.revision_manager import NullRevisionCollector from cvs2svn_lib.rcs_revision_manager import RCSRevisionReader from cvs2svn_lib.cvs_revision_manager import CVSRevisionReader from cvs2svn_lib.output_option import NullOutputOption from cvs2svn_lib.git_output_option import GitRevisionInlineWriter from cvs2svn_lib.bzr_output_option import BzrOutputOption class BzrRunOptions(DVCSRunOptions): short_desc = 'convert a cvs repository into a Bazaar repository' synopsis = """\ .B cvs2bzr [\\fIOPTION\\fR]... \\fIOUTPUT-OPTIONS CVS-REPOS-PATH\\fR .br .B cvs2bzr [\\fIOPTION\\fR]... \\fI--options=PATH\\fR """ long_desc = """\ Create a new Bazaar repository based on the version history stored in a CVS repository. Each CVS commit will be mirrored in the Bazaar repository, including such information as date of commit and id of the committer. .P The output of this program is a "fast-import dumpfile", which can be loaded into a Bazaar repository using the Bazaar FastImport Plugin, available from https://launchpad.net/bzr-fastimport. .P \\fICVS-REPOS-PATH\\fR is the filesystem path of the part of the CVS repository that you want to convert. This path doesn't have to be the top level directory of a CVS repository; it can point at a project within a repository, in which case only that project will be converted. This path or one of its parent directories has to contain a subdirectory called CVSROOT (though the CVSROOT directory can be empty). .P It is not possible directly to convert a CVS repository to which you only have remote access, but the FAQ describes tools that may be used to create a local copy of a remote CVS repository. """ files = """\ A directory called \\fIcvs2svn-tmp\\fR (or the directory specified by \\fB--tmpdir\\fR) is used as scratch space for temporary data files. """ see_also = [ ('cvs', '1'), ('bzr', '1'), ] def _get_output_options_group(self): group = super(BzrRunOptions, self)._get_output_options_group() group.add_option(IncompatibleOption( '--dumpfile', type='string', action='store', help='path to which the data should be written', man_help=( 'Write the blobs and revision data to \\fIpath\\fR.' ), metavar='PATH', )) group.add_option(ContextOption( '--dry-run', action='store_true', help=( 'do not create any output; just print what would happen.' ), man_help=( 'Do not create any output; just print what would happen.' ), )) return group def _get_extraction_options_group(self): group = super(BzrRunOptions, self)._get_extraction_options_group() self._add_use_cvs_option(group) self._add_use_rcs_option(group) return group def process_extraction_options(self): """Process options related to extracting data from the CVS repository.""" options = self.options not_both(options.use_rcs, '--use-rcs', options.use_cvs, '--use-cvs') # cvs2bzr defers acting on extraction options to process_output_options def process_output_options(self): """Process options related to fastimport output.""" ctx = Ctx() options = self.options if options.use_rcs: revision_reader = RCSRevisionReader( co_executable=options.co_executable ) else: # --use-cvs is the default: revision_reader = CVSRevisionReader( cvs_executable=options.cvs_executable ) if not ctx.dry_run and not options.dumpfile: raise FatalError("must pass '--dry-run' or '--dumpfile' option.") # See cvs2bzr-example.options for explanations of these ctx.revision_collector = NullRevisionCollector() ctx.revision_reader = None if ctx.dry_run: ctx.output_option = NullOutputOption() else: ctx.output_option = BzrOutputOption( options.dumpfile, GitRevisionInlineWriter(revision_reader), # Optional map from CVS author names to bzr author names: author_transforms={}, # FIXME ) cvs2svn-2.4.0/cvs2svn_lib/main.py0000664000076500007650000001072711434364604017747 0ustar mhaggermhagger00000000000000#!/usr/bin/env python # (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== import os import errno import gc try: # Try to get access to a bunch of encodings for use with --encoding. # See http://cjkpython.i18n.org/ for details. import iconv_codec except ImportError: pass from cvs2svn_lib.common import FatalError from cvs2svn_lib.svn_run_options import SVNRunOptions from cvs2svn_lib.git_run_options import GitRunOptions from cvs2svn_lib.bzr_run_options import BzrRunOptions from cvs2svn_lib.context import Ctx from cvs2svn_lib.pass_manager import PassManager from cvs2svn_lib.passes import passes def main(progname, run_options, pass_manager): # Disable garbage collection, as we try not to create any circular # data structures: gc.disable() # Convenience var, so we don't have to keep instantiating this Borg. ctx = Ctx() # Make sure the tmp directory exists. Note that we don't check if # it's empty -- we want to be able to use, for example, "." to hold # tempfiles. if not os.path.exists(ctx.tmpdir): erase_tmpdir = True os.mkdir(ctx.tmpdir) elif not os.path.isdir(ctx.tmpdir): raise FatalError( "cvs2svn tried to use '%s' for temporary files, but that path\n" " exists and is not a directory. Please make it be a directory,\n" " or specify some other directory for temporary files." % (ctx.tmpdir,)) else: erase_tmpdir = False # But do lock the tmpdir, to avoid process clash. try: os.mkdir(os.path.join(ctx.tmpdir, 'cvs2svn.lock')) except OSError, e: if e.errno == errno.EACCES: raise FatalError("Permission denied:" + " No write access to directory '%s'." % ctx.tmpdir) if e.errno == errno.EEXIST: raise FatalError( "cvs2svn is using directory '%s' for temporary files, but\n" " subdirectory '%s/cvs2svn.lock' exists, indicating that another\n" " cvs2svn process is currently using '%s' as its temporary\n" " workspace. If you are certain that is not the case,\n" " then remove the '%s/cvs2svn.lock' subdirectory." % (ctx.tmpdir, ctx.tmpdir, ctx.tmpdir, ctx.tmpdir,)) raise try: if run_options.profiling: try: import cProfile except ImportError: # Old version of Python without cProfile. Use hotshot instead. import hotshot prof = hotshot.Profile('cvs2svn.hotshot') prof.runcall(pass_manager.run, run_options) prof.close() else: # Recent version of Python (2.5+) with cProfile. def run_with_profiling(): pass_manager.run(run_options) cProfile.runctx( 'run_with_profiling()', globals(), locals(), 'cvs2svn.cProfile' ) else: pass_manager.run(run_options) finally: try: os.rmdir(os.path.join(ctx.tmpdir, 'cvs2svn.lock')) except: pass if erase_tmpdir: try: os.rmdir(ctx.tmpdir) except: pass def svn_main(progname, cmd_args): pass_manager = PassManager(passes) run_options = SVNRunOptions(progname, cmd_args, pass_manager) main(progname, run_options, pass_manager) def git_main(progname, cmd_args): pass_manager = PassManager(passes) run_options = GitRunOptions(progname, cmd_args, pass_manager) main(progname, run_options, pass_manager) def bzr_main(progname, cmd_args): pass_manager = PassManager(passes) run_options = BzrRunOptions(progname, cmd_args, pass_manager) main(progname, run_options, pass_manager) def hg_main(progname, cmd_args): # Import late so cvs2{svn,git} do not depend on being able to import # the Mercurial API. from cvs2svn_lib.hg_run_options import HgRunOptions pass_manager = PassManager(passes) run_options = HgRunOptions(progname, cmd_args, pass_manager) main(progname, run_options, pass_manager) cvs2svn-2.4.0/cvs2svn_lib/artifact.py0000664000076500007650000000326711434364604020621 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module defines Artifact types to be used with an ArtifactManager.""" import os from cvs2svn_lib.context import Ctx from cvs2svn_lib.log import logger class Artifact(object): """An object that is created, used across passes, then cleaned up.""" def __init__(self): # The set of passes that need this artifact. This field is # maintained by ArtifactManager. self._passes_needed = set() def cleanup(self): """This artifact is no longer needed; clean it up.""" pass class TempFile(Artifact): """A temporary file that can be used across cvs2svn passes.""" def __init__(self, basename): Artifact.__init__(self) self.basename = basename def _get_filename(self): return Ctx().get_temp_filename(self.basename) filename = property(_get_filename) def cleanup(self): logger.verbose("Deleting", self.filename) os.unlink(self.filename) def __str__(self): return 'Temporary file %r' % (self.filename,) cvs2svn-2.4.0/cvs2svn_lib/property_setters.py0000664000076500007650000003577711710517256022474 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains classes to set Subversion properties on files.""" import os import re import fnmatch import ConfigParser from cStringIO import StringIO from cvs2svn_lib.common import warning_prefix from cvs2svn_lib.log import logger def _squash_case(s): return s.lower() def _preserve_case(s): return s def cvs_file_is_binary(cvs_file): return cvs_file.mode == 'b' class FilePropertySetter(object): """Abstract class for objects that set properties on a CVSFile.""" def maybe_set_property(self, cvs_file, name, value): """Set a property on CVS_FILE if it does not already have a value. This method is here for the convenience of derived classes.""" if name not in cvs_file.properties: cvs_file.properties[name] = value def set_properties(self, cvs_file): """Set any properties needed for CVS_FILE. CVS_FILE is an instance of CVSFile. This method should modify CVS_FILE.properties in place.""" raise NotImplementedError() class ExecutablePropertySetter(FilePropertySetter): """Set the svn:executable property based on cvs_file.executable.""" def set_properties(self, cvs_file): if cvs_file.executable: self.maybe_set_property(cvs_file, 'svn:executable', '*') class DescriptionPropertySetter(FilePropertySetter): """Set the cvs:description property based on cvs_file.description.""" def __init__(self, propname='cvs:description'): self.propname = propname def set_properties(self, cvs_file): if cvs_file.description: self.maybe_set_property(cvs_file, self.propname, cvs_file.description) class CVSBinaryFileEOLStyleSetter(FilePropertySetter): """Set the eol-style to None for files with CVS mode '-kb'.""" def set_properties(self, cvs_file): if cvs_file.mode == 'b': self.maybe_set_property(cvs_file, 'svn:eol-style', None) class MimeMapper(FilePropertySetter): """A class that provides mappings from file names to MIME types.""" propname = 'svn:mime-type' def __init__( self, mime_types_file=None, mime_mappings=None, ignore_case=False ): """Constructor. Arguments: mime_types_file -- a path to a MIME types file on disk. Each line of the file should contain the MIME type, then a whitespace-separated list of file extensions; e.g., one line might be 'text/plain txt c h cpp hpp'. (See http://en.wikipedia.org/wiki/Mime.types for information about mime.types files): mime_mappings -- a dictionary mapping a file extension to a MIME type; e.g., {'txt': 'text/plain', 'cpp': 'text/plain'}. ignore_case -- True iff case should be ignored in filename extensions. Setting this option to True can be useful if your CVS repository was used on systems with case-insensitive filenames, in which case you might have a mix of uppercase and lowercase filenames.""" self.mappings = { } if ignore_case: self.transform_case = _squash_case else: self.transform_case = _preserve_case if mime_types_file is None and mime_mappings is None: logger.error('Should specify MIME types file or dict.\n') if mime_types_file is not None: for line in file(mime_types_file): if line.startswith("#"): continue # format of a line is something like # text/plain c h cpp extensions = line.split() if len(extensions) < 2: continue type = extensions.pop(0) for ext in extensions: ext = self.transform_case(ext) if ext in self.mappings and self.mappings[ext] != type: logger.error( "%s: ambiguous MIME mapping for *.%s (%s or %s)\n" % (warning_prefix, ext, self.mappings[ext], type) ) self.mappings[ext] = type if mime_mappings is not None: for ext, type in mime_mappings.iteritems(): ext = self.transform_case(ext) if ext in self.mappings and self.mappings[ext] != type: logger.error( "%s: ambiguous MIME mapping for *.%s (%s or %s)\n" % (warning_prefix, ext, self.mappings[ext], type) ) self.mappings[ext] = type def set_properties(self, cvs_file): if self.propname in cvs_file.properties: return basename, extension = os.path.splitext(cvs_file.rcs_basename) # Extension includes the dot, so strip it (will leave extension # empty if filename ends with a dot, which is ok): extension = extension[1:] # If there is no extension (or the file ends with a period), use # the base name for mapping. This allows us to set mappings for # files such as README or Makefile: if not extension: extension = basename extension = self.transform_case(extension) mime_type = self.mappings.get(extension, None) if mime_type is not None: cvs_file.properties[self.propname] = mime_type class AutoPropsPropertySetter(FilePropertySetter): """Set arbitrary svn properties based on an auto-props configuration. This class supports case-sensitive or case-insensitive pattern matching. The command-line default is case-insensitive behavior, consistent with Subversion (see http://subversion.tigris.org/issues/show_bug.cgi?id=2036). As a special extension to Subversion's auto-props handling, if a property name is preceded by a '!' then that property is forced to be left unset. If a property specified in auto-props has already been set to a different value, print a warning and leave the old property value unchanged. Python's treatment of whitespaces in the ConfigParser module is buggy and inconsistent. Usually spaces are preserved, but if there is at least one semicolon in the value, and the *first* semicolon is preceded by a space, then that is treated as the start of a comment and the rest of the line is silently discarded.""" property_name_pattern = r'(?P[^\!\=\s]+)' property_unset_re = re.compile( r'^\!\s*' + property_name_pattern + r'$' ) property_set_re = re.compile( r'^' + property_name_pattern + r'\s*\=\s*(?P.*)$' ) property_novalue_re = re.compile( r'^' + property_name_pattern + r'$' ) quoted_re = re.compile( r'^([\'\"]).*\1$' ) comment_re = re.compile(r'\s;') class Pattern: """Describes the properties to be set for files matching a pattern.""" def __init__(self, pattern, propdict): # A glob-like pattern: self.pattern = pattern # A dictionary of properties that should be set: self.propdict = propdict def match(self, basename): """Does the file with the specified basename match pattern?""" return fnmatch.fnmatch(basename, self.pattern) def __init__(self, configfilename, ignore_case=True): config = ConfigParser.ConfigParser() if ignore_case: self.transform_case = _squash_case else: config.optionxform = _preserve_case self.transform_case = _preserve_case configtext = open(configfilename).read() if self.comment_re.search(configtext): logger.warn( '%s: Please be aware that a space followed by a\n' 'semicolon is sometimes treated as a comment in configuration\n' 'files. This pattern was seen in\n' ' %s\n' 'Please make sure that you have not inadvertently commented\n' 'out part of an important line.' % (warning_prefix, configfilename,) ) config.readfp(StringIO(configtext), configfilename) self.patterns = [] sections = config.sections() sections.sort() for section in sections: if self.transform_case(section) == 'auto-props': patterns = config.options(section) patterns.sort() for pattern in patterns: value = config.get(section, pattern) if value: self._add_pattern(pattern, value) def _add_pattern(self, pattern, props): propdict = {} if self.quoted_re.match(pattern): logger.warn( '%s: Quoting is not supported in auto-props; please verify rule\n' 'for %r. (Using pattern including quotation marks.)\n' % (warning_prefix, pattern,) ) for prop in props.split(';'): prop = prop.strip() m = self.property_unset_re.match(prop) if m: name = m.group('name') logger.debug( 'auto-props: For %r, leaving %r unset.' % (pattern, name,) ) propdict[name] = None continue m = self.property_set_re.match(prop) if m: name = m.group('name') value = m.group('value') if self.quoted_re.match(value): logger.warn( '%s: Quoting is not supported in auto-props; please verify\n' 'rule %r for pattern %r. (Using value\n' 'including quotation marks.)\n' % (warning_prefix, prop, pattern,) ) logger.debug( 'auto-props: For %r, setting %r to %r.' % (pattern, name, value,) ) propdict[name] = value continue m = self.property_novalue_re.match(prop) if m: name = m.group('name') logger.debug( 'auto-props: For %r, setting %r to the empty string' % (pattern, name,) ) propdict[name] = '' continue logger.warn( '%s: in auto-props line for %r, value %r cannot be parsed (ignored)' % (warning_prefix, pattern, prop,) ) self.patterns.append(self.Pattern(self.transform_case(pattern), propdict)) def get_propdict(self, cvs_file): basename = self.transform_case(cvs_file.rcs_basename) propdict = {} for pattern in self.patterns: if pattern.match(basename): for (key,value) in pattern.propdict.items(): if key in propdict: if propdict[key] != value: logger.warn( "Contradictory values set for property '%s' for file %s." % (key, cvs_file,)) else: propdict[key] = value return propdict def set_properties(self, cvs_file): propdict = self.get_propdict(cvs_file) for (k,v) in propdict.items(): if k in cvs_file.properties: if cvs_file.properties[k] != v: logger.warn( "Property '%s' already set to %r for file %s; " "auto-props value (%r) ignored." % (k, cvs_file.properties[k], cvs_file.cvs_path, v,) ) else: cvs_file.properties[k] = v class CVSBinaryFileDefaultMimeTypeSetter(FilePropertySetter): """If the file is binary and its svn:mime-type property is not yet set, set it to 'application/octet-stream'.""" def set_properties(self, cvs_file): if cvs_file.mode == 'b': self.maybe_set_property( cvs_file, 'svn:mime-type', 'application/octet-stream' ) class EOLStyleFromMimeTypeSetter(FilePropertySetter): """Set svn:eol-style based on svn:mime-type. If svn:mime-type is known but svn:eol-style is not, then set svn:eol-style based on svn:mime-type as follows: if svn:mime-type starts with 'text/', then set svn:eol-style to native; otherwise, force it to remain unset. See also issue #39.""" propname = 'svn:eol-style' def set_properties(self, cvs_file): if self.propname in cvs_file.properties: return mime_type = cvs_file.properties.get('svn:mime-type', None) if mime_type: if mime_type.startswith("text/"): cvs_file.properties[self.propname] = 'native' else: cvs_file.properties[self.propname] = None class DefaultEOLStyleSetter(FilePropertySetter): """Set the eol-style if one has not already been set.""" valid_values = { None : None, # Also treat "binary" as None: 'binary' : None, 'native' : 'native', 'CRLF' : 'CRLF', 'LF' : 'LF', 'CR' : 'CR', } def __init__(self, value): """Initialize with the specified default VALUE.""" try: # Check that value is valid, and translate it to the proper case self.value = self.valid_values[value] except KeyError: raise ValueError( 'Illegal value specified for the default EOL option: %r' % (value,) ) def set_properties(self, cvs_file): self.maybe_set_property(cvs_file, 'svn:eol-style', self.value) class SVNBinaryFileKeywordsPropertySetter(FilePropertySetter): """Turn off svn:keywords for files with binary svn:eol-style.""" propname = 'svn:keywords' def set_properties(self, cvs_file): if self.propname in cvs_file.properties: return if not cvs_file.properties.get('svn:eol-style'): cvs_file.properties[self.propname] = None class KeywordsPropertySetter(FilePropertySetter): """If the svn:keywords property is not yet set, set it based on the file's mode. See issue #2.""" def __init__(self, value): """Use VALUE for the value of the svn:keywords property if it is to be set.""" self.value = value def set_properties(self, cvs_file): if cvs_file.mode in [None, 'kv', 'kvl']: self.maybe_set_property(cvs_file, 'svn:keywords', self.value) class ConditionalPropertySetter(object): """Delegate to the passed property setters when the passed predicate applies. The predicate should be a function that takes a CVSFile or CVSRevision argument and return True if the property setters should be applied.""" def __init__(self, predicate, *property_setters): self.predicate = predicate self.property_setters = property_setters def set_properties(self, cvs_file_or_rev): if self.predicate(cvs_file_or_rev): for property_setter in self.property_setters: property_setter.set_properties(cvs_file_or_rev) class RevisionPropertySetter: """Abstract class for objects that can set properties on a CVSRevision.""" def set_properties(self, cvs_rev): """Set any properties that can be determined for CVS_REV. CVS_REV is an instance of CVSRevision. This method should modify CVS_REV.properties in place.""" raise NotImplementedError() class CVSRevisionNumberSetter(RevisionPropertySetter): """Store the CVS revision number to an SVN property.""" def __init__(self, propname='cvs2svn:cvs-rev'): self.propname = propname def set_properties(self, cvs_rev): if self.propname in cvs_rev.properties: return cvs_rev.properties[self.propname] = cvs_rev.rev cvs2svn-2.4.0/cvs2svn_lib/apple_single_filter.py0000664000076500007650000002334011434364604023025 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2007-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """A stream filter for extracting the data fork from AppleSingle data. Some Macintosh CVS clients store resource fork data along with the contents of the file (called the data fork) by encoding both in an 'AppleSingle' data stream before storing them to CVS. This file contains a stream filter for extracting the data fork from such data streams. (Any other forks are discarded.) See the following for some random information about this format and how it is used by Macintosh CVS clients: http://users.phg-online.de/tk/netatalk/doc/Apple/v1/ http://rfc.net/rfc1740.html http://ximbiot.com/cvs/cvshome/cyclic/cvs/dev-mac.html http://www.maccvs.org/faq.html#resfiles http://www.heilancoo.net/MacCVSClient/MacCVSClientDoc/storage-formats.html """ import struct from cStringIO import StringIO class AppleSingleFormatError(IOError): """The stream was not in correct AppleSingle format.""" pass class AppleSingleIncorrectMagicError(AppleSingleFormatError): """The file didn't start with the correct magic number.""" def __init__(self, data_read, eof): AppleSingleFormatError.__init__(self) self.data_read = data_read self.eof = eof class AppleSingleEOFError(AppleSingleFormatError): """EOF was reached where AppleSingle doesn't allow it.""" pass class AppleSingleFilter(object): """A stream that reads the data fork from an AppleSingle stream. If the constructor discovers that the file is not a legitimate AppleSingle stream, then it raises an AppleSingleFormatError. In the special case that the magic number is incorrect, it raises AppleSingleIncorrectMagicError with data_read set to the data that have been read so far from the input stream. (This allows the caller the option to fallback to treating the input stream as a normal binary data stream.)""" # The header is: # # Magic number 4 bytes # Version number 4 bytes # File system or filler 16 bytes # Number of entries 2 bytes magic_struct = '>i' magic_len = struct.calcsize(magic_struct) # The part of the header after the magic number: rest_of_header_struct = '>i16sH' rest_of_header_len = struct.calcsize(rest_of_header_struct) # Each entry is: # # Entry ID 4 bytes # Offset 4 bytes # Length 4 bytes entry_struct = '>iii' entry_len = struct.calcsize(entry_struct) apple_single_magic = 0x00051600 apple_single_version_1 = 0x00010000 apple_single_version_2 = 0x00020000 apple_single_filler = '\0' * 16 apple_single_data_fork_entry_id = 1 def __init__(self, stream): self.stream = stream # Check for the AppleSingle magic number: s = self._read_exactly(self.magic_len) if len(s) < self.magic_len: raise AppleSingleIncorrectMagicError(s, True) (magic,) = struct.unpack(self.magic_struct, s) if magic != self.apple_single_magic: raise AppleSingleIncorrectMagicError(s, False) # Read the rest of the header: s = self._read_exactly(self.rest_of_header_len) if len(s) < self.rest_of_header_len: raise AppleSingleEOFError('AppleSingle header incomplete') (version, filler, num_entries) = \ struct.unpack(self.rest_of_header_struct, s) if version == self.apple_single_version_1: self._prepare_apple_single_v1_file(num_entries) elif version == self.apple_single_version_2: if filler != self.apple_single_filler: raise AppleSingleFormatError('Incorrect filler') self._prepare_apple_single_v2_file(num_entries) else: raise AppleSingleFormatError('Unknown AppleSingle version') def _read_exactly(self, size): """Read and return exactly SIZE characters from the stream. This method is to deal with the fact that stream.read(size) is allowed to return less than size characters. If EOF is reached before SIZE characters have been read, return the characters that have been read so far.""" retval = [] length_remaining = size while length_remaining > 0: s = self.stream.read(length_remaining) if not s: break retval.append(s) length_remaining -= len(s) return ''.join(retval) def _prepare_apple_single_file(self, num_entries): entries = self._read_exactly(num_entries * self.entry_len) if len(entries) < num_entries * self.entry_len: raise AppleSingleEOFError('Incomplete entries list') for i in range(num_entries): entry = entries[i * self.entry_len : (i + 1) * self.entry_len] (entry_id, offset, length) = struct.unpack(self.entry_struct, entry) if entry_id == self.apple_single_data_fork_entry_id: break else: raise AppleSingleFormatError('No data fork found') # The data fork is located at [offset : offset + length]. Read up # to the start of the data: n = offset - self.magic_len - self.rest_of_header_len - len(entries) if n < 0: raise AppleSingleFormatError('Invalid offset to AppleSingle data fork') max_chunk_size = 65536 while n > 0: s = self.stream.read(min(n, max_chunk_size)) if not s: raise AppleSingleEOFError( 'Offset to AppleSingle data fork past end of file' ) n -= len(s) self.length_remaining = length def _prepare_apple_single_v1_file(self, num_entries): self._prepare_apple_single_file(num_entries) def _prepare_apple_single_v2_file(self, num_entries): self._prepare_apple_single_file(num_entries) def read(self, size=-1): if size == 0 or self.length_remaining == 0: return '' elif size < 0: s = self._read_exactly(self.length_remaining) if len(s) < self.length_remaining: raise AppleSingleEOFError('AppleSingle data fork truncated') self.length_remaining = 0 return s else: # The length of this read is allowed to be shorter than the # requested size: s = self.stream.read(min(size, self.length_remaining)) if not s: raise AppleSingleEOFError() self.length_remaining -= len(s) return s def close(self): self.stream.close() self.stream = None class CompoundStream(object): """A stream that reads from a series of streams, one after the other.""" def __init__(self, streams, stream_index=0): self.streams = list(streams) self.stream_index = stream_index def read(self, size=-1): if size < 0: retval = [] while self.stream_index < len(self.streams): retval.append(self.streams[self.stream_index].read()) self.stream_index += 1 return ''.join(retval) else: while self.stream_index < len(self.streams): s = self.streams[self.stream_index].read(size) if s: # This may not be the full size requested, but that is OK: return s else: # That stream was empty; proceed to the next stream: self.stream_index += 1 # No streams are left: return '' def close(self): for stream in self.streams: stream.close() self.streams = None def get_maybe_apple_single_stream(stream): """Treat STREAM as AppleSingle if possible; otherwise treat it literally. If STREAM is in AppleSingle format, then return a stream that will output the data fork of the original stream. Otherwise, return a stream that will output the original file contents literally. Be careful not to read from STREAM after it has already hit EOF.""" try: return AppleSingleFilter(stream) except AppleSingleIncorrectMagicError, e: # This is OK; the file is not AppleSingle, so we read it normally: string_io = StringIO(e.data_read) if e.eof: # The original stream already reached EOF, so the part already # read contains the complete file contents. Nevertheless return # a CompoundStream to make sure that the stream's close() method # is called: return CompoundStream([stream, string_io], stream_index=1) else: # The stream needs to output the part already read followed by # whatever hasn't been read of the original stream: return CompoundStream([string_io, stream]) def get_maybe_apple_single(data): """Treat DATA as AppleSingle if possible; otherwise treat it literally. If DATA is in AppleSingle format, then return its data fork. Otherwise, return the original DATA.""" return get_maybe_apple_single_stream(StringIO(data)).read() if __name__ == '__main__': # For fun and testing, allow use of this file as a pipe if it is # invoked as a script. Specifically, if stdin is in AppleSingle # format, then output only its data fork; otherwise, output it # unchanged. # # This might not work on systems where sys.stdin is opened in text # mode. # # Remember to set PYTHONPATH to point to the main cvs2svn directory. import sys #CHUNK_SIZE = -1 CHUNK_SIZE = 100 if CHUNK_SIZE < 0: sys.stdout.write(get_maybe_apple_single(sys.stdin.read())) else: f = get_maybe_apple_single_stream(sys.stdin) while True: s = f.read(CHUNK_SIZE) if not s: break sys.stdout.write(s) cvs2svn-2.4.0/cvs2svn_lib/time_range.py0000664000076500007650000000326711357430075021136 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2006-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains a class to manage time ranges.""" class TimeRange(object): __slots__ = ('t_min', 't_max') def __init__(self): # Start out with a t_min higher than any incoming time T, and a # t_max lower than any incoming T. This way the first T will push # t_min down to T, and t_max up to T, naturally (without any # special-casing), and successive times will then ratchet them # outward as appropriate. self.t_min = 1L<<32 self.t_max = 0 def add(self, timestamp): """Expand the range to encompass TIMESTAMP.""" if timestamp < self.t_min: self.t_min = timestamp if timestamp > self.t_max: self.t_max = timestamp def __cmp__(self, other): # Sorted by t_max, and break ties using t_min. return cmp(self.t_max, other.t_max) or cmp(self.t_min, other.t_min) def __lt__(self, other): c = cmp(self.t_max, other.t_max) if 0 == c: return self.t_min < other.t_min return c < 0 cvs2svn-2.4.0/cvs2svn_lib/revision_manager.py0000664000076500007650000001127211500107341022332 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module describes the interface to the CVS repository.""" class RevisionCollector(object): """Optionally collect revision information for CVS files.""" def __init__(self): """Initialize the RevisionCollector. Please note that a RevisionCollector is instantiated in every program run, even if the data-collection pass will not be executed. (This is to allow it to register the artifacts that it produces.) Therefore, the __init__() method should not do much, and more substantial preparation for use (like actually creating the artifacts) should be done in start().""" pass def register_artifacts(self, which_pass): """Register artifacts that will be needed while collecting data. WHICH_PASS is the pass that will call our callbacks, so it should be used to do the registering (e.g., call WHICH_PASS.register_temp_file() and/or WHICH_PASS.register_temp_file_needed()).""" pass def start(self): """Data will soon start being collected. Any non-idempotent initialization should be done here.""" pass def process_file(self, cvs_file_items): """Collect data for the file described by CVS_FILE_ITEMS. CVS_FILE_ITEMS has already been transformed into the logical representation of the file's history as it should be output. Therefore it is not necessarily identical to the history as recorded in the RCS file. This method is allowed to store a pickleable object to the CVSItem.revision_reader_token member of CVSItems in CVS_FILE_ITEMS. These data are stored with the items and available for the use of the RevisionReader.""" raise NotImplementedError() def finish(self): """All recording is done; clean up.""" pass class NullRevisionCollector(RevisionCollector): """A do-nothing variety of RevisionCollector.""" def process_file(self, cvs_file_items): pass class RevisionReader(object): """An object that can read the contents of CVSRevisions.""" def register_artifacts(self, which_pass): """Register artifacts that will be needed during branch exclusion. WHICH_PASS is the pass that will call our callbacks, so it should be used to do the registering (e.g., call WHICH_PASS.register_temp_file() and/or WHICH_PASS.register_temp_file_needed()).""" pass def start(self): """Prepare for calls to get_content().""" pass def get_content(self, cvs_rev): """Return the contents of CVS_REV. CVS_REV is a CVSRevision. The way that the contents are extracted is influenced by properties that are set on CVS_REV: * The CVS_REV property _keyword_handling specifies how RCS/CVS keywords should be handled: * 'collapsed' -- collapse RCS/CVS keywords in the output; e.g., '$Author: jrandom $' -> '$Author$'. * 'expanded' -- expand RCS/CVS keywords in the output; e.g., '$Author$' -> '$Author: jrandom $'. * 'untouched' -- leave RCS/CVS keywords untouched. For a file that had keyword expansion enabled in CVS, this typically means that the keyword comes out expanded as for the *previous* revision, because CVS expands keywords on checkout, not checkin. * unset -- undefined behavior; depends on which revision manager is being used. * The CVS_REV property _eol_fix specifies how EOL sequences should be handled in the output. If the property is unset or empty, then leave EOL sequences untouched. If it is non-empty, then convert all end-of-line sequences to the value of that property (typically '\n' or '\r\n'). See doc/properties.txt and doc/text-transformations.txt for more information. If Ctx().decode_apple_single is set, then extract the data fork from any content that looks like AppleSingle format.""" raise NotImplementedError() def finish(self): """Inform the reader that all calls to get_content() are done. Start may be called again at a later point.""" pass cvs2svn-2.4.0/cvs2svn_lib/passes.py0000664000076500007650000017440511710517256020325 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module defines the passes that make up a conversion.""" import sys import shutil import cPickle from cvs2svn_lib import config from cvs2svn_lib.context import Ctx from cvs2svn_lib.common import warning_prefix from cvs2svn_lib.common import FatalException from cvs2svn_lib.common import FatalError from cvs2svn_lib.common import InternalError from cvs2svn_lib.common import DB_OPEN_NEW from cvs2svn_lib.common import DB_OPEN_READ from cvs2svn_lib.common import DB_OPEN_WRITE from cvs2svn_lib.common import Timestamper from cvs2svn_lib.sort import sort_file from cvs2svn_lib.log import logger from cvs2svn_lib.pass_manager import Pass from cvs2svn_lib.serializer import PrimedPickleSerializer from cvs2svn_lib.artifact_manager import artifact_manager from cvs2svn_lib.cvs_path_database import CVSPathDatabase from cvs2svn_lib.metadata_database import MetadataDatabase from cvs2svn_lib.project import read_projects from cvs2svn_lib.project import write_projects from cvs2svn_lib.symbol import LineOfDevelopment from cvs2svn_lib.symbol import Trunk from cvs2svn_lib.symbol import Symbol from cvs2svn_lib.symbol import Branch from cvs2svn_lib.symbol import Tag from cvs2svn_lib.symbol import ExcludedSymbol from cvs2svn_lib.symbol_database import SymbolDatabase from cvs2svn_lib.symbol_database import create_symbol_database from cvs2svn_lib.symbol_statistics import SymbolPlanError from cvs2svn_lib.symbol_statistics import IndeterminateSymbolException from cvs2svn_lib.symbol_statistics import SymbolStatistics from cvs2svn_lib.cvs_item import CVSRevision from cvs2svn_lib.cvs_item import CVSSymbol from cvs2svn_lib.cvs_item_database import OldCVSItemStore from cvs2svn_lib.cvs_item_database import IndexedCVSItemStore from cvs2svn_lib.cvs_item_database import cvs_item_primer from cvs2svn_lib.cvs_item_database import NewSortableCVSRevisionDatabase from cvs2svn_lib.cvs_item_database import OldSortableCVSRevisionDatabase from cvs2svn_lib.cvs_item_database import NewSortableCVSSymbolDatabase from cvs2svn_lib.cvs_item_database import OldSortableCVSSymbolDatabase from cvs2svn_lib.key_generator import KeyGenerator from cvs2svn_lib.changeset import RevisionChangeset from cvs2svn_lib.changeset import OrderedChangeset from cvs2svn_lib.changeset import SymbolChangeset from cvs2svn_lib.changeset import BranchChangeset from cvs2svn_lib.changeset import create_symbol_changeset from cvs2svn_lib.changeset_graph import ChangesetGraph from cvs2svn_lib.changeset_graph_link import ChangesetGraphLink from cvs2svn_lib.changeset_database import ChangesetDatabase from cvs2svn_lib.changeset_database import CVSItemToChangesetTable from cvs2svn_lib.svn_commit import SVNRevisionCommit from cvs2svn_lib.openings_closings import SymbolingsLogger from cvs2svn_lib.svn_commit_creator import SVNCommitCreator from cvs2svn_lib.persistence_manager import PersistenceManager from cvs2svn_lib.repository_walker import walk_repository from cvs2svn_lib.collect_data import CollectData from cvs2svn_lib.check_dependencies_pass \ import CheckItemStoreDependenciesPass from cvs2svn_lib.check_dependencies_pass \ import CheckIndexedItemStoreDependenciesPass class CollectRevsPass(Pass): """This pass was formerly known as pass1.""" def register_artifacts(self): self._register_temp_file(config.PROJECTS) self._register_temp_file(config.SYMBOL_STATISTICS) self._register_temp_file(config.METADATA_INDEX_TABLE) self._register_temp_file(config.METADATA_STORE) self._register_temp_file(config.CVS_PATHS_DB) self._register_temp_file(config.CVS_ITEMS_STORE) def run(self, run_options, stats_keeper): logger.quiet("Examining all CVS ',v' files...") Ctx()._projects = {} Ctx()._cvs_path_db = CVSPathDatabase(DB_OPEN_NEW) cd = CollectData(stats_keeper) # Key generator for CVSFiles: file_key_generator = KeyGenerator() for project in run_options.projects: Ctx()._projects[project.id] = project cd.process_project( project, walk_repository(project, file_key_generator, cd.record_fatal_error), ) run_options.projects = None fatal_errors = cd.close() if fatal_errors: raise FatalException("Pass 1 complete.\n" + "=" * 75 + "\n" + "Error summary:\n" + "\n".join(fatal_errors) + "\n" + "Exited due to fatal error(s).") Ctx()._cvs_path_db.close() write_projects(artifact_manager.get_temp_file(config.PROJECTS)) logger.quiet("Done") class CleanMetadataPass(Pass): """Clean up CVS revision metadata and write it to a new database.""" def register_artifacts(self): self._register_temp_file(config.METADATA_CLEAN_INDEX_TABLE) self._register_temp_file(config.METADATA_CLEAN_STORE) self._register_temp_file_needed(config.METADATA_INDEX_TABLE) self._register_temp_file_needed(config.METADATA_STORE) def _get_clean_author(self, author): """Return AUTHOR, converted appropriately to UTF8. Raise a UnicodeException if it cannot be converted using the configured cvs_author_decoder.""" try: return self._authors[author] except KeyError: pass try: clean_author = Ctx().cvs_author_decoder(author) except UnicodeError: self._authors[author] = author raise UnicodeError('Problem decoding author \'%s\'' % (author,)) try: clean_author = clean_author.encode('utf8') except UnicodeError: self._authors[author] = author raise UnicodeError('Problem encoding author \'%s\'' % (author,)) self._authors[author] = clean_author return clean_author def _get_clean_log_msg(self, log_msg): """Return LOG_MSG, converted appropriately to UTF8. Raise a UnicodeException if it cannot be converted using the configured cvs_log_decoder.""" try: clean_log_msg = Ctx().cvs_log_decoder(log_msg) except UnicodeError: raise UnicodeError( 'Problem decoding log message:\n' '%s\n' '%s\n' '%s' % ('-' * 75, log_msg, '-' * 75,) ) try: return clean_log_msg.encode('utf8') except UnicodeError: raise UnicodeError( 'Problem encoding log message:\n' '%s\n' '%s\n' '%s' % ('-' * 75, log_msg, '-' * 75,) ) def _clean_metadata(self, metadata): """Clean up METADATA by overwriting its members as necessary.""" try: metadata.author = self._get_clean_author(metadata.author) except UnicodeError, e: logger.warn('%s: %s' % (warning_prefix, e,)) self.warnings = True try: metadata.log_msg = self._get_clean_log_msg(metadata.log_msg) except UnicodeError, e: logger.warn('%s: %s' % (warning_prefix, e,)) self.warnings = True def run(self, run_options, stats_keeper): logger.quiet("Converting metadata to UTF8...") metadata_db = MetadataDatabase( artifact_manager.get_temp_file(config.METADATA_STORE), artifact_manager.get_temp_file(config.METADATA_INDEX_TABLE), DB_OPEN_READ, ) metadata_clean_db = MetadataDatabase( artifact_manager.get_temp_file(config.METADATA_CLEAN_STORE), artifact_manager.get_temp_file(config.METADATA_CLEAN_INDEX_TABLE), DB_OPEN_NEW, ) self.warnings = False # A map {author : clean_author} for those known (to avoid # repeating warnings): self._authors = {} for id in metadata_db.iterkeys(): metadata = metadata_db[id] # Record the original author name because it might be needed for # expanding CVS keywords: metadata.original_author = metadata.author self._clean_metadata(metadata) metadata_clean_db[id] = metadata if self.warnings: raise FatalError( 'There were warnings converting author names and/or log messages\n' 'to Unicode (see messages above). Please restart this pass\n' 'with one or more \'--encoding\' parameters or with\n' '\'--fallback-encoding\'.' ) metadata_clean_db.close() metadata_db.close() logger.quiet("Done") class CollateSymbolsPass(Pass): """Divide symbols into branches, tags, and excludes.""" conversion_names = { Trunk : 'trunk', Branch : 'branch', Tag : 'tag', ExcludedSymbol : 'exclude', Symbol : '.', } def register_artifacts(self): self._register_temp_file(config.SYMBOL_DB) self._register_temp_file_needed(config.PROJECTS) self._register_temp_file_needed(config.SYMBOL_STATISTICS) def get_symbol(self, run_options, stats): """Use StrategyRules to decide what to do with a symbol. STATS is an instance of symbol_statistics._Stats describing an instance of Symbol or Trunk. To determine how the symbol is to be converted, consult the StrategyRules in the project's symbol_strategy_rules. Each rule is allowed a chance to change the way the symbol will be converted. If the symbol is not a Trunk or TypedSymbol after all rules have run, raise IndeterminateSymbolException.""" symbol = stats.lod rules = run_options.project_symbol_strategy_rules[symbol.project.id] for rule in rules: symbol = rule.get_symbol(symbol, stats) assert symbol is not None stats.check_valid(symbol) return symbol def log_symbol_summary(self, stats, symbol): if not self.symbol_info_file: return if isinstance(symbol, Trunk): name = '.trunk.' preferred_parent_name = '.' else: name = stats.lod.name if symbol.preferred_parent_id is None: preferred_parent_name = '.' else: preferred_parent = self.symbol_stats[symbol.preferred_parent_id].lod if isinstance(preferred_parent, Trunk): preferred_parent_name = '.trunk.' else: preferred_parent_name = preferred_parent.name if isinstance(symbol, LineOfDevelopment) and symbol.base_path: symbol_path = symbol.base_path else: symbol_path = '.' self.symbol_info_file.write( '%-5d %-30s %-10s %s %s\n' % ( stats.lod.project.id, name, self.conversion_names[symbol.__class__], symbol_path, preferred_parent_name, ) ) self.symbol_info_file.write(' # %s\n' % (stats,)) parent_counts = stats.possible_parents.items() if parent_counts: self.symbol_info_file.write(' # Possible parents:\n') parent_counts.sort(lambda a,b: cmp((b[1], a[0]), (a[1], b[0]))) for (pp, count) in parent_counts: if isinstance(pp, Trunk): self.symbol_info_file.write( ' # .trunk. : %d\n' % (count,) ) else: self.symbol_info_file.write( ' # %s : %d\n' % (pp.name, count,) ) def get_symbols(self, run_options): """Return a map telling how to convert symbols. The return value is a map {AbstractSymbol : (Trunk|TypedSymbol)}, indicating how each symbol should be converted. Trunk objects in SYMBOL_STATS are passed through unchanged. One object is included in the return value for each line of development described in SYMBOL_STATS. Raise FatalError if there was an error.""" errors = [] mismatches = [] if Ctx().symbol_info_filename is not None: self.symbol_info_file = open(Ctx().symbol_info_filename, 'w') self.symbol_info_file.write( '# Columns: project_id symbol_name conversion symbol_path ' 'preferred_parent_name\n' ) else: self.symbol_info_file = None # Initialize each symbol strategy rule a single time, even if it # is used in more than one project. First define a map from # object id to symbol strategy rule: rules = {} for rule_list in run_options.project_symbol_strategy_rules: for rule in rule_list: rules[id(rule)] = rule for rule in rules.itervalues(): rule.start(self.symbol_stats) retval = {} for stats in self.symbol_stats: try: symbol = self.get_symbol(run_options, stats) except IndeterminateSymbolException, e: self.log_symbol_summary(stats, stats.lod) mismatches.append(e.stats) except SymbolPlanError, e: self.log_symbol_summary(stats, stats.lod) errors.append(e) else: self.log_symbol_summary(stats, symbol) retval[stats.lod] = symbol for rule in rules.itervalues(): rule.finish() if self.symbol_info_file: self.symbol_info_file.close() del self.symbol_info_file if errors or mismatches: s = ['Problems determining how symbols should be converted:\n'] for e in errors: s.append('%s\n' % (e,)) if mismatches: s.append( 'It is not clear how the following symbols ' 'should be converted.\n' 'Use --symbol-hints, --force-tag, --force-branch, --exclude, ' 'and/or\n' '--symbol-default to resolve the ambiguity.\n' ) for stats in mismatches: s.append(' %s\n' % (stats,)) raise FatalError(''.join(s)) else: return retval def run(self, run_options, stats_keeper): Ctx()._projects = read_projects( artifact_manager.get_temp_file(config.PROJECTS) ) self.symbol_stats = SymbolStatistics( artifact_manager.get_temp_file(config.SYMBOL_STATISTICS) ) symbol_map = self.get_symbols(run_options) # Check the symbols for consistency and bail out if there were errors: self.symbol_stats.check_consistency(symbol_map) # Check that the symbols all have SVN paths set and that the paths # are disjoint: Ctx().output_option.check_symbols(symbol_map) for symbol in symbol_map.itervalues(): if isinstance(symbol, ExcludedSymbol): self.symbol_stats.exclude_symbol(symbol) create_symbol_database(symbol_map.values()) del self.symbol_stats logger.quiet("Done") class FilterSymbolsPass(Pass): """Delete any branches/tags that are to be excluded. Also delete revisions on excluded branches, and delete other references to the excluded symbols.""" def register_artifacts(self): self._register_temp_file(config.ITEM_SERIALIZER) self._register_temp_file(config.CVS_REVS_DATAFILE) self._register_temp_file(config.CVS_SYMBOLS_DATAFILE) self._register_temp_file_needed(config.PROJECTS) self._register_temp_file_needed(config.SYMBOL_DB) self._register_temp_file_needed(config.METADATA_CLEAN_STORE) self._register_temp_file_needed(config.METADATA_CLEAN_INDEX_TABLE) self._register_temp_file_needed(config.CVS_PATHS_DB) self._register_temp_file_needed(config.CVS_ITEMS_STORE) Ctx().revision_collector.register_artifacts(self) def run(self, run_options, stats_keeper): Ctx()._projects = read_projects( artifact_manager.get_temp_file(config.PROJECTS) ) Ctx()._cvs_path_db = CVSPathDatabase(DB_OPEN_READ) Ctx()._metadata_db = MetadataDatabase( artifact_manager.get_temp_file(config.METADATA_CLEAN_STORE), artifact_manager.get_temp_file(config.METADATA_CLEAN_INDEX_TABLE), DB_OPEN_READ, ) Ctx()._symbol_db = SymbolDatabase() cvs_item_store = OldCVSItemStore( artifact_manager.get_temp_file(config.CVS_ITEMS_STORE)) cvs_item_serializer = PrimedPickleSerializer(cvs_item_primer) f = open(artifact_manager.get_temp_file(config.ITEM_SERIALIZER), 'wb') cPickle.dump(cvs_item_serializer, f, -1) f.close() rev_db = NewSortableCVSRevisionDatabase( artifact_manager.get_temp_file(config.CVS_REVS_DATAFILE), cvs_item_serializer, ) symbol_db = NewSortableCVSSymbolDatabase( artifact_manager.get_temp_file(config.CVS_SYMBOLS_DATAFILE), cvs_item_serializer, ) revision_collector = Ctx().revision_collector logger.quiet("Filtering out excluded symbols and summarizing items...") stats_keeper.reset_cvs_rev_info() revision_collector.start() # Process the cvs items store one file at a time: for cvs_file_items in cvs_item_store.iter_cvs_file_items(): logger.verbose(cvs_file_items.cvs_file.rcs_path) cvs_file_items.filter_excluded_symbols() cvs_file_items.mutate_symbols() cvs_file_items.adjust_parents() cvs_file_items.refine_symbols() cvs_file_items.determine_revision_properties( Ctx().revision_property_setters ) cvs_file_items.record_opened_symbols() cvs_file_items.record_closed_symbols() cvs_file_items.check_link_consistency() # Give the revision collector a chance to collect data about the # file: revision_collector.process_file(cvs_file_items) # Store whatever is left to the new file and update statistics: stats_keeper.record_cvs_file(cvs_file_items.cvs_file) for cvs_item in cvs_file_items.values(): stats_keeper.record_cvs_item(cvs_item) if isinstance(cvs_item, CVSRevision): rev_db.add(cvs_item) elif isinstance(cvs_item, CVSSymbol): symbol_db.add(cvs_item) stats_keeper.set_stats_reflect_exclude(True) rev_db.close() symbol_db.close() revision_collector.finish() cvs_item_store.close() Ctx()._symbol_db.close() Ctx()._cvs_path_db.close() logger.quiet("Done") class SortRevisionsPass(Pass): """Sort the revisions file.""" def register_artifacts(self): self._register_temp_file(config.CVS_REVS_SORTED_DATAFILE) self._register_temp_file_needed(config.CVS_REVS_DATAFILE) def run(self, run_options, stats_keeper): logger.quiet("Sorting CVS revision summaries...") sort_file( artifact_manager.get_temp_file(config.CVS_REVS_DATAFILE), artifact_manager.get_temp_file( config.CVS_REVS_SORTED_DATAFILE ), tempdirs=[Ctx().tmpdir], ) logger.quiet("Done") class SortSymbolsPass(Pass): """Sort the symbols file.""" def register_artifacts(self): self._register_temp_file(config.CVS_SYMBOLS_SORTED_DATAFILE) self._register_temp_file_needed(config.CVS_SYMBOLS_DATAFILE) def run(self, run_options, stats_keeper): logger.quiet("Sorting CVS symbol summaries...") sort_file( artifact_manager.get_temp_file(config.CVS_SYMBOLS_DATAFILE), artifact_manager.get_temp_file( config.CVS_SYMBOLS_SORTED_DATAFILE ), tempdirs=[Ctx().tmpdir], ) logger.quiet("Done") class InitializeChangesetsPass(Pass): """Create preliminary CommitSets.""" def register_artifacts(self): self._register_temp_file(config.CVS_ITEM_TO_CHANGESET) self._register_temp_file(config.CHANGESETS_STORE) self._register_temp_file(config.CHANGESETS_INDEX) self._register_temp_file(config.CVS_ITEMS_SORTED_STORE) self._register_temp_file(config.CVS_ITEMS_SORTED_INDEX_TABLE) self._register_temp_file_needed(config.PROJECTS) self._register_temp_file_needed(config.SYMBOL_DB) self._register_temp_file_needed(config.CVS_PATHS_DB) self._register_temp_file_needed(config.ITEM_SERIALIZER) self._register_temp_file_needed(config.CVS_REVS_SORTED_DATAFILE) self._register_temp_file_needed( config.CVS_SYMBOLS_SORTED_DATAFILE) def get_revision_changesets(self): """Generate revision changesets, one at a time. Each time, yield a list of CVSRevisions that might potentially consititute a changeset.""" # Create changesets for CVSRevisions: old_metadata_id = None old_timestamp = None changeset_items = [] db = OldSortableCVSRevisionDatabase( artifact_manager.get_temp_file( config.CVS_REVS_SORTED_DATAFILE ), self.cvs_item_serializer, ) for cvs_rev in db: if cvs_rev.metadata_id != old_metadata_id \ or cvs_rev.timestamp > old_timestamp + config.COMMIT_THRESHOLD: # Start a new changeset. First finish up the old changeset, # if any: if changeset_items: yield changeset_items changeset_items = [] old_metadata_id = cvs_rev.metadata_id changeset_items.append(cvs_rev) old_timestamp = cvs_rev.timestamp # Finish up the last changeset, if any: if changeset_items: yield changeset_items def get_symbol_changesets(self): """Generate symbol changesets, one at a time. Each time, yield a list of CVSSymbols that might potentially consititute a changeset.""" old_symbol_id = None changeset_items = [] db = OldSortableCVSSymbolDatabase( artifact_manager.get_temp_file( config.CVS_SYMBOLS_SORTED_DATAFILE ), self.cvs_item_serializer, ) for cvs_symbol in db: if cvs_symbol.symbol.id != old_symbol_id: # Start a new changeset. First finish up the old changeset, # if any: if changeset_items: yield changeset_items changeset_items = [] old_symbol_id = cvs_symbol.symbol.id changeset_items.append(cvs_symbol) # Finish up the last changeset, if any: if changeset_items: yield changeset_items @staticmethod def compare_items(a, b): return ( cmp(a.timestamp, b.timestamp) or cmp(a.cvs_file.cvs_path, b.cvs_file.cvs_path) or cmp([int(x) for x in a.rev.split('.')], [int(x) for x in b.rev.split('.')]) or cmp(a.id, b.id)) def break_internal_dependencies(self, changeset_items): """Split up CHANGESET_ITEMS if necessary to break internal dependencies. CHANGESET_ITEMS is a list of CVSRevisions that could possibly belong in a single RevisionChangeset, but there might be internal dependencies among the items. Return a list of lists, where each sublist is a list of CVSRevisions and at least one internal dependency has been eliminated. Iff CHANGESET_ITEMS does not have to be split, then the return value will contain a single value, namely the original value of CHANGESET_ITEMS. Split CHANGESET_ITEMS at most once, even though the resulting changesets might themselves have internal dependencies.""" # We only look for succ dependencies, since by doing so we # automatically cover pred dependencies as well. First create a # list of tuples (pred, succ) of id pairs for CVSItems that depend # on each other. dependencies = [] changeset_cvs_item_ids = set([cvs_rev.id for cvs_rev in changeset_items]) for cvs_item in changeset_items: for next_id in cvs_item.get_succ_ids(): if next_id in changeset_cvs_item_ids: # Sanity check: a CVSItem should never depend on itself: if next_id == cvs_item.id: raise InternalError('Item depends on itself: %s' % (cvs_item,)) dependencies.append((cvs_item.id, next_id,)) if dependencies: # Sort the changeset_items in a defined order (chronological to the # extent that the timestamps are correct and unique). changeset_items.sort(self.compare_items) indexes = {} for (i, changeset_item) in enumerate(changeset_items): indexes[changeset_item.id] = i # How many internal dependencies would be broken by breaking the # Changeset after a particular index? breaks = [0] * len(changeset_items) for (pred, succ,) in dependencies: pred_index = indexes[pred] succ_index = indexes[succ] breaks[min(pred_index, succ_index)] += 1 breaks[max(pred_index, succ_index)] -= 1 best_i = None best_count = -1 best_time = 0 for i in range(1, len(breaks)): breaks[i] += breaks[i - 1] for i in range(0, len(breaks) - 1): if breaks[i] > best_count: best_i = i best_count = breaks[i] best_time = (changeset_items[i + 1].timestamp - changeset_items[i].timestamp) elif breaks[i] == best_count \ and (changeset_items[i + 1].timestamp - changeset_items[i].timestamp) < best_time: best_i = i best_count = breaks[i] best_time = (changeset_items[i + 1].timestamp - changeset_items[i].timestamp) # Reuse the old changeset.id for the first of the split changesets. return [changeset_items[:best_i + 1], changeset_items[best_i + 1:]] else: return [changeset_items] def break_all_internal_dependencies(self, changeset_items): """Keep breaking CHANGESET_ITEMS up to break all internal dependencies. CHANGESET_ITEMS is a list of CVSRevisions that could conceivably be part of a single changeset. Break this list into sublists, where the CVSRevisions in each sublist are free of mutual dependencies.""" # This method is written non-recursively to avoid any possible # problems with recursion depth. changesets_to_split = [changeset_items] while changesets_to_split: changesets = self.break_internal_dependencies(changesets_to_split.pop()) if len(changesets) == 1: [changeset_items] = changesets yield changeset_items else: # The changeset had to be split; see if either of the # fragments have to be split: changesets.reverse() changesets_to_split.extend(changesets) def get_changesets(self): """Generate (Changeset, [CVSItem,...]) for all changesets. The Changesets already have their internal dependencies broken. The [CVSItem,...] list is the list of CVSItems in the corresponding Changeset.""" for changeset_items in self.get_revision_changesets(): for split_changeset_items \ in self.break_all_internal_dependencies(changeset_items): yield ( RevisionChangeset( self.changeset_key_generator.gen_id(), [cvs_rev.id for cvs_rev in split_changeset_items] ), split_changeset_items, ) for changeset_items in self.get_symbol_changesets(): yield ( create_symbol_changeset( self.changeset_key_generator.gen_id(), changeset_items[0].symbol, [cvs_symbol.id for cvs_symbol in changeset_items] ), changeset_items, ) def run(self, run_options, stats_keeper): logger.quiet("Creating preliminary commit sets...") Ctx()._projects = read_projects( artifact_manager.get_temp_file(config.PROJECTS) ) Ctx()._cvs_path_db = CVSPathDatabase(DB_OPEN_READ) Ctx()._symbol_db = SymbolDatabase() f = open(artifact_manager.get_temp_file(config.ITEM_SERIALIZER), 'rb') self.cvs_item_serializer = cPickle.load(f) f.close() changeset_db = ChangesetDatabase( artifact_manager.get_temp_file(config.CHANGESETS_STORE), artifact_manager.get_temp_file(config.CHANGESETS_INDEX), DB_OPEN_NEW, ) cvs_item_to_changeset_id = CVSItemToChangesetTable( artifact_manager.get_temp_file(config.CVS_ITEM_TO_CHANGESET), DB_OPEN_NEW, ) self.sorted_cvs_items_db = IndexedCVSItemStore( artifact_manager.get_temp_file(config.CVS_ITEMS_SORTED_STORE), artifact_manager.get_temp_file(config.CVS_ITEMS_SORTED_INDEX_TABLE), DB_OPEN_NEW) self.changeset_key_generator = KeyGenerator() for (changeset, changeset_items) in self.get_changesets(): if logger.is_on(logger.DEBUG): logger.debug(repr(changeset)) changeset_db.store(changeset) for cvs_item in changeset_items: self.sorted_cvs_items_db.add(cvs_item) cvs_item_to_changeset_id[cvs_item.id] = changeset.id self.sorted_cvs_items_db.close() cvs_item_to_changeset_id.close() changeset_db.close() Ctx()._symbol_db.close() Ctx()._cvs_path_db.close() del self.cvs_item_serializer logger.quiet("Done") class ProcessedChangesetLogger: def __init__(self): self.processed_changeset_ids = [] def log(self, changeset_id): if logger.is_on(logger.DEBUG): self.processed_changeset_ids.append(changeset_id) def flush(self): if self.processed_changeset_ids: logger.debug( 'Consumed changeset ids %s' % (', '.join(['%x' % id for id in self.processed_changeset_ids]),)) del self.processed_changeset_ids[:] class BreakRevisionChangesetCyclesPass(Pass): """Break up any dependency cycles involving only RevisionChangesets.""" def register_artifacts(self): self._register_temp_file(config.CHANGESETS_REVBROKEN_STORE) self._register_temp_file(config.CHANGESETS_REVBROKEN_INDEX) self._register_temp_file(config.CVS_ITEM_TO_CHANGESET_REVBROKEN) self._register_temp_file_needed(config.PROJECTS) self._register_temp_file_needed(config.SYMBOL_DB) self._register_temp_file_needed(config.CVS_PATHS_DB) self._register_temp_file_needed(config.CVS_ITEMS_SORTED_STORE) self._register_temp_file_needed(config.CVS_ITEMS_SORTED_INDEX_TABLE) self._register_temp_file_needed(config.CHANGESETS_STORE) self._register_temp_file_needed(config.CHANGESETS_INDEX) self._register_temp_file_needed(config.CVS_ITEM_TO_CHANGESET) def get_source_changesets(self): old_changeset_db = ChangesetDatabase( artifact_manager.get_temp_file(config.CHANGESETS_STORE), artifact_manager.get_temp_file(config.CHANGESETS_INDEX), DB_OPEN_READ) changeset_ids = old_changeset_db.keys() for changeset_id in changeset_ids: yield old_changeset_db[changeset_id] old_changeset_db.close() del old_changeset_db def break_cycle(self, cycle): """Break up one or more changesets in CYCLE to help break the cycle. CYCLE is a list of Changesets where cycle[i] depends on cycle[i - 1] Break up one or more changesets in CYCLE to make progress towards breaking the cycle. Update self.changeset_graph accordingly. It is not guaranteed that the cycle will be broken by one call to this routine, but at least some progress must be made.""" self.processed_changeset_logger.flush() best_i = None best_link = None for i in range(len(cycle)): # It's OK if this index wraps to -1: link = ChangesetGraphLink( cycle[i - 1], cycle[i], cycle[i + 1 - len(cycle)]) if best_i is None or link < best_link: best_i = i best_link = link if logger.is_on(logger.DEBUG): logger.debug( 'Breaking cycle %s by breaking node %x' % ( ' -> '.join(['%x' % node.id for node in (cycle + [cycle[0]])]), best_link.changeset.id,)) new_changesets = best_link.break_changeset(self.changeset_key_generator) self.changeset_graph.delete_changeset(best_link.changeset) for changeset in new_changesets: self.changeset_graph.add_new_changeset(changeset) def run(self, run_options, stats_keeper): logger.quiet("Breaking revision changeset dependency cycles...") Ctx()._projects = read_projects( artifact_manager.get_temp_file(config.PROJECTS) ) Ctx()._cvs_path_db = CVSPathDatabase(DB_OPEN_READ) Ctx()._symbol_db = SymbolDatabase() Ctx()._cvs_items_db = IndexedCVSItemStore( artifact_manager.get_temp_file(config.CVS_ITEMS_SORTED_STORE), artifact_manager.get_temp_file(config.CVS_ITEMS_SORTED_INDEX_TABLE), DB_OPEN_READ) shutil.copyfile( artifact_manager.get_temp_file( config.CVS_ITEM_TO_CHANGESET), artifact_manager.get_temp_file( config.CVS_ITEM_TO_CHANGESET_REVBROKEN)) cvs_item_to_changeset_id = CVSItemToChangesetTable( artifact_manager.get_temp_file( config.CVS_ITEM_TO_CHANGESET_REVBROKEN), DB_OPEN_WRITE) changeset_db = ChangesetDatabase( artifact_manager.get_temp_file(config.CHANGESETS_REVBROKEN_STORE), artifact_manager.get_temp_file(config.CHANGESETS_REVBROKEN_INDEX), DB_OPEN_NEW) self.changeset_graph = ChangesetGraph( changeset_db, cvs_item_to_changeset_id ) max_changeset_id = 0 for changeset in self.get_source_changesets(): changeset_db.store(changeset) if isinstance(changeset, RevisionChangeset): self.changeset_graph.add_changeset(changeset) max_changeset_id = max(max_changeset_id, changeset.id) self.changeset_key_generator = KeyGenerator(max_changeset_id + 1) self.processed_changeset_logger = ProcessedChangesetLogger() # Consume the graph, breaking cycles using self.break_cycle(): for (changeset, time_range) in self.changeset_graph.consume_graph( cycle_breaker=self.break_cycle ): self.processed_changeset_logger.log(changeset.id) self.processed_changeset_logger.flush() del self.processed_changeset_logger self.changeset_graph.close() self.changeset_graph = None Ctx()._cvs_items_db.close() Ctx()._symbol_db.close() Ctx()._cvs_path_db.close() logger.quiet("Done") class RevisionTopologicalSortPass(Pass): """Sort RevisionChangesets into commit order. Also convert them to OrderedChangesets, without changing their ids.""" def register_artifacts(self): self._register_temp_file(config.CHANGESETS_REVSORTED_STORE) self._register_temp_file(config.CHANGESETS_REVSORTED_INDEX) self._register_temp_file_needed(config.PROJECTS) self._register_temp_file_needed(config.SYMBOL_DB) self._register_temp_file_needed(config.CVS_PATHS_DB) self._register_temp_file_needed(config.CVS_ITEMS_SORTED_STORE) self._register_temp_file_needed(config.CVS_ITEMS_SORTED_INDEX_TABLE) self._register_temp_file_needed(config.CHANGESETS_REVBROKEN_STORE) self._register_temp_file_needed(config.CHANGESETS_REVBROKEN_INDEX) self._register_temp_file_needed(config.CVS_ITEM_TO_CHANGESET_REVBROKEN) def get_source_changesets(self, changeset_db): changeset_ids = changeset_db.keys() for changeset_id in changeset_ids: yield changeset_db[changeset_id] def get_changesets(self): changeset_db = ChangesetDatabase( artifact_manager.get_temp_file(config.CHANGESETS_REVBROKEN_STORE), artifact_manager.get_temp_file(config.CHANGESETS_REVBROKEN_INDEX), DB_OPEN_READ, ) changeset_graph = ChangesetGraph( changeset_db, CVSItemToChangesetTable( artifact_manager.get_temp_file( config.CVS_ITEM_TO_CHANGESET_REVBROKEN ), DB_OPEN_READ, ) ) for changeset in self.get_source_changesets(changeset_db): if isinstance(changeset, RevisionChangeset): changeset_graph.add_changeset(changeset) else: yield changeset changeset_ids = [] # Sentry: changeset_ids.append(None) for (changeset, time_range) in changeset_graph.consume_graph(): changeset_ids.append(changeset.id) # Sentry: changeset_ids.append(None) for i in range(1, len(changeset_ids) - 1): changeset = changeset_db[changeset_ids[i]] yield OrderedChangeset( changeset.id, changeset.cvs_item_ids, i - 1, changeset_ids[i - 1], changeset_ids[i + 1]) changeset_graph.close() def run(self, run_options, stats_keeper): logger.quiet("Generating CVSRevisions in commit order...") Ctx()._projects = read_projects( artifact_manager.get_temp_file(config.PROJECTS) ) Ctx()._cvs_path_db = CVSPathDatabase(DB_OPEN_READ) Ctx()._symbol_db = SymbolDatabase() Ctx()._cvs_items_db = IndexedCVSItemStore( artifact_manager.get_temp_file(config.CVS_ITEMS_SORTED_STORE), artifact_manager.get_temp_file(config.CVS_ITEMS_SORTED_INDEX_TABLE), DB_OPEN_READ) changesets_revordered_db = ChangesetDatabase( artifact_manager.get_temp_file(config.CHANGESETS_REVSORTED_STORE), artifact_manager.get_temp_file(config.CHANGESETS_REVSORTED_INDEX), DB_OPEN_NEW) for changeset in self.get_changesets(): changesets_revordered_db.store(changeset) changesets_revordered_db.close() Ctx()._cvs_items_db.close() Ctx()._symbol_db.close() Ctx()._cvs_path_db.close() logger.quiet("Done") class BreakSymbolChangesetCyclesPass(Pass): """Break up any dependency cycles involving only SymbolChangesets.""" def register_artifacts(self): self._register_temp_file(config.CHANGESETS_SYMBROKEN_STORE) self._register_temp_file(config.CHANGESETS_SYMBROKEN_INDEX) self._register_temp_file(config.CVS_ITEM_TO_CHANGESET_SYMBROKEN) self._register_temp_file_needed(config.PROJECTS) self._register_temp_file_needed(config.SYMBOL_DB) self._register_temp_file_needed(config.CVS_PATHS_DB) self._register_temp_file_needed(config.CVS_ITEMS_SORTED_STORE) self._register_temp_file_needed(config.CVS_ITEMS_SORTED_INDEX_TABLE) self._register_temp_file_needed(config.CHANGESETS_REVSORTED_STORE) self._register_temp_file_needed(config.CHANGESETS_REVSORTED_INDEX) self._register_temp_file_needed(config.CVS_ITEM_TO_CHANGESET_REVBROKEN) def get_source_changesets(self): old_changeset_db = ChangesetDatabase( artifact_manager.get_temp_file(config.CHANGESETS_REVSORTED_STORE), artifact_manager.get_temp_file(config.CHANGESETS_REVSORTED_INDEX), DB_OPEN_READ) changeset_ids = old_changeset_db.keys() for changeset_id in changeset_ids: yield old_changeset_db[changeset_id] old_changeset_db.close() def break_cycle(self, cycle): """Break up one or more changesets in CYCLE to help break the cycle. CYCLE is a list of Changesets where cycle[i] depends on cycle[i - 1] Break up one or more changesets in CYCLE to make progress towards breaking the cycle. Update self.changeset_graph accordingly. It is not guaranteed that the cycle will be broken by one call to this routine, but at least some progress must be made.""" self.processed_changeset_logger.flush() best_i = None best_link = None for i in range(len(cycle)): # It's OK if this index wraps to -1: link = ChangesetGraphLink( cycle[i - 1], cycle[i], cycle[i + 1 - len(cycle)]) if best_i is None or link < best_link: best_i = i best_link = link if logger.is_on(logger.DEBUG): logger.debug( 'Breaking cycle %s by breaking node %x' % ( ' -> '.join(['%x' % node.id for node in (cycle + [cycle[0]])]), best_link.changeset.id,)) new_changesets = best_link.break_changeset(self.changeset_key_generator) self.changeset_graph.delete_changeset(best_link.changeset) for changeset in new_changesets: self.changeset_graph.add_new_changeset(changeset) def run(self, run_options, stats_keeper): logger.quiet("Breaking symbol changeset dependency cycles...") Ctx()._projects = read_projects( artifact_manager.get_temp_file(config.PROJECTS) ) Ctx()._cvs_path_db = CVSPathDatabase(DB_OPEN_READ) Ctx()._symbol_db = SymbolDatabase() Ctx()._cvs_items_db = IndexedCVSItemStore( artifact_manager.get_temp_file(config.CVS_ITEMS_SORTED_STORE), artifact_manager.get_temp_file(config.CVS_ITEMS_SORTED_INDEX_TABLE), DB_OPEN_READ) shutil.copyfile( artifact_manager.get_temp_file( config.CVS_ITEM_TO_CHANGESET_REVBROKEN), artifact_manager.get_temp_file( config.CVS_ITEM_TO_CHANGESET_SYMBROKEN)) cvs_item_to_changeset_id = CVSItemToChangesetTable( artifact_manager.get_temp_file( config.CVS_ITEM_TO_CHANGESET_SYMBROKEN), DB_OPEN_WRITE) changeset_db = ChangesetDatabase( artifact_manager.get_temp_file(config.CHANGESETS_SYMBROKEN_STORE), artifact_manager.get_temp_file(config.CHANGESETS_SYMBROKEN_INDEX), DB_OPEN_NEW) self.changeset_graph = ChangesetGraph( changeset_db, cvs_item_to_changeset_id ) max_changeset_id = 0 for changeset in self.get_source_changesets(): changeset_db.store(changeset) if isinstance(changeset, SymbolChangeset): self.changeset_graph.add_changeset(changeset) max_changeset_id = max(max_changeset_id, changeset.id) self.changeset_key_generator = KeyGenerator(max_changeset_id + 1) self.processed_changeset_logger = ProcessedChangesetLogger() # Consume the graph, breaking cycles using self.break_cycle(): for (changeset, time_range) in self.changeset_graph.consume_graph( cycle_breaker=self.break_cycle ): self.processed_changeset_logger.log(changeset.id) self.processed_changeset_logger.flush() del self.processed_changeset_logger self.changeset_graph.close() self.changeset_graph = None Ctx()._cvs_items_db.close() Ctx()._symbol_db.close() Ctx()._cvs_path_db.close() logger.quiet("Done") class BreakAllChangesetCyclesPass(Pass): """Break up any dependency cycles that are closed by SymbolChangesets.""" def register_artifacts(self): self._register_temp_file(config.CHANGESETS_ALLBROKEN_STORE) self._register_temp_file(config.CHANGESETS_ALLBROKEN_INDEX) self._register_temp_file(config.CVS_ITEM_TO_CHANGESET_ALLBROKEN) self._register_temp_file_needed(config.PROJECTS) self._register_temp_file_needed(config.SYMBOL_DB) self._register_temp_file_needed(config.CVS_PATHS_DB) self._register_temp_file_needed(config.CVS_ITEMS_SORTED_STORE) self._register_temp_file_needed(config.CVS_ITEMS_SORTED_INDEX_TABLE) self._register_temp_file_needed(config.CHANGESETS_SYMBROKEN_STORE) self._register_temp_file_needed(config.CHANGESETS_SYMBROKEN_INDEX) self._register_temp_file_needed(config.CVS_ITEM_TO_CHANGESET_SYMBROKEN) def get_source_changesets(self): old_changeset_db = ChangesetDatabase( artifact_manager.get_temp_file(config.CHANGESETS_SYMBROKEN_STORE), artifact_manager.get_temp_file(config.CHANGESETS_SYMBROKEN_INDEX), DB_OPEN_READ) changeset_ids = old_changeset_db.keys() for changeset_id in changeset_ids: yield old_changeset_db[changeset_id] old_changeset_db.close() def _split_retrograde_changeset(self, changeset): """CHANGESET is retrograde. Split it into non-retrograde changesets.""" logger.debug('Breaking retrograde changeset %x' % (changeset.id,)) self.changeset_graph.delete_changeset(changeset) # A map { cvs_branch_id : (max_pred_ordinal, min_succ_ordinal) } ordinal_limits = {} for cvs_branch in changeset.iter_cvs_items(): max_pred_ordinal = 0 min_succ_ordinal = sys.maxint for pred_id in cvs_branch.get_pred_ids(): pred_ordinal = self.ordinals.get( self.cvs_item_to_changeset_id[pred_id], 0) max_pred_ordinal = max(max_pred_ordinal, pred_ordinal) for succ_id in cvs_branch.get_succ_ids(): succ_ordinal = self.ordinals.get( self.cvs_item_to_changeset_id[succ_id], sys.maxint) min_succ_ordinal = min(min_succ_ordinal, succ_ordinal) assert max_pred_ordinal < min_succ_ordinal ordinal_limits[cvs_branch.id] = (max_pred_ordinal, min_succ_ordinal,) # Find the earliest successor ordinal: min_min_succ_ordinal = sys.maxint for (max_pred_ordinal, min_succ_ordinal) in ordinal_limits.values(): min_min_succ_ordinal = min(min_min_succ_ordinal, min_succ_ordinal) early_item_ids = [] late_item_ids = [] for (id, (max_pred_ordinal, min_succ_ordinal)) in ordinal_limits.items(): if max_pred_ordinal >= min_min_succ_ordinal: late_item_ids.append(id) else: early_item_ids.append(id) assert early_item_ids assert late_item_ids early_changeset = changeset.create_split_changeset( self.changeset_key_generator.gen_id(), early_item_ids) late_changeset = changeset.create_split_changeset( self.changeset_key_generator.gen_id(), late_item_ids) self.changeset_graph.add_new_changeset(early_changeset) self.changeset_graph.add_new_changeset(late_changeset) early_split = self._split_if_retrograde(early_changeset.id) # Because of the way we constructed it, the early changeset should # not have to be split: assert not early_split self._split_if_retrograde(late_changeset.id) def _split_if_retrograde(self, changeset_id): node = self.changeset_graph[changeset_id] pred_ordinals = [ self.ordinals[id] for id in node.pred_ids if id in self.ordinals ] pred_ordinals.sort() succ_ordinals = [ self.ordinals[id] for id in node.succ_ids if id in self.ordinals ] succ_ordinals.sort() if pred_ordinals and succ_ordinals \ and pred_ordinals[-1] >= succ_ordinals[0]: self._split_retrograde_changeset(self.changeset_db[node.id]) return True else: return False def break_segment(self, segment): """Break a changeset in SEGMENT[1:-1]. The range SEGMENT[1:-1] is not empty, and all of the changesets in that range are SymbolChangesets.""" best_i = None best_link = None for i in range(1, len(segment) - 1): link = ChangesetGraphLink(segment[i - 1], segment[i], segment[i + 1]) if best_i is None or link < best_link: best_i = i best_link = link if logger.is_on(logger.DEBUG): logger.debug( 'Breaking segment %s by breaking node %x' % ( ' -> '.join(['%x' % node.id for node in segment]), best_link.changeset.id,)) new_changesets = best_link.break_changeset(self.changeset_key_generator) self.changeset_graph.delete_changeset(best_link.changeset) for changeset in new_changesets: self.changeset_graph.add_new_changeset(changeset) def break_cycle(self, cycle): """Break up one or more SymbolChangesets in CYCLE to help break the cycle. CYCLE is a list of SymbolChangesets where cycle[i] depends on cycle[i - 1] . Break up one or more changesets in CYCLE to make progress towards breaking the cycle. Update self.changeset_graph accordingly. It is not guaranteed that the cycle will be broken by one call to this routine, but at least some progress must be made.""" if logger.is_on(logger.DEBUG): logger.debug( 'Breaking cycle %s' % ( ' -> '.join(['%x' % changeset.id for changeset in cycle + [cycle[0]]]),)) # Unwrap the cycle into a segment then break the segment: self.break_segment([cycle[-1]] + cycle + [cycle[0]]) def run(self, run_options, stats_keeper): logger.quiet("Breaking CVSSymbol dependency loops...") Ctx()._projects = read_projects( artifact_manager.get_temp_file(config.PROJECTS) ) Ctx()._cvs_path_db = CVSPathDatabase(DB_OPEN_READ) Ctx()._symbol_db = SymbolDatabase() Ctx()._cvs_items_db = IndexedCVSItemStore( artifact_manager.get_temp_file(config.CVS_ITEMS_SORTED_STORE), artifact_manager.get_temp_file(config.CVS_ITEMS_SORTED_INDEX_TABLE), DB_OPEN_READ) shutil.copyfile( artifact_manager.get_temp_file( config.CVS_ITEM_TO_CHANGESET_SYMBROKEN), artifact_manager.get_temp_file( config.CVS_ITEM_TO_CHANGESET_ALLBROKEN)) self.cvs_item_to_changeset_id = CVSItemToChangesetTable( artifact_manager.get_temp_file( config.CVS_ITEM_TO_CHANGESET_ALLBROKEN), DB_OPEN_WRITE) self.changeset_db = ChangesetDatabase( artifact_manager.get_temp_file(config.CHANGESETS_ALLBROKEN_STORE), artifact_manager.get_temp_file(config.CHANGESETS_ALLBROKEN_INDEX), DB_OPEN_NEW) self.changeset_graph = ChangesetGraph( self.changeset_db, self.cvs_item_to_changeset_id ) # A map {changeset_id : ordinal} for OrderedChangesets: self.ordinals = {} # A map {ordinal : changeset_id}: ordered_changeset_map = {} # A list of all BranchChangeset ids: branch_changeset_ids = [] max_changeset_id = 0 for changeset in self.get_source_changesets(): self.changeset_db.store(changeset) self.changeset_graph.add_changeset(changeset) if isinstance(changeset, OrderedChangeset): ordered_changeset_map[changeset.ordinal] = changeset.id self.ordinals[changeset.id] = changeset.ordinal elif isinstance(changeset, BranchChangeset): branch_changeset_ids.append(changeset.id) max_changeset_id = max(max_changeset_id, changeset.id) # An array of ordered_changeset ids, indexed by ordinal: ordered_changesets = [] for ordinal in range(len(ordered_changeset_map)): id = ordered_changeset_map[ordinal] ordered_changesets.append(id) ordered_changeset_ids = set(ordered_changeset_map.values()) del ordered_changeset_map self.changeset_key_generator = KeyGenerator(max_changeset_id + 1) # First we scan through all BranchChangesets looking for # changesets that are individually "retrograde" and splitting # those up: for changeset_id in branch_changeset_ids: self._split_if_retrograde(changeset_id) del self.ordinals next_ordered_changeset = 0 self.processed_changeset_logger = ProcessedChangesetLogger() while self.changeset_graph: # Consume any nodes that don't have predecessors: for (changeset, time_range) \ in self.changeset_graph.consume_nopred_nodes(): self.processed_changeset_logger.log(changeset.id) if changeset.id in ordered_changeset_ids: next_ordered_changeset += 1 ordered_changeset_ids.remove(changeset.id) self.processed_changeset_logger.flush() if not self.changeset_graph: break # Now work on the next ordered changeset that has not yet been # processed. BreakSymbolChangesetCyclesPass has broken any # cycles involving only SymbolChangesets, so the presence of a # cycle implies that there is at least one ordered changeset # left in the graph: assert next_ordered_changeset < len(ordered_changesets) id = ordered_changesets[next_ordered_changeset] path = self.changeset_graph.search_for_path(id, ordered_changeset_ids) if path: if logger.is_on(logger.DEBUG): logger.debug('Breaking path from %s to %s' % (path[0], path[-1],)) self.break_segment(path) else: # There were no ordered changesets among the reachable # predecessors, so do generic cycle-breaking: if logger.is_on(logger.DEBUG): logger.debug( 'Breaking generic cycle found from %s' % (self.changeset_db[id],) ) self.break_cycle(self.changeset_graph.find_cycle(id)) del self.processed_changeset_logger self.changeset_graph.close() self.changeset_graph = None self.cvs_item_to_changeset_id = None self.changeset_db = None logger.quiet("Done") class TopologicalSortPass(Pass): """Sort changesets into commit order.""" def register_artifacts(self): self._register_temp_file(config.CHANGESETS_SORTED_DATAFILE) self._register_temp_file_needed(config.PROJECTS) self._register_temp_file_needed(config.SYMBOL_DB) self._register_temp_file_needed(config.CVS_PATHS_DB) self._register_temp_file_needed(config.CVS_ITEMS_SORTED_STORE) self._register_temp_file_needed(config.CVS_ITEMS_SORTED_INDEX_TABLE) self._register_temp_file_needed(config.CHANGESETS_ALLBROKEN_STORE) self._register_temp_file_needed(config.CHANGESETS_ALLBROKEN_INDEX) self._register_temp_file_needed(config.CVS_ITEM_TO_CHANGESET_ALLBROKEN) def get_source_changesets(self, changeset_db): for changeset_id in changeset_db.keys(): yield changeset_db[changeset_id] def get_changesets(self): """Generate (changeset, timestamp) pairs in commit order.""" changeset_db = ChangesetDatabase( artifact_manager.get_temp_file(config.CHANGESETS_ALLBROKEN_STORE), artifact_manager.get_temp_file(config.CHANGESETS_ALLBROKEN_INDEX), DB_OPEN_READ) changeset_graph = ChangesetGraph( changeset_db, CVSItemToChangesetTable( artifact_manager.get_temp_file( config.CVS_ITEM_TO_CHANGESET_ALLBROKEN ), DB_OPEN_READ, ), ) symbol_changeset_ids = set() for changeset in self.get_source_changesets(changeset_db): changeset_graph.add_changeset(changeset) if isinstance(changeset, SymbolChangeset): symbol_changeset_ids.add(changeset.id) # Ensure a monotonically-increasing timestamp series by keeping # track of the previous timestamp and ensuring that the following # one is larger. timestamper = Timestamper() for (changeset, time_range) in changeset_graph.consume_graph(): timestamp = timestamper.get( time_range.t_max, changeset.id in symbol_changeset_ids ) yield (changeset, timestamp) changeset_graph.close() def run(self, run_options, stats_keeper): logger.quiet("Generating CVSRevisions in commit order...") Ctx()._projects = read_projects( artifact_manager.get_temp_file(config.PROJECTS) ) Ctx()._cvs_path_db = CVSPathDatabase(DB_OPEN_READ) Ctx()._symbol_db = SymbolDatabase() Ctx()._cvs_items_db = IndexedCVSItemStore( artifact_manager.get_temp_file(config.CVS_ITEMS_SORTED_STORE), artifact_manager.get_temp_file(config.CVS_ITEMS_SORTED_INDEX_TABLE), DB_OPEN_READ) sorted_changesets = open( artifact_manager.get_temp_file(config.CHANGESETS_SORTED_DATAFILE), 'w') for (changeset, timestamp) in self.get_changesets(): sorted_changesets.write('%x %08x\n' % (changeset.id, timestamp,)) sorted_changesets.close() Ctx()._cvs_items_db.close() Ctx()._symbol_db.close() Ctx()._cvs_path_db.close() logger.quiet("Done") class CreateRevsPass(Pass): """Generate the SVNCommit <-> CVSRevision mapping databases. SVNCommitCreator also calls SymbolingsLogger to register CVSRevisions that represent an opening or closing for a path on a branch or tag. See SymbolingsLogger for more details. This pass was formerly known as pass5.""" def register_artifacts(self): self._register_temp_file(config.SVN_COMMITS_INDEX_TABLE) self._register_temp_file(config.SVN_COMMITS_STORE) self._register_temp_file(config.CVS_REVS_TO_SVN_REVNUMS) self._register_temp_file(config.SYMBOL_OPENINGS_CLOSINGS) self._register_temp_file_needed(config.PROJECTS) self._register_temp_file_needed(config.CVS_PATHS_DB) self._register_temp_file_needed(config.CVS_ITEMS_SORTED_STORE) self._register_temp_file_needed(config.CVS_ITEMS_SORTED_INDEX_TABLE) self._register_temp_file_needed(config.SYMBOL_DB) self._register_temp_file_needed(config.CHANGESETS_ALLBROKEN_STORE) self._register_temp_file_needed(config.CHANGESETS_ALLBROKEN_INDEX) self._register_temp_file_needed(config.CHANGESETS_SORTED_DATAFILE) def get_changesets(self): """Generate (changeset,timestamp,) tuples in commit order.""" changeset_db = ChangesetDatabase( artifact_manager.get_temp_file(config.CHANGESETS_ALLBROKEN_STORE), artifact_manager.get_temp_file(config.CHANGESETS_ALLBROKEN_INDEX), DB_OPEN_READ) for line in file( artifact_manager.get_temp_file( config.CHANGESETS_SORTED_DATAFILE)): [changeset_id, timestamp] = [int(s, 16) for s in line.strip().split()] yield (changeset_db[changeset_id], timestamp) changeset_db.close() def get_svn_commits(self, creator): """Generate the SVNCommits, in order.""" for (changeset, timestamp) in self.get_changesets(): for svn_commit in creator.process_changeset(changeset, timestamp): yield svn_commit def log_svn_commit(self, svn_commit): """Output information about SVN_COMMIT.""" logger.normal( 'Creating Subversion r%d (%s)' % (svn_commit.revnum, svn_commit.get_description(),) ) if isinstance(svn_commit, SVNRevisionCommit): for cvs_rev in svn_commit.cvs_revs: logger.verbose(' %s %s' % (cvs_rev.cvs_path, cvs_rev.rev,)) def run(self, run_options, stats_keeper): logger.quiet("Mapping CVS revisions to Subversion commits...") Ctx()._projects = read_projects( artifact_manager.get_temp_file(config.PROJECTS) ) Ctx()._cvs_path_db = CVSPathDatabase(DB_OPEN_READ) Ctx()._symbol_db = SymbolDatabase() Ctx()._cvs_items_db = IndexedCVSItemStore( artifact_manager.get_temp_file(config.CVS_ITEMS_SORTED_STORE), artifact_manager.get_temp_file(config.CVS_ITEMS_SORTED_INDEX_TABLE), DB_OPEN_READ) Ctx()._symbolings_logger = SymbolingsLogger() persistence_manager = PersistenceManager(DB_OPEN_NEW) creator = SVNCommitCreator() for svn_commit in self.get_svn_commits(creator): self.log_svn_commit(svn_commit) persistence_manager.put_svn_commit(svn_commit) stats_keeper.set_svn_rev_count(creator.revnum_generator.get_last_id()) del creator persistence_manager.close() Ctx()._symbolings_logger.close() Ctx()._cvs_items_db.close() Ctx()._symbol_db.close() Ctx()._cvs_path_db.close() logger.quiet("Done") class SortSymbolOpeningsClosingsPass(Pass): """This pass was formerly known as pass6.""" def register_artifacts(self): self._register_temp_file(config.SYMBOL_OPENINGS_CLOSINGS_SORTED) self._register_temp_file_needed(config.SYMBOL_OPENINGS_CLOSINGS) def run(self, run_options, stats_keeper): logger.quiet("Sorting symbolic name source revisions...") def sort_key(line): line = line.split(' ', 2) return (int(line[0], 16), int(line[1]), line[2],) sort_file( artifact_manager.get_temp_file(config.SYMBOL_OPENINGS_CLOSINGS), artifact_manager.get_temp_file( config.SYMBOL_OPENINGS_CLOSINGS_SORTED ), key=sort_key, tempdirs=[Ctx().tmpdir], ) logger.quiet("Done") class IndexSymbolsPass(Pass): """This pass was formerly known as pass7.""" def register_artifacts(self): self._register_temp_file(config.SYMBOL_OFFSETS_DB) self._register_temp_file_needed(config.PROJECTS) self._register_temp_file_needed(config.SYMBOL_DB) self._register_temp_file_needed(config.SYMBOL_OPENINGS_CLOSINGS_SORTED) def generate_offsets_for_symbolings(self): """This function iterates through all the lines in SYMBOL_OPENINGS_CLOSINGS_SORTED, writing out a file mapping SYMBOLIC_NAME to the file offset in SYMBOL_OPENINGS_CLOSINGS_SORTED where SYMBOLIC_NAME is first encountered. This will allow us to seek to the various offsets in the file and sequentially read only the openings and closings that we need.""" offsets = {} f = open( artifact_manager.get_temp_file( config.SYMBOL_OPENINGS_CLOSINGS_SORTED), 'r') old_id = None while True: fpos = f.tell() line = f.readline() if not line: break id, svn_revnum, ignored = line.split(" ", 2) id = int(id, 16) if id != old_id: logger.verbose(' ', Ctx()._symbol_db.get_symbol(id).name) old_id = id offsets[id] = fpos f.close() offsets_db = file( artifact_manager.get_temp_file(config.SYMBOL_OFFSETS_DB), 'wb') cPickle.dump(offsets, offsets_db, -1) offsets_db.close() def run(self, run_options, stats_keeper): logger.quiet("Determining offsets for all symbolic names...") Ctx()._projects = read_projects( artifact_manager.get_temp_file(config.PROJECTS) ) Ctx()._symbol_db = SymbolDatabase() self.generate_offsets_for_symbolings() Ctx()._symbol_db.close() logger.quiet("Done.") class OutputPass(Pass): """This pass was formerly known as pass8.""" def register_artifacts(self): self._register_temp_file_needed(config.PROJECTS) self._register_temp_file_needed(config.CVS_PATHS_DB) self._register_temp_file_needed(config.CVS_ITEMS_SORTED_STORE) self._register_temp_file_needed(config.CVS_ITEMS_SORTED_INDEX_TABLE) self._register_temp_file_needed(config.SYMBOL_DB) self._register_temp_file_needed(config.METADATA_CLEAN_INDEX_TABLE) self._register_temp_file_needed(config.METADATA_CLEAN_STORE) self._register_temp_file_needed(config.SVN_COMMITS_INDEX_TABLE) self._register_temp_file_needed(config.SVN_COMMITS_STORE) self._register_temp_file_needed(config.CVS_REVS_TO_SVN_REVNUMS) Ctx().output_option.register_artifacts(self) def run(self, run_options, stats_keeper): Ctx()._projects = read_projects( artifact_manager.get_temp_file(config.PROJECTS) ) Ctx()._cvs_path_db = CVSPathDatabase(DB_OPEN_READ) Ctx()._metadata_db = MetadataDatabase( artifact_manager.get_temp_file(config.METADATA_CLEAN_STORE), artifact_manager.get_temp_file(config.METADATA_CLEAN_INDEX_TABLE), DB_OPEN_READ, ) Ctx()._cvs_items_db = IndexedCVSItemStore( artifact_manager.get_temp_file(config.CVS_ITEMS_SORTED_STORE), artifact_manager.get_temp_file(config.CVS_ITEMS_SORTED_INDEX_TABLE), DB_OPEN_READ) Ctx()._symbol_db = SymbolDatabase() Ctx()._persistence_manager = PersistenceManager(DB_OPEN_READ) Ctx().output_option.setup(stats_keeper.svn_rev_count()) svn_revnum = 1 svn_commit = Ctx()._persistence_manager.get_svn_commit(svn_revnum) while svn_commit: svn_commit.output(Ctx().output_option) svn_revnum += 1 svn_commit = Ctx()._persistence_manager.get_svn_commit(svn_revnum) Ctx().output_option.cleanup() Ctx()._persistence_manager.close() Ctx()._symbol_db.close() Ctx()._cvs_items_db.close() Ctx()._metadata_db.close() Ctx()._cvs_path_db.close() # The list of passes constituting a run of cvs2svn: passes = [ CollectRevsPass(), CleanMetadataPass(), CollateSymbolsPass(), #CheckItemStoreDependenciesPass(config.CVS_ITEMS_STORE), FilterSymbolsPass(), SortRevisionsPass(), SortSymbolsPass(), InitializeChangesetsPass(), #CheckIndexedItemStoreDependenciesPass( # config.CVS_ITEMS_SORTED_STORE, # config.CVS_ITEMS_SORTED_INDEX_TABLE), BreakRevisionChangesetCyclesPass(), RevisionTopologicalSortPass(), BreakSymbolChangesetCyclesPass(), BreakAllChangesetCyclesPass(), TopologicalSortPass(), CreateRevsPass(), SortSymbolOpeningsClosingsPass(), IndexSymbolsPass(), OutputPass(), ] cvs2svn-2.4.0/cvs2svn_lib/generate_blobs.py0000775000076500007650000002027211710517256021775 0ustar mhaggermhagger00000000000000#!/usr/bin/env python -u # (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2009-2010 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Generate git blobs directly from RCS files. Usage: generate_blobs.py BLOBFILE To standard input should be written a series of pickles, each of which contains the following tuple: (RCSFILE, {CVS_REV : MARK, ...}) indicating which RCS file to read, which CVS revisions should be written to the blob file, and which marks to give each of the blobs. Since the tuples are read from stdin, either the calling program has to write to this program's stdin in binary mode and ensure that this program's standard input is opened in binary mode (e.g., using Python's '-u' option) or both can be in text mode *provided* that pickle protocol 0 is used. The program does most of its work in RAM, keeping at most one revision fulltext and one revision deltatext (plus perhaps one or two copies as scratch space) in memory at a time. But there are times when the fulltext of a revision is needed multiple times, for example when multiple branches sprout from the revision. In these cases, the fulltext is written to disk. If the fulltext is also needed for the blobfile, then the copy in the blobfils is read again when it is needed. If the fulltext is not needed in the blobfile, then it is written to a temporary file created with Python's tempfile module.""" import sys import os import tempfile import cPickle as pickle sys.path.insert(0, os.path.dirname(os.path.dirname(sys.argv[0]))) from cvs2svn_lib.rcsparser import Sink from cvs2svn_lib.rcsparser import parse from cvs2svn_lib.rcs_stream import RCSStream def read_marks(): # A map from CVS revision number (e.g., 1.2.3.4) to mark: marks = {} for l in sys.stdin: [rev, mark] = l.strip().split() marks[rev] = mark return marks class RevRecord(object): def __init__(self, rev, mark=None): self.rev = rev self.mark = mark # The rev whose fulltext is the base for this one's delta. self.base = None # Other revs that refer to this one as their base text: self.refs = set() # The (f, offset, length) where the fulltext of this revision can # be found: self.fulltext = None def is_needed(self): return bool(self.mark is not None or self.refs) def is_written(self): return self.fulltext is not None def write_blob(self, f, text): f.seek(0, 2) length = len(text) f.write('blob\n') f.write('mark :%s\n' % (self.mark,)) f.write('data %d\n' % (length,)) offset = f.tell() f.write(text) f.write('\n') self.fulltext = (f, offset, length) # This record (with its mark) has now been written, so the mark is # no longer needed. Setting it to None might allow is_needed() to # become False: self.mark = None def write(self, f, text): f.seek(0, 2) offset = f.tell() length = len(text) f.write(text) self.fulltext = (f, offset, length) def read_fulltext(self): assert self.fulltext is not None (f, offset, length) = self.fulltext f.seek(offset) return f.read(length) def __str__(self): if self.mark is not None: return '%s (%r): %r, %s' % ( self.rev, self.mark, self.refs, self.fulltext is not None, ) else: return '%s: %r, %s' % (self.rev, self.refs, self.fulltext is not None) class WriteBlobSink(Sink): def __init__(self, blobfile, marks): self.blobfile = blobfile # A map {rev : RevRecord} for all of the revisions whose fulltext # will still be needed: self.revrecs = {} # The revisions that need marks will definitely be needed, so # create records for them now (the rest will be filled in while # reading the RCS file): for (rev, mark) in marks.items(): self.revrecs[rev] = RevRecord(rev, mark) # The RevRecord of the last fulltext that has been reconstructed, # if it still is_needed(): self.last_revrec = None # An RCSStream holding the fulltext of last_revrec: self.last_rcsstream = None # A file to temporarily hold the fulltexts of revisions for which # no blobs are needed: self.fulltext_file = tempfile.TemporaryFile() def __getitem__(self, rev): try: return self.revrecs[rev] except KeyError: revrec = RevRecord(rev) self.revrecs[rev] = revrec return revrec def define_revision(self, rev, timestamp, author, state, branches, next): revrec = self[rev] if next is not None: revrec.refs.add(next) revrec.refs.update(branches) for dependent_rev in revrec.refs: dependent_revrec = self[dependent_rev] assert dependent_revrec.base is None dependent_revrec.base = rev def tree_completed(self): """Remove unneeded RevRecords. Remove the RevRecords for any revisions whose fulltext will not be needed (neither as blob output nor as the base of another needed revision).""" revrecs_to_remove = [ revrec for revrec in self.revrecs.itervalues() if not revrec.is_needed() ] while revrecs_to_remove: revrec = revrecs_to_remove.pop() del self.revrecs[revrec.rev] base_revrec = self[revrec.base] base_revrec.refs.remove(revrec.rev) if not base_revrec.is_needed(): revrecs_to_remove.append(base_revrec) def set_revision_info(self, rev, log, text): revrec = self.revrecs.get(rev) if revrec is None: return base_rev = revrec.base if base_rev is None: # This must be the last revision on trunk, for which the # fulltext is stored directly in the RCS file: assert self.last_revrec is None if revrec.mark is not None: revrec.write_blob(self.blobfile, text) if revrec.is_needed(): self.last_revrec = revrec self.last_rcsstream = RCSStream(text) elif self.last_revrec is not None and base_rev == self.last_revrec.rev: # Our base revision is stored in self.last_rcsstream. self.last_revrec.refs.remove(rev) if self.last_revrec.is_needed(): if not self.last_revrec.is_written(): self.last_revrec.write( self.fulltext_file, self.last_rcsstream.get_text() ) self.last_rcsstream.apply_diff(text) if revrec.mark is not None: revrec.write_blob(self.blobfile, self.last_rcsstream.get_text()) if revrec.is_needed(): self.last_revrec = revrec else: self.last_revrec = None self.last_rcsstream = None else: # Our base revision is not stored in self.last_rcsstream; it # will have to be obtained from elsewhere. # Store the old last_rcsstream if necessary: if self.last_revrec is not None: if not self.last_revrec.is_written(): self.last_revrec.write( self.fulltext_file, self.last_rcsstream.get_text() ) self.last_revrec = None self.last_rcsstream = None base_revrec = self[base_rev] rcsstream = RCSStream(base_revrec.read_fulltext()) base_revrec.refs.remove(rev) rcsstream.apply_diff(text) if revrec.mark is not None: revrec.write_blob(self.blobfile, rcsstream.get_text()) if revrec.is_needed(): self.last_revrec = revrec self.last_rcsstream = rcsstream del rcsstream def parse_completed(self): self.fulltext_file.close() def main(args): [blobfilename] = args blobfile = open(blobfilename, 'w+b') while True: try: (rcsfile, marks) = pickle.load(sys.stdin) except EOFError: break parse(open(rcsfile, 'rb'), WriteBlobSink(blobfile, marks)) blobfile.close() if __name__ == '__main__': main(sys.argv[1:]) cvs2svn-2.4.0/cvs2svn_lib/log.py0000664000076500007650000001016511434364604017600 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains a simple logging facility for cvs2svn.""" import sys import time import threading class _Log: """A Simple logging facility. If self.log_level is DEBUG or higher, each line will be timestamped with the number of wall-clock seconds since the time when this module was first imported. The public methods of this class are thread-safe.""" # These constants represent the log levels that this class supports. # The increase_verbosity() and decrease_verbosity() methods rely on # these constants being consecutive integers: ERROR = -2 WARN = -1 QUIET = 0 NORMAL = 1 VERBOSE = 2 DEBUG = 3 start_time = time.time() def __init__(self): self.log_level = _Log.NORMAL # The output file to use for errors: self._err = sys.stderr # The output file to use for lower-priority messages: self._out = sys.stdout # Lock to serialize writes to the log: self.lock = threading.Lock() def increase_verbosity(self): self.lock.acquire() try: self.log_level = min(self.log_level + 1, _Log.DEBUG) finally: self.lock.release() def decrease_verbosity(self): self.lock.acquire() try: self.log_level = max(self.log_level - 1, _Log.ERROR) finally: self.lock.release() def is_on(self, level): """Return True iff messages at the specified LEVEL are currently on. LEVEL should be one of the constants _Log.WARN, _Log.QUIET, etc.""" return self.log_level >= level def _timestamp(self): """Return a timestamp if needed, as a string with a trailing space.""" retval = [] if self.log_level >= _Log.DEBUG: retval.append('%f: ' % (time.time() - self.start_time,)) return ''.join(retval) def _write(self, out, *args): """Write a message to OUT. If there are multiple ARGS, they will be separated by spaces. If there are multiple lines, they will be output one by one with the same timestamp prefix.""" timestamp = self._timestamp() s = ' '.join(map(str, args)) lines = s.split('\n') if lines and not lines[-1]: del lines[-1] self.lock.acquire() try: for s in lines: out.write('%s%s\n' % (timestamp, s,)) # Ensure that log output doesn't get out-of-order with respect to # stderr output. out.flush() finally: self.lock.release() def write(self, *args): """Write a message to SELF._out. This is a public method to use for writing to the output log unconditionally.""" self._write(self._out, *args) def error(self, *args): """Log a message at the ERROR level.""" if self.is_on(_Log.ERROR): self._write(self._err, *args) def warn(self, *args): """Log a message at the WARN level.""" if self.is_on(_Log.WARN): self._write(self._out, *args) def quiet(self, *args): """Log a message at the QUIET level.""" if self.is_on(_Log.QUIET): self._write(self._out, *args) def normal(self, *args): """Log a message at the NORMAL level.""" if self.is_on(_Log.NORMAL): self._write(self._out, *args) def verbose(self, *args): """Log a message at the VERBOSE level.""" if self.is_on(_Log.VERBOSE): self._write(self._out, *args) def debug(self, *args): """Log a message at the DEBUG level.""" if self.is_on(_Log.DEBUG): self._write(self._out, *args) # Create an instance that everybody can use: logger = _Log() cvs2svn-2.4.0/cvs2svn_lib/fill_source.py0000664000076500007650000001476311244045075021332 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains classes describing the sources of symbol fills.""" from cvs2svn_lib.common import InternalError from cvs2svn_lib.common import FatalError from cvs2svn_lib.common import SVN_INVALID_REVNUM from cvs2svn_lib.svn_revision_range import SVNRevisionRange from cvs2svn_lib.svn_revision_range import RevisionScores class FillSource: """Representation of a fill source. A FillSource keeps track of the paths that have to be filled in a particular symbol fill. This class holds a SVNRevisionRange instance for each CVSFile that has to be filled within the subtree of the repository rooted at self.cvs_path. The SVNRevisionRange objects are stored in a tree in which the directory nodes are dictionaries mapping CVSPaths to subnodes and the leaf nodes are the SVNRevisionRange objects telling for what source_lod and what range of revisions the leaf could serve as a source. FillSource objects are able to compute the score for arbitrary source LODs and source revision numbers. These objects are used by the symbol filler in SVNOutputOption.""" def __init__(self, cvs_path, symbol, node_tree): """Create a fill source. The best LOD and SVN REVNUM to use as the copy source can be determined by calling compute_best_source(). Members: cvs_path -- (CVSPath): the CVSPath described by this FillSource. _symbol -- (Symbol) the symbol to be filled. _node_tree -- (dict) a tree stored as a map { CVSPath : node }, where subnodes have the same form. Leaves are SVNRevisionRange instances telling the source_lod and range of SVN revision numbers from which the CVSPath can be copied. """ self.cvs_path = cvs_path self._symbol = symbol self._node_tree = node_tree def _set_node(self, cvs_file, svn_revision_range): parent_node = self._get_node(cvs_file.parent_directory, create=True) if cvs_file in parent_node: raise InternalError( '%s appeared twice in sources for %s' % (cvs_file, self._symbol) ) parent_node[cvs_file] = svn_revision_range def _get_node(self, cvs_path, create=False): if cvs_path == self.cvs_path: return self._node_tree else: parent_node = self._get_node(cvs_path.parent_directory, create=create) try: return parent_node[cvs_path] except KeyError: if create: node = {} parent_node[cvs_path] = node return node else: raise def compute_best_source(self, preferred_source): """Determine the best source_lod and subversion revision number to copy. Return the best source found, as an SVNRevisionRange instance. If PREFERRED_SOURCE is not None and its opening is among the sources with the best scores, return it; otherwise, return the oldest such revision on the first such source_lod (ordered by the natural LOD sort order). The return value's source_lod is the best LOD to copy from, and its opening_revnum is the best SVN revision.""" # Aggregate openings and closings from our rev tree svn_revision_ranges = self._get_revision_ranges(self._node_tree) # Score the lists revision_scores = RevisionScores(svn_revision_ranges) best_source_lod, best_revnum, best_score = \ revision_scores.get_best_revnum() if ( preferred_source is not None and revision_scores.get_score(preferred_source) == best_score ): best_source_lod = preferred_source.source_lod best_revnum = preferred_source.opening_revnum if best_revnum == SVN_INVALID_REVNUM: raise FatalError( "failed to find a revision to copy from when copying %s" % self._symbol.name ) return SVNRevisionRange(best_source_lod, best_revnum) def _get_revision_ranges(self, node): """Return a list of all the SVNRevisionRanges at and under NODE. Include duplicates. This is a helper method used by compute_best_source().""" if isinstance(node, SVNRevisionRange): # It is a leaf node. return [ node ] else: # It is an intermediate node. revision_ranges = [] for key, subnode in node.items(): revision_ranges.extend(self._get_revision_ranges(subnode)) return revision_ranges def get_subsources(self): """Generate (CVSPath, FillSource) for all direct subsources.""" if not isinstance(self._node_tree, SVNRevisionRange): for cvs_path, node in self._node_tree.items(): fill_source = FillSource(cvs_path, self._symbol, node) yield (cvs_path, fill_source) def get_subsource_map(self): """Return the map {CVSPath : FillSource} of direct subsources.""" src_entries = {} for (cvs_path, fill_subsource) in self.get_subsources(): src_entries[cvs_path] = fill_subsource return src_entries def __str__(self): """For convenience only. The format is subject to change at any time.""" return '%s(%s:%s)' % ( self.__class__.__name__, self._symbol, self.cvs_path, ) def __repr__(self): """For convenience only. The format is subject to change at any time.""" return '%s%r' % (self, self._node_tree,) def get_source_set(symbol, range_map): """Return a FillSource describing the fill sources for RANGE_MAP. SYMBOL is either a Branch or a Tag. RANGE_MAP is a map { CVSSymbol : SVNRevisionRange } as returned by SymbolingsReader.get_range_map(). Use the SVNRevisionRanges from RANGE_MAP to create a FillSource instance describing the sources for filling SYMBOL.""" root_cvs_directory = symbol.project.get_root_cvs_directory() fill_source = FillSource(root_cvs_directory, symbol, {}) for cvs_symbol, svn_revision_range in range_map.items(): fill_source._set_node(cvs_symbol.cvs_file, svn_revision_range) return fill_source cvs2svn-2.4.0/cvs2svn_lib/git_run_options.py0000664000076500007650000001565311710517256022250 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module manages cvs2git run options.""" from cvs2svn_lib.common import FatalError from cvs2svn_lib.context import Ctx from cvs2svn_lib.dvcs_common import DVCSRunOptions from cvs2svn_lib.run_options import ContextOption from cvs2svn_lib.run_options import IncompatibleOption from cvs2svn_lib.run_options import not_both from cvs2svn_lib.revision_manager import NullRevisionCollector from cvs2svn_lib.rcs_revision_manager import RCSRevisionReader from cvs2svn_lib.cvs_revision_manager import CVSRevisionReader from cvs2svn_lib.git_revision_collector import GitRevisionCollector from cvs2svn_lib.external_blob_generator import ExternalBlobGenerator from cvs2svn_lib.output_option import NullOutputOption from cvs2svn_lib.git_output_option import GitRevisionMarkWriter from cvs2svn_lib.git_output_option import GitOutputOption class GitRunOptions(DVCSRunOptions): short_desc = 'convert a cvs repository into a git repository' synopsis = """\ .B cvs2git [\\fIOPTION\\fR]... \\fIOUTPUT-OPTIONS CVS-REPOS-PATH\\fR .br .B cvs2git [\\fIOPTION\\fR]... \\fI--options=PATH\\fR """ long_desc = """\ Create a new git repository based on the version history stored in a CVS repository. Each CVS commit will be mirrored in the git repository, including such information as date of commit and id of the committer. .P The output of this program are a "blobfile" and a "dumpfile", which together can be loaded into a git repository using "git fast-import". .P \\fICVS-REPOS-PATH\\fR is the filesystem path of the part of the CVS repository that you want to convert. This path doesn't have to be the top level directory of a CVS repository; it can point at a project within a repository, in which case only that project will be converted. This path or one of its parent directories has to contain a subdirectory called CVSROOT (though the CVSROOT directory can be empty). .P It is not possible directly to convert a CVS repository to which you only have remote access, but the FAQ describes tools that may be used to create a local copy of a remote CVS repository. """ files = """\ A directory called \\fIcvs2svn-tmp\\fR (or the directory specified by \\fB--tmpdir\\fR) is used as scratch space for temporary data files. """ see_also = [ ('cvs', '1'), ('git', '1'), ('git-fast-import', '1'), ] def _get_output_options_group(self): group = super(GitRunOptions, self)._get_output_options_group() group.add_option(IncompatibleOption( '--blobfile', type='string', action='store', help='path to which the "blob" data should be written', man_help=( 'Write the "blob" data (containing revision contents) to ' '\\fIpath\\fR.' ), metavar='PATH', )) group.add_option(IncompatibleOption( '--dumpfile', type='string', action='store', help='path to which the revision data should be written', man_help=( 'Write the revision data (branches and commits) to \\fIpath\\fR.' ), metavar='PATH', )) group.add_option(ContextOption( '--dry-run', action='store_true', help=( 'do not create any output; just print what would happen.' ), man_help=( 'Do not create any output; just print what would happen.' ), )) return group def _get_extraction_options_group(self): group = super(GitRunOptions, self)._get_extraction_options_group() self._add_use_cvs_option(group) self._add_use_rcs_option(group) self.parser.set_default('use_external_blob_generator', False) group.add_option(IncompatibleOption( '--use-external-blob-generator', action='store_true', help=( 'Use an external Python program to extract file revision ' 'contents (much faster than --use-rcs or --use-cvs but ' 'leaves keywords unexpanded and requires a separate, ' 'seekable blob file to write to in parallel to the main ' 'cvs2git script.' ), man_help=( 'Use an external Python program to extract the file revision ' 'contents from the RCS files and output them to the blobfile. ' 'This option is much faster than \\fB--use-rcs\\fR or ' '\\fB--use-cvs\\fR but leaves keywords unexpanded and requires ' 'a separate, seekable blob file to write to in parallel to the ' 'main cvs2git script.' ), )) return group def process_extraction_options(self): """Process options related to extracting data from the CVS repository.""" ctx = Ctx() options = self.options not_both(options.use_rcs, '--use-rcs', options.use_cvs, '--use-cvs') not_both(options.use_external_blob_generator, '--use-external-blob-generator', options.use_cvs, '--use-cvs') not_both(options.use_external_blob_generator, '--use-external-blob-generator', options.use_rcs, '--use-rcs') # cvs2git never needs a revision reader: ctx.revision_reader = None if ctx.dry_run: ctx.revision_collector = NullRevisionCollector() return if not (options.blobfile and options.dumpfile): raise FatalError("must pass '--blobfile' and '--dumpfile' options.") if options.use_external_blob_generator: ctx.revision_collector = ExternalBlobGenerator(options.blobfile) else: if options.use_rcs: revision_reader = RCSRevisionReader( co_executable=options.co_executable ) else: # --use-cvs is the default: revision_reader = CVSRevisionReader( cvs_executable=options.cvs_executable ) ctx.revision_collector = GitRevisionCollector( options.blobfile, revision_reader, ) def process_output_options(self): """Process options related to fastimport output.""" ctx = Ctx() if ctx.dry_run: ctx.output_option = NullOutputOption() else: ctx.output_option = GitOutputOption( self.options.dumpfile, GitRevisionMarkWriter(), # Optional map from CVS author names to git author names: author_transforms={}, # FIXME ) cvs2svn-2.4.0/cvs2svn_lib/check_dependencies_pass.py0000664000076500007650000001161111434364604023625 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module defines some passes that can be used for debugging cv2svn.""" from cvs2svn_lib import config from cvs2svn_lib.context import Ctx from cvs2svn_lib.common import FatalException from cvs2svn_lib.common import DB_OPEN_READ from cvs2svn_lib.log import logger from cvs2svn_lib.pass_manager import Pass from cvs2svn_lib.project import read_projects from cvs2svn_lib.artifact_manager import artifact_manager from cvs2svn_lib.cvs_path_database import CVSPathDatabase from cvs2svn_lib.symbol_database import SymbolDatabase from cvs2svn_lib.cvs_item_database import OldCVSItemStore from cvs2svn_lib.cvs_item_database import IndexedCVSItemStore class CheckDependenciesPass(Pass): """Check that the dependencies are self-consistent.""" def __init__(self): Pass.__init__(self) def register_artifacts(self): self._register_temp_file_needed(config.PROJECTS) self._register_temp_file_needed(config.SYMBOL_DB) self._register_temp_file_needed(config.CVS_PATHS_DB) def iter_cvs_items(self): raise NotImplementedError() def get_cvs_item(self, item_id): raise NotImplementedError() def run(self, run_options, stats_keeper): Ctx()._projects = read_projects( artifact_manager.get_temp_file(config.PROJECTS) ) Ctx()._cvs_path_db = CVSPathDatabase(DB_OPEN_READ) self.symbol_db = SymbolDatabase() Ctx()._symbol_db = self.symbol_db logger.quiet("Checking dependency consistency...") fatal_errors = [] for cvs_item in self.iter_cvs_items(): # Check that the pred_ids and succ_ids are mutually consistent: for pred_id in cvs_item.get_pred_ids(): pred = self.get_cvs_item(pred_id) if not cvs_item.id in pred.get_succ_ids(): fatal_errors.append( '%s lists pred=%s, but not vice versa.' % (cvs_item, pred,)) for succ_id in cvs_item.get_succ_ids(): succ = self.get_cvs_item(succ_id) if not cvs_item.id in succ.get_pred_ids(): fatal_errors.append( '%s lists succ=%s, but not vice versa.' % (cvs_item, succ,)) if fatal_errors: raise FatalException( 'Dependencies inconsistent:\n' '%s\n' 'Exited due to fatal error(s).' % ('\n'.join(fatal_errors),) ) self.symbol_db.close() self.symbol_db = None Ctx()._cvs_path_db.close() logger.quiet("Done") class CheckItemStoreDependenciesPass(CheckDependenciesPass): def __init__(self, cvs_items_store_file): CheckDependenciesPass.__init__(self) self.cvs_items_store_file = cvs_items_store_file def register_artifacts(self): CheckDependenciesPass.register_artifacts(self) self._register_temp_file_needed(self.cvs_items_store_file) def iter_cvs_items(self): cvs_item_store = OldCVSItemStore( artifact_manager.get_temp_file(self.cvs_items_store_file)) for cvs_file_items in cvs_item_store.iter_cvs_file_items(): self.current_cvs_file_items = cvs_file_items for cvs_item in cvs_file_items.values(): yield cvs_item del self.current_cvs_file_items cvs_item_store.close() def get_cvs_item(self, item_id): return self.current_cvs_file_items[item_id] class CheckIndexedItemStoreDependenciesPass(CheckDependenciesPass): def __init__(self, cvs_items_store_file, cvs_items_store_index_file): CheckDependenciesPass.__init__(self) self.cvs_items_store_file = cvs_items_store_file self.cvs_items_store_index_file = cvs_items_store_index_file def register_artifacts(self): CheckDependenciesPass.register_artifacts(self) self._register_temp_file_needed(self.cvs_items_store_file) self._register_temp_file_needed(self.cvs_items_store_index_file) def iter_cvs_items(self): return self.cvs_item_store.itervalues() def get_cvs_item(self, item_id): return self.cvs_item_store[item_id] def run(self, run_options, stats_keeper): self.cvs_item_store = IndexedCVSItemStore( artifact_manager.get_temp_file(self.cvs_items_store_file), artifact_manager.get_temp_file(self.cvs_items_store_index_file), DB_OPEN_READ) CheckDependenciesPass.run(self, run_options, stats_keeper) self.cvs_item_store.close() self.cvs_item_store = None cvs2svn-2.4.0/cvs2svn_lib/repository_mirror.py0000664000076500007650000007405011710517256022633 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains the RepositoryMirror class and supporting classes. RepositoryMirror represents the skeleton of a versioned file tree with multiple lines of development ('LODs'). It records the presence or absence of files and directories, but not their contents. Given three values (revnum, lod, cvs_path), it can tell you whether the specified CVSPath existed on the specified LOD in the given revision number. The file trees corresponding to the most recent revision can be modified. The individual file trees are stored using immutable tree structures. Each directory node is represented as a MirrorDirectory instance, which is basically a map {cvs_path : node_id}, where cvs_path is a CVSPath within the directory, and node_id is an integer ID that uniquely identifies another directory node if that node is a CVSDirectory, or None if that node is a CVSFile. If a directory node is to be modified, then first a new node is created with a copy of the original node's contents, then the copy is modified. A reference to the copy also has to be stored in the parent node, meaning that the parent node needs to be modified, and so on recursively to the root node of the file tree. This data structure allows cheap deep copies, which is useful for tagging and branching. The class must also be able to find the root directory node corresponding to a particular (revnum, lod). This is done by keeping an LODHistory instance for each LOD, which can determine the root directory node ID for that LOD for any revnum. It does so by recording changes to the root directory node ID only for revisions in which it changed. Thus it stores two arrays, revnums (a list of the revision numbers when the ID changed), and ids (a list of the corresponding IDs). To find the ID for a particular revnum, first a binary search is done in the revnums array to find the index of the last change preceding revnum, then the corresponding ID is read from the ids array. Since most revisions change only one LOD, this allows storage of the history of potentially tens of thousands of LODs over hundreds of thousands of revisions in an amount of space that scales as O(numberOfLODs + numberOfRevisions), rather than O(numberOfLODs * numberOfRevisions) as would be needed if the information were stored in the equivalent of a 2D array. The internal operation of these classes is somewhat intricate, but the interface attempts to hide the complexity, enforce the usage rules, and allow efficient access. The most important facts to remember are (1) that a directory node can be used for multiple purposes (for multiple branches and for multiple revisions on a single branch), (2) that only a node that has been created within the current revision is allowed to be mutated, and (3) that the current revision can include nodes carried over from prior revisions, which are immutable. This leads to a bewildering variety of MirrorDirectory classes. The most important distinction is between OldMirrorDirectories and CurrentMirrorDirectories. A single node can be represented multiple ways in memory at the same time, depending on whether it was looked up as part of the current revision or part of an old revision: MirrorDirectory -- the base class for all MirrorDirectory nodes. This class allows lookup of subnodes and iteration over subnodes. OldMirrorDirectory -- a MirrorDirectory that was looked up for an old revision. These instances are immutable, as only the current revision is allowed to be modified. CurrentMirrorDirectory -- a MirrorDirectory that was looked up for the current revision. Such an instance is always logically mutable, though mutating it might require the node to be copied first. Such an instance might represent a node that has already been copied during this revision and can therefore be modified freely (such nodes implement _WritableMirrorDirectoryMixin), or it might represent a node that was carried over from an old revision and hasn't been copied yet (such nodes implement _ReadOnlyMirrorDirectoryMixin). If the latter, then the node copies itself (and bubbles up the change) before allowing itself to be modified. But the distinction is managed internally; client classes should not have to worry about it. CurrentMirrorLODDirectory -- A CurrentMirrorDirectory representing the root directory of a line of development in the current revision. This class has two concrete subclasses, _CurrentMirrorReadOnlyLODDirectory and _CurrentMirrorWritableLODDirectory, depending on whether the node has already been copied during this revision. CurrentMirrorSubdirectory -- A CurrentMirrorDirectory representing a subdirectory within a line of development's directory tree in the current revision. This class has two concrete subclasses, _CurrentMirrorReadOnlySubdirectory and _CurrentMirrorWritableSubdirectory, depending on whether the node has already been copied during this revision. DeletedCurrentMirrorDirectory -- a MirrorDirectory that has been deleted. Such an instance is disabled so that it cannot accidentally be used. While a revision is being processed, RepositoryMirror._new_nodes holds every writable CurrentMirrorDirectory instance (i.e., every node that has been created in the revision). Since these nodes are mutable, it is important that there be exactly one instance associated with each node; otherwise there would be problems keeping the instances synchronized. These are written to the database by RepositoryMirror.end_commit(). OldMirrorDirectory and read-only CurrentMirrorDirectory instances are *not* cached; they are recreated whenever they are referenced. There might be multiple instances referring to the same node. A read-only CurrentMirrorDirectory instance is mutated in place into a writable CurrentMirrorDirectory instance if it needs to be modified. FIXME: The rules for when a MirrorDirectory instance can continue to be used vs. when it has to be read again (because it has been modified indirectly and therefore copied) are confusing and error-prone. Probably the semantics should be changed. """ import bisect from cvs2svn_lib import config from cvs2svn_lib.common import DB_OPEN_NEW from cvs2svn_lib.common import InternalError from cvs2svn_lib.log import logger from cvs2svn_lib.context import Ctx from cvs2svn_lib.cvs_path import CVSFile from cvs2svn_lib.cvs_path import CVSDirectory from cvs2svn_lib.key_generator import KeyGenerator from cvs2svn_lib.artifact_manager import artifact_manager from cvs2svn_lib.serializer import MarshalSerializer from cvs2svn_lib.indexed_database import IndexedDatabase class RepositoryMirrorError(Exception): """An error related to the RepositoryMirror.""" pass class LODExistsError(RepositoryMirrorError): """The LOD already exists in the repository. Exception raised if an attempt is made to add an LOD to the repository mirror and that LOD already exists in the youngest revision of the repository.""" pass class PathExistsError(RepositoryMirrorError): """The path already exists in the repository. Exception raised if an attempt is made to add a path to the repository mirror and that path already exists in the youngest revision of the repository.""" pass class DeletedNodeReusedError(RepositoryMirrorError): """The MirrorDirectory has already been deleted and shouldn't be reused.""" pass class CopyFromCurrentNodeError(RepositoryMirrorError): """A CurrentMirrorDirectory cannot be copied to the current revision.""" pass class MirrorDirectory(object): """Represent a node within the RepositoryMirror. Instances of this class act like a map {CVSPath : MirrorDirectory}, where CVSPath is an item within this directory (i.e., a file or subdirectory within this directory). The value is either another MirrorDirectory instance (for directories) or None (for files).""" def __init__(self, repo, id, entries): # The RepositoryMirror containing this directory: self.repo = repo # The id of this node: self.id = id # The entries within this directory, stored as a map {CVSPath : # node_id}. The node_ids are integers for CVSDirectories, None # for CVSFiles: self._entries = entries def __getitem__(self, cvs_path): """Return the MirrorDirectory associated with the specified subnode. Return a MirrorDirectory instance if the subnode is a CVSDirectory; None if it is a CVSFile. Raise KeyError if the specified subnode does not exist.""" raise NotImplementedError() def __len__(self): """Return the number of CVSPaths within this node.""" return len(self._entries) def __contains__(self, cvs_path): """Return True iff CVS_PATH is contained in this node.""" return cvs_path in self._entries def __iter__(self): """Iterate over the CVSPaths within this node.""" return self._entries.__iter__() def _format_entries(self): """Format the entries map for output in subclasses' __repr__() methods.""" def format_item(key, value): if value is None: return str(key) else: return '%s -> %x' % (key, value,) items = self._entries.items() items.sort() return '{%s}' % (', '.join([format_item(*item) for item in items]),) def __str__(self): """For convenience only. The format is subject to change at any time.""" return '%s<%x>' % (self.__class__.__name__, self.id,) class OldMirrorDirectory(MirrorDirectory): """Represent a historical directory within the RepositoryMirror.""" def __getitem__(self, cvs_path): id = self._entries[cvs_path] if id is None: # This represents a leaf node. return None else: return OldMirrorDirectory(self.repo, id, self.repo._node_db[id]) def __repr__(self): """For convenience only. The format is subject to change at any time.""" return '%s(%s)' % (self, self._format_entries(),) class CurrentMirrorDirectory(MirrorDirectory): """Represent a directory that currently exists in the RepositoryMirror.""" def __init__(self, repo, id, lod, cvs_path, entries): MirrorDirectory.__init__(self, repo, id, entries) self.lod = lod self.cvs_path = cvs_path def __getitem__(self, cvs_path): id = self._entries[cvs_path] if id is None: # This represents a leaf node. return None else: try: return self.repo._new_nodes[id] except KeyError: return _CurrentMirrorReadOnlySubdirectory( self.repo, id, self.lod, cvs_path, self, self.repo._node_db[id] ) def __setitem__(self, cvs_path, node): """Create or overwrite a subnode of this node. CVS_PATH is the path of the subnode. NODE will be the new value of the node; for CVSDirectories it should be a MirrorDirectory instance; for CVSFiles it should be None.""" if isinstance(node, DeletedCurrentMirrorDirectory): raise DeletedNodeReusedError( '%r has already been deleted and should not be reused' % (node,) ) elif isinstance(node, CurrentMirrorDirectory): raise CopyFromCurrentNodeError( '%r was created in the current node and cannot be copied' % (node,) ) else: self._set_entry(cvs_path, node) def __delitem__(self, cvs_path): """Remove the subnode of this node at CVS_PATH. If the node does not exist, then raise a KeyError.""" node = self[cvs_path] self._del_entry(cvs_path) if isinstance(node, _WritableMirrorDirectoryMixin): node._mark_deleted() def mkdir(self, cvs_directory): """Create an empty subdirectory of this node at CVS_PATH. Return the CurrentDirectory that was created.""" assert isinstance(cvs_directory, CVSDirectory) if cvs_directory in self: raise PathExistsError( 'Attempt to create directory \'%s\' in %s in repository mirror ' 'when it already exists.' % (cvs_directory, self.lod,) ) new_node = _CurrentMirrorWritableSubdirectory( self.repo, self.repo._key_generator.gen_id(), self.lod, cvs_directory, self, {} ) self._set_entry(cvs_directory, new_node) self.repo._new_nodes[new_node.id] = new_node return new_node def add_file(self, cvs_file): """Create a file within this node at CVS_FILE.""" assert isinstance(cvs_file, CVSFile) if cvs_file in self: raise PathExistsError( 'Attempt to create file \'%s\' in %s in repository mirror ' 'when it already exists.' % (cvs_file, self.lod,) ) self._set_entry(cvs_file, None) def __repr__(self): """For convenience only. The format is subject to change at any time.""" return '%s(%r, %r, %s)' % ( self, self.lod, self.cvs_path, self._format_entries(), ) class DeletedCurrentMirrorDirectory(object): """A MirrorDirectory that has been deleted. A MirrorDirectory that used to be a _WritableMirrorDirectoryMixin but then was deleted. Such instances are turned into this class so that nobody can accidentally mutate them again.""" pass class _WritableMirrorDirectoryMixin: """Mixin for MirrorDirectories that are already writable. A MirrorDirectory is writable if it has already been recreated during the current revision.""" def _set_entry(self, cvs_path, node): """Create or overwrite a subnode of this node, with no checks.""" if node is None: self._entries[cvs_path] = None else: self._entries[cvs_path] = node.id def _del_entry(self, cvs_path): """Remove the subnode of this node at CVS_PATH, with no checks.""" del self._entries[cvs_path] def _mark_deleted(self): """Mark this object and any writable descendants as being deleted.""" self.__class__ = DeletedCurrentMirrorDirectory for (cvs_path, id) in self._entries.iteritems(): if id in self.repo._new_nodes: node = self[cvs_path] if isinstance(node, _WritableMirrorDirectoryMixin): # Mark deleted and recurse: node._mark_deleted() class _ReadOnlyMirrorDirectoryMixin: """Mixin for a CurrentMirrorDirectory that hasn't yet been made writable.""" def _make_writable(self): raise NotImplementedError() def _set_entry(self, cvs_path, node): """Create or overwrite a subnode of this node, with no checks.""" self._make_writable() self._set_entry(cvs_path, node) def _del_entry(self, cvs_path): """Remove the subnode of this node at CVS_PATH, with no checks.""" self._make_writable() self._del_entry(cvs_path) class CurrentMirrorLODDirectory(CurrentMirrorDirectory): """Represent an LOD's main directory in the mirror's current version.""" def __init__(self, repo, id, lod, entries): CurrentMirrorDirectory.__init__( self, repo, id, lod, lod.project.get_root_cvs_directory(), entries ) def delete(self): """Remove the directory represented by this object.""" lod_history = self.repo._get_lod_history(self.lod) assert lod_history.exists() lod_history.update(self.repo._youngest, None) self._mark_deleted() class _CurrentMirrorReadOnlyLODDirectory( CurrentMirrorLODDirectory, _ReadOnlyMirrorDirectoryMixin ): """Represent an LOD's main directory in the mirror's current version.""" def _make_writable(self): self.__class__ = _CurrentMirrorWritableLODDirectory # Create a new ID: self.id = self.repo._key_generator.gen_id() self.repo._new_nodes[self.id] = self self.repo._get_lod_history(self.lod).update(self.repo._youngest, self.id) self._entries = self._entries.copy() class _CurrentMirrorWritableLODDirectory( CurrentMirrorLODDirectory, _WritableMirrorDirectoryMixin ): pass class CurrentMirrorSubdirectory(CurrentMirrorDirectory): """Represent a subdirectory in the mirror's current version.""" def __init__(self, repo, id, lod, cvs_path, parent_mirror_dir, entries): CurrentMirrorDirectory.__init__(self, repo, id, lod, cvs_path, entries) self.parent_mirror_dir = parent_mirror_dir def delete(self): """Remove the directory represented by this object.""" del self.parent_mirror_dir[self.cvs_path] class _CurrentMirrorReadOnlySubdirectory( CurrentMirrorSubdirectory, _ReadOnlyMirrorDirectoryMixin ): """Represent a subdirectory in the mirror's current version.""" def _make_writable(self): self.__class__ = _CurrentMirrorWritableSubdirectory # Create a new ID: self.id = self.repo._key_generator.gen_id() self.repo._new_nodes[self.id] = self self.parent_mirror_dir._set_entry(self.cvs_path, self) self._entries = self._entries.copy() class _CurrentMirrorWritableSubdirectory( CurrentMirrorSubdirectory, _WritableMirrorDirectoryMixin ): pass class LODHistory(object): """The history of root nodes for a line of development. Members: _mirror -- (RepositoryMirror) the RepositoryMirror that manages this LODHistory. lod -- (LineOfDevelopment) the LOD described by this LODHistory. revnums -- (list of int) the revision numbers in which the id changed, in numerical order. ids -- (list of (int or None)) the ID of the node describing the root of this LOD starting at the corresponding revision number, or None if the LOD did not exist in that revision. To find the root id for a given revision number, a binary search is done within REVNUMS to find the index of the most recent revision at the time of REVNUM, then that index is used to read the id out of IDS. A sentry is written at the zeroth index of both arrays to describe the initial situation, namely, that the LOD doesn't exist in revision r0.""" __slots__ = ['_mirror', 'lod', 'revnums', 'ids'] def __init__(self, mirror, lod): self._mirror = mirror self.lod = lod self.revnums = [0] self.ids = [None] def get_id(self, revnum): """Get the ID of the root path for this LOD in REVNUM. Raise KeyError if this LOD didn't exist in REVNUM.""" index = bisect.bisect_right(self.revnums, revnum) - 1 id = self.ids[index] if id is None: raise KeyError(revnum) return id def get_current_id(self): """Get the ID of the root path for this LOD in the current revision. Raise KeyError if this LOD doesn't currently exist.""" id = self.ids[-1] if id is None: raise KeyError() return id def exists(self): """Return True iff LOD exists in the current revision.""" return self.ids[-1] is not None def update(self, revnum, id): """Indicate that the root node of this LOD changed to ID at REVNUM. REVNUM is a revision number that must be the same as that of the previous recorded change (in which case the previous change is overwritten) or later (in which the new change is appended). ID can be a node ID, or it can be None to indicate that this LOD ceased to exist in REVNUM.""" if revnum < self.revnums[-1]: raise KeyError(revnum) elif revnum == self.revnums[-1]: # This is an attempt to overwrite an entry that was already # updated during this revision. Don't allow the replacement # None -> None or allow one new id to be replaced with another: old_id = self.ids[-1] if old_id is None and id is None: raise InternalError( 'ID changed from None -> None for %s, r%d' % (self.lod, revnum,) ) elif (old_id is not None and id is not None and old_id in self._mirror._new_nodes): raise InternalError( 'ID changed from %x -> %x for %s, r%d' % (old_id, id, self.lod, revnum,) ) self.ids[-1] = id else: self.revnums.append(revnum) self.ids.append(id) class _NodeDatabase(object): """A database storing all of the directory nodes. The nodes are written in groups every time write_new_nodes() is called. To the database is written a dictionary {node_id : [(cvs_path.id, node_id),...]}, where the keys are the node_ids of the new nodes. When a node is read, its whole group is read and cached under the assumption that the other nodes in the group are likely to be needed soon. The cache is retained across revisions and cleared when _cache_max_size is exceeded. The dictionaries for nodes that have been read from the database during the current revision are cached by node_id in the _cache member variable. The corresponding dictionaries are *not* copied when read. To avoid cross-talk between distinct MirrorDirectory instances that have the same node_id, users of these dictionaries have to copy them before modification.""" # How many entries should be allowed in the cache for each # CVSDirectory in the repository. (This number is very roughly the # number of complete lines of development that can be stored in the # cache at one time.) CACHE_SIZE_MULTIPLIER = 5 # But the cache will never be limited to less than this number: MIN_CACHE_LIMIT = 5000 def __init__(self): self.cvs_path_db = Ctx()._cvs_path_db self.db = IndexedDatabase( artifact_manager.get_temp_file(config.MIRROR_NODES_STORE), artifact_manager.get_temp_file(config.MIRROR_NODES_INDEX_TABLE), DB_OPEN_NEW, serializer=MarshalSerializer(), ) # A list of the maximum node_id stored by each call to # write_new_nodes(): self._max_node_ids = [0] # A map {node_id : {cvs_path : node_id}}: self._cache = {} # The number of directories in the repository: num_dirs = len([ cvs_path for cvs_path in self.cvs_path_db.itervalues() if isinstance(cvs_path, CVSDirectory) ]) self._cache_max_size = max( int(self.CACHE_SIZE_MULTIPLIER * num_dirs), self.MIN_CACHE_LIMIT, ) def _load(self, items): retval = {} for (id, value) in items: retval[self.cvs_path_db.get_path(id)] = value return retval def _dump(self, node): return [ (cvs_path.id, value) for (cvs_path, value) in node.iteritems() ] def _determine_index(self, id): """Return the index of the record holding the node with ID.""" return bisect.bisect_left(self._max_node_ids, id) def __getitem__(self, id): try: items = self._cache[id] except KeyError: index = self._determine_index(id) for (node_id, items) in self.db[index].items(): self._cache[node_id] = self._load(items) items = self._cache[id] return items def write_new_nodes(self, nodes): """Write NODES to the database. NODES is an iterable of writable CurrentMirrorDirectory instances.""" if len(self._cache) > self._cache_max_size: # The size of the cache has exceeded the threshold. Discard the # old cache values (but still store the new nodes into the # cache): logger.debug('Clearing node cache') self._cache.clear() data = {} max_node_id = 0 for node in nodes: max_node_id = max(max_node_id, node.id) data[node.id] = self._dump(node._entries) self._cache[node.id] = node._entries self.db[len(self._max_node_ids)] = data if max_node_id == 0: # Rewrite last value: self._max_node_ids.append(self._max_node_ids[-1]) else: self._max_node_ids.append(max_node_id) def close(self): self._cache.clear() self.db.close() self.db = None class RepositoryMirror: """Mirror a repository and its history. Mirror a repository as it is constructed, one revision at a time. For each LineOfDevelopment we store a skeleton of the directory structure within that LOD for each revnum in which it changed. For each LOD that has been seen so far, an LODHistory instance is stored in self._lod_histories. An LODHistory keeps track of each revnum in which files were added to or deleted from that LOD, as well as the node id of the root of the node tree describing the LOD contents at that revision. The LOD trees themselves are stored in the _node_db database, which maps node ids to nodes. A node is a map from CVSPath to ids of the corresponding subnodes. The _node_db is stored on disk and each access is expensive. The _node_db database only holds the nodes for old revisions. The revision that is being constructed is kept in memory in the _new_nodes map, which is cheap to access. You must invoke start_commit() before each commit and end_commit() afterwards.""" def register_artifacts(self, which_pass): """Register the artifacts that will be needed for this object.""" artifact_manager.register_temp_file( config.MIRROR_NODES_INDEX_TABLE, which_pass ) artifact_manager.register_temp_file( config.MIRROR_NODES_STORE, which_pass ) def open(self): """Set up the RepositoryMirror and prepare it for commits.""" self._key_generator = KeyGenerator() # A map from LOD to LODHistory instance for all LODs that have # been referenced so far: self._lod_histories = {} # This corresponds to the 'nodes' table in a Subversion fs. (We # don't need a 'representations' or 'strings' table because we # only track file existence, not file contents.) self._node_db = _NodeDatabase() # Start at revision 0 without a root node. self._youngest = 0 def start_commit(self, revnum): """Start a new commit.""" assert revnum > self._youngest self._youngest = revnum # A map {node_id : _WritableMirrorDirectoryMixin}. self._new_nodes = {} def end_commit(self): """Called at the end of each commit. This method copies the newly created nodes to the on-disk nodes db.""" # Copy the new nodes to the _node_db self._node_db.write_new_nodes([ node for node in self._new_nodes.values() if not isinstance(node, DeletedCurrentMirrorDirectory) ]) del self._new_nodes def _get_lod_history(self, lod): """Return the LODHistory instance describing LOD. Create a new (empty) LODHistory if it doesn't yet exist.""" try: return self._lod_histories[lod] except KeyError: lod_history = LODHistory(self, lod) self._lod_histories[lod] = lod_history return lod_history def get_old_lod_directory(self, lod, revnum): """Return the directory for the root path of LOD at revision REVNUM. Return an instance of MirrorDirectory if the path exists; otherwise, raise KeyError.""" lod_history = self._get_lod_history(lod) id = lod_history.get_id(revnum) return OldMirrorDirectory(self, id, self._node_db[id]) def get_old_path(self, cvs_path, lod, revnum): """Return the node for CVS_PATH from LOD at REVNUM. If CVS_PATH is a CVSDirectory, then return an instance of OldMirrorDirectory. If CVS_PATH is a CVSFile, return None. If CVS_PATH does not exist in the specified LOD and REVNUM, raise KeyError.""" node = self.get_old_lod_directory(lod, revnum) for sub_path in cvs_path.get_ancestry()[1:]: node = node[sub_path] return node def get_current_lod_directory(self, lod): """Return the directory for the root path of LOD in the current revision. Return an instance of CurrentMirrorDirectory. Raise KeyError if the path doesn't already exist.""" lod_history = self._get_lod_history(lod) id = lod_history.get_current_id() try: return self._new_nodes[id] except KeyError: return _CurrentMirrorReadOnlyLODDirectory( self, id, lod, self._node_db[id] ) def get_current_path(self, cvs_path, lod): """Return the node for CVS_PATH from LOD in the current revision. If CVS_PATH is a CVSDirectory, then return an instance of CurrentMirrorDirectory. If CVS_PATH is a CVSFile, return None. If CVS_PATH does not exist in the current revision of the specified LOD, raise KeyError.""" node = self.get_current_lod_directory(lod) for sub_path in cvs_path.get_ancestry()[1:]: node = node[sub_path] return node def add_lod(self, lod): """Create a new LOD in this repository. Return the CurrentMirrorDirectory that was created. If the LOD already exists, raise LODExistsError.""" lod_history = self._get_lod_history(lod) if lod_history.exists(): raise LODExistsError( 'Attempt to create %s in repository mirror when it already exists.' % (lod,) ) new_node = _CurrentMirrorWritableLODDirectory( self, self._key_generator.gen_id(), lod, {} ) lod_history.update(self._youngest, new_node.id) self._new_nodes[new_node.id] = new_node return new_node def copy_lod(self, src_lod, dest_lod, src_revnum): """Copy all of SRC_LOD at SRC_REVNUM to DST_LOD. In the youngest revision of the repository, the destination LOD *must not* already exist. Return the new node at DEST_LOD, as a CurrentMirrorDirectory.""" # Get the node of our src_path src_node = self.get_old_lod_directory(src_lod, src_revnum) dest_lod_history = self._get_lod_history(dest_lod) if dest_lod_history.exists(): raise LODExistsError( 'Attempt to copy to %s in repository mirror when it already exists.' % (dest_lod,) ) dest_lod_history.update(self._youngest, src_node.id) # Return src_node, except packaged up as a CurrentMirrorDirectory: return self.get_current_lod_directory(dest_lod) def close(self): """Free resources and close databases.""" self._lod_histories = None self._node_db.close() self._node_db = None cvs2svn-2.4.0/cvs2svn_lib/svn_run_options.py0000664000076500007650000004410111710517256022261 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module manages cvs2svn run options.""" import sys import optparse from cvs2svn_lib import config from cvs2svn_lib.common import warning_prefix from cvs2svn_lib.common import error_prefix from cvs2svn_lib.common import FatalError from cvs2svn_lib.common import normalize_svn_path from cvs2svn_lib.log import logger from cvs2svn_lib.context import Ctx from cvs2svn_lib.run_options import not_both from cvs2svn_lib.run_options import RunOptions from cvs2svn_lib.run_options import ContextOption from cvs2svn_lib.run_options import IncompatibleOption from cvs2svn_lib.project import Project from cvs2svn_lib.svn_output_option import DumpfileOutputOption from cvs2svn_lib.svn_output_option import ExistingRepositoryOutputOption from cvs2svn_lib.svn_output_option import NewRepositoryOutputOption from cvs2svn_lib.symbol_strategy import TrunkPathRule from cvs2svn_lib.symbol_strategy import BranchesPathRule from cvs2svn_lib.symbol_strategy import TagsPathRule from cvs2svn_lib.property_setters import FilePropertySetter class SVNEOLFixPropertySetter(FilePropertySetter): """Set _eol_fix property. This keyword is used to tell the RevisionReader how to munge EOLs when generating the fulltext, based on how svn:eol-style is set. If svn:eol-style is not set, it does _eol_style to None, thereby disabling any EOL munging.""" # A mapping from the value of the svn:eol-style property to the EOL # string that should appear in a dumpfile: EOL_REPLACEMENTS = { 'LF' : '\n', 'CR' : '\r', 'CRLF' : '\r\n', 'native' : '\n', } def set_properties(self, cvs_file): # Fix EOLs if necessary: eol_style = cvs_file.properties.get('svn:eol-style', None) if eol_style: self.maybe_set_property( cvs_file, '_eol_fix', self.EOL_REPLACEMENTS[eol_style] ) else: self.maybe_set_property( cvs_file, '_eol_fix', None ) class SVNKeywordHandlingPropertySetter(FilePropertySetter): """Set _keyword_handling property based on the file mode and svn:keywords. This setting tells the RevisionReader that it has to collapse RCS keywords when generating the fulltext.""" def set_properties(self, cvs_file): if cvs_file.mode == 'b' or cvs_file.mode == 'o': # Leave keywords in the form that they were checked in. value = 'untouched' elif cvs_file.mode == 'k': # This mode causes CVS to collapse keywords on checkout, so we # do the same: value = 'collapsed' elif cvs_file.properties.get('svn:keywords'): # Subversion is going to expand the keywords, so they have to be # collapsed in the dumpfile: value = 'collapsed' else: # CVS expands keywords, so we will too. value = 'expanded' self.maybe_set_property(cvs_file, '_keyword_handling', value) class SVNRunOptions(RunOptions): short_desc = 'convert a CVS repository into a Subversion repository' synopsis = """\ .B cvs2svn [\\fIOPTION\\fR]... \\fIOUTPUT-OPTION CVS-REPOS-PATH\\fR .br .B cvs2svn [\\fIOPTION\\fR]... \\fI--options=PATH\\fR """ long_desc = """\ Create a new Subversion repository based on the version history stored in a CVS repository. Each CVS commit will be mirrored in the Subversion repository, including such information as date of commit and id of the committer. .P \\fICVS-REPOS-PATH\\fR is the filesystem path of the part of the CVS repository that you want to convert. It is not possible to convert a CVS repository to which you only have remote access; see the FAQ for more information. This path doesn't have to be the top level directory of a CVS repository; it can point at a project within a repository, in which case only that project will be converted. This path or one of its parent directories has to contain a subdirectory called CVSROOT (though the CVSROOT directory can be empty). .P Multiple CVS repositories can be converted into a single Subversion repository in a single run of cvs2svn, but only by using an \\fB--options\\fR file. """ files = """\ A directory called \\fIcvs2svn-tmp\\fR (or the directory specified by \\fB--tmpdir\\fR) is used as scratch space for temporary data files. """ see_also = [ ('cvs', '1'), ('svn', '1'), ('svnadmin', '1'), ] def _get_output_options_group(self): group = super(SVNRunOptions, self)._get_output_options_group() group.add_option(IncompatibleOption( '--svnrepos', '-s', type='string', action='store', help='path where SVN repos should be created', man_help=( 'Write the output of the conversion into a Subversion repository ' 'located at \\fIpath\\fR. This option causes a new Subversion ' 'repository to be created at \\fIpath\\fR unless the ' '\\fB--existing-svnrepos\\fR option is also used.' ), metavar='PATH', )) self.parser.set_default('existing_svnrepos', False) group.add_option(IncompatibleOption( '--existing-svnrepos', action='store_true', help='load into existing SVN repository (for use with --svnrepos)', man_help=( 'Load the converted CVS repository into an existing Subversion ' 'repository, instead of creating a new repository. (This option ' 'should be used in combination with ' '\\fB-s\\fR/\\fB--svnrepos\\fR.) The repository must either be ' 'empty or contain no paths that overlap with those that will ' 'result from the conversion. Please note that you need write ' 'permission for the repository files.' ), )) group.add_option(IncompatibleOption( '--fs-type', type='string', action='store', help=( 'pass --fs-type=TYPE to "svnadmin create" (for use with ' '--svnrepos)' ), man_help=( 'Pass \\fI--fs-type\\fR=\\fItype\\fR to "svnadmin create" when ' 'creating a new repository.' ), metavar='TYPE', )) self.parser.set_default('bdb_txn_nosync', False) group.add_option(IncompatibleOption( '--bdb-txn-nosync', action='store_true', help=( 'pass --bdb-txn-nosync to "svnadmin create" (for use with ' '--svnrepos)' ), man_help=( 'Pass \\fI--bdb-txn-nosync\\fR to "svnadmin create" when ' 'creating a new BDB-style Subversion repository.' ), )) self.parser.set_default('create_options', []) group.add_option(IncompatibleOption( '--create-option', type='string', action='append', dest='create_options', help='pass OPT to "svnadmin create" (for use with --svnrepos)', man_help=( 'Pass \\fIopt\\fR to "svnadmin create" when creating a new ' 'Subversion repository (can be specified multiple times to ' 'pass multiple options).' ), metavar='OPT', )) group.add_option(IncompatibleOption( '--dumpfile', type='string', action='store', help='just produce a dumpfile; don\'t commit to a repos', man_help=( 'Just produce a dumpfile; don\'t commit to an SVN repository. ' 'Write the dumpfile to \\fIpath\\fR.' ), metavar='PATH', )) group.add_option(ContextOption( '--dry-run', action='store_true', help=( 'do not create a repository or a dumpfile; just print what ' 'would happen.' ), man_help=( 'Do not create a repository or a dumpfile; just print the ' 'details of what cvs2svn would do if it were really converting ' 'your repository.' ), )) # Deprecated options: self.parser.set_default('dump_only', False) group.add_option(IncompatibleOption( '--dump-only', action='callback', callback=self.callback_dump_only, help=optparse.SUPPRESS_HELP, man_help=optparse.SUPPRESS_HELP, )) group.add_option(IncompatibleOption( '--create', action='callback', callback=self.callback_create, help=optparse.SUPPRESS_HELP, man_help=optparse.SUPPRESS_HELP, )) return group def _get_conversion_options_group(self): group = super(SVNRunOptions, self)._get_conversion_options_group() self.parser.set_default('trunk_base', config.DEFAULT_TRUNK_BASE) group.add_option(IncompatibleOption( '--trunk', type='string', action='store', dest='trunk_base', help=( 'path for trunk (default: %s)' % (config.DEFAULT_TRUNK_BASE,) ), man_help=( 'Set the top-level path to use for trunk in the Subversion ' 'repository. The default is \\fI%s\\fR.' % (config.DEFAULT_TRUNK_BASE,) ), metavar='PATH', )) self.parser.set_default('branches_base', config.DEFAULT_BRANCHES_BASE) group.add_option(IncompatibleOption( '--branches', type='string', action='store', dest='branches_base', help=( 'path for branches (default: %s)' % (config.DEFAULT_BRANCHES_BASE,) ), man_help=( 'Set the top-level path to use for branches in the Subversion ' 'repository. The default is \\fI%s\\fR.' % (config.DEFAULT_BRANCHES_BASE,) ), metavar='PATH', )) self.parser.set_default('tags_base', config.DEFAULT_TAGS_BASE) group.add_option(IncompatibleOption( '--tags', type='string', action='store', dest='tags_base', help=( 'path for tags (default: %s)' % (config.DEFAULT_TAGS_BASE,) ), man_help=( 'Set the top-level path to use for tags in the Subversion ' 'repository. The default is \\fI%s\\fR.' % (config.DEFAULT_TAGS_BASE,) ), metavar='PATH', )) group.add_option(ContextOption( '--include-empty-directories', action='store_true', dest='include_empty_directories', help=( 'include empty directories within the CVS repository ' 'in the conversion' ), man_help=( 'Treat empty subdirectories within the CVS repository as actual ' 'directories, creating them when the parent directory is created ' 'and removing them if and when the parent directory is pruned.' ), )) group.add_option(ContextOption( '--no-prune', action='store_false', dest='prune', help='don\'t prune empty directories', man_help=( 'When all files are deleted from a directory in the Subversion ' 'repository, don\'t delete the empty directory (the default is ' 'to delete any empty directories).' ), )) group.add_option(ContextOption( '--no-cross-branch-commits', action='store_false', dest='cross_branch_commits', help='prevent the creation of cross-branch commits', man_help=( 'Prevent the creation of commits that affect files on multiple ' 'branches at once.' ), )) return group def _get_extraction_options_group(self): group = super(SVNRunOptions, self)._get_extraction_options_group() self._add_use_internal_co_option(group) self._add_use_cvs_option(group) self._add_use_rcs_option(group) return group def _get_environment_options_group(self): group = super(SVNRunOptions, self)._get_environment_options_group() group.add_option(ContextOption( '--svnadmin', type='string', action='store', dest='svnadmin_executable', help='path to the "svnadmin" program', man_help=( 'Path to the \\fIsvnadmin\\fR program. (\\fIsvnadmin\\fR is ' 'needed when the \\fB-s\\fR/\\fB--svnrepos\\fR output option is ' 'used.)' ), metavar='PATH', compatible_with_option=True, )) return group def callback_dump_only(self, option, opt_str, value, parser): parser.values.dump_only = True logger.error( warning_prefix + ': The --dump-only option is deprecated (it is implied ' 'by --dumpfile).\n' ) def callback_create(self, option, opt_str, value, parser): logger.error( warning_prefix + ': The behaviour produced by the --create option is now the ' 'default;\n' 'passing the option is deprecated.\n' ) def process_extraction_options(self): """Process options related to extracting data from the CVS repository.""" self.process_all_extraction_options() def process_output_options(self): """Process the options related to SVN output.""" ctx = Ctx() options = self.options if options.dump_only and not options.dumpfile: raise FatalError("'--dump-only' requires '--dumpfile' to be specified.") if not options.svnrepos and not options.dumpfile and not ctx.dry_run: raise FatalError("must pass one of '-s' or '--dumpfile'.") not_both(options.svnrepos, '-s', options.dumpfile, '--dumpfile') not_both(options.dumpfile, '--dumpfile', options.existing_svnrepos, '--existing-svnrepos') not_both(options.bdb_txn_nosync, '--bdb-txn-nosync', options.existing_svnrepos, '--existing-svnrepos') not_both(options.dumpfile, '--dumpfile', options.bdb_txn_nosync, '--bdb-txn-nosync') not_both(options.fs_type, '--fs-type', options.existing_svnrepos, '--existing-svnrepos') if ( options.fs_type and options.fs_type != 'bdb' and options.bdb_txn_nosync ): raise FatalError("cannot pass --bdb-txn-nosync with --fs-type=%s." % options.fs_type) if options.svnrepos: if options.existing_svnrepos: ctx.output_option = ExistingRepositoryOutputOption(options.svnrepos) else: ctx.output_option = NewRepositoryOutputOption( options.svnrepos, fs_type=options.fs_type, bdb_txn_nosync=options.bdb_txn_nosync, create_options=options.create_options) else: ctx.output_option = DumpfileOutputOption(options.dumpfile) def add_project( self, project_cvs_repos_path, trunk_path=None, branches_path=None, tags_path=None, initial_directories=[], symbol_transforms=None, symbol_strategy_rules=[], exclude_paths=[], ): """Add a project to be converted. Most arguments are passed straight through to the Project constructor. SYMBOL_STRATEGY_RULES is an iterable of SymbolStrategyRules that will be applied to symbols in this project.""" if trunk_path is not None: trunk_path = normalize_svn_path(trunk_path, allow_empty=True) if branches_path is not None: branches_path = normalize_svn_path(branches_path, allow_empty=False) if tags_path is not None: tags_path = normalize_svn_path(tags_path, allow_empty=False) initial_directories = [ path for path in [trunk_path, branches_path, tags_path] if path ] + [ normalize_svn_path(path) for path in initial_directories ] symbol_strategy_rules = list(symbol_strategy_rules) # Add rules to set the SVN paths for LODs depending on whether # they are the trunk, tags, or branches: if trunk_path is not None: symbol_strategy_rules.append(TrunkPathRule(trunk_path)) if branches_path is not None: symbol_strategy_rules.append(BranchesPathRule(branches_path)) if tags_path is not None: symbol_strategy_rules.append(TagsPathRule(tags_path)) id = len(self.projects) project = Project( id, project_cvs_repos_path, initial_directories=initial_directories, symbol_transforms=symbol_transforms, exclude_paths=exclude_paths, ) self.projects.append(project) self.project_symbol_strategy_rules.append(symbol_strategy_rules) def clear_projects(self): """Clear the list of projects to be converted. This method is for the convenience of options files, which may want to import one another.""" del self.projects[:] del self.project_symbol_strategy_rules[:] def process_property_setter_options(self): super(SVNRunOptions, self).process_property_setter_options() # Property setters for internal use: Ctx().file_property_setters.append(SVNEOLFixPropertySetter()) Ctx().file_property_setters.append(SVNKeywordHandlingPropertySetter()) def process_options(self): # Consistency check for options and arguments. if len(self.args) == 0: self.usage() sys.exit(1) if len(self.args) > 1: logger.error(error_prefix + ": must pass only one CVS repository.\n") self.usage() sys.exit(1) cvsroot = self.args[0] self.process_extraction_options() self.process_output_options() self.process_symbol_strategy_options() self.process_property_setter_options() # Create the default project (using ctx.trunk, ctx.branches, and # ctx.tags): self.add_project( cvsroot, trunk_path=self.options.trunk_base, branches_path=self.options.branches_base, tags_path=self.options.tags_base, symbol_transforms=self.options.symbol_transforms, symbol_strategy_rules=self.options.symbol_strategy_rules, ) cvs2svn-2.4.0/cvs2svn_lib/dvcs_common.py0000664000076500007650000003134611710517256021332 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2007-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Miscellaneous utility code common to DVCS backends (like Git, Mercurial, or Bazaar). """ import sys from cvs2svn_lib import config from cvs2svn_lib.common import FatalError from cvs2svn_lib.common import InternalError from cvs2svn_lib.run_options import RunOptions from cvs2svn_lib.log import logger from cvs2svn_lib.common import error_prefix from cvs2svn_lib.context import Ctx from cvs2svn_lib.artifact_manager import artifact_manager from cvs2svn_lib.project import Project from cvs2svn_lib.cvs_item import CVSRevisionAdd from cvs2svn_lib.cvs_item import CVSRevisionChange from cvs2svn_lib.cvs_item import CVSRevisionDelete from cvs2svn_lib.cvs_item import CVSRevisionNoop from cvs2svn_lib.svn_revision_range import RevisionScores from cvs2svn_lib.openings_closings import SymbolingsReader from cvs2svn_lib.repository_mirror import RepositoryMirror from cvs2svn_lib.output_option import OutputOption from cvs2svn_lib.property_setters import FilePropertySetter class KeywordHandlingPropertySetter(FilePropertySetter): """Set property _keyword_handling to a specified value. This keyword is used to tell the RevisionReader whether it has to collapse/expand RCS keywords when generating the fulltext or leave them alone.""" propname = '_keyword_handling' def __init__(self, value): if value not in ['collapsed', 'expanded', 'untouched', None]: raise FatalError( 'Value for %s must be "collapsed", "expanded", or "untouched"' % (self.propname,) ) self.value = value def set_properties(self, cvs_file): self.maybe_set_property(cvs_file, self.propname, self.value) class DVCSRunOptions(RunOptions): """Dumping ground for whatever is common to GitRunOptions and HgRunOptions.""" def __init__(self, progname, cmd_args, pass_manager): Ctx().cross_project_commits = False Ctx().cross_branch_commits = False RunOptions.__init__(self, progname, cmd_args, pass_manager) def set_project( self, project_cvs_repos_path, symbol_transforms=None, symbol_strategy_rules=[], exclude_paths=[], ): """Set the project to be converted. If a project had already been set, overwrite it. Most arguments are passed straight through to the Project constructor. SYMBOL_STRATEGY_RULES is an iterable of SymbolStrategyRules that will be applied to symbols in this project.""" symbol_strategy_rules = list(symbol_strategy_rules) project = Project( 0, project_cvs_repos_path, symbol_transforms=symbol_transforms, exclude_paths=exclude_paths, ) self.projects = [project] self.project_symbol_strategy_rules = [symbol_strategy_rules] def process_property_setter_options(self): super(DVCSRunOptions, self).process_property_setter_options() # Property setters for internal use: Ctx().file_property_setters.append( KeywordHandlingPropertySetter('collapsed') ) def process_options(self): # Consistency check for options and arguments. if len(self.args) == 0: self.usage() sys.exit(1) if len(self.args) > 1: logger.error(error_prefix + ": must pass only one CVS repository.\n") self.usage() sys.exit(1) cvsroot = self.args[0] self.process_extraction_options() self.process_output_options() self.process_symbol_strategy_options() self.process_property_setter_options() # Create the project: self.set_project( cvsroot, symbol_transforms=self.options.symbol_transforms, symbol_strategy_rules=self.options.symbol_strategy_rules, ) class DVCSOutputOption(OutputOption): def __init__(self): self._mirror = RepositoryMirror() self._symbolings_reader = None def normalize_author_transforms(self, author_transforms): """Convert AUTHOR_TRANSFORMS into author strings. AUTHOR_TRANSFORMS is a dict { CVSAUTHOR : DVCSAUTHOR } where CVSAUTHOR is the CVS author and DVCSAUTHOR is either: * a tuple (NAME, EMAIL) where NAME and EMAIL are strings. Such entries are converted into a UTF-8 string of the form 'name '. * a string already in the form 'name '. Return a similar dict { CVSAUTHOR : DVCSAUTHOR } where all keys and values are UTF-8-encoded strings. Any of the input strings may be Unicode strings (in which case they are encoded to UTF-8) or 8-bit strings (in which case they are used as-is). Also turns None into the empty dict.""" result = {} if author_transforms is not None: for (cvsauthor, dvcsauthor) in author_transforms.iteritems(): cvsauthor = to_utf8(cvsauthor) if isinstance(dvcsauthor, basestring): dvcsauthor = to_utf8(dvcsauthor) else: (name, email,) = dvcsauthor name = to_utf8(name) email = to_utf8(email) dvcsauthor = "%s <%s>" % (name, email,) result[cvsauthor] = dvcsauthor return result def register_artifacts(self, which_pass): # These artifacts are needed for SymbolingsReader: artifact_manager.register_temp_file_needed( config.SYMBOL_OPENINGS_CLOSINGS_SORTED, which_pass ) artifact_manager.register_temp_file_needed( config.SYMBOL_OFFSETS_DB, which_pass ) self._mirror.register_artifacts(which_pass) def check(self): if Ctx().cross_project_commits: raise FatalError( '%s output is not supported with cross-project commits' % self.name ) if Ctx().cross_branch_commits: raise FatalError( '%s output is not supported with cross-branch commits' % self.name ) if Ctx().username is None: raise FatalError( '%s output requires a default commit username' % self.name ) def setup(self, svn_rev_count): self._symbolings_reader = SymbolingsReader() self._mirror.open() def cleanup(self): self._mirror.close() self._symbolings_reader.close() del self._symbolings_reader def _get_source_groups(self, svn_commit): """Return groups of sources for SVN_COMMIT. SVN_COMMIT is an instance of SVNSymbolCommit. Return a list of tuples (svn_revnum, source_lod, cvs_symbols) where svn_revnum is the revision that should serve as a source, source_lod is the CVS line of development, and cvs_symbols is a list of CVSSymbolItems that can be copied from that source. The list is in arbitrary order.""" # Get a map {CVSSymbol : SVNRevisionRange}: range_map = self._symbolings_reader.get_range_map(svn_commit) # range_map, split up into one map per LOD; i.e., {LOD : # {CVSSymbol : SVNRevisionRange}}: lod_range_maps = {} for (cvs_symbol, range) in range_map.iteritems(): lod_range_map = lod_range_maps.get(range.source_lod) if lod_range_map is None: lod_range_map = {} lod_range_maps[range.source_lod] = lod_range_map lod_range_map[cvs_symbol] = range # Sort the sources so that the branch that serves most often as # parent is processed first: lod_ranges = lod_range_maps.items() lod_ranges.sort( lambda (lod1,lod_range_map1),(lod2,lod_range_map2): -cmp(len(lod_range_map1), len(lod_range_map2)) or cmp(lod1, lod2) ) source_groups = [] for (lod, lod_range_map) in lod_ranges: while lod_range_map: revision_scores = RevisionScores(lod_range_map.values()) (source_lod, revnum, score) = revision_scores.get_best_revnum() assert source_lod == lod cvs_symbols = [] for (cvs_symbol, range) in lod_range_map.items(): if revnum in range: cvs_symbols.append(cvs_symbol) del lod_range_map[cvs_symbol] source_groups.append((revnum, lod, cvs_symbols)) return source_groups def _is_simple_copy(self, svn_commit, source_groups): """Return True iff SVN_COMMIT can be created as a simple copy. SVN_COMMIT is an SVNTagCommit. Return True iff it can be created as a simple copy from an existing revision (i.e., if the fixup branch can be avoided for this tag creation).""" # The first requirement is that there be exactly one source: if len(source_groups) != 1: return False (svn_revnum, source_lod, cvs_symbols) = source_groups[0] # The second requirement is that the destination LOD not already # exist: try: self._mirror.get_current_lod_directory(svn_commit.symbol) except KeyError: # The LOD doesn't already exist. This is good. pass else: # The LOD already exists. It cannot be created by a copy. return False # The third requirement is that the source LOD contains exactly # the same files as we need to add to the symbol: try: source_node = self._mirror.get_old_lod_directory(source_lod, svn_revnum) except KeyError: raise InternalError('Source %r does not exist' % (source_lod,)) return ( set([cvs_symbol.cvs_file for cvs_symbol in cvs_symbols]) == set(self._get_all_files(source_node)) ) def _get_all_files(self, node): """Generate all of the CVSFiles under NODE.""" for cvs_path in node: subnode = node[cvs_path] if subnode is None: yield cvs_path else: for sub_cvs_path in self._get_all_files(subnode): yield sub_cvs_path class ExpectedDirectoryError(Exception): """A file was found where a directory was expected.""" pass class ExpectedFileError(Exception): """A directory was found where a file was expected.""" pass class MirrorUpdater(object): def register_artifacts(self, which_pass): pass def start(self, mirror): self._mirror = mirror def _mkdir_p(self, cvs_directory, lod): """Make sure that CVS_DIRECTORY exists in LOD. If not, create it. Return the node for CVS_DIRECTORY.""" try: node = self._mirror.get_current_lod_directory(lod) except KeyError: node = self._mirror.add_lod(lod) for sub_path in cvs_directory.get_ancestry()[1:]: try: node = node[sub_path] except KeyError: node = node.mkdir(sub_path) if node is None: raise ExpectedDirectoryError( 'File found at \'%s\' where directory was expected.' % (sub_path,) ) return node def add_file(self, cvs_rev, post_commit): cvs_file = cvs_rev.cvs_file if post_commit: lod = cvs_file.project.get_trunk() else: lod = cvs_rev.lod parent_node = self._mkdir_p(cvs_file.parent_directory, lod) parent_node.add_file(cvs_file) def modify_file(self, cvs_rev, post_commit): cvs_file = cvs_rev.cvs_file if post_commit: lod = cvs_file.project.get_trunk() else: lod = cvs_rev.lod if self._mirror.get_current_path(cvs_file, lod) is not None: raise ExpectedFileError( 'Directory found at \'%s\' where file was expected.' % (cvs_file,) ) def delete_file(self, cvs_rev, post_commit): cvs_file = cvs_rev.cvs_file if post_commit: lod = cvs_file.project.get_trunk() else: lod = cvs_rev.lod parent_node = self._mirror.get_current_path( cvs_file.parent_directory, lod ) if parent_node[cvs_file] is not None: raise ExpectedFileError( 'Directory found at \'%s\' where file was expected.' % (cvs_file,) ) del parent_node[cvs_file] def process_revision(self, cvs_rev, post_commit): if isinstance(cvs_rev, CVSRevisionAdd): self.add_file(cvs_rev, post_commit) elif isinstance(cvs_rev, CVSRevisionChange): self.modify_file(cvs_rev, post_commit) elif isinstance(cvs_rev, CVSRevisionDelete): self.delete_file(cvs_rev, post_commit) elif isinstance(cvs_rev, CVSRevisionNoop): pass else: raise InternalError('Unexpected CVSRevision type: %s' % (cvs_rev,)) def branch_file(self, cvs_symbol): cvs_file = cvs_symbol.cvs_file parent_node = self._mkdir_p(cvs_file.parent_directory, cvs_symbol.symbol) parent_node.add_file(cvs_file) def finish(self): del self._mirror def to_utf8(s): if isinstance(s, unicode): return s.encode('utf8') else: return s cvs2svn-2.4.0/cvs2svn_lib/svn_revision_range.py0000664000076500007650000001340611244045162022712 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """This module contains the SVNRevisionRange class.""" import bisect from cvs2svn_lib.common import SVN_INVALID_REVNUM class SVNRevisionRange: """The range of subversion revision numbers from which a path can be copied. self.opening_revnum is the number of the earliest such revision, and self.closing_revnum is one higher than the number of the last such revision. If self.closing_revnum is None, then no closings were registered.""" def __init__(self, source_lod, opening_revnum): self.source_lod = source_lod self.opening_revnum = opening_revnum self.closing_revnum = None def add_closing(self, closing_revnum): # When we have a non-trunk default branch, we may have multiple # closings--only register the first closing we encounter. if self.closing_revnum is None: self.closing_revnum = closing_revnum def __contains__(self, revnum): """Return True iff REVNUM is contained in the range.""" return ( self.opening_revnum <= revnum \ and (self.closing_revnum is None or revnum < self.closing_revnum) ) def __str__(self): if self.closing_revnum is None: return '[%d:]' % (self.opening_revnum,) else: return '[%d:%d]' % (self.opening_revnum, self.closing_revnum,) def __repr__(self): return str(self) class RevisionScores: """Represent the scores for a range of revisions.""" def __init__(self, svn_revision_ranges): """Initialize based on SVN_REVISION_RANGES. SVN_REVISION_RANGES is a list of SVNRevisionRange objects. The score of an svn source is defined to be the number of SVNRevisionRanges on that LOD that include the revision. A score thus indicates that copying the corresponding revision (or any following revision up to the next revision in the list) of the object in question would yield that many correct paths at or underneath the object. There may be other paths underneath it that are not correct and would need to be deleted or recopied; those can only be detected by descending and examining their scores. If SVN_REVISION_RANGES is empty, then all scores are undefined.""" deltas_map = {} for range in svn_revision_ranges: source_lod = range.source_lod try: deltas = deltas_map[source_lod] except: deltas = [] deltas_map[source_lod] = deltas deltas.append((range.opening_revnum, +1)) if range.closing_revnum is not None: deltas.append((range.closing_revnum, -1)) # A map: # # {SOURCE_LOD : [(REV1 SCORE1), (REV2 SCORE2), (REV3 SCORE3), ...]} # # where the tuples are sorted by revision number and the revision # numbers are distinct. Score is the number of correct paths that # would result from using the specified SOURCE_LOD and revision # number (or any other revision preceding the next revision # listed) as a source. For example, the score of any revision REV # in the range REV2 <= REV < REV3 is equal to SCORE2. self._scores_map = {} for (source_lod,deltas) in deltas_map.items(): # Sort by revision number: deltas.sort() # Initialize output list with zeroth element of deltas. This # element must exist, because it was verified that # svn_revision_ranges (and therefore openings) is not empty. scores = [ deltas[0] ] total = deltas[0][1] for (rev, change) in deltas[1:]: total += change if rev == scores[-1][0]: # Same revision as last entry; modify last entry: scores[-1] = (rev, total) else: # Previously-unseen revision; create new entry: scores.append((rev, total)) self._scores_map[source_lod] = scores def get_score(self, range): """Return the score for RANGE's opening revision. If RANGE doesn't appear explicitly in self.scores, use the score of the higest revision preceding RANGE. If there are no preceding revisions, then the score for RANGE is unknown; in this case, return -1.""" try: scores = self._scores_map[range.source_lod] except KeyError: return -1 # Remember, according to the tuple sorting rules, # # (revnum, anything,) < (revnum+1,) < (revnum+1, anything,) predecessor_index = bisect.bisect_right( scores, (range.opening_revnum + 1,) ) - 1 if predecessor_index < 0: return -1 return scores[predecessor_index][1] def get_best_revnum(self): """Find the revnum with the highest score. Return (revnum, score) for the revnum with the highest score. If the highest score is shared by multiple revisions, select the oldest revision.""" best_source_lod = None best_revnum = SVN_INVALID_REVNUM best_score = 0 source_lods = self._scores_map.keys() source_lods.sort() for source_lod in source_lods: for revnum, score in self._scores_map[source_lod]: if score > best_score: best_source_lod = source_lod best_score = score best_revnum = revnum return best_source_lod, best_revnum, best_score cvs2svn-2.4.0/cvs2svn_lib/changeset_graph_link.py0000664000076500007650000001145311244044540023150 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2006-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Keep track of counts of different types of changeset links.""" # A cvs_item doesn't depend on any cvs_items in either pred or succ: LINK_NONE = 0 # A cvs_item depends on one or more cvs_items in pred but none in succ: LINK_PRED = 1 # A cvs_item depends on one or more cvs_items in succ but none in pred: LINK_SUCC = 2 # A cvs_item depends on one or more cvs_items in both pred and succ: LINK_PASSTHRU = LINK_PRED | LINK_SUCC class ChangesetGraphLink(object): def __init__(self, pred, changeset, succ): """Represent a link in a loop in a changeset graph. This is the link that goes from PRED -> CHANGESET -> SUCC. We are mainly concerned with how many CVSItems have LINK_PRED, LINK_SUCC, and LINK_PASSTHRU type links to the neighboring commitsets. If necessary, this class can also break up CHANGESET into multiple changesets.""" self.pred = pred self.pred_ids = set(pred.cvs_item_ids) self.changeset = changeset self.succ_ids = set(succ.cvs_item_ids) self.succ = succ # A count of each type of link for cvs_items in changeset # (indexed by LINK_* constants): link_counts = [0] * 4 for cvs_item in list(changeset.iter_cvs_items()): link_counts[self.get_link_type(cvs_item)] += 1 [self.pred_links, self.succ_links, self.passthru_links] = link_counts[1:] def get_link_type(self, cvs_item): """Return the type of links from CVS_ITEM to self.PRED and self.SUCC. The return value is one of LINK_NONE, LINK_PRED, LINK_SUCC, or LINK_PASSTHRU.""" retval = LINK_NONE if cvs_item.get_pred_ids() & self.pred_ids: retval |= LINK_PRED if cvs_item.get_succ_ids() & self.succ_ids: retval |= LINK_SUCC return retval def get_links_to_move(self): """Return the number of items that would be moved to split changeset.""" return min(self.pred_links, self.succ_links) \ or max(self.pred_links, self.succ_links) def is_breakable(self): """Return True iff breaking the changeset will do any good.""" return self.pred_links != 0 or self.succ_links != 0 def __cmp__(self, other): """Compare SELF with OTHER in terms of which would be better to break. The one that is better to break is considered the lesser.""" return ( - cmp(int(self.is_breakable()), int(other.is_breakable())) or cmp(self.passthru_links, other.passthru_links) or cmp(self.get_links_to_move(), other.get_links_to_move()) ) def break_changeset(self, changeset_key_generator): """Break up self.changeset and return the fragments. Break it up in such a way that the link is weakened as efficiently as possible.""" if not self.is_breakable(): raise ValueError('Changeset is not breakable: %r' % self.changeset) pred_items = [] succ_items = [] # For each link type, should such CVSItems be moved to the # changeset containing the predecessor items or the one containing # the successor items? destination = { LINK_PRED : pred_items, LINK_SUCC : succ_items, } if self.pred_links == 0: destination[LINK_NONE] = pred_items destination[LINK_PASSTHRU] = pred_items elif self.succ_links == 0: destination[LINK_NONE] = succ_items destination[LINK_PASSTHRU] = succ_items elif self.pred_links < self.succ_links: destination[LINK_NONE] = succ_items destination[LINK_PASSTHRU] = succ_items else: destination[LINK_NONE] = pred_items destination[LINK_PASSTHRU] = pred_items for cvs_item in self.changeset.iter_cvs_items(): link_type = self.get_link_type(cvs_item) destination[link_type].append(cvs_item.id) # Create new changesets of the same type as the old one: return [ self.changeset.create_split_changeset( changeset_key_generator.gen_id(), pred_items), self.changeset.create_split_changeset( changeset_key_generator.gen_id(), succ_items), ] def __str__(self): return 'Link<%x>(%d, %d, %d)' % ( self.changeset.id, self.pred_links, self.succ_links, self.passthru_links) cvs2svn-2.4.0/cvs2svn_lib/repository_walker.py0000664000076500007650000002632011710517256022603 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Walk through a CVS project, generating CVSPaths.""" import os import stat from cvs2svn_lib.common import path_join from cvs2svn_lib.common import FatalError from cvs2svn_lib.common import warning_prefix from cvs2svn_lib.common import IllegalSVNPathError from cvs2svn_lib.log import logger from cvs2svn_lib.context import Ctx from cvs2svn_lib.project import FileInAndOutOfAtticException from cvs2svn_lib.cvs_path import CVSDirectory from cvs2svn_lib.cvs_path import CVSFile class _RepositoryWalker(object): def __init__(self, file_key_generator, error_handler): self.file_key_generator = file_key_generator self.error_handler = error_handler def _get_cvs_file( self, parent_directory, basename, file_in_attic=False, leave_in_attic=False, ): """Return a CVSFile describing the file with name BASENAME. PARENT_DIRECTORY is the CVSDirectory instance describing the directory that physically holds this file in the filesystem. BASENAME must be the base name of a *,v file within PARENT_DIRECTORY. FILE_IN_ATTIC is a boolean telling whether the specified file is in an Attic subdirectory. If FILE_IN_ATTIC is True, then: - If LEAVE_IN_ATTIC is True, then leave the 'Attic' component in the filename. - Otherwise, raise FileInAndOutOfAtticException if a file with the same filename appears outside of Attic. The CVSFile is assigned a new unique id. All of the CVSFile information is filled in except mode (which can only be determined by parsing the file). Raise FatalError if the resulting filename would not be legal in SVN.""" filename = os.path.join(parent_directory.rcs_path, basename) try: Ctx().output_option.verify_filename_legal(basename[:-2]) except IllegalSVNPathError, e: raise FatalError( 'File %r would result in an illegal SVN filename: %s' % (filename, e,) ) if file_in_attic and not leave_in_attic: in_attic = True logical_parent_directory = parent_directory.parent_directory # If this file also exists outside of the attic, it's a fatal # error: non_attic_filename = os.path.join( logical_parent_directory.rcs_path, basename, ) if os.path.exists(non_attic_filename): raise FileInAndOutOfAtticException(non_attic_filename, filename) else: in_attic = False logical_parent_directory = parent_directory file_stat = os.stat(filename) # The size of the file in bytes: file_size = file_stat.st_size # Whether or not the executable bit is set: file_executable = bool(file_stat.st_mode & stat.S_IXUSR) # mode is not known, so we temporarily set it to None. return CVSFile( self.file_key_generator.gen_id(), parent_directory.project, logical_parent_directory, basename[:-2], in_attic, file_executable, file_size, None, None ) def _get_attic_file(self, parent_directory, basename): """Return a CVSFile object for the Attic file at BASENAME. PARENT_DIRECTORY is the CVSDirectory that physically contains the file on the filesystem (i.e., the Attic directory). It is not necessarily the parent_directory of the CVSFile that will be returned. Return CVSFile, whose parent directory is usually PARENT_DIRECTORY.parent_directory, but might be PARENT_DIRECTORY iff CVSFile will remain in the Attic directory.""" try: return self._get_cvs_file( parent_directory, basename, file_in_attic=True, ) except FileInAndOutOfAtticException, e: if Ctx().retain_conflicting_attic_files: logger.warn( "%s: %s;\n" " storing the latter into 'Attic' subdirectory.\n" % (warning_prefix, e) ) else: self.error_handler(str(e)) # Either way, return a CVSFile object so that the rest of the # file processing can proceed: return self._get_cvs_file( parent_directory, basename, file_in_attic=True, leave_in_attic=True, ) def _generate_attic_cvs_files(self, cvs_directory): """Generate CVSFiles for the files in Attic directory CVS_DIRECTORY. Also yield CVS_DIRECTORY if any files are being retained in the Attic. Silently ignore subdirectories named '.svn', but emit a warning if any other directories are found within the Attic directory.""" retained_attic_files = [] fnames = os.listdir(cvs_directory.rcs_path) fnames.sort() for fname in fnames: pathname = os.path.join(cvs_directory.rcs_path, fname) if os.path.isdir(pathname): if fname == '.svn': logger.debug( "Directory %s found within Attic; ignoring" % (pathname,) ) else: logger.warn( "Directory %s found within Attic; ignoring" % (pathname,) ) elif fname.endswith(',v'): cvs_file = self._get_attic_file(cvs_directory, fname) if cvs_file.parent_directory == cvs_directory: # This file will be retained in the Attic directory. retained_attic_files.append(cvs_file) else: # This is a normal Attic file, which is treated as if it # were located one directory up: yield cvs_file if retained_attic_files: # There was at least one file in the attic that will be retained # in the attic. First include the Attic directory itself in the # output, then the retained attic files: yield cvs_directory for cvs_file in retained_attic_files: yield cvs_file def generate_cvs_paths(self, cvs_directory, exclude_paths): """Generate the CVSPaths under non-Attic directory CVS_DIRECTORY. Yield CVSDirectory and CVSFile instances as they are found. Process directories recursively, including Attic directories. Also look for conflicts between the filenames that will result from files, attic files, and subdirectories. Silently ignore subdirectories named '.svn', as these don't make much sense in a real conversion, but they are present in our test suite.""" yield cvs_directory # Map {cvs_file.rcs_basename : cvs_file.rcs_path} for files # directly in cvs_directory: rcsfiles = {} attic_dir = None # Non-Attic subdirectories of cvs_directory (to be recursed into): dirs = [] fnames = os.listdir(cvs_directory.rcs_path) fnames.sort() for fname in fnames: pathname = os.path.join(cvs_directory.rcs_path, fname) path_in_repository = path_join(cvs_directory.get_cvs_path(), fname) if path_in_repository in exclude_paths: logger.normal( "Excluding file from conversion: %s" % (path_in_repository,) ) pass elif os.path.isdir(pathname): if fname == 'Attic': attic_dir = fname elif fname == '.svn': logger.debug("Directory %s ignored" % (pathname,)) else: dirs.append(fname) elif fname.endswith(',v'): cvs_file = self._get_cvs_file(cvs_directory, fname) rcsfiles[cvs_file.rcs_basename] = cvs_file.rcs_path yield cvs_file else: # Silently ignore other files: pass # Map {cvs_file.rcs_basename : cvs_file.rcs_path} for files in an # Attic directory within cvs_directory: attic_rcsfiles = {} if attic_dir is not None: attic_directory = CVSDirectory( self.file_key_generator.gen_id(), cvs_directory.project, cvs_directory, 'Attic', ) for cvs_path in self._generate_attic_cvs_files(attic_directory): if isinstance(cvs_path, CVSFile) \ and cvs_path.parent_directory == cvs_directory: attic_rcsfiles[cvs_path.rcs_basename] = cvs_path.rcs_path yield cvs_path alldirs = dirs + [attic_dir] else: alldirs = dirs # Check for conflicts between directory names and the filenames # that will result from the rcs files (both in this directory and # in attic). (We recurse into the subdirectories nevertheless, to # try to detect more problems.) for fname in alldirs: for rcsfile_list in [rcsfiles, attic_rcsfiles]: if fname in rcsfile_list: self.error_handler( 'Directory name conflicts with filename. Please remove or ' 'rename one\n' 'of the following:\n' ' "%s"\n' ' "%s"' % ( os.path.join(cvs_directory.rcs_path, fname), rcsfile_list[fname], ) ) # Now recurse into the other subdirectories: for fname in dirs: dirname = os.path.join(cvs_directory.rcs_path, fname) # Verify that the directory name does not contain any illegal # characters: try: Ctx().output_option.verify_filename_legal(fname) except IllegalSVNPathError, e: raise FatalError( 'Directory %r would result in an illegal SVN path name: %s' % (dirname, e,) ) sub_directory = CVSDirectory( self.file_key_generator.gen_id(), cvs_directory.project, cvs_directory, fname, ) for cvs_path in self.generate_cvs_paths(sub_directory, exclude_paths): yield cvs_path def walk_repository(project, file_key_generator, error_handler): """Generate CVSDirectories and CVSFiles within PROJECT. Use FILE_KEY_GENERATOR to generate the IDs used for files. If there is a fatal error, register it by calling ERROR_HANDLER with a string argument describing the problem. (The error will be logged but processing will continue through the end of the pass.) Also: * Set PROJECT.root_cvs_directory_id. * Handle files in the Attic by generating CVSFile instances with the _in_attic member set. * Check for naming conflicts that will result from files in and out of the Attic. If Ctx().retain_conflicting_attic_files is set, fix the conflicts by leaving the Attic file in the attic. Otherwise, register a fatal error. * Check for naming conflicts between files (in or out of the Attic) and directories. * Check for filenames that contain characters not allowed by Subversion. """ root_cvs_directory = CVSDirectory( file_key_generator.gen_id(), project, None, '' ) project.root_cvs_directory_id = root_cvs_directory.id repository_walker = _RepositoryWalker(file_key_generator, error_handler) for cvs_path in repository_walker.generate_cvs_paths( root_cvs_directory, project.exclude_paths ): yield cvs_path cvs2svn-2.4.0/cvs2svn_lib/changeset_graph.py0000664000076500007650000003517711434364604022153 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2006-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """The changeset dependency graph.""" import heapq from cvs2svn_lib.log import logger from cvs2svn_lib.changeset import RevisionChangeset from cvs2svn_lib.changeset import OrderedChangeset from cvs2svn_lib.changeset import BranchChangeset from cvs2svn_lib.changeset import TagChangeset class CycleInGraphException(Exception): def __init__(self, cycle): Exception.__init__( self, 'Cycle found in graph: %s' % ' -> '.join(map(str, cycle + [cycle[0]]))) class NoPredNodeInGraphException(Exception): def __init__(self, node): Exception.__init__(self, 'Node %s has no predecessors' % (node,)) class _NoPredNodes(object): """Manage changesets that are ready to be processed. Output the changesets in order by time and changeset type.""" def __init__(self, changeset_db, initial_nodes): """Initialize. INITIAL_NODES is an iterable over node to add to this object on initialization.""" self.changeset_db = changeset_db # A heapified list of (node.time_range, changeset, node) tuples # that have no predecessors. These tuples sort in the desired # commit order: self._nodes = [ (node.time_range, self.changeset_db[node.id], node) for node in initial_nodes ] heapq.heapify(self._nodes) def __len__(self): return len(self._nodes) def add(self, node): node = (node.time_range, self.changeset_db[node.id], node) heapq.heappush(self._nodes, node) def get(self): """Return (node, changeset,) of the next node to be committed. 'Smallest' is defined by the ordering of the tuples in self._nodes; namely, the changeset with the earliest time_range, with ties broken by comparing the changesets themselves.""" (time_range, changeset, node) = heapq.heappop(self._nodes) return (node, changeset) class ChangesetGraph(object): """A graph of changesets and their dependencies.""" def __init__(self, changeset_db, cvs_item_to_changeset_id): self._changeset_db = changeset_db self._cvs_item_to_changeset_id = cvs_item_to_changeset_id # A map { id : ChangesetGraphNode } self.nodes = {} def close(self): self._cvs_item_to_changeset_id.close() self._cvs_item_to_changeset_id = None self._changeset_db.close() self._changeset_db = None def add_changeset(self, changeset): """Add CHANGESET to this graph. Determine and record any dependencies to changesets that are already in the graph. This method does not affect the databases.""" node = changeset.create_graph_node(self._cvs_item_to_changeset_id) # Now tie the node into our graph. If a changeset referenced by # node is already in our graph, then add the backwards connection # from the other node to the new one. If not, then delete the # changeset from node. for pred_id in list(node.pred_ids): pred_node = self.nodes.get(pred_id) if pred_node is not None: pred_node.succ_ids.add(node.id) else: node.pred_ids.remove(pred_id) for succ_id in list(node.succ_ids): succ_node = self.nodes.get(succ_id) if succ_node is not None: succ_node.pred_ids.add(node.id) else: node.succ_ids.remove(succ_id) self.nodes[node.id] = node def store_changeset(self, changeset): for cvs_item_id in changeset.cvs_item_ids: self._cvs_item_to_changeset_id[cvs_item_id] = changeset.id self._changeset_db.store(changeset) def add_new_changeset(self, changeset): """Add the new CHANGESET to the graph and also to the databases.""" if logger.is_on(logger.DEBUG): logger.debug('Adding changeset %r' % (changeset,)) self.add_changeset(changeset) self.store_changeset(changeset) def delete_changeset(self, changeset): """Remove CHANGESET from the graph and also from the databases. In fact, we don't remove CHANGESET from self._cvs_item_to_changeset_id, because in practice the CVSItems in CHANGESET are always added again as part of a new CHANGESET, which will cause the old values to be overwritten.""" if logger.is_on(logger.DEBUG): logger.debug('Removing changeset %r' % (changeset,)) del self[changeset.id] del self._changeset_db[changeset.id] def __nonzero__(self): """Instances are considered True iff they contain any nodes.""" return bool(self.nodes) def __contains__(self, id): """Return True if the specified ID is contained in this graph.""" return id in self.nodes def __getitem__(self, id): return self.nodes[id] def get(self, id): return self.nodes.get(id) def __delitem__(self, id): """Remove the node corresponding to ID. Also remove references to it from other nodes. This method does not change pred_ids or succ_ids of the node being deleted, nor does it affect the databases.""" node = self[id] for succ_id in node.succ_ids: succ = self[succ_id] succ.pred_ids.remove(node.id) for pred_id in node.pred_ids: pred = self[pred_id] pred.succ_ids.remove(node.id) del self.nodes[node.id] def keys(self): return self.nodes.keys() def __iter__(self): return self.nodes.itervalues() def _get_path(self, reachable_changesets, starting_node_id, ending_node_id): """Return the shortest path from ENDING_NODE_ID to STARTING_NODE_ID. Find a path from ENDING_NODE_ID to STARTING_NODE_ID in REACHABLE_CHANGESETS, where STARTING_NODE_ID is the id of a changeset that depends on the changeset with ENDING_NODE_ID. (See the comment in search_for_path() for a description of the format of REACHABLE_CHANGESETS.) Return a list of changesets, where the 0th one has ENDING_NODE_ID and the last one has STARTING_NODE_ID. If there is no such path described in in REACHABLE_CHANGESETS, return None.""" if ending_node_id not in reachable_changesets: return None path = [self._changeset_db[ending_node_id]] id = reachable_changesets[ending_node_id][1] while id != starting_node_id: path.append(self._changeset_db[id]) id = reachable_changesets[id][1] path.append(self._changeset_db[starting_node_id]) return path def search_for_path(self, starting_node_id, stop_set): """Search for paths to prerequisites of STARTING_NODE_ID. Try to find the shortest dependency path that causes the changeset with STARTING_NODE_ID to depend (directly or indirectly) on one of the changesets whose ids are contained in STOP_SET. We consider direct and indirect dependencies in the sense that the changeset can be reached by following a chain of predecessor nodes. When one of the changeset_ids in STOP_SET is found, terminate the search and return the path from that changeset_id to STARTING_NODE_ID. If no path is found to a node in STOP_SET, return None.""" # A map {node_id : (steps, next_node_id)} where NODE_ID can be # reached from STARTING_NODE_ID in STEPS steps, and NEXT_NODE_ID # is the id of the previous node in the path. STARTING_NODE_ID is # only included as a key if there is a loop leading back to it. reachable_changesets = {} # A list of (node_id, steps) that still have to be investigated, # and STEPS is the number of steps to get to NODE_ID. open_nodes = [(starting_node_id, 0)] # A breadth-first search: while open_nodes: (id, steps) = open_nodes.pop(0) steps += 1 node = self[id] for pred_id in node.pred_ids: # Since the search is breadth-first, we only have to set steps # that don't already exist. if pred_id not in reachable_changesets: reachable_changesets[pred_id] = (steps, id) open_nodes.append((pred_id, steps)) # See if we can stop now: if pred_id in stop_set: return self._get_path( reachable_changesets, starting_node_id, pred_id ) return None def consume_nopred_nodes(self): """Remove and yield changesets in dependency order. Each iteration, this generator yields a (changeset, time_range) tuple for the oldest changeset in the graph that doesn't have any predecessor nodes (i.e., it is ready to be committed). This is continued until there are no more nodes without predecessors (either because the graph has been emptied, or because of cycles in the graph). Among the changesets that are ready to be processed, the earliest one (according to the sorting of the TimeRange class) is yielded each time. (This is the order in which the changesets should be committed.) The graph should not be otherwise altered while this generator is running.""" # Find a list of (node,changeset,) where the node has no # predecessors: nopred_nodes = _NoPredNodes( self._changeset_db, ( node for node in self.nodes.itervalues() if not node.pred_ids ), ) while nopred_nodes: (node, changeset,) = nopred_nodes.get() del self[node.id] # See if any successors are now ready for extraction: for succ_id in node.succ_ids: succ = self[succ_id] if not succ.pred_ids: nopred_nodes.add(succ) yield (changeset, node.time_range) def find_cycle(self, starting_node_id): """Find a cycle in the dependency graph and return it. Use STARTING_NODE_ID as the place to start looking. This routine must only be called after all nopred_nodes have been removed. Return the list of changesets that are involved in the cycle (ordered such that cycle[n-1] is a predecessor of cycle[n] and cycle[-1] is a predecessor of cycle[0]).""" # Since there are no nopred nodes in the graph, all nodes in the # graph must either be involved in a cycle or depend (directly or # indirectly) on nodes that are in a cycle. # Pick an arbitrary node: node = self[starting_node_id] seen_nodes = [node] # Follow it backwards until a node is seen a second time; then we # have our cycle. while True: # Pick an arbitrary predecessor of node. It must exist, because # there are no nopred nodes: try: node_id = node.pred_ids.__iter__().next() except StopIteration: raise NoPredNodeInGraphException(node) node = self[node_id] try: i = seen_nodes.index(node) except ValueError: seen_nodes.append(node) else: seen_nodes = seen_nodes[i:] seen_nodes.reverse() return [self._changeset_db[node.id] for node in seen_nodes] def consume_graph(self, cycle_breaker=None): """Remove and yield changesets from this graph in dependency order. Each iteration, this generator yields a (changeset, time_range) tuple for the oldest changeset in the graph that doesn't have any predecessor nodes. If CYCLE_BREAKER is specified, then call CYCLE_BREAKER(cycle) whenever a cycle is encountered, where cycle is the list of changesets that are involved in the cycle (ordered such that cycle[n-1] is a predecessor of cycle[n] and cycle[-1] is a predecessor of cycle[0]). CYCLE_BREAKER should break the cycle in place then return. If a cycle is found and CYCLE_BREAKER was not specified, raise CycleInGraphException.""" while True: for (changeset, time_range) in self.consume_nopred_nodes(): yield (changeset, time_range) # If there are any nodes left in the graph, then there must be # at least one cycle. Find a cycle and process it. # This might raise StopIteration, but that indicates that the # graph has been fully consumed, so we just let the exception # escape. start_node_id = self.nodes.iterkeys().next() cycle = self.find_cycle(start_node_id) if cycle_breaker is not None: cycle_breaker(cycle) else: raise CycleInGraphException(cycle) def __repr__(self): """For convenience only. The format is subject to change at any time.""" if self.nodes: return 'ChangesetGraph:\n%s' \ % ''.join([' %r\n' % node for node in self]) else: return 'ChangesetGraph:\n EMPTY\n' node_colors = { RevisionChangeset : 'lightgreen', OrderedChangeset : 'cyan', BranchChangeset : 'orange', TagChangeset : 'yellow', } def output_coarse_dot(self, f): """Output the graph in DOT format to file-like object f. Such a file can be rendered into a visual representation of the graph using tools like graphviz. Include only changesets in the graph, and the dependencies between changesets.""" f.write('digraph G {\n') for node in self: f.write( ' C%x [style=filled, fillcolor=%s];\n' % ( node.id, self.node_colors[self._changeset_db[node.id].__class__], ) ) f.write('\n') for node in self: for succ_id in node.succ_ids: f.write(' C%x -> C%x\n' % (node.id, succ_id,)) f.write('\n') f.write('}\n') def output_fine_dot(self, f): """Output the graph in DOT format to file-like object f. Such a file can be rendered into a visual representation of the graph using tools like graphviz. Include all CVSItems and the CVSItem-CVSItem dependencies in the graph. Group the CVSItems into clusters by changeset.""" f.write('digraph G {\n') for node in self: f.write(' subgraph cluster_%x {\n' % (node.id,)) f.write(' label = "C%x";\n' % (node.id,)) changeset = self._changeset_db[node.id] for item_id in changeset.cvs_item_ids: f.write(' I%x;\n' % (item_id,)) f.write(' style=filled;\n') f.write( ' fillcolor=%s;\n' % (self.node_colors[self._changeset_db[node.id].__class__],)) f.write(' }\n\n') for node in self: changeset = self._changeset_db[node.id] for cvs_item in changeset.iter_cvs_items(): for succ_id in cvs_item.get_succ_ids(): f.write(' I%x -> I%x;\n' % (cvs_item.id, succ_id,)) f.write('\n') f.write('}\n') cvs2svn-2.4.0/cvs2bzr-example.options0000664000076500007650000006325111710517257020661 0ustar mhaggermhagger00000000000000# (Be in -*- mode: python; coding: utf-8 -*- mode.) # # ==================================================================== # Copyright (c) 2006-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== # ##################### # ## PLEASE READ ME! ## # ##################### # # This is a template for an options file that can be used to configure # cvs2svn to convert to Bazaar rather than to Subversion. See # www/cvs2bzr.html and www/cvs2svn.html for general information, and # see the comments in this file for information about what options are # available and how they can be set. # # The program that is run to convert from CVS to Bazaar is called # cvs2bzr. Run it with the --options option, passing it this file # like this: # # cvs2bzr --options=cvs2bzr-example.options # # The output of cvs2bzr is a dump file that can be loaded into Bazaar # using the "bzr fast-import" command. Please read www/cvs2bzr.html # for more information. # # Many options do not have defaults, so it is easier to copy this file # and modify what you need rather than creating a new options file # from scratch. This file is in Python syntax, but you don't need to # know Python to modify it. But if you *do* know Python, then you # will be happy to know that you can use arbitary Python constructs to # do fancy configuration tricks. # # But please be aware of the following: # # * In many places, leading whitespace is significant in Python (it is # used instead of curly braces to group statements together). # Therefore, if you don't know what you are doing, it is best to # leave the whitespace as it is. # # * In normal strings, Python treats a backslash ("\") as an escape # character. Therefore, if you want to specify a string that # contains a backslash, you need either to escape the backslash with # another backslash ("\\"), or use a "raw string", as in one if the # following equivalent examples: # # cvs_executable = 'c:\\windows\\system32\\cvs.exe' # cvs_executable = r'c:\windows\system32\cvs.exe' # # See http://docs.python.org/tutorial/introduction.html#strings for # more information. # # Two identifiers will have been defined before this file is executed, # and can be used freely within this file: # # ctx -- a Ctx object (see cvs2svn_lib/context.py), which holds # many configuration options # # run_options -- an instance of the BzrRunOptions class (see # cvs2svn_lib/bzr_run_options.py), which holds some variables # governing how cvs2bzr is run # Import some modules that are used in setting the options: import os from cvs2svn_lib import config from cvs2svn_lib import changeset_database from cvs2svn_lib.common import CVSTextDecoder from cvs2svn_lib.log import logger from cvs2svn_lib.git_output_option import GitRevisionInlineWriter from cvs2svn_lib.bzr_output_option import BzrOutputOption from cvs2svn_lib.dvcs_common import KeywordHandlingPropertySetter from cvs2svn_lib.revision_manager import NullRevisionCollector from cvs2svn_lib.rcs_revision_manager import RCSRevisionReader from cvs2svn_lib.cvs_revision_manager import CVSRevisionReader from cvs2svn_lib.symbol_strategy import AllBranchRule from cvs2svn_lib.symbol_strategy import AllTagRule from cvs2svn_lib.symbol_strategy import BranchIfCommitsRule from cvs2svn_lib.symbol_strategy import ExcludeRegexpStrategyRule from cvs2svn_lib.symbol_strategy import ForceBranchRegexpStrategyRule from cvs2svn_lib.symbol_strategy import ForceTagRegexpStrategyRule from cvs2svn_lib.symbol_strategy import ExcludeTrivialImportBranchRule from cvs2svn_lib.symbol_strategy import ExcludeVendorBranchRule from cvs2svn_lib.symbol_strategy import HeuristicStrategyRule from cvs2svn_lib.symbol_strategy import UnambiguousUsageRule from cvs2svn_lib.symbol_strategy import HeuristicPreferredParentRule from cvs2svn_lib.symbol_strategy import SymbolHintsFileRule from cvs2svn_lib.symbol_transform import ReplaceSubstringsSymbolTransform from cvs2svn_lib.symbol_transform import RegexpSymbolTransform from cvs2svn_lib.symbol_transform import IgnoreSymbolTransform from cvs2svn_lib.symbol_transform import NormalizePathsSymbolTransform from cvs2svn_lib.property_setters import AutoPropsPropertySetter from cvs2svn_lib.property_setters import ConditionalPropertySetter from cvs2svn_lib.property_setters import cvs_file_is_binary from cvs2svn_lib.property_setters import CVSBinaryFileDefaultMimeTypeSetter from cvs2svn_lib.property_setters import CVSBinaryFileEOLStyleSetter from cvs2svn_lib.property_setters import DefaultEOLStyleSetter from cvs2svn_lib.property_setters import EOLStyleFromMimeTypeSetter from cvs2svn_lib.property_setters import ExecutablePropertySetter from cvs2svn_lib.property_setters import KeywordsPropertySetter from cvs2svn_lib.property_setters import MimeMapper from cvs2svn_lib.property_setters import SVNBinaryFileKeywordsPropertySetter # To choose the level of logging output, uncomment one of the # following lines: #logger.log_level = logger.WARN #logger.log_level = logger.QUIET logger.log_level = logger.NORMAL #logger.log_level = logger.VERBOSE #logger.log_level = logger.DEBUG # The directory to use for temporary files: ctx.tmpdir = r'cvs2svn-tmp' # cvs2bzr does not need to keep track of what revisions will be # excluded, so leave this option unchanged: ctx.revision_collector = NullRevisionCollector() # cvs2bzr's revision reader is set via the BzrOutputOption constructor, # so leave this option set to None. ctx.revision_reader = None # Change the following line to True if the conversion should only # include the trunk of the repository (i.e., all branches and tags # should be omitted from the conversion): ctx.trunk_only = False # How to convert CVS author names, log messages, and filenames to # Unicode. The first argument to CVSTextDecoder is a list of encoders # that are tried in order in 'strict' mode until one of them succeeds. # If none of those succeeds, then fallback_encoder (if it is # specified) is used in lossy 'replace' mode. Setting a fallback # encoder ensures that the encoder always succeeds, but it can cause # information loss. ctx.cvs_author_decoder = CVSTextDecoder( [ #'utf8', #'latin1', 'ascii', ], #fallback_encoding='ascii' ) ctx.cvs_log_decoder = CVSTextDecoder( [ #'utf8', #'latin1', 'ascii', ], #fallback_encoding='ascii', eol_fix='\n', ) # You might want to be especially strict when converting filenames to # Unicode (e.g., maybe not specify a fallback_encoding). ctx.cvs_filename_decoder = CVSTextDecoder( [ #'utf8', #'latin1', 'ascii', ], #fallback_encoding='ascii' ) # Template for the commit message to be used for initial project # commits. ctx.initial_project_commit_message = ( 'Standard project directories initialized by cvs2svn.' ) # Template for the commit message to be used for post commits, in # which modifications to a vendor branch are copied back to trunk. # This message can use '%(revnum)d' to include the SVN revision number # of the revision that included the change to the vendor branch # (admittedly rather pointless in a cvs2bzr conversion). ctx.post_commit_message = ( 'This commit was generated by cvs2svn to track changes on a CVS ' 'vendor branch.' ) # Template for the commit message to be used for commits in which # symbols are created. This message can use '%(symbol_type)s' to # include the type of the symbol ('branch' or 'tag') or # '%(symbol_name)s' to include the name of the symbol. ctx.symbol_commit_message = ( "This commit was manufactured by cvs2svn to create %(symbol_type)s " "'%(symbol_name)s'." ) # Template for the commit message to be used for commits in which # tags are pseudo-merged back to their source branch. This message can # use '%(symbol_name)s' to include the name of the symbol. # (Not used by default unless you enable tie_tag_fixup_branches on # GitOutputOption.) ctx.tie_tag_ancestry_message = ( "This commit was manufactured by cvs2svn to tie ancestry for " "tag '%(symbol_name)s' back to the source branch." ) # Some CVS clients for MacOS store resource fork data into CVS along # with the file contents itself by wrapping it all up in a container # format called "AppleSingle". Subversion currently does not support # MacOS resource forks. Nevertheless, sometimes the resource fork # information is not necessary and can be discarded. Set the # following option to True if you would like cvs2svn to identify files # whose contents are encoded in AppleSingle format, and discard all # but the data fork for such files before committing them to # Subversion. (Please note that AppleSingle contents are identified # by the AppleSingle magic number as the first four bytes of the file. # This check is not failproof, so only set this option if you think # you need it.) ctx.decode_apple_single = False # This option can be set to the name of a filename to which are stored # statistics and conversion decisions about the CVS symbols. ctx.symbol_info_filename = None #ctx.symbol_info_filename = 'symbol-info.txt' # cvs2svn uses "symbol strategy rules" to help decide how to handle # CVS symbols. The rules in a project's symbol_strategy_rules are # applied in order, and each rule is allowed to modify the symbol. # The result (after each of the rules has been applied) is used for # the conversion. # # 1. A CVS symbol might be used as a tag in one file and as a branch # in another file. cvs2svn has to decide whether to convert such a # symbol as a tag or as a branch. cvs2svn uses a series of # heuristic rules to decide how to convert a symbol. The user can # override the default rules for specific symbols or symbols # matching regular expressions. # # 2. cvs2svn is also capable of excluding symbols from the conversion # (provided no other symbols depend on them. # # 3. CVS does not record unambiguously the line of development from # which a symbol sprouted. cvs2svn uses a heuristic to choose a # symbol's "preferred parents". # # The standard branch/tag/exclude StrategyRules do not change a symbol # that has already been processed by an earlier rule, so in effect the # first matching rule is the one that is used. global_symbol_strategy_rules = [ # It is possible to specify manually exactly how symbols should be # converted and what line of development should be used as the # preferred parent. To do so, create a file containing the symbol # hints and enable the following option. # # The format of the hints file is described in the documentation # for the --symbol-hints command-line option. The file output by # the --write-symbol-info (i.e., ctx.symbol_info_filename) option # is in the same format. The simplest way to use this option is # to run the conversion through CollateSymbolsPass with # --write-symbol-info option, copy the symbol info and edit it to # create a hints file, then re-start the conversion at # CollateSymbolsPass with this option enabled. #SymbolHintsFileRule('symbol-hints.txt'), # To force all symbols matching a regular expression to be # converted as branches, add rules like the following: #ForceBranchRegexpStrategyRule(r'branch.*'), # To force all symbols matching a regular expression to be # converted as tags, add rules like the following: #ForceTagRegexpStrategyRule(r'tag.*'), # To force all symbols matching a regular expression to be # excluded from the conversion, add rules like the following: #ExcludeRegexpStrategyRule(r'unknown-.*'), # Sometimes people use "cvs import" to get their own source code # into CVS. This practice creates a vendor branch 1.1.1 and # imports the code onto the vendor branch as 1.1.1.1, then copies # the same content to the trunk as version 1.1. Normally, such # vendor branches are useless and they complicate the SVN history # unnecessarily. The following rule excludes any branches that # only existed as a vendor branch with a single import (leaving # only the 1.1 revision). If you want to retain such branches, # comment out the following line. (Please note that this rule # does not exclude vendor *tags*, as they are not so easy to # identify.) ExcludeTrivialImportBranchRule(), # To exclude all vendor branches (branches that had "cvs import"s # on them but no other kinds of commits), uncomment the following # line: #ExcludeVendorBranchRule(), # Usually you want this rule, to convert unambiguous symbols # (symbols that were only ever used as tags or only ever used as # branches in CVS) the same way they were used in CVS: UnambiguousUsageRule(), # If there was ever a commit on a symbol, then it cannot be # converted as a tag. This rule causes all such symbols to be # converted as branches. If you would like to resolve such # ambiguities manually, comment out the following line: BranchIfCommitsRule(), # Last in the list can be a catch-all rule that is used for # symbols that were not matched by any of the more specific rules # above. (Assuming that BranchIfCommitsRule() was included above, # then the symbols that are still indeterminate at this point can # sensibly be converted as branches or tags.) Include at most one # of these lines. If none of these catch-all rules are included, # then the presence of any ambiguous symbols (that haven't been # disambiguated above) is an error: # Convert ambiguous symbols based on whether they were used more # often as branches or as tags: HeuristicStrategyRule(), # Convert all ambiguous symbols as branches: #AllBranchRule(), # Convert all ambiguous symbols as tags: #AllTagRule(), # The last rule is here to choose the preferred parent of branches # and tags, that is, the line of development from which the symbol # sprouts. HeuristicPreferredParentRule(), ] # Specify a username to be used for commits for which CVS doesn't # record the original author (for example, the creation of a branch). # This should be a simple (unix-style) username, but it can be # translated into a Bazaar-style name by the author_transforms map. ctx.username = 'cvs2svn' # ctx.file_property_setters and ctx.revision_property_setters contain # rules used to set the svn properties on files in the converted # archive. For each file, the rules are tried one by one. Any rule # can add or suppress one or more svn properties. Typically the rules # will not overwrite properties set by a previous rule (though they # are free to do so). ctx.file_property_setters should be used for # properties that remain the same for the life of the file; these # should implement FilePropertySetter. ctx.revision_property_setters # should be used for properties that are allowed to vary from revision # to revision; these should implement RevisionPropertySetter. # # Obviously, SVN properties per se are not interesting for a cvs2bzr # conversion, but some of these properties have side-effects that do # affect the Bazaar output. FIXME: Document this in more detail. ctx.file_property_setters.extend([ # To read auto-props rules from a file, uncomment the following line # and specify a filename. The boolean argument specifies whether # case should be ignored when matching filenames to the filename # patterns found in the auto-props file: #AutoPropsPropertySetter( # r'/home/username/.subversion/config', # ignore_case=True, # ), # To read mime types from a file and use them to set svn:mime-type # based on the filename extensions, uncomment the following line # and specify a filename (see # http://en.wikipedia.org/wiki/Mime.types for information about # mime.types files): #MimeMapper(r'/etc/mime.types', ignore_case=False), # Omit the svn:eol-style property from any files that are listed # as binary (i.e., mode '-kb') in CVS: CVSBinaryFileEOLStyleSetter(), # If the file is binary and its svn:mime-type property is not yet # set, set svn:mime-type to 'application/octet-stream'. CVSBinaryFileDefaultMimeTypeSetter(), # To try to determine the eol-style from the mime type, uncomment # the following line: #EOLStyleFromMimeTypeSetter(), # Choose one of the following lines to set the default # svn:eol-style if none of the above rules applied. The argument # is the svn:eol-style that should be applied, or None if no # svn:eol-style should be set (i.e., the file should be treated as # binary). # # The default is to treat all files as binary unless one of the # previous rules has determined otherwise, because this is the # safest approach. However, if you have been diligent about # marking binary files with -kb in CVS and/or you have used the # above rules to definitely mark binary files as binary, then you # might prefer to use 'native' as the default, as it is usually # the most convenient setting for text files. Other possible # options: 'CRLF', 'CR', 'LF'. DefaultEOLStyleSetter(None), #DefaultEOLStyleSetter('native'), # Prevent svn:keywords from being set on files that have # svn:eol-style unset. SVNBinaryFileKeywordsPropertySetter(), # If svn:keywords has not been set yet, set it based on the file's # CVS mode: KeywordsPropertySetter(config.SVN_KEYWORDS_VALUE), # Set the svn:executable flag on any files that are marked in CVS as # being executable: ExecutablePropertySetter(), # The following causes keywords to be untouched in binary files and # collapsed in all text to be committed: ConditionalPropertySetter( cvs_file_is_binary, KeywordHandlingPropertySetter('untouched'), ), KeywordHandlingPropertySetter('collapsed'), ]) ctx.revision_property_setters.extend([ ]) # To skip the cleanup of temporary files, uncomment the following # option: #ctx.skip_cleanup = True # In CVS, it is perfectly possible to make a single commit that # affects more than one project or more than one branch of a single # project. Subversion also allows such commits. Therefore, by # default, when cvs2svn sees what looks like a cross-project or # cross-branch CVS commit, it converts it into a # cross-project/cross-branch Subversion commit. # # However, other tools and SCMs have trouble representing # cross-project or cross-branch commits. (For example, Trac's Revtree # plugin, http://www.trac-hacks.org/wiki/RevtreePlugin is confused by # such commits.) Therefore, we provide the following two options to # allow cross-project/cross-branch commits to be suppressed. # cvs2bzr only supports single-project conversions (multiple-project # conversions wouldn't really make sense for Bazaar anyway). So this # option must be set to False: ctx.cross_project_commits = False # Bazaar itself doesn't allow commits that affect more than one branch, # so this option must be set to False: ctx.cross_branch_commits = False # cvs2bzr does not yet handle translating .cvsignore files into # .bzrignore content, so by default, the .cvsignore files are included # in the conversion output. If you would like to omit the .cvsignore # files from the output, set this option to False: ctx.keep_cvsignore = True # By default, it is a fatal error for a CVS ",v" file to appear both # inside and outside of an "Attic" subdirectory (this should never # happen, but frequently occurs due to botched repository # administration). If you would like to retain both versions of such # files, change the following option to True, and the attic version of # the file will be written to a subdirectory called "Attic" in the # output repository: ctx.retain_conflicting_attic_files = False # CVS uses unix login names as author names whereas Bazaar requires # author names to be of the form "foo ". The default is to set # the Bazaar author to "cvsauthor ". author_transforms can # be used to map cvsauthor names (e.g., "jrandom") to a true name and # email address (e.g., "J. Random " for the # example shown). All strings should be either Unicode strings (i.e., # with "u" as a prefix) or 8-bit strings in the utf-8 encoding. The # values can either be strings in the form "name " or tuples # (name, email). Please substitute your own project's usernames here # to use with the author_transforms option of BzrOutputOption below. author_transforms={ 'jrandom' : ('J. Random', 'jrandom@example.com'), 'mhagger' : 'Michael Haggerty ', 'brane' : (u'Branko Čibej', 'brane@xbc.nu'), 'ringstrom' : 'Tobias Ringström ', 'dionisos' : (u'Erik Hülsmann', 'e.huelsmann@gmx.net'), # This one will be used for commits for which CVS doesn't record # the original author, as explained above. 'cvs2svn' : 'cvs2svn ', } # This is the main option that causes cvs2svn to output to a # "fastimport"-format dumpfile rather than to Subversion: ctx.output_option = BzrOutputOption( # The file in which to write the "fastimport" stream: os.path.join(ctx.tmpdir, 'dumpfile.fi'), # Write the file contents inline in the "fastimport" stream, # rather than using a separate blobs file (which "bzr fastimport" # can't handle as easily). revision_writer=GitRevisionInlineWriter( # cvs2bzr uses either RCS's "co" command or CVS's "cvs co -p" to # extract the content of file revisions. Here you can choose # whether to use RCS (faster, but fails in some rare # circumstances) or CVS (much slower, but more reliable). #RCSRevisionReader(co_executable=r'co') CVSRevisionReader(cvs_executable=r'cvs') ), # Optional map from CVS author names to Bazaar author names: author_transforms=author_transforms, ) # Change this option to True to turn on profiling of cvs2svn (for # debugging purposes): run_options.profiling = False # Should CVSItem -> Changeset database files be memory mapped? In # some tests, using memory mapping speeded up the overall conversion # by about 5%. But this option can cause the conversion to fail with # an out of memory error if the conversion computer runs out of # virtual address space (e.g., when running a very large conversion on # a 32-bit operating system). Therefore it is disabled by default. # Uncomment the following line to allow these database files to be # memory mapped. #changeset_database.use_mmap_for_cvs_item_to_changeset_table = True # Now set the project to be converted to Bazaar. cvs2bzr only supports # single-project conversions, so this method must only be called # once: run_options.set_project( # The filesystem path to the part of the CVS repository (*not* a # CVS working copy) that should be converted. This may be a # subdirectory (i.e., a module) within a larger CVS repository. r'test-data/main-cvsrepos', # A list of symbol transformations that can be used to rename # symbols in this project. symbol_transforms=[ # Use IgnoreSymbolTransforms like the following to completely # ignore symbols matching a regular expression when parsing # the CVS repository, for example to avoid warnings about # branches with two names and to choose the preferred name. # It is *not* recommended to use this instead of # ExcludeRegexpStrategyRule; though more efficient, # IgnoreSymbolTransforms are less flexible and don't exclude # branches correctly. The argument is a Python-style regular # expression that has to match the *whole* CVS symbol name: #IgnoreSymbolTransform(r'nightly-build-tag-.*') # RegexpSymbolTransforms transform symbols textually using a # regular expression. The first argument is a Python regular # expression pattern and the second is a replacement pattern. # The pattern is matched against each symbol name. If it # matches the whole symbol name, then the symbol name is # replaced with the corresponding replacement text. The # replacement can include substitution patterns (e.g., r'\1' # or r'\g'). Typically you will want to use raw strings # (strings with a preceding 'r', like shown in the examples) # for the regexp and its replacement to avoid backslash # substitution within those strings. #RegexpSymbolTransform(r'release-(\d+)_(\d+)', # r'release-\1.\2'), #RegexpSymbolTransform(r'release-(\d+)_(\d+)_(\d+)', # r'release-\1.\2.\3'), # Simple 1:1 character replacements can also be done. The # following transform, which converts backslashes into forward # slashes, should usually be included: ReplaceSubstringsSymbolTransform('\\','/'), # This last rule eliminates leading, trailing, and repeated # slashes within the output symbol names: NormalizePathsSymbolTransform(), ], # See the definition of global_symbol_strategy_rules above for a # description of this option: symbol_strategy_rules=global_symbol_strategy_rules, # Exclude paths from the conversion. Should be relative to # repository path and use forward slashes: #exclude_paths=['file-to-exclude.txt,v', 'dir/to/exclude'], ) cvs2svn-2.4.0/COPYING0000664000076500007650000000524510646512653015254 0ustar mhaggermhagger00000000000000This license applies to all portions of cvs2svn which are not externally-maintained libraries (e.g. rcsparse). Such libraries have their own licenses; we recommend you read them, as their terms may differ from the terms below. This is version 1 of this license. It is also available online at http://subversion.tigris.org/license-1.html. If newer versions of this license are posted there (the same URL, but with the version number incremented: .../license-2.html, .../license-3.html, and so on), you may use a newer version instead, at your option. ==================================================================== Copyright (c) 2000-2007 CollabNet. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. The end-user documentation included with the redistribution, if any, must include the following acknowledgment: "This product includes software developed by CollabNet (http://www.Collab.Net/)." Alternately, this acknowledgment may appear in the software itself, if and wherever such third-party acknowledgments normally appear. 4. The hosted project names must not be used to endorse or promote products derived from this software without prior written permission. For written permission, please contact info@collab.net. 5. Products derived from this software may not use the "Tigris" name nor may "Tigris" appear in their names without prior written permission of CollabNet. THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL COLLABNET OR ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ==================================================================== This software consists of voluntary contributions made by many individuals on behalf of CollabNet. cvs2svn-2.4.0/setup.py0000775000076500007650000000576511435427214015740 0ustar mhaggermhagger00000000000000#!/usr/bin/env python import sys from distutils.core import setup assert 0x02040000 <= sys.hexversion < 0x03000000, \ "Install Python 2, version 2.4 or greater" def get_version(): "Return the version number of cvs2svn." from cvs2svn_lib.version import VERSION return VERSION setup( # Metadata. name = "cvs2svn", version = get_version(), description = "CVS to Subversion/git/Bazaar/Mercurial repository converter", author = "The cvs2svn team", author_email = "dev@cvs2svn.tigris.org", url = "http://cvs2svn.tigris.org/", download_url = "http://cvs2svn.tigris.org/servlets/ProjectDocumentList?folderID=2976", license = "Apache-style", long_description = """\ cvs2svn_ is a tool for migrating a CVS repository to Subversion_, git_, Bazaar_, or Mercurial_. The main design goals are robustness and 100% data preservation. cvs2svn can convert just about any CVS repository we've ever seen, including gcc, Mozilla, FreeBSD, KDE, GNOME... .. _cvs2svn: http://cvs2svn.tigris.org/ .. _Subversion: http://svn.tigris.org/ .. _git: http://git-scm.com/ .. _Bazaar: http://bazaar-vcs.org/ .. _Mercurial: http://mercurial.selenic.com/ cvs2svn infers what happened in the history of your CVS repository and replicates that history as accurately as possible in the target SCM. All revisions, branches, tags, log messages, author names, and commit dates are converted. cvs2svn deduces what CVS modifications were made at the same time, and outputs these modifications grouped together as changesets in the target SCM. cvs2svn also deals with many CVS quirks and is highly configurable. See the comprehensive `feature list`_. .. _feature list: http://cvs2svn.tigris.org/features.html .. _documentation: http://cvs2svn.tigris.org/cvs2svn.html Please read the documentation_ carefully before using cvs2svn. Latest development version -------------------------- For general use, the most recent released version of cvs2svn is usually the best choice. However, if you want to use the newest cvs2svn features or if you're debugging or patching cvs2svn, you might want to use the trunk version (which is usually quite stable). To do so, use Subversion to check out a working copy from http://cvs2svn.tigris.org/svn/cvs2svn/trunk/ using a command like:: svn co --username=guest --password="" http://cvs2svn.tigris.org/svn/cvs2svn/trunk cvs2svn-trunk """, classifiers = [ 'Development Status :: 5 - Production/Stable', 'Environment :: Console', 'Intended Audience :: Developers', 'Intended Audience :: System Administrators', 'License :: OSI Approved', 'Natural Language :: English', 'Operating System :: OS Independent', 'Programming Language :: Python :: 2', 'Topic :: Software Development :: Version Control', 'Topic :: Software Development :: Version Control :: CVS', 'Topic :: Utilities', ], # Data. packages = ["cvs2svn_lib", "cvs2svn_rcsparse"], scripts = ["cvs2svn", "cvs2git", "cvs2bzr"], ) cvs2svn-2.4.0/cvs2svn_rcsparse/0000775000076500007650000000000012027373500017510 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/cvs2svn_rcsparse/texttools.py0000664000076500007650000002437011160222265022133 0ustar mhaggermhagger00000000000000# -*-python-*- # # Copyright (C) 1999-2008 The ViewCVS Group. All Rights Reserved. # # By using this file, you agree to the terms and conditions set forth in # the LICENSE.html file which can be found at the top level of the ViewVC # distribution or at http://viewvc.org/license-1.html. # # For more information, visit http://viewvc.org/ # # ----------------------------------------------------------------------- import string # note: this will raise an ImportError if it isn't available. the rcsparse # package will recognize this and switch over to the default parser. from mx import TextTools import common # for convenience _tt = TextTools _idchar_list = map(chr, range(33, 127)) + map(chr, range(160, 256)) _idchar_list.remove('$') _idchar_list.remove(',') #_idchar_list.remove('.') # leave as part of 'num' symbol _idchar_list.remove(':') _idchar_list.remove(';') _idchar_list.remove('@') _idchar = string.join(_idchar_list, '') _idchar_set = _tt.set(_idchar) _onechar_token_set = _tt.set(':;') _not_at_set = _tt.invset('@') _T_TOKEN = 30 _T_STRING_START = 40 _T_STRING_SPAN = 60 _T_STRING_END = 70 _E_COMPLETE = 100 # ended on a complete token _E_TOKEN = 110 # ended mid-token _E_STRING_SPAN = 130 # ended within a string _E_STRING_END = 140 # ended with string-end ('@') (could be mid-@@) _SUCCESS = +100 _EOF = 'EOF' _CONTINUE = 'CONTINUE' _UNUSED = 'UNUSED' # continuation of a token over a chunk boundary _c_token_table = ( (_T_TOKEN, _tt.AllInSet, _idchar_set), ) class _mxTokenStream: # the algorithm is about the same speed for any CHUNK_SIZE chosen. # grab a good-sized chunk, but not too large to overwhelm memory. # note: we use a multiple of a standard block size CHUNK_SIZE = 192 * 512 # about 100k # CHUNK_SIZE = 5 # for debugging, make the function grind... def __init__(self, file): self.rcsfile = file self.tokens = [ ] self.partial = None self.string_end = None def _parse_chunk(self, buf, start=0): "Get the next token from the RCS file." buflen = len(buf) assert start < buflen # construct a tag table which refers to the buffer we need to parse. table = ( #1: ignore whitespace. with or without whitespace, move to the next rule. (None, _tt.AllInSet, _tt.whitespace_set, +1), #2 (_E_COMPLETE, _tt.EOF + _tt.AppendTagobj, _tt.Here, +1, _SUCCESS), #3: accumulate token text and exit, or move to the next rule. (_UNUSED, _tt.AllInSet + _tt.AppendMatch, _idchar_set, +2), #4 (_E_TOKEN, _tt.EOF + _tt.AppendTagobj, _tt.Here, -3, _SUCCESS), #5: single character tokens exit immediately, or move to the next rule (_UNUSED, _tt.IsInSet + _tt.AppendMatch, _onechar_token_set, +2), #6 (_E_COMPLETE, _tt.EOF + _tt.AppendTagobj, _tt.Here, -5, _SUCCESS), #7: if this isn't an '@' symbol, then we have a syntax error (go to a # negative index to indicate that condition). otherwise, suck it up # and move to the next rule. (_T_STRING_START, _tt.Is + _tt.AppendTagobj, '@'), #8 (None, _tt.Is, '@', +4, +1), #9 (buf, _tt.Is, '@', +1, -1), #10 (_T_STRING_END, _tt.Skip + _tt.AppendTagobj, 0, 0, +1), #11 (_E_STRING_END, _tt.EOF + _tt.AppendTagobj, _tt.Here, -10, _SUCCESS), #12 (_E_STRING_SPAN, _tt.EOF + _tt.AppendTagobj, _tt.Here, +1, _SUCCESS), #13: suck up everything that isn't an AT. go to next rule to look for EOF (buf, _tt.AllInSet, _not_at_set, 0, +1), #14: go back to look for double AT if we aren't at the end of the string (_E_STRING_SPAN, _tt.EOF + _tt.AppendTagobj, _tt.Here, -6, _SUCCESS), ) # Fast, texttools may be, but it's somewhat lacking in clarity. # Here's an attempt to document the logic encoded in the table above: # # Flowchart: # _____ # / /\ # 1 -> 2 -> 3 -> 5 -> 7 -> 8 -> 9 -> 10 -> 11 # | \/ \/ \/ /\ \/ # \ 4 6 12 14 / # \_______/_____/ \ / / # \ 13 / # \__________________________________________/ # # #1: Skip over any whitespace. # #2: If now EOF, exit with code _E_COMPLETE. # #3: If we have a series of characters in _idchar_set, then: # #4: Output them as a token, and go back to #1. # #5: If we have a character in _onechar_token_set, then: # #6: Output it as a token, and go back to #1. # #7: If we do not have an '@', then error. # If we do, then log a _T_STRING_START and continue. # #8: If we have another '@', continue on to #9. Otherwise: # #12: If now EOF, exit with code _E_STRING_SPAN. # #13: Record the slice up to the next '@' (or EOF). # #14: If now EOF, exit with code _E_STRING_SPAN. # Otherwise, go back to #8. # #9: If we have another '@', then we've just seen an escaped # (by doubling) '@' within an @-string. Record a slice including # just one '@' character, and jump back to #8. # Otherwise, we've *either* seen the terminating '@' of an @-string, # *or* we've seen one half of an escaped @@ sequence that just # happened to be split over a chunk boundary - in either case, # we continue on to #10. # #10: Log a _T_STRING_END. # #11: If now EOF, exit with _E_STRING_END. Otherwise, go back to #1. success, taglist, idx = _tt.tag(buf, table, start) if not success: ### need a better way to report this error raise common.RCSIllegalCharacter() assert idx == buflen # pop off the last item last_which = taglist.pop() i = 0 tlen = len(taglist) while i < tlen: if taglist[i] == _T_STRING_START: j = i + 1 while j < tlen: if taglist[j] == _T_STRING_END: s = _tt.join(taglist, '', i+1, j) del taglist[i:j] tlen = len(taglist) taglist[i] = s break j = j + 1 else: assert last_which == _E_STRING_SPAN s = _tt.join(taglist, '', i+1) del taglist[i:] self.partial = (_T_STRING_SPAN, [ s ]) break i = i + 1 # figure out whether we have a partial last-token if last_which == _E_TOKEN: self.partial = (_T_TOKEN, [ taglist.pop() ]) elif last_which == _E_COMPLETE: pass elif last_which == _E_STRING_SPAN: assert self.partial else: assert last_which == _E_STRING_END self.partial = (_T_STRING_END, [ taglist.pop() ]) taglist.reverse() taglist.extend(self.tokens) self.tokens = taglist def _set_end(self, taglist, text, l, r, subtags): self.string_end = l def _handle_partial(self, buf): which, chunks = self.partial if which == _T_TOKEN: success, taglist, idx = _tt.tag(buf, _c_token_table) if not success: # The start of this buffer was not a token. So the end of the # prior buffer was a complete token. self.tokens.insert(0, string.join(chunks, '')) else: assert len(taglist) == 1 and taglist[0][0] == _T_TOKEN \ and taglist[0][1] == 0 and taglist[0][2] == idx if idx == len(buf): # # The whole buffer was one huge token, so we may have a # partial token again. # # Note: this modifies the list of chunks in self.partial # chunks.append(buf) # consumed the whole buffer return len(buf) # got the rest of the token. chunks.append(buf[:idx]) self.tokens.insert(0, string.join(chunks, '')) # no more partial token self.partial = None return idx if which == _T_STRING_END: if buf[0] != '@': self.tokens.insert(0, string.join(chunks, '')) return 0 chunks.append('@') start = 1 else: start = 0 self.string_end = None string_table = ( (None, _tt.Is, '@', +3, +1), (_UNUSED, _tt.Is + _tt.AppendMatch, '@', +1, -1), (self._set_end, _tt.Skip + _tt.CallTag, 0, 0, _SUCCESS), (None, _tt.EOF, _tt.Here, +1, _SUCCESS), # suck up everything that isn't an AT. move to next rule to look # for EOF (_UNUSED, _tt.AllInSet + _tt.AppendMatch, _not_at_set, 0, +1), # go back to look for double AT if we aren't at the end of the string (None, _tt.EOF, _tt.Here, -5, _SUCCESS), ) success, unused, idx = _tt.tag(buf, string_table, start, len(buf), chunks) # must have matched at least one item assert success if self.string_end is None: assert idx == len(buf) self.partial = (_T_STRING_SPAN, chunks) elif self.string_end < len(buf): self.partial = None self.tokens.insert(0, string.join(chunks, '')) else: self.partial = (_T_STRING_END, chunks) return idx def _parse_more(self): buf = self.rcsfile.read(self.CHUNK_SIZE) if not buf: return _EOF if self.partial: idx = self._handle_partial(buf) if idx is None: return _CONTINUE if idx < len(buf): self._parse_chunk(buf, idx) else: self._parse_chunk(buf) return _CONTINUE def get(self): try: return self.tokens.pop() except IndexError: pass while not self.tokens: action = self._parse_more() if action == _EOF: return None return self.tokens.pop() # _get = get # def get(self): token = self._get() print 'T:', `token` return token def match(self, match): if self.tokens: token = self.tokens.pop() else: token = self.get() if token != match: raise common.RCSExpected(token, match) def unget(self, token): self.tokens.append(token) def mget(self, count): "Return multiple tokens. 'next' is at the end." while len(self.tokens) < count: action = self._parse_more() if action == _EOF: ### fix this raise RuntimeError, 'EOF hit while expecting tokens' result = self.tokens[-count:] del self.tokens[-count:] return result class Parser(common._Parser): stream_class = _mxTokenStream cvs2svn-2.4.0/cvs2svn_rcsparse/__init__.py0000664000076500007650000000030111710517257021622 0ustar mhaggermhagger00000000000000# This file causes Python to treat its containing directory it as a # package. Please note that the contents of this file differ from # those of the __init__.py file distributed with ViewVC. cvs2svn-2.4.0/cvs2svn_rcsparse/parse_rcs_file.py0000775000076500007650000000430610720141157023047 0ustar mhaggermhagger00000000000000#! /usr/bin/python # (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2006-2007 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== """Parse an RCS file, showing the rcsparse callbacks that are called. This program is useful to see whether an RCS file has a problem (in the sense of not being parseable by rcsparse) and also to illuminate the correspondence between RCS file contents and rcsparse callbacks. The output of this program can also be considered to be a kind of 'canonical' format for RCS files, at least in so far as rcsparse returns all relevant information in the file and provided that the order of callbacks is always the same.""" import sys import os class Logger: def __init__(self, f, name): self.f = f self.name = name def __call__(self, *args): self.f.write( '%s(%s)\n' % (self.name, ', '.join(['%r' % arg for arg in args]),) ) class LoggingSink: def __init__(self, f): self.f = f def __getattr__(self, name): return Logger(self.f, name) if __name__ == '__main__': # Since there is nontrivial logic in __init__.py, we have to import # parse() via that file. First make sure that the directory # containing this script is in the path: sys.path.insert(0, os.path.dirname(sys.argv[0])) from __init__ import parse if sys.argv[1:]: for path in sys.argv[1:]: if os.path.isfile(path) and path.endswith(',v'): parse( open(path, 'rb'), LoggingSink(sys.stdout) ) else: sys.stderr.write('%r is being ignored.\n' % path) else: parse(sys.stdin, LoggingSink(sys.stdout)) cvs2svn-2.4.0/cvs2svn_rcsparse/default.py0000664000076500007650000001122511434364605021516 0ustar mhaggermhagger00000000000000# -*-python-*- # # Copyright (C) 1999-2008 The ViewCVS Group. All Rights Reserved. # # By using this file, you agree to the terms and conditions set forth in # the LICENSE.html file which can be found at the top level of the ViewVC # distribution or at http://viewvc.org/license-1.html. # # For more information, visit http://viewvc.org/ # # ----------------------------------------------------------------------- # # This file was originally based on portions of the blame.py script by # Curt Hagenlocher. # # ----------------------------------------------------------------------- import string import common class _TokenStream: token_term = string.whitespace + ";:" try: token_term = frozenset(token_term) except NameError: pass # the algorithm is about the same speed for any CHUNK_SIZE chosen. # grab a good-sized chunk, but not too large to overwhelm memory. # note: we use a multiple of a standard block size CHUNK_SIZE = 192 * 512 # about 100k # CHUNK_SIZE = 5 # for debugging, make the function grind... def __init__(self, file): self.rcsfile = file self.idx = 0 self.buf = self.rcsfile.read(self.CHUNK_SIZE) if self.buf == '': raise RuntimeError, 'EOF' def get(self): "Get the next token from the RCS file." # Note: we can afford to loop within Python, examining individual # characters. For the whitespace and tokens, the number of iterations # is typically quite small. Thus, a simple iterative loop will beat # out more complex solutions. buf = self.buf lbuf = len(buf) idx = self.idx while 1: if idx == lbuf: buf = self.rcsfile.read(self.CHUNK_SIZE) if buf == '': # signal EOF by returning None as the token del self.buf # so we fail if get() is called again return None lbuf = len(buf) idx = 0 if buf[idx] not in string.whitespace: break idx = idx + 1 if buf[idx] in ';:': self.buf = buf self.idx = idx + 1 return buf[idx] if buf[idx] != '@': end = idx + 1 token = '' while 1: # find token characters in the current buffer while end < lbuf and buf[end] not in self.token_term: end = end + 1 token = token + buf[idx:end] if end < lbuf: # we stopped before the end, so we have a full token idx = end break # we stopped at the end of the buffer, so we may have a partial token buf = self.rcsfile.read(self.CHUNK_SIZE) lbuf = len(buf) idx = end = 0 self.buf = buf self.idx = idx return token # a "string" which starts with the "@" character. we'll skip it when we # search for content. idx = idx + 1 chunks = [ ] while 1: if idx == lbuf: idx = 0 buf = self.rcsfile.read(self.CHUNK_SIZE) if buf == '': raise RuntimeError, 'EOF' lbuf = len(buf) i = string.find(buf, '@', idx) if i == -1: chunks.append(buf[idx:]) idx = lbuf continue if i == lbuf - 1: chunks.append(buf[idx:i]) idx = 0 buf = '@' + self.rcsfile.read(self.CHUNK_SIZE) if buf == '@': raise RuntimeError, 'EOF' lbuf = len(buf) continue if buf[i + 1] == '@': chunks.append(buf[idx:i+1]) idx = i + 2 continue chunks.append(buf[idx:i]) self.buf = buf self.idx = i + 1 return string.join(chunks, '') # _get = get # def get(self): token = self._get() print 'T:', `token` return token def match(self, match): "Try to match the next token from the input buffer." token = self.get() if token != match: raise common.RCSExpected(token, match) def unget(self, token): "Put this token back, for the next get() to return." # Override the class' .get method with a function which clears the # overridden method then returns the pushed token. Since this function # will not be looked up via the class mechanism, it should be a "normal" # function, meaning it won't have "self" automatically inserted. # Therefore, we need to pass both self and the token thru via defaults. # note: we don't put this into the input buffer because it may have been # @-unescaped already. def give_it_back(self=self, token=token): del self.get return token self.get = give_it_back def mget(self, count): "Return multiple tokens. 'next' is at the end." result = [ ] for i in range(count): result.append(self.get()) result.reverse() return result class Parser(common._Parser): stream_class = _TokenStream cvs2svn-2.4.0/cvs2svn_rcsparse/run-tests.py0000775000076500007650000000425010720141675022036 0ustar mhaggermhagger00000000000000#! /usr/bin/python # (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2007 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://viewvc.tigris.org/. # ==================================================================== """Run tests of rcsparse code.""" import sys import os import glob from cStringIO import StringIO from difflib import Differ # Since there is nontrivial logic in __init__.py, we have to import # parse() via that file. First make sure that the directory # containing this script is in the path: script_dir = os.path.dirname(sys.argv[0]) sys.path.insert(0, script_dir) from __init__ import parse from parse_rcs_file import LoggingSink test_dir = os.path.join(script_dir, 'test-data') filelist = glob.glob(os.path.join(test_dir, '*,v')) filelist.sort() all_tests_ok = 1 for filename in filelist: sys.stderr.write('%s: ' % (filename,)) f = StringIO() try: parse(open(filename, 'rb'), LoggingSink(f)) except Exception, e: sys.stderr.write('Error parsing file: %s!\n' % (e,)) all_tests_ok = 0 else: output = f.getvalue() expected_output_filename = filename[:-2] + '.out' expected_output = open(expected_output_filename, 'rb').read() if output == expected_output: sys.stderr.write('OK\n') else: sys.stderr.write('Output does not match expected output!\n') differ = Differ() for diffline in differ.compare( expected_output.splitlines(1), output.splitlines(1) ): sys.stderr.write(diffline) all_tests_ok = 0 if all_tests_ok: sys.exit(0) else: sys.exit(1) cvs2svn-2.4.0/cvs2svn_rcsparse/debug.py0000664000076500007650000000631110406665550021161 0ustar mhaggermhagger00000000000000# -*-python-*- # # Copyright (C) 1999-2006 The ViewCVS Group. All Rights Reserved. # # By using this file, you agree to the terms and conditions set forth in # the LICENSE.html file which can be found at the top level of the ViewVC # distribution or at http://viewvc.org/license-1.html. # # For more information, visit http://viewvc.org/ # # ----------------------------------------------------------------------- """debug.py: various debugging tools for the rcsparse package.""" import time from __init__ import parse import common class DebugSink(common.Sink): def set_head_revision(self, revision): print 'head:', revision def set_principal_branch(self, branch_name): print 'branch:', branch_name def define_tag(self, name, revision): print 'tag:', name, '=', revision def set_comment(self, comment): print 'comment:', comment def set_description(self, description): print 'description:', description def define_revision(self, revision, timestamp, author, state, branches, next): print 'revision:', revision print ' timestamp:', timestamp print ' author:', author print ' state:', state print ' branches:', branches print ' next:', next def set_revision_info(self, revision, log, text): print 'revision:', revision print ' log:', log print ' text:', text[:100], '...' class DumpSink(common.Sink): """Dump all the parse information directly to stdout. The output is relatively unformatted and untagged. It is intended as a raw dump of the data in the RCS file. A copy can be saved, then changes made to the parsing engine, then a comparison of the new output against the old output. """ def __init__(self): global sha import sha def set_head_revision(self, revision): print revision def set_principal_branch(self, branch_name): print branch_name def define_tag(self, name, revision): print name, revision def set_comment(self, comment): print comment def set_description(self, description): print description def define_revision(self, revision, timestamp, author, state, branches, next): print revision, timestamp, author, state, branches, next def set_revision_info(self, revision, log, text): print revision, sha.new(log).hexdigest(), sha.new(text).hexdigest() def tree_completed(self): print 'tree_completed' def parse_completed(self): print 'parse_completed' def dump_file(fname): parse(open(fname, 'rb'), DumpSink()) def time_file(fname): f = open(fname, 'rb') s = common.Sink() t = time.time() parse(f, s) t = time.time() - t print t def _usage(): print 'This is normally a module for importing, but it has a couple' print 'features for testing as an executable script.' print 'USAGE: %s COMMAND filename,v' % sys.argv[0] print ' where COMMAND is one of:' print ' dump: filename is "dumped" to stdout' print ' time: filename is parsed with the time written to stdout' sys.exit(1) if __name__ == '__main__': import sys if len(sys.argv) != 3: _usage() if sys.argv[1] == 'dump': dump_file(sys.argv[2]) elif sys.argv[1] == 'time': time_file(sys.argv[2]) else: _usage() cvs2svn-2.4.0/cvs2svn_rcsparse/common.py0000664000076500007650000003612511710517257021370 0ustar mhaggermhagger00000000000000# -*-python-*- # # Copyright (C) 1999-2011 The ViewCVS Group. All Rights Reserved. # # By using this file, you agree to the terms and conditions set forth in # the LICENSE.html file which can be found at the top level of the ViewVC # distribution or at http://viewvc.org/license-1.html. # # For more information, visit http://viewvc.org/ # # ----------------------------------------------------------------------- """common.py: common classes and functions for the RCS parsing tools.""" import calendar import string class Sink: """Interface to be implemented by clients. The RCS parser calls this as it parses the RCS file. All these methods have stub implementations that do nothing, so you only have to override the callbacks that you care about. """ def set_head_revision(self, revision): """Reports the head revision for this RCS file. This is the value of the 'head' header in the admin section of the RCS file. This function can only be called before admin_completed(). Parameter: REVISION is a string containing a revision number. This is an actual revision number, not a branch number. """ pass def set_principal_branch(self, branch_name): """Reports the principal branch for this RCS file. This is only called if the principal branch is not trunk. This is the value of the 'branch' header in the admin section of the RCS file. This function can only be called before admin_completed(). Parameter: BRANCH_NAME is a string containing a branch number. If this function is called, the parameter is typically "1.1.1", indicating the vendor branch. """ pass def set_access(self, accessors): """Reports the access control list for this RCS file. This function is only called if the ACL is set. If this function is not called then there is no ACL and all users are allowed access. This is the value of the 'access' header in the admin section of the RCS file. This function can only be called before admin_completed(). Parameter: ACCESSORS is a list of strings. Each string is a username. The user is allowed access if and only if their username is in the list, OR the user owns the RCS file on disk, OR the user is root. Note that CVS typically doesn't use this field. """ pass def define_tag(self, name, revision): """Reports a tag or branch definition. This function will be called once for each tag or branch. This is taken from the 'symbols' header in the admin section of the RCS file. This function can only be called before admin_completed(). Parameters: NAME is a string containing the tag or branch name. REVISION is a string containing a revision number. This may be an actual revision number (for a tag) or a branch number. The revision number consists of a number of decimal components separated by dots. There are three common forms. If there are an odd number of components, it's a branch. Otherwise, if the next-to-last component is zero, it's a branch (and the next-to-last component is an artifact of CVS and should not be shown to the user). Otherwise, it's a tag. This function is called in the order that the tags appear in the RCS file header. For CVS, this appears to be in reverse chronological order of tag/branch creation. """ pass def set_locker(self, revision, locker): """Reports a lock on this RCS file. This function will be called once for each lock. This is taken from the 'locks' header in the admin section of the RCS file. This function can only be called before admin_completed(). Parameters: REVISION is a string containing a revision number. This is an actual revision number, not a branch number. LOCKER is a string containing a username. """ pass def set_locking(self, mode): """Signals strict locking mode. This function will be called if and only if the RCS file is in strict locking mode. This is taken from the 'strict' header in the admin section of the RCS file. This function can only be called before admin_completed(). Parameters: MODE is always the string 'strict'. """ pass def set_comment(self, comment): """Reports the comment for this RCS file. This is the value of the 'comment' header in the admin section of the RCS file. This function can only be called before admin_completed(). Parameter: COMMENT is a string containing the comment. This may be multi-line. This field does not seem to be used by CVS. """ pass def set_expansion(self, mode): """Reports the keyword expansion mode for this RCS file. This is the value of the 'expand' header in the admin section of the RCS file. This function can only be called before admin_completed(). Parameter: MODE is a string containing the keyword expansion mode. Possible values include 'o' and 'b', amongst others. """ pass def admin_completed(self): """Reports that the initial RCS header has been parsed. This function is called exactly once. """ pass def define_revision(self, revision, timestamp, author, state, branches, next): """Reports metadata about a single revision. This function is called for each revision. It is called later than admin_completed() and earlier than tree_completed(). Parameter: REVISION is a revision number, as a string. This is an actual revision number, not a branch number. TIMESTAMP is the date and time that the revision was created, as an integer number of seconds since the epoch. (I.e. "UNIX time" format). AUTHOR is the author name, as a string. STATE is the state of the revision, as a string. Common values are "Exp" and "dead". BRANCHES is a list of strings, with each string being an actual revision number (not a branch number). For each branch which is based on this revision and has commits, the revision number of the first branch commit is listed here. NEXT is either None or a string representing an actual revision number (not a branch number). When on trunk, NEXT points to what humans might consider to be the 'previous' revision number. For example, 1.3's NEXT is 1.2. However, on a branch, NEXT really does point to what humans would consider to be the 'next' revision number. For example, 1.1.2.1's NEXT would be 1.1.2.2. In other words, NEXT always means "where to find the next deltatext that you need this revision to retrieve". """ pass def tree_completed(self): """Reports that the RCS revision tree has been parsed. This function is called exactly once. This function will be called later than admin_completed(). """ pass def set_description(self, description): """Reports the description from the RCS file. This is set using the "-m" flag to "cvs add". However, many CVS users don't use that option, so this is often empty. This function is called once, after tree_completed(). Parameter: DESCRIPTION is a string containing the description. This may be multi-line. """ pass def set_revision_info(self, revision, log, text): """Reports the log message and contents of a CVS revision. This function is called for each revision. It is called later than set_description(). Parameters: REVISION is a string containing the actual revision number. LOG is a string containing the log message. This may be multi-line. TEXT is the contents of the file in this revision, either as full-text or as a diff. This is usually multi-line, and often quite large and/or binary. """ pass def parse_completed(self): """Reports that parsing an RCS file is complete. This function is called once. After it is called, no more calls will be made via this interface. """ pass # -------------------------------------------------------------------------- # # EXCEPTIONS USED BY RCSPARSE # class RCSParseError(Exception): pass class RCSIllegalCharacter(RCSParseError): pass class RCSExpected(RCSParseError): def __init__(self, got, wanted): RCSParseError.__init__( self, 'Unexpected parsing error in RCS file.\n' 'Expected token: %s, but saw: %s' % (wanted, got) ) class RCSStopParser(Exception): pass # -------------------------------------------------------------------------- # # STANDARD TOKEN STREAM-BASED PARSER # class _Parser: stream_class = None # subclasses need to define this def _read_until_semicolon(self): """Read all tokens up to and including the next semicolon token. Return the tokens (not including the semicolon) as a list.""" tokens = [] while 1: token = self.ts.get() if token == ';': break tokens.append(token) return tokens def _parse_admin_head(self, token): rev = self.ts.get() if rev == ';': # The head revision is not specified. Just drop the semicolon # on the floor. pass else: self.sink.set_head_revision(rev) self.ts.match(';') def _parse_admin_branch(self, token): branch = self.ts.get() if branch != ';': self.sink.set_principal_branch(branch) self.ts.match(';') def _parse_admin_access(self, token): accessors = self._read_until_semicolon() if accessors: self.sink.set_access(accessors) def _parse_admin_symbols(self, token): while 1: tag_name = self.ts.get() if tag_name == ';': break self.ts.match(':') tag_rev = self.ts.get() self.sink.define_tag(tag_name, tag_rev) def _parse_admin_locks(self, token): while 1: locker = self.ts.get() if locker == ';': break self.ts.match(':') rev = self.ts.get() self.sink.set_locker(rev, locker) def _parse_admin_strict(self, token): self.sink.set_locking("strict") self.ts.match(';') def _parse_admin_comment(self, token): self.sink.set_comment(self.ts.get()) self.ts.match(';') def _parse_admin_expand(self, token): expand_mode = self.ts.get() self.sink.set_expansion(expand_mode) self.ts.match(';') admin_token_map = { 'head' : _parse_admin_head, 'branch' : _parse_admin_branch, 'access' : _parse_admin_access, 'symbols' : _parse_admin_symbols, 'locks' : _parse_admin_locks, 'strict' : _parse_admin_strict, 'comment' : _parse_admin_comment, 'expand' : _parse_admin_expand, 'desc' : None, } def parse_rcs_admin(self): while 1: # Read initial token at beginning of line token = self.ts.get() try: f = self.admin_token_map[token] except KeyError: # We're done once we reach the description of the RCS tree if token[0] in string.digits: self.ts.unget(token) return else: # Chew up "newphrase" # warn("Unexpected RCS token: $token\n") while self.ts.get() != ';': pass else: if f is None: self.ts.unget(token) return else: f(self, token) def _parse_rcs_tree_entry(self, revision): # Parse date self.ts.match('date') date = self.ts.get() self.ts.match(';') # Convert date into standard UNIX time format (seconds since epoch) date_fields = string.split(date, '.') # According to rcsfile(5): the year "contains just the last two # digits of the year for years from 1900 through 1999, and all the # digits of years thereafter". if len(date_fields[0]) == 2: date_fields[0] = '19' + date_fields[0] date_fields = map(string.atoi, date_fields) EPOCH = 1970 if date_fields[0] < EPOCH: raise ValueError, 'invalid year for revision %s' % (revision,) try: timestamp = calendar.timegm(tuple(date_fields) + (0, 0, 0,)) except ValueError, e: raise ValueError, 'invalid date for revision %s: %s' % (revision, e,) # Parse author ### NOTE: authors containing whitespace are violations of the ### RCS specification. We are making an allowance here because ### CVSNT is known to produce these sorts of authors. self.ts.match('author') author = ' '.join(self._read_until_semicolon()) # Parse state self.ts.match('state') state = '' while 1: token = self.ts.get() if token == ';': break state = state + token + ' ' state = state[:-1] # toss the trailing space # Parse branches self.ts.match('branches') branches = self._read_until_semicolon() # Parse revision of next delta in chain self.ts.match('next') next = self.ts.get() if next == ';': next = None else: self.ts.match(';') # there are some files with extra tags in them. for example: # owner 640; # group 15; # permissions 644; # hardlinks @configure.in@; # commitid mLiHw3bulRjnTDGr; # this is "newphrase" in RCSFILE(5). we just want to skip over these. while 1: token = self.ts.get() if token == 'desc' or token[0] in string.digits: self.ts.unget(token) break # consume everything up to the semicolon self._read_until_semicolon() self.sink.define_revision(revision, timestamp, author, state, branches, next) def parse_rcs_tree(self): while 1: revision = self.ts.get() # End of RCS tree description ? if revision == 'desc': self.ts.unget(revision) return self._parse_rcs_tree_entry(revision) def parse_rcs_description(self): self.ts.match('desc') self.sink.set_description(self.ts.get()) def parse_rcs_deltatext(self): while 1: revision = self.ts.get() if revision is None: # EOF break text, sym2, log, sym1 = self.ts.mget(4) if sym1 != 'log': print `text[:100], sym2[:100], log[:100], sym1[:100]` raise RCSExpected(sym1, 'log') if sym2 != 'text': raise RCSExpected(sym2, 'text') ### need to add code to chew up "newphrase" self.sink.set_revision_info(revision, log, text) def parse(self, file, sink): """Parse an RCS file. Parameters: FILE is the file object to parse. (I.e. an object of the built-in Python type "file", usually created using Python's built-in "open()" function). SINK is an instance of (some subclass of) Sink. It's methods will be called as the file is parsed; see the definition of Sink for the details. """ self.ts = self.stream_class(file) self.sink = sink self.parse_rcs_admin() # let sink know when the admin section has been completed self.sink.admin_completed() self.parse_rcs_tree() # many sinks want to know when the tree has been completed so they can # do some work to prep for the arrival of the deltatext self.sink.tree_completed() self.parse_rcs_description() self.parse_rcs_deltatext() # easiest for us to tell the sink it is done, rather than worry about # higher level software doing it. self.sink.parse_completed() self.ts = self.sink = None # -------------------------------------------------------------------------- cvs2svn-2.4.0/Makefile0000664000076500007650000000241211244063205015637 0ustar mhaggermhagger00000000000000# Makefile for packaging and installing cvs2svn. # The python interpreter to be used can be overridden here or via # something like "make ... PYTHON=/path/to/python2.5". Please note # that this option only affects the "install" and "check" targets: PYTHON=python all: @echo "Supported make targets:" @echo " man -- Create manpages for the main programs" @echo " install -- Install software using distutils" @echo " dist -- Create an installation package" @echo " check -- Run cvs2svn tests" @echo " pycheck -- Use pychecker to check cvs2svn Python code" @echo " clean -- Clean up source tree and temporary directory" man: cvs2svn.1 cvs2git.1 cvs2bzr.1 cvs2svn.1: ./cvs2svn --man >$@ cvs2git.1: ./cvs2git --man >$@ cvs2bzr.1: ./cvs2bzr --man >$@ dist: ./dist.sh install: @case "${DESTDIR}" in \ "") \ echo ${PYTHON} ./setup.py install ; \ ${PYTHON} ./setup.py install ; \ ;; \ *) \ echo ${PYTHON} ./setup.py install --root=${DESTDIR} ; \ ${PYTHON} ./setup.py install --root=${DESTDIR} ; \ ;; \ esac check: clean ${PYTHON} ./run-tests.py pycheck: pychecker cvs2svn_lib/*.py clean: -rm -rf cvs2svn-*.tar.gz build cvs2svn-tmp cvs2*.1 -for d in . cvs2svn_lib cvs2svn_rcsparse svntest contrib ; \ do \ rm -f $$d/*.pyc $$d/*.pyo; \ done cvs2svn-2.4.0/PKG-INFO0000664000076500007650000000543312027373500015304 0ustar mhaggermhagger00000000000000Metadata-Version: 1.1 Name: cvs2svn Version: 2.4.0 Summary: CVS to Subversion/git/Bazaar/Mercurial repository converter Home-page: http://cvs2svn.tigris.org/ Author: The cvs2svn team Author-email: dev@cvs2svn.tigris.org License: Apache-style Download-URL: http://cvs2svn.tigris.org/servlets/ProjectDocumentList?folderID=2976 Description: cvs2svn_ is a tool for migrating a CVS repository to Subversion_, git_, Bazaar_, or Mercurial_. The main design goals are robustness and 100% data preservation. cvs2svn can convert just about any CVS repository we've ever seen, including gcc, Mozilla, FreeBSD, KDE, GNOME... .. _cvs2svn: http://cvs2svn.tigris.org/ .. _Subversion: http://svn.tigris.org/ .. _git: http://git-scm.com/ .. _Bazaar: http://bazaar-vcs.org/ .. _Mercurial: http://mercurial.selenic.com/ cvs2svn infers what happened in the history of your CVS repository and replicates that history as accurately as possible in the target SCM. All revisions, branches, tags, log messages, author names, and commit dates are converted. cvs2svn deduces what CVS modifications were made at the same time, and outputs these modifications grouped together as changesets in the target SCM. cvs2svn also deals with many CVS quirks and is highly configurable. See the comprehensive `feature list`_. .. _feature list: http://cvs2svn.tigris.org/features.html .. _documentation: http://cvs2svn.tigris.org/cvs2svn.html Please read the documentation_ carefully before using cvs2svn. Latest development version -------------------------- For general use, the most recent released version of cvs2svn is usually the best choice. However, if you want to use the newest cvs2svn features or if you're debugging or patching cvs2svn, you might want to use the trunk version (which is usually quite stable). To do so, use Subversion to check out a working copy from http://cvs2svn.tigris.org/svn/cvs2svn/trunk/ using a command like:: svn co --username=guest --password="" http://cvs2svn.tigris.org/svn/cvs2svn/trunk cvs2svn-trunk Platform: UNKNOWN Classifier: Development Status :: 5 - Production/Stable Classifier: Environment :: Console Classifier: Intended Audience :: Developers Classifier: Intended Audience :: System Administrators Classifier: License :: OSI Approved Classifier: Natural Language :: English Classifier: Operating System :: OS Independent Classifier: Programming Language :: Python :: 2 Classifier: Topic :: Software Development :: Version Control Classifier: Topic :: Software Development :: Version Control :: CVS Classifier: Topic :: Utilities cvs2svn-2.4.0/run-tests.py0000775000076500007650000040163212027257624016541 0ustar mhaggermhagger00000000000000#!/usr/bin/env python # # run_tests.py: test suite for cvs2svn # # Usage: run_tests.py [-v | --verbose] [list | ] # # Options: # -v, --verbose # enable verbose output # # Arguments (at most one argument is allowed): # list # If the word "list" is passed as an argument, the list of # available tests is printed (but no tests are run). # # # If a number is passed as an argument, then only the test # with that number is run. # # If no argument is specified, then all tests are run. # # Subversion is a tool for revision control. # See http://subversion.tigris.org for more information. # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # ###################################################################### # General modules import sys import shutil import stat import re import os import time import os.path import locale import textwrap import calendar import types try: from hashlib import md5 except ImportError: from md5 import md5 from difflib import Differ # Make sure that a supported version of Python is being used: if not (0x02040000 <= sys.hexversion < 0x03000000): sys.stderr.write( 'error: Python 2, version 2.4 or higher required.\n' ) sys.exit(1) # This script needs to run in the correct directory. Make sure we're there. if not (os.path.exists('cvs2svn') and os.path.exists('test-data')): sys.stderr.write("error: I need to be run in the directory containing " "'cvs2svn' and 'test-data'.\n") sys.exit(1) # Load the Subversion test framework. import svntest from svntest import Failure from svntest.main import safe_rmtree from svntest.testcase import TestCase from svntest.testcase import XFail # Test if Mercurial >= 1.1 is available. try: from mercurial import context context.memctx have_hg = True except (ImportError, AttributeError): have_hg = False cvs2svn = os.path.abspath('cvs2svn') cvs2git = os.path.abspath('cvs2git') cvs2hg = os.path.abspath('cvs2hg') # We use the installed svn and svnlook binaries, instead of using # svntest.main.run_svn() and svntest.main.run_svnlook(), because the # behavior -- or even existence -- of local builds shouldn't affect # the cvs2svn test suite. svn_binary = 'svn' svnlook_binary = 'svnlook' svnadmin_binary = 'svnadmin' svnversion_binary = 'svnversion' test_data_dir = 'test-data' tmp_dir = 'cvs2svn-tmp' #---------------------------------------------------------------------- # Helpers. #---------------------------------------------------------------------- # The value to expect for svn:keywords if it is set: KEYWORDS = 'Author Date Id Revision' class RunProgramException(Failure): pass class MissingErrorException(Failure): def __init__(self, error_re): Failure.__init__( self, "Test failed because no error matched '%s'" % (error_re,) ) def run_program(program, error_re, *varargs): """Run PROGRAM with VARARGS, return stdout as a list of lines. If there is any stderr and ERROR_RE is None, raise RunProgramException, and print the stderr lines if svntest.main.options.verbose is true. If ERROR_RE is not None, it is a string regular expression that must match some line of stderr. If it fails to match, raise MissingErrorExpection.""" # FIXME: exit_code is currently ignored. exit_code, out, err = svntest.main.run_command(program, 1, 0, *varargs) if error_re: # Specified error expected on stderr. if not err: raise MissingErrorException(error_re) else: for line in err: if re.match(error_re, line): return out raise MissingErrorException(error_re) else: # No stderr allowed. if err: if svntest.main.options.verbose: print '\n%s said:\n' % program for line in err: print ' ' + line, print raise RunProgramException() return out def run_script(script, error_re, *varargs): """Run Python script SCRIPT with VARARGS, returning stdout as a list of lines. If there is any stderr and ERROR_RE is None, raise RunProgramException, and print the stderr lines if svntest.main.options.verbose is true. If ERROR_RE is not None, it is a string regular expression that must match some line of stderr. If it fails to match, raise MissingErrorException.""" # Use the same python that is running this script return run_program(sys.executable, error_re, script, *varargs) # On Windows, for an unknown reason, the cmd.exe process invoked by # os.system('sort ...') in cvs2svn receives invalid stdio handles, if # cvs2svn is started as "cvs2svn ...". "python cvs2svn ..." avoids # this. Therefore, the redirection of the output to the .s-revs file fails. # We no longer use the problematic invocation on any system, but this # comment remains to warn about this problem. def run_svn(*varargs): """Run svn with VARARGS; return stdout as a list of lines. If there is any stderr, raise RunProgramException, and print the stderr lines if svntest.main.options.verbose is true.""" return run_program(svn_binary, None, *varargs) def repos_to_url(path_to_svn_repos): """This does what you think it does.""" rpath = os.path.abspath(path_to_svn_repos) if rpath[0] != '/': rpath = '/' + rpath return 'file://%s' % rpath.replace(os.sep, '/') def svn_strptime(timestr): return time.strptime(timestr, '%Y-%m-%d %H:%M:%S') class Log: def __init__(self, revision, author, date, symbols): self.revision = revision self.author = author # Internally, we represent the date as seconds since epoch (UTC). # Since standard subversion log output shows dates in localtime # # "1993-06-18 00:46:07 -0500 (Fri, 18 Jun 1993)" # # and time.mktime() converts from localtime, it all works out very # happily. self.date = time.mktime(svn_strptime(date[0:19])) # The following symbols are used for string interpolation when # checking paths: self.symbols = symbols # The changed paths will be accumulated later, as log data is read. # Keys here are paths such as '/trunk/foo/bar', values are letter # codes such as 'M', 'A', and 'D'. self.changed_paths = { } # The msg will be accumulated later, as log data is read. self.msg = '' def absorb_changed_paths(self, out): 'Read changed paths from OUT into self, until no more.' while 1: line = out.readline() if len(line) == 1: return line = line[:-1] op_portion = line[3:4] path_portion = line[5:] # If we're running on Windows we get backslashes instead of # forward slashes. path_portion = path_portion.replace('\\', '/') # # We could parse out history information, but currently we # # just leave it in the path portion because that's how some # # tests expect it. # # m = re.match("(.*) \(from /.*:[0-9]+\)", path_portion) # if m: # path_portion = m.group(1) self.changed_paths[path_portion] = op_portion def __cmp__(self, other): return cmp(self.revision, other.revision) or \ cmp(self.author, other.author) or cmp(self.date, other.date) or \ cmp(self.changed_paths, other.changed_paths) or \ cmp(self.msg, other.msg) def get_path_op(self, path): """Return the operator for the change involving PATH. PATH is allowed to include string interpolation directives (e.g., '%(trunk)s'), which are interpolated against self.symbols. Return None if there is no record for PATH.""" return self.changed_paths.get(path % self.symbols) def check_msg(self, msg): """Verify that this Log's message starts with the specified MSG.""" if self.msg.find(msg) != 0: raise Failure( "Revision %d log message was:\n%s\n\n" "It should have begun with:\n%s\n\n" % (self.revision, self.msg, msg,) ) def check_change(self, path, op): """Verify that this Log includes a change for PATH with operator OP. PATH is allowed to include string interpolation directives (e.g., '%(trunk)s'), which are interpolated against self.symbols.""" path = path % self.symbols found_op = self.changed_paths.get(path, None) if found_op is None: raise Failure( "Revision %d does not include change for path %s " "(it should have been %s).\n" % (self.revision, path, op,) ) if found_op != op: raise Failure( "Revision %d path %s had op %s (it should have been %s)\n" % (self.revision, path, found_op, op,) ) def check_changes(self, changed_paths): """Verify that this Log has precisely the CHANGED_PATHS specified. CHANGED_PATHS is a sequence of tuples (path, op), where the paths strings are allowed to include string interpolation directives (e.g., '%(trunk)s'), which are interpolated against self.symbols.""" cp = {} for (path, op) in changed_paths: cp[path % self.symbols] = op if self.changed_paths != cp: raise Failure( "Revision %d changed paths list was:\n%s\n\n" "It should have been:\n%s\n\n" % (self.revision, self.changed_paths, cp,) ) def check(self, msg, changed_paths): """Verify that this Log has the MSG and CHANGED_PATHS specified. Convenience function to check two things at once. MSG is passed to check_msg(); CHANGED_PATHS is passed to check_changes().""" self.check_msg(msg) self.check_changes(changed_paths) def parse_log(svn_repos, symbols): """Return a dictionary of Logs, keyed on revision number, for SVN_REPOS. Initialize the Logs' symbols with SYMBOLS.""" class LineFeeder: 'Make a list of lines behave like an open file handle.' def __init__(self, lines): self.lines = lines def readline(self): if len(self.lines) > 0: return self.lines.pop(0) else: return None def absorb_message_body(out, num_lines, log): """Read NUM_LINES of log message body from OUT into Log item LOG.""" for i in range(num_lines): log.msg += out.readline() log_start_re = re.compile('^r(?P[0-9]+) \| ' '(?P[^\|]+) \| ' '(?P[^\|]+) ' '\| (?P[0-9]+) (line|lines)$') log_separator = '-' * 72 logs = { } out = LineFeeder(run_svn('log', '-v', repos_to_url(svn_repos))) while 1: this_log = None line = out.readline() if not line: break line = line[:-1] if line.find(log_separator) == 0: line = out.readline() if not line: break line = line[:-1] m = log_start_re.match(line) if m: this_log = Log( int(m.group('rev')), m.group('author'), m.group('date'), symbols) line = out.readline() if not line.find('Changed paths:') == 0: print 'unexpected log output (missing changed paths)' print "Line: '%s'" % line sys.exit(1) this_log.absorb_changed_paths(out) absorb_message_body(out, int(m.group('lines')), this_log) logs[this_log.revision] = this_log elif len(line) == 0: break # We've reached the end of the log output. else: print 'unexpected log output (missing revision line)' print "Line: '%s'" % line sys.exit(1) else: print 'unexpected log output (missing log separator)' print "Line: '%s'" % line sys.exit(1) return logs def erase(path): """Unconditionally remove PATH and its subtree, if any. PATH may be non-existent, a file or symlink, or a directory.""" if os.path.isdir(path): safe_rmtree(path) elif os.path.exists(path): os.remove(path) log_msg_text_wrapper = textwrap.TextWrapper(width=76, break_long_words=False) def sym_log_msg(symbolic_name, is_tag=None): """Return the expected log message for a cvs2svn-synthesized revision creating branch or tag SYMBOLIC_NAME.""" # This reproduces the logic in SVNSymbolCommit.get_log_msg(). if is_tag: type = 'tag' else: type = 'branch' return log_msg_text_wrapper.fill( "This commit was manufactured by cvs2svn to create %s '%s'." % (type, symbolic_name) ) def make_conversion_id( name, args, passbypass, options_file=None, symbol_hints_file=None ): """Create an identifying tag for a conversion. The return value can also be used as part of a filesystem path. NAME is the name of the CVS repository. ARGS are the extra arguments to be passed to cvs2svn. PASSBYPASS is a boolean indicating whether the conversion is to be run one pass at a time. If OPTIONS_FILE is specified, it is an options file that will be used for the conversion. If SYMBOL_HINTS_FILE is specified, it is a symbol hints file that will be used for the conversion. The 1-to-1 mapping between cvs2svn command parameters and conversion_ids allows us to avoid running the same conversion more than once, when multiple tests use exactly the same conversion.""" conv_id = name args = args[:] if passbypass: args.append('--passbypass') if symbol_hints_file is not None: args.append('--symbol-hints=%s' % (symbol_hints_file,)) # There are some characters that are forbidden in filenames, and # there is a limit on the total length of a path to a file. So use # a hash of the parameters rather than concatenating the parameters # into a string. if args: conv_id += "-" + md5('\0'.join(args)).hexdigest() # Some options-file based tests rely on knowing the paths to which # the repository should be written, so we handle that option as a # predictable string: if options_file is not None: conv_id += '--options=%s' % (options_file,) return conv_id class Conversion: """A record of a cvs2svn conversion. Fields: conv_id -- the conversion id for this Conversion. name -- a one-word name indicating the involved repositories. dumpfile -- the name of the SVN dumpfile created by the conversion (if the DUMPFILE constructor argument was used); otherwise, None. repos -- the path to the svn repository. Unset if DUMPFILE was specified. logs -- a dictionary of Log instances, as returned by parse_log(). Unset if DUMPFILE was specified. symbols -- a dictionary of symbols used for string interpolation in path names. stdout -- a list of lines written by cvs2svn to stdout _wc -- the basename of the svn working copy (within tmp_dir). Unset if DUMPFILE was specified. _wc_path -- the path to the svn working copy, if it has already been created; otherwise, None. (The working copy is created lazily when get_wc() is called.) Unset if DUMPFILE was specified. _wc_tree -- the tree built from the svn working copy, if it has already been created; otherwise, None. The tree is created lazily when get_wc_tree() is called.) Unset if DUMPFILE was specified. _svnrepos -- the basename of the svn repository (within tmp_dir). Unset if DUMPFILE was specified.""" # The number of the last cvs2svn pass (determined lazily by # get_last_pass()). last_pass = None @classmethod def get_last_pass(cls): """Return the number of cvs2svn's last pass.""" if cls.last_pass is None: out = run_script(cvs2svn, None, '--help-passes') cls.last_pass = int(out[-1].split()[0]) return cls.last_pass def __init__( self, conv_id, name, error_re, passbypass, symbols, args, options_file=None, symbol_hints_file=None, dumpfile=None, ): self.conv_id = conv_id self.name = name self.symbols = symbols if not os.path.isdir(tmp_dir): os.mkdir(tmp_dir) cvsrepos = os.path.join(test_data_dir, '%s-cvsrepos' % self.name) if dumpfile: self.dumpfile = os.path.join(tmp_dir, dumpfile) # Clean up from any previous invocations of this script. erase(self.dumpfile) else: self.dumpfile = None self.repos = os.path.join(tmp_dir, '%s-svnrepos' % self.conv_id) self._wc = os.path.join(tmp_dir, '%s-wc' % self.conv_id) self._wc_path = None self._wc_tree = None # Clean up from any previous invocations of this script. erase(self.repos) erase(self._wc) args = list(args) args.extend([ '--svnadmin=%s' % (svntest.main.svnadmin_binary,), ]) if options_file: self.options_file = os.path.join(cvsrepos, options_file) args.extend([ '--options=%s' % self.options_file, ]) assert not symbol_hints_file else: self.options_file = None if tmp_dir != 'cvs2svn-tmp': # Only include this argument if it differs from cvs2svn's default: args.extend([ '--tmpdir=%s' % tmp_dir, ]) if symbol_hints_file: self.symbol_hints_file = os.path.join(cvsrepos, symbol_hints_file) args.extend([ '--symbol-hints=%s' % self.symbol_hints_file, ]) if self.dumpfile: args.extend(['--dumpfile=%s' % (self.dumpfile,)]) else: args.extend(['-s', self.repos]) args.extend([cvsrepos]) if passbypass: self.stdout = [] for p in range(1, self.get_last_pass() + 1): self.stdout += run_script(cvs2svn, error_re, '-p', str(p), *args) else: self.stdout = run_script(cvs2svn, error_re, *args) if self.dumpfile: if not os.path.isfile(self.dumpfile): raise Failure( "Dumpfile not created: '%s'" % os.path.join(os.getcwd(), self.dumpfile) ) else: if os.path.isdir(self.repos): self.logs = parse_log(self.repos, self.symbols) elif error_re is None: raise Failure( "Repository not created: '%s'" % os.path.join(os.getcwd(), self.repos) ) def output_found(self, pattern): """Return True if PATTERN matches any line in self.stdout. PATTERN is a regular expression pattern as a string. """ pattern_re = re.compile(pattern) for line in self.stdout: if pattern_re.match(line): # We found the pattern that we were looking for. return 1 else: return 0 def find_tag_log(self, tagname): """Search LOGS for a log message containing 'TAGNAME' and return the log in which it was found.""" for i in xrange(len(self.logs), 0, -1): if self.logs[i].msg.find("'"+tagname+"'") != -1: return self.logs[i] raise ValueError("Tag %s not found in logs" % tagname) def get_wc(self, *args): """Return the path to the svn working copy, or a path within the WC. If a working copy has not been created yet, create it now. If ARGS are specified, then they should be strings that form fragments of a path within the WC. They are joined using os.path.join() and appended to the WC path.""" if self._wc_path is None: run_svn('co', repos_to_url(self.repos), self._wc) self._wc_path = self._wc return os.path.join(self._wc_path, *args) def get_wc_tree(self): if self._wc_tree is None: self._wc_tree = svntest.tree.build_tree_from_wc(self.get_wc(), 1) return self._wc_tree def path_exists(self, *args): """Return True if the specified path exists within the repository. (The strings in ARGS are first joined into a path using os.path.join().)""" return os.path.exists(self.get_wc(*args)) def check_props(self, keys, checks): """Helper function for checking lots of properties. For a list of files in the conversion, check that the values of the properties listed in KEYS agree with those listed in CHECKS. CHECKS is a list of tuples: [ (filename, [value, value, ...]), ...], where the values are listed in the same order as the key names are listed in KEYS.""" for (file, values) in checks: assert len(values) == len(keys) props = props_for_path(self.get_wc_tree(), file) for i in range(len(keys)): if props.get(keys[i]) != values[i]: raise Failure( "File %s has property %s set to \"%s\" " "(it should have been \"%s\").\n" % (file, keys[i], props.get(keys[i]), values[i],) ) class GitConversion: """A record of a cvs2svn conversion. Fields: name -- a one-word name indicating the CVS repository to be converted. stdout -- a list of lines written by cvs2svn to stdout.""" def __init__(self, name, error_re, args, options_file=None): self.name = name if not os.path.isdir(tmp_dir): os.mkdir(tmp_dir) cvsrepos = os.path.join(test_data_dir, '%s-cvsrepos' % self.name) args = list(args) if options_file: self.options_file = os.path.join(cvsrepos, options_file) args.extend([ '--options=%s' % self.options_file, ]) else: self.options_file = None self.stdout = run_script(cvs2git, error_re, *args) # Cache of conversions that have already been done. Keys are conv_id; # values are Conversion instances. already_converted = { } def ensure_conversion( name, error_re=None, passbypass=None, trunk=None, branches=None, tags=None, args=None, options_file=None, symbol_hints_file=None, dumpfile=None, ): """Convert CVS repository NAME to Subversion, but only if it has not been converted before by this invocation of this script. If it has been converted before, return the Conversion object from the previous invocation. If no error, return a Conversion instance. If ERROR_RE is a string, it is a regular expression expected to match some line of stderr printed by the conversion. If there is an error and ERROR_RE is not set, then raise Failure. If PASSBYPASS is set, then cvs2svn is run multiple times, each time with a -p option starting at 1 and increasing to a (hardcoded) maximum. NAME is just one word. For example, 'main' would mean to convert './test-data/main-cvsrepos', and after the conversion, the resulting Subversion repository would be in './cvs2svn-tmp/main-svnrepos', and a checked out head working copy in './cvs2svn-tmp/main-wc'. Any other options to pass to cvs2svn should be in ARGS, each element being one option, e.g., '--trunk-only'. If the option takes an argument, include it directly, e.g., '--mime-types=PATH'. Arguments are passed to cvs2svn in the order that they appear in ARGS. If OPTIONS_FILE is specified, then it should be the name of a file within the main directory of the cvs repository associated with this test. It is passed to cvs2svn using the --options option (which suppresses some other options that are incompatible with --options). If SYMBOL_HINTS_FILE is specified, then it should be the name of a file within the main directory of the cvs repository associated with this test. It is passed to cvs2svn using the --symbol-hints option. If DUMPFILE is specified, then it is the name of a dumpfile within the temporary directory to which the conversion output should be written.""" if args is None: args = [] else: args = list(args) if trunk is None: trunk = 'trunk' else: args.append('--trunk=%s' % (trunk,)) if branches is None: branches = 'branches' else: args.append('--branches=%s' % (branches,)) if tags is None: tags = 'tags' else: args.append('--tags=%s' % (tags,)) conv_id = make_conversion_id( name, args, passbypass, options_file, symbol_hints_file ) if conv_id not in already_converted: try: # Run the conversion and store the result for the rest of this # session: already_converted[conv_id] = Conversion( conv_id, name, error_re, passbypass, {'trunk' : trunk, 'branches' : branches, 'tags' : tags}, args, options_file, symbol_hints_file, dumpfile, ) except Failure: # Remember the failure so that a future attempt to run this conversion # does not bother to retry, but fails immediately. already_converted[conv_id] = None raise conv = already_converted[conv_id] if conv is None: raise Failure() return conv class Cvs2SvnTestFunction(TestCase): """A TestCase based on a naked Python function object. FUNC should be a function that returns None on success and throws an svntest.Failure exception on failure. It should have a brief docstring describing what it does (and fulfilling certain conditions). FUNC must take no arguments. This class is almost identical to svntest.testcase.FunctionTestCase, except that the test function does not require a sandbox and does not accept any parameter (not even sandbox=None). This class can be used as an annotation on a Python function. """ def __init__(self, func): # it better be a function that accepts no parameters and has a # docstring on it. assert isinstance(func, types.FunctionType) name = func.func_name assert func.func_code.co_argcount == 0, \ '%s must not take any arguments' % name doc = func.__doc__.strip() assert doc, '%s must have a docstring' % name # enforce stylistic guidelines for the function docstrings: # - no longer than 50 characters # - should not end in a period # - should not be capitalized assert len(doc) <= 50, \ "%s's docstring must be 50 characters or less" % name assert doc[-1] != '.', \ "%s's docstring should not end in a period" % name assert doc[0].lower() == doc[0], \ "%s's docstring should not be capitalized" % name TestCase.__init__(self, doc=doc) self.func = func def get_function_name(self): return self.func.func_name def get_sandbox_name(self): return None def run(self, sandbox): return self.func() class Cvs2HgTestFunction(Cvs2SvnTestFunction): """Same as Cvs2SvnTestFunction, but for test cases that should be skipped if Mercurial is not available. """ def run(self, sandbox): if not have_hg: raise svntest.Skip() else: return self.func() class Cvs2SvnTestCase(TestCase): def __init__( self, name, doc=None, variant=None, error_re=None, passbypass=None, trunk=None, branches=None, tags=None, args=None, options_file=None, symbol_hints_file=None, dumpfile=None, ): self.name = name if doc is None: # By default, use the first line of the class docstring as the # doc: doc = self.__doc__.splitlines()[0] if variant is not None: # Modify doc to show the variant. Trim doc first if necessary # to stay within the 50-character limit. suffix = '...variant %s' % (variant,) doc = doc[:50 - len(suffix)] + suffix TestCase.__init__(self, doc=doc) self.error_re = error_re self.passbypass = passbypass self.trunk = trunk self.branches = branches self.tags = tags self.args = args self.options_file = options_file self.symbol_hints_file = symbol_hints_file self.dumpfile = dumpfile def ensure_conversion(self): return ensure_conversion( self.name, error_re=self.error_re, passbypass=self.passbypass, trunk=self.trunk, branches=self.branches, tags=self.tags, args=self.args, options_file=self.options_file, symbol_hints_file=self.symbol_hints_file, dumpfile=self.dumpfile, ) def get_sandbox_name(self): return None class Cvs2SvnPropertiesTestCase(Cvs2SvnTestCase): """Test properties resulting from a conversion.""" def __init__(self, name, props_to_test, expected_props, **kw): """Initialize an instance of Cvs2SvnPropertiesTestCase. NAME is the name of the test, passed to Cvs2SvnTestCase. PROPS_TO_TEST is a list of the names of svn properties that should be tested. EXPECTED_PROPS is a list of tuples [(filename, [value,...])], where the second item in each tuple is a list of values expected for the properties listed in PROPS_TO_TEST for the specified filename. If a property must *not* be set, then its value should be listed as None.""" Cvs2SvnTestCase.__init__(self, name, **kw) self.props_to_test = props_to_test self.expected_props = expected_props def run(self, sbox): conv = self.ensure_conversion() conv.check_props(self.props_to_test, self.expected_props) #---------------------------------------------------------------------- # Tests. #---------------------------------------------------------------------- @Cvs2SvnTestFunction def show_usage(): "cvs2svn with no arguments shows usage" out = run_script(cvs2svn, None) if (len(out) > 2 and out[0].find('ERROR:') == 0 and out[1].find('DBM module')): print 'cvs2svn cannot execute due to lack of proper DBM module.' print 'Exiting without running any further tests.' sys.exit(1) if out[0].find('Usage:') < 0: raise Failure('Basic cvs2svn invocation failed.') @Cvs2SvnTestFunction def cvs2svn_manpage(): "generate a manpage for cvs2svn" out = run_script(cvs2svn, None, '--man') @Cvs2SvnTestFunction def cvs2git_manpage(): "generate a manpage for cvs2git" out = run_script(cvs2git, None, '--man') @Cvs2HgTestFunction def cvs2hg_manpage(): "generate a manpage for cvs2hg" out = run_script(cvs2hg, None, '--man') @Cvs2SvnTestFunction def show_help_passes(): "cvs2svn --help-passes shows pass information" out = run_script(cvs2svn, None, '--help-passes') if out[0].find('PASSES') < 0: raise Failure('cvs2svn --help-passes failed.') @Cvs2SvnTestFunction def attr_exec(): "detection of the executable flag" if sys.platform == 'win32': raise svntest.Skip() conv = ensure_conversion('main') st = os.stat(conv.get_wc('trunk', 'single-files', 'attr-exec')) if not st.st_mode & stat.S_IXUSR: raise Failure() @Cvs2SvnTestFunction def space_fname(): "conversion of filename with a space" conv = ensure_conversion('main') if not conv.path_exists('trunk', 'single-files', 'space fname'): raise Failure() @Cvs2SvnTestFunction def two_quick(): "two commits in quick succession" conv = ensure_conversion('main') logs = parse_log( os.path.join(conv.repos, 'trunk', 'single-files', 'twoquick'), {}) if len(logs) != 2: raise Failure() class PruneWithCare(Cvs2SvnTestCase): "prune, but never too much" def __init__(self, **kw): Cvs2SvnTestCase.__init__(self, 'main', **kw) def run(self, sbox): # Robert Pluim encountered this lovely one while converting the # directory src/gnu/usr.bin/cvs/contrib/pcl-cvs/ in FreeBSD's CVS # repository (see issue #1302). Step 4 is the doozy: # # revision 1: adds trunk/blah/, adds trunk/blah/first # revision 2: adds trunk/blah/second # revision 3: deletes trunk/blah/first # revision 4: deletes blah [re-deleting trunk/blah/first pruned blah!] # revision 5: does nothing # # After fixing cvs2svn, the sequence (correctly) looks like this: # # revision 1: adds trunk/blah/, adds trunk/blah/first # revision 2: adds trunk/blah/second # revision 3: deletes trunk/blah/first # revision 4: does nothing [because trunk/blah/first already deleted] # revision 5: deletes blah # # The difference is in 4 and 5. In revision 4, it's not correct # to prune blah/, because second is still in there, so revision 4 # does nothing now. But when we delete second in 5, that should # bubble up and prune blah/ instead. # # ### Note that empty revisions like 4 are probably going to become # ### at least optional, if not banished entirely from cvs2svn's # ### output. Hmmm, or they may stick around, with an extra # ### revision property explaining what happened. Need to think # ### about that. In some sense, it's a bug in Subversion itself, # ### that such revisions don't show up in 'svn log' output. conv = self.ensure_conversion() # Confirm that revision 4 removes '/trunk/full-prune/first', # and that revision 6 removes '/trunk/full-prune'. # # Also confirm similar things about '/full-prune-reappear/...', # which is similar, except that later on it reappears, restored # from pruneland, because a file gets added to it. # # And finally, a similar thing for '/partial-prune/...', except that # in its case, a permanent file on the top level prevents the # pruning from going farther than the subdirectory containing first # and second. for path in ('full-prune/first', 'full-prune-reappear/sub/first', 'partial-prune/sub/first'): conv.logs[5].check_change('/%(trunk)s/' + path, 'D') for path in ('full-prune', 'full-prune-reappear', 'partial-prune/sub'): conv.logs[7].check_change('/%(trunk)s/' + path, 'D') for path in ('full-prune-reappear', 'full-prune-reappear/appears-later'): conv.logs[33].check_change('/%(trunk)s/' + path, 'A') @Cvs2SvnTestFunction def interleaved_commits(): "two interleaved trunk commits, different log msgs" # See test-data/main-cvsrepos/proj/README. conv = ensure_conversion('main') # The initial import. rev = 26 conv.logs[rev].check('Initial import.', ( ('/%(trunk)s/interleaved', 'A'), ('/%(trunk)s/interleaved/1', 'A'), ('/%(trunk)s/interleaved/2', 'A'), ('/%(trunk)s/interleaved/3', 'A'), ('/%(trunk)s/interleaved/4', 'A'), ('/%(trunk)s/interleaved/5', 'A'), ('/%(trunk)s/interleaved/a', 'A'), ('/%(trunk)s/interleaved/b', 'A'), ('/%(trunk)s/interleaved/c', 'A'), ('/%(trunk)s/interleaved/d', 'A'), ('/%(trunk)s/interleaved/e', 'A'), )) def check_letters(rev): """Check if REV is the rev where only letters were committed.""" conv.logs[rev].check('Committing letters only.', ( ('/%(trunk)s/interleaved/a', 'M'), ('/%(trunk)s/interleaved/b', 'M'), ('/%(trunk)s/interleaved/c', 'M'), ('/%(trunk)s/interleaved/d', 'M'), ('/%(trunk)s/interleaved/e', 'M'), )) def check_numbers(rev): """Check if REV is the rev where only numbers were committed.""" conv.logs[rev].check('Committing numbers only.', ( ('/%(trunk)s/interleaved/1', 'M'), ('/%(trunk)s/interleaved/2', 'M'), ('/%(trunk)s/interleaved/3', 'M'), ('/%(trunk)s/interleaved/4', 'M'), ('/%(trunk)s/interleaved/5', 'M'), )) # One of the commits was letters only, the other was numbers only. # But they happened "simultaneously", so we don't assume anything # about which commit appeared first, so we just try both ways. rev += 1 try: check_letters(rev) check_numbers(rev + 1) except Failure: check_numbers(rev) check_letters(rev + 1) @Cvs2SvnTestFunction def simple_commits(): "simple trunk commits" # See test-data/main-cvsrepos/proj/README. conv = ensure_conversion('main') # The initial import. conv.logs[13].check('Initial import.', ( ('/%(trunk)s/proj', 'A'), ('/%(trunk)s/proj/default', 'A'), ('/%(trunk)s/proj/sub1', 'A'), ('/%(trunk)s/proj/sub1/default', 'A'), ('/%(trunk)s/proj/sub1/subsubA', 'A'), ('/%(trunk)s/proj/sub1/subsubA/default', 'A'), ('/%(trunk)s/proj/sub1/subsubB', 'A'), ('/%(trunk)s/proj/sub1/subsubB/default', 'A'), ('/%(trunk)s/proj/sub2', 'A'), ('/%(trunk)s/proj/sub2/default', 'A'), ('/%(trunk)s/proj/sub2/subsubA', 'A'), ('/%(trunk)s/proj/sub2/subsubA/default', 'A'), ('/%(trunk)s/proj/sub3', 'A'), ('/%(trunk)s/proj/sub3/default', 'A'), )) # The first commit. conv.logs[18].check('First commit to proj, affecting two files.', ( ('/%(trunk)s/proj/sub1/subsubA/default', 'M'), ('/%(trunk)s/proj/sub3/default', 'M'), )) # The second commit. conv.logs[19].check('Second commit to proj, affecting all 7 files.', ( ('/%(trunk)s/proj/default', 'M'), ('/%(trunk)s/proj/sub1/default', 'M'), ('/%(trunk)s/proj/sub1/subsubA/default', 'M'), ('/%(trunk)s/proj/sub1/subsubB/default', 'M'), ('/%(trunk)s/proj/sub2/default', 'M'), ('/%(trunk)s/proj/sub2/subsubA/default', 'M'), ('/%(trunk)s/proj/sub3/default', 'M') )) class SimpleTags(Cvs2SvnTestCase): "simple tags and branches, no commits" def __init__(self, **kw): # See test-data/main-cvsrepos/proj/README. Cvs2SvnTestCase.__init__(self, 'main', **kw) def run(self, sbox): conv = self.ensure_conversion() # Verify the copy source for the tags we are about to check # No need to verify the copyfrom revision, as simple_commits did that conv.logs[13].check('Initial import.', ( ('/%(trunk)s/proj', 'A'), ('/%(trunk)s/proj/default', 'A'), ('/%(trunk)s/proj/sub1', 'A'), ('/%(trunk)s/proj/sub1/default', 'A'), ('/%(trunk)s/proj/sub1/subsubA', 'A'), ('/%(trunk)s/proj/sub1/subsubA/default', 'A'), ('/%(trunk)s/proj/sub1/subsubB', 'A'), ('/%(trunk)s/proj/sub1/subsubB/default', 'A'), ('/%(trunk)s/proj/sub2', 'A'), ('/%(trunk)s/proj/sub2/default', 'A'), ('/%(trunk)s/proj/sub2/subsubA', 'A'), ('/%(trunk)s/proj/sub2/subsubA/default', 'A'), ('/%(trunk)s/proj/sub3', 'A'), ('/%(trunk)s/proj/sub3/default', 'A'), )) # Tag on rev 1.1.1.1 of all files in proj conv.logs[16].check(sym_log_msg('B_FROM_INITIALS'), ( ('/%(branches)s/B_FROM_INITIALS (from /%(trunk)s:13)', 'A'), ('/%(branches)s/B_FROM_INITIALS/single-files', 'D'), ('/%(branches)s/B_FROM_INITIALS/partial-prune', 'D'), )) # The same, as a tag log = conv.find_tag_log('T_ALL_INITIAL_FILES') log.check(sym_log_msg('T_ALL_INITIAL_FILES',1), ( ('/%(tags)s/T_ALL_INITIAL_FILES (from /%(trunk)s:13)', 'A'), ('/%(tags)s/T_ALL_INITIAL_FILES/single-files', 'D'), ('/%(tags)s/T_ALL_INITIAL_FILES/partial-prune', 'D'), )) # Tag on rev 1.1.1.1 of all files in proj, except one log = conv.find_tag_log('T_ALL_INITIAL_FILES_BUT_ONE') log.check(sym_log_msg('T_ALL_INITIAL_FILES_BUT_ONE',1), ( ('/%(tags)s/T_ALL_INITIAL_FILES_BUT_ONE (from /%(trunk)s:13)', 'A'), ('/%(tags)s/T_ALL_INITIAL_FILES_BUT_ONE/single-files', 'D'), ('/%(tags)s/T_ALL_INITIAL_FILES_BUT_ONE/partial-prune', 'D'), ('/%(tags)s/T_ALL_INITIAL_FILES_BUT_ONE/proj/sub1/subsubB', 'D'), )) # The same, as a branch conv.logs[17].check(sym_log_msg('B_FROM_INITIALS_BUT_ONE'), ( ('/%(branches)s/B_FROM_INITIALS_BUT_ONE (from /%(trunk)s:13)', 'A'), ('/%(branches)s/B_FROM_INITIALS_BUT_ONE/proj/sub1/subsubB', 'D'), ('/%(branches)s/B_FROM_INITIALS_BUT_ONE/single-files', 'D'), ('/%(branches)s/B_FROM_INITIALS_BUT_ONE/partial-prune', 'D'), )) @Cvs2SvnTestFunction def simple_branch_commits(): "simple branch commits" # See test-data/main-cvsrepos/proj/README. conv = ensure_conversion('main') conv.logs[23].check('Modify three files, on branch B_MIXED.', ( ('/%(branches)s/B_MIXED/proj/default', 'M'), ('/%(branches)s/B_MIXED/proj/sub1/default', 'M'), ('/%(branches)s/B_MIXED/proj/sub2/subsubA/default', 'M'), )) @Cvs2SvnTestFunction def mixed_time_tag(): "mixed-time tag" # See test-data/main-cvsrepos/proj/README. conv = ensure_conversion('main') log = conv.find_tag_log('T_MIXED') log.check_changes(( ('/%(tags)s/T_MIXED (from /%(trunk)s:19)', 'A'), ('/%(tags)s/T_MIXED/single-files', 'D'), ('/%(tags)s/T_MIXED/partial-prune', 'D'), ('/%(tags)s/T_MIXED/proj/sub2/subsubA ' '(from /%(trunk)s/proj/sub2/subsubA:13)', 'R'), ('/%(tags)s/T_MIXED/proj/sub3 (from /%(trunk)s/proj/sub3:18)', 'R'), )) @Cvs2SvnTestFunction def mixed_time_branch_with_added_file(): "mixed-time branch, and a file added to the branch" # See test-data/main-cvsrepos/proj/README. conv = ensure_conversion('main') # A branch from the same place as T_MIXED in the previous test, # plus a file added directly to the branch conv.logs[21].check(sym_log_msg('B_MIXED'), ( ('/%(branches)s/B_MIXED (from /%(trunk)s:19)', 'A'), ('/%(branches)s/B_MIXED/partial-prune', 'D'), ('/%(branches)s/B_MIXED/single-files', 'D'), ('/%(branches)s/B_MIXED/proj/sub2/subsubA ' '(from /%(trunk)s/proj/sub2/subsubA:13)', 'R'), ('/%(branches)s/B_MIXED/proj/sub3 (from /%(trunk)s/proj/sub3:18)', 'R'), )) conv.logs[22].check('Add a file on branch B_MIXED.', ( ('/%(branches)s/B_MIXED/proj/sub2/branch_B_MIXED_only', 'A'), )) @Cvs2SvnTestFunction def mixed_commit(): "a commit affecting both trunk and a branch" # See test-data/main-cvsrepos/proj/README. conv = ensure_conversion('main') conv.logs[24].check( 'A single commit affecting one file on branch B_MIXED ' 'and one on trunk.', ( ('/%(trunk)s/proj/sub2/default', 'M'), ('/%(branches)s/B_MIXED/proj/sub2/branch_B_MIXED_only', 'M'), )) @Cvs2SvnTestFunction def split_time_branch(): "branch some trunk files, and later branch the rest" # See test-data/main-cvsrepos/proj/README. conv = ensure_conversion('main') # First change on the branch, creating it conv.logs[25].check(sym_log_msg('B_SPLIT'), ( ('/%(branches)s/B_SPLIT (from /%(trunk)s:24)', 'A'), ('/%(branches)s/B_SPLIT/partial-prune', 'D'), ('/%(branches)s/B_SPLIT/single-files', 'D'), ('/%(branches)s/B_SPLIT/proj/sub1/subsubB', 'D'), )) conv.logs[29].check('First change on branch B_SPLIT.', ( ('/%(branches)s/B_SPLIT/proj/default', 'M'), ('/%(branches)s/B_SPLIT/proj/sub1/default', 'M'), ('/%(branches)s/B_SPLIT/proj/sub1/subsubA/default', 'M'), ('/%(branches)s/B_SPLIT/proj/sub2/default', 'M'), ('/%(branches)s/B_SPLIT/proj/sub2/subsubA/default', 'M'), )) # A trunk commit for the file which was not branched conv.logs[30].check('A trunk change to sub1/subsubB/default. ' 'This was committed about an', ( ('/%(trunk)s/proj/sub1/subsubB/default', 'M'), )) # Add the file not already branched to the branch, with modification:w conv.logs[31].check(sym_log_msg('B_SPLIT'), ( ('/%(branches)s/B_SPLIT/proj/sub1/subsubB ' '(from /%(trunk)s/proj/sub1/subsubB:30)', 'A'), )) conv.logs[32].check('This change affects sub3/default and ' 'sub1/subsubB/default, on branch', ( ('/%(branches)s/B_SPLIT/proj/sub1/subsubB/default', 'M'), ('/%(branches)s/B_SPLIT/proj/sub3/default', 'M'), )) @Cvs2SvnTestFunction def multiple_tags(): "multiple tags referring to same revision" conv = ensure_conversion('main') if not conv.path_exists('tags', 'T_ALL_INITIAL_FILES', 'proj', 'default'): raise Failure() if not conv.path_exists( 'tags', 'T_ALL_INITIAL_FILES_BUT_ONE', 'proj', 'default'): raise Failure() @Cvs2SvnTestFunction def multiply_defined_symbols(): "multiple definitions of symbol names" # We can only check one line of the error output at a time, so test # twice. (The conversion only have to be done once because the # results are cached.) conv = ensure_conversion( 'multiply-defined-symbols', error_re=( r"ERROR\: Multiple definitions of the symbol \'BRANCH\' .*\: " r"1\.2\.4 1\.2\.2" ), ) conv = ensure_conversion( 'multiply-defined-symbols', error_re=( r"ERROR\: Multiple definitions of the symbol \'TAG\' .*\: " r"1\.2 1\.1" ), ) @Cvs2SvnTestFunction def multiply_defined_symbols_renamed(): "rename multiply defined symbols" conv = ensure_conversion( 'multiply-defined-symbols', options_file='cvs2svn-rename.options', ) @Cvs2SvnTestFunction def multiply_defined_symbols_ignored(): "ignore multiply defined symbols" conv = ensure_conversion( 'multiply-defined-symbols', options_file='cvs2svn-ignore.options', ) @Cvs2SvnTestFunction def repeatedly_defined_symbols(): "multiple identical definitions of symbol names" # If a symbol is defined multiple times but has the same value each # time, that should not be an error. conv = ensure_conversion('repeatedly-defined-symbols') @Cvs2SvnTestFunction def bogus_tag(): "conversion of invalid symbolic names" conv = ensure_conversion('bogus-tag') @Cvs2SvnTestFunction def overlapping_branch(): "ignore a file with a branch with two names" conv = ensure_conversion('overlapping-branch') if not conv.output_found('.*cannot also have name \'vendorB\''): raise Failure() conv.logs[2].check('imported', ( ('/%(trunk)s/nonoverlapping-branch', 'A'), ('/%(trunk)s/overlapping-branch', 'A'), )) if len(conv.logs) != 2: raise Failure() class PhoenixBranch(Cvs2SvnTestCase): "convert a branch file rooted in a 'dead' revision" def __init__(self, **kw): Cvs2SvnTestCase.__init__(self, 'phoenix', **kw) def run(self, sbox): conv = self.ensure_conversion() conv.logs[8].check('This file was supplied by Jack Moffitt', ( ('/%(branches)s/volsung_20010721', 'A'), ('/%(branches)s/volsung_20010721/phoenix', 'A'), )) conv.logs[9].check('This file was supplied by Jack Moffitt', ( ('/%(branches)s/volsung_20010721/phoenix', 'M'), )) ###TODO: We check for 4 changed paths here to accomodate creating tags ###and branches in rev 1, but that will change, so this will ###eventually change back. @Cvs2SvnTestFunction def ctrl_char_in_log(): "handle a control char in a log message" # This was issue #1106. rev = 2 conv = ensure_conversion('ctrl-char-in-log') conv.logs[rev].check_changes(( ('/%(trunk)s/ctrl-char-in-log', 'A'), )) if conv.logs[rev].msg.find('\x04') < 0: raise Failure( "Log message of 'ctrl-char-in-log,v' (rev 2) is wrong.") @Cvs2SvnTestFunction def overdead(): "handle tags rooted in a redeleted revision" conv = ensure_conversion('overdead') class NoTrunkPrune(Cvs2SvnTestCase): "ensure that trunk doesn't get pruned" def __init__(self, **kw): Cvs2SvnTestCase.__init__(self, 'overdead', **kw) def run(self, sbox): conv = self.ensure_conversion() for rev in conv.logs.keys(): rev_logs = conv.logs[rev] if rev_logs.get_path_op('/%(trunk)s') == 'D': raise Failure() @Cvs2SvnTestFunction def double_delete(): "file deleted twice, in the root of the repository" # This really tests several things: how we handle a file that's # removed (state 'dead') in two successive revisions; how we # handle a file in the root of the repository (there were some # bugs in cvs2svn's svn path construction for top-level files); and # the --no-prune option. conv = ensure_conversion( 'double-delete', args=['--trunk-only', '--no-prune']) path = '/%(trunk)s/twice-removed' rev = 2 conv.logs[rev].check('Updated CVS', ( (path, 'A'), )) conv.logs[rev + 1].check('Remove this file for the first time.', ( (path, 'D'), )) conv.logs[rev + 2].check('Remove this file for the second time,', ( )) @Cvs2SvnTestFunction def split_branch(): "branch created from both trunk and another branch" # See test-data/split-branch-cvsrepos/README. # # The conversion will fail if the bug is present, and # ensure_conversion will raise Failure. conv = ensure_conversion('split-branch') @Cvs2SvnTestFunction def resync_misgroups(): "resyncing should not misorder commit groups" # See test-data/resync-misgroups-cvsrepos/README. # # The conversion will fail if the bug is present, and # ensure_conversion will raise Failure. conv = ensure_conversion('resync-misgroups') class TaggedBranchAndTrunk(Cvs2SvnTestCase): "allow tags with mixed trunk and branch sources" def __init__(self, **kw): Cvs2SvnTestCase.__init__(self, 'tagged-branch-n-trunk', **kw) def run(self, sbox): conv = self.ensure_conversion() tags = conv.symbols.get('tags', 'tags') a_path = conv.get_wc(tags, 'some-tag', 'a.txt') b_path = conv.get_wc(tags, 'some-tag', 'b.txt') if not (os.path.exists(a_path) and os.path.exists(b_path)): raise Failure() if (open(a_path, 'r').read().find('1.24') == -1) \ or (open(b_path, 'r').read().find('1.5') == -1): raise Failure() @Cvs2SvnTestFunction def enroot_race(): "never use the rev-in-progress as a copy source" # See issue #1427 and r8544. conv = ensure_conversion('enroot-race') rev = 6 conv.logs[rev].check_changes(( ('/%(branches)s/mybranch (from /%(trunk)s:5)', 'A'), ('/%(branches)s/mybranch/proj/a.txt', 'D'), ('/%(branches)s/mybranch/proj/b.txt', 'D'), )) conv.logs[rev + 1].check_changes(( ('/%(branches)s/mybranch/proj/c.txt', 'M'), ('/%(trunk)s/proj/a.txt', 'M'), ('/%(trunk)s/proj/b.txt', 'M'), )) @Cvs2SvnTestFunction def enroot_race_obo(): "do use the last completed rev as a copy source" conv = ensure_conversion('enroot-race-obo') conv.logs[3].check_change('/%(branches)s/BRANCH (from /%(trunk)s:2)', 'A') if not len(conv.logs) == 3: raise Failure() class BranchDeleteFirst(Cvs2SvnTestCase): "correctly handle deletion as initial branch action" def __init__(self, **kw): Cvs2SvnTestCase.__init__(self, 'branch-delete-first', **kw) def run(self, sbox): # See test-data/branch-delete-first-cvsrepos/README. # # The conversion will fail if the bug is present, and # ensure_conversion would raise Failure. conv = self.ensure_conversion() branches = conv.symbols.get('branches', 'branches') # 'file' was deleted from branch-1 and branch-2, but not branch-3 if conv.path_exists(branches, 'branch-1', 'file'): raise Failure() if conv.path_exists(branches, 'branch-2', 'file'): raise Failure() if not conv.path_exists(branches, 'branch-3', 'file'): raise Failure() @Cvs2SvnTestFunction def nonascii_filenames(): "non ascii files converted incorrectly" # see issue #1255 # on a en_US.iso-8859-1 machine this test fails with # svn: Can't recode ... # # as described in the issue # on a en_US.UTF-8 machine this test fails with # svn: Malformed XML ... # # which means at least it fails. Unfortunately it won't fail # with the same error... # mangle current locale settings so we know we're not running # a UTF-8 locale (which does not exhibit this problem) current_locale = locale.getlocale() new_locale = 'en_US.ISO8859-1' locale_changed = None # From http://docs.python.org/lib/module-sys.html # # getfilesystemencoding(): # # Return the name of the encoding used to convert Unicode filenames # into system file names, or None if the system default encoding is # used. The result value depends on the operating system: # # - On Windows 9x, the encoding is ``mbcs''. # - On Mac OS X, the encoding is ``utf-8''. # - On Unix, the encoding is the user's preference according to the # result of nl_langinfo(CODESET), or None if the # nl_langinfo(CODESET) failed. # - On Windows NT+, file names are Unicode natively, so no conversion is # performed. # So we're going to skip this test on Mac OS X for now. if sys.platform == "darwin": raise svntest.Skip() try: # change locale to non-UTF-8 locale to generate latin1 names locale.setlocale(locale.LC_ALL, # this might be too broad? new_locale) locale_changed = 1 except locale.Error: raise svntest.Skip() try: srcrepos_path = os.path.join(test_data_dir,'main-cvsrepos') dstrepos_path = os.path.join(test_data_dir,'non-ascii-cvsrepos') if not os.path.exists(dstrepos_path): # create repos from existing main repos shutil.copytree(srcrepos_path, dstrepos_path) base_path = os.path.join(dstrepos_path, 'single-files') shutil.copyfile(os.path.join(base_path, 'twoquick,v'), os.path.join(base_path, 'two\366uick,v')) new_path = os.path.join(dstrepos_path, 'single\366files') os.rename(base_path, new_path) conv = ensure_conversion('non-ascii', args=['--encoding=latin1']) finally: if locale_changed: locale.setlocale(locale.LC_ALL, current_locale) safe_rmtree(dstrepos_path) class UnicodeTest(Cvs2SvnTestCase): "metadata contains Unicode" warning_pattern = r'ERROR\: There were warnings converting .* messages' def __init__(self, name, warning_expected, **kw): if warning_expected: error_re = self.warning_pattern else: error_re = None Cvs2SvnTestCase.__init__(self, name, error_re=error_re, **kw) self.warning_expected = warning_expected def run(self, sbox): try: # ensure the availability of the "utf_8" encoding: u'a'.encode('utf_8').decode('utf_8') except LookupError: raise svntest.Skip() self.ensure_conversion() class UnicodeAuthor(UnicodeTest): "author name contains Unicode" def __init__(self, warning_expected, **kw): UnicodeTest.__init__(self, 'unicode-author', warning_expected, **kw) class UnicodeLog(UnicodeTest): "log message contains Unicode" def __init__(self, warning_expected, **kw): UnicodeTest.__init__(self, 'unicode-log', warning_expected, **kw) @Cvs2SvnTestFunction def vendor_branch_sameness(): "avoid spurious changes for initial revs" conv = ensure_conversion( 'vendor-branch-sameness', args=['--keep-trivial-imports'] ) # The following files are in this repository: # # a.txt: Imported in the traditional way; 1.1 and 1.1.1.1 have # the same contents, the file's default branch is 1.1.1, # and both revisions are in state 'Exp'. # # b.txt: Like a.txt, except that 1.1.1.1 has a real change from # 1.1 (the addition of a line of text). # # c.txt: Like a.txt, except that 1.1.1.1 is in state 'dead'. # # d.txt: This file was created by 'cvs add' instead of import, so # it has only 1.1 -- no 1.1.1.1, and no default branch. # The timestamp on the add is exactly the same as for the # imports of the other files. # # e.txt: Like a.txt, except that the log message for revision 1.1 # is not the standard import log message. # # (Aside from e.txt, the log messages for the same revisions are the # same in all files.) # # We expect that only a.txt is recognized as an import whose 1.1 # revision can be omitted. The other files should be added on trunk # then filled to vbranchA, whereas a.txt should be added to vbranchA # then copied to trunk. In the copy of 1.1.1.1 back to trunk, a.txt # and e.txt should be copied untouched; b.txt should be 'M'odified, # and c.txt should be 'D'eleted. rev = 2 conv.logs[rev].check('Initial revision', ( ('/%(trunk)s/proj', 'A'), ('/%(trunk)s/proj/b.txt', 'A'), ('/%(trunk)s/proj/c.txt', 'A'), ('/%(trunk)s/proj/d.txt', 'A'), )) conv.logs[rev + 1].check(sym_log_msg('vbranchA'), ( ('/%(branches)s/vbranchA (from /%(trunk)s:2)', 'A'), ('/%(branches)s/vbranchA/proj/d.txt', 'D'), )) conv.logs[rev + 2].check('First vendor branch revision.', ( ('/%(branches)s/vbranchA/proj/a.txt', 'A'), ('/%(branches)s/vbranchA/proj/b.txt', 'M'), ('/%(branches)s/vbranchA/proj/c.txt', 'D'), )) conv.logs[rev + 3].check('This commit was generated by cvs2svn ' 'to compensate for changes in r4,', ( ('/%(trunk)s/proj/a.txt (from /%(branches)s/vbranchA/proj/a.txt:4)', 'A'), ('/%(trunk)s/proj/b.txt (from /%(branches)s/vbranchA/proj/b.txt:4)', 'R'), ('/%(trunk)s/proj/c.txt', 'D'), )) rev = 7 conv.logs[rev].check('This log message is not the standard', ( ('/%(trunk)s/proj/e.txt', 'A'), )) conv.logs[rev + 2].check('First vendor branch revision', ( ('/%(branches)s/vbranchB/proj/e.txt', 'M'), )) conv.logs[rev + 3].check('This commit was generated by cvs2svn ' 'to compensate for changes in r9,', ( ('/%(trunk)s/proj/e.txt (from /%(branches)s/vbranchB/proj/e.txt:9)', 'R'), )) @Cvs2SvnTestFunction def vendor_branch_trunk_only(): "handle vendor branches with --trunk-only" conv = ensure_conversion('vendor-branch-sameness', args=['--trunk-only']) rev = 2 conv.logs[rev].check('Initial revision', ( ('/%(trunk)s/proj', 'A'), ('/%(trunk)s/proj/b.txt', 'A'), ('/%(trunk)s/proj/c.txt', 'A'), ('/%(trunk)s/proj/d.txt', 'A'), )) conv.logs[rev + 1].check('First vendor branch revision', ( ('/%(trunk)s/proj/a.txt', 'A'), ('/%(trunk)s/proj/b.txt', 'M'), ('/%(trunk)s/proj/c.txt', 'D'), )) conv.logs[rev + 2].check('This log message is not the standard', ( ('/%(trunk)s/proj/e.txt', 'A'), )) conv.logs[rev + 3].check('First vendor branch revision', ( ('/%(trunk)s/proj/e.txt', 'M'), )) @Cvs2SvnTestFunction def default_branches(): "handle default branches correctly" conv = ensure_conversion('default-branches') # There are seven files in the repository: # # a.txt: # Imported in the traditional way, so 1.1 and 1.1.1.1 are the # same. Then 1.1.1.2 and 1.1.1.3 were imported, then 1.2 # committed (thus losing the default branch "1.1.1"), then # 1.1.1.4 was imported. All vendor import release tags are # still present. # # b.txt: # Like a.txt, but without rev 1.2. # # c.txt: # Exactly like b.txt, just s/b.txt/c.txt/ in content. # # d.txt: # Same as the previous two, but 1.1.1 branch is unlabeled. # # e.txt: # Same, but missing 1.1.1 label and all tags but 1.1.1.3. # # deleted-on-vendor-branch.txt,v: # Like b.txt and c.txt, except that 1.1.1.3 is state 'dead'. # # added-then-imported.txt,v: # Added with 'cvs add' to create 1.1, then imported with # completely different contents to create 1.1.1.1, therefore # never had a default branch. # conv.logs[2].check("Import (vbranchA, vtag-1).", ( ('/%(branches)s/unlabeled-1.1.1', 'A'), ('/%(branches)s/unlabeled-1.1.1/proj', 'A'), ('/%(branches)s/unlabeled-1.1.1/proj/d.txt', 'A'), ('/%(branches)s/unlabeled-1.1.1/proj/e.txt', 'A'), ('/%(branches)s/vbranchA', 'A'), ('/%(branches)s/vbranchA/proj', 'A'), ('/%(branches)s/vbranchA/proj/a.txt', 'A'), ('/%(branches)s/vbranchA/proj/b.txt', 'A'), ('/%(branches)s/vbranchA/proj/c.txt', 'A'), ('/%(branches)s/vbranchA/proj/deleted-on-vendor-branch.txt', 'A'), )) conv.logs[3].check("This commit was generated by cvs2svn " "to compensate for changes in r2,", ( ('/%(trunk)s/proj', 'A'), ('/%(trunk)s/proj/a.txt (from /%(branches)s/vbranchA/proj/a.txt:2)', 'A'), ('/%(trunk)s/proj/b.txt (from /%(branches)s/vbranchA/proj/b.txt:2)', 'A'), ('/%(trunk)s/proj/c.txt (from /%(branches)s/vbranchA/proj/c.txt:2)', 'A'), ('/%(trunk)s/proj/d.txt ' '(from /%(branches)s/unlabeled-1.1.1/proj/d.txt:2)', 'A'), ('/%(trunk)s/proj/deleted-on-vendor-branch.txt ' '(from /%(branches)s/vbranchA/proj/deleted-on-vendor-branch.txt:2)', 'A'), ('/%(trunk)s/proj/e.txt ' '(from /%(branches)s/unlabeled-1.1.1/proj/e.txt:2)', 'A'), )) conv.logs[4].check(sym_log_msg('vtag-1',1), ( ('/%(tags)s/vtag-1 (from /%(branches)s/vbranchA:2)', 'A'), ('/%(tags)s/vtag-1/proj/d.txt ' '(from /%(branches)s/unlabeled-1.1.1/proj/d.txt:2)', 'A'), )) conv.logs[5].check("Import (vbranchA, vtag-2).", ( ('/%(branches)s/unlabeled-1.1.1/proj/d.txt', 'M'), ('/%(branches)s/unlabeled-1.1.1/proj/e.txt', 'M'), ('/%(branches)s/vbranchA/proj/a.txt', 'M'), ('/%(branches)s/vbranchA/proj/b.txt', 'M'), ('/%(branches)s/vbranchA/proj/c.txt', 'M'), ('/%(branches)s/vbranchA/proj/deleted-on-vendor-branch.txt', 'M'), )) conv.logs[6].check("This commit was generated by cvs2svn " "to compensate for changes in r5,", ( ('/%(trunk)s/proj/a.txt ' '(from /%(branches)s/vbranchA/proj/a.txt:5)', 'R'), ('/%(trunk)s/proj/b.txt ' '(from /%(branches)s/vbranchA/proj/b.txt:5)', 'R'), ('/%(trunk)s/proj/c.txt ' '(from /%(branches)s/vbranchA/proj/c.txt:5)', 'R'), ('/%(trunk)s/proj/d.txt ' '(from /%(branches)s/unlabeled-1.1.1/proj/d.txt:5)', 'R'), ('/%(trunk)s/proj/deleted-on-vendor-branch.txt ' '(from /%(branches)s/vbranchA/proj/deleted-on-vendor-branch.txt:5)', 'R'), ('/%(trunk)s/proj/e.txt ' '(from /%(branches)s/unlabeled-1.1.1/proj/e.txt:5)', 'R'), )) conv.logs[7].check(sym_log_msg('vtag-2',1), ( ('/%(tags)s/vtag-2 (from /%(branches)s/vbranchA:5)', 'A'), ('/%(tags)s/vtag-2/proj/d.txt ' '(from /%(branches)s/unlabeled-1.1.1/proj/d.txt:5)', 'A'), )) conv.logs[8].check("Import (vbranchA, vtag-3).", ( ('/%(branches)s/unlabeled-1.1.1/proj/d.txt', 'M'), ('/%(branches)s/unlabeled-1.1.1/proj/e.txt', 'M'), ('/%(branches)s/vbranchA/proj/a.txt', 'M'), ('/%(branches)s/vbranchA/proj/b.txt', 'M'), ('/%(branches)s/vbranchA/proj/c.txt', 'M'), ('/%(branches)s/vbranchA/proj/deleted-on-vendor-branch.txt', 'D'), )) conv.logs[9].check("This commit was generated by cvs2svn " "to compensate for changes in r8,", ( ('/%(trunk)s/proj/a.txt ' '(from /%(branches)s/vbranchA/proj/a.txt:8)', 'R'), ('/%(trunk)s/proj/b.txt ' '(from /%(branches)s/vbranchA/proj/b.txt:8)', 'R'), ('/%(trunk)s/proj/c.txt ' '(from /%(branches)s/vbranchA/proj/c.txt:8)', 'R'), ('/%(trunk)s/proj/d.txt ' '(from /%(branches)s/unlabeled-1.1.1/proj/d.txt:8)', 'R'), ('/%(trunk)s/proj/deleted-on-vendor-branch.txt', 'D'), ('/%(trunk)s/proj/e.txt ' '(from /%(branches)s/unlabeled-1.1.1/proj/e.txt:8)', 'R'), )) conv.logs[10].check(sym_log_msg('vtag-3',1), ( ('/%(tags)s/vtag-3 (from /%(branches)s/vbranchA:8)', 'A'), ('/%(tags)s/vtag-3/proj/d.txt ' '(from /%(branches)s/unlabeled-1.1.1/proj/d.txt:8)', 'A'), ('/%(tags)s/vtag-3/proj/e.txt ' '(from /%(branches)s/unlabeled-1.1.1/proj/e.txt:8)', 'A'), )) conv.logs[11].check("First regular commit, to a.txt, on vtag-3.", ( ('/%(trunk)s/proj/a.txt', 'M'), )) conv.logs[12].check("Add a file to the working copy.", ( ('/%(trunk)s/proj/added-then-imported.txt', 'A'), )) conv.logs[13].check(sym_log_msg('vbranchA'), ( ('/%(branches)s/vbranchA/proj/added-then-imported.txt ' '(from /%(trunk)s/proj/added-then-imported.txt:12)', 'A'), )) conv.logs[14].check("Import (vbranchA, vtag-4).", ( ('/%(branches)s/unlabeled-1.1.1/proj/d.txt', 'M'), ('/%(branches)s/unlabeled-1.1.1/proj/e.txt', 'M'), ('/%(branches)s/vbranchA/proj/a.txt', 'M'), ('/%(branches)s/vbranchA/proj/added-then-imported.txt', 'M'), # CHECK!!! ('/%(branches)s/vbranchA/proj/b.txt', 'M'), ('/%(branches)s/vbranchA/proj/c.txt', 'M'), ('/%(branches)s/vbranchA/proj/deleted-on-vendor-branch.txt', 'A'), )) conv.logs[15].check("This commit was generated by cvs2svn " "to compensate for changes in r14,", ( ('/%(trunk)s/proj/b.txt ' '(from /%(branches)s/vbranchA/proj/b.txt:14)', 'R'), ('/%(trunk)s/proj/c.txt ' '(from /%(branches)s/vbranchA/proj/c.txt:14)', 'R'), ('/%(trunk)s/proj/d.txt ' '(from /%(branches)s/unlabeled-1.1.1/proj/d.txt:14)', 'R'), ('/%(trunk)s/proj/deleted-on-vendor-branch.txt ' '(from /%(branches)s/vbranchA/proj/deleted-on-vendor-branch.txt:14)', 'A'), ('/%(trunk)s/proj/e.txt ' '(from /%(branches)s/unlabeled-1.1.1/proj/e.txt:14)', 'R'), )) conv.logs[16].check(sym_log_msg('vtag-4',1), ( ('/%(tags)s/vtag-4 (from /%(branches)s/vbranchA:14)', 'A'), ('/%(tags)s/vtag-4/proj/d.txt ' '(from /%(branches)s/unlabeled-1.1.1/proj/d.txt:14)', 'A'), )) @Cvs2SvnTestFunction def default_branches_trunk_only(): "handle default branches with --trunk-only" conv = ensure_conversion('default-branches', args=['--trunk-only']) conv.logs[2].check("Import (vbranchA, vtag-1).", ( ('/%(trunk)s/proj', 'A'), ('/%(trunk)s/proj/a.txt', 'A'), ('/%(trunk)s/proj/b.txt', 'A'), ('/%(trunk)s/proj/c.txt', 'A'), ('/%(trunk)s/proj/d.txt', 'A'), ('/%(trunk)s/proj/e.txt', 'A'), ('/%(trunk)s/proj/deleted-on-vendor-branch.txt', 'A'), )) conv.logs[3].check("Import (vbranchA, vtag-2).", ( ('/%(trunk)s/proj/a.txt', 'M'), ('/%(trunk)s/proj/b.txt', 'M'), ('/%(trunk)s/proj/c.txt', 'M'), ('/%(trunk)s/proj/d.txt', 'M'), ('/%(trunk)s/proj/e.txt', 'M'), ('/%(trunk)s/proj/deleted-on-vendor-branch.txt', 'M'), )) conv.logs[4].check("Import (vbranchA, vtag-3).", ( ('/%(trunk)s/proj/a.txt', 'M'), ('/%(trunk)s/proj/b.txt', 'M'), ('/%(trunk)s/proj/c.txt', 'M'), ('/%(trunk)s/proj/d.txt', 'M'), ('/%(trunk)s/proj/e.txt', 'M'), ('/%(trunk)s/proj/deleted-on-vendor-branch.txt', 'D'), )) conv.logs[5].check("First regular commit, to a.txt, on vtag-3.", ( ('/%(trunk)s/proj/a.txt', 'M'), )) conv.logs[6].check("Add a file to the working copy.", ( ('/%(trunk)s/proj/added-then-imported.txt', 'A'), )) conv.logs[7].check("Import (vbranchA, vtag-4).", ( ('/%(trunk)s/proj/b.txt', 'M'), ('/%(trunk)s/proj/c.txt', 'M'), ('/%(trunk)s/proj/d.txt', 'M'), ('/%(trunk)s/proj/e.txt', 'M'), ('/%(trunk)s/proj/deleted-on-vendor-branch.txt', 'A'), )) @Cvs2SvnTestFunction def default_branch_and_1_2(): "do not allow 1.2 revision with default branch" conv = ensure_conversion( 'default-branch-and-1-2', error_re=( r'.*File \'.*\' has default branch=1\.1\.1 but also a revision 1\.2' ), ) @Cvs2SvnTestFunction def compose_tag_three_sources(): "compose a tag from three sources" conv = ensure_conversion('compose-tag-three-sources') conv.logs[2].check("Add on trunk", ( ('/%(trunk)s/tagged-on-trunk-1.1', 'A'), ('/%(trunk)s/tagged-on-trunk-1.2-a', 'A'), ('/%(trunk)s/tagged-on-trunk-1.2-b', 'A'), ('/%(trunk)s/tagged-on-b1', 'A'), ('/%(trunk)s/tagged-on-b2', 'A'), )) conv.logs[3].check(sym_log_msg('b1'), ( ('/%(branches)s/b1 (from /%(trunk)s:2)', 'A'), )) conv.logs[4].check(sym_log_msg('b2'), ( ('/%(branches)s/b2 (from /%(trunk)s:2)', 'A'), )) conv.logs[5].check("Commit on branch b1", ( ('/%(branches)s/b1/tagged-on-trunk-1.1', 'M'), ('/%(branches)s/b1/tagged-on-trunk-1.2-a', 'M'), ('/%(branches)s/b1/tagged-on-trunk-1.2-b', 'M'), ('/%(branches)s/b1/tagged-on-b1', 'M'), ('/%(branches)s/b1/tagged-on-b2', 'M'), )) conv.logs[6].check("Commit on branch b2", ( ('/%(branches)s/b2/tagged-on-trunk-1.1', 'M'), ('/%(branches)s/b2/tagged-on-trunk-1.2-a', 'M'), ('/%(branches)s/b2/tagged-on-trunk-1.2-b', 'M'), ('/%(branches)s/b2/tagged-on-b1', 'M'), ('/%(branches)s/b2/tagged-on-b2', 'M'), )) conv.logs[7].check("Commit again on trunk", ( ('/%(trunk)s/tagged-on-trunk-1.2-a', 'M'), ('/%(trunk)s/tagged-on-trunk-1.2-b', 'M'), ('/%(trunk)s/tagged-on-trunk-1.1', 'M'), ('/%(trunk)s/tagged-on-b1', 'M'), ('/%(trunk)s/tagged-on-b2', 'M'), )) conv.logs[8].check(sym_log_msg('T',1), ( ('/%(tags)s/T (from /%(trunk)s:7)', 'A'), ('/%(tags)s/T/tagged-on-trunk-1.1 ' '(from /%(trunk)s/tagged-on-trunk-1.1:2)', 'R'), ('/%(tags)s/T/tagged-on-b1 (from /%(branches)s/b1/tagged-on-b1:5)', 'R'), ('/%(tags)s/T/tagged-on-b2 (from /%(branches)s/b2/tagged-on-b2:6)', 'R'), )) @Cvs2SvnTestFunction def pass5_when_to_fill(): "reserve a svn revnum for a fill only when required" # The conversion will fail if the bug is present, and # ensure_conversion would raise Failure. conv = ensure_conversion('pass5-when-to-fill') class EmptyTrunk(Cvs2SvnTestCase): "don't break when the trunk is empty" def __init__(self, **kw): Cvs2SvnTestCase.__init__(self, 'empty-trunk', **kw) def run(self, sbox): # The conversion will fail if the bug is present, and # ensure_conversion would raise Failure. conv = self.ensure_conversion() @Cvs2SvnTestFunction def no_spurious_svn_commits(): "ensure that we don't create any spurious commits" conv = ensure_conversion('phoenix') # Check spurious commit that could be created in # SVNCommitCreator._pre_commit() # # (When you add a file on a branch, CVS creates a trunk revision # in state 'dead'. If the log message of that commit is equal to # the one that CVS generates, we do not ever create a 'fill' # SVNCommit for it.) # # and spurious commit that could be created in # SVNCommitCreator._commit() # # (When you add a file on a branch, CVS creates a trunk revision # in state 'dead'. If the log message of that commit is equal to # the one that CVS generates, we do not create a primary SVNCommit # for it.) conv.logs[17].check('File added on branch xiphophorus', ( ('/%(branches)s/xiphophorus/added-on-branch.txt', 'A'), )) # Check to make sure that a commit *is* generated: # (When you add a file on a branch, CVS creates a trunk revision # in state 'dead'. If the log message of that commit is NOT equal # to the one that CVS generates, we create a primary SVNCommit to # serve as a home for the log message in question. conv.logs[18].check('file added-on-branch2.txt was initially added on ' + 'branch xiphophorus,\nand this log message was tweaked', ()) # Check spurious commit that could be created in # SVNCommitCreator._commit_symbols(). conv.logs[19].check('This file was also added on branch xiphophorus,', ( ('/%(branches)s/xiphophorus/added-on-branch2.txt', 'A'), )) class PeerPathPruning(Cvs2SvnTestCase): "make sure that filling prunes paths correctly" def __init__(self, **kw): Cvs2SvnTestCase.__init__(self, 'peer-path-pruning', **kw) def run(self, sbox): conv = self.ensure_conversion() conv.logs[6].check(sym_log_msg('BRANCH'), ( ('/%(branches)s/BRANCH (from /%(trunk)s:4)', 'A'), ('/%(branches)s/BRANCH/bar', 'D'), ('/%(branches)s/BRANCH/foo (from /%(trunk)s/foo:5)', 'R'), )) @Cvs2SvnTestFunction def invalid_closings_on_trunk(): "verify correct revs are copied to default branches" # The conversion will fail if the bug is present, and # ensure_conversion would raise Failure. conv = ensure_conversion('invalid-closings-on-trunk') @Cvs2SvnTestFunction def individual_passes(): "run each pass individually" conv = ensure_conversion('main') conv2 = ensure_conversion('main', passbypass=1) if conv.logs != conv2.logs: raise Failure() @Cvs2SvnTestFunction def resync_bug(): "reveal a big bug in our resync algorithm" # This will fail if the bug is present conv = ensure_conversion('resync-bug') @Cvs2SvnTestFunction def branch_from_default_branch(): "reveal a bug in our default branch detection code" conv = ensure_conversion('branch-from-default-branch') # This revision will be a default branch synchronization only # if cvs2svn is correctly determining default branch revisions. # # The bug was that cvs2svn was treating revisions on branches off of # default branches as default branch revisions, resulting in # incorrectly regarding the branch off of the default branch as a # non-trunk default branch. Crystal clear? I thought so. See # issue #42 for more incoherent blathering. conv.logs[5].check("This commit was generated by cvs2svn", ( ('/%(trunk)s/proj/file.txt ' '(from /%(branches)s/upstream/proj/file.txt:4)', 'R'), )) @Cvs2SvnTestFunction def file_in_attic_too(): "die if a file exists in and out of the attic" ensure_conversion( 'file-in-attic-too', error_re=( r'.*A CVS repository cannot contain both ' r'(.*)' + re.escape(os.sep) + r'(.*) ' + r'and ' r'\1' + re.escape(os.sep) + r'Attic' + re.escape(os.sep) + r'\2' ) ) @Cvs2SvnTestFunction def retain_file_in_attic_too(): "test --retain-conflicting-attic-files option" conv = ensure_conversion( 'file-in-attic-too', args=['--retain-conflicting-attic-files']) if not conv.path_exists('trunk', 'file.txt'): raise Failure() if not conv.path_exists('trunk', 'Attic', 'file.txt'): raise Failure() @Cvs2SvnTestFunction def symbolic_name_filling_guide(): "reveal a big bug in our SymbolFillingGuide" # This will fail if the bug is present conv = ensure_conversion('symbolic-name-overfill') # Helpers for tests involving file contents and properties. class NodeTreeWalkException: "Exception class for node tree traversals." pass def node_for_path(node, path): "In the tree rooted under SVNTree NODE, return the node at PATH." if node.name != '__SVN_ROOT_NODE': raise NodeTreeWalkException() path = path.strip('/') components = path.split('/') for component in components: node = svntest.tree.get_child(node, component) return node # Helper for tests involving properties. def props_for_path(node, path): "In the tree rooted under SVNTree NODE, return the prop dict for PATH." return node_for_path(node, path).props class EOLMime(Cvs2SvnPropertiesTestCase): """eol settings and mime types together The files are as follows: trunk/foo.txt: no -kb, mime file says nothing. trunk/foo.xml: no -kb, mime file says text. trunk/foo.zip: no -kb, mime file says non-text. trunk/foo.bin: has -kb, mime file says nothing. trunk/foo.csv: has -kb, mime file says text. trunk/foo.dbf: has -kb, mime file says non-text. """ def __init__(self, args, **kw): # TODO: It's a bit klugey to construct this path here. But so far # there's only one test with a mime.types file. If we have more, # we should abstract this into some helper, which would be located # near ensure_conversion(). Note that it is a convention of this # test suite for a mime.types file to be located in the top level # of the CVS repository to which it applies. self.mime_path = os.path.join( test_data_dir, 'eol-mime-cvsrepos', 'mime.types') Cvs2SvnPropertiesTestCase.__init__( self, 'eol-mime', props_to_test=['svn:eol-style', 'svn:mime-type', 'svn:keywords'], args=['--mime-types=%s' % self.mime_path] + args, **kw) # We do four conversions. Each time, we pass --mime-types=FILE with # the same FILE, but vary --default-eol and --eol-from-mime-type. # Thus there's one conversion with neither flag, one with just the # former, one with just the latter, and one with both. # Neither --no-default-eol nor --eol-from-mime-type: eol_mime1 = EOLMime( variant=1, args=[], expected_props=[ ('trunk/foo.txt', [None, None, None]), ('trunk/foo.xml', [None, 'text/xml', None]), ('trunk/foo.zip', [None, 'application/zip', None]), ('trunk/foo.bin', [None, 'application/octet-stream', None]), ('trunk/foo.csv', [None, 'text/csv', None]), ('trunk/foo.dbf', [None, 'application/what-is-dbf', None]), ]) # Just --no-default-eol, not --eol-from-mime-type: eol_mime2 = EOLMime( variant=2, args=['--default-eol=native'], expected_props=[ ('trunk/foo.txt', ['native', None, KEYWORDS]), ('trunk/foo.xml', ['native', 'text/xml', KEYWORDS]), ('trunk/foo.zip', ['native', 'application/zip', KEYWORDS]), ('trunk/foo.bin', [None, 'application/octet-stream', None]), ('trunk/foo.csv', [None, 'text/csv', None]), ('trunk/foo.dbf', [None, 'application/what-is-dbf', None]), ]) # Just --eol-from-mime-type, not --no-default-eol: eol_mime3 = EOLMime( variant=3, args=['--eol-from-mime-type'], expected_props=[ ('trunk/foo.txt', [None, None, None]), ('trunk/foo.xml', ['native', 'text/xml', KEYWORDS]), ('trunk/foo.zip', [None, 'application/zip', None]), ('trunk/foo.bin', [None, 'application/octet-stream', None]), ('trunk/foo.csv', [None, 'text/csv', None]), ('trunk/foo.dbf', [None, 'application/what-is-dbf', None]), ]) # Both --no-default-eol and --eol-from-mime-type: eol_mime4 = EOLMime( variant=4, args=['--eol-from-mime-type', '--default-eol=native'], expected_props=[ ('trunk/foo.txt', ['native', None, KEYWORDS]), ('trunk/foo.xml', ['native', 'text/xml', KEYWORDS]), ('trunk/foo.zip', [None, 'application/zip', None]), ('trunk/foo.bin', [None, 'application/octet-stream', None]), ('trunk/foo.csv', [None, 'text/csv', None]), ('trunk/foo.dbf', [None, 'application/what-is-dbf', None]), ]) cvs_revnums_off = Cvs2SvnPropertiesTestCase( 'eol-mime', doc='test non-setting of cvs2svn:cvs-rev property', args=[], props_to_test=['cvs2svn:cvs-rev'], expected_props=[ ('trunk/foo.txt', [None]), ('trunk/foo.xml', [None]), ('trunk/foo.zip', [None]), ('trunk/foo.bin', [None]), ('trunk/foo.csv', [None]), ('trunk/foo.dbf', [None]), ]) cvs_revnums_on = Cvs2SvnPropertiesTestCase( 'eol-mime', doc='test setting of cvs2svn:cvs-rev property', args=['--cvs-revnums'], props_to_test=['cvs2svn:cvs-rev'], expected_props=[ ('trunk/foo.txt', ['1.2']), ('trunk/foo.xml', ['1.2']), ('trunk/foo.zip', ['1.2']), ('trunk/foo.bin', ['1.2']), ('trunk/foo.csv', ['1.2']), ('trunk/foo.dbf', ['1.2']), ]) keywords = Cvs2SvnPropertiesTestCase( 'keywords', doc='test setting of svn:keywords property among others', args=['--default-eol=native'], props_to_test=['svn:keywords', 'svn:eol-style', 'svn:mime-type'], expected_props=[ ('trunk/foo.default', [KEYWORDS, 'native', None]), ('trunk/foo.kkvl', [KEYWORDS, 'native', None]), ('trunk/foo.kkv', [KEYWORDS, 'native', None]), ('trunk/foo.kb', [None, None, 'application/octet-stream']), ('trunk/foo.kk', [None, 'native', None]), ('trunk/foo.ko', [None, 'native', None]), ('trunk/foo.kv', [None, 'native', None]), ]) @Cvs2SvnTestFunction def ignore(): "test setting of svn:ignore property" conv = ensure_conversion('cvsignore') wc_tree = conv.get_wc_tree() topdir_props = props_for_path(wc_tree, 'trunk/proj') subdir_props = props_for_path(wc_tree, '/trunk/proj/subdir') if topdir_props['svn:ignore'] != \ '*.idx\n*.aux\n*.dvi\n*.log\nfoo\nbar\nbaz\nqux\n': raise Failure() if subdir_props['svn:ignore'] != \ '*.idx\n*.aux\n*.dvi\n*.log\nfoo\nbar\nbaz\nqux\n': raise Failure() @Cvs2SvnTestFunction def requires_cvs(): "test that CVS can still do what RCS can't" # See issues 4, 11, 29 for the bugs whose regression we're testing for. conv = ensure_conversion( 'requires-cvs', args=['--use-cvs', '--default-eol=native'], ) atsign_contents = file(conv.get_wc("trunk", "atsign-add")).read() cl_contents = file(conv.get_wc("trunk", "client_lock.idl")).read() if atsign_contents[-1:] == "@": raise Failure() if cl_contents.find("gregh\n//\n//Integration for locks") < 0: raise Failure() if not (conv.logs[6].author == "William Lyon Phelps III" and conv.logs[5].author == "j random"): raise Failure() @Cvs2SvnTestFunction def questionable_branch_names(): "test that we can handle weird branch names" conv = ensure_conversion('questionable-symbols') # If the conversion succeeds, then we're okay. We could check the # actual branch paths, too, but the main thing is to know that the # conversion doesn't fail. @Cvs2SvnTestFunction def questionable_tag_names(): "test that we can handle weird tag names" conv = ensure_conversion('questionable-symbols') conv.find_tag_log('Tag_A').check(sym_log_msg('Tag_A', 1), ( ('/%(tags)s/Tag_A (from /trunk:8)', 'A'), )) conv.find_tag_log('TagWith/Backslash_E').check( sym_log_msg('TagWith/Backslash_E',1), ( ('/%(tags)s/TagWith', 'A'), ('/%(tags)s/TagWith/Backslash_E (from /trunk:8)', 'A'), ) ) conv.find_tag_log('TagWith/Slash_Z').check( sym_log_msg('TagWith/Slash_Z',1), ( ('/%(tags)s/TagWith/Slash_Z (from /trunk:8)', 'A'), ) ) @Cvs2SvnTestFunction def revision_reorder_bug(): "reveal a bug that reorders file revisions" conv = ensure_conversion('revision-reorder-bug') # If the conversion succeeds, then we're okay. We could check the # actual revisions, too, but the main thing is to know that the # conversion doesn't fail. @Cvs2SvnTestFunction def exclude(): "test that exclude really excludes everything" conv = ensure_conversion('main', args=['--exclude=.*']) for log in conv.logs.values(): for item in log.changed_paths.keys(): if item.startswith('/branches/') or item.startswith('/tags/'): raise Failure() @Cvs2SvnTestFunction def vendor_branch_delete_add(): "add trunk file that was deleted on vendor branch" # This will error if the bug is present conv = ensure_conversion('vendor-branch-delete-add') @Cvs2SvnTestFunction def resync_pass2_pull_forward(): "ensure pass2 doesn't pull rev too far forward" conv = ensure_conversion('resync-pass2-pull-forward') # If the conversion succeeds, then we're okay. We could check the # actual revisions, too, but the main thing is to know that the # conversion doesn't fail. @Cvs2SvnTestFunction def native_eol(): "only LFs for svn:eol-style=native files" conv = ensure_conversion('native-eol', args=['--default-eol=native']) lines = run_program(svntest.main.svnadmin_binary, None, 'dump', '-q', conv.repos) # Verify that all files in the dump have LF EOLs. We're actually # testing the whole dump file, but the dump file itself only uses # LF EOLs, so we're safe. for line in lines: if line[-1] != '\n' or line[:-1].find('\r') != -1: raise Failure() @Cvs2SvnTestFunction def double_fill(): "reveal a bug that created a branch twice" conv = ensure_conversion('double-fill') # If the conversion succeeds, then we're okay. We could check the # actual revisions, too, but the main thing is to know that the # conversion doesn't fail. @Cvs2SvnTestFunction def double_fill2(): "reveal a second bug that created a branch twice" conv = ensure_conversion('double-fill2') conv.logs[6].check_msg(sym_log_msg('BRANCH1')) conv.logs[7].check_msg(sym_log_msg('BRANCH2')) try: # This check should fail: conv.logs[8].check_msg(sym_log_msg('BRANCH2')) except Failure: pass else: raise Failure('Symbol filled twice in a row') @Cvs2SvnTestFunction def resync_pass2_push_backward(): "ensure pass2 doesn't push rev too far backward" conv = ensure_conversion('resync-pass2-push-backward') # If the conversion succeeds, then we're okay. We could check the # actual revisions, too, but the main thing is to know that the # conversion doesn't fail. @Cvs2SvnTestFunction def double_add(): "reveal a bug that added a branch file twice" conv = ensure_conversion('double-add') # If the conversion succeeds, then we're okay. We could check the # actual revisions, too, but the main thing is to know that the # conversion doesn't fail. @Cvs2SvnTestFunction def bogus_branch_copy(): "reveal a bug that copies a branch file wrongly" conv = ensure_conversion('bogus-branch-copy') # If the conversion succeeds, then we're okay. We could check the # actual revisions, too, but the main thing is to know that the # conversion doesn't fail. @Cvs2SvnTestFunction def nested_ttb_directories(): "require error if ttb directories are not disjoint" opts_list = [ {'trunk' : 'a', 'branches' : 'a',}, {'trunk' : 'a', 'tags' : 'a',}, {'branches' : 'a', 'tags' : 'a',}, # This option conflicts with the default trunk path: {'branches' : 'trunk',}, # Try some nested directories: {'trunk' : 'a', 'branches' : 'a/b',}, {'trunk' : 'a/b', 'tags' : 'a/b/c/d',}, {'branches' : 'a', 'tags' : 'a/b',}, ] for opts in opts_list: ensure_conversion( 'main', error_re=r'The following paths are not disjoint\:', **opts ) class AutoProps(Cvs2SvnPropertiesTestCase): """Test auto-props. The files are as follows: trunk/foo.txt: no -kb, mime auto-prop says nothing. trunk/foo.xml: no -kb, mime auto-prop says text and eol-style=CRLF. trunk/foo.zip: no -kb, mime auto-prop says non-text. trunk/foo.asc: no -kb, mime auto-prop says text and eol-style=. trunk/foo.bin: has -kb, mime auto-prop says nothing. trunk/foo.csv: has -kb, mime auto-prop says text and eol-style=CRLF. trunk/foo.dbf: has -kb, mime auto-prop says non-text. trunk/foo.UPCASE1: no -kb, no mime type. trunk/foo.UPCASE2: no -kb, no mime type. """ def __init__(self, args, **kw): ### TODO: It's a bit klugey to construct this path here. See also ### the comment in eol_mime(). auto_props_path = os.path.join( test_data_dir, 'eol-mime-cvsrepos', 'auto-props') Cvs2SvnPropertiesTestCase.__init__( self, 'eol-mime', props_to_test=[ 'myprop', 'svn:eol-style', 'svn:mime-type', 'svn:keywords', 'svn:executable', ], args=[ '--auto-props=%s' % auto_props_path, '--eol-from-mime-type' ] + args, **kw) auto_props_ignore_case = AutoProps( doc="test auto-props", args=['--default-eol=native'], expected_props=[ ('trunk/foo.txt', ['txt', 'native', None, KEYWORDS, None]), ('trunk/foo.xml', ['xml', 'CRLF', 'text/xml', KEYWORDS, None]), ('trunk/foo.zip', ['zip', None, 'application/zip', None, None]), ('trunk/foo.asc', ['asc', None, 'text/plain', None, None]), ('trunk/foo.bin', ['bin', None, 'application/octet-stream', None, '']), ('trunk/foo.csv', ['csv', 'CRLF', 'text/csv', None, None]), ('trunk/foo.dbf', ['dbf', None, 'application/what-is-dbf', None, None]), ('trunk/foo.UPCASE1', ['UPCASE1', 'native', None, KEYWORDS, None]), ('trunk/foo.UPCASE2', ['UPCASE2', 'native', None, KEYWORDS, None]), ]) @Cvs2SvnTestFunction def ctrl_char_in_filename(): "do not allow control characters in filenames" try: srcrepos_path = os.path.join(test_data_dir,'main-cvsrepos') dstrepos_path = os.path.join(test_data_dir,'ctrl-char-filename-cvsrepos') if os.path.exists(dstrepos_path): safe_rmtree(dstrepos_path) # create repos from existing main repos shutil.copytree(srcrepos_path, dstrepos_path) base_path = os.path.join(dstrepos_path, 'single-files') try: shutil.copyfile(os.path.join(base_path, 'twoquick,v'), os.path.join(base_path, 'two\rquick,v')) except: # Operating systems that don't allow control characters in # filenames will hopefully have thrown an exception; in that # case, just skip this test. raise svntest.Skip() conv = ensure_conversion( 'ctrl-char-filename', error_re=(r'.*Subversion does not allow character .*.'), ) finally: safe_rmtree(dstrepos_path) @Cvs2SvnTestFunction def commit_dependencies(): "interleaved and multi-branch commits to same files" conv = ensure_conversion("commit-dependencies") conv.logs[2].check('adding', ( ('/%(trunk)s/interleaved', 'A'), ('/%(trunk)s/interleaved/file1', 'A'), ('/%(trunk)s/interleaved/file2', 'A'), )) conv.logs[3].check('big commit', ( ('/%(trunk)s/interleaved/file1', 'M'), ('/%(trunk)s/interleaved/file2', 'M'), )) conv.logs[4].check('dependant small commit', ( ('/%(trunk)s/interleaved/file1', 'M'), )) conv.logs[5].check('adding', ( ('/%(trunk)s/multi-branch', 'A'), ('/%(trunk)s/multi-branch/file1', 'A'), ('/%(trunk)s/multi-branch/file2', 'A'), )) conv.logs[6].check(sym_log_msg("branch"), ( ('/%(branches)s/branch (from /%(trunk)s:5)', 'A'), ('/%(branches)s/branch/interleaved', 'D'), )) conv.logs[7].check('multi-branch-commit', ( ('/%(trunk)s/multi-branch/file1', 'M'), ('/%(trunk)s/multi-branch/file2', 'M'), ('/%(branches)s/branch/multi-branch/file1', 'M'), ('/%(branches)s/branch/multi-branch/file2', 'M'), )) @Cvs2SvnTestFunction def double_branch_delete(): "fill branches before modifying files on them" conv = ensure_conversion('double-branch-delete') # Test for issue #102. The file IMarshalledValue.java is branched, # deleted, readded on the branch, and then deleted again. If the # fill for the file on the branch is postponed until after the # modification, the file will end up live on the branch instead of # dead! Make sure it happens at the right time. conv.logs[6].check('JBAS-2436 - Adding LGPL Header2', ( ('/%(branches)s/Branch_4_0/IMarshalledValue.java', 'A'), )); conv.logs[7].check('JBAS-3025 - Removing dependency', ( ('/%(branches)s/Branch_4_0/IMarshalledValue.java', 'D'), )); @Cvs2SvnTestFunction def symbol_mismatches(): "error for conflicting tag/branch" ensure_conversion( 'symbol-mess', args=['--symbol-default=strict'], error_re=r'.*Problems determining how symbols should be converted', ) @Cvs2SvnTestFunction def overlook_symbol_mismatches(): "overlook conflicting tag/branch when --trunk-only" # This is a test for issue #85. ensure_conversion('symbol-mess', args=['--trunk-only']) @Cvs2SvnTestFunction def force_symbols(): "force symbols to be tags/branches" conv = ensure_conversion( 'symbol-mess', args=['--force-branch=MOSTLY_BRANCH', '--force-tag=MOSTLY_TAG']) if conv.path_exists('tags', 'BRANCH') \ or not conv.path_exists('branches', 'BRANCH'): raise Failure() if not conv.path_exists('tags', 'TAG') \ or conv.path_exists('branches', 'TAG'): raise Failure() if conv.path_exists('tags', 'MOSTLY_BRANCH') \ or not conv.path_exists('branches', 'MOSTLY_BRANCH'): raise Failure() if not conv.path_exists('tags', 'MOSTLY_TAG') \ or conv.path_exists('branches', 'MOSTLY_TAG'): raise Failure() @Cvs2SvnTestFunction def commit_blocks_tags(): "commit prevents forced tag" basic_args = ['--force-branch=MOSTLY_BRANCH', '--force-tag=MOSTLY_TAG'] ensure_conversion( 'symbol-mess', args=(basic_args + ['--force-tag=BRANCH_WITH_COMMIT']), error_re=( r'.*The following branches cannot be forced to be tags ' r'because they have commits' ) ) @Cvs2SvnTestFunction def blocked_excludes(): "error for blocked excludes" basic_args = ['--force-branch=MOSTLY_BRANCH', '--force-tag=MOSTLY_TAG'] for blocker in ['BRANCH', 'COMMIT', 'UNNAMED']: try: ensure_conversion( 'symbol-mess', args=(basic_args + ['--exclude=BLOCKED_BY_%s' % blocker])) raise MissingErrorException() except Failure: pass @Cvs2SvnTestFunction def unblock_blocked_excludes(): "excluding blocker removes blockage" basic_args = ['--force-branch=MOSTLY_BRANCH', '--force-tag=MOSTLY_TAG'] for blocker in ['BRANCH', 'COMMIT']: ensure_conversion( 'symbol-mess', args=(basic_args + ['--exclude=BLOCKED_BY_%s' % blocker, '--exclude=BLOCKING_%s' % blocker])) @Cvs2SvnTestFunction def regexp_force_symbols(): "force symbols via regular expressions" conv = ensure_conversion( 'symbol-mess', args=['--force-branch=MOST.*_BRANCH', '--force-tag=MOST.*_TAG']) if conv.path_exists('tags', 'MOSTLY_BRANCH') \ or not conv.path_exists('branches', 'MOSTLY_BRANCH'): raise Failure() if not conv.path_exists('tags', 'MOSTLY_TAG') \ or conv.path_exists('branches', 'MOSTLY_TAG'): raise Failure() @Cvs2SvnTestFunction def heuristic_symbol_default(): "test 'heuristic' symbol default" conv = ensure_conversion( 'symbol-mess', args=['--symbol-default=heuristic']) if conv.path_exists('tags', 'MOSTLY_BRANCH') \ or not conv.path_exists('branches', 'MOSTLY_BRANCH'): raise Failure() if not conv.path_exists('tags', 'MOSTLY_TAG') \ or conv.path_exists('branches', 'MOSTLY_TAG'): raise Failure() @Cvs2SvnTestFunction def branch_symbol_default(): "test 'branch' symbol default" conv = ensure_conversion( 'symbol-mess', args=['--symbol-default=branch']) if conv.path_exists('tags', 'MOSTLY_BRANCH') \ or not conv.path_exists('branches', 'MOSTLY_BRANCH'): raise Failure() if conv.path_exists('tags', 'MOSTLY_TAG') \ or not conv.path_exists('branches', 'MOSTLY_TAG'): raise Failure() @Cvs2SvnTestFunction def tag_symbol_default(): "test 'tag' symbol default" conv = ensure_conversion( 'symbol-mess', args=['--symbol-default=tag']) if not conv.path_exists('tags', 'MOSTLY_BRANCH') \ or conv.path_exists('branches', 'MOSTLY_BRANCH'): raise Failure() if not conv.path_exists('tags', 'MOSTLY_TAG') \ or conv.path_exists('branches', 'MOSTLY_TAG'): raise Failure() @Cvs2SvnTestFunction def symbol_transform(): "test --symbol-transform" conv = ensure_conversion( 'symbol-mess', args=[ '--symbol-default=heuristic', '--symbol-transform=BRANCH:branch', '--symbol-transform=TAG:tag', '--symbol-transform=MOSTLY_(BRANCH|TAG):MOSTLY.\\1', ]) if not conv.path_exists('branches', 'branch'): raise Failure() if not conv.path_exists('tags', 'tag'): raise Failure() if not conv.path_exists('branches', 'MOSTLY.BRANCH'): raise Failure() if not conv.path_exists('tags', 'MOSTLY.TAG'): raise Failure() @Cvs2SvnTestFunction def write_symbol_info(): "test --write-symbol-info" expected_lines = [ ['0', '.trunk.', 'trunk', 'trunk', '.'], ['0', 'BLOCKED_BY_UNNAMED', 'branch', 'branches/BLOCKED_BY_UNNAMED', '.trunk.'], ['0', 'BLOCKING_COMMIT', 'branch', 'branches/BLOCKING_COMMIT', 'BLOCKED_BY_COMMIT'], ['0', 'BLOCKED_BY_COMMIT', 'branch', 'branches/BLOCKED_BY_COMMIT', '.trunk.'], ['0', 'BLOCKING_BRANCH', 'branch', 'branches/BLOCKING_BRANCH', 'BLOCKED_BY_BRANCH'], ['0', 'BLOCKED_BY_BRANCH', 'branch', 'branches/BLOCKED_BY_BRANCH', '.trunk.'], ['0', 'MOSTLY_BRANCH', '.', '.', '.'], ['0', 'MOSTLY_TAG', '.', '.', '.'], ['0', 'BRANCH_WITH_COMMIT', 'branch', 'branches/BRANCH_WITH_COMMIT', '.trunk.'], ['0', 'BRANCH', 'branch', 'branches/BRANCH', '.trunk.'], ['0', 'TAG', 'tag', 'tags/TAG', '.trunk.'], ['0', 'unlabeled-1.1.12.1.2', 'branch', 'branches/unlabeled-1.1.12.1.2', 'BLOCKED_BY_UNNAMED'], ] expected_lines.sort() symbol_info_file = os.path.join(tmp_dir, 'symbol-mess-symbol-info.txt') try: ensure_conversion( 'symbol-mess', args=[ '--symbol-default=strict', '--write-symbol-info=%s' % (symbol_info_file,), '--passes=:CollateSymbolsPass', ], ) raise MissingErrorException() except Failure: pass lines = [] comment_re = re.compile(r'^\s*\#') for l in open(symbol_info_file, 'r'): if comment_re.match(l): continue lines.append(l.strip().split()) lines.sort() if lines != expected_lines: s = ['Symbol info incorrect\n'] differ = Differ() for diffline in differ.compare( [' '.join(line) + '\n' for line in expected_lines], [' '.join(line) + '\n' for line in lines], ): s.append(diffline) raise Failure(''.join(s)) @Cvs2SvnTestFunction def symbol_hints(): "test --symbol-hints for setting branch/tag" conv = ensure_conversion( 'symbol-mess', symbol_hints_file='symbol-mess-symbol-hints.txt', ) if not conv.path_exists('branches', 'MOSTLY_BRANCH'): raise Failure() if not conv.path_exists('tags', 'MOSTLY_TAG'): raise Failure() conv.logs[3].check(sym_log_msg('MOSTLY_TAG', 1), ( ('/tags/MOSTLY_TAG (from /trunk:2)', 'A'), )) conv.logs[9].check(sym_log_msg('BRANCH_WITH_COMMIT'), ( ('/branches/BRANCH_WITH_COMMIT (from /trunk:2)', 'A'), )) conv.logs[10].check(sym_log_msg('MOSTLY_BRANCH'), ( ('/branches/MOSTLY_BRANCH (from /trunk:2)', 'A'), )) @Cvs2SvnTestFunction def parent_hints(): "test --symbol-hints for setting parent" conv = ensure_conversion( 'symbol-mess', symbol_hints_file='symbol-mess-parent-hints.txt', ) conv.logs[9].check(sym_log_msg('BRANCH_WITH_COMMIT'), ( ('/%(branches)s/BRANCH_WITH_COMMIT (from /branches/BRANCH:8)', 'A'), )) @Cvs2SvnTestFunction def parent_hints_invalid(): "test --symbol-hints with an invalid parent" # BRANCH_WITH_COMMIT is usually determined to branch from .trunk.; # this symbol hints file sets the preferred parent to BRANCH # instead: conv = ensure_conversion( 'symbol-mess', symbol_hints_file='symbol-mess-parent-hints-invalid.txt', error_re=( r"BLOCKED_BY_BRANCH is not a valid parent for BRANCH_WITH_COMMIT" ), ) @Cvs2SvnTestFunction def parent_hints_wildcards(): "test --symbol-hints wildcards" # BRANCH_WITH_COMMIT is usually determined to branch from .trunk.; # this symbol hints file sets the preferred parent to BRANCH # instead: conv = ensure_conversion( 'symbol-mess', symbol_hints_file='symbol-mess-parent-hints-wildcards.txt', ) conv.logs[9].check(sym_log_msg('BRANCH_WITH_COMMIT'), ( ('/%(branches)s/BRANCH_WITH_COMMIT (from /branches/BRANCH:8)', 'A'), )) @Cvs2SvnTestFunction def path_hints(): "test --symbol-hints for setting svn paths" conv = ensure_conversion( 'symbol-mess', symbol_hints_file='symbol-mess-path-hints.txt', ) conv.logs[1].check('Standard project directories initialized by cvs2svn.', ( ('/trunk', 'A'), ('/a', 'A'), ('/a/strange', 'A'), ('/a/strange/trunk', 'A'), ('/a/strange/trunk/path', 'A'), ('/branches', 'A'), ('/tags', 'A'), )) conv.logs[3].check(sym_log_msg('MOSTLY_TAG', 1), ( ('/special', 'A'), ('/special/tag', 'A'), ('/special/tag/path (from /a/strange/trunk/path:2)', 'A'), )) conv.logs[9].check(sym_log_msg('BRANCH_WITH_COMMIT'), ( ('/special/other', 'A'), ('/special/other/branch', 'A'), ('/special/other/branch/path (from /a/strange/trunk/path:2)', 'A'), )) conv.logs[10].check(sym_log_msg('MOSTLY_BRANCH'), ( ('/special/branch', 'A'), ('/special/branch/path (from /a/strange/trunk/path:2)', 'A'), )) @Cvs2SvnTestFunction def issue_99(): "test problem from issue 99" conv = ensure_conversion('issue-99') @Cvs2SvnTestFunction def issue_100(): "test problem from issue 100" conv = ensure_conversion('issue-100') file1 = conv.get_wc('trunk', 'file1.txt') if file(file1).read() != 'file1.txt<1.2>\n': raise Failure() @Cvs2SvnTestFunction def issue_106(): "test problem from issue 106" conv = ensure_conversion('issue-106') @Cvs2SvnTestFunction def options_option(): "use of the --options option" conv = ensure_conversion('main', options_file='cvs2svn.options') @Cvs2SvnTestFunction def multiproject(): "multiproject conversion" conv = ensure_conversion( 'main', options_file='cvs2svn-multiproject.options' ) conv.logs[1].check('Standard project directories initialized by cvs2svn.', ( ('/partial-prune', 'A'), ('/partial-prune/trunk', 'A'), ('/partial-prune/branches', 'A'), ('/partial-prune/tags', 'A'), ('/partial-prune/releases', 'A'), )) @Cvs2SvnTestFunction def crossproject(): "multiproject conversion with cross-project commits" conv = ensure_conversion( 'main', options_file='cvs2svn-crossproject.options' ) @Cvs2SvnTestFunction def tag_with_no_revision(): "tag defined but revision is deleted" conv = ensure_conversion('tag-with-no-revision') @Cvs2SvnTestFunction def delete_cvsignore(): "svn:ignore should vanish when .cvsignore does" # This is issue #81. conv = ensure_conversion('delete-cvsignore') wc_tree = conv.get_wc_tree() props = props_for_path(wc_tree, 'trunk/proj') if props.has_key('svn:ignore'): raise Failure() @Cvs2SvnTestFunction def repeated_deltatext(): "ignore repeated deltatext blocks with warning" conv = ensure_conversion('repeated-deltatext') warning_re = r'.*Deltatext block for revision 1.1 appeared twice' if not conv.output_found(warning_re): raise Failure() @Cvs2SvnTestFunction def nasty_graphs(): "process some nasty dependency graphs" # It's not how well the bear can dance, but that the bear can dance # at all: conv = ensure_conversion('nasty-graphs') @Cvs2SvnTestFunction def tagging_after_delete(): "optimal tag after deleting files" conv = ensure_conversion('tagging-after-delete') # tag should be 'clean', no deletes log = conv.find_tag_log('tag1') expected = ( ('/%(tags)s/tag1 (from /%(trunk)s:3)', 'A'), ) log.check_changes(expected) @Cvs2SvnTestFunction def crossed_branches(): "branches created in inconsistent orders" conv = ensure_conversion('crossed-branches') @Cvs2SvnTestFunction def file_directory_conflict(): "error when filename conflicts with directory name" conv = ensure_conversion( 'file-directory-conflict', error_re=r'.*Directory name conflicts with filename', ) @Cvs2SvnTestFunction def attic_directory_conflict(): "error when attic filename conflicts with dirname" # This tests the problem reported in issue #105. conv = ensure_conversion( 'attic-directory-conflict', error_re=r'.*Directory name conflicts with filename', ) @Cvs2SvnTestFunction def use_rcs(): "verify that --use-rcs and --use-internal-co agree" rcs_conv = ensure_conversion( 'main', args=['--use-rcs', '--default-eol=native'], dumpfile='use-rcs-rcs.dump', ) conv = ensure_conversion( 'main', args=['--default-eol=native'], dumpfile='use-rcs-int.dump', ) if conv.output_found(r'WARNING\: internal problem\: leftover revisions'): raise Failure() rcs_lines = list(open(rcs_conv.dumpfile, 'rb')) lines = list(open(conv.dumpfile, 'rb')) # Compare all lines following the repository UUID: if lines[3:] != rcs_lines[3:]: raise Failure() @Cvs2SvnTestFunction def internal_co_exclude(): "verify that --use-internal-co --exclude=... works" rcs_conv = ensure_conversion( 'internal-co', args=['--use-rcs', '--exclude=BRANCH', '--default-eol=native'], dumpfile='internal-co-exclude-rcs.dump', ) conv = ensure_conversion( 'internal-co', args=['--exclude=BRANCH', '--default-eol=native'], dumpfile='internal-co-exclude-int.dump', ) if conv.output_found(r'WARNING\: internal problem\: leftover revisions'): raise Failure() rcs_lines = list(open(rcs_conv.dumpfile, 'rb')) lines = list(open(conv.dumpfile, 'rb')) # Compare all lines following the repository UUID: if lines[3:] != rcs_lines[3:]: raise Failure() @Cvs2SvnTestFunction def internal_co_trunk_only(): "verify that --use-internal-co --trunk-only works" conv = ensure_conversion( 'internal-co', args=['--trunk-only', '--default-eol=native'], ) if conv.output_found(r'WARNING\: internal problem\: leftover revisions'): raise Failure() @Cvs2SvnTestFunction def leftover_revs(): "check for leftover checked-out revisions" conv = ensure_conversion( 'leftover-revs', args=['--exclude=BRANCH', '--default-eol=native'], ) if conv.output_found(r'WARNING\: internal problem\: leftover revisions'): raise Failure() @Cvs2SvnTestFunction def requires_internal_co(): "test that internal co can do more than RCS" # See issues 4, 11 for the bugs whose regression we're testing for. # Unlike in requires_cvs above, issue 29 is not covered. conv = ensure_conversion('requires-cvs') atsign_contents = file(conv.get_wc("trunk", "atsign-add")).read() if atsign_contents[-1:] == "@": raise Failure() if not (conv.logs[6].author == "William Lyon Phelps III" and conv.logs[5].author == "j random"): raise Failure() @Cvs2SvnTestFunction def internal_co_keywords(): "test that internal co handles keywords correctly" conv_ic = ensure_conversion('internal-co-keywords', args=["--keywords-off"]) conv_cvs = ensure_conversion('internal-co-keywords', args=["--use-cvs", "--keywords-off"]) ko_ic = file(conv_ic.get_wc('trunk', 'dir', 'ko.txt')).read() ko_cvs = file(conv_cvs.get_wc('trunk', 'dir', 'ko.txt')).read() kk_ic = file(conv_ic.get_wc('trunk', 'dir', 'kk.txt')).read() kk_cvs = file(conv_cvs.get_wc('trunk', 'dir', 'kk.txt')).read() kv_ic = file(conv_ic.get_wc('trunk', 'dir', 'kv.txt')).read() kv_cvs = file(conv_cvs.get_wc('trunk', 'dir', 'kv.txt')).read() # Ensure proper "/Attic" expansion of $Source$ keyword in files # which are in a deleted state in trunk del_ic = file(conv_ic.get_wc('branches/b', 'dir', 'kv-deleted.txt')).read() del_cvs = file(conv_cvs.get_wc('branches/b', 'dir', 'kv-deleted.txt')).read() if ko_ic != ko_cvs: raise Failure() if kk_ic != kk_cvs: raise Failure() if del_ic != del_cvs: raise Failure() # The date format changed between cvs and co ('/' instead of '-'). # Accept either one: date_substitution_re = re.compile(r' ([0-9]*)-([0-9]*)-([0-9]*) ') if kv_ic != kv_cvs \ and date_substitution_re.sub(r' \1/\2/\3 ', kv_ic) != kv_cvs: raise Failure() @Cvs2SvnTestFunction def timestamp_chaos(): "test timestamp adjustments" conv = ensure_conversion('timestamp-chaos', args=["-v"]) # The times are expressed here in UTC: times = [ '2007-01-01 21:00:00', # Initial commit '2007-01-01 21:00:00', # revision 1.1 of both files '2007-01-01 21:00:01', # revision 1.2 of file1.txt, adjusted forwards '2007-01-01 21:00:02', # revision 1.2 of file2.txt, adjusted backwards '2007-01-01 22:00:00', # revision 1.3 of both files ] # Convert the times to seconds since the epoch, in UTC: times = [calendar.timegm(svn_strptime(t)) for t in times] for i in range(len(times)): if abs(conv.logs[i + 1].date - times[i]) > 0.1: raise Failure() @Cvs2SvnTestFunction def symlinks(): "convert a repository that contains symlinks" # This is a test for issue #97. proj = os.path.join(test_data_dir, 'symlinks-cvsrepos', 'proj') links = [ ( os.path.join('..', 'file.txt,v'), os.path.join(proj, 'dir1', 'file.txt,v'), ), ( 'dir1', os.path.join(proj, 'dir2'), ), ] try: os.symlink except AttributeError: # Apparently this OS doesn't support symlinks, so skip test. raise svntest.Skip() try: for (src,dst) in links: os.symlink(src, dst) conv = ensure_conversion('symlinks') conv.logs[2].check('', ( ('/%(trunk)s/proj', 'A'), ('/%(trunk)s/proj/file.txt', 'A'), ('/%(trunk)s/proj/dir1', 'A'), ('/%(trunk)s/proj/dir1/file.txt', 'A'), ('/%(trunk)s/proj/dir2', 'A'), ('/%(trunk)s/proj/dir2/file.txt', 'A'), )) finally: for (src,dst) in links: os.remove(dst) @Cvs2SvnTestFunction def empty_trunk_path(): "allow --trunk to be empty if --trunk-only" # This is a test for issue #53. conv = ensure_conversion( 'main', args=['--trunk-only', '--trunk='], ) @Cvs2SvnTestFunction def preferred_parent_cycle(): "handle a cycle in branch parent preferences" conv = ensure_conversion('preferred-parent-cycle') @Cvs2SvnTestFunction def branch_from_empty_dir(): "branch from an empty directory" conv = ensure_conversion('branch-from-empty-dir') @Cvs2SvnTestFunction def trunk_readd(): "add a file on a branch then on trunk" conv = ensure_conversion('trunk-readd') @Cvs2SvnTestFunction def branch_from_deleted_1_1(): "branch from a 1.1 revision that will be deleted" conv = ensure_conversion('branch-from-deleted-1-1') conv.logs[5].check('Adding b.txt:1.1.2.1', ( ('/%(branches)s/BRANCH1/proj/b.txt', 'A'), )) conv.logs[6].check('Adding b.txt:1.1.4.1', ( ('/%(branches)s/BRANCH2/proj/b.txt', 'A'), )) conv.logs[7].check('Adding b.txt:1.2', ( ('/%(trunk)s/proj/b.txt', 'A'), )) conv.logs[8].check('Adding c.txt:1.1.2.1', ( ('/%(branches)s/BRANCH1/proj/c.txt', 'A'), )) conv.logs[9].check('Adding c.txt:1.1.4.1', ( ('/%(branches)s/BRANCH2/proj/c.txt', 'A'), )) @Cvs2SvnTestFunction def add_on_branch(): "add a file on a branch using newer CVS" conv = ensure_conversion('add-on-branch') conv.logs[6].check('Adding b.txt:1.1', ( ('/%(trunk)s/proj/b.txt', 'A'), )) conv.logs[7].check('Adding b.txt:1.1.2.2', ( ('/%(branches)s/BRANCH1/proj/b.txt', 'A'), )) conv.logs[8].check('Adding c.txt:1.1', ( ('/%(trunk)s/proj/c.txt', 'A'), )) conv.logs[9].check('Removing c.txt:1.2', ( ('/%(trunk)s/proj/c.txt', 'D'), )) conv.logs[10].check('Adding c.txt:1.2.2.2', ( ('/%(branches)s/BRANCH2/proj/c.txt', 'A'), )) conv.logs[11].check('Adding d.txt:1.1', ( ('/%(trunk)s/proj/d.txt', 'A'), )) conv.logs[12].check('Adding d.txt:1.1.2.2', ( ('/%(branches)s/BRANCH3/proj/d.txt', 'A'), )) @Cvs2SvnTestFunction def main_git(): "test output in git-fast-import format" # Note: To test importing into git, do # # ./run-tests # rm -rf cvs2svn-tmp/main.git # git init --bare cvs2svn-tmp/main.git # cd cvs2svn-tmp/main.git # cat ../git-{blob,dump}.dat | git fast-import # # Or, to load the dumpfiles separately: # # cat ../git-blob.dat | git fast-import --export-marks=../git-marks.dat # cat ../git-dump.dat | git fast-import --import-marks=../git-marks.dat # # Then use "gitk --all", "git log", etc. to test the contents of the # repository or "git clone" to make a non-bare clone. # We don't have the infrastructure to check that the resulting git # repository is correct, so we just check that the conversion runs # to completion: conv = GitConversion('main', None, [ '--blobfile=cvs2svn-tmp/git-blob.dat', '--dumpfile=cvs2svn-tmp/git-dump.dat', '--username=cvs2git', 'test-data/main-cvsrepos', ]) @Cvs2SvnTestFunction def main_git2(): "test cvs2git --use-external-blob-generator option" # See comment in main_git() for more information. conv = GitConversion('main', None, [ '--use-external-blob-generator', '--blobfile=cvs2svn-tmp/blobfile.out', '--dumpfile=cvs2svn-tmp/dumpfile.out', '--username=cvs2git', 'test-data/main-cvsrepos', ]) @Cvs2SvnTestFunction def git_options(): "test cvs2git using options file" conv = GitConversion('main', None, [], options_file='cvs2git.options') @Cvs2SvnTestFunction def main_hg(): "output in git-fast-import format with inline data" # The output should be suitable for import by Mercurial. # We don't have the infrastructure to check that the resulting # Mercurial repository is correct, so we just check that the # conversion runs to completion: conv = GitConversion('main', None, [], options_file='cvs2hg.options') @Cvs2SvnTestFunction def invalid_symbol(): "a symbol with the incorrect format" conv = ensure_conversion('invalid-symbol') if not conv.output_found( r".*branch 'SYMBOL' references invalid revision 1$" ): raise Failure() @Cvs2SvnTestFunction def invalid_symbol_ignore(): "ignore a symbol using a SymbolMapper" conv = ensure_conversion( 'invalid-symbol', options_file='cvs2svn-ignore.options' ) @Cvs2SvnTestFunction def invalid_symbol_ignore2(): "ignore a symbol using an IgnoreSymbolTransform" conv = ensure_conversion( 'invalid-symbol', options_file='cvs2svn-ignore2.options' ) class EOLVariants(Cvs2SvnTestCase): "handle various --eol-style options" eol_style_strings = { 'LF' : '\n', 'CR' : '\r', 'CRLF' : '\r\n', 'native' : '\n', } def __init__(self, eol_style): self.eol_style = eol_style self.dumpfile = 'eol-variants-%s.dump' % (self.eol_style,) Cvs2SvnTestCase.__init__( self, 'eol-variants', variant=self.eol_style, dumpfile=self.dumpfile, args=[ '--default-eol=%s' % (self.eol_style,), ], ) def run(self, sbox): conv = self.ensure_conversion() dump_contents = open(conv.dumpfile, 'rb').read() expected_text = self.eol_style_strings[self.eol_style].join( ['line 1', 'line 2', '\n\n'] ) if not dump_contents.endswith(expected_text): raise Failure() @Cvs2SvnTestFunction def no_revs_file(): "handle a file with no revisions (issue #80)" conv = ensure_conversion('no-revs-file') @Cvs2SvnTestFunction def mirror_keyerror_test(): "a case that gave KeyError in SVNRepositoryMirror" conv = ensure_conversion('mirror-keyerror') @Cvs2SvnTestFunction def exclude_ntdb_test(): "exclude a non-trunk default branch" symbol_info_file = os.path.join(tmp_dir, 'exclude-ntdb-symbol-info.txt') conv = ensure_conversion( 'exclude-ntdb', args=[ '--write-symbol-info=%s' % (symbol_info_file,), '--exclude=branch3', '--exclude=tag3', '--exclude=vendortag3', '--exclude=vendorbranch', ], ) @Cvs2SvnTestFunction def mirror_keyerror2_test(): "a case that gave KeyError in RepositoryMirror" conv = ensure_conversion('mirror-keyerror2') @Cvs2SvnTestFunction def mirror_keyerror3_test(): "a case that gave KeyError in RepositoryMirror" conv = ensure_conversion('mirror-keyerror3') @Cvs2SvnTestFunction def add_cvsignore_to_branch_test(): "check adding .cvsignore to an existing branch" # This a test for issue #122. conv = ensure_conversion('add-cvsignore-to-branch') wc_tree = conv.get_wc_tree() trunk_props = props_for_path(wc_tree, 'trunk/dir') if trunk_props['svn:ignore'] != '*.o\n\n': raise Failure() branch_props = props_for_path(wc_tree, 'branches/BRANCH/dir') if branch_props['svn:ignore'] != '*.o\n\n': raise Failure() @Cvs2SvnTestFunction def missing_deltatext(): "a revision's deltatext is missing" # This is a type of RCS file corruption that has been observed. conv = ensure_conversion( 'missing-deltatext', error_re=( r"ERROR\: .* has no deltatext section for revision 1\.1\.4\.4" ), ) @Cvs2SvnTestFunction def transform_unlabeled_branch_name(): "transform name of unlabeled branch" conv = ensure_conversion( 'unlabeled-branch', args=[ '--symbol-transform=unlabeled-1.1.4:BRANCH2', ], ) if conv.path_exists('branches', 'unlabeled-1.1.4'): raise Failure('Branch unlabeled-1.1.4 not excluded') if not conv.path_exists('branches', 'BRANCH2'): raise Failure('Branch BRANCH2 not found') @Cvs2SvnTestFunction def ignore_unlabeled_branch(): "ignoring an unlabeled branch is not allowed" conv = ensure_conversion( 'unlabeled-branch', options_file='cvs2svn-ignore.options', error_re=( r"ERROR\: The unlabeled branch \'unlabeled\-1\.1\.4\' " r"in \'.*\' contains commits" ), ) @Cvs2SvnTestFunction def exclude_unlabeled_branch(): "exclude unlabeled branch" conv = ensure_conversion( 'unlabeled-branch', args=['--exclude=unlabeled-.*'], ) if conv.path_exists('branches', 'unlabeled-1.1.4'): raise Failure('Branch unlabeled-1.1.4 not excluded') @Cvs2SvnTestFunction def unlabeled_branch_name_collision(): "transform unlabeled branch to same name as branch" conv = ensure_conversion( 'unlabeled-branch', args=[ '--symbol-transform=unlabeled-1.1.4:BRANCH', ], error_re=( r"ERROR\: Symbol name \'BRANCH\' is already used" ), ) @Cvs2SvnTestFunction def collision_with_unlabeled_branch_name(): "transform branch to same name as unlabeled branch" conv = ensure_conversion( 'unlabeled-branch', args=[ '--symbol-transform=BRANCH:unlabeled-1.1.4', ], error_re=( r"ERROR\: Symbol name \'unlabeled\-1\.1\.4\' is already used" ), ) @Cvs2SvnTestFunction def many_deletes(): "a repo with many removable dead revisions" conv = ensure_conversion('many-deletes') conv.logs[5].check('Add files on BRANCH', ( ('/%(branches)s/BRANCH/proj/b.txt', 'A'), )) conv.logs[6].check('Add files on BRANCH2', ( ('/%(branches)s/BRANCH2/proj/b.txt', 'A'), ('/%(branches)s/BRANCH2/proj/c.txt', 'A'), ('/%(branches)s/BRANCH2/proj/d.txt', 'A'), )) cvs_description = Cvs2SvnPropertiesTestCase( 'main', doc='test handling of CVS file descriptions', props_to_test=['cvs:description'], expected_props=[ ('trunk/proj/default', ['This is an example file description.']), ('trunk/proj/sub1/default', [None]), ]) @Cvs2SvnTestFunction def include_empty_directories(): "test --include-empty-directories option" conv = ensure_conversion( 'empty-directories', args=['--include-empty-directories'], ) conv.logs[1].check('Standard project directories', ( ('/%(trunk)s', 'A'), ('/%(branches)s', 'A'), ('/%(tags)s', 'A'), ('/%(trunk)s/root-empty-directory', 'A'), ('/%(trunk)s/root-empty-directory/empty-subdirectory', 'A'), )) conv.logs[3].check('Add b.txt.', ( ('/%(trunk)s/direct', 'A'), ('/%(trunk)s/direct/b.txt', 'A'), ('/%(trunk)s/direct/empty-directory', 'A'), ('/%(trunk)s/direct/empty-directory/empty-subdirectory', 'A'), )) conv.logs[4].check('Add c.txt.', ( ('/%(trunk)s/indirect', 'A'), ('/%(trunk)s/indirect/subdirectory', 'A'), ('/%(trunk)s/indirect/subdirectory/c.txt', 'A'), ('/%(trunk)s/indirect/empty-directory', 'A'), ('/%(trunk)s/indirect/empty-directory/empty-subdirectory', 'A'), )) conv.logs[5].check('Remove b.txt', ( ('/%(trunk)s/direct', 'D'), )) conv.logs[6].check('Remove c.txt', ( ('/%(trunk)s/indirect', 'D'), )) conv.logs[7].check('Re-add b.txt.', ( ('/%(trunk)s/direct', 'A'), ('/%(trunk)s/direct/b.txt', 'A'), ('/%(trunk)s/direct/empty-directory', 'A'), ('/%(trunk)s/direct/empty-directory/empty-subdirectory', 'A'), )) conv.logs[8].check('Re-add c.txt.', ( ('/%(trunk)s/indirect', 'A'), ('/%(trunk)s/indirect/subdirectory', 'A'), ('/%(trunk)s/indirect/subdirectory/c.txt', 'A'), ('/%(trunk)s/indirect/empty-directory', 'A'), ('/%(trunk)s/indirect/empty-directory/empty-subdirectory', 'A'), )) conv.logs[9].check('This commit was manufactured', ( ('/%(tags)s/TAG (from /%(trunk)s:8)', 'A'), )) conv.logs[10].check('This commit was manufactured', ( ('/%(branches)s/BRANCH (from /%(trunk)s:8)', 'A'), )) conv.logs[11].check('Import d.txt.', ( ('/%(branches)s/VENDORBRANCH', 'A'), ('/%(branches)s/VENDORBRANCH/import', 'A'), ('/%(branches)s/VENDORBRANCH/import/d.txt', 'A'), ('/%(branches)s/VENDORBRANCH/root-empty-directory', 'A'), ('/%(branches)s/VENDORBRANCH/root-empty-directory/empty-subdirectory', 'A'), ('/%(branches)s/VENDORBRANCH/import/empty-directory', 'A'), ('/%(branches)s/VENDORBRANCH/import/empty-directory/empty-subdirectory', 'A'), )) conv.logs[12].check('This commit was generated', ( ('/%(trunk)s/import', 'A'), ('/%(trunk)s/import/d.txt ' '(from /%(branches)s/VENDORBRANCH/import/d.txt:11)', 'A'), ('/%(trunk)s/import/empty-directory', 'A'), ('/%(trunk)s/import/empty-directory/empty-subdirectory', 'A'), )) @Cvs2SvnTestFunction def include_empty_directories_no_prune(): "test --include-empty-directories with --no-prune" conv = ensure_conversion( 'empty-directories', args=['--include-empty-directories', '--no-prune'], ) conv.logs[1].check('Standard project directories', ( ('/%(trunk)s', 'A'), ('/%(branches)s', 'A'), ('/%(tags)s', 'A'), ('/%(trunk)s/root-empty-directory', 'A'), ('/%(trunk)s/root-empty-directory/empty-subdirectory', 'A'), )) conv.logs[3].check('Add b.txt.', ( ('/%(trunk)s/direct', 'A'), ('/%(trunk)s/direct/b.txt', 'A'), ('/%(trunk)s/direct/empty-directory', 'A'), ('/%(trunk)s/direct/empty-directory/empty-subdirectory', 'A'), )) conv.logs[4].check('Add c.txt.', ( ('/%(trunk)s/indirect', 'A'), ('/%(trunk)s/indirect/subdirectory', 'A'), ('/%(trunk)s/indirect/subdirectory/c.txt', 'A'), ('/%(trunk)s/indirect/empty-directory', 'A'), ('/%(trunk)s/indirect/empty-directory/empty-subdirectory', 'A'), )) conv.logs[5].check('Remove b.txt', ( ('/%(trunk)s/direct/b.txt', 'D'), )) conv.logs[6].check('Remove c.txt', ( ('/%(trunk)s/indirect/subdirectory/c.txt', 'D'), )) conv.logs[7].check('Re-add b.txt.', ( ('/%(trunk)s/direct/b.txt', 'A'), )) conv.logs[8].check('Re-add c.txt.', ( ('/%(trunk)s/indirect/subdirectory/c.txt', 'A'), )) conv.logs[9].check('This commit was manufactured', ( ('/%(tags)s/TAG (from /%(trunk)s:8)', 'A'), )) conv.logs[10].check('This commit was manufactured', ( ('/%(branches)s/BRANCH (from /%(trunk)s:8)', 'A'), )) @Cvs2SvnTestFunction def exclude_symbol_default(): "test 'exclude' symbol default" conv = ensure_conversion( 'symbol-mess', args=['--symbol-default=exclude']) if conv.path_exists('tags', 'MOSTLY_BRANCH') \ or conv.path_exists('branches', 'MOSTLY_BRANCH'): raise Failure() if conv.path_exists('tags', 'MOSTLY_TAG') \ or conv.path_exists('branches', 'MOSTLY_TAG'): raise Failure() @Cvs2SvnTestFunction def add_on_branch2(): "another add-on-branch test case" conv = ensure_conversion('add-on-branch2') if len(conv.logs) != 2: raise Failure() conv.logs[2].check('add file on branch', ( ('/%(branches)s/BRANCH', 'A'), ('/%(branches)s/BRANCH/file1', 'A'), )) @Cvs2SvnTestFunction def branch_from_vendor_branch(): "branch from vendor branch" ensure_conversion( 'branch-from-vendor-branch', symbol_hints_file='branch-from-vendor-branch-symbol-hints.txt', ) @Cvs2SvnTestFunction def strange_default_branch(): "default branch too deep in the hierarchy" ensure_conversion( 'strange-default-branch', error_re=( r'ERROR\: The default branch 1\.2\.4\.3\.2\.1\.2 ' r'in file .* is not a top-level branch' ), ) @Cvs2SvnTestFunction def move_parent(): "graft onto preferred parent that was itself moved" conv = ensure_conversion( 'move-parent', ) conv.logs[2].check('first', ( ('/%(trunk)s/file1', 'A'), ('/%(trunk)s/file2', 'A'), )) conv.logs[3].check('This commit was manufactured', ( ('/%(branches)s/b2 (from /%(trunk)s:2)', 'A'), )) conv.logs[4].check('second', ( ('/%(branches)s/b2/file1', 'M'), )) conv.logs[5].check('This commit was manufactured', ( ('/%(branches)s/b1 (from /%(branches)s/b2:4)', 'A'), )) # b2 and b1 are equally good parents for b3, so accept either one. # (Currently, cvs2svn chooses b1 as the preferred parent because it # comes earlier than b2 in alphabetical order.) try: conv.logs[6].check('This commit was manufactured', ( ('/%(branches)s/b3 (from /%(branches)s/b1:5)', 'A'), )) except Failure: conv.logs[6].check('This commit was manufactured', ( ('/%(branches)s/b3 (from /%(branches)s/b2:4)', 'A'), )) @Cvs2SvnTestFunction def log_message_eols(): "nonstandard EOLs in log messages" conv = ensure_conversion( 'log-message-eols', ) conv.logs[2].check('The CRLF at the end of this line\nshould', ( ('/%(trunk)s/lottalogs', 'A'), )) conv.logs[3].check('The CR at the end of this line\nshould', ( ('/%(trunk)s/lottalogs', 'M'), )) @Cvs2SvnTestFunction def missing_vendor_branch(): "default branch not present in RCS file" conv = ensure_conversion( 'missing-vendor-branch', ) if not conv.output_found( r'.*vendor branch \'1\.1\.1\' is not present in file and will be ignored' ): raise Failure() @Cvs2SvnTestFunction def newphrases(): "newphrases in RCS files" ensure_conversion( 'newphrases', ) ######################################################################## # Run the tests # list all tests here, starting with None: test_list = [ None, # 1: show_usage, cvs2svn_manpage, cvs2git_manpage, XFail(cvs2hg_manpage), attr_exec, space_fname, two_quick, PruneWithCare(), PruneWithCare(variant=1, trunk='a', branches='b', tags='c'), # 10: PruneWithCare(variant=2, trunk='a/1', branches='b/1', tags='c/1'), PruneWithCare(variant=3, trunk='a/1', branches='a/2', tags='a/3'), interleaved_commits, simple_commits, SimpleTags(), SimpleTags(variant=1, trunk='a', branches='b', tags='c'), SimpleTags(variant=2, trunk='a/1', branches='b/1', tags='c/1'), SimpleTags(variant=3, trunk='a/1', branches='a/2', tags='a/3'), simple_branch_commits, mixed_time_tag, # 20: mixed_time_branch_with_added_file, mixed_commit, split_time_branch, bogus_tag, overlapping_branch, PhoenixBranch(), PhoenixBranch(variant=1, trunk='a/1', branches='b/1', tags='c/1'), ctrl_char_in_log, overdead, NoTrunkPrune(), # 30: NoTrunkPrune(variant=1, trunk='a', branches='b', tags='c'), NoTrunkPrune(variant=2, trunk='a/1', branches='b/1', tags='c/1'), NoTrunkPrune(variant=3, trunk='a/1', branches='a/2', tags='a/3'), double_delete, split_branch, resync_misgroups, TaggedBranchAndTrunk(), TaggedBranchAndTrunk(variant=1, trunk='a/1', branches='a/2', tags='a/3'), enroot_race, enroot_race_obo, # 40: BranchDeleteFirst(), BranchDeleteFirst(variant=1, trunk='a/1', branches='a/2', tags='a/3'), nonascii_filenames, UnicodeAuthor( warning_expected=1), UnicodeAuthor( warning_expected=0, variant='encoding', args=['--encoding=utf_8']), UnicodeAuthor( warning_expected=0, variant='fallback-encoding', args=['--fallback-encoding=utf_8']), UnicodeLog( warning_expected=1), UnicodeLog( warning_expected=0, variant='encoding', args=['--encoding=utf_8']), UnicodeLog( warning_expected=0, variant='fallback-encoding', args=['--fallback-encoding=utf_8']), vendor_branch_sameness, # 50: vendor_branch_trunk_only, default_branches, default_branches_trunk_only, default_branch_and_1_2, compose_tag_three_sources, pass5_when_to_fill, PeerPathPruning(), PeerPathPruning(variant=1, trunk='a/1', branches='a/2', tags='a/3'), EmptyTrunk(), EmptyTrunk(variant=1, trunk='a', branches='b', tags='c'), # 60: EmptyTrunk(variant=2, trunk='a/1', branches='a/2', tags='a/3'), no_spurious_svn_commits, invalid_closings_on_trunk, individual_passes, resync_bug, branch_from_default_branch, file_in_attic_too, retain_file_in_attic_too, symbolic_name_filling_guide, eol_mime1, # 70: eol_mime2, eol_mime3, eol_mime4, cvs_revnums_off, cvs_revnums_on, keywords, ignore, requires_cvs, questionable_branch_names, questionable_tag_names, # 80: revision_reorder_bug, exclude, vendor_branch_delete_add, resync_pass2_pull_forward, native_eol, double_fill, XFail(double_fill2), resync_pass2_push_backward, double_add, bogus_branch_copy, # 90: nested_ttb_directories, auto_props_ignore_case, ctrl_char_in_filename, commit_dependencies, show_help_passes, multiple_tags, multiply_defined_symbols, multiply_defined_symbols_renamed, multiply_defined_symbols_ignored, repeatedly_defined_symbols, # 100: double_branch_delete, symbol_mismatches, overlook_symbol_mismatches, force_symbols, commit_blocks_tags, blocked_excludes, unblock_blocked_excludes, regexp_force_symbols, heuristic_symbol_default, branch_symbol_default, # 110: tag_symbol_default, symbol_transform, write_symbol_info, symbol_hints, parent_hints, parent_hints_invalid, parent_hints_wildcards, path_hints, issue_99, issue_100, # 120: issue_106, options_option, multiproject, crossproject, tag_with_no_revision, delete_cvsignore, repeated_deltatext, nasty_graphs, XFail(tagging_after_delete), crossed_branches, # 130: file_directory_conflict, attic_directory_conflict, use_rcs, internal_co_exclude, internal_co_trunk_only, internal_co_keywords, leftover_revs, requires_internal_co, timestamp_chaos, symlinks, # 140: empty_trunk_path, preferred_parent_cycle, branch_from_empty_dir, trunk_readd, branch_from_deleted_1_1, add_on_branch, main_git, main_git2, git_options, main_hg, # 150: invalid_symbol, invalid_symbol_ignore, invalid_symbol_ignore2, EOLVariants('LF'), EOLVariants('CR'), EOLVariants('CRLF'), EOLVariants('native'), no_revs_file, mirror_keyerror_test, exclude_ntdb_test, # 160: mirror_keyerror2_test, mirror_keyerror3_test, XFail(add_cvsignore_to_branch_test), missing_deltatext, transform_unlabeled_branch_name, ignore_unlabeled_branch, exclude_unlabeled_branch, unlabeled_branch_name_collision, collision_with_unlabeled_branch_name, many_deletes, # 170: cvs_description, include_empty_directories, include_empty_directories_no_prune, exclude_symbol_default, add_on_branch2, branch_from_vendor_branch, strange_default_branch, move_parent, log_message_eols, missing_vendor_branch, # 180: newphrases, ] if __name__ == '__main__': # Configure the environment for reproducable output from svn, etc. os.environ["LC_ALL"] = "C" # Unfortunately, there is no way under Windows to make Subversion # think that the local time zone is UTC, so we just work in the # local time zone. # The Subversion test suite code assumes it's being invoked from # within a working copy of the Subversion sources, and tries to use # the binaries in that tree. Since the cvs2svn tree never contains # a Subversion build, we just use the system's installed binaries. svntest.main.svn_binary = svn_binary svntest.main.svnlook_binary = svnlook_binary svntest.main.svnadmin_binary = svnadmin_binary svntest.main.svnversion_binary = svnversion_binary svntest.main.run_tests(test_list) # NOTREACHED ### End of file. cvs2svn-2.4.0/cvs2git0000775000076500007650000000464211134563221015516 0ustar mhaggermhagger00000000000000#!/usr/bin/env python # (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== import sys # Make sure that a supported version of Python is being used. Do this # as early as possible, using only code compatible with Python 1.5.2 # and Python 3.x before the check. Remember: # # Python 1.5.2 doesn't have sys.version_info or ''.join(). # Python 3.0 doesn't have string.join(). # There are plans to start deprecating the string formatting '%' # operator in Python 3.1 (but we use it here anyway). version_error = """\ ERROR: cvs2git requires Python 2, version 2.4 or later; it does not work with Python 3. You are currently using""" version_advice = """\ Please restart cvs2git using a different version of the Python interpreter. Visit http://www.python.org or consult your local system administrator if you need help. HINT: If you already have a usable Python version installed, it might be possible to invoke cvs2git with the correct Python interpreter by typing something like 'python2.5 """ + sys.argv[0] + """ [...]'. """ try: version = sys.version_info except AttributeError: # This is probably a pre-2.0 version of Python. sys.stderr.write(version_error + '\n') sys.stderr.write('-'*70 + '\n') sys.stderr.write(sys.version + '\n') sys.stderr.write('-'*70 + '\n') sys.stderr.write(version_advice) sys.exit(1) if not ((2,4) <= version < (3,0)): sys.stderr.write( version_error + ' version %d.%d.%d.\n' % (version[0], version[1], version[2],) ) sys.stderr.write(version_advice) sys.exit(1) import os from cvs2svn_lib.common import FatalException from cvs2svn_lib.main import git_main try: git_main(os.path.basename(sys.argv[0]), sys.argv[1:]) except FatalException, e: sys.stderr.write(str(e) + '\n') sys.exit(1) cvs2svn-2.4.0/CHANGES0000664000076500007650000003347412027264603015213 0ustar mhaggermhagger00000000000000Version 2.4.0 (22 September 2012) --------------------------------- New features: * Store CVS file descriptions in a Subversion property "cvs:description". * SVN: Optionally include empty directories from the CVS repository. * Much faster cvs2git conversions possible via --use-external-blob-generator. * Use file properties for more flexibility over keyword and EOL handling. * Add a ConditionalPropertySetter. * Allow CVS repository paths to be excluded from the conversion. * Normalize EOLs in CVS log messages to LF. * Ignore vendor branch declarations that refer to non-existent branches. Bugs fixed: * Issue #31: cvs2svn does not convert empty directories. * Issue #127: Dead "file X added on branch Y" revisions not always dropped. * Fix --dry-run for cvs2git and cvs2bzr. Improvements and output changes: * More aggressively omit unnecessary dead revisions. * Consider it a failure if "cvs" or "co" writes something to stderr. * Add concept of "file properties", which are only computed once per file. * Refuse to accept a default branch that is not a top-level branch. * Make check of illegal filename characters dependent on the target VCS. * Improve error reporting for invalid date strings in CVS. * Many documentation improvements. * Allow grafting a branch onto a parent that has itself been grafted. * Slightly improve choice of parent branch for vendor branches. * Only import "database" if used, to avoid error if no DB module installed. * Ignore newphrases in RCS files more robustly. * Fix the expansion of the $Source$ keyword for Attic files. * Various other minor improvements and fixes. Miscellaneous: * Sort large files using Python to avoid dependency on GNU sort. Version 2.3.0 (22 August 2009) ------------------------------ New features: * Add a "cvs2git" script for starting conversions to git (or Mercurial). * Add a "cvs2bzr" script for starting conversions to Bazaar. * Generate manual pages automatically via new --man option. * Allow --mime-types and --auto-props options to be specified more than once. * Support author transforms when converting to Subversion. * Allow unlabeled branches to be renamed using SymbolTransforms. Bugs fixed: * cvs2git with non-inline blobs: a revision after a delete could be empty. * Fix timezone handling under Windows (which does not respect TZ variable). * Do path comparisions platform-independently in symbol transform classes. * Fix https://bugs.launchpad.net/pld-linux/+bug/385920 Improvements and output changes: * Output error message if a revision's deltatext is missing. * Improve contrib/verify-cvs2svn.py (used for testing conversion accuracy). Miscellaneous: * Add an IgnoreSymbolTransform class, for ignoring symbols matching a regexp. * Remove some DeprecationWarnings when running under newer Python versions. Version 2.2.0 (23 November 2008) -------------------------------- New features: * cvs2git: Omit fixup branch if a tag can be copied from an existing revision. * cvs2git: Add option to set the maximum number of merge sources per commit. * Allow arbitrary SVN directories to be created when a project is created. * Allow vendor branches to be excluded, grafting child symbols to trunk. * By default, omit trivial import branches from conversion. - Add --keep-trivial-imports option to get old behavior. * By default, don't include .cvsignore files in output (except as svn:ignore). - Add option --keep-cvsignore to get the old behavior. * Allow the user to specify the form of cvs2svn-generated log messages. * Allow file contents to be written inline in git-fast-import streams. * --create-option: allow arbitrary options to be passed to "svnadmin create". * Improve handling of auto-props file: - Discard extraneous spaces where they don't make sense. - Warn if parts of the file might be commented out unintentionally. - Warn if the user appears to be trying to quote a property value. Bugs fixed: * Fix issue #81: Remove svn:ignore property when .cvsignore is deleted. * Fix svn dumpfile conformance: - Don't include a leading '/' for Node-path. - Include the Node-kind field when copying nodes. * Make symlink test create symlinks explicitly, to avoid packaging problems. * Accept symbol references to revision numbers that end with ".0". Improvements and output changes: * When -v, log reasons for symbol conversion choices (tag/branch/exclude). * Log preferred parent determinations at verbose (rather than debug) level. * Log symbol transformations at verbose (rather than warn) level. * Log statistics about all symbol transformations at normal level. * cvs2git: Generate lightweight rather than annotated tags. * contrib/destroy_repository.py: - Allow symbols, files, and directories to be renamed. - Allow CVSROOT directory contents to be erased. - Specify what aspects of a repo to destroy via command-line options. Miscellaneous: * cvs2svn now requires Python version 2.4 or later. Version 2.1.1 (15 April 2008) ----------------------------- Bugs fixed: * Make files that are to be sorted more text-like to fix problem on Windows. * Fix comment in header of --write-symbol-info output file. Miscellaneous * Adjust test suite for upstream changes in the svntest code. Version 2.1.0 (19 February 2008) -------------------------------- New features: * Allow conversion of a CVS repository to git (experimental). - Support mapping from cvs author names to git "Author " form. * Enhance symbol transform capabilities: - Add SymbolMapper, for transforming specific symbols in specific files. - Allow SymbolTransforms to cause a symbol to be discarded. * Enhance symbol strategy capabilities: - Write each CVS branch/tag to be written to an arbitrary SVN path. - Choose which trunk/branch should serve as the parent of each branch/tag. - --symbol-hints: manually specify how symbols should be converted. - Make symbol strategy rules project-specific. * --write-symbol-info: output info about CVS symbols. * Add option ctx.decode_apple_single for handling AppleSingle-encoded files. * Add a new, restartable pass that converts author and log_msg to Unicode. * Allow properties to be left unset via auto-props using a leading '!'. Bugs fixed: * Fix issue #80: Empty CVS,v file not accepted. * Fix issue #108: Create project TTB directories "just-in-time". * Fix issue #112: Random test failures with Python 2.5. * Fix issue #115: crash - Bad file descriptor. * Fix the translation of line-end characters for eol-styles CR and CRLF. Improvements and output changes: * Create trunk/tags/branches directories for project when project is created. * Improved conversion speed significantly, especially for large repositories. * Ignore (with a warning) symbols defined to malformed revision numbers. * Tolerate multiple definitions of a symbol to the same revision number. * Handle RCS files that superfluously set the default branch to trunk. * Allow '/' characters in CVS symbol names (creating multilevel SVN paths). * Allow symbols to be transformed to contain '/' (allowing multilevel paths). * Convert '\' characters to '/' (rather than '--') in symbol names. * Make encoding problems fatal; to resolve, restart at CleanMetadataPass. Miscellaneous: * Change the default symbol handling option to --symbol-default=heuristic. Version 2.0.1 (04 October 2007) ------------------------------- Bugs fixed: * Fix problem with keyword expansion when using --use-internal-co. Version 2.0.0 (15 August 2007) ------------------------------ New features: * Add --use-internal-co to speed conversions, and make it the default. * Add --retain-conflicting-attic-files option. * Add --no-cross-branch-commits option. * Add --default-eol option and deprecate --no-default-eol. * RevisionRecorder hook allows file text/deltas to be recorded in pass 1. * RevisionReader hook allow file text to be retrieved from RevisionRecorder. * Slightly changed the order that properties are set, for more flexibility. * Don't set svn:keywords on files for which svn:eol-style is not set. * Implement issue #53: Allow --trunk='' for --trunk-only conversions. Bugs fixed: * Fix issue #97: Follow symlinks within CVS repository. * Fix issue #99: cvs2svn tries to create a file twice. * Fix issue #100: cvs2svn doesn't retrieve the right version. * Fix issue #105: Conflict between directory and Attic file causes crash. * Fix issue #106: SVNRepositoryMirrorParentMissingError. * Fix missing command-line handling of --fallback-encoding option. * Fix issue #85: Disable symbol sanity checks when in --trunk-only mode. Improvements and output changes: * Analyze CVS revision dependency graph, giving a more robust conversion. * Improve choice of symbol parents when CVS history is ambiguous. * In the case of clock skew to the past, resync forwards, not backwards. * Treat timestamps that lie in the future as bogus, and adjust backwards. * Gracefully handle tags that refer to nonexistent revisions. * Check and fail if revision header appears multiple times. * Gracefully handle multiple deltatext blocks for same revision. * Be more careful about only processing reasonable *,v files. * Improve checks for illegal filenames. * Check if a directory name conflicts with a filename. * When file is imported, omit the empty revision 1.1. * If a non-trunk default branch is excluded, graft its contents to trunk. * Omit the initial 'dead' revision when a file is added on a branch. * Require --symbol-transform pattern to match entire symbol name. * Treat files as binary by default instead of as text, because it is safer. * Treat auto-props case-insensitively; deprecate --auto-props-ignore-case. Miscellaneous: * Add a simple (nonportable) script to log cvs2svn memory usage. * Allow contrib/shrink_test_case.py script to try deleting tags and branches. * Add --skip-initial-test option to contrib/shrink_test_case.py script. Version 1.5.1 (28 January 2007) ------------------------------- Bugs fixed: * Add missing import in cvs2svn_lib/process.py. * Fix omission of parsing of the --fallback-encoding option. Version 1.5.0 (03 October 2006) ------------------------------- New features: * Support multiproject conversions (each gets its own trunk, tags, branches). * New --options option to allow run-time options to be defined via a file. * --co, --cvs, and --sort options to specify the paths to executables. * Add new --fallback-encoding option. Bugs fixed: * Fix issue #86: Support multiple project roots per repository. * Fix issue #104: Allow path to "sort" executable to be specified. * Fix issue #8: Allow multiple --encoding options. * Fix issue #109: Improve handling of fallback encodings. Improvements and output changes: * Further reduce conversion time and temporary space requirements. Miscellaneous: * Deprecate the --dump-only option (it is now implied by --dumpfile). * Add scripts to help isolate conversion problems and shrink test cases. * Add a script to search for illegal filenames in a CVS repository. Version 1.4.0 (27 August 2006) ------------------------------ New features: * Support multicomponent --trunk, --tags, and --branches paths (issue #7). * New --auto-props option allows file properties to be set via file. * --force-branch and --force-tag options now accept regular expressions. * Add --symbol-default option. * Support multiple, ordered --encoding options. Bugs fixed: * Fix issue #93: Tags with forbidden characters converted to branches. * Fix issue #102: Branch file, deleted in CVS, is present in SVN. Improvements and output changes: * Print informative warning message if a required program is missing. * Output an error if any CVS filenames contain control characters. * Clean up temporary files even for pass-by-pass conversions. * Improve handling of commit dependencies and multibranch commits. * Implemented issue #50 (performance change). * Reduced the amount of temporary disk space needed during the conversion. Miscellaneous: * cvs2svn now requires Python version 2.2 or later. * cvs2svn has been broken up into many smaller python modules for clarity. Version 1.3.1 (24 May 2006) --------------------------- Bugs fixed: * Fix issue #67: malfunction caused by RCS branches rooted at revision 1.0. Version 1.3.0 (18 August 2005) ------------------------------ Bugs fixed: * Fix cvs2svn's dumpfile output to work after Subversion's r12645. * Fix issue #71: Avoid resyncing two consecutive CVS revs to same time. * Fix issue #88: Don't allow resyncing to throw off CVS revision order. * Fix issue #89: Handle initially dead branch revisions acceptably. * Fix some branch-filling bugs (r1429, r1444). Improvements and output changes: * Better --encoding support when iconv_codec module is available. * Speedups to pass8 (r1421) * Use standard "rNNN" syntax when printing Subversion revisions. Version 1.2.1 (14 February 2005) -------------------------------- Bugs fixed: * Fix cvs2svn's dumpfile output to work after Subversion's r12645. Version 1.2.0 (11 January 2005) ------------------------------- New features: * --fs-type=TYPE: make it possible to specify the filesystem type. Bugs fixed: * Convert files with svn:eol-style to have LF end of lines only. * Fix hang in pass 8 for files that ended with a CR. * Import unexpanded keywords into the repository. * Fix the handling of the $Revision$ keyword. * Fix bug in branch/tag creation edge case. Version 1.1.0 (15 September 2004) --------------------------------- New features: * --symbol-transform: change tag and branch names using regular expressions. * Flush log after writing, for better feedback when using 'tee'. Bugs fixed: * Issue 74: No longer attempt to change non-existent files. * Allow the Subversion repository created to have spaces in its name. * Avoid erroring when using a svnadmin that uses FSFS by default. Version 1.0.0 (25 August 2004) ------------------------------ * The cvs2svn project comes of age. cvs2svn-2.4.0/test-data/0000775000076500007650000000000012027373500016070 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/no-revs-file-cvsrepos/0000775000076500007650000000000012027373500022240 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/no-revs-file-cvsrepos/proj/0000775000076500007650000000000012027373500023212 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/no-revs-file-cvsrepos/proj/one-rev.txt,v0000664000076500007650000000031510720143340025562 0ustar mhaggermhagger00000000000000head 1.1; access; symbols; locks; strict; comment @# @; 1.1 date 2007.11.18.22.28.09; author mhagger; state Exp; branches; next ; commitid cLpur0bMXRgJK6Gs; desc @@ 1.1 log @Add empty file @ text @@ cvs2svn-2.4.0/test-data/no-revs-file-cvsrepos/proj/no-revs.txt,v0000664000076500007650000000010010720143340025570 0ustar mhaggermhagger00000000000000head ; access; symbols; locks; strict; comment @# @; desc @@ cvs2svn-2.4.0/test-data/trunk-readd-cvsrepos/0000775000076500007650000000000012027373500022152 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/trunk-readd-cvsrepos/root/0000775000076500007650000000000012027373500023135 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/trunk-readd-cvsrepos/root/b_file,v0000664000076500007650000000077310702477016024556 0ustar mhaggermhagger00000000000000head 1.2; access; symbols mytag:1.1.2.1 mybranch:1.1.0.2; locks; strict; comment @# @; 1.2 date 2007.06.17.18.48.11; author mhagger; state Exp; branches; next 1.1; 1.1 date 2004.06.05.14.14.45; author max; state dead; branches 1.1.2.1; next ; 1.1.2.1 date 2004.06.05.14.14.45; author max; state Exp; branches; next ; desc @@ 1.2 log @Re-adding b_file on trunk @ text @b_file 1.2 @ 1.1 log @file b_file was initially added on branch mybranch. @ text @d1 1 @ 1.1.2.1 log @Add b_file @ text @@ cvs2svn-2.4.0/test-data/issue-100-cvsrepos/0000775000076500007650000000000012027373500021360 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/issue-100-cvsrepos/file2.txt,v0000775000076500007650000000142710702477015023377 0ustar mhaggermhagger00000000000000head 1.2; access; symbols vendor:1.1.1; locks; strict; comment @# @; 1.2 date 2005.02.17.01.59.23; author ianr; state Exp; branches; next 1.1; 1.1 date 2003.02.04.22.27.56; author ianr; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.02.04.22.27.56; author ianr; state Exp; branches; next 1.1.1.2; 1.1.1.2 date 2004.11.28.11.21.29; author ianr; state Exp; branches; next 1.1.1.3; 1.1.1.3 date 2003.02.04.22.27.56; author ianr; state Exp; branches; next 1.1.1.4; 1.1.1.4 date 2005.02.28.05.08.33; author ianr; state Exp; branches; next ; desc @@ 1.2 log @log 1 @ text @@ 1.1 log @Initial revision @ text @@ 1.1.1.1 log @import 1 @ text @@ 1.1.1.2 log @duplicated log message @ text @@ 1.1.1.3 log @revert @ text @@ 1.1.1.4 log @duplicated log message @ text @@ cvs2svn-2.4.0/test-data/issue-100-cvsrepos/file1.txt,v0000775000076500007650000000142610702477015023375 0ustar mhaggermhagger00000000000000head 1.2; access; symbols vendor:1.1.1; locks; strict; comment @# @; 1.2 date 2004.11.28.11.21.29; author ianr; state Exp; branches; next 1.1; 1.1 date 2003.02.04.22.27.56; author ianr; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.02.04.22.27.56; author ianr; state Exp; branches; next 1.1.1.2; 1.1.1.2 date 2004.10.11.02.26.01; author ianr; state Exp; branches; next 1.1.1.3; 1.1.1.3 date 2003.02.04.22.27.56; author ianr; state Exp; branches; next ; desc @@ 1.2 log @duplicated log message @ text @file1.txt<1.2> @ 1.1 log @Initial revision @ text @d1 1 a1 1 file1.txt<1.1> @ 1.1.1.1 log @import 1 @ text @d1 1 a1 1 file1.txt<1.1.1.1> @ 1.1.1.2 log @import 2 @ text @d1 1 a1 1 file1.txt<1.1.1.2> @ 1.1.1.3 log @revert @ text @d1 1 a1 1 file1.txt<1.1.1.3> @ cvs2svn-2.4.0/test-data/mirror-keyerror2-cvsrepos/0000775000076500007650000000000012027373500023166 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/mirror-keyerror2-cvsrepos/proj/0000775000076500007650000000000012027373500024140 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/mirror-keyerror2-cvsrepos/proj/dir1/0000775000076500007650000000000012027373500024777 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/mirror-keyerror2-cvsrepos/proj/dir1/file1.txt,v0000664000076500007650000000052311011572255027002 0ustar mhaggermhagger00000000000000head 1.1; access; symbols TAG1:1.1.2.1 BRANCH1:1.1.2.1.0.2 BRANCH2:1.1.0.2; locks; strict; comment @ * @; 1.1 date 2002.06.24.05.47.35; author author2; state Exp; branches 1.1.2.1; next ; 1.1.2.1 date 2002.07.03.17.27.18; author author3; state Exp; branches; next ; desc @@ 1.1 log @log 3@ text @@ 1.1.2.1 log @log 4@ text @@ cvs2svn-2.4.0/test-data/mirror-keyerror2-cvsrepos/proj/dir2/0000775000076500007650000000000012027373500025000 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/mirror-keyerror2-cvsrepos/proj/dir2/file3.txt,v0000664000076500007650000000025111011572256027004 0ustar mhaggermhagger00000000000000head 1.1; access; symbols; locks; strict; comment @// @; 1.1 date 2001.02.16.00.44.44; author author1; state Exp; branches; next ; desc @@ 1.1 log @log 1@ text @@ cvs2svn-2.4.0/test-data/mirror-keyerror2-cvsrepos/proj/dir2/file2.txt,v0000664000076500007650000000027211011572256027006 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BRANCH1:1.1.0.2; locks; strict; comment @// @; 1.1 date 2002.01.04.21.27.35; author author1; state Exp; branches; next ; desc @@ 1.1 log @log 2@ text @@ cvs2svn-2.4.0/test-data/keywords-cvsrepos/0000775000076500007650000000000012027373500021601 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/keywords-cvsrepos/foo.ko,v0000664000076500007650000000100110702477021023152 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; expand @o@; 1.2 date 2004.07.28.10.42.27; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.07.19.20.57.24; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Commit a second revision, appending text to the first revision. @ text @This is the first revision in this file. It has three keywords: $Author$ $Date$ $Id$ This second revision appends some text to the first revision. @ 1.1 log @Add a file. @ text @d10 1 @ cvs2svn-2.4.0/test-data/keywords-cvsrepos/foo.kkv,v0000664000076500007650000000117110702477021023344 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2004.07.28.10.42.27; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.07.19.20.57.24; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Commit a second revision, appending text to the first revision. @ text @This is the first revision in this file. It has three keywords: $Author: jrandom $ $Date: 2004/07/19 20:57:24 $ $Id: foo.kkv,v 1.1 2004/07/19 20:57:24 jrandom Exp $ This second revision appends some text to the first revision. @ 1.1 log @Add a file. @ text @d5 1 a5 1 $Author$ d7 1 a7 1 $Date$ d9 2 a10 1 $Id$ @ cvs2svn-2.4.0/test-data/keywords-cvsrepos/foo.kv,v0000664000076500007650000000115110702477021023167 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; expand @v@; 1.2 date 2004.07.28.10.42.27; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.07.19.20.57.24; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Commit a second revision, appending text to the first revision. @ text @This is the first revision in this file. It has three keywords: jrandom 2004/07/19 20:57:24 foo.kv,v 1.1 2004/07/19 20:57:24 jrandom Exp This second revision appends some text to the first revision. @ 1.1 log @Add a file. @ text @d5 1 a5 1 $Author$ d7 1 a7 1 $Date$ d9 2 a10 1 $Id$ @ cvs2svn-2.4.0/test-data/keywords-cvsrepos/foo.default,v0000664000076500007650000000117510702477021024201 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2004.07.28.10.42.27; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.07.19.20.57.24; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Commit a second revision, appending text to the first revision. @ text @This is the first revision in this file. It has three keywords: $Author: jrandom $ $Date: 2004/07/19 20:57:24 $ $Id: foo.default,v 1.1 2004/07/19 20:57:24 jrandom Exp $ This second revision appends some text to the first revision. @ 1.1 log @Add a file. @ text @d5 1 a5 1 $Author$ d7 1 a7 1 $Date$ d9 2 a10 1 $Id$ @ cvs2svn-2.4.0/test-data/keywords-cvsrepos/foo.kk,v0000664000076500007650000000100110702477021023146 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; expand @k@; 1.2 date 2004.07.28.10.42.27; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.07.19.20.57.24; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Commit a second revision, appending text to the first revision. @ text @This is the first revision in this file. It has three keywords: $Author$ $Date$ $Id$ This second revision appends some text to the first revision. @ 1.1 log @Add a file. @ text @d10 1 @ cvs2svn-2.4.0/test-data/keywords-cvsrepos/foo.kb,v0000664000076500007650000000100110702477021023135 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; expand @b@; 1.2 date 2004.07.28.10.42.27; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.07.19.20.57.24; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Commit a second revision, appending text to the first revision. @ text @This is the first revision in this file. It has three keywords: $Author$ $Date$ $Id$ This second revision appends some text to the first revision. @ 1.1 log @Add a file. @ text @d10 1 @ cvs2svn-2.4.0/test-data/keywords-cvsrepos/foo.kkvl,v0000664000076500007650000000121010702477021023512 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; expand @kvl@; 1.2 date 2004.07.28.10.42.27; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.07.19.20.57.24; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Commit a second revision, appending text to the first revision. @ text @This is the first revision in this file. It has three keywords: $Author: jrandom $ $Date: 2004/07/19 20:57:24 $ $Id: foo.kkvl,v 1.1 2004/07/19 20:57:24 jrandom Exp $ This second revision appends some text to the first revision. @ 1.1 log @Add a file. @ text @d5 1 a5 1 $Author$ d7 1 a7 1 $Date$ d9 2 a10 1 $Id$ @ cvs2svn-2.4.0/test-data/double-fill-cvsrepos/0000775000076500007650000000000012027373500022130 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/double-fill-cvsrepos/Attic/0000775000076500007650000000000012027373500023174 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/double-fill-cvsrepos/Attic/oldfile.txt,v0000775000076500007650000000060310702477015025623 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BRANCH-boom:1.1.0.2; locks; strict; comment @# @; 1.1 date 2005.04.04.16.13.48; author sally; state dead; branches 1.1.2.1; next ; 1.1.2.1 date 2005.04.04.16.13.48; author boom; state dead; branches; next ; desc @@ 1.1 log @log message @ text @@ 1.1.2.1 log @file oldfile.txt was added on branch BRANCH-boom on 2005-04-06 20:15:29 +0000 @ text @@ cvs2svn-2.4.0/test-data/double-fill-cvsrepos/file.txt,v0000775000076500007650000000075410702477015024067 0ustar mhaggermhagger00000000000000head 1.2; access; symbols BRANCH-boom:1.2.0.2; locks; strict; comment @ * @; 1.2 date 2005.04.22.14.13.03; author harry; state Exp; branches 1.2.2.1; next 1.1; 1.1 date 2005.04.21.19.38.46; author sally; state dead; branches; next ; 1.2.2.1 date 2005.04.22.14.13.03; author harry; state dead; branches; next ; desc @@ 1.2 log @log message @ text @@ 1.2.2.1 log @file file.txt was added on branch BRANCH-boom on 2005-04-22 17:58:42 +0000 @ text @@ 1.1 log @log message @ text @@ cvs2svn-2.4.0/test-data/double-fill-cvsrepos/otherfile.txt,v0000775000076500007650000000116510702477015025126 0ustar mhaggermhagger00000000000000head 1.2; access; symbols BRANCH-boom:1.2.0.2; locks; strict; comment @# @; 1.2 date 2005.03.29.22.31.23; author harry; state Exp; branches 1.2.2.1; next 1.1; 1.1 date 2005.03.29.13.21.35; author sally; state dead; branches; next ; 1.2.2.1 date 2005.03.29.22.31.23; author harry; state dead; branches; next 1.2.2.2; 1.2.2.2 date 2005.03.29.23.36.06; author harry; state Exp; branches; next ; desc @@ 1.2 log @log message @ text @@ 1.2.2.1 log @file otherfile.txt was added on branch BRANCH-boom on 2005-03-29 23:36:06 +0000 @ text @@ 1.2.2.2 log @merged trunk to boom @ text @@ 1.1 log @log message @ text @@ cvs2svn-2.4.0/test-data/cvsignore-cvsrepos/0000775000076500007650000000000012027373500021731 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/cvsignore-cvsrepos/proj/0000775000076500007650000000000012027373500022703 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/cvsignore-cvsrepos/proj/.cvsignore,v0000664000076500007650000000055610702477013025155 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2004.07.22.16.03.32; author fitz; state Exp; branches; next 1.1; 1.1 date 2004.07.22.16.02.36; author fitz; state Exp; branches; next ; desc @@ 1.2 log @Add a few more ignore lines, plus a blank line. @ text @*.idx *.aux *.dvi *.log foo bar baz qux @ 1.1 log @initial import @ text @d5 5 @ cvs2svn-2.4.0/test-data/cvsignore-cvsrepos/proj/file.txt,v0000664000076500007650000000035310702477013024630 0ustar mhaggermhagger00000000000000head 1.1; access; symbols; locks; strict; comment @# @; 1.1 date 2004.07.19.20.57.24; author jrandom; state Exp; branches; next ; desc @@ 1.1 log @Add a file. @ text @This is the only revision in this file. It's made of meat. @ cvs2svn-2.4.0/test-data/cvsignore-cvsrepos/proj/subdir/0000775000076500007650000000000012027373500024173 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/cvsignore-cvsrepos/proj/subdir/.cvsignore,v0000664000076500007650000000055610702477013026445 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2004.07.22.16.03.32; author fitz; state Exp; branches; next 1.1; 1.1 date 2004.07.22.16.02.36; author fitz; state Exp; branches; next ; desc @@ 1.2 log @Add a few more ignore lines, plus a blank line. @ text @*.idx *.aux *.dvi *.log foo bar baz qux @ 1.1 log @initial import @ text @d5 5 @ cvs2svn-2.4.0/test-data/phoenix-cvsrepos/0000775000076500007650000000000012027373500021404 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/phoenix-cvsrepos/Attic/0000775000076500007650000000000012027373500022450 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/phoenix-cvsrepos/Attic/added-on-branch.txt,v0000664000076500007650000000067510702477015026375 0ustar mhaggermhagger00000000000000head 1.1; access; symbols xiphophorus:1.1.0.2; locks; strict; comment @# @; 1.1 date 2004.06.15.15.33.01; author fitz; state dead; branches 1.1.2.1; next ; 1.1.2.1 date 2004.06.15.15.33.01; author fitz; state Exp; branches; next ; desc @@ 1.1 log @file added-on-branch.txt was initially added on branch xiphophorus. @ text @@ 1.1.2.1 log @File added on branch xiphophorus @ text @a0 1 This file was added on the branch xiphophorus.@ cvs2svn-2.4.0/test-data/phoenix-cvsrepos/Attic/added-on-branch2.txt,v0000664000076500007650000000124610702477015026452 0ustar mhaggermhagger00000000000000head 1.1; access; symbols xiphophorus:1.1.0.2; locks; strict; comment @# @; 1.1 date 2004.06.15.15.39.58; author fitz; state dead; branches 1.1.2.1; next ; 1.1.2.1 date 2004.06.15.15.39.58; author fitz; state Exp; branches; next ; desc @@ 1.1 log @file added-on-branch2.txt was initially added on branch xiphophorus, and this log message was tweaked so that it's not the standard log message generated by CVS for the 'dead' trunk revision of a file added on a branch. @ text @@ 1.1.2.1 log @This file was also added on branch xiphophorus, but slightly after the other file was added on this branch. @ text @a0 1 This file was also added on the branch xiphophorus. @ cvs2svn-2.4.0/test-data/phoenix-cvsrepos/file.txt,v0000664000076500007650000000052210702477015023331 0ustar mhaggermhagger00000000000000head 1.1; access; symbols; locks; strict; comment @# @; 1.1 date 2000.09.01.09.56.05; author fitz; state Exp; branches; next ; desc @@ 1.1 log @initial import @ text @This is a file on trunk that has a few revisions. It's here only to make sure that it doesn't accidentally get copied over to any branches when they get created. @ cvs2svn-2.4.0/test-data/phoenix-cvsrepos/phoenix,v0000664000076500007650000001650110702477015023252 0ustar mhaggermhagger00000000000000head 1.4; access; symbols libogg2-zerocopy:1.4.0.4 vorbis1_0_public_release:1.4 volsung_flush:1.4.0.2 release_0_8_2:1.4 vorbis1_0_public_release_candidate_2:1.4 volsung_20010721:1.2.0.2 start:1.1.1.1 xiphophorus:1.1.1; locks; strict; comment @# @; 1.4 date 2001.08.05.02.35.30; author volsung; state Exp; branches; next 1.3; 1.3 date 2001.08.04.02.56.08; author volsung; state Exp; branches; next 1.2; 1.2 date 2000.10.31.07.08.41; author jack; state dead; branches 1.2.2.1; next 1.1; 1.1 date 2000.09.03.09.56.05; author jack; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2000.09.03.09.56.05; author jack; state Exp; branches; next ; 1.2.2.1 date 2001.07.22.03.35.41; author volsung; state Exp; branches; next 1.2.2.2; 1.2.2.2 date 2001.07.24.17.51.09; author volsung; state Exp; branches; next ; desc @@ 1.4 log @This file was supplied by Jack Moffitt to help us reproduce a bug in which cvs2svn.py would throw an exception when it tried to create a file branch rooted in a 'dead' RCS revision. See http://subversion.tigris.org/issues/show_bug.cgi?id=1417 for more details. The original file was ao/doc/index.html,v, and the log message for this revision was: "Documentation update." @ text @ libao - Documentation

libao documentation

libao version 0.8.0 - 20010804

libao Documentation

Libao is a cross-platform library that allows programs to output PCM audio data to the native audio devices on a wide variety of platforms. It currently supports:

  • OSS (Open Sound System)
  • ESD (ESounD)
  • ALSA (Advanced Linux Sound Architecture)
  • Sun audio system (used in Solaris, OpenBSD, and NetBSD)
  • aRts (Analog Realtime Synthesizer)

libao api overview
drivers
example code
configuration files
libao api reference
plugin writer's overview
plugin api reference



copyright © 2001 Stan Seibert

xiph.org
indigo@@aztec.asu.edu

libao documentation

libao version 0.8.0 - 20010804

@ 1.3 log @This file was supplied by Jack Moffitt to help us reproduce a bug in which cvs2svn.py would throw an exception when it tried to create a file branch rooted in a 'dead' RCS revision. See http://subversion.tigris.org/issues/show_bug.cgi?id=1417 for more details. The original file was ao/doc/index.html,v, and the log message for this revision was: "Merger of new API branch (volsung_20010721) with head." @ text @d12 1 a12 1

libao version 0.90 - 20010528

d35 1 a35 1 plugin overview
d46 1 a46 1

libao version 0.90 - 20010528

@ 1.2 log @This file was supplied by Jack Moffitt to help us reproduce a bug in which cvs2svn.py would throw an exception when it tried to create a file branch rooted in a 'dead' RCS revision. See http://subversion.tigris.org/issues/show_bug.cgi?id=1417 for more details. The original file was ao/doc/index.html,v, and the log message for this revision was: "documetation???? WHAT!?!" jack. @ text @d1 52 a52 1 There is no documentation yet. @ 1.2.2.1 log @This file was supplied by Jack Moffitt to help us reproduce a bug in which cvs2svn.py would throw an exception when it tried to create a file branch rooted in a 'dead' RCS revision. See http://subversion.tigris.org/issues/show_bug.cgi?id=1417 for more details. The original file was ao/doc/index.html,v, and the log message for this revision was: "Initial branch of libao to new API. Great fear and trembling shall sweep the land..." @ text @d1 1 a1 52 libao - Documentation

libao documentation

libao version 0.90 - 20010528

libao Documentation

Libao is a cross-platform library that allows programs to output PCM audio data to the native audio devices on a wide variety of platforms. It currently supports:

  • OSS (Open Sound System)
  • ESD (ESounD)
  • ALSA (Advanced Linux Sound Architecture)
  • Sun audio system (used in Solaris, OpenBSD, and NetBSD)
  • aRts (Analog Realtime Synthesizer)

libao api overview
driver list
example code
configuration files
libao api reference
plugin overview
plugin api reference



copyright © 2001 Stan Seibert

xiph.org
indigo@@aztec.asu.edu

libao documentation

libao version 0.90 - 20010528

@ 1.2.2.2 log @This file was supplied by Jack Moffitt to help us reproduce a bug in which cvs2svn.py would throw an exception when it tried to create a file branch rooted in a 'dead' RCS revision. See http://subversion.tigris.org/issues/show_bug.cgi?id=1417 for more details. The original file was ao/doc/index.html,v, and the log message for this revision was: "We now use a ranking system to select a defaut driver from the range of possible drivers. Note that the null device is no longer a possible default device to prevent user confusion that we have had in the past." @ text @d31 1 a31 1 drivers
@ 1.1 log @This file was supplied by Jack Moffitt to help us reproduce a bug in which cvs2svn.py would throw an exception when it tried to create a file branch rooted in a 'dead' RCS revision. See http://subversion.tigris.org/issues/show_bug.cgi?id=1417 for more details. The original file was ao/doc/index.html,v, and the log message for this revision was: "Initial revision" @ text @@ 1.1.1.1 log @This file was supplied by Jack Moffitt to help us reproduce a bug in which cvs2svn.py would throw an exception when it tried to create a file branch rooted in a 'dead' RCS revision. See http://subversion.tigris.org/issues/show_bug.cgi?id=1417 for more details. The original file was ao/doc/index.html,v, and the log message for this revision was: "The first sample..." @ text @@ cvs2svn-2.4.0/test-data/native-eol-cvsrepos/0000775000076500007650000000000012027373500021775 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/native-eol-cvsrepos/foo.txt,v0000664000076500007650000000111210702477013023560 0ustar mhaggermhagger00000000000000head 1.4; access; symbols; locks; strict; comment @# @; 1.4 date 2005.01.04.19.59.01; author tori; state Exp; branches; next 1.3; 1.3 date 2005.01.04.19.58.15; author tori; state Exp; branches; next 1.2; 1.2 date 2005.01.04.19.57.04; author tori; state Exp; branches; next 1.1; 1.1 date 2005.01.04.19.55.50; author tori; state Exp; branches; next ; desc @@ 1.4 log @Mixed EOLs @ text @And then mixed EOLs @ 1.3 log @CR EOLs @ text @d1 2 a2 1 And then CR EOLs @ 1.2 log @CRLF EOLs @ text @d1 1 a1 3 Then CRLF EOLs @ 1.1 log @LF EOLs @ text @d1 3 a3 3 First LF EOLs @ cvs2svn-2.4.0/test-data/add-on-branch-cvsrepos/0000775000076500007650000000000012027373500022327 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/add-on-branch-cvsrepos/proj/0000775000076500007650000000000012027373500023301 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/add-on-branch-cvsrepos/proj/d.txt,v0000664000076500007650000000123310702477013024530 0ustar mhaggermhagger00000000000000head 1.2; access; symbols BRANCH3:1.1.0.2; locks; strict; comment @# @; 1.2 date 2007.07.05.20.37.52; author mhagger; state Exp; branches; next 1.1; 1.1 date 2007.07.05.18.27.32; author mhagger; state Exp; branches 1.1.2.1; next ; 1.1.2.1 date 2007.07.05.18.27.32; author mhagger; state dead; branches; next 1.1.2.2; 1.1.2.2 date 2007.07.05.18.27.35; author mhagger; state Exp; branches; next ; desc @@ 1.2 log @Changing d.txt:1.2 @ text @1.2 @ 1.1 log @Adding d.txt:1.1 @ text @d1 1 a1 1 1.1 @ 1.1.2.1 log @file d.txt was added on branch BRANCH3 on 2007-07-05 18:27:35 +0000 @ text @d1 1 @ 1.1.2.2 log @Adding d.txt:1.1.2.2 @ text @a0 1 1.1.2.2 @ cvs2svn-2.4.0/test-data/add-on-branch-cvsrepos/proj/Attic/0000775000076500007650000000000012027373500024345 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/add-on-branch-cvsrepos/proj/Attic/c.txt,v0000664000076500007650000000121510702477013025573 0ustar mhaggermhagger00000000000000head 1.2; access; symbols BRANCH2:1.2.0.2; locks; strict; comment @# @; 1.2 date 2007.07.05.18.27.28; author mhagger; state dead; branches 1.2.2.1; next 1.1; 1.1 date 2007.07.05.18.27.27; author mhagger; state Exp; branches; next ; 1.2.2.1 date 2007.07.05.18.27.28; author mhagger; state dead; branches; next 1.2.2.2; 1.2.2.2 date 2007.07.05.18.27.30; author mhagger; state Exp; branches; next ; desc @@ 1.2 log @Removing c.txt:1.2 @ text @1.1 @ 1.2.2.1 log @file c.txt was added on branch BRANCH2 on 2007-07-05 18:27:30 +0000 @ text @d1 1 @ 1.2.2.2 log @Adding c.txt:1.2.2.2 @ text @a0 1 1.2.2.2 @ 1.1 log @Adding c.txt:1.1 @ text @@ cvs2svn-2.4.0/test-data/add-on-branch-cvsrepos/proj/a.txt,v0000664000076500007650000000035310702477013024527 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BRANCH3:1.1.0.6 BRANCH2:1.1.0.4 BRANCH1:1.1.0.2; locks; strict; comment @# @; 1.1 date 2007.07.05.18.27.21; author mhagger; state Exp; branches; next ; desc @@ 1.1 log @Adding a.txt:1.1 @ text @1.1 @ cvs2svn-2.4.0/test-data/add-on-branch-cvsrepos/proj/b.txt,v0000664000076500007650000000102710702477013024527 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BRANCH1:1.1.0.2; locks; strict; comment @# @; 1.1 date 2007.07.05.18.27.22; author mhagger; state Exp; branches 1.1.2.1; next ; 1.1.2.1 date 2007.07.05.18.27.22; author mhagger; state dead; branches; next 1.1.2.2; 1.1.2.2 date 2007.07.05.18.27.25; author mhagger; state Exp; branches; next ; desc @@ 1.1 log @Adding b.txt:1.1 @ text @1.1 @ 1.1.2.1 log @file b.txt was added on branch BRANCH1 on 2007-07-05 18:27:25 +0000 @ text @d1 1 @ 1.1.2.2 log @Adding b.txt:1.1.2.2 @ text @a0 1 1.1.2.2 @ cvs2svn-2.4.0/test-data/add-on-branch-cvsrepos/makerepo.sh0000775000076500007650000000457510702477013024506 0ustar mhaggermhagger00000000000000#! /bin/sh # This is the script used to create the add-on-branch CVS repository. # (The repository is checked into svn; this script is only here for # its documentation value.) The script should be started from the # main cvs2svn directory. # The output of this script depends on the CVS version. Newer CVS # versions add dead revisions (b.txt:1.1.2.1 and c.txt:1.2.2.1) on the # branch, presumably to indicate that the file didn't exist on the # branch during the period of time between the branching point and # when the 1.x.2.2 revisions were committed. Older versions of CVS do # not add these extra revisions. The point of this test is to handle # the new CVS behavior, so set this variable to point at a newish CVS # executable: cvs=$HOME/download/cvs-1.11.21/src/cvs name=add-on-branch repo=`pwd`/test-data/$name-cvsrepos wc=`pwd`/cvs2svn-tmp/$name-wc [ -e $repo/CVSROOT ] && rm -rf $repo/CVSROOT [ -e $repo/proj ] && rm -rf $repo/proj [ -e $wc ] && rm -rf $wc $cvs -d $repo init $cvs -d $repo co -d $wc . cd $wc mkdir proj $cvs add proj cd $wc/proj echo "Create a file a.txt on trunk:" echo '1.1' >a.txt $cvs add a.txt $cvs commit -m 'Adding a.txt:1.1' . echo "Create BRANCH1 on file a.txt:" $cvs tag -b BRANCH1 echo "Create BRANCH2 on file a.txt:" $cvs tag -b BRANCH2 echo "Create BRANCH3 on file a.txt:" $cvs tag -b BRANCH3 f=b.txt b=BRANCH1 echo "Add file $f on trunk:" $cvs up -A echo "1.1" >$f $cvs add $f $cvs commit -m "Adding $f:1.1" echo "Add file $f on $b:" $cvs up -r $b # Ensure that times are distinct: sleep 2 echo "1.1.2.2" >$f $cvs add $f $cvs commit -m "Adding $f:1.1.2.2" f=c.txt b=BRANCH2 echo "Add file $f on trunk:" $cvs up -A echo "1.1" >$f $cvs add $f $cvs commit -m "Adding $f:1.1" echo "Delete file $f on trunk:" rm $f $cvs remove $f $cvs commit -m "Removing $f:1.2" echo "Add file $f on $b:" $cvs up -r $b # Ensure that times are distinct: sleep 2 echo "1.2.2.2" >$f $cvs add $f $cvs commit -m "Adding $f:1.2.2.2" f=d.txt b=BRANCH3 echo "Add file $f on trunk:" $cvs up -A echo "1.1" >$f $cvs add $f $cvs commit -m "Adding $f:1.1" echo "Add file $f on $b:" $cvs up -r $b # Ensure that times are distinct: sleep 2 echo "1.1.2.2" >$f $cvs add $f $cvs commit -m "Adding $f:1.1.2.2" echo "Modify file $f on trunk:" $cvs up -A echo "1.2" >$f $cvs commit -m "Changing $f:1.2" # Erase the unneeded stuff out of CVSROOT: rm -rf $repo/CVSROOT mkdir $repo/CVSROOT cvs2svn-2.4.0/test-data/resync-bug-cvsrepos/0000775000076500007650000000000012027373500022010 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/resync-bug-cvsrepos/b,v0000664000076500007650000000062110702477014022420 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2003.07.03.19.35.55; author max; state Exp; branches; next 1.1; 1.1 date 2004.07.03.19.35.06; author max; state Exp; branches; next ; desc @@ 1.2 log @Modify b. This is the commit with the bad timestamp. (Suppose the server's clock was temporarily set for too early.) @ text @modify @ 1.1 log @Modify a. Add b. @ text @d1 1 @ cvs2svn-2.4.0/test-data/resync-bug-cvsrepos/a,v0000664000076500007650000000044310702477014022421 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2004.07.03.19.35.06; author max; state Exp; branches; next 1.1; 1.1 date 2004.07.03.19.33.42; author max; state Exp; branches; next ; desc @@ 1.2 log @Modify a. Add b. @ text @modify @ 1.1 log @Add a. @ text @d1 1 @ cvs2svn-2.4.0/test-data/missing-deltatext-cvsrepos/0000775000076500007650000000000012027373500023377 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/missing-deltatext-cvsrepos/README0000664000076500007650000000030711134563221024256 0ustar mhaggermhagger00000000000000Please note that revision 1.1.4.4 of the RCS file has a "delta" but no "deltatext". cvs2svn should test for this problem and emit a suitable error. This test file was submitted by Sebastian Marek. cvs2svn-2.4.0/test-data/missing-deltatext-cvsrepos/file001,v0000664000076500007650000000132511134563221024724 0ustar mhaggermhagger00000000000000head 1.1; access; symbols; locks; strict; comment @# @; 1.1 date 2003.02.14.12.21.04; author author1; state dead; branches 1.1.2.1 1.1.4.1; next ; 1.1.2.1 date 2003.02.14.12.21.04; author author1; state Exp; branches; next ; 1.1.4.1 date 2003.02.14.18.28.37; author author1; state Exp; branches; next 1.1.4.2; 1.1.4.2 date 2003.02.28.13.31.47; author author1; state dead; branches; next 1.1.4.3; 1.1.4.3 date 2003.02.28.14.25.24; author author2; state Exp; branches; next 1.1.4.4; 1.1.4.4 date 2003.05.02.21.55.01; author author3; state dead; branches; next ; desc @@ 1.1 log @log 1@ text @@ 1.1.4.1 log @log 2@ text @@ 1.1.4.2 log @log 3@ text @@ 1.1.4.3 log @log 4@ text @@ 1.1.2.1 log @log 5@ text @@ cvs2svn-2.4.0/test-data/vendor-branch-delete-add-cvsrepos/0000775000076500007650000000000012027373500024450 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/vendor-branch-delete-add-cvsrepos/proj/0000775000076500007650000000000012027373500025422 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/vendor-branch-delete-add-cvsrepos/proj/file.txt,v0000664000076500007650000000103110702477015027343 0ustar mhaggermhagger00000000000000head 1.2; access; symbols vendorbranch:1.1.1; locks; strict; comment @# @; expand @o@; 1.2 date 2003.03.03.03.03.03; author cc; state Exp; branches; next 1.1; 1.1 date 2001.01.01.01.01.01; author aa; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2001.01.01.01.01.01; author aa; state Exp; branches; next 1.1.1.2; 1.1.1.2 date 2002.02.02.02.02.02; author bb; state dead; branches; next ; desc @@ 1.2 log @1.2 @ text @@ 1.1 log @Initial revision @ text @@ 1.1.1.1 log @1.1.1.1 @ text @@ 1.1.1.2 log @1.1.1.2 @ text @@ cvs2svn-2.4.0/test-data/eol-mime-cvsrepos/0000775000076500007650000000000012027373500021436 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/eol-mime-cvsrepos/foo.UPCASE1,v0000664000076500007650000000062410702477021023451 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2004.07.26.23.38.17; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.07.19.20.57.24; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Commit a second revision of every file. @ text @This is the first revision in this file. This line was added in the second revision. @ 1.1 log @Add a file. @ text @d2 1 @ cvs2svn-2.4.0/test-data/eol-mime-cvsrepos/foo.asc,v0000664000076500007650000000062410756006223023157 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2004.07.26.23.38.17; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.07.19.20.57.24; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Commit a second revision of every file. @ text @This is the first revision in this file. This line was added in the second revision. @ 1.1 log @Add a file. @ text @d2 1 @ cvs2svn-2.4.0/test-data/eol-mime-cvsrepos/auto-props0000664000076500007650000000070510761607530023502 0ustar mhaggermhagger00000000000000# auto-props file for auto_props() test. [auto-props] *.txt = myprop=txt *.xml = myprop=xml;svn:mime-type=text/xml;svn:eol-style=CRLF *.zip = myprop=zip;svn:mime-type=application/zip *.asc = myprop=asc;svn:mime-type=text/plain;!svn:eol-style *.bin = myprop=bin;svn:executable *.csv = myprop=csv;svn:mime-type=text/csv;svn:eol-style=CRLF *.dbf = myprop=dbf;svn:mime-type=application/what-is-dbf *.UPCASE1 = myprop=UPCASE1 *.upcase2 = myprop=UPCASE2 cvs2svn-2.4.0/test-data/eol-mime-cvsrepos/foo.xml,v0000664000076500007650000000062410702477021023210 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2004.07.26.23.38.17; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.07.19.20.57.24; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Commit a second revision of every file. @ text @This is the first revision in this file. This line was added in the second revision. @ 1.1 log @Add a file. @ text @d2 1 @ cvs2svn-2.4.0/test-data/eol-mime-cvsrepos/foo.zip,v0000664000076500007650000000062410702477021023212 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2004.07.26.23.38.17; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.07.19.20.57.24; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Commit a second revision of every file. @ text @This is the first revision in this file. This line was added in the second revision. @ 1.1 log @Add a file. @ text @d2 1 @ cvs2svn-2.4.0/test-data/eol-mime-cvsrepos/README0000664000076500007650000000061010702477021022314 0ustar mhaggermhagger00000000000000This repository is for testing that setting of svn:eol-style works as expected, that the mime mapper also works, and that mime types and eol styles interact with each other correctly. Issue #39 has more about how these interactions work. The 'mime-mappings.txt' file for this repository is stored right here, in the repository itself. It won't be converted because it doesn't end with ,v. cvs2svn-2.4.0/test-data/eol-mime-cvsrepos/foo.csv,v0000664000076500007650000000064010702477021023201 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; expand @b@; 1.2 date 2004.07.26.23.38.17; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.07.19.20.57.24; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Commit a second revision of every file. @ text @This is the first revision in this file. This line was added in the second revision. @ 1.1 log @Add a file. @ text @d2 1 @ cvs2svn-2.4.0/test-data/eol-mime-cvsrepos/foo.bin,v0000664000076500007650000000060310760647473023173 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; expand @b@; 1.2 date 2004.07.26.23.38.17; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.07.19.20.57.24; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Commit a second revision of every file. @ text @#!/bin/sh echo 'Hello world!' echo 'Hello back at you!' @ 1.1 log @Add a file. @ text @d3 1 @ cvs2svn-2.4.0/test-data/eol-mime-cvsrepos/foo.txt,v0000664000076500007650000000062410702477021023227 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2004.07.26.23.38.17; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.07.19.20.57.24; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Commit a second revision of every file. @ text @This is the first revision in this file. This line was added in the second revision. @ 1.1 log @Add a file. @ text @d2 1 @ cvs2svn-2.4.0/test-data/eol-mime-cvsrepos/foo.UPCASE2,v0000664000076500007650000000062410702477021023452 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2004.07.26.23.38.17; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.07.19.20.57.24; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Commit a second revision of every file. @ text @This is the first revision in this file. This line was added in the second revision. @ 1.1 log @Add a file. @ text @d2 1 @ cvs2svn-2.4.0/test-data/eol-mime-cvsrepos/mime.types0000664000076500007650000000020410702477021023450 0ustar mhaggermhagger00000000000000text/xml xml application/zip zip text/csv csv application/what-is-dbf dbf cvs2svn-2.4.0/test-data/eol-mime-cvsrepos/foo.dbf,v0000664000076500007650000000064010702477021023141 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; expand @b@; 1.2 date 2004.07.26.23.38.17; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.07.19.20.57.24; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Commit a second revision of every file. @ text @This is the first revision in this file. This line was added in the second revision. @ 1.1 log @Add a file. @ text @d2 1 @ cvs2svn-2.4.0/test-data/questionable-symbols-cvsrepos/0000775000076500007650000000000012027373500024113 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/questionable-symbols-cvsrepos/foo.txt,v0000664000076500007650000001171210702477016025710 0ustar mhaggermhagger00000000000000head 1.2; access; symbols TagWith/Slash_Z:1.2 TagWith\Backslash_E:1.2 TagWith///ThreeSlashes_D:1.2 Tag_A:1.2 BranchWith/Slash_Z:1.1.0.10 /BranchStartsWithSlash_Y:1.1.0.8 #BranchStartsWithHash_X:1.1.0.6 BranchWith.Dot_W:1.1.0.4 3BranchStartsWithNumber_V:1.1.0.2 BranchWith\Backslash_E:1.2.0.10 BranchWith///ThreeSlashes_D:1.2.0.8 BranchWith.Various/Prohibited\Symbols_C:1.2.0.6 \BranchStartsWithBackslash_B:1.2.0.4 Branch_A:1.2.0.2; locks; strict; comment @# @; 1.2 date 2004.07.26.23.38.17; author kfogel; state Exp; branches 1.2.2.1 1.2.4.1 1.2.6.1 1.2.8.1 1.2.10.1; next 1.1; 1.1 date 2004.07.19.20.57.24; author jrandom; state Exp; branches 1.1.2.1 1.1.4.1 1.1.6.1 1.1.8.1 1.1.10.1; next ; 1.1.2.1 date 2004.08.11.18.27.06; author kfogel; state Exp; branches; next 1.1.2.2; 1.1.2.2 date 2004.08.11.18.28.12; author kfogel; state Exp; branches; next ; 1.1.4.1 date 2004.08.11.18.27.09; author kfogel; state Exp; branches; next 1.1.4.2; 1.1.4.2 date 2004.08.11.18.28.15; author kfogel; state Exp; branches; next ; 1.1.6.1 date 2004.08.11.18.27.12; author kfogel; state Exp; branches; next 1.1.6.2; 1.1.6.2 date 2004.08.11.18.28.18; author kfogel; state Exp; branches; next ; 1.1.8.1 date 2004.08.11.18.27.15; author kfogel; state Exp; branches; next 1.1.8.2; 1.1.8.2 date 2004.08.11.18.28.21; author kfogel; state Exp; branches; next ; 1.1.10.1 date 2004.08.11.18.27.18; author kfogel; state Exp; branches; next 1.1.10.2; 1.1.10.2 date 2004.08.11.18.28.24; author kfogel; state Exp; branches; next ; 1.2.2.1 date 2004.08.11.18.26.55; author kfogel; state Exp; branches; next 1.2.2.2; 1.2.2.2 date 2004.08.11.18.27.57; author kfogel; state Exp; branches; next ; 1.2.4.1 date 2004.08.11.18.26.57; author kfogel; state Exp; branches; next 1.2.4.2; 1.2.4.2 date 2004.08.11.18.28.00; author kfogel; state Exp; branches; next ; 1.2.6.1 date 2004.08.11.18.26.59; author kfogel; state Exp; branches; next 1.2.6.2; 1.2.6.2 date 2004.08.11.18.28.03; author kfogel; state Exp; branches; next ; 1.2.8.1 date 2004.08.11.18.27.01; author kfogel; state Exp; branches; next 1.2.8.2; 1.2.8.2 date 2004.08.11.18.28.06; author kfogel; state Exp; branches; next ; 1.2.10.1 date 2004.08.11.18.27.03; author kfogel; state Exp; branches; next 1.2.10.2; 1.2.10.2 date 2004.08.11.18.28.09; author kfogel; state Exp; branches; next ; desc @@ 1.2 log @Commit a second revision of every file. @ text @This is the first revision in this file. This line was added in the second revision. @ 1.2.10.1 log @Committing branch change (unique code 'E'). @ text @a2 1 First commit on this branch (unique code 'E'). @ 1.2.10.2 log @Committing another branch change (unique code 'E'). @ text @a3 1 Second commit on this branch (unique code 'E'). @ 1.2.8.1 log @Committing branch change (unique code 'D'). @ text @a2 1 First commit on this branch (unique code 'D'). @ 1.2.8.2 log @Committing another branch change (unique code 'D'). @ text @a3 1 Second commit on this branch (unique code 'D'). @ 1.2.6.1 log @Committing branch change (unique code 'C'). @ text @a2 1 First commit on this branch (unique code 'C'). @ 1.2.6.2 log @Committing another branch change (unique code 'C'). @ text @a3 1 Second commit on this branch (unique code 'C'). @ 1.2.4.1 log @Committing branch change (unique code 'B'). @ text @a2 1 First commit on this branch (unique code 'B'). @ 1.2.4.2 log @Committing another branch change (unique code 'B'). @ text @a3 1 Second commit on this branch (unique code 'B'). @ 1.2.2.1 log @Committing branch change (unique code 'A'). @ text @a2 1 First commit on this branch (unique code 'A'). @ 1.2.2.2 log @Committing another branch change (unique code 'A'). @ text @a3 1 Second commit on this branch (unique code 'A'). @ 1.1 log @Add a file. @ text @d2 1 @ 1.1.10.1 log @Committing branch change (unique code 'Z'). @ text @a1 1 First commit on this branch (unique code 'Z'). @ 1.1.10.2 log @Committing another branch change (unique code 'Z'). @ text @a2 1 Second commit on this branch (unique code 'Z'). @ 1.1.8.1 log @Committing branch change (unique code 'Y'). @ text @a1 1 First commit on this branch (unique code 'Y'). @ 1.1.8.2 log @Committing another branch change (unique code 'Y'). @ text @a2 1 Second commit on this branch (unique code 'Y'). @ 1.1.6.1 log @Committing branch change (unique code 'X'). @ text @a1 1 First commit on this branch (unique code 'X'). @ 1.1.6.2 log @Committing another branch change (unique code 'X'). @ text @a2 1 Second commit on this branch (unique code 'X'). @ 1.1.4.1 log @Committing branch change (unique code 'W'). @ text @a1 1 First commit on this branch (unique code 'W'). @ 1.1.4.2 log @Committing another branch change (unique code 'W'). @ text @a2 1 Second commit on this branch (unique code 'W'). @ 1.1.2.1 log @Committing branch change (unique code 'V'). @ text @a1 1 First commit on this branch (unique code 'V'). @ 1.1.2.2 log @Committing another branch change (unique code 'V'). @ text @a2 1 Second commit on this branch (unique code 'V'). @ cvs2svn-2.4.0/test-data/main-cvsrepos/0000775000076500007650000000000012027373500020656 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/full-prune/0000775000076500007650000000000012027373500022747 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/full-prune/Attic/0000775000076500007650000000000012027373500024013 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/full-prune/Attic/second,v0000664000076500007650000000112710762064151025457 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 96.08.20.23.53.47; author jrandom; state dead; branches; next 1.1; 1.1 date 95.03.31.07.44.01; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 95.03.31.07.44.02; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Remove the file 'second'. Since 'first' was already removed, removing 'second' empties the directory, so the directory itself gets pruned. @ text @@ 1.1 log @Initial revision @ text @@ 1.1.1.1 log @Original sources from CVS-1.4A2 munged to fit our directory structure. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/full-prune/Attic/first,v0000664000076500007650000000121210762064151025326 0ustar mhaggermhagger00000000000000head 1.3; access; symbols; locks; strict; comment @; @; 1.3 date 95.12.30.18.37.22; author jrandom; state dead; branches; next 1.2; 1.2 date 95.12.11.00.27.53; author jrandom; state dead; branches; next 1.1; 1.1 date 93.06.18.05.46.07; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 93.06.18.05.46.08; author jrandom; state Exp; branches; next ; desc @@ 1.3 log @Remove the file 'first' again, which should have no effect. @ text @@ 1.2 log @Remove the file 'first', for the first time. (Note that its sibling 'second' still exists.) @ text @@ 1.1 log @Initial revision @ text @@ 1.1.1.1 log @Updated CVS @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/interleaved/0000775000076500007650000000000012027373500023160 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/interleaved/b,v0000664000076500007650000000110010702477017023564 0ustar mhaggermhagger00000000000000head 1.2; access; symbols vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.2 date 2003.06.03.00.20.01; author jrandom; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Committing letters only. @ text @This is file b, always committed with other letters, never with numbers. A random change on trunk. @ 1.1 log @Initial revision @ text @d2 2 @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/interleaved/a,v0000664000076500007650000000110010702477017023563 0ustar mhaggermhagger00000000000000head 1.2; access; symbols vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.2 date 2003.06.03.00.20.01; author jrandom; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Committing letters only. @ text @This is file a, always committed with other letters, never with numbers. A random change on trunk. @ 1.1 log @Initial revision @ text @d2 2 @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/interleaved/c,v0000664000076500007650000000110010702477017023565 0ustar mhaggermhagger00000000000000head 1.2; access; symbols vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.2 date 2003.06.03.00.20.01; author jrandom; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Committing letters only. @ text @This is file c, always committed with other letters, never with numbers. A random change on trunk. @ 1.1 log @Initial revision @ text @d2 2 @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/interleaved/e,v0000664000076500007650000000110010702477017023567 0ustar mhaggermhagger00000000000000head 1.2; access; symbols vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.2 date 2003.06.03.00.20.01; author jrandom; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Committing letters only. @ text @This is file e, always committed with other letters, never with numbers. A random change on trunk. @ 1.1 log @Initial revision @ text @d2 2 @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/interleaved/d,v0000664000076500007650000000110010702477017023566 0ustar mhaggermhagger00000000000000head 1.2; access; symbols vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.2 date 2003.06.03.00.20.01; author jrandom; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Committing letters only. @ text @This is file d, always committed with other letters, never with numbers. A random change on trunk. @ 1.1 log @Initial revision @ text @d2 2 @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/interleaved/4,v0000664000076500007650000000110010702477017023506 0ustar mhaggermhagger00000000000000head 1.2; access; symbols vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.2 date 2003.06.03.00.20.01; author jrandom; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Committing numbers only. @ text @This is file 4, always committed with other numbers, never with letters. A random change on trunk. @ 1.1 log @Initial revision @ text @d2 2 @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/interleaved/1,v0000664000076500007650000000110010702477017023503 0ustar mhaggermhagger00000000000000head 1.2; access; symbols vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.2 date 2003.06.03.00.20.01; author jrandom; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Committing numbers only. @ text @This is file 1, always committed with other numbers, never with letters. A random change on trunk. @ 1.1 log @Initial revision @ text @d2 2 @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/interleaved/2,v0000664000076500007650000000110010702477017023504 0ustar mhaggermhagger00000000000000head 1.2; access; symbols vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.2 date 2003.06.03.00.20.01; author jrandom; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Committing numbers only. @ text @This is file 2, always committed with other numbers, never with letters. A random change on trunk. @ 1.1 log @Initial revision @ text @d2 2 @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/interleaved/5,v0000664000076500007650000000110010702477017023507 0ustar mhaggermhagger00000000000000head 1.2; access; symbols vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.2 date 2003.06.03.00.20.01; author jrandom; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Committing numbers only. @ text @This is file 5, always committed with other numbers, never with letters. A random change on trunk. @ 1.1 log @Initial revision @ text @d2 2 @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/interleaved/3,v0000664000076500007650000000110010702477017023505 0ustar mhaggermhagger00000000000000head 1.2; access; symbols vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.2 date 2003.06.03.00.20.01; author jrandom; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Committing numbers only. @ text @This is file 3, always committed with other numbers, never with letters. A random change on trunk. @ 1.1 log @Initial revision @ text @d2 2 @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/cvs2git.options0000664000076500007650000000035011121220731023641 0ustar mhaggermhagger00000000000000# (Be in -*- mode: python; coding: utf-8 -*- mode.) # An options file to test converting to git. # Actually, the example file cvs2git-example.options is fully # functional, so we just load it: execfile('cvs2git-example.options') cvs2svn-2.4.0/test-data/main-cvsrepos/cvs2svn.options0000664000076500007650000000064310737603204023705 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # As a partial check that the example options file is functional, we # use it as the basis for this test. We only need to overwrite the # output option to get the output repository in the location expected # by the test infrastructure. execfile('cvs2svn-example.options') ctx.output_option = NewRepositoryOutputOption( 'cvs2svn-tmp/main--options=cvs2svn.options-svnrepos', ) cvs2svn-2.4.0/test-data/main-cvsrepos/partial-prune/0000775000076500007650000000000012027373500023441 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/partial-prune/sub/0000775000076500007650000000000012027373500024232 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/partial-prune/sub/Attic/0000775000076500007650000000000012027373500025276 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/partial-prune/sub/Attic/second,v0000664000076500007650000000112710762064150026741 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 96.08.20.23.53.47; author jrandom; state dead; branches; next 1.1; 1.1 date 95.03.31.07.44.01; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 95.03.31.07.44.02; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Remove the file 'second'. Since 'first' was already removed, removing 'second' empties the directory, so the directory itself gets pruned. @ text @@ 1.1 log @Initial revision @ text @@ 1.1.1.1 log @Original sources from CVS-1.4A2 munged to fit our directory structure. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/partial-prune/sub/Attic/first,v0000664000076500007650000000121210762064150026610 0ustar mhaggermhagger00000000000000head 1.3; access; symbols; locks; strict; comment @; @; 1.3 date 95.12.30.18.37.22; author jrandom; state dead; branches; next 1.2; 1.2 date 95.12.11.00.27.53; author jrandom; state dead; branches; next 1.1; 1.1 date 93.06.18.05.46.07; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 93.06.18.05.46.08; author jrandom; state Exp; branches; next ; desc @@ 1.3 log @Remove the file 'first' again, which should have no effect. @ text @@ 1.2 log @Remove the file 'first', for the first time. (Note that its sibling 'second' still exists.) @ text @@ 1.1 log @Initial revision @ text @@ 1.1.1.1 log @Updated CVS @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/partial-prune/permanent,v0000664000076500007650000000075610702477017025635 0ustar mhaggermhagger00000000000000head 1.1; access; symbols vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.1 date 94.06.18.05.46.08; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 94.06.18.05.46.08; author jrandom; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @This file was added in between the addition of sub/first and sub/second, to demonstrate that when those two are both removed, the pruning stops with sub/. @ 1.1.1.1 log @Updated CVS @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/cvs2svn-multiproject.options0000664000076500007650000000201310760115614026414 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # As a partial check that the example options file is functional, we # use it as the basis for this test. We only need to overwrite the # output option to get the output repository in the location expected # by the test infrastructure. import os execfile('cvs2svn-example.options') ctx.output_option = NewRepositoryOutputOption( 'cvs2svn-tmp/main--options=cvs2svn-multiproject.options-svnrepos', ) ctx.cross_project_commits = False ctx.cross_branch_commits = False run_options.clear_projects() for project in [ 'full-prune', 'full-prune-reappear', 'interleaved', 'partial-prune', 'proj', 'single-files', ]: run_options.add_project( os.path.join(r'test-data/main-cvsrepos', project), trunk_path='%s/trunk' % (project,), branches_path='%s/branches' % (project,), tags_path='%s/tags' % (project,), initial_directories=['%s/releases' % (project,)], symbol_strategy_rules=global_symbol_strategy_rules, ) cvs2svn-2.4.0/test-data/main-cvsrepos/full-prune-reappear/0000775000076500007650000000000012027373500024544 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/full-prune-reappear/sub/0000775000076500007650000000000012027373500025335 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/full-prune-reappear/sub/Attic/0000775000076500007650000000000012027373500026401 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/full-prune-reappear/sub/Attic/second,v0000664000076500007650000000112710762064150030044 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 96.08.20.23.53.47; author jrandom; state dead; branches; next 1.1; 1.1 date 95.03.31.07.44.01; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 95.03.31.07.44.02; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Remove the file 'second'. Since 'first' was already removed, removing 'second' empties the directory, so the directory itself gets pruned. @ text @@ 1.1 log @Initial revision @ text @@ 1.1.1.1 log @Original sources from CVS-1.4A2 munged to fit our directory structure. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/full-prune-reappear/sub/Attic/first,v0000664000076500007650000000121210762064150027713 0ustar mhaggermhagger00000000000000head 1.3; access; symbols; locks; strict; comment @; @; 1.3 date 95.12.30.18.37.22; author jrandom; state dead; branches; next 1.2; 1.2 date 95.12.11.00.27.53; author jrandom; state dead; branches; next 1.1; 1.1 date 93.06.18.05.46.07; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 93.06.18.05.46.08; author jrandom; state Exp; branches; next ; desc @@ 1.3 log @Remove the file 'first' again, which should have no effect. @ text @@ 1.2 log @Remove the file 'first', for the first time. (Note that its sibling 'second' still exists.) @ text @@ 1.1 log @Initial revision @ text @@ 1.1.1.1 log @Updated CVS @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/full-prune-reappear/appears-later,v0000664000076500007650000000070410702477017027500 0ustar mhaggermhagger00000000000000head 1.1; access; symbols vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.1 date 2003.06.10.20.19.48; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.06.10.20.19.48; author jrandom; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @This file is added to a directory that had earlier been pruned, to demonstrate that the directory reappears. @ 1.1.1.1 log @Updated CVS @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/single-files/0000775000076500007650000000000012027373500023237 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/single-files/attr-exec,v0000775000076500007650000000065010702477017025332 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access ; symbols vendortag:1.1.1.1 vendorbranch:1.1.1; locks ; strict; comment @# @; 1.1 date 2003.01.25.13.43.57; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.01.25.13.43.57; author jrandom; state Exp; branches ; next ; desc @@ 1.1 log @Initial revision @ text @#!/bin/sh echo Hello world! @ 1.1.1.1 log @initial checkin @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/single-files/can't-avoid-quotes,v0000664000076500007650000000057011710517254027044 0ustar mhaggermhagger00000000000000head 1.2; access; symbols after:1.2; locks maxb:1.2; strict; comment @ * @; 1.2 date 2002.09.29.00.00.01; author jrandom; state Exp; branches; next 1.1; 1.1 date 2002.09.29.00.00.00; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @*** empty log message *** @ text @hello modified after checked in @ 1.1 log @*** empty log message *** @ text @d2 2 @ cvs2svn-2.4.0/test-data/main-cvsrepos/single-files/single-double-quote",v0000664000076500007650000000057011710517254027360 0ustar mhaggermhagger00000000000000head 1.2; access; symbols after:1.2; locks maxb:1.2; strict; comment @ * @; 1.2 date 2002.09.29.00.00.01; author jrandom; state Exp; branches; next 1.1; 1.1 date 2002.09.29.00.00.00; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @*** empty log message *** @ text @hello modified after checked in @ 1.1 log @*** empty log message *** @ text @d2 2 @ cvs2svn-2.4.0/test-data/main-cvsrepos/single-files/"double-double-quotes",v0000664000076500007650000000057011710517254027576 0ustar mhaggermhagger00000000000000head 1.2; access; symbols after:1.2; locks maxb:1.2; strict; comment @ * @; 1.2 date 2002.09.29.00.00.01; author jrandom; state Exp; branches; next 1.1; 1.1 date 2002.09.29.00.00.00; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @*** empty log message *** @ text @hello modified after checked in @ 1.1 log @*** empty log message *** @ text @d2 2 @ cvs2svn-2.4.0/test-data/main-cvsrepos/single-files/quotin'-in-dirname/0000775000076500007650000000000012027373500026646 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/single-files/quotin'-in-dirname/foo,v0000664000076500007650000000057011710517254027624 0ustar mhaggermhagger00000000000000head 1.2; access; symbols after:1.2; locks maxb:1.2; strict; comment @ * @; 1.2 date 2002.09.29.00.00.01; author jrandom; state Exp; branches; next 1.1; 1.1 date 2002.09.29.00.00.00; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @*** empty log message *** @ text @hello modified after checked in @ 1.1 log @*** empty log message *** @ text @d2 2 @ cvs2svn-2.4.0/test-data/main-cvsrepos/single-files/space fname,v0000664000076500007650000000065610702477017025603 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access ; symbols vendortag:1.1.1.1 vendorbranch:1.1.1; locks ; strict; comment @# @; 1.1 date 2002.11.30.19.27.42; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2002.11.30.19.27.42; author jrandom; state Exp; branches ; next ; desc @@ 1.1 log @Initial revision @ text @Just a test for spaces in the file name. @ 1.1.1.1 log @imported @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/single-files/twoquick,v0000664000076500007650000000057010702477017025302 0ustar mhaggermhagger00000000000000head 1.2; access; symbols after:1.2; locks maxb:1.2; strict; comment @ * @; 1.2 date 2002.09.29.00.00.01; author jrandom; state Exp; branches; next 1.1; 1.1 date 2002.09.29.00.00.00; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @*** empty log message *** @ text @hello modified after checked in @ 1.1 log @*** empty log message *** @ text @d2 2 @ cvs2svn-2.4.0/test-data/main-cvsrepos/proj/0000775000076500007650000000000012027373500021630 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/proj/README0000664000076500007650000000452110702477017022520 0ustar mhaggermhagger00000000000000(This README file is not an RCS ,v file, so cvs2svn won't notice it.) This directory, the `proj' project, is for testing cvs2svn's ability to group revisions correctly along tags and branches. Here is its history: 1. The initial import (revision 1.1 of everybody) created a directory structure with a file named `default' in each dir: ./ default sub1/default subsubA/default subsubB/default sub2/default subsubA/default sub3/default 2. Then tagged everyone with T_ALL_INITIAL_FILES. 3. Then tagged everyone except sub1/subsubB/default with T_ALL_INITIAL_FILES_BUT_ONE. 4. Then created branch B_FROM_INITIALS on everyone. 5. Then created branch B_FROM_INITIALS_BUT_ONE on everyone except /sub1/subsubB/default. 6. Then committed modifications to two files: sub3/default, and sub1/subsubA/default. 7. Then committed a modification to all 7 files. 8. Then backdated sub3/default to revision 1.2, and sub2/subsubA/default to revision 1.1, and tagged with T_MIXED. 9. Same as 8, but tagged with -b to create branch B_MIXED. 10. Switched the working copy to B_MIXED, and added sub2/branch_B_MIXED_only. (That's why the RCS file is in sub2/Attic/ -- it never existed on trunk.) 11. In one commit, modified default, sub1/default, and sub2/subsubA/default, on branch B_MIXED. 12. Did "cvs up -A" on sub2/default, then in one commit, made a change to sub2/default and sub2/branch_B_MIXED_only. So this commit should be spread between the branch and the trunk. 13. Do "cvs up -A" to get everyone back to trunk, then make a new branch B_SPLIT on everyone except sub1/subsubB/default,v. 14. Switch to branch B_SPLIT (see sub1/subsubB/default disappear) and commit a change that affects everyone except sub3/default. 15. An hour or so later, "cvs up -A" to get sub1/subsubB/default back, then commit a change on that file, on trunk. (It's important that this change happened after the previous commits on B_SPLIT.) 16. Branch sub1/subsubB/default to B_SPLIT, then "cvs up -r B_SPLIT" to switch the whole working copy to the branch. 17. Commit a change on B_SPLIT, to sub1/subsubB/default and sub3/default. cvs2svn-2.4.0/test-data/main-cvsrepos/proj/sub1/0000775000076500007650000000000012027373500022502 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/proj/sub1/default,v0000664000076500007650000000254110702477017024323 0ustar mhaggermhagger00000000000000head 1.2; access; symbols B_SPLIT:1.2.0.4 B_MIXED:1.2.0.2 T_MIXED:1.2 B_FROM_INITIALS_BUT_ONE:1.1.1.1.0.4 B_FROM_INITIALS:1.1.1.1.0.2 T_ALL_INITIAL_FILES_BUT_ONE:1.1.1.1 T_ALL_INITIAL_FILES:1.1.1.1 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.2 date 2003.05.23.00.17.53; author jrandom; state Exp; branches 1.2.2.1 1.2.4.1; next 1.1; 1.1 date 2003.05.22.23.20.19; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.22.23.20.19; author jrandom; state Exp; branches; next ; 1.2.2.1 date 2003.05.23.00.31.36; author jrandom; state Exp; branches; next ; 1.2.4.1 date 2003.06.03.03.20.31; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Second commit to proj, affecting all 7 files. @ text @This is sub1/default. Every directory in the `proj' project has a file named `default'. This line was added in the second commit (affecting all 7 files). @ 1.2.4.1 log @First change on branch B_SPLIT. This change excludes sub3/default, because it was not part of this commit, and sub1/subsubB/default, which is not even on the branch yet. @ text @a5 2 First change on branch B_SPLIT. @ 1.2.2.1 log @Modify three files, on branch B_MIXED. @ text @a5 2 This line was added on branch B_MIXED only (affecting 3 files). @ 1.1 log @Initial revision @ text @d4 2 @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/proj/sub1/subsubA/0000775000076500007650000000000012027373500024106 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/proj/sub1/subsubA/default,v0000664000076500007650000000253610702477017025733 0ustar mhaggermhagger00000000000000head 1.3; access; symbols B_SPLIT:1.3.0.4 B_MIXED:1.3.0.2 T_MIXED:1.3 B_FROM_INITIALS_BUT_ONE:1.1.1.1.0.4 B_FROM_INITIALS:1.1.1.1.0.2 T_ALL_INITIAL_FILES_BUT_ONE:1.1.1.1 T_ALL_INITIAL_FILES:1.1.1.1 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2003.05.23.00.17.53; author jrandom; state Exp; branches 1.3.4.1; next 1.2; 1.2 date 2003.05.23.00.15.26; author jrandom; state Exp; branches; next 1.1; 1.1 date 2003.05.22.23.20.19; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.22.23.20.19; author jrandom; state Exp; branches; next ; 1.3.4.1 date 2003.06.03.03.20.31; author jrandom; state Exp; branches; next ; desc @@ 1.3 log @Second commit to proj, affecting all 7 files. @ text @This is sub1/subsubA/default. Every directory in the `proj' project has a file named `default'. This line was added by the first commit (affecting two files). This line was added in the second commit (affecting all 7 files). @ 1.3.4.1 log @First change on branch B_SPLIT. This change excludes sub3/default, because it was not part of this commit, and sub1/subsubB/default, which is not even on the branch yet. @ text @a7 2 First change on branch B_SPLIT. @ 1.2 log @First commit to proj, affecting two files. @ text @d6 2 @ 1.1 log @Initial revision @ text @d4 2 @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/proj/sub1/subsubB/0000775000076500007650000000000012027373500024107 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/proj/sub1/subsubB/default,v0000664000076500007650000000367110702477017025735 0ustar mhaggermhagger00000000000000head 1.3; access; symbols B_SPLIT:1.3.0.2 B_MIXED:1.2.0.2 T_MIXED:1.2 B_FROM_INITIALS:1.1.1.1.0.2 T_ALL_INITIAL_FILES:1.1.1.1 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2003.06.03.04.29.14; author jrandom; state Exp; branches 1.3.2.1; next 1.2; 1.2 date 2003.05.23.00.17.53; author jrandom; state Exp; branches; next 1.1; 1.1 date 2003.05.22.23.20.19; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.22.23.20.19; author jrandom; state Exp; branches; next ; 1.3.2.1 date 2003.06.03.04.33.13; author jrandom; state Exp; branches; next ; desc @@ 1.3 log @A trunk change to sub1/subsubB/default. This was committed about an hour after an earlier change that affected most files on branch B_SPLIT. This file is not on that branch yet, but after this commit, we'll branch to B_SPLIT, albeit rooted in a revision that didn't exist at the time the rest of B_SPLIT was created. @ text @This is sub1/subsubB/default. Every directory in the `proj' project has a file named `default'. This line was added in the second commit (affecting all 7 files). This bit was committed on trunk about an hour after an earlier change to everyone else on branch B_SPLIT. Afterwards, we'll finally branch this file to B_SPLIT, but rooted in a revision that didn't exist at the time the rest of B_SPLIT was created. @ 1.3.2.1 log @This change affects sub3/default and sub1/subsubB/default, on branch B_SPLIT. Note that the latter file did not even exist on this branch until after some other files had had revisions committed on B_SPLIT. @ text @a10 4 This change affects sub3/default and sub1/subsubB/default, on branch B_SPLIT. Note that the latter file did not even exist on this branch until after some other files had had revisions committed on B_SPLIT. @ 1.2 log @Second commit to proj, affecting all 7 files. @ text @d6 5 @ 1.1 log @Initial revision @ text @d4 2 @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/proj/default,v0000664000076500007650000000265311317026316023450 0ustar mhaggermhagger00000000000000head 1.2; access; symbols B_SPLIT:1.2.0.4 B_MIXED:1.2.0.2 T_MIXED:1.2 B_FROM_INITIALS_BUT_ONE:1.1.1.1.0.4 B_FROM_INITIALS:1.1.1.1.0.2 T_ALL_INITIAL_FILES_BUT_ONE:1.1.1.1 T_ALL_INITIAL_FILES:1.1.1.1 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.2 date 2003.05.23.00.17.53; author jrandom; state Exp; branches 1.2.2.1 1.2.4.1; next 1.1; 1.1 date 2003.05.22.23.20.19; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.22.23.20.19; author jrandom; state Exp; branches; next ; 1.2.2.1 date 2003.05.23.00.31.36; author jrandom; state Exp; branches; next ; 1.2.4.1 date 2003.06.03.03.20.31; author jrandom; state Exp; branches; next ; desc @This is an example file description.@ 1.2 log @Second commit to proj, affecting all 7 files. @ text @This is the file `default' in the top level of the project. Every directory in the `proj' project has a file named `default'. This line was added in the second commit (affecting all 7 files). @ 1.2.4.1 log @First change on branch B_SPLIT. This change excludes sub3/default, because it was not part of this commit, and sub1/subsubB/default, which is not even on the branch yet. @ text @a5 2 First change on branch B_SPLIT. @ 1.2.2.1 log @Modify three files, on branch B_MIXED. @ text @a5 2 This line was added on branch B_MIXED only (affecting 3 files). @ 1.1 log @Initial revision @ text @d4 2 @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/proj/sub2/0000775000076500007650000000000012027373500022503 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/proj/sub2/Attic/0000775000076500007650000000000012027373500023547 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/proj/sub2/Attic/branch_B_MIXED_only,v0000664000076500007650000000136510702477017027434 0ustar mhaggermhagger00000000000000head 1.1; access; symbols B_MIXED:1.1.0.2; locks; strict; comment @# @; 1.1 date 2003.05.23.00.25.26; author jrandom; state dead; branches 1.1.2.1; next ; 1.1.2.1 date 2003.05.23.00.25.26; author jrandom; state Exp; branches; next 1.1.2.2; 1.1.2.2 date 2003.05.23.00.48.51; author jrandom; state Exp; branches; next ; desc @@ 1.1 log @file branch_B_MIXED_only was initially added on branch B_MIXED. @ text @@ 1.1.2.1 log @Add a file on branch B_MIXED. @ text @a0 1 This file was added on branch B_MIXED. It never existed on trunk. @ 1.1.2.2 log @A single commit affecting one file on branch B_MIXED and one on trunk. @ text @a1 3 The same commit added these two lines here on branch B_MIXED, and two similar lines to ./default on trunk. @ cvs2svn-2.4.0/test-data/main-cvsrepos/proj/sub2/default,v0000664000076500007650000000265210702477017024327 0ustar mhaggermhagger00000000000000head 1.3; access; symbols B_SPLIT:1.3.0.2 B_MIXED:1.2.0.2 T_MIXED:1.2 B_FROM_INITIALS_BUT_ONE:1.1.1.1.0.4 B_FROM_INITIALS:1.1.1.1.0.2 T_ALL_INITIAL_FILES_BUT_ONE:1.1.1.1 T_ALL_INITIAL_FILES:1.1.1.1 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2003.05.23.00.48.51; author jrandom; state Exp; branches 1.3.2.1; next 1.2; 1.2 date 2003.05.23.00.17.53; author jrandom; state Exp; branches; next 1.1; 1.1 date 2003.05.22.23.20.19; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.22.23.20.19; author jrandom; state Exp; branches; next ; 1.3.2.1 date 2003.06.03.03.20.31; author jrandom; state Exp; branches; next ; desc @@ 1.3 log @A single commit affecting one file on branch B_MIXED and one on trunk. @ text @This is sub2/default. Every directory in the `proj' project has a file named `default'. This line was added in the second commit (affecting all 7 files). The same commit added these two lines here on trunk, and two similar lines to ./branch_B_MIXED_only on branch B_MIXED. @ 1.3.2.1 log @First change on branch B_SPLIT. This change excludes sub3/default, because it was not part of this commit, and sub1/subsubB/default, which is not even on the branch yet. @ text @a8 2 First change on branch B_SPLIT. @ 1.2 log @Second commit to proj, affecting all 7 files. @ text @d6 3 @ 1.1 log @Initial revision @ text @d4 2 @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/proj/sub2/subsubA/0000775000076500007650000000000012027373500024107 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/proj/sub2/subsubA/default,v0000664000076500007650000000255110702477017025731 0ustar mhaggermhagger00000000000000head 1.2; access; symbols B_SPLIT:1.2.0.2 B_MIXED:1.1.0.2 T_MIXED:1.1 B_FROM_INITIALS_BUT_ONE:1.1.1.1.0.4 B_FROM_INITIALS:1.1.1.1.0.2 T_ALL_INITIAL_FILES_BUT_ONE:1.1.1.1 T_ALL_INITIAL_FILES:1.1.1.1 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.2 date 2003.05.23.00.17.53; author jrandom; state Exp; branches 1.2.2.1; next 1.1; 1.1 date 2003.05.22.23.20.19; author jrandom; state Exp; branches 1.1.1.1 1.1.2.1; next ; 1.1.1.1 date 2003.05.22.23.20.19; author jrandom; state Exp; branches; next ; 1.1.2.1 date 2003.05.23.00.31.36; author jrandom; state Exp; branches; next ; 1.2.2.1 date 2003.06.03.03.20.31; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Second commit to proj, affecting all 7 files. @ text @This is sub2/subsub2/default. Every directory in the `proj' project has a file named `default'. This line was added in the second commit (affecting all 7 files). @ 1.2.2.1 log @First change on branch B_SPLIT. This change excludes sub3/default, because it was not part of this commit, and sub1/subsubB/default, which is not even on the branch yet. @ text @a5 2 First change on branch B_SPLIT. @ 1.1 log @Initial revision @ text @d4 2 @ 1.1.2.1 log @Modify three files, on branch B_MIXED. @ text @a3 2 This line was added on branch B_MIXED only (affecting 3 files). @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/proj/sub3/0000775000076500007650000000000012027373500022504 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/main-cvsrepos/proj/sub3/default,v0000664000076500007650000000305310702477017024324 0ustar mhaggermhagger00000000000000head 1.3; access; symbols B_SPLIT:1.3.0.2 B_MIXED:1.2.0.2 T_MIXED:1.2 B_FROM_INITIALS_BUT_ONE:1.1.1.1.0.4 B_FROM_INITIALS:1.1.1.1.0.2 T_ALL_INITIAL_FILES_BUT_ONE:1.1.1.1 T_ALL_INITIAL_FILES:1.1.1.1 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2003.05.23.00.17.53; author jrandom; state Exp; branches 1.3.2.1; next 1.2; 1.2 date 2003.05.23.00.15.26; author jrandom; state Exp; branches; next 1.1; 1.1 date 2003.05.22.23.20.19; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.22.23.20.19; author jrandom; state Exp; branches; next ; 1.3.2.1 date 2003.06.03.04.33.13; author jrandom; state Exp; branches; next ; desc @@ 1.3 log @Second commit to proj, affecting all 7 files. @ text @This is sub3/default. Every directory in the `proj' project has a file named `default'. This line was added by the first commit (affecting two files). This line was added in the second commit (affecting all 7 files). @ 1.3.2.1 log @This change affects sub3/default and sub1/subsubB/default, on branch B_SPLIT. Note that the latter file did not even exist on this branch until after some other files had had revisions committed on B_SPLIT. @ text @a7 4 This change affects sub3/default and sub1/subsubB/default, on branch B_SPLIT. Note that the latter file did not even exist on this branch until after some other files had had revisions committed on B_SPLIT. @ 1.2 log @First commit to proj, affecting two files. @ text @d6 2 @ 1.1 log @Initial revision @ text @d4 2 @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/main-cvsrepos/cvs2svn-crossproject.options0000664000076500007650000000172010760115614026417 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # As a partial check that the example options file is functional, we # use it as the basis for this test. We only need to overwrite the # output option to get the output repository in the location expected # by the test infrastructure. import os execfile('cvs2svn-example.options') ctx.output_option = NewRepositoryOutputOption( 'cvs2svn-tmp/main--options=cvs2svn-crossproject.options-svnrepos', ) ctx.cross_project_commits = True ctx.cross_branch_commits = False run_options.clear_projects() for project in [ 'full-prune', 'full-prune-reappear', 'interleaved', 'partial-prune', 'proj', 'single-files', ]: run_options.add_project( os.path.join(r'test-data/main-cvsrepos', project), trunk_path='%s/trunk' % (project,), branches_path='%s/branches' % (project,), tags_path='%s/tags' % (project,), symbol_strategy_rules=global_symbol_strategy_rules, ) cvs2svn-2.4.0/test-data/main-cvsrepos/cvs2hg.options0000664000076500007650000000035411121220731023460 0ustar mhaggermhagger00000000000000# (Be in -*- mode: python; coding: utf-8 -*- mode.) # An options file to test converting to Mercurial. # Actually, the example file cvs2hg-example.options is fully # functional, so we just load it: execfile('cvs2hg-example.options') cvs2svn-2.4.0/test-data/split-branch-cvsrepos/0000775000076500007650000000000012027373500022320 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/split-branch-cvsrepos/module/0000775000076500007650000000000012027373500023605 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/split-branch-cvsrepos/module/branched-from-branch,v0000664000076500007650000000076410702477020027743 0ustar mhaggermhagger00000000000000head 1.1; access; symbols demo-node-0:1.1.1.1.0.2 first_working:1.1.1; locks; strict; comment @;; @; 1.1 date 98.02.08.01.48.39; author russel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 98.02.08.01.48.39; author russel; state Exp; branches 1.1.1.1.2.1; next ; 1.1.1.1.2.1 date 98.02.25.18.56.53; author adam; state dead; branches; next ; desc @@ 1.1 log @Initial revision @ text @@ 1.1.1.1 log @Create a branch @ text @@ 1.1.1.1.2.1 log @This revision will go wrong @ text @@ cvs2svn-2.4.0/test-data/split-branch-cvsrepos/module/branched-from-trunk,v0000664000076500007650000000052710702477020027646 0ustar mhaggermhagger00000000000000head 1.1; access; symbols demo-node-0:1.1.0.2; locks; strict; comment @;; @; 1.1 date 98.02.19.21.28.43; author russel; state Exp; branches 1.1.2.1; next ; 1.1.2.1 date 98.02.25.18.56.53; author adam; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @@ 1.1.2.1 log @This revision will go wrong @ text @@ cvs2svn-2.4.0/test-data/split-branch-cvsrepos/README0000664000076500007650000000076110702477020023204 0ustar mhaggermhagger00000000000000This tree is for reproducing http://subversion.tigris.org/issues/show_bug.cgi?id=1421 It is based on issue-1421-small-cvsrepos.tgz from Blair Zajac. Previously, this directory contained a different set of test data, which began to pass after a partial fix in r6534. However, similar errors still occurred with other data sets. The issue was reopened, and the test data changed to more reliably test the bug. It continued to fail until the final fix in r7006. See the issue for more details. cvs2svn-2.4.0/test-data/double-add-cvsrepos/0000775000076500007650000000000012027373500021732 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/double-add-cvsrepos/Attic/0000775000076500007650000000000012027373500022776 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/double-add-cvsrepos/Attic/file2.txt,v0000775000076500007650000000127010702477016025012 0ustar mhaggermhagger00000000000000head 1.2; access; symbols boom-branch:1.2.0.2; locks; strict; comment @# @; 1.2 date 2005.03.04.19.57.41; author harry; state Exp; branches 1.2.2.1; next 1.1; 1.1 date 2005.03.04.18.57.55; author sally; state dead; branches; next ; 1.2.2.1 date 2005.03.04.19.57.41; author harry; state dead; branches; next 1.2.2.2; 1.2.2.2 date 2005.03.04.21.02.34; author harry; state Exp; branches; next ; desc @@ 1.2 log @merged some-branch to trunk @ text @@ 1.2.2.1 log @file file2.txt was added on branch boom-branch on 2005-03-04 21:02:34 +0000 @ text @@ 1.2.2.2 log @merged trunk to another-branch @ text @@ 1.1 log @file file2.txt was initially added on branch some-branch. @ text @@ cvs2svn-2.4.0/test-data/double-add-cvsrepos/file.txt,v0000775000076500007650000000143610702477016023670 0ustar mhaggermhagger00000000000000head 1.2; access; symbols my-branch:1.2.0.2 boom-branch:1.2.0.4; locks; strict; comment @# @; 1.2 date 2005.02.10.19.45.43; author harry; state Exp; branches 1.2.2.1 1.2.4.1; next 1.1; 1.1 date 2005.01.27.22.11.00; author sally; state dead; branches; next ; 1.2.2.1 date 2005.03.04.21.06.12; author harry; state Exp; branches; next 1.2.2.2; 1.2.2.2 date 2005.04.01.23.09.43; author harry; state Exp; branches; next ; 1.2.4.1 date 2005.03.04.21.02.34; author harry; state Exp; branches; next ; desc @@ 1.2 log @merged some-branch to trunk @ text @@ 1.2.2.1 log @merged trunk to my-branch @ text @@ 1.2.2.2 log @merged trunk to my-branch @ text @@ 1.2.4.1 log @merged trunk to another-branch @ text @@ 1.1 log @file file.txt was initially added on branch BRANCH-sally. @ text @@ cvs2svn-2.4.0/test-data/double-add-cvsrepos/seemingly-irrelevant-file.txt,v0000775000076500007650000000060210702477016030025 0ustar mhaggermhagger00000000000000head 1.2; access; symbols boom-branch:1.2.0.2; locks; strict; comment @# @; 1.2 date 2005.05.04.15.08.53; author harry; state Exp; branches; next 1.1; 1.1 date 2005.05.02.19.09.31; author sally; state dead; branches; next ; desc @@ 1.2 log @merged some-branch to trunk @ text @@ 1.1 log @file seemingly-irrelevant-file.txt was initially added on branch BRANCH-sally. @ text @@ cvs2svn-2.4.0/test-data/timestamp-chaos-cvsrepos/0000775000076500007650000000000012027373500023030 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/timestamp-chaos-cvsrepos/proj/0000775000076500007650000000000012027373500024002 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/timestamp-chaos-cvsrepos/proj/file2.txt,v0000664000076500007650000000071410702477015026014 0ustar mhaggermhagger00000000000000head 1.3; access; symbols; locks; strict; comment @# @; 1.3 date 2007.01.01.22.00.00; author mhagger; state Exp; branches; next 1.2; 1.2 date 2030.01.01.00.00.00; author mhagger; state Exp; branches; next 1.1; 1.1 date 2007.01.01.21.00.00; author mhagger; state Exp; branches; next ; desc @@ 1.3 log @Revision 1.3 @ text @Revision 1.3 @ 1.2 log @Revision 1.2 @ text @d1 1 a1 1 Revision 1.2 @ 1.1 log @Revision 1.1 @ text @d1 1 a1 1 Revision 1.1 @ cvs2svn-2.4.0/test-data/timestamp-chaos-cvsrepos/proj/file1.txt,v0000664000076500007650000000071410702477015026013 0ustar mhaggermhagger00000000000000head 1.3; access; symbols; locks; strict; comment @# @; 1.3 date 2007.01.01.22.00.00; author mhagger; state Exp; branches; next 1.2; 1.2 date 2000.01.01.00.00.00; author mhagger; state Exp; branches; next 1.1; 1.1 date 2007.01.01.21.00.00; author mhagger; state Exp; branches; next ; desc @@ 1.3 log @Revision 1.3 @ text @Revision 1.3 @ 1.2 log @Revision 1.2 @ text @d1 1 a1 1 Revision 1.2 @ 1.1 log @Revision 1.1 @ text @d1 1 a1 1 Revision 1.1 @ cvs2svn-2.4.0/test-data/vendor-branch-sameness-cvsrepos/0000775000076500007650000000000012027373500024276 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/vendor-branch-sameness-cvsrepos/proj/0000775000076500007650000000000012027373500025250 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/vendor-branch-sameness-cvsrepos/proj/d.txt,v0000664000076500007650000000036610702477013026505 0ustar mhaggermhagger00000000000000head 1.1; access; symbols; locks; strict; comment @# @; 1.1 date 2004.02.12.22.01.44; author kfogel; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @Added d.txt via 'cvs add', but with same timestamp as the imports. @ cvs2svn-2.4.0/test-data/vendor-branch-sameness-cvsrepos/proj/a.txt,v0000664000076500007650000000065410702477013026502 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access ; symbols vtag-1:1.1.1.1 vbranchA:1.1.1; locks ; strict; comment @# @; 1.1 date 2004.02.12.22.01.44; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2004.02.12.22.01.44; author kfogel; state Exp; branches ; next ; desc @@ 1.1 log @Initial revision @ text @Import vtag-1 on vbranchA. @ 1.1.1.1 log @First vendor branch revision. @ text @@ cvs2svn-2.4.0/test-data/vendor-branch-sameness-cvsrepos/proj/e.txt,v0000664000076500007650000000075710702477013026512 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access ; symbols vtag-2:1.1.1.1 vbranchB:1.1.1; locks ; strict; comment @# @; 1.1 date 2005.02.12.22.01.44; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2005.02.12.22.01.44; author kfogel; state Exp; branches ; next ; desc @@ 1.1 log @This log message is not the standard 'Initial revision\n' that indicates an import. @ text @Import vtag-2 on vbranchB. @ 1.1.1.1 log @First vendor branch revision. @ text @@ cvs2svn-2.4.0/test-data/vendor-branch-sameness-cvsrepos/proj/b.txt,v0000664000076500007650000000073510702477013026503 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access ; symbols vtag-1:1.1.1.1 vbranchA:1.1.1; locks ; strict; comment @# @; 1.1 date 2004.02.12.22.01.44; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2004.02.12.22.01.44; author kfogel; state Exp; branches ; next ; desc @@ 1.1 log @Initial revision @ text @Import vtag-1 on vbranchA. @ 1.1.1.1 log @First vendor branch revision. @ text @a1 1 The text on the vendor branch is different. @ cvs2svn-2.4.0/test-data/vendor-branch-sameness-cvsrepos/proj/c.txt,v0000664000076500007650000000065510702477013026505 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access ; symbols vtag-1:1.1.1.1 vbranchA:1.1.1; locks ; strict; comment @# @; 1.1 date 2004.02.12.22.01.44; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2004.02.12.22.01.44; author kfogel; state dead; branches ; next ; desc @@ 1.1 log @Initial revision @ text @Import vtag-1 on vbranchA. @ 1.1.1.1 log @First vendor branch revision. @ text @@ cvs2svn-2.4.0/test-data/branch-from-default-branch-cvsrepos/0000775000076500007650000000000012027373500025005 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/branch-from-default-branch-cvsrepos/proj/0000775000076500007650000000000012027373500025757 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/branch-from-default-branch-cvsrepos/proj/file.txt,v0000664000076500007650000000171210702477021027703 0ustar mhaggermhagger00000000000000head 1.2; access; symbols branch-off-of-default-branch:1.1.1.2.0.2 upstream:1.1.1; locks; strict; comment @# @; 1.2 date 2003.03.25.12.51.44; author fitz; state Exp; branches; next 1.1; 1.1 date 2000.10.20.07.15.19; author fitz; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2000.10.20.07.15.19; author fitz; state Exp; branches; next 1.1.1.2; 1.1.1.2 date 2002.01.10.11.03.58; author fitz; state Exp; branches 1.1.1.2.2.1; next ; 1.1.1.2.2.1 date 2003.03.18.01.36.42; author fitz; state Exp; branches; next ; desc @@ 1.2 log @This is a log message. @ text @This is revision 1.2 of file.txt @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is the initial import of file.txt @ 1.1.1.1 log @Initial Import. @ text @@ 1.1.1.2 log @This is a log message. @ text @d1 1 a1 1 This is the first commit on the default branch. @ 1.1.1.2.2.1 log @This is a log message. @ text @d1 1 a1 1 This is a commit on a branch off of the default branch. @ cvs2svn-2.4.0/test-data/move-parent-cvsrepos/0000775000076500007650000000000012027373500022167 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/move-parent-cvsrepos/file1,v0000664000076500007650000000062711710517254023365 0ustar mhaggermhagger00000000000000head 1.1; access; symbols b3:1.1.2.1.0.4 b1:1.1.2.1.0.2 b2:1.1.0.2; locks; strict; comment @# @; 1.1 date 2011.01.03.14.26.48; author pilegand; state Exp; branches 1.1.2.1; next ; commitid 657b4d21dca84567; 1.1.2.1 date 2011.01.03.14.26.51; author pilegand; state Exp; branches; next ; commitid 657f4d21dcab4567; desc @@ 1.1 log @first @ text @file1 @ 1.1.2.1 log @second @ text @a1 1 On b2 @ cvs2svn-2.4.0/test-data/move-parent-cvsrepos/file2,v0000664000076500007650000000035711710517254023366 0ustar mhaggermhagger00000000000000head 1.1; access; symbols b3:1.1.0.6 b1:1.1.0.4 b2:1.1.0.2; locks; strict; comment @# @; 1.1 date 2011.01.03.14.26.48; author pilegand; state Exp; branches; next ; commitid 657b4d21dca84567; desc @@ 1.1 log @first @ text @file2 @ cvs2svn-2.4.0/test-data/log-message-eols-cvsrepos/0000775000076500007650000000000012027373500023075 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/log-message-eols-cvsrepos/lottalogs,v0000664000076500007650000000067011710517254025301 0ustar mhaggermhagger00000000000000head 1.2; access ; symbols ; locks ; strict; comment @# @; 1.2 date 2003.08.28.15.00.00; author jrandom; state Exp; branches ; next 1.1; 1.1 date 2003.07.28.15.00.00; author jrandom; state Exp; branches ; next ; desc @@ 1.2 log @The CR at the end of this line should be turned into a LF. @ text @Nothing to see here. @ 1.1 log @The CRLF at the end of this line should be turned into a LF. @ text @@ cvs2svn-2.4.0/test-data/symlinks-cvsrepos/0000775000076500007650000000000012027373500021603 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/symlinks-cvsrepos/proj/0000775000076500007650000000000012027373500022555 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/symlinks-cvsrepos/proj/file.txt,v0000664000076500007650000000024310702477016024503 0ustar mhaggermhagger00000000000000head 1.1; access; symbols; locks; strict; comment @# @; 1.1 date 2007.04.08.08.10.10; author mhagger; state Exp; branches; next ; desc @@ 1.1 log @@ text @@ cvs2svn-2.4.0/test-data/symlinks-cvsrepos/proj/dir1/0000775000076500007650000000000012027373500023414 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/symlinks-cvsrepos/proj/dir1/README.txt0000664000076500007650000000030511024312621025101 0ustar mhaggermhagger00000000000000This directory is used by the symlinks() test. The test creates a symlink in this directory before starting cvs2svn. This README file itself is ignored by cvs2svn because it doesn't end in ",v". cvs2svn-2.4.0/test-data/symlinks-cvsrepos/proj/README.txt0000664000076500007650000000030511024313135024243 0ustar mhaggermhagger00000000000000This directory is used by the symlinks() test. The test creates a symlink in this directory before starting cvs2svn. This README file itself is ignored by cvs2svn because it doesn't end in ",v". cvs2svn-2.4.0/test-data/tagged-branch-n-trunk-cvsrepos/0000775000076500007650000000000012027373500024014 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/tagged-branch-n-trunk-cvsrepos/a.txt,v0000664000076500007650000000700010702477015025240 0ustar mhaggermhagger00000000000000head 1.27; access ; symbols some-branch:1.24.0.22 some-tag:1.24.22.2 ; locks ; strict; comment @# @; 1.27 date 2002.11.05.12.31.01; author Mats; state Dev; branches ; next 1.26; 1.26 date 2002.10.11.14.28.44; author Mats; state Dev; branches ; next 1.25; 1.25 date 2002.06.25.13.54.10; author Mats; state Dev; branches ; next 1.24; 1.24 date 2000.12.15.13.22.31; author Mats; state Dev; branches 1.24.22.1; next 1.23; 1.23 date 2000.12.15.13.14.45; author Mats; state Dev; branches ; next 1.22; 1.22 date 2000.11.07.10.36.13; author Mats; state Dev; branches ; next 1.21; 1.21 date 2000.10.26.13.11.15; author Mats; state Dev; branches ; next 1.20; 1.20 date 2000.09.26.07.07.19; author mats; state Dev; branches ; next 1.19; 1.19 date 2000.08.09.10.36.51; author mats; state Dev; branches ; next 1.18; 1.18 date 2000.07.06.13.25.36; author mats; state Dev; branches ; next 1.17; 1.17 date 2000.07.06.12.22.47; author mats; state Dev; branches ; next 1.16; 1.16 date 2000.05.18.16.46.37; author mats; state Dev; branches ; next 1.15; 1.15 date 2000.04.25.11.48.55; author mats; state Dev; branches ; next 1.14; 1.14 date 2000.04.25.07.11.38; author mats; state Dev; branches ; next 1.13; 1.13 date 2000.04.05.13.19.20; author mats; state Dev; branches ; next 1.12; 1.12 date 2000.03.31.14.47.59; author mats; state Dev; branches ; next 1.11; 1.11 date 2000.03.23.17.10.45; author mats; state Dev; branches ; next 1.10; 1.10 date 2000.03.22.14.20.50; author mats; state Dev; branches ; next 1.9; 1.9 date 2000.03.15.14.15.52; author mats; state Dev; branches ; next 1.8; 1.8 date 2000.03.08.19.46.55; author mats; state Dev; branches ; next 1.7; 1.7 date 2000.03.08.18.53.46; author mats; state Dev; branches ; next 1.6; 1.6 date 2000.03.08.14.45.46; author mats; state Dev; branches ; next 1.5; 1.5 date 2000.03.06.21.31.28; author mats; state Dev; branches ; next 1.4; 1.4 date 2000.03.06.19.27.30; author mats; state Dev; branches ; next 1.3; 1.3 date 2000.03.06.17.23.03; author mats; state Dev; branches ; next 1.2; 1.2 date 2000.03.03.16.23.52; author mats; state Dev; branches ; next 1.1; 1.1 date 2000.03.02.08.39.26; author mats; state Dev; branches ; next ; 1.24.22.1 date 2000.12.15.13.22.31; author Mats; state Dev; branches ; next 1.24.22.2; 1.24.22.2 date 2002.03.22.15.29.53; author Mats; state Dev; branches ; next ; desc @@ 1.27 log @Log socket level errors@ text @ 1.27 @ 1.26 log @log@ text @d1 1 a1 1 1.26 @ 1.25 log @log@ text @d1 1 a1 1 1.25 @ 1.24 log @log@ text @d1 1 a1 1 1.24 @ 1.24.22.1 log @Duplicate revision @ text @@ 1.24.22.2 log @ff@ text @d1 1 a1 1 1.24 @ 1.23 log @log@ text @d1 1 a1 1 1.23 @ 1.22 log @log@ text @d1 1 a1 1 1.22 @ 1.21 log @log@ text @d1 1 a1 1 1.21 @ 1.20 log @log@ text @d1 1 a1 1 1.20 @ 1.19 log @log@ text @d1 1 a1 1 1.19 @ 1.18 log @log@ text @d1 1 a1 1 1.18 @ 1.17 log @log@ text @d1 1 a1 1 1.17 @ 1.16 log @log@ text @d1 1 a1 1 1.16 @ 1.15 log @log@ text @d1 1 a1 1 1.15 @ 1.14 log @log@ text @d1 1 a1 1 1.14 @ 1.13 log @log@ text @d1 1 a1 1 1.13 @ 1.12 log @log@ text @d1 1 a1 1 1.12 @ 1.11 log @log@ text @d1 1 a1 1 1.11 @ 1.10 log @log@ text @d1 1 a1 1 1.10 @ 1.9 log @log@ text @d1 1 a1 1 1.9 @ 1.8 log @log@ text @d1 1 a1 1 1.8 @ 1.7 log @log@ text @d1 1 a1 1 1.7 @ 1.6 log @log@ text @d1 1 a1 1 1.6 @ 1.5 log @log@ text @d1 1 a1 1 1.5 @ 1.4 log @log@ text @d1 1 a1 1 1.4 @ 1.3 log @log@ text @d1 1 a1 1 1.3 @ 1.2 log @log@ text @d1 1 a1 1 1.2 @ 1.1 log @Initial Revision@ text @d1 1 a1 1 1.1 @ cvs2svn-2.4.0/test-data/tagged-branch-n-trunk-cvsrepos/b.txt,v0000664000076500007650000000144110702477015025244 0ustar mhaggermhagger00000000000000head 1.6; access ; symbols some-branch:1.5.0.22 some-tag:1.5 ; locks ; strict; comment @# @; 1.6 date 2002.10.11.14.28.44; author Mats; state Dev; branches ; next 1.5; 1.5 date 2000.03.31.14.48.32; author mats; state Dev; branches ; next 1.4; 1.4 date 2000.03.08.20.59.57; author mats; state Dev; branches ; next 1.3; 1.3 date 2000.03.06.17.27.00; author mats; state Dev; branches ; next 1.2; 1.2 date 2000.03.02.13.11.20; author mats; state Dev; branches ; next 1.1; 1.1 date 2000.03.02.08.39.24; author mats; state Dev; branches ; next ; desc @@ 1.6 log @log@ text @ 1.6 @ 1.5 log @log@ text @d1 1 a1 1 1.5 @ 1.4 log @log@ text @d1 1 a1 1 1.4 @ 1.3 log @ @ text @d1 1 a1 1 1.3 @ 1.2 log @log@ text @d1 1 a1 1 1.2 @ 1.1 log @Initial Revision@ text @d1 1 a1 1 1.1 @ cvs2svn-2.4.0/test-data/file-directory-conflict-cvsrepos/0000775000076500007650000000000012027373500024452 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/file-directory-conflict-cvsrepos/proj/0000775000076500007650000000000012027373500025424 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/file-directory-conflict-cvsrepos/proj/name/0000775000076500007650000000000012027373500026344 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/file-directory-conflict-cvsrepos/proj/name/name2,v0000664000076500007650000000027010702477015027536 0ustar mhaggermhagger00000000000000head 1.1; access; symbols; locks; strict; comment @# @; 1.1 date 2007.03.26.12.00.00; author mhagger; state Exp; branches; next ; desc @@ 1.1 log @Adding file "name2". @ text @@ cvs2svn-2.4.0/test-data/file-directory-conflict-cvsrepos/proj/name,v0000664000076500007650000000026710702477015026542 0ustar mhaggermhagger00000000000000head 1.1; access; symbols; locks; strict; comment @# @; 1.1 date 2007.03.26.13.00.00; author mhagger; state Exp; branches; next ; desc @@ 1.1 log @Adding file "name". @ text @@ cvs2svn-2.4.0/test-data/commit-dependencies-cvsrepos/0000775000076500007650000000000012027373500023646 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/commit-dependencies-cvsrepos/multi-branch/0000775000076500007650000000000012027373500026233 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/commit-dependencies-cvsrepos/multi-branch/file1,v0000664000076500007650000000076010702477021027424 0ustar mhaggermhagger00000000000000head 1.2; access; symbols branch:1.1.0.2; locks; strict; comment @# @; 1.2 date 2005.12.27.10.45.27; author ossi; state tst; branches; next 1.1; 1.1 date 2005.12.27.10.44.37; author ossi; state Exp; branches 1.1.2.1; next ; 1.1.2.1 date 2005.12.27.10.45.53; author ossi; state Exp; branches; next ; desc @@ 1.2 log @multi-branch-commit @ text @file1 rev2 head @ 1.1 log @adding @ text @d1 1 a1 1 file1 rev1 @ 1.1.2.1 log @multi-branch-commit @ text @d1 1 a1 1 file1 rev2 branch @ cvs2svn-2.4.0/test-data/commit-dependencies-cvsrepos/multi-branch/file2,v0000664000076500007650000000076010702477021027425 0ustar mhaggermhagger00000000000000head 1.2; access; symbols branch:1.1.0.2; locks; strict; comment @# @; 1.2 date 2005.12.27.10.46.17; author ossi; state tst; branches; next 1.1; 1.1 date 2005.12.27.10.44.37; author ossi; state Exp; branches 1.1.2.1; next ; 1.1.2.1 date 2005.12.27.10.46.41; author ossi; state Exp; branches; next ; desc @@ 1.2 log @multi-branch-commit @ text @file2 rev2 head @ 1.1 log @adding @ text @d1 1 a1 1 file2 rev1 @ 1.1.2.1 log @multi-branch-commit @ text @d1 1 a1 1 file2 rev2 branch @ cvs2svn-2.4.0/test-data/commit-dependencies-cvsrepos/interleaved/0000775000076500007650000000000012027373500026150 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/commit-dependencies-cvsrepos/interleaved/file1,v0000664000076500007650000000067710702477021027350 0ustar mhaggermhagger00000000000000head 1.3; access; symbols; locks; strict; comment @# @; 1.3 date 2005.11.27.10.43.34; author ossi; state tst; branches; next 1.2; 1.2 date 2005.11.27.10.43.12; author ossi; state Exp; branches; next 1.1; 1.1 date 2005.11.27.10.42.18; author ossi; state Exp; branches; next ; desc @@ 1.3 log @dependant small commit @ text @file1 rev3 @ 1.2 log @big commit @ text @d1 1 a1 1 file1 rev2 @ 1.1 log @adding @ text @d1 1 a1 1 file1 rev1 @ cvs2svn-2.4.0/test-data/commit-dependencies-cvsrepos/interleaved/file2,v0000664000076500007650000000046310702477021027342 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2005.11.27.10.43.45; author ossi; state tst; branches; next 1.1; 1.1 date 2005.11.27.10.42.18; author ossi; state Exp; branches; next ; desc @@ 1.2 log @big commit @ text @file2 rev2 @ 1.1 log @adding @ text @d1 1 a1 1 file2 rev1 @ cvs2svn-2.4.0/test-data/CVSROOT/0000775000076500007650000000000012027373500017227 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/CVSROOT/README0000664000076500007650000000113011317026313020100 0ustar mhaggermhagger00000000000000This CVSROOT/ directory is only here to convince CVS to treat the neighboring directories as CVS repository modules. Without it, CVS operations fail with an error like: cvs [checkout aborted]: .../main-cvsrepos/CVSROOT: No such file or directory Of course, CVS doesn't seem to require that there actually be any files in CVSROOT/, which kind of makes one wonder why it cares about the directory at all. Although this directly is only strictly needed when the --use-cvs option is used, cvs2svn checks that every project has an associated CVSROOT directory to avoid complicating its bookkeeping. cvs2svn-2.4.0/test-data/exclude-ntdb-cvsrepos/0000775000076500007650000000000012027373500022310 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/exclude-ntdb-cvsrepos/proj/0000775000076500007650000000000012027373500023262 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/exclude-ntdb-cvsrepos/proj/file.txt,v0000664000076500007650000000317510771742370025223 0ustar mhaggermhagger00000000000000head 1.2; access; symbols branch3:1.1.1.3.0.2 tag3:1.1.1.3 vendortag3:1.1.1.3 branch2:1.1.1.2.0.2 tag2:1.1.1.2 vendortag2:1.1.1.2 branch1:1.1.1.1.0.2 tag1:1.1.1.1 vendortag1:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.2 date 2008.03.23.21.09.25; author mhagger; state Exp; branches; next 1.1; commitid eP23GQr5EN3CgiWs; 1.1 date 2008.03.23.21.09.12; author mhagger; state Exp; branches 1.1.1.1; next ; commitid DP3br0gPAbExgiWs; 1.1.1.1 date 2008.03.23.21.09.12; author mhagger; state Exp; branches 1.1.1.1.2.1; next 1.1.1.2; commitid DP3br0gPAbExgiWs; 1.1.1.2 date 2008.03.23.21.09.18; author mhagger; state Exp; branches 1.1.1.2.2.1; next 1.1.1.3; commitid jW6XTZM1bdCzgiWs; 1.1.1.3 date 2008.03.23.21.09.28; author mhagger; state Exp; branches 1.1.1.3.2.1; next ; commitid 9MGIzCxv7EgDgiWs; 1.1.1.1.2.1 date 2008.03.23.21.09.15; author mhagger; state Exp; branches; next ; commitid 8JhDtGnHHSyygiWs; 1.1.1.2.2.1 date 2008.03.23.21.09.21; author mhagger; state Exp; branches; next ; commitid t6VniT76SxUAgiWs; 1.1.1.3.2.1 date 2008.03.23.21.09.31; author mhagger; state Exp; branches; next ; commitid gM9gAG7c2jgEgiWs; desc @@ 1.2 log @First explicit commit on trunk @ text @1.2 @ 1.1 log @Initial revision @ text @d1 1 a1 1 1.1.1.1 @ 1.1.1.1 log @First import @ text @@ 1.1.1.2 log @Second import @ text @d1 1 a1 1 1.1.1.2 @ 1.1.1.3 log @Third import @ text @d1 1 a1 1 1.1.1.3 @ 1.1.1.3.2.1 log @Commit on branch branch3 @ text @d1 1 a1 1 1.1.1.3.2.1 @ 1.1.1.2.2.1 log @Commit on branch branch2 @ text @d1 1 a1 1 1.1.1.2.2.1 @ 1.1.1.1.2.1 log @Commit on branch branch1 @ text @d1 1 a1 1 1.1.1.1.2.1 @ cvs2svn-2.4.0/test-data/exclude-ntdb-cvsrepos/makerepo.sh0000775000076500007650000000355710771742370024475 0ustar mhaggermhagger00000000000000#! /bin/sh # Run script from the main cvs2svn directory to create the # exclude-ntdb cvs repository. CVSROOT=`pwd`/test-data/exclude-ntdb-cvsrepos TMP=cvs2svn-tmp rm -rf $TMP mkdir $TMP cd $TMP #cvs -d $CVSROOT init rm -rf $CVSROOT/proj mkdir proj echo 'Import proj/file.txt:' cd proj echo '1.1.1.1' >file.txt cvs -d $CVSROOT import -m "First import" proj vendorbranch vendortag1 sleep 2 cd .. echo 'Check out the repository:' cvs -d $CVSROOT co -d wc . echo 'Add a tag and a branch to trunk (these appear on revision 1.1.1.1)' echo 'and commit a revision on the branch:' cd wc/proj cvs tag tag1 cvs tag -b branch1 cvs up -r branch1 echo 1.1.1.1.2.1 >file.txt cvs ci -m 'Commit on branch branch1' sleep 2 cd ../.. echo 'Import proj/file.txt a second time:' cd proj echo '1.1.1.2' >file.txt cvs -d $CVSROOT import -m "Second import" proj vendorbranch vendortag2 sleep 2 cd .. echo 'Add a second tag and branch to trunk (these appear on revision' echo '1.1.1.2) and commit a revision on the branch:' cd wc/proj cvs up -A cvs tag tag2 cvs tag -b branch2 cvs up -r branch2 echo 1.1.1.2.2.1 >file.txt cvs ci -m 'Commit on branch branch2' sleep 2 cd ../.. echo 'Commit directly to trunk. This creates a revision 1.2 and' echo 'changes the default branch back to trunk:' cd wc/proj cvs up -A echo '1.2' >file.txt cvs ci -m 'First explicit commit on trunk' sleep 2 cd ../.. echo 'Import again. This import is no longer on the non-trunk vendor' echo 'branch, so it does not have any effect on trunk:' cd proj echo '1.1.1.3' >file.txt cvs -d $CVSROOT import -m "Third import" proj vendorbranch vendortag3 sleep 2 cd .. echo 'Create a tag and a branch explicitly from the vendor branch, and' echo 'commit a revision on the branch:' cd wc/proj cvs up -r vendorbranch cvs tag tag3 cvs tag -b branch3 cvs up -r branch3 echo 1.1.1.3.2.1 >file.txt cvs ci -m 'Commit on branch branch3' sleep 2 cd ../.. cvs2svn-2.4.0/test-data/double-delete-cvsrepos/0000775000076500007650000000000012027373500022444 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/double-delete-cvsrepos/twice-removed,v0000664000076500007650000000144710702477016025416 0ustar mhaggermhagger00000000000000head 1.3; access; symbols; locks; strict; comment @; @; 1.3 date 95.12.30.18.37.22; author jrandom; state dead; branches; next 1.2; 1.2 date 95.12.11.00.27.53; author jrandom; state dead; branches; next 1.1; 1.1 date 93.06.18.05.46.07; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 93.06.18.05.46.08; author jrandom; state Exp; branches; next ; desc @@ 1.3 log @Remove this file for the second time, which should have no effect. @ text @The original text of this file was much longer, but we didn't need it for the regression test, so we removed it. It was src/gnu/usr.bin/cvs/contrib/pcl-cvs/Attic/cookie.el,v in the FreeBSD CVS repository. @ 1.2 log @Remove this file for the first time. @ text @@ 1.1 log @Initial revision @ text @@ 1.1.1.1 log @Updated CVS @ text @@ cvs2svn-2.4.0/test-data/enroot-race-obo-cvsrepos/0000775000076500007650000000000012027373500022725 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/enroot-race-obo-cvsrepos/file,v0000664000076500007650000000030410702477015024032 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BRANCH:1.1.0.2; locks; strict; comment @# @; 1.1 date 2004.01.01.12.00.00; author jrandom; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @@ cvs2svn-2.4.0/test-data/overdead-cvsrepos/0000775000076500007650000000000012027373500021523 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/overdead-cvsrepos/overdead,v0000664000076500007650000000535710702477013023515 0ustar mhaggermhagger00000000000000head 1.16; access; symbols jujubean-2_1_0:1.16 jujubean-0_3_4:1.1.1.1 jujubean-1_2_4:1.1.1.1 jujubean-1_2_3:1.1.1.1 jujubean-1_2_2:1.1.1.1 jujubean-0_3_3:1.1.1.1 jujubean-1_2_1:1.1.1.1 jujubean-1_2_0:1.1.1.1.0.10 jujubean-1_1_0:1.1.1.1.0.6 jujubean-1_0_2:1.1.1.1 jujubean-1_0_1:1.1.1.1 jujubean-0_3_2:1.1.1.1 jujubean-1_0_0:1.1.1.1.0.4 jujubean-0_3_1:1.1.1.1 jujubean-0_3_0:1.1.1.1.0.2 jujubean-0_0_0:1.1.1.1 jujubean:1.1.1; locks; strict; comment @// @; 1.16 date 2002.05.24.23.49.01; author bloomy; state dead; branches; next 1.15; 1.15 date 2002.05.24.23.48.55; author bloomy; state dead; branches; next 1.14; 1.14 date 2002.05.24.23.48.52; author bloomy; state dead; branches; next 1.13; 1.13 date 2002.05.24.23.48.50; author bloomy; state dead; branches; next 1.12; 1.12 date 2002.05.24.23.48.47; author bloomy; state dead; branches; next 1.11; 1.11 date 2002.05.24.23.48.44; author bloomy; state dead; branches; next 1.10; 1.10 date 2002.05.24.23.48.41; author bloomy; state dead; branches; next 1.9; 1.9 date 2002.05.24.23.48.39; author bloomy; state dead; branches; next 1.8; 1.8 date 2002.05.24.23.48.36; author bloomy; state dead; branches; next 1.7; 1.7 date 2002.05.24.23.48.30; author bloomy; state dead; branches; next 1.6; 1.6 date 2002.05.24.23.48.27; author bloomy; state dead; branches; next 1.5; 1.5 date 2002.05.24.23.48.24; author bloomy; state dead; branches; next 1.4; 1.4 date 2002.05.24.23.48.21; author bloomy; state dead; branches; next 1.3; 1.3 date 2002.05.24.23.48.18; author bloomy; state dead; branches; next 1.2; 1.2 date 2002.05.24.23.48.16; author bloomy; state dead; branches; next 1.1; 1.1 date 2002.04.11.03.20.38; author bloomy; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2002.04.11.03.20.38; author bloomy; state Exp; branches; next ; desc @@ 1.16 log @Flipping the fragbottom busters. @ text @This is a file that's been marked as 'dead' in a crazy number of revisions. @ 1.15 log @Flipping the fragbottom busters. @ text @@ 1.14 log @Flipping the fragbottom busters. @ text @@ 1.13 log @Flipping the fragbottom busters. @ text @@ 1.12 log @Flipping the fragbottom busters. @ text @@ 1.11 log @Flipping the fragbottom busters. @ text @@ 1.10 log @Flipping the fragbottom busters. @ text @@ 1.9 log @Flipping the fragbottom busters. @ text @@ 1.8 log @Flipping the fragbottom busters. @ text @@ 1.7 log @Flipping the fragbottom busters. @ text @@ 1.6 log @Flipping the fragbottom busters. @ text @@ 1.5 log @Flipping the fragbottom busters. @ text @@ 1.4 log @Flipping the fragbottom busters. @ text @@ 1.3 log @Flipping the fragbottom busters. @ text @@ 1.2 log @Flipping the fragbottom busters. @ text @@ 1.1 log @Initial revision @ text @@ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/missing-vendor-branch-cvsrepos/0000775000076500007650000000000012027373500024131 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/missing-vendor-branch-cvsrepos/file,v0000664000076500007650000000030211710517254025234 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access; symbols; locks; strict; comment @# @; 1.1 date 2006.09.06.19.14.41; author author3; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @@ cvs2svn-2.4.0/test-data/mirror-keyerror-cvsrepos/0000775000076500007650000000000012027373500023104 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/mirror-keyerror-cvsrepos/powerpc/0000775000076500007650000000000012027373500024563 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/mirror-keyerror-cvsrepos/powerpc/bits/0000775000076500007650000000000012027373500025524 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/mirror-keyerror-cvsrepos/powerpc/bits/Attic/0000775000076500007650000000000012027373500026570 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/mirror-keyerror-cvsrepos/powerpc/bits/Attic/file2.txt,v0000664000076500007650000000043210765222713030601 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @ * @; 1.2 date 2002.06.24.05.48.16; author author49; state dead; branches; next 1.1; 1.1 date 2000.04.26.02.51.11; author author19; state Exp; branches; next ; desc @@ 1.2 log @log 2492@ text @@ 1.1 log @log 4326@ text @@ cvs2svn-2.4.0/test-data/mirror-keyerror-cvsrepos/powerpc/file1.txt,v0000664000076500007650000000060610765223031026570 0ustar mhaggermhagger00000000000000head 1.3; access; symbols; locks; strict; comment @ * @; 1.3 date 2002.06.24.05.48.14; author author49; state Exp; branches; next 1.2; 1.2 date 2000.04.26.02.46.59; author author19; state dead; branches; next 1.1; 1.1 date 2000.04.21.20.33.29; author author19; state Exp; branches; next ; desc @@ 1.3 log @log 2492@ text @@ 1.2 log @log 4326@ text @@ 1.1 log @log 4129@ text @@ cvs2svn-2.4.0/test-data/attic-directory-conflict-cvsrepos/0000775000076500007650000000000012027373500024637 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/attic-directory-conflict-cvsrepos/proj/0000775000076500007650000000000012027373500025611 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/attic-directory-conflict-cvsrepos/proj/Attic/0000775000076500007650000000000012027373500026655 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/attic-directory-conflict-cvsrepos/proj/Attic/file1,v0000664000076500007650000000047410702477016030054 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2007.04.08.10.09.19; author mhagger; state dead; branches; next 1.1; 1.1 date 2007.04.08.08.09.02; author mhagger; state Exp; branches; next ; desc @@ 1.2 log @*** empty log message *** @ text @@ 1.1 log @*** empty log message *** @ text @@ cvs2svn-2.4.0/test-data/attic-directory-conflict-cvsrepos/proj/file1/0000775000076500007650000000000012027373500026611 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/attic-directory-conflict-cvsrepos/proj/file1/file2.txt,v0000664000076500007650000000027510702477016030626 0ustar mhaggermhagger00000000000000head 1.1; access; symbols; locks; strict; comment @# @; 1.1 date 2007.04.08.08.10.10; author mhagger; state Exp; branches; next ; desc @@ 1.1 log @*** empty log message *** @ text @@ cvs2svn-2.4.0/test-data/crossed-branches-cvsrepos/0000775000076500007650000000000012027373500023157 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/crossed-branches-cvsrepos/proj/0000775000076500007650000000000012027373500024131 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/crossed-branches-cvsrepos/proj/file2.txt,v0000664000076500007650000000126310702477013026141 0ustar mhaggermhagger00000000000000head 1.2; access; symbols BRANCH3:1.2.0.4 BRANCH4:1.2.0.2 BRANCH1:1.1.0.4 BRANCH2:1.1.0.2; locks; strict; comment @# @; 1.2 date 2007.01.20.14.54.27; author mhagger; state Exp; branches 1.2.2.1 1.2.4.1; next 1.1; 1.1 date 2007.01.20.14.45.02; author mhagger; state Exp; branches; next ; 1.2.2.1 date 2007.01.20.14.57.11; author mhagger; state Exp; branches; next ; 1.2.4.1 date 2007.01.20.14.56.46; author mhagger; state Exp; branches; next ; desc @@ 1.2 log @Revisions 1.2 @ text @1.2 @ 1.2.2.1 log @Shared commit message @ text @d1 1 a1 1 BRANCH4 commit @ 1.2.4.1 log @Shared commit message @ text @d1 1 a1 1 BRANCH3 commit @ 1.1 log @Adding two files @ text @d1 1 @ cvs2svn-2.4.0/test-data/crossed-branches-cvsrepos/proj/file1.txt,v0000664000076500007650000000126310702477013026140 0ustar mhaggermhagger00000000000000head 1.2; access; symbols BRANCH4:1.2.0.4 BRANCH3:1.2.0.2 BRANCH2:1.1.0.4 BRANCH1:1.1.0.2; locks; strict; comment @# @; 1.2 date 2007.01.20.14.54.27; author mhagger; state Exp; branches 1.2.2.1 1.2.4.1; next 1.1; 1.1 date 2007.01.20.14.45.02; author mhagger; state Exp; branches; next ; 1.2.2.1 date 2007.01.20.14.56.46; author mhagger; state Exp; branches; next ; 1.2.4.1 date 2007.01.20.14.57.11; author mhagger; state Exp; branches; next ; desc @@ 1.2 log @Revisions 1.2 @ text @1.2 @ 1.2.4.1 log @Shared commit message @ text @d1 1 a1 1 BRANCH4 commit @ 1.2.2.1 log @Shared commit message @ text @d1 1 a1 1 BRANCH3 commit @ 1.1 log @Adding two files @ text @d1 1 @ cvs2svn-2.4.0/test-data/issue-106-cvsrepos/0000775000076500007650000000000012027373500021366 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/issue-106-cvsrepos/d/0000775000076500007650000000000012027373500021611 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/issue-106-cvsrepos/d/b.txt,v0000664000076500007650000000053110702477013023036 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.15; access; symbols branch:1.1.15; locks; strict; comment @# @; 1.1 date 2002.12.13.08.29.37; author joeschmo; state Exp; branches 1.1.15.1; next ; 1.1.15.1 date 2002.12.13.08.29.37; author joeschmo; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @@ 1.1.15.1 log @b.txt 1.1.15.1 @ text @@ cvs2svn-2.4.0/test-data/issue-106-cvsrepos/a.txt,v0000664000076500007650000000053110702477013022612 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.15; access; symbols branch:1.1.15; locks; strict; comment @# @; 1.1 date 2002.12.13.08.30.00; author joeschmo; state Exp; branches 1.1.15.1; next ; 1.1.15.1 date 2002.12.13.08.30.00; author joeschmo; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @@ 1.1.15.1 log @a.txt 1.1.15.1 @ text @@ cvs2svn-2.4.0/test-data/empty-trunk-cvsrepos/0000775000076500007650000000000012027373500022231 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/empty-trunk-cvsrepos/root/0000775000076500007650000000000012027373500023214 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/empty-trunk-cvsrepos/root/Attic/0000775000076500007650000000000012027373500024260 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/empty-trunk-cvsrepos/root/Attic/a_file,v0000664000076500007650000000055610702477020025672 0ustar mhaggermhagger00000000000000head 1.1; access; symbols mytag:1.1.2.1 mybranch:1.1.0.2; locks; strict; comment @# @; 1.1 date 2004.06.05.14.14.45; author max; state dead; branches 1.1.2.1; next ; 1.1.2.1 date 2004.06.05.14.14.45; author max; state Exp; branches; next ; desc @@ 1.1 log @file a_file was initially added on branch mybranch. @ text @@ 1.1.2.1 log @Add a_file @ text @@ cvs2svn-2.4.0/test-data/repeated-deltatext-cvsrepos/0000775000076500007650000000000012027373500023517 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/repeated-deltatext-cvsrepos/file.txt,v0000664000076500007650000000131610702477014025445 0ustar mhaggermhagger00000000000000head 1.3; access; symbols uves_2_0_0:1.3 UVES-1_3_0-beta1a:1.1; locks; strict; comment @# @; 1.3 date 2003.06.30.12.59.08; author amodigli; state Exp; branches; next 1.2; 1.2 date 2003.06.30.12.54.52; author amodigli; state Exp; branches; next 1.1; 1.1 date 2002.02.11.17.59.51; author amodigli; state Exp; branches; next ; desc @@ 1.3 log @uves-2.0.0-rep @ text @ COMMON /QC_LOG/MID_S_N_CENT,OBJ_POS_CENT, + FWHM,N_CURR_ORD !to not pass a parameter to G_PROF @ 1.2 log @uves-2.0.0 @ text @@ 1.1 log @1st release @ text @@ 1.1 log @Created @ text @ COMMON /QC_LOG/MID_S_N_CENT,OBJ_POS_CENT, + FWHM,N_CURR_ORD !to not pass a parameter to G_PROF @ cvs2svn-2.4.0/test-data/requires-cvs-cvsrepos/0000775000076500007650000000000012027373500022362 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/requires-cvs-cvsrepos/space-in-authorname,v0000664000076500007650000000064510702477020026414 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2004.07.26.23.38.17; author William Lyon Phelps III; state Exp; branches; next 1.1; 1.1 date 2004.07.19.20.57.24; author j random; state Exp; branches; next ; desc @@ 1.2 log @Commit a second revision of every file. @ text @This is the first revision in this file. This line was added in the second revision. @ 1.1 log @Add a file. @ text @d2 1 @ cvs2svn-2.4.0/test-data/requires-cvs-cvsrepos/atsign-add,v0000664000076500007650000000030210702477020024555 0ustar mhaggermhagger00000000000000head 1.1; access; symbols; locks; strict; comment @# @; 1.1 date 2000.01.01.01.01.01; author someuser; state Exp; branches; next ; desc @@ 1.1 log @Somelogmsg @ text @Sometext /* $Id: */@ cvs2svn-2.4.0/test-data/requires-cvs-cvsrepos/README0000664000076500007650000000023210702477020023237 0ustar mhaggermhagger00000000000000This repository tests that the --use-cvs flag allows cvs2svn to correctly convert some files that would be troublesome for RCS. See issues 4, 11, and 29. cvs2svn-2.4.0/test-data/requires-cvs-cvsrepos/client_lock.idl,v0000664000076500007650000000346711500107341025607 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2001.10.09.07.30.31; author gregh; state Exp; branches; next 1.1; 1.1 date 2001.10.03.00.59.06; author gregh; state Exp; branches; next ; desc @@ 1.2 log @ Integration for locks @ text @//================================================================== -*- C++ -*- // // File client_lock.idl // // Description // Lock for interface objects. // //$Id: client_lock.idl,v 1.1 2001/10/03 00:59:06 gregh Exp $ // //$Log: client_lock.idl,v $ //Revision 1.1 2001/10/03 00:59:06 gregh // //Added graph points to track, and added advisory locks to track, marker //look, sensor, and ownship interfaces. // // //============================================================================== #ifndef _CLIENT_LOCK_IDL_ #define _CLIENT_LOCK_IDL_ #include "tdms.idl" #include "client.idl" #include "exception.idl" module Orb { //============================================================================== interface I_Client_Lock { // Record Locking (used as base interface for all leaf object interfaces) // Note that locks are advisory, so clients need not acquire or honour. void acquire_read_lock(in I_Client objref) raises (Lock_Failed); void acquire_write_lock(in I_Client objref) raises (Lock_Failed); // promotes a read lock void release_lock(in I_Client objref); }; //============================================================================== }; #endif @ 1.1 log @ Added graph points to track, and added advisory locks to track, marker look, sensor, and ownship interfaces. @ text @d8 7 a14 1 //$Id: tdms.idl,v 1.28 2001/09/11 08:10:55 daveb Exp $ a15 1 //$Log: tdms.idl,v $ d24 1 d35 2 a36 2 void acquire_read_lock(in I_Client objref); void acquire_write_lock(in I_Client objref); // this will promote a read lock @ cvs2svn-2.4.0/test-data/invalid-symbol-cvsrepos/0000775000076500007650000000000012027373500022663 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/invalid-symbol-cvsrepos/cvs2svn-ignore2.options0000664000076500007650000000125611233672717027265 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # Fix a problem with an invalid symbol by ignoring it using an # IgnoreSymbolTransform. from cvs2svn_lib.symbol_transform import IgnoreSymbolTransform execfile('cvs2svn-example.options') name = 'invalid-symbol' ctx.output_option = NewRepositoryOutputOption( 'cvs2svn-tmp/%s--options=cvs2svn-ignore2.options-svnrepos' % (name,), ) run_options.clear_projects() run_options.add_project( r'test-data/%s-cvsrepos' % (name,), trunk_path='trunk', branches_path='branches', tags_path='tags', symbol_transforms=[ IgnoreSymbolTransform(r'SYMBOL'), ], symbol_strategy_rules=global_symbol_strategy_rules, ) cvs2svn-2.4.0/test-data/invalid-symbol-cvsrepos/file.txt,v0000664000076500007650000000026210703734217024612 0ustar mhaggermhagger00000000000000head 1.1; access; symbols SYMBOL:1; locks; strict; comment @ * @; 1.1 date 95.11.09.02.31.53; author author2; state Exp; branches; next ; desc @@ 1.1 log @log 43@ text @@ cvs2svn-2.4.0/test-data/invalid-symbol-cvsrepos/cvs2svn-ignore.options0000664000076500007650000000141311233672717027176 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # Fix a problem with an invalid symbol by ignoring it using a # SymbolMapper. from cvs2svn_lib.symbol_transform import SymbolMapper execfile('cvs2svn-example.options') name = 'invalid-symbol' ctx.output_option = NewRepositoryOutputOption( 'cvs2svn-tmp/%s--options=cvs2svn-ignore.options-svnrepos' % (name,), ) run_options.clear_projects() filename = 'test-data/%s-cvsrepos/file.txt,v' % (name,) symbol_mapper = SymbolMapper([ (filename, 'SYMBOL', '1', None), ]) run_options.add_project( r'test-data/%s-cvsrepos' % (name,), trunk_path='trunk', branches_path='branches', tags_path='tags', symbol_transforms=[ symbol_mapper, ], symbol_strategy_rules=global_symbol_strategy_rules, ) cvs2svn-2.4.0/test-data/default-branch-and-1-2-cvsrepos/0000775000076500007650000000000012027373500023646 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/default-branch-and-1-2-cvsrepos/proj/0000775000076500007650000000000012027373500024620 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/default-branch-and-1-2-cvsrepos/proj/a.txt,v0000664000076500007650000000231410702477014026046 0ustar mhaggermhagger00000000000000head 1.2; branch 1.1.1; access; symbols vtag-4:1.1.1.4 vtag-3:1.1.1.3 vtag-2:1.1.1.2 vtag-1:1.1.1.1 vbranchA:1.1.1; locks; strict; comment @# @; 1.2 date 2004.02.09.15.43.14; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.02.09.15.43.13; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.2; 1.1.1.2 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.3; 1.1.1.3 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.4; 1.1.1.4 date 2004.02.09.15.43.16; author kfogel; state Exp; branches; next ; desc @@ 1.2 log @First regular commit, to a.txt, on vtag-3. @ text @This is vtag-3 (on vbranchA) of a.txt. A regular change to a.txt. @ 1.1 log @Initial revision @ text @d1 2 a2 1 This is vtag-1 (on vbranchA) of a.txt. @ 1.1.1.1 log @Import (vbranchA, vtag-1). @ text @@ 1.1.1.2 log @Import (vbranchA, vtag-2). @ text @d1 1 a1 1 This is vtag-2 (on vbranchA) of a.txt. @ 1.1.1.3 log @Import (vbranchA, vtag-3). @ text @d1 1 a1 1 This is vtag-3 (on vbranchA) of a.txt. @ 1.1.1.4 log @Import (vbranchA, vtag-4). @ text @d1 1 a1 1 This is vtag-4 (on vbranchA) of a.txt. @ cvs2svn-2.4.0/test-data/ctrl-char-in-log-cvsrepos/0000775000076500007650000000000012027373500022774 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/ctrl-char-in-log-cvsrepos/ctrl-char-in-log,v0000664000076500007650000000104310702477013026223 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access ; symbols vendorbranch:1.1.1 vendortag:1.1.1.1; locks ; strict; comment @# @; 1.1 date 2003.07.28.15.00.00; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.07.28.15.00.00; author jrandom; state Exp; branches ; next ; desc @@ 1.1 log @The content of this revision is unimportant, what matters is that this log message contains a Ctrl-D right here, "", and cvs2svn.py should handle this. @ text @Nothing to see here. @ 1.1.1.1 log @imported @ text @@ cvs2svn-2.4.0/test-data/branch-delete-first-cvsrepos/0000775000076500007650000000000012027373500023554 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/branch-delete-first-cvsrepos/file,v0000664000076500007650000000134110702477015024663 0ustar mhaggermhagger00000000000000head 1.3; access; symbols branch-3:1.1.0.6 branch-2:1.1.0.4 branch-1:1.1.0.2 import:1.1; locks; strict; comment @# @; 1.3 date 2003.01.17.06.56.24; author joeuser; state Exp; branches; next 1.2; 1.2 date 2003.01.17.06.48.54; author joeuser; state Exp; branches; next 1.1; 1.1 date 2003.01.15.06.21.59; author joeuser; state Exp; branches 1.1.2.1 1.1.4.1; next ; 1.1.2.1 date 2003.01.17.00.34.20; author joeuser; state dead; branches; next ; 1.1.4.1 date 2003.01.16.00.16.39; author joeuser; state dead; branches; next ; desc @@ 1.3 log @1.3 @ text @ 1.3 @ 1.2 log @1.2 @ text @d1 1 a1 1 1.2 @ 1.1 log @1.1 @ text @d1 1 a1 1 1.1 @ 1.1.2.1 log @1.1.2.1 @ text @@ 1.1.4.1 log @1.1.4.1 @ text @@ cvs2svn-2.4.0/test-data/branch-delete-first-cvsrepos/README.txt0000664000076500007650000000053311317026315025253 0ustar mhaggermhagger00000000000000This repository exhibits has interesting characteristic that the very first thing that happen on a branch is that its sole file is deleted. A bug in cvs2svn caused this to delay branch creation until the end of the program (where we're finished off branches and tags), which resulted in the file's deletion from the branch never really happening. cvs2svn-2.4.0/test-data/add-cvsignore-to-branch-cvsrepos/0000775000076500007650000000000012027373500024332 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/add-cvsignore-to-branch-cvsrepos/dir/0000775000076500007650000000000012027373500025110 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/add-cvsignore-to-branch-cvsrepos/dir/.cvsignore,v0000664000076500007650000000027011112147612027346 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BRANCH:1.1.0.2; locks; strict; comment @# @; 1.1 date 2004.09.30.09.26.41; author author11; state Exp; branches; next ; desc @@ 1.1 log @@ text @*.o@ cvs2svn-2.4.0/test-data/add-cvsignore-to-branch-cvsrepos/dir/file.txt,v0000664000076500007650000000044611076444704027046 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BRANCH:1.1.0.2; locks; strict; comment @# @; 1.1 date 2004.01.28.12.14.36; author author8; state Exp; branches 1.1.2.1; next ; 1.1.2.1 date 2004.05.03.15.31.02; author author1; state Exp; branches; next ; desc @@ 1.1 log @@ text @@ 1.1.2.1 log @@ text @@ cvs2svn-2.4.0/test-data/resync-pass2-pull-forward-cvsrepos/0000775000076500007650000000000012027373500024677 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/resync-pass2-pull-forward-cvsrepos/file2.txt,v0000664000076500007650000000041710702477015026711 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; 1.2 date 90.04.19.15.10.41; author user1; state Exp; branches; next 1.1; 1.1 date 90.04.19.15.10.41; author user1; state Exp; branches; next ; desc @@ 1.2 log @Summary: foo @ text @@ 1.1 log @Initial revision @ text @@ cvs2svn-2.4.0/test-data/resync-pass2-pull-forward-cvsrepos/file1.txt,v0000664000076500007650000000041710702477015026710 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; 1.2 date 90.04.19.15.10.30; author user1; state Exp; branches; next 1.1; 1.1 date 90.04.19.15.10.29; author user1; state Exp; branches; next ; desc @@ 1.2 log @Summary: foo @ text @@ 1.1 log @Initial revision @ text @@ cvs2svn-2.4.0/test-data/internal-co-cvsrepos/0000775000076500007650000000000012027373500022145 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/internal-co-cvsrepos/branched/0000775000076500007650000000000012027373500023713 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/internal-co-cvsrepos/branched/Attic/0000775000076500007650000000000012027373500024757 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/internal-co-cvsrepos/branched/Attic/somefile.txt,v0000664000076500007650000000443710702477021027576 0ustar mhaggermhagger00000000000000head 1.5; access; symbols BRANCH_FROM_DEAD:1.5.0.2 BRANCH:1.1.0.2; locks; strict; comment @# @; 1.5 date 2007.04.05.15.32.44; author ossi; state dead; branches 1.5.2.1; next 1.4; commitid ksTEPgcwRGzBKTcs; 1.4 date 2007.04.05.15.32.23; author ossi; state Exp; branches; next 1.3; commitid g00K7nhwfWduKTcs; 1.3 date 2007.04.05.15.13.23; author ossi; state dead; branches; next 1.2; commitid 6vDsezBCNsbYDTcs; 1.2 date 2007.04.05.15.13.08; author ossi; state Exp; branches; next 1.1; commitid XPB93k3c4jJSDTcs; 1.1 date 2007.04.05.15.07.41; author ossi; state Exp; branches 1.1.2.1; next ; commitid jgg4E7IfvqX0CTcs; 1.5.2.1 date 2007.04.05.15.32.44; author ossi; state dead; branches; next 1.5.2.2; commitid bGYbKyNPiicdLTcs; 1.5.2.2 date 2007.04.05.15.34.30; author ossi; state Exp; branches; next ; commitid bGYbKyNPiicdLTcs; 1.1.2.1 date 2007.04.05.15.30.02; author ossi; state Exp; branches; next 1.1.2.2; commitid 6MkfqH2xRPLDJTcs; 1.1.2.2 date 2007.04.05.15.30.44; author ossi; state Exp; branches; next 1.1.2.3; commitid BSF6Cvx3cHWUJTcs; 1.1.2.3 date 2007.04.05.15.30.55; author ossi; state dead; branches; next ; commitid UTPOIUtFBh3ZJTcs; desc @@ 1.5 log @file re-deleted on trunk @ text @resurrected file content. @ 1.5.2.1 log @file somefile.txt was added on branch BRANCH_FROM_DEAD on 2007-04-05 15:34:30 +0000 @ text @d1 1 @ 1.5.2.2 log @file revived on branch @ text @a0 1 text on branch spawning from dead revision.@ 1.4 log @file resurrected on trunk @ text @@ 1.3 log @file deleted @ text @d1 1 a1 2 keyword: $Id: somefile.txt,v 1.2 2007-04-05 15:13:08 ossi Exp $ now done this is modified file content. @ 1.2 log @file modified @ text @d1 1 a1 1 keyword: $Id: somefile.txt,v 1.1 2007-04-05 15:07:41 ossi Exp $ now done @ 1.1 log @file added @ text @d1 2 a2 2 this is file content. keyword: $Id: fake expaded keyword$ now done @ 1.1.2.1 log @modified on branch @ text @d2 1 a2 2 text added on branch. keyword: $Id: somefile.txt,v 1.1 2007-04-05 15:07:41 ossi Exp $ now done @ 1.1.2.2 log @file modified on branch, take 2 @ text @d3 1 a3 2 keyword: $Id: somefile.txt,v 1.1.2.1 2007-04-05 15:30:02 ossi Exp $ now done more text added on branch. @ 1.1.2.3 log @file deleted on branch @ text @d3 1 a3 1 keyword: $Id: somefile.txt,v 1.1.2.2 2007-04-05 15:30:44 ossi Exp $ now done @ cvs2svn-2.4.0/test-data/repeatedly-defined-symbols-cvsrepos/0000775000076500007650000000000012027373500025152 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/repeatedly-defined-symbols-cvsrepos/proj/0000775000076500007650000000000012027373500026124 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/repeatedly-defined-symbols-cvsrepos/proj/default,v0000664000076500007650000000034610740005425027736 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BRANCH:1.1.0.2 BRANCH:1.1.0.2 TAG:1.1 TAG:1.1; locks; strict; comment @# @; 1.1 date 2003.05.22.23.20.19; author jrandom; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @@ cvs2svn-2.4.0/test-data/tagging-after-delete-cvsrepos/0000775000076500007650000000000012027373500023711 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/tagging-after-delete-cvsrepos/test/0000775000076500007650000000000012027373500024670 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/tagging-after-delete-cvsrepos/test/Attic/0000775000076500007650000000000012027373500025734 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/tagging-after-delete-cvsrepos/test/Attic/b,v0000664000076500007650000000044410702477014026347 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2006.11.28.18.18.39; author mark; state dead; branches; next 1.1; 1.1 date 2006.11.28.18.18.24; author mark; state Exp; branches; next ; desc @@ 1.2 log @removed file b @ text @file b @ 1.1 log @added files @ text @@ cvs2svn-2.4.0/test-data/tagging-after-delete-cvsrepos/test/a,v0000664000076500007650000000027510702477014025304 0ustar mhaggermhagger00000000000000head 1.1; access; symbols tag1:1.1; locks; strict; comment @# @; 1.1 date 2006.11.28.18.18.23; author mark; state Exp; branches; next ; desc @@ 1.1 log @added files @ text @file a @ cvs2svn-2.4.0/test-data/symbolic-name-overfill-cvsrepos/0000775000076500007650000000000012027373500024311 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/symbolic-name-overfill-cvsrepos/proj/0000775000076500007650000000000012027373500025263 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/symbolic-name-overfill-cvsrepos/proj/Attic/0000775000076500007650000000000012027373500026327 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/symbolic-name-overfill-cvsrepos/proj/Attic/file-added-on-branch.txt,v0000664000076500007650000000062110702477012033155 0ustar mhaggermhagger00000000000000head 1.1; access; symbols branch1:1.1.0.2; locks; strict; comment @# @; 1.1 date 2004.07.23.19.28.17; author fitz; state dead; branches 1.1.2.1; next ; 1.1.2.1 date 2004.07.23.19.28.17; author fitz; state Exp; branches; next ; desc @@ 1.1 log @file branched-file.txt was initially added on branch branch1. @ text @@ 1.1.2.1 log @initial import @ text @a0 2 This file added on a branch. @ cvs2svn-2.4.0/test-data/symbolic-name-overfill-cvsrepos/proj/file.txt,v0000664000076500007650000000064410702477012027212 0ustar mhaggermhagger00000000000000head 1.1; access; symbols branch1:1.1.0.2; locks; strict; comment @# @; 1.1 date 2004.07.23.19.16.28; author fitz; state Exp; branches 1.1.2.1; next ; 1.1.2.1 date 2004.07.23.19.17.22; author fitz; state Exp; branches; next ; desc @@ 1.1 log @initial import @ text @This is a file. That's all there is to it.@ 1.1.2.1 log @Commit to file on branch. @ text @d1 1 a1 1 This is a file. It's now on a branch.@ cvs2svn-2.4.0/test-data/double-branch-delete-cvsrepos/0000775000076500007650000000000012027373500023677 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/double-branch-delete-cvsrepos/SoftSet.java,v0000664000076500007650000000027410702477016026403 0ustar mhaggermhagger00000000000000head 1.1; access; symbols Branch_4_0:1.1.0.2; locks; strict; comment @# @; 1.1 date 2006.05.11.02.23.22; author starksm; state Exp; branches; next ; desc @@ 1.1 log @log1 @ text @@ cvs2svn-2.4.0/test-data/double-branch-delete-cvsrepos/Streams.java,v0000664000076500007650000000076110702477016026433 0ustar mhaggermhagger00000000000000head 1.4; access; symbols Branch_4_0:1.4.0.18; locks; strict; comment @# @; 1.4 date 2002.05.31.03.09.03; author user57; state Exp; branches 1.4.18.1; next ; 1.4.18.1 date 2005.10.29.05.07.47; author starksm; state Exp; branches; next ; desc @@ 1.4 log @ o turned commented System.out's into gaurded log.trace's o flush output buffer on copyb, as if it was not a buffer already the last bit might be (and usually is) lost. @ text @@ 1.4.18.1 log @Update the LGPL header @ text @@ cvs2svn-2.4.0/test-data/double-branch-delete-cvsrepos/IMarshalledValue.java,v0000664000076500007650000000145010702477016030173 0ustar mhaggermhagger00000000000000head 1.1; access; symbols Branch_4_0:1.1.0.2; locks; strict; comment @# @; 1.1 date 2005.12.15.20.18.27; author csuconic; state Exp; branches 1.1.2.1; next ; 1.1.2.1 date 2005.12.15.20.18.27; author csuconic; state dead; branches; next 1.1.2.3; 1.1.2.3 date 2005.12.15.20.33.41; author csuconic; state Exp; branches; next 1.1.2.4; 1.1.2.4 date 2006.03.29.04.16.05; author csuconic; state dead; branches; next ; desc @@ 1.1 log @JBAS-2436 - Adding IMarshalledValue to common - required piece for integrating Pluggable Serialization @ text @@ 1.1.2.1 log @file IMarshalledValue.java was added on branch Branch_4_0 on 2005-12-15 20:18:49 +0000 @ text @@ 1.1.2.3 log @JBAS-2436 - Adding LGPL Header2 @ text @@ 1.1.2.4 log @JBAS-3025 - Removing dependency on commons/IMarshalledValue @ text @@ cvs2svn-2.4.0/test-data/pass5-when-to-fill-cvsrepos/0000775000076500007650000000000012027373500023270 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/pass5-when-to-fill-cvsrepos/root/0000775000076500007650000000000012027373500024253 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/pass5-when-to-fill-cvsrepos/root/b,v0000664000076500007650000000050710702477020024663 0ustar mhaggermhagger00000000000000head 1.1; access; symbols mybranch:1.1.0.2; locks; strict; comment @# @; 1.1 date 2004.06.05.13.30.31; author max; state Exp; branches 1.1.2.1; next ; 1.1.2.1 date 2004.06.05.13.31.12; author max; state Exp; branches; next ; desc @@ 1.1 log @Add a and b to trunk @ text @@ 1.1.2.1 log @Branch commit of b @ text @@ cvs2svn-2.4.0/test-data/pass5-when-to-fill-cvsrepos/root/a,v0000664000076500007650000000050710702477020024662 0ustar mhaggermhagger00000000000000head 1.1; access; symbols mybranch:1.1.0.2; locks; strict; comment @# @; 1.1 date 2004.06.05.13.30.31; author max; state Exp; branches 1.1.2.1; next ; 1.1.2.1 date 2004.06.05.13.31.07; author max; state Exp; branches; next ; desc @@ 1.1 log @Add a and b to trunk @ text @@ 1.1.2.1 log @Branch commit of a @ text @@ cvs2svn-2.4.0/test-data/pass5-when-to-fill-cvsrepos/root/c,v0000664000076500007650000000050110702477020024656 0ustar mhaggermhagger00000000000000head 1.1; access; symbols mybranch:1.1.0.2; locks; strict; comment @# @; 1.1 date 2004.06.05.13.31.17; author max; state Exp; branches 1.1.2.1; next ; 1.1.2.1 date 2004.06.05.13.31.18; author max; state Exp; branches; next ; desc @@ 1.1 log @Add c to trunk @ text @@ 1.1.2.1 log @Branch commit of c @ text @@ cvs2svn-2.4.0/test-data/pass5-when-to-fill-cvsrepos/README0000664000076500007650000000034110702477020024146 0ustar mhaggermhagger00000000000000This test checks a case where pass5 generates a fill which isn't actually needed, causing cvs2svn to die when it composes an empty revision in pass8. The bug was originally detected converting the apache 1.3 cvs repository. cvs2svn-2.4.0/test-data/file-in-attic-too-cvsrepos/0000775000076500007650000000000012027373500023156 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/file-in-attic-too-cvsrepos/Attic/0000775000076500007650000000000012027373500024222 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/file-in-attic-too-cvsrepos/Attic/file.txt,v0000664000076500007650000000031310702477015026145 0ustar mhaggermhagger00000000000000head 1.1; access; symbols; locks; strict; comment @# @; 1.1 date 2004.07.12.20.01.23; author fitz; state Exp; branches; next ; desc @@ 1.1 log @initial import @ text @This is a file. So there. @ cvs2svn-2.4.0/test-data/file-in-attic-too-cvsrepos/file.txt,v0000664000076500007650000000031310702477015025101 0ustar mhaggermhagger00000000000000head 1.1; access; symbols; locks; strict; comment @# @; 1.1 date 2004.07.12.20.01.23; author fitz; state Exp; branches; next ; desc @@ 1.1 log @initial import @ text @This is a file. So there. @ cvs2svn-2.4.0/test-data/default-branches-cvsrepos/0000775000076500007650000000000012027373500023141 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/default-branches-cvsrepos/proj/0000775000076500007650000000000012027373500024113 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/default-branches-cvsrepos/proj/d.txt,v0000664000076500007650000000174310702477015025352 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access; symbols vtag-4:1.1.1.4 vtag-3:1.1.1.3 vtag-2:1.1.1.2 vtag-1:1.1.1.1; locks; strict; comment @# @; 1.1 date 2004.02.09.15.43.13; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.2; 1.1.1.2 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.3; 1.1.1.3 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.4; 1.1.1.4 date 2004.02.09.15.43.16; author kfogel; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @This is vtag-1 (on vbranchA) of d.txt. @ 1.1.1.1 log @Import (vbranchA, vtag-1). @ text @@ 1.1.1.2 log @Import (vbranchA, vtag-2). @ text @d1 1 a1 1 This is vtag-2 (on vbranchA) of d.txt. @ 1.1.1.3 log @Import (vbranchA, vtag-3). @ text @d1 1 a1 1 This is vtag-3 (on vbranchA) of d.txt. @ 1.1.1.4 log @Import (vbranchA, vtag-4). @ text @d1 1 a1 1 This is vtag-4 (on vbranchA) of d.txt. @ cvs2svn-2.4.0/test-data/default-branches-cvsrepos/proj/deleted-on-vendor-branch.txt,v0000664000076500007650000000212010702477015031663 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access; symbols vtag-4:1.1.1.4 vtag-3:1.1.1.3 vtag-2:1.1.1.2 vtag-1:1.1.1.1 vbranchA:1.1.1; locks; strict; comment @# @; 1.1 date 2004.02.09.15.43.13; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.2; 1.1.1.2 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.3; 1.1.1.3 date 2004.02.09.15.43.13; author kfogel; state dead; branches; next 1.1.1.4; 1.1.1.4 date 2004.02.09.15.43.16; author kfogel; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @This is vtag-1 (on vbranchA) of deleted-on-vendor-branch.txt. @ 1.1.1.1 log @Import (vbranchA, vtag-1). @ text @@ 1.1.1.2 log @Import (vbranchA, vtag-2). @ text @d1 1 a1 1 This is vtag-2 (on vbranchA) of deleted-on-vendor-branch.txt. @ 1.1.1.3 log @Import (vbranchA, vtag-3). @ text @d1 1 a1 1 This is vtag-3 (on vbranchA) of deleted-on-vendor-branch.txt. @ 1.1.1.4 log @Import (vbranchA, vtag-4). @ text @d1 1 a1 1 This is vtag-4 (on vbranchA) of deleted-on-vendor-branch.txt. @ cvs2svn-2.4.0/test-data/default-branches-cvsrepos/proj/a.txt,v0000664000076500007650000000227610702477015025351 0ustar mhaggermhagger00000000000000head 1.2; access; symbols vtag-4:1.1.1.4 vtag-3:1.1.1.3 vtag-2:1.1.1.2 vtag-1:1.1.1.1 vbranchA:1.1.1; locks; strict; comment @# @; 1.2 date 2004.02.09.15.43.14; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.02.09.15.43.13; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.2; 1.1.1.2 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.3; 1.1.1.3 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.4; 1.1.1.4 date 2004.02.09.15.43.16; author kfogel; state Exp; branches; next ; desc @@ 1.2 log @First regular commit, to a.txt, on vtag-3. @ text @This is vtag-3 (on vbranchA) of a.txt. A regular change to a.txt. @ 1.1 log @Initial revision @ text @d1 2 a2 1 This is vtag-1 (on vbranchA) of a.txt. @ 1.1.1.1 log @Import (vbranchA, vtag-1). @ text @@ 1.1.1.2 log @Import (vbranchA, vtag-2). @ text @d1 1 a1 1 This is vtag-2 (on vbranchA) of a.txt. @ 1.1.1.3 log @Import (vbranchA, vtag-3). @ text @d1 1 a1 1 This is vtag-3 (on vbranchA) of a.txt. @ 1.1.1.4 log @Import (vbranchA, vtag-4). @ text @d1 1 a1 1 This is vtag-4 (on vbranchA) of a.txt. @ cvs2svn-2.4.0/test-data/default-branches-cvsrepos/proj/e.txt,v0000664000076500007650000000166310702477015025354 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access; symbols vtag-3:1.1.1.3; locks; strict; comment @# @; 1.1 date 2004.02.09.15.43.13; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.2; 1.1.1.2 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.3; 1.1.1.3 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.4; 1.1.1.4 date 2004.02.09.15.43.16; author kfogel; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @This is vtag-1 (on vbranchA) of e.txt. @ 1.1.1.1 log @Import (vbranchA, vtag-1). @ text @@ 1.1.1.2 log @Import (vbranchA, vtag-2). @ text @d1 1 a1 1 This is vtag-2 (on vbranchA) of e.txt. @ 1.1.1.3 log @Import (vbranchA, vtag-3). @ text @d1 1 a1 1 This is vtag-3 (on vbranchA) of e.txt. @ 1.1.1.4 log @Import (vbranchA, vtag-4). @ text @d1 1 a1 1 This is vtag-4 (on vbranchA) of e.txt. @ cvs2svn-2.4.0/test-data/default-branches-cvsrepos/proj/b.txt,v0000664000076500007650000000176310702477015025352 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access; symbols vtag-4:1.1.1.4 vtag-3:1.1.1.3 vtag-2:1.1.1.2 vtag-1:1.1.1.1 vbranchA:1.1.1; locks; strict; comment @# @; 1.1 date 2004.02.09.15.43.13; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.2; 1.1.1.2 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.3; 1.1.1.3 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.4; 1.1.1.4 date 2004.02.09.15.43.16; author kfogel; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @This is vtag-1 (on vbranchA) of b.txt. @ 1.1.1.1 log @Import (vbranchA, vtag-1). @ text @@ 1.1.1.2 log @Import (vbranchA, vtag-2). @ text @d1 1 a1 1 This is vtag-2 (on vbranchA) of b.txt. @ 1.1.1.3 log @Import (vbranchA, vtag-3). @ text @d1 1 a1 1 This is vtag-3 (on vbranchA) of b.txt. @ 1.1.1.4 log @Import (vbranchA, vtag-4). @ text @d1 1 a1 1 This is vtag-4 (on vbranchA) of b.txt. @ cvs2svn-2.4.0/test-data/default-branches-cvsrepos/proj/added-then-imported.txt,v0000664000076500007650000000076010702477015030743 0ustar mhaggermhagger00000000000000head 1.1; access; symbols vtag-4:1.1.1.1 vbranchA:1.1.1; locks; strict; comment @# @; 1.1 date 2004.02.09.15.43.15; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2004.02.09.15.43.16; author kfogel; state Exp; branches; next ; desc @@ 1.1 log @Add a file to the working copy. @ text @Adding this file, before importing it with different contents. @ 1.1.1.1 log @Import (vbranchA, vtag-4). @ text @d1 1 a1 1 This is vtag-4 (on vbranchA) of added-then-imported.txt. @ cvs2svn-2.4.0/test-data/default-branches-cvsrepos/proj/c.txt,v0000664000076500007650000000176310702477015025353 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access; symbols vtag-4:1.1.1.4 vtag-3:1.1.1.3 vtag-2:1.1.1.2 vtag-1:1.1.1.1 vbranchA:1.1.1; locks; strict; comment @# @; 1.1 date 2004.02.09.15.43.13; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.2; 1.1.1.2 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.3; 1.1.1.3 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.4; 1.1.1.4 date 2004.02.09.15.43.16; author kfogel; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @This is vtag-1 (on vbranchA) of c.txt. @ 1.1.1.1 log @Import (vbranchA, vtag-1). @ text @@ 1.1.1.2 log @Import (vbranchA, vtag-2). @ text @d1 1 a1 1 This is vtag-2 (on vbranchA) of c.txt. @ 1.1.1.3 log @Import (vbranchA, vtag-3). @ text @d1 1 a1 1 This is vtag-3 (on vbranchA) of c.txt. @ 1.1.1.4 log @Import (vbranchA, vtag-4). @ text @d1 1 a1 1 This is vtag-4 (on vbranchA) of c.txt. @ cvs2svn-2.4.0/test-data/enroot-race-cvsrepos/0000775000076500007650000000000012027373500022150 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/enroot-race-cvsrepos/proj/0000775000076500007650000000000012027373500023122 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/enroot-race-cvsrepos/proj/a.txt,v0000664000076500007650000000116010702477016024350 0ustar mhaggermhagger00000000000000head 1.3; access; symbols mybranch:1.3.0.2; locks; strict; comment @# @; 1.3 date 2004.02.04.17.02.08; author kfogel; state Exp; branches; next 1.2; 1.2 date 2004.02.04.17.02.05; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.02.04.17.02.04; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2004.02.04.17.02.04; author kfogel; state Exp; branches; next ; desc @@ 1.3 log @Commit next change. @ text @Next change. @ 1.2 log @Commit first change. @ text @d1 1 a1 1 First change. @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is a.txt. @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/enroot-race-cvsrepos/proj/b.txt,v0000664000076500007650000000116010702477016024351 0ustar mhaggermhagger00000000000000head 1.3; access; symbols mybranch:1.3.0.2; locks; strict; comment @# @; 1.3 date 2004.02.04.17.02.08; author kfogel; state Exp; branches; next 1.2; 1.2 date 2004.02.04.17.02.05; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.02.04.17.02.04; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2004.02.04.17.02.04; author kfogel; state Exp; branches; next ; desc @@ 1.3 log @Commit next change. @ text @Next change. @ 1.2 log @Commit first change. @ text @d1 1 a1 1 First change. @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is b.txt. @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/enroot-race-cvsrepos/proj/c.txt,v0000664000076500007650000000172010702477016024354 0ustar mhaggermhagger00000000000000head 1.4; access; symbols mybranch:1.4.0.2; locks; strict; comment @# @; 1.4 date 2004.02.04.17.02.07; author kfogel; state Exp; branches 1.4.2.1; next 1.3; 1.3 date 2004.02.04.17.02.06; author kfogel; state Exp; branches; next 1.2; 1.2 date 2004.02.04.17.02.05; author kfogel; state Exp; branches; next 1.1; 1.1 date 2004.02.04.17.02.04; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2004.02.04.17.02.04; author kfogel; state Exp; branches; next ; 1.4.2.1 date 2004.02.04.17.02.08; author kfogel; state Exp; branches; next ; desc @@ 1.4 log @Commit yet another change to c.txt. @ text @Yet another commit on c.txt. @ 1.4.2.1 log @Commit next change. @ text @d1 1 a1 1 Next change. @ 1.3 log @Commit another change to c.txt. @ text @d1 1 a1 1 Another commit on c.txt. @ 1.2 log @Commit first change. @ text @d1 1 a1 1 First change. @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is c.txt. @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/0000775000076500007650000000000012027373500023263 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/README0000664000076500007650000000224710702477015024154 0ustar mhaggermhagger00000000000000This data is for testing the resolution of: http://subversion.tigris.org/issues/show_bug.cgi?id=1427 "cvs2svn: fails on GtkRadiant repository" But the data here is not the GtkRadiant data. Instead, it comes from Jack Moffitt at xiph.org, who was able to narrow down the same bug to a much smaller repro set. It might be possible to narrow it down even further, I don't know -- too lazy to try right now. The important thing is that this data won't convert if revision 6567 is subtracted from cvs2svn.py. The error message can look either like this ----- pass 3 ----- ----- pass 4 ----- committing: Sun Sep 9 21:26:32 2001, over 3 seconds No origin records for branch 'xiph'. or like this File "./cvs2svn.py", line 960, in copy_path entries) File "./cvs2svn.py", line 661, in change_path for ent in new_val.keys(): AttributeError: 'None' object has no attribute 'keys' the former if no part of the 'xiph' vendor import branch has been created in the Subversion repository by the time we get to the problem file, the latter if /branches/xiph/ already exists. It could go either way, depending on how Python iterates over dictionary keys. cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/thread/0000775000076500007650000000000012027373500024532 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/thread/.cvsignore,v0000664000076500007650000000074010702477015027001 0ustar mhaggermhagger00000000000000head 1.2; access; symbols libshout-2_0:1.2 libshout-2_0b3:1.2 libshout-2_0b2:1.2 libshout_2_0b1:1.2 libogg2-zerocopy:1.2.0.4 branch-beta2-rewrite:1.2.0.2; locks; strict; comment @# @; 1.2 date 2001.09.10.03.04.11; author jack; state Exp; branches; next 1.1; 1.1 date 2001.09.10.03.00.41; author jack; state Exp; branches; next ; desc @@ 1.2 log @.cvsignore is fun! @ text @Makefile Makefile.in .deps .libs *.la *.lo @ 1.1 log @Still more .cvsignore @ text @d4 3 @ cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/thread/BUILDING,v0000664000076500007650000000161010702477015026116 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access; symbols libshout-2_0:1.1.1.1 libshout-2_0b3:1.1.1.1 libshout-2_0b2:1.1.1.1 libshout_2_0b1:1.1.1.1 libogg2-zerocopy:1.1.1.1.0.4 branch-beta2-rewrite:1.1.1.1.0.2 start:1.1.1.1 xiph:1.1.1; locks; strict; comment @# @; 1.1 date 2001.09.10.02.26.33; author jack; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2001.09.10.02.26.33; author jack; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @defines that affect compilation _WIN32 this should be defined for Win32 platforms DEBUG_MUTEXES define this to turn on mutex debugging. this will log locks/unlocks. CHECK_MUTEXES (DEBUG_MUTEXES must also be defined) checks to make sure mutex operations make sense. ie, multi_mutex is locked when locking multiple mutexes, etc. THREAD_DEBUG (define to 1-4) turns on the thread.log logging @ 1.1.1.1 log @move to cvs @ text @@ cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/thread/README,v0000664000076500007650000000145410702477015025664 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access; symbols libshout-2_0:1.1.1.1 libshout-2_0b3:1.1.1.1 libshout-2_0b2:1.1.1.1 libshout_2_0b1:1.1.1.1 libogg2-zerocopy:1.1.1.1.0.4 branch-beta2-rewrite:1.1.1.1.0.2 start:1.1.1.1 xiph:1.1.1; locks; strict; comment @# @; 1.1 date 2001.09.10.02.26.32; author jack; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2001.09.10.02.26.32; author jack; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @This is the cross platform thread and syncronization library. It depends on the avl library. This is a massively cleaned and picked through version of the code from the icecast 1.3.x base. It has not been heavily tested *YET*. But since it's just cleanups, it really shouldn't have that many problems. jack. @ 1.1.1.1 log @move to cvs @ text @@ cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/thread/Makefile.am,v0000664000076500007650000000317510702477015027042 0ustar mhaggermhagger00000000000000head 1.4; access; symbols libshout-2_0:1.4 libshout-2_0b3:1.4 libshout-2_0b2:1.3 libshout_2_0b1:1.3 libogg2-zerocopy:1.1.1.1.0.4 branch-beta2-rewrite:1.1.1.1.0.2 start:1.1.1.1 xiph:1.1.1; locks; strict; comment @# @; 1.4 date 2003.07.03.12.59.06; author brendan; state Exp; branches; next 1.3; 1.3 date 2003.03.09.22.56.46; author karl; state Exp; branches; next 1.2; 1.2 date 2003.03.08.00.46.58; author karl; state Exp; branches; next 1.1; 1.1 date 2001.09.10.02.26.32; author jack; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2001.09.10.02.26.32; author jack; state Exp; branches; next ; desc @@ 1.4 log @Get everthing in make dist, and remove bitkeeper cruft @ text @## Process this with automake to create Makefile.in AUTOMAKE_OPTIONS = foreign EXTRA_DIST = BUILDING COPYING README TODO noinst_LTLIBRARIES = libicethread.la noinst_HEADERS = thread.h libicethread_la_SOURCES = thread.c libicethread_la_CFLAGS = @@XIPH_CFLAGS@@ INCLUDES = -I$(srcdir)/.. debug: $(MAKE) all CFLAGS="@@DEBUG@@" profile: $(MAKE) all CFLAGS="@@PROFILE@@" @ 1.3 log @reduce include file namespace clutter for libshout and the associated smaller libs. @ text @d5 2 a13 3 # SCCS stuff (for BitKeeper) GET = true @ 1.2 log @more on the XIPH_CFLAGS. For the smaller libs like thread etc put any passed flags into the compiling rules. Also configure in libshout now sets up the XIPH_CFLAGS @ text @d11 1 a11 1 INCLUDES = -I$(srcdir)/../avl -I$(srcdir)/../log @ 1.1 log @Initial revision @ text @d9 1 d17 1 a17 1 $(MAKE) all CFLAGS="@@DEBUG@@" d20 1 a20 1 $(MAKE) all CFLAGS="@@PROFILE@@" @ 1.1.1.1 log @move to cvs @ text @@ cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/thread/thread.c,v0000664000076500007650000013154110702477015026420 0ustar mhaggermhagger00000000000000head 1.25; access; symbols libshout-2_0:1.24 libshout-2_0b3:1.24 libshout-2_0b2:1.24 libshout_2_0b1:1.24 libogg2-zerocopy:1.17.0.2 branch-beta2-rewrite:1.5.0.2 start:1.1.1.1 xiph:1.1.1; locks; strict; comment @ * @; 1.25 date 2003.07.14.02.17.52; author brendan; state Exp; branches; next 1.24; 1.24 date 2003.03.15.02.10.18; author msmith; state Exp; branches; next 1.23; 1.23 date 2003.03.12.03.59.55; author karl; state Exp; branches; next 1.22; 1.22 date 2003.03.09.22.56.46; author karl; state Exp; branches; next 1.21; 1.21 date 2003.03.08.16.05.38; author karl; state Exp; branches; next 1.20; 1.20 date 2003.03.04.15.31.34; author msmith; state Exp; branches; next 1.19; 1.19 date 2003.01.17.09.01.04; author msmith; state Exp; branches; next 1.18; 1.18 date 2002.12.29.09.55.50; author msmith; state Exp; branches; next 1.17; 1.17 date 2002.11.22.13.00.44; author msmith; state Exp; branches; next 1.16; 1.16 date 2002.09.24.07.09.08; author msmith; state Exp; branches; next 1.15; 1.15 date 2002.08.16.14.23.17; author msmith; state Exp; branches; next 1.14; 1.14 date 2002.08.13.01.08.15; author msmith; state Exp; branches; next 1.13; 1.13 date 2002.08.10.03.22.44; author msmith; state Exp; branches; next 1.12; 1.12 date 2002.08.09.06.52.07; author msmith; state Exp; branches; next 1.11; 1.11 date 2002.08.05.14.48.03; author msmith; state Exp; branches; next 1.10; 1.10 date 2002.08.03.08.14.56; author msmith; state Exp; branches; next 1.9; 1.9 date 2002.04.30.06.50.47; author msmith; state Exp; branches; next 1.8; 1.8 date 2002.03.05.23.59.38; author jack; state Exp; branches; next 1.7; 1.7 date 2002.02.08.03.51.19; author jack; state Exp; branches; next 1.6; 1.6 date 2002.02.07.01.04.09; author jack; state Exp; branches; next 1.5; 1.5 date 2001.10.21.02.04.27; author jack; state Exp; branches; next 1.4; 1.4 date 2001.10.20.22.27.52; author jack; state Exp; branches; next 1.3; 1.3 date 2001.10.20.05.35.30; author jack; state Exp; branches; next 1.2; 1.2 date 2001.10.20.03.39.10; author jack; state Exp; branches; next 1.1; 1.1 date 2001.09.10.02.26.33; author jack; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2001.09.10.02.26.33; author jack; state Exp; branches; next ; desc @@ 1.25 log @Assign LGP to thread module @ text @/* threads.c: Thread Abstraction Functions * * Copyright (c) 1999, 2000 the icecast team * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Library General Public * License as published by the Free Software Foundation; either * version 2 of the License, or (at your option) any later version. * * This library is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU * Library General Public License for more details. * * You should have received a copy of the GNU Library General Public * License along with this library; if not, write to the Free * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ #ifdef HAVE_CONFIG_H #include #endif #include #include #include #include #include #include #include #ifndef _WIN32 #include #include #else #include #include #include #endif #include #include #include #ifdef THREAD_DEBUG #include #endif #ifdef _WIN32 #define __FUNCTION__ __FILE__ #endif #ifdef THREAD_DEBUG #define CATMODULE "thread" #define LOG_ERROR(y) log_write(_logid, 1, CATMODULE "/", __FUNCTION__, y) #define LOG_ERROR3(y, z1, z2, z3) log_write(_logid, 1, CATMODULE "/", __FUNCTION__, y, z1, z2, z3) #define LOG_ERROR7(y, z1, z2, z3, z4, z5, z6, z7) log_write(_logid, 1, CATMODULE "/", __FUNCTION__, y, z1, z2, z3, z4, z5, z6, z7) #define LOG_WARN(y) log_write(_logid, 2, CATMODULE "/", __FUNCTION__, y) #define LOG_WARN3(y, z1, z2, z3) log_write(_logid, 2, CATMODULE "/", __FUNCTION__, y, z1, z2, z3) #define LOG_WARN5(y, z1, z2, z3, z4, z5) log_write(_logid, 2, CATMODULE "/", __FUNCTION__, y, z1, z2, z3, z4, z5) #define LOG_WARN7(y, z1, z2, z3, z4, z5, z6, z7) log_write(_logid, 2, CATMODULE "/", __FUNCTION__, y, z1, z2, z3, z4, z5, z6, z7) #define LOG_INFO(y) log_write(_logid, 3, CATMODULE "/", __FUNCTION__, y) #define LOG_INFO4(y, z1, z2, z3, z4) log_write(_logid, 3, CATMODULE "/", __FUNCTION__, y, z1, z2, z3, z4) #define LOG_INFO5(y, z1, z2, z3, z4, z5) log_write(_logid, 3, CATMODULE "/", __FUNCTION__, y, z1, z2, z3, z4, z5) #define LOG_DEBUG(y) log_write(_logid, 4, CATMODULE "/", __FUNCTION__, y) #define LOG_DEBUG2(y, z1, z2) log_write(_logid, 4, CATMODULE "/", __FUNCTION__, y, z1, z2) #define LOG_DEBUG5(y, z1, z2, z3, z4, z5) log_write(_logid, 4, CATMODULE "/", __FUNCTION__, y, z1, z2, z3, z4, z5) #endif /* thread starting structure */ typedef struct thread_start_tag { /* the real start routine and arg */ void *(*start_routine)(void *); void *arg; /* whether to create the threaded in detached state */ int detached; /* the other stuff we need to make sure this thread is inserted into ** the thread tree */ thread_type *thread; pthread_t sys_thread; } thread_start_t; static long _next_thread_id = 0; static int _initialized = 0; static avl_tree *_threadtree = NULL; #ifdef DEBUG_MUTEXES static mutex_t _threadtree_mutex = { -1, NULL, MUTEX_STATE_UNINIT, NULL, -1, PTHREAD_MUTEX_INITIALIZER}; #else static mutex_t _threadtree_mutex = { PTHREAD_MUTEX_INITIALIZER }; #endif #ifdef DEBUG_MUTEXES static int _logid = -1; static long _next_mutex_id = 0; static avl_tree *_mutextree = NULL; static mutex_t _mutextree_mutex = { -1, NULL, MUTEX_STATE_UNINIT, NULL, -1, PTHREAD_MUTEX_INITIALIZER}; #endif #ifdef DEBUG_MUTEXES static mutex_t _library_mutex = { -1, NULL, MUTEX_STATE_UNINIT, NULL, -1, PTHREAD_MUTEX_INITIALIZER}; #else static mutex_t _library_mutex = { PTHREAD_MUTEX_INITIALIZER }; #endif /* INTERNAL FUNCTIONS */ /* avl tree functions */ #ifdef DEBUG_MUTEXES static int _compare_mutexes(void *compare_arg, void *a, void *b); static int _free_mutex(void *key); #endif static int _compare_threads(void *compare_arg, void *a, void *b); static int _free_thread(void *key); static int _free_thread_if_detached(void *key); /* mutex fuctions */ static void _mutex_create(mutex_t *mutex); static void _mutex_lock(mutex_t *mutex); static void _mutex_unlock(mutex_t *mutex); /* misc thread stuff */ static void *_start_routine(void *arg); static void _catch_signals(void); static void _block_signals(void); /* LIBRARY INITIALIZATION */ void thread_initialize(void) { thread_type *thread; /* set up logging */ #ifdef THREAD_DEBUG log_initialize(); _logid = log_open("thread.log"); log_set_level(_logid, THREAD_DEBUG); #endif #ifdef DEBUG_MUTEXES /* create all the internal mutexes, and initialize the mutex tree */ _mutextree = avl_tree_new(_compare_mutexes, NULL); /* we have to create this one by hand, because there's no ** mutextree_mutex to lock yet! */ _mutex_create(&_mutextree_mutex); _mutextree_mutex.mutex_id = _next_mutex_id++; avl_insert(_mutextree, (void *)&_mutextree_mutex); #endif thread_mutex_create(&_threadtree_mutex); thread_mutex_create(&_library_mutex); /* initialize the thread tree and insert the main thread */ _threadtree = avl_tree_new(_compare_threads, NULL); thread = (thread_type *)malloc(sizeof(thread_type)); thread->thread_id = _next_thread_id++; thread->line = 0; thread->file = strdup("main.c"); thread->sys_thread = pthread_self(); thread->create_time = time(NULL); thread->name = strdup("Main Thread"); avl_insert(_threadtree, (void *)thread); _catch_signals(); _initialized = 1; } void thread_shutdown(void) { if (_initialized == 1) { thread_mutex_destroy(&_library_mutex); thread_mutex_destroy(&_threadtree_mutex); #ifdef THREAD_DEBUG thread_mutex_destroy(&_mutextree_mutex); avl_tree_free(_mutextree, _free_mutex); #endif avl_tree_free(_threadtree, _free_thread); } #ifdef THREAD_DEBUG log_close(_logid); log_shutdown(); #endif } /* * Signals should be handled by the main thread, nowhere else. * I'm using POSIX signal interface here, until someone tells me * that I should use signal/sigset instead * * This function only valid for non-Win32 */ static void _block_signals(void) { #ifndef _WIN32 sigset_t ss; sigfillset(&ss); /* These ones we want */ sigdelset(&ss, SIGKILL); sigdelset(&ss, SIGSTOP); sigdelset(&ss, SIGTERM); sigdelset(&ss, SIGSEGV); sigdelset(&ss, SIGBUS); if (pthread_sigmask(SIG_BLOCK, &ss, NULL) != 0) { #ifdef THREAD_DEBUG LOG_ERROR("Pthread_sigmask() failed for blocking signals"); #endif } #endif } /* * Let the calling thread catch all the relevant signals * * This function only valid for non-Win32 */ static void _catch_signals(void) { #ifndef _WIN32 sigset_t ss; sigemptyset(&ss); /* These ones should only be accepted by the signal handling thread (main thread) */ sigaddset(&ss, SIGHUP); sigaddset(&ss, SIGCHLD); sigaddset(&ss, SIGINT); sigaddset(&ss, SIGPIPE); if (pthread_sigmask(SIG_UNBLOCK, &ss, NULL) != 0) { #ifdef THREAD_DEBUG LOG_ERROR("pthread_sigmask() failed for catching signals!"); #endif } #endif } thread_type *thread_create_c(char *name, void *(*start_routine)(void *), void *arg, int detached, int line, char *file) { int created; thread_type *thread; thread_start_t *start; thread = (thread_type *)malloc(sizeof(thread_type)); start = (thread_start_t *)malloc(sizeof(thread_start_t)); thread->line = line; thread->file = strdup(file); _mutex_lock(&_threadtree_mutex); thread->thread_id = _next_thread_id++; _mutex_unlock(&_threadtree_mutex); thread->name = strdup(name); thread->create_time = time(NULL); thread->detached = 0; start->start_routine = start_routine; start->arg = arg; start->thread = thread; start->detached = detached; created = 0; if (pthread_create(&thread->sys_thread, NULL, _start_routine, start) == 0) created = 1; #ifdef THREAD_DEBUG else LOG_ERROR("Could not create new thread"); #endif if (created == 0) { #ifdef THREAD_DEBUG LOG_ERROR("System won't let me create more threads, giving up"); #endif return NULL; } return thread; } /* _mutex_create ** ** creates a mutex */ static void _mutex_create(mutex_t *mutex) { #ifdef DEBUG_MUTEXES mutex->thread_id = MUTEX_STATE_NEVERLOCKED; mutex->line = -1; #endif pthread_mutex_init(&mutex->sys_mutex, NULL); } void thread_mutex_create_c(mutex_t *mutex, int line, char *file) { _mutex_create(mutex); #ifdef DEBUG_MUTEXES _mutex_lock(&_mutextree_mutex); mutex->mutex_id = _next_mutex_id++; avl_insert(_mutextree, (void *)mutex); _mutex_unlock(&_mutextree_mutex); #endif } void thread_mutex_destroy (mutex_t *mutex) { pthread_mutex_destroy(&mutex->sys_mutex); #ifdef DEBUG_MUTEXES _mutex_lock(&_mutextree_mutex); avl_delete(_mutextree, mutex, _free_mutex); _mutex_unlock(&_mutextree_mutex); #endif } void thread_mutex_lock_c(mutex_t *mutex, int line, char *file) { #ifdef DEBUG_MUTEXES thread_type *th = thread_self(); if (!th) LOG_WARN("No mt record for %u in lock [%s:%d]", thread_self(), file, line); LOG_DEBUG5("Locking %p (%s) on line %d in file %s by thread %d", mutex, mutex->name, line, file, th ? th->thread_id : -1); # ifdef CHECK_MUTEXES /* Just a little sanity checking to make sure that we're locking ** mutexes correctly */ if (th) { int locks = 0; avl_node *node; mutex_t *tmutex; _mutex_lock(&_mutextree_mutex); node = avl_get_first (_mutextree); while (node) { tmutex = (mutex_t *)node->key; if (tmutex->mutex_id == mutex->mutex_id) { if (tmutex->thread_id == th->thread_id) { /* Deadlock, same thread can't lock the same mutex twice */ LOG_ERROR7("DEADLOCK AVOIDED (%d == %d) on mutex [%s] in file %s line %d by thread %d [%s]", tmutex->thread_id, th->thread_id, mutex->name ? mutex->name : "undefined", file, line, th->thread_id, th->name); _mutex_unlock(&_mutextree_mutex); return; } } else if (tmutex->thread_id == th->thread_id) { /* Mutex locked by this thread (not this mutex) */ locks++; } node = avl_get_next(node); } if (locks > 0) { /* Has already got a mutex locked */ if (_multi_mutex.thread_id != th->thread_id) { /* Tries to lock two mutexes, but has not got the double mutex, norty boy! */ LOG_WARN("(%d != %d) Thread %d [%s] tries to lock a second mutex [%s] in file %s line %d, without locking double mutex!", _multi_mutex.thread_id, th->thread_id, th->thread_id, th->name, mutex->name ? mutex->name : "undefined", file, line); } } _mutex_unlock(&_mutextree_mutex); } # endif /* CHECK_MUTEXES */ _mutex_lock(mutex); _mutex_lock(&_mutextree_mutex); LOG_DEBUG2("Locked %p by thread %d", mutex, th ? th->thread_id : -1); mutex->line = line; if (th) { mutex->thread_id = th->thread_id; } _mutex_unlock(&_mutextree_mutex); #else _mutex_lock(mutex); #endif /* DEBUG_MUTEXES */ } void thread_mutex_unlock_c(mutex_t *mutex, int line, char *file) { #ifdef DEBUG_MUTEXES thread_type *th = thread_self(); if (!th) { LOG_ERROR3("No record for %u in unlock [%s:%d]", thread_self(), file, line); } LOG_DEBUG5("Unlocking %p (%s) on line %d in file %s by thread %d", mutex, mutex->name, line, file, th ? th->thread_id : -1); mutex->line = line; # ifdef CHECK_MUTEXES if (th) { int locks = 0; avl_node *node; mutex_t *tmutex; _mutex_lock(&_mutextree_mutex); while (node) { tmutex = (mutex_t *)node->key; if (tmutex->mutex_id == mutex->mutex_id) { if (tmutex->thread_id != th->thread_id) { LOG_ERROR7("ILLEGAL UNLOCK (%d != %d) on mutex [%s] in file %s line %d by thread %d [%s]", tmutex->thread_id, th->thread_id, mutex->name ? mutex->name : "undefined", file, line, th->thread_id, th->name); _mutex_unlock(&_mutextree_mutex); return; } } else if (tmutex->thread_id == th->thread_id) { locks++; } node = avl_get_next (node); } if ((locks > 0) && (_multi_mutex.thread_id != th->thread_id)) { /* Don't have double mutex, has more than this mutex left */ LOG_WARN("(%d != %d) Thread %d [%s] tries to unlock a mutex [%s] in file %s line %d, without owning double mutex!", _multi_mutex.thread_id, th->thread_id, th->thread_id, th->name, mutex->name ? mutex->name : "undefined", file, line); } _mutex_unlock(&_mutextree_mutex); } # endif /* CHECK_MUTEXES */ _mutex_unlock(mutex); _mutex_lock(&_mutextree_mutex); LOG_DEBUG2("Unlocked %p by thread %d", mutex, th ? th->thread_id : -1); mutex->line = -1; if (mutex->thread_id == th->thread_id) { mutex->thread_id = MUTEX_STATE_NOTLOCKED; } _mutex_unlock(&_mutextree_mutex); #else _mutex_unlock(mutex); #endif /* DEBUG_MUTEXES */ } void thread_cond_create_c(cond_t *cond, int line, char *file) { pthread_cond_init(&cond->sys_cond, NULL); pthread_mutex_init(&cond->cond_mutex, NULL); } void thread_cond_destroy(cond_t *cond) { pthread_mutex_destroy(&cond->cond_mutex); pthread_cond_destroy(&cond->sys_cond); } void thread_cond_signal_c(cond_t *cond, int line, char *file) { pthread_cond_signal(&cond->sys_cond); } void thread_cond_broadcast_c(cond_t *cond, int line, char *file) { pthread_cond_broadcast(&cond->sys_cond); } void thread_cond_timedwait_c(cond_t *cond, int millis, int line, char *file) { struct timespec time; time.tv_sec = millis/1000; time.tv_nsec = (millis - time.tv_sec*1000)*1000000; pthread_mutex_lock(&cond->cond_mutex); pthread_cond_timedwait(&cond->sys_cond, &cond->cond_mutex, &time); pthread_mutex_unlock(&cond->cond_mutex); } void thread_cond_wait_c(cond_t *cond, int line, char *file) { pthread_mutex_lock(&cond->cond_mutex); pthread_cond_wait(&cond->sys_cond, &cond->cond_mutex); pthread_mutex_unlock(&cond->cond_mutex); } void thread_rwlock_create_c(rwlock_t *rwlock, int line, char *file) { pthread_rwlock_init(&rwlock->sys_rwlock, NULL); } void thread_rwlock_destroy(rwlock_t *rwlock) { pthread_rwlock_destroy(&rwlock->sys_rwlock); } void thread_rwlock_rlock_c(rwlock_t *rwlock, int line, char *file) { pthread_rwlock_rdlock(&rwlock->sys_rwlock); } void thread_rwlock_wlock_c(rwlock_t *rwlock, int line, char *file) { pthread_rwlock_wrlock(&rwlock->sys_rwlock); } void thread_rwlock_unlock_c(rwlock_t *rwlock, int line, char *file) { pthread_rwlock_unlock(&rwlock->sys_rwlock); } void thread_exit_c(int val, int line, char *file) { thread_type *th = thread_self(); #if defined(DEBUG_MUTEXES) && defined(CHECK_MUTEXES) if (th) { avl_node *node; mutex_t *tmutex; char name[40]; _mutex_lock(&_mutextree_mutex); while (node) { tmutex = (mutex_t *)node->key; if (tmutex->thread_id == th->thread_id) { LOG_WARN("Thread %d [%s] exiting in file %s line %d, without unlocking mutex [%s]", th->thread_id, th->name, file, line, mutex_to_string(tmutex, name)); } node = avl_get_next (node); } _mutex_unlock(&_mutextree_mutex); } #endif if (th) { #ifdef THREAD_DEBUG LOG_INFO4("Removing thread %d [%s] started at [%s:%d], reason: 'Thread Exited'", th->thread_id, th->name, th->file, th->line); #endif _mutex_lock(&_threadtree_mutex); avl_delete(_threadtree, th, _free_thread_if_detached); _mutex_unlock(&_threadtree_mutex); } pthread_exit((void *)val); } /* sleep for a number of microseconds */ void thread_sleep(unsigned long len) { #ifdef _WIN32 Sleep(len / 1000); #else # ifdef HAVE_NANOSLEEP struct timespec time_sleep; struct timespec time_remaining; int ret; time_sleep.tv_sec = len / 1000000; time_sleep.tv_nsec = (len % 1000000) * 1000; ret = nanosleep(&time_sleep, &time_remaining); while (ret != 0 && errno == EINTR) { time_sleep.tv_sec = time_remaining.tv_sec; time_sleep.tv_nsec = time_remaining.tv_nsec; ret = nanosleep(&time_sleep, &time_remaining); } # else struct timeval tv; tv.tv_sec = len / 1000000; tv.tv_usec = (len % 1000000); select(0, NULL, NULL, NULL, &tv); # endif #endif } static void *_start_routine(void *arg) { thread_start_t *start = (thread_start_t *)arg; void *(*start_routine)(void *) = start->start_routine; void *real_arg = start->arg; thread_type *thread = start->thread; int detach = start->detached; _block_signals(); free(start); /* insert thread into thread tree here */ _mutex_lock(&_threadtree_mutex); thread->sys_thread = pthread_self(); avl_insert(_threadtree, (void *)thread); _mutex_unlock(&_threadtree_mutex); #ifdef THREAD_DEBUG LOG_INFO4("Added thread %d [%s] started at [%s:%d]", thread->thread_id, thread->name, thread->file, thread->line); #endif if (detach) { pthread_detach(thread->sys_thread); thread->detached = 1; } pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL); /* call the real start_routine and start the thread ** this should never exit! */ (start_routine)(real_arg); #ifdef THREAD_DEBUG LOG_WARN("Thread x should never exit from here!!!"); #endif return NULL; } thread_type *thread_self(void) { avl_node *node; thread_type *th; pthread_t sys_thread = pthread_self(); _mutex_lock(&_threadtree_mutex); if (_threadtree == NULL) { #ifdef THREAD_DEBUG LOG_WARN("Thread tree is empty, this must be wrong!"); #endif _mutex_unlock(&_threadtree_mutex); return NULL; } node = avl_get_first(_threadtree); while (node) { th = (thread_type *)node->key; if (th && pthread_equal(sys_thread, th->sys_thread)) { _mutex_unlock(&_threadtree_mutex); return th; } node = avl_get_next(node); } _mutex_unlock(&_threadtree_mutex); #ifdef THREAD_DEBUG LOG_ERROR("Nonexistant thread alive..."); #endif return NULL; } void thread_rename(const char *name) { thread_type *th; th = thread_self(); if (th->name) free(th->name); th->name = strdup(name); } static void _mutex_lock(mutex_t *mutex) { pthread_mutex_lock(&mutex->sys_mutex); } static void _mutex_unlock(mutex_t *mutex) { pthread_mutex_unlock(&mutex->sys_mutex); } void thread_library_lock(void) { _mutex_lock(&_library_mutex); } void thread_library_unlock(void) { _mutex_unlock(&_library_mutex); } void thread_join(thread_type *thread) { void *ret; int i; i = pthread_join(thread->sys_thread, &ret); _mutex_lock(&_threadtree_mutex); avl_delete(_threadtree, thread, _free_thread); _mutex_unlock(&_threadtree_mutex); } /* AVL tree functions */ #ifdef DEBUG_MUTEXES static int _compare_mutexes(void *compare_arg, void *a, void *b) { mutex_t *m1, *m2; m1 = (mutex_t *)a; m2 = (mutex_t *)b; if (m1->mutex_id > m2->mutex_id) return 1; if (m1->mutex_id < m2->mutex_id) return -1; return 0; } #endif static int _compare_threads(void *compare_arg, void *a, void *b) { thread_type *t1, *t2; t1 = (thread_type *)a; t2 = (thread_type *)b; if (t1->thread_id > t2->thread_id) return 1; if (t1->thread_id < t2->thread_id) return -1; return 0; } #ifdef DEBUG_MUTEXES static int _free_mutex(void *key) { mutex_t *m; m = (mutex_t *)key; if (m && m->file) { free(m->file); m->file = NULL; } /* all mutexes are static. don't need to free them */ return 1; } #endif static int _free_thread(void *key) { thread_type *t; t = (thread_type *)key; if (t->file) free(t->file); if (t->name) free(t->name); free(t); return 1; } static int _free_thread_if_detached(void *key) { thread_type *t = key; if(t->detached) return _free_thread(key); return 1; } @ 1.24 log @Brendan was getting pissed off about inconsistent indentation styles. Convert all tabs to 4 spaces. All code must now use 4 space indents. @ text @d1 18 a18 19 /* threads.c ** - Thread Abstraction Functions ** ** Copyright (c) 1999, 2000 the icecast team ** ** This program is free software; you can redistribute it and/or ** modify it under the terms of the GNU General Public License ** as published by the Free Software Foundation; either version 2 ** of the License, or (at your option) any latfer version. ** ** This program is distributed in the hope that it will be useful, ** but WITHOUT ANY WARRANTY; without even the implied warranty of ** MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ** GNU General Public License for more details. ** ** You should have received a copy of the GNU General Public License ** along with this program; if not, write to the Free Software ** Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */ @ 1.23 log @avoid freeing a thread structure a second time. @ text @d77 12 a88 12 /* the real start routine and arg */ void *(*start_routine)(void *); void *arg; /* whether to create the threaded in detached state */ int detached; /* the other stuff we need to make sure this thread is inserted into ** the thread tree */ thread_type *thread; pthread_t sys_thread; d146 1 a146 1 thread_type *thread; d148 1 a148 1 /* set up logging */ d151 3 a153 3 log_initialize(); _logid = log_open("thread.log"); log_set_level(_logid, THREAD_DEBUG); d157 1 a157 1 /* create all the internal mutexes, and initialize the mutex tree */ d159 1 a159 1 _mutextree = avl_tree_new(_compare_mutexes, NULL); d161 4 a164 4 /* we have to create this one by hand, because there's no ** mutextree_mutex to lock yet! */ _mutex_create(&_mutextree_mutex); d166 2 a167 2 _mutextree_mutex.mutex_id = _next_mutex_id++; avl_insert(_mutextree, (void *)&_mutextree_mutex); d170 2 a171 2 thread_mutex_create(&_threadtree_mutex); thread_mutex_create(&_library_mutex); d173 1 a173 1 /* initialize the thread tree and insert the main thread */ d175 1 a175 1 _threadtree = avl_tree_new(_compare_threads, NULL); d177 1 a177 1 thread = (thread_type *)malloc(sizeof(thread_type)); d179 6 a184 6 thread->thread_id = _next_thread_id++; thread->line = 0; thread->file = strdup("main.c"); thread->sys_thread = pthread_self(); thread->create_time = time(NULL); thread->name = strdup("Main Thread"); d186 1 a186 1 avl_insert(_threadtree, (void *)thread); d188 1 a188 1 _catch_signals(); d190 1 a190 1 _initialized = 1; d195 3 a197 3 if (_initialized == 1) { thread_mutex_destroy(&_library_mutex); thread_mutex_destroy(&_threadtree_mutex); d199 3 a201 3 thread_mutex_destroy(&_mutextree_mutex); avl_tree_free(_mutextree, _free_mutex); d203 2 a204 2 avl_tree_free(_threadtree, _free_thread); } d207 2 a208 2 log_close(_logid); log_shutdown(); d271 8 a278 12 int created; thread_type *thread; thread_start_t *start; thread = (thread_type *)malloc(sizeof(thread_type)); start = (thread_start_t *)malloc(sizeof(thread_start_t)); thread->line = line; thread->file = strdup(file); _mutex_lock(&_threadtree_mutex); thread->thread_id = _next_thread_id++; _mutex_unlock(&_threadtree_mutex); d280 6 a285 2 thread->name = strdup(name); thread->create_time = time(NULL); d288 4 a291 4 start->start_routine = start_routine; start->arg = arg; start->thread = thread; start->detached = detached; d293 3 a295 3 created = 0; if (pthread_create(&thread->sys_thread, NULL, _start_routine, start) == 0) created = 1; d297 2 a298 2 else LOG_ERROR("Could not create new thread"); d301 1 a301 1 if (created == 0) { d303 1 a303 1 LOG_ERROR("System won't let me create more threads, giving up"); d305 2 a306 2 return NULL; } d308 1 a308 1 return thread; d318 2 a319 2 mutex->thread_id = MUTEX_STATE_NEVERLOCKED; mutex->line = -1; d322 1 a322 1 pthread_mutex_init(&mutex->sys_mutex, NULL); d327 1 a327 1 _mutex_create(mutex); d330 4 a333 4 _mutex_lock(&_mutextree_mutex); mutex->mutex_id = _next_mutex_id++; avl_insert(_mutextree, (void *)mutex); _mutex_unlock(&_mutextree_mutex); d339 1 a339 1 pthread_mutex_destroy(&mutex->sys_mutex); d342 3 a344 3 _mutex_lock(&_mutextree_mutex); avl_delete(_mutextree, mutex, _free_mutex); _mutex_unlock(&_mutextree_mutex); d351 1 a351 1 thread_type *th = thread_self(); d353 1 a353 1 if (!th) LOG_WARN("No mt record for %u in lock [%s:%d]", thread_self(), file, line); d355 1 a355 1 LOG_DEBUG5("Locking %p (%s) on line %d in file %s by thread %d", mutex, mutex->name, line, file, th ? th->thread_id : -1); d358 44 a401 44 /* Just a little sanity checking to make sure that we're locking ** mutexes correctly */ if (th) { int locks = 0; avl_node *node; mutex_t *tmutex; _mutex_lock(&_mutextree_mutex); node = avl_get_first (_mutextree); while (node) { tmutex = (mutex_t *)node->key; if (tmutex->mutex_id == mutex->mutex_id) { if (tmutex->thread_id == th->thread_id) { /* Deadlock, same thread can't lock the same mutex twice */ LOG_ERROR7("DEADLOCK AVOIDED (%d == %d) on mutex [%s] in file %s line %d by thread %d [%s]", tmutex->thread_id, th->thread_id, mutex->name ? mutex->name : "undefined", file, line, th->thread_id, th->name); _mutex_unlock(&_mutextree_mutex); return; } } else if (tmutex->thread_id == th->thread_id) { /* Mutex locked by this thread (not this mutex) */ locks++; } node = avl_get_next(node); } if (locks > 0) { /* Has already got a mutex locked */ if (_multi_mutex.thread_id != th->thread_id) { /* Tries to lock two mutexes, but has not got the double mutex, norty boy! */ LOG_WARN("(%d != %d) Thread %d [%s] tries to lock a second mutex [%s] in file %s line %d, without locking double mutex!", _multi_mutex.thread_id, th->thread_id, th->thread_id, th->name, mutex->name ? mutex->name : "undefined", file, line); } } _mutex_unlock(&_mutextree_mutex); } d403 10 a412 10 _mutex_lock(mutex); _mutex_lock(&_mutextree_mutex); LOG_DEBUG2("Locked %p by thread %d", mutex, th ? th->thread_id : -1); mutex->line = line; if (th) { mutex->thread_id = th->thread_id; } d414 1 a414 1 _mutex_unlock(&_mutextree_mutex); d416 1 a416 1 _mutex_lock(mutex); d423 1 a423 1 thread_type *th = thread_self(); d425 3 a427 3 if (!th) { LOG_ERROR3("No record for %u in unlock [%s:%d]", thread_self(), file, line); } d429 1 a429 1 LOG_DEBUG5("Unlocking %p (%s) on line %d in file %s by thread %d", mutex, mutex->name, line, file, th ? th->thread_id : -1); d431 1 a431 1 mutex->line = line; d434 20 a453 30 if (th) { int locks = 0; avl_node *node; mutex_t *tmutex; _mutex_lock(&_mutextree_mutex); while (node) { tmutex = (mutex_t *)node->key; if (tmutex->mutex_id == mutex->mutex_id) { if (tmutex->thread_id != th->thread_id) { LOG_ERROR7("ILLEGAL UNLOCK (%d != %d) on mutex [%s] in file %s line %d by thread %d [%s]", tmutex->thread_id, th->thread_id, mutex->name ? mutex->name : "undefined", file, line, th->thread_id, th->name); _mutex_unlock(&_mutextree_mutex); return; } } else if (tmutex->thread_id == th->thread_id) { locks++; } node = avl_get_next (node); } if ((locks > 0) && (_multi_mutex.thread_id != th->thread_id)) { /* Don't have double mutex, has more than this mutex left */ LOG_WARN("(%d != %d) Thread %d [%s] tries to unlock a mutex [%s] in file %s line %d, without owning double mutex!", _multi_mutex.thread_id, th->thread_id, th->thread_id, th->name, mutex->name ? mutex->name : "undefined", file, line); } d455 12 a466 2 _mutex_unlock(&_mutextree_mutex); } d469 1 a469 1 _mutex_unlock(mutex); d471 1 a471 1 _mutex_lock(&_mutextree_mutex); d473 5 a477 5 LOG_DEBUG2("Unlocked %p by thread %d", mutex, th ? th->thread_id : -1); mutex->line = -1; if (mutex->thread_id == th->thread_id) { mutex->thread_id = MUTEX_STATE_NOTLOCKED; } d479 1 a479 1 _mutex_unlock(&_mutextree_mutex); d481 1 a481 1 _mutex_unlock(mutex); d487 2 a488 2 pthread_cond_init(&cond->sys_cond, NULL); pthread_mutex_init(&cond->cond_mutex, NULL); d493 2 a494 2 pthread_mutex_destroy(&cond->cond_mutex); pthread_cond_destroy(&cond->sys_cond); d499 1 a499 1 pthread_cond_signal(&cond->sys_cond); d504 1 a504 1 pthread_cond_broadcast(&cond->sys_cond); d521 3 a523 3 pthread_mutex_lock(&cond->cond_mutex); pthread_cond_wait(&cond->sys_cond, &cond->cond_mutex); pthread_mutex_unlock(&cond->cond_mutex); d528 1 a528 1 pthread_rwlock_init(&rwlock->sys_rwlock, NULL); d533 1 a533 1 pthread_rwlock_destroy(&rwlock->sys_rwlock); d538 1 a538 1 pthread_rwlock_rdlock(&rwlock->sys_rwlock); d543 1 a543 1 pthread_rwlock_wrlock(&rwlock->sys_rwlock); d548 1 a548 1 pthread_rwlock_unlock(&rwlock->sys_rwlock); d553 1 a553 1 thread_type *th = thread_self(); d556 14 a569 4 if (th) { avl_node *node; mutex_t *tmutex; char name[40]; d571 2 a572 4 _mutex_lock(&_mutextree_mutex); while (node) { tmutex = (mutex_t *)node->key; d574 2 a575 10 if (tmutex->thread_id == th->thread_id) { LOG_WARN("Thread %d [%s] exiting in file %s line %d, without unlocking mutex [%s]", th->thread_id, th->name, file, line, mutex_to_string(tmutex, name)); } node = avl_get_next (node); } _mutex_unlock(&_mutextree_mutex); } d577 2 a578 2 if (th) { d580 1 a580 1 LOG_INFO4("Removing thread %d [%s] started at [%s:%d], reason: 'Thread Exited'", th->thread_id, th->name, th->file, th->line); d583 6 a588 6 _mutex_lock(&_threadtree_mutex); avl_delete(_threadtree, th, _free_thread_if_detached); _mutex_unlock(&_threadtree_mutex); } pthread_exit((void *)val); d595 1 a595 1 Sleep(len / 1000); d598 14 a611 14 struct timespec time_sleep; struct timespec time_remaining; int ret; time_sleep.tv_sec = len / 1000000; time_sleep.tv_nsec = (len % 1000000) * 1000; ret = nanosleep(&time_sleep, &time_remaining); while (ret != 0 && errno == EINTR) { time_sleep.tv_sec = time_remaining.tv_sec; time_sleep.tv_nsec = time_remaining.tv_nsec; ret = nanosleep(&time_sleep, &time_remaining); } d613 1 a613 1 struct timeval tv; d615 2 a616 2 tv.tv_sec = len / 1000000; tv.tv_usec = (len % 1000000); d618 1 a618 1 select(0, NULL, NULL, NULL, &tv); d625 4 a628 4 thread_start_t *start = (thread_start_t *)arg; void *(*start_routine)(void *) = start->start_routine; void *real_arg = start->arg; thread_type *thread = start->thread; d631 1 a631 1 _block_signals(); d633 1 a633 1 free(start); d635 5 a639 5 /* insert thread into thread tree here */ _mutex_lock(&_threadtree_mutex); thread->sys_thread = pthread_self(); avl_insert(_threadtree, (void *)thread); _mutex_unlock(&_threadtree_mutex); d642 1 a642 1 LOG_INFO4("Added thread %d [%s] started at [%s:%d]", thread->thread_id, thread->name, thread->file, thread->line); d645 2 a646 2 if (detach) { pthread_detach(thread->sys_thread); d648 2 a649 2 } pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL); d651 4 a654 4 /* call the real start_routine and start the thread ** this should never exit! */ (start_routine)(real_arg); d657 1 a657 1 LOG_WARN("Thread x should never exit from here!!!"); d660 1 a660 1 return NULL; d665 3 a667 3 avl_node *node; thread_type *th; pthread_t sys_thread = pthread_self(); d669 1 a669 1 _mutex_lock(&_threadtree_mutex); d671 1 a671 1 if (_threadtree == NULL) { d673 1 a673 1 LOG_WARN("Thread tree is empty, this must be wrong!"); d675 17 a691 17 _mutex_unlock(&_threadtree_mutex); return NULL; } node = avl_get_first(_threadtree); while (node) { th = (thread_type *)node->key; if (th && pthread_equal(sys_thread, th->sys_thread)) { _mutex_unlock(&_threadtree_mutex); return th; } node = avl_get_next(node); } _mutex_unlock(&_threadtree_mutex); d695 1 a695 1 LOG_ERROR("Nonexistant thread alive..."); d697 2 a698 2 return NULL; d703 1 a703 1 thread_type *th; d705 2 a706 2 th = thread_self(); if (th->name) free(th->name); d708 1 a708 1 th->name = strdup(name); d713 1 a713 1 pthread_mutex_lock(&mutex->sys_mutex); d718 1 a718 1 pthread_mutex_unlock(&mutex->sys_mutex); d724 1 a724 1 _mutex_lock(&_library_mutex); d729 1 a729 1 _mutex_unlock(&_library_mutex); d734 2 a735 2 void *ret; int i; d737 1 a737 1 i = pthread_join(thread->sys_thread, &ret); d748 1 a748 1 mutex_t *m1, *m2; d750 2 a751 2 m1 = (mutex_t *)a; m2 = (mutex_t *)b; d753 5 a757 5 if (m1->mutex_id > m2->mutex_id) return 1; if (m1->mutex_id < m2->mutex_id) return -1; return 0; d763 1 a763 1 thread_type *t1, *t2; d765 2 a766 2 t1 = (thread_type *)a; t2 = (thread_type *)b; d768 5 a772 5 if (t1->thread_id > t2->thread_id) return 1; if (t1->thread_id < t2->thread_id) return -1; return 0; d778 1 a778 1 mutex_t *m; d780 1 a780 1 m = (mutex_t *)key; d782 4 a785 4 if (m && m->file) { free(m->file); m->file = NULL; } d787 1 a787 1 /* all mutexes are static. don't need to free them */ d789 1 a789 1 return 1; d795 1 a795 1 thread_type *t; d797 1 a797 1 t = (thread_type *)key; d799 4 a802 4 if (t->file) free(t->file); if (t->name) free(t->name); d804 1 a804 1 free(t); d806 1 a806 1 return 1; @ 1.22 log @reduce include file namespace clutter for libshout and the associated smaller libs. @ text @a740 1 _free_thread(thread); @ 1.21 log @include the automake config.h file if the application defines one @ text @d45 2 a46 2 #include "thread.h" #include "avl.h" d48 1 a48 1 #include "log.h" @ 1.20 log @Make various thread structures omit the bits only used in debug mode. Some of these are pretty heavily used, so saving 10-20 bytes each can be quite significant. No functional differences. @ text @d21 4 @ 1.19 log @Fix some warnings, fix cflags. @ text @a86 1 static int _logid = -1; d90 2 d94 4 d99 3 d103 1 d107 3 d112 3 d119 1 d121 3 a124 1 static int _free_mutex(void *key); d152 1 a161 1 #ifdef DEBUG_MUTEXES d194 1 d198 1 d313 1 d316 1 a521 2 static int rwlocknum = 0; d742 1 d756 1 d772 1 d788 1 @ 1.18 log @Rename thread_t to avoid problems on OS X @ text @d91 2 a92 1 static mutex_t _threadtree_mutex = { -1, NULL, MUTEX_STATE_UNINIT, NULL, -1 }; d96 4 a99 2 static mutex_t _mutextree_mutex = { -1, NULL, MUTEX_STATE_UNINIT, NULL, -1 }; static mutex_t _library_mutex = { -1, NULL, MUTEX_STATE_UNINIT, NULL, -1 }; @ 1.17 log @Lots of bugfixes contributed by Karl Heyes. @ text @d83 1 a83 1 thread_t *thread; d121 1 a121 1 thread_t *thread; d152 1 a152 1 thread = (thread_t *)malloc(sizeof(thread_t)); d241 2 a242 1 thread_t *thread_create_c(char *name, void *(*start_routine)(void *), void *arg, int detached, int line, char *file) d245 1 a245 1 thread_t *thread; d248 1 a248 1 thread = (thread_t *)malloc(sizeof(thread_t)); d322 1 a322 1 thread_t *th = thread_self(); d394 1 a394 1 thread_t *th = thread_self(); d526 1 a526 1 thread_t *th = thread_self(); d601 1 a601 1 thread_t *thread = start->thread; d636 1 a636 1 thread_t *thread_self(void) d639 1 a639 1 thread_t *th; d655 1 a655 1 th = (thread_t *)node->key; d676 1 a676 1 thread_t *th; d705 1 a705 1 void thread_join(thread_t *thread) d735 1 a735 1 thread_t *t1, *t2; d737 2 a738 2 t1 = (thread_t *)a; t2 = (thread_t *)b; d765 1 a765 1 thread_t *t; d767 1 a767 1 t = (thread_t *)key; d781 1 a781 1 thread_t *t = key; @ 1.16 log @Bugfix: thread_join is often called after a thread has already exited, which it does using thread_exit(). thread_exit() was freeing the thread structure, so thread_join was using freed memory. Rearrange things so that if the thread is detached, the freeing happens in thread_join instead. @ text @d710 3 @ 1.15 log @Liberally sprinkle #ifdef THREAD_DEBUG around so libshout doesn't need to link with it. @ text @d105 1 d258 1 d556 1 a556 1 avl_delete(_threadtree, th, _free_thread); d619 1 d710 1 d775 7 @ 1.14 log @Timing fixes @ text @a40 1 #include "log.h" d43 3 d51 1 d69 1 d124 1 a125 2 #ifdef THREAD_DEBUG d180 1 a182 1 log_shutdown(); d205 2 a206 1 if (pthread_sigmask(SIG_BLOCK, &ss, NULL) != 0) d209 2 d231 2 a232 1 if (pthread_sigmask(SIG_UNBLOCK, &ss, NULL) != 0) d235 2 d266 1 d269 1 d272 1 d274 1 d549 1 d551 1 d611 1 d613 1 d625 1 d627 1 d641 1 d643 1 d663 1 d665 1 @ 1.13 log @Various cleanups @ text @d571 1 a571 1 tv.tv_usec = (len % 1000000) / 1000; @ 1.12 log @oddsock's xslt stats support, slightly cleaned up @ text @d231 1 a231 1 long thread_create_c(char *name, void *(*start_routine)(void *), void *arg, int detached, int line, char *file) d262 1 a262 1 return -1; d265 1 a265 2 // return thread->thread_id; return thread->sys_thread; d678 1 a678 1 void thread_join(long thread) d683 1 a683 1 i = pthread_join(thread, &ret); @ 1.11 log @Cleaned up version of Ciaran Anscomb's relaying patch. @ text @a117 5 /* this must be called to init pthreads-win32 */ #ifdef _WIN32 ptw32_processInitialize(); #endif d127 1 a127 1 /* create all the interal mutexes, and initialize the mutex tree */ @ 1.10 log @Lots of patches committable now that my sound card works properly again. logging API changed slightly (I got sick of gcc warnings about deprecated features). resampling (for live input, not yet for reencoding) is in there. several patches from Karl Heyes have been incorporated. @ text @d468 12 d486 2 @ 1.9 log @Don't use start after freeing it in thread startup code. @ text @d50 16 a65 16 #define LOG_ERROR(y) log_write(_logid, 1, CATMODULE "/" __FUNCTION__, y) #define LOG_ERROR3(y, z1, z2, z3) log_write(_logid, 1, CATMODULE "/" __FUNCTION__, y, z1, z2, z3) #define LOG_ERROR7(y, z1, z2, z3, z4, z5, z6, z7) log_write(_logid, 1, CATMODULE "/" __FUNCTION__, y, z1, z2, z3, z4, z5, z6, z7) #define LOG_WARN(y) log_write(_logid, 2, CATMODULE "/" __FUNCTION__, y) #define LOG_WARN3(y, z1, z2, z3) log_write(_logid, 2, CATMODULE "/" __FUNCTION__, y, z1, z2, z3) #define LOG_WARN5(y, z1, z2, z3, z4, z5) log_write(_logid, 2, CATMODULE "/" __FUNCTION__, y, z1, z2, z3, z4, z5) #define LOG_WARN7(y, z1, z2, z3, z4, z5, z6, z7) log_write(_logid, 2, CATMODULE "/" __FUNCTION__, y, z1, z2, z3, z4, z5, z6, z7) #define LOG_INFO(y) log_write(_logid, 3, CATMODULE "/" __FUNCTION__, y) #define LOG_INFO4(y, z1, z2, z3, z4) log_write(_logid, 3, CATMODULE "/" __FUNCTION__, y, z1, z2, z3, z4) #define LOG_INFO5(y, z1, z2, z3, z4, z5) log_write(_logid, 3, CATMODULE "/" __FUNCTION__, y, z1, z2, z3, z4, z5) #define LOG_DEBUG(y) log_write(_logid, 4, CATMODULE "/" __FUNCTION__, y) #define LOG_DEBUG2(y, z1, z2) log_write(_logid, 4, CATMODULE "/" __FUNCTION__, y, z1, z2) #define LOG_DEBUG5(y, z1, z2, z3, z4, z5) log_write(_logid, 4, CATMODULE "/" __FUNCTION__, y, z1, z2, z3, z4, z5) a257 1 @ 1.8 log @win32 patches from Ed @ text @d577 1 d591 1 a591 1 if (start->detached) { @ 1.7 log @More win32 fixes. @ text @d28 2 d36 1 a38 2 #include d46 1 a46 1 #define __FUNCTION__ __FILE__ ":" __LINE__ @ 1.6 log @minor build fixes for win32 courtesy of Oddsock @ text @d117 5 @ 1.5 log @Revert the stacksize work. It's stupid. The original patch from Ben Laurie some years ago was needed because FreeBSD's default stack size was < 8k and this wasn't acceptable. Both Linux and Solaris had reasonable defaults for stacksize, or grew the stack as needed to a reasonable size. Testing today and consulting documentation shows that the default stack sizes on FreeBSD, Linux, and Solaris are all acceptable. Linux can grow to 2MB, 32bit Solaris defaults to 1MB, 64bit Solaris defaults to 2MB, and FreeBSD defaults to 64k. In my opinion FreeBSD needs to get with the program and provide a reasonable default. 64k is enough for us, but might not be for others. @ text @d31 3 d43 4 @ 1.4 log @Stack size per thread needs to be configurable. Setting it on a global bases is not enough. ices and icecast need this to be different, and if one is interested in tuning memory usage, one will want to alter this per thread. @ text @d223 1 a223 1 long thread_create_c(char *name, void *(*start_routine)(void *), void *arg, int stacksize, int detached, int line, char *file) a224 1 pthread_attr_t attr; a245 2 pthread_attr_init(&attr); pthread_attr_setstacksize(&attr, stacksize); d248 1 a248 1 if (pthread_create(&thread->sys_thread, &attr, _start_routine, start) == 0) a251 2 pthread_attr_destroy(&attr); @ 1.3 log @Win32 fixes. Specifically a header change and not using the gcc extensions for vararg macros. It's not as pretty, but it works. @ text @a58 3 /* INTERNAL DATA */ #define STACKSIZE 8192 d223 1 a223 1 long thread_create_c(char *name, void *(*start_routine)(void *), void *arg, int detached, int line, char *file) d248 1 a248 1 pthread_attr_setstacksize(&attr, STACKSIZE); @ 1.2 log @Oddsock found this bug when working with icecast2 on freebsd. Nanoseconds were off by a few orders of magnitude. @ text @a26 1 #include d30 1 d42 16 a57 4 #define LOG_ERROR(y, z...) log_write(_logid, 1, CATMODULE "/" __FUNCTION__, y, ##z) #define LOG_WARN(y, z...) log_write(_logid, 2, CATMODULE "/" __FUNCTION__, y, ##z) #define LOG_INFO(y, z...) log_write(_logid, 3, CATMODULE "/" __FUNCTION__, y, ##z) #define LOG_DEBUG(y, z...) log_write(_logid, 4, CATMODULE "/" __FUNCTION__, y, ##z) d312 1 a312 1 LOG_DEBUG("Locking %p (%s) on line %d in file %s by thread %d", mutex, mutex->name, line, file, th ? th->thread_id : -1); d334 1 a334 1 LOG_ERROR("DEADLOCK AVOIDED (%d == %d) on mutex [%s] in file %s line %d by thread %d [%s]", d365 1 a365 1 LOG_DEBUG("Locked %p by thread %d", mutex, th ? th->thread_id : -1); d383 1 a383 1 LOG_ERROR("No record for %u in unlock [%s:%d]", thread_self(), file, line); d386 1 a386 1 LOG_DEBUG("Unlocking %p (%s) on line %d in file %s by thread %d", mutex, mutex->name, line, file, th ? th->thread_id : -1); d403 1 a403 1 LOG_ERROR("ILLEGAL UNLOCK (%d != %d) on mutex [%s] in file %s line %d by thread %d [%s]", tmutex->thread_id, th->thread_id, d430 1 a430 1 LOG_DEBUG("Unlocked %p by thread %d", mutex, th ? th->thread_id : -1); d524 1 a524 1 LOG_INFO("Removing thread %d [%s] started at [%s:%d], reason: 'Thread Exited'", th->thread_id, th->name, th->file, th->line); d583 1 a583 1 LOG_INFO("Added thread %d [%s] started at [%s:%d]", thread->thread_id, thread->name, thread->file, thread->line); @ 1.1 log @Initial revision @ text @d534 1 a534 1 time_sleep.tv_nsec = len % 1000000; @ 1.1.1.1 log @move to cvs @ text @@ cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/thread/TODO,v0000664000076500007650000000123510702477015025471 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access; symbols libshout-2_0:1.1.1.1 libshout-2_0b3:1.1.1.1 libshout-2_0b2:1.1.1.1 libshout_2_0b1:1.1.1.1 libogg2-zerocopy:1.1.1.1.0.4 branch-beta2-rewrite:1.1.1.1.0.2 start:1.1.1.1 xiph:1.1.1; locks; strict; comment @# @; 1.1 date 2001.09.10.02.26.33; author jack; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2001.09.10.02.26.33; author jack; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @- make DEBUG_MUTEXES and CHECK_MUTEXES work - recursive locking/unlocking (easy) - reader/writer locking (easy) - make a mode were _log is disabled (normal mode) (easy) @ 1.1.1.1 log @move to cvs @ text @@ cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/thread/thread.h,v0000664000076500007650000003167210702477015026431 0ustar mhaggermhagger00000000000000head 1.13; access; symbols libshout-2_0:1.12 libshout-2_0b3:1.12 libshout-2_0b2:1.11 libshout_2_0b1:1.11 libogg2-zerocopy:1.7.0.2 branch-beta2-rewrite:1.4.0.2 start:1.1.1.1 xiph:1.1.1; locks; strict; comment @ * @; 1.13 date 2003.07.14.02.17.52; author brendan; state Exp; branches; next 1.12; 1.12 date 2003.07.07.20.38.34; author brendan; state Exp; branches; next 1.11; 1.11 date 2003.03.15.02.10.18; author msmith; state Exp; branches; next 1.10; 1.10 date 2003.03.05.19.52.10; author brendan; state Exp; branches; next 1.9; 1.9 date 2003.03.04.15.31.34; author msmith; state Exp; branches; next 1.8; 1.8 date 2002.12.29.09.55.50; author msmith; state Exp; branches; next 1.7; 1.7 date 2002.09.24.07.09.08; author msmith; state Exp; branches; next 1.6; 1.6 date 2002.08.10.03.22.44; author msmith; state Exp; branches; next 1.5; 1.5 date 2002.08.05.14.48.04; author msmith; state Exp; branches; next 1.4; 1.4 date 2001.10.21.02.04.27; author jack; state Exp; branches; next 1.3; 1.3 date 2001.10.20.22.40.28; author jack; state Exp; branches; next 1.2; 1.2 date 2001.10.20.22.27.52; author jack; state Exp; branches; next 1.1; 1.1 date 2001.09.10.02.26.33; author jack; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2001.09.10.02.26.33; author jack; state Exp; branches; next ; desc @@ 1.13 log @Assign LGP to thread module @ text @/* thread.h * - Thread Abstraction Function Headers * * Copyright (c) 1999, 2000 the icecast team * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Library General Public * License as published by the Free Software Foundation; either * version 2 of the License, or (at your option) any later version. * * This library is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU * Library General Public License for more details. * * You should have received a copy of the GNU Library General Public * License along with this library; if not, write to the Free * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ #ifndef __THREAD_H__ #define __THREAD_H__ #include /* renamed from thread_t due to conflict on OS X */ typedef struct { /* the local id for the thread, and it's name */ long thread_id; char *name; /* the time the thread was created */ time_t create_time; /* the file and line which created this thread */ char *file; int line; /* is the thread running detached? */ int detached; /* the system specific thread */ pthread_t sys_thread; } thread_type; typedef struct { #ifdef DEBUG_MUTEXES /* the local id and name of the mutex */ long mutex_id; char *name; /* the thread which is currently locking this mutex */ long thread_id; /* the file and line where the mutex was locked */ char *file; int line; #endif /* the system specific mutex */ pthread_mutex_t sys_mutex; } mutex_t; typedef struct { #ifdef THREAD_DEBUG long cond_id; char *name; #endif pthread_mutex_t cond_mutex; pthread_cond_t sys_cond; } cond_t; typedef struct { #ifdef THREAD_DEBUG long rwlock_id; char *name; /* information on which thread and where in the code ** this rwlock was write locked */ long thread_id; char *file; int line; #endif pthread_rwlock_t sys_rwlock; } rwlock_t; #define thread_create(n,x,y,z) thread_create_c(n,x,y,z,__LINE__,__FILE__) #define thread_mutex_create(x) thread_mutex_create_c(x,__LINE__,__FILE__) #define thread_mutex_lock(x) thread_mutex_lock_c(x,__LINE__,__FILE__) #define thread_mutex_unlock(x) thread_mutex_unlock_c(x,__LINE__,__FILE__) #define thread_cond_create(x) thread_cond_create_c(x,__LINE__,__FILE__) #define thread_cond_signal(x) thread_cond_signal_c(x,__LINE__,__FILE__) #define thread_cond_broadcast(x) thread_cond_broadcast_c(x,__LINE__,__FILE__) #define thread_cond_wait(x) thread_cond_wait_c(x,__LINE__,__FILE__) #define thread_cond_timedwait(x,t) thread_cond_wait_c(x,t,__LINE__,__FILE__) #define thread_rwlock_create(x) thread_rwlock_create_c(x,__LINE__,__FILE__) #define thread_rwlock_rlock(x) thread_rwlock_rlock_c(x,__LINE__,__FILE__) #define thread_rwlock_wlock(x) thread_rwlock_wlock_c(x,__LINE__,__FILE__) #define thread_rwlock_unlock(x) thread_rwlock_unlock_c(x,__LINE__,__FILE__) #define thread_exit(x) thread_exit_c(x,__LINE__,__FILE__) #define MUTEX_STATE_NOTLOCKED -1 #define MUTEX_STATE_NEVERLOCKED -2 #define MUTEX_STATE_UNINIT -3 #define THREAD_DETACHED 1 #define THREAD_ATTACHED 0 #ifdef _mangle # define thread_initialize _mangle(thread_initialize) # define thread_initialize_with_log_id _mangle(thread_initialize_with_log_id) # define thread_shutdown _mangle(thread_shutdown) # define thread_create_c _mangle(thread_create_c) # define thread_mutex_create_c _mangle(thread_mutex_create) # define thread_mutex_lock_c _mangle(thread_mutex_lock_c) # define thread_mutex_unlock_c _mangle(thread_mutex_unlock_c) # define thread_mutex_destroy _mangle(thread_mutex_destroy) # define thread_cond_create_c _mangle(thread_cond_create_c) # define thread_cond_signal_c _mangle(thread_cond_signal_c) # define thread_cond_broadcast_c _mangle(thread_cond_broadcast_c) # define thread_cond_wait_c _mangle(thread_cond_wait_c) # define thread_cond_timedwait_c _mangle(thread_cond_timedwait_c) # define thread_cond_destroy _mangle(thread_cond_destroy) # define thread_rwlock_create_c _mangle(thread_rwlock_create_c) # define thread_rwlock_rlock_c _mangle(thread_rwlock_rlock_c) # define thread_rwlock_wlock_c _mangle(thread_rwlock_wlock_c) # define thread_rwlock_unlock_c _mangle(thread_rwlock_unlock_c) # define thread_rwlock_destroy _mangle(thread_rwlock_destroy) # define thread_exit_c _mangle(thread_exit_c) # define thread_sleep _mangle(thread_sleep) # define thread_library_lock _mangle(thread_library_lock) # define thread_library_unlock _mangle(thread_library_unlock) # define thread_self _mangle(thread_self) # define thread_rename _mangle(thread_rename) # define thread_join _mangle(thread_join) #endif /* init/shutdown of the library */ void thread_initialize(void); void thread_initialize_with_log_id(int log_id); void thread_shutdown(void); /* creation, destruction, locking, unlocking, signalling and waiting */ thread_type *thread_create_c(char *name, void *(*start_routine)(void *), void *arg, int detached, int line, char *file); void thread_mutex_create_c(mutex_t *mutex, int line, char *file); void thread_mutex_lock_c(mutex_t *mutex, int line, char *file); void thread_mutex_unlock_c(mutex_t *mutex, int line, char *file); void thread_mutex_destroy(mutex_t *mutex); void thread_cond_create_c(cond_t *cond, int line, char *file); void thread_cond_signal_c(cond_t *cond, int line, char *file); void thread_cond_broadcast_c(cond_t *cond, int line, char *file); void thread_cond_wait_c(cond_t *cond, int line, char *file); void thread_cond_timedwait_c(cond_t *cond, int millis, int line, char *file); void thread_cond_destroy(cond_t *cond); void thread_rwlock_create_c(rwlock_t *rwlock, int line, char *file); void thread_rwlock_rlock_c(rwlock_t *rwlock, int line, char *file); void thread_rwlock_wlock_c(rwlock_t *rwlock, int line, char *file); void thread_rwlock_unlock_c(rwlock_t *rwlock, int line, char *file); void thread_rwlock_destroy(rwlock_t *rwlock); void thread_exit_c(int val, int line, char *file); /* sleeping */ void thread_sleep(unsigned long len); /* for using library functions which aren't threadsafe */ void thread_library_lock(void); void thread_library_unlock(void); #define PROTECT_CODE(code) { thread_library_lock(); code; thread_library_unlock(); } /* thread information functions */ thread_type *thread_self(void); /* renames current thread */ void thread_rename(const char *name); /* waits until thread_exit is called for another thread */ void thread_join(thread_type *thread); #endif /* __THREAD_H__ */ @ 1.12 log @The last of the convenience lib cleanups. A little forethought in designing a keyboard macro made this one a lot easier. @ text @d4 1 a4 1 * Copyright (c) 1999, 2000 the icecast team d6 4 a9 13 * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License * as published by the Free Software Foundation; either version 2 * of the License, or (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. d11 8 @ 1.11 log @Brendan was getting pissed off about inconsistent indentation styles. Convert all tabs to 4 spaces. All code must now use 4 space indents. @ text @d114 29 a185 3 @ 1.10 log @Remove some namespace pollution @ text @d30 10 a39 10 /* the local id for the thread, and it's name */ long thread_id; char *name; /* the time the thread was created */ time_t create_time; /* the file and line which created this thread */ char *file; int line; d41 2 a42 2 /* is the thread running detached? */ int detached; d44 2 a45 2 /* the system specific thread */ pthread_t sys_thread; d50 10 a59 10 /* the local id and name of the mutex */ long mutex_id; char *name; /* the thread which is currently locking this mutex */ long thread_id; /* the file and line where the mutex was locked */ char *file; int line; d63 2 a64 2 /* the system specific mutex */ pthread_mutex_t sys_mutex; d69 2 a70 2 long cond_id; char *name; d73 2 a74 2 pthread_mutex_t cond_mutex; pthread_cond_t sys_cond; d79 2 a80 2 long rwlock_id; char *name; d82 6 a87 6 /* information on which thread and where in the code ** this rwlock was write locked */ long thread_id; char *file; int line; d90 1 a90 1 pthread_rwlock_t sys_rwlock; @ 1.9 log @Make various thread structures omit the bits only used in debug mode. Some of these are pretty heavily used, so saving 10-20 bytes each can be quite significant. No functional differences. @ text @d29 1 a29 1 typedef struct thread_tag { d41 2 a42 2 /* is the thread running detached? */ int detached; d48 1 a48 1 typedef struct mutex_tag { d67 1 a67 1 typedef struct cond_tag { d77 1 a77 1 typedef struct rwlock_tag { @ 1.8 log @Rename thread_t to avoid problems on OS X @ text @d49 1 d61 2 d68 1 d71 1 d78 1 d88 1 @ 1.7 log @Bugfix: thread_join is often called after a thread has already exited, which it does using thread_exit(). thread_exit() was freeing the thread structure, so thread_join was using freed memory. Rearrange things so that if the thread is detached, the freeing happens in thread_join instead. @ text @d27 2 d46 1 a46 1 } thread_t; d113 2 a114 1 thread_t *thread_create_c(char *name, void *(*start_routine)(void *), void *arg, int detached, int line, char *file); d141 1 a141 1 thread_t *thread_self(void); d147 1 a147 1 void thread_join(thread_t *thread); @ 1.6 log @Various cleanups @ text @d39 3 @ 1.5 log @Cleaned up version of Ciaran Anscomb's relaying patch. @ text @d108 1 a108 1 long thread_create_c(char *name, void *(*start_routine)(void *), void *arg, int detached, int line, char *file); d141 1 a141 1 void thread_join(long thread); @ 1.4 log @Revert the stacksize work. It's stupid. The original patch from Ben Laurie some years ago was needed because FreeBSD's default stack size was < 8k and this wasn't acceptable. Both Linux and Solaris had reasonable defaults for stacksize, or grew the stack as needed to a reasonable size. Testing today and consulting documentation shows that the default stack sizes on FreeBSD, Linux, and Solaris are all acceptable. Linux can grow to 2MB, 32bit Solaris defaults to 1MB, 64bit Solaris defaults to 2MB, and FreeBSD defaults to 64k. In my opinion FreeBSD needs to get with the program and provide a reasonable default. 64k is enough for us, but might not be for others. @ text @d89 1 d117 1 @ 1.3 log @Fix header definition. @ text @a26 2 #define THREAD_DEFAULT_STACKSIZE 8192 d81 1 a81 1 #define thread_create(n,w,x,y,z) thread_create_c(n,w,x,y,z,__LINE__,__FILE__) d107 1 a107 1 long thread_create_c(char *name, void *(*start_routine)(void *), void *arg, int stacksize, int detached, int line, char *file); @ 1.2 log @Stack size per thread needs to be configurable. Setting it on a global bases is not enough. ices and icecast need this to be different, and if one is interested in tuning memory usage, one will want to alter this per thread. @ text @d109 1 a109 1 long thread_create_c(char *name, void *(*start_routine)(void *), void *arg, int detached, int line, char *file); @ 1.1 log @Initial revision @ text @d27 2 d83 1 a83 1 #define thread_create(n,x,y,z) thread_create_c(n,x,y,z,__LINE__,__FILE__) @ 1.1.1.1 log @move to cvs @ text @@ cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/thread/COPYING,v0000664000076500007650000006225610702477015026046 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access; symbols libshout-2_0:1.1.1.1 libshout-2_0b3:1.1.1.1 libshout-2_0b2:1.1.1.1 libshout_2_0b1:1.1.1.1 libogg2-zerocopy:1.1.1.1.0.4 branch-beta2-rewrite:1.1.1.1.0.2 start:1.1.1.1 xiph:1.1.1; locks; strict; comment @# @; 1.1 date 2001.09.10.02.26.35; author jack; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2001.09.10.02.26.35; author jack; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @ GNU LIBRARY GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1991 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. [This is the first released version of the library GPL. It is numbered 2 because it goes with version 2 of the ordinary GPL.] Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public Licenses are intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This license, the Library General Public License, applies to some specially designated Free Software Foundation software, and to any other libraries whose authors decide to use it. You can use it for your libraries, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the library, or if you modify it. For example, if you distribute copies of the library, whether gratis or for a fee, you must give the recipients all the rights that we gave you. You must make sure that they, too, receive or can get the source code. If you link a program with the library, you must provide complete object files to the recipients so that they can relink them with the library, after making changes to the library and recompiling it. And you must show them these terms so they know their rights. Our method of protecting your rights has two steps: (1) copyright the library, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the library. Also, for each distributor's protection, we want to make certain that everyone understands that there is no warranty for this free library. If the library is modified by someone else and passed on, we want its recipients to know that what they have is not the original version, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that companies distributing free software will individually obtain patent licenses, thus in effect transforming the program into proprietary software. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. Most GNU software, including some libraries, is covered by the ordinary GNU General Public License, which was designed for utility programs. This license, the GNU Library General Public License, applies to certain designated libraries. This license is quite different from the ordinary one; be sure to read it in full, and don't assume that anything in it is the same as in the ordinary license. The reason we have a separate public license for some libraries is that they blur the distinction we usually make between modifying or adding to a program and simply using it. Linking a program with a library, without changing the library, is in some sense simply using the library, and is analogous to running a utility program or application program. However, in a textual and legal sense, the linked executable is a combined work, a derivative of the original library, and the ordinary General Public License treats it as such. Because of this blurred distinction, using the ordinary General Public License for libraries did not effectively promote software sharing, because most developers did not use the libraries. We concluded that weaker conditions might promote sharing better. However, unrestricted linking of non-free programs would deprive the users of those programs of all benefit from the free status of the libraries themselves. This Library General Public License is intended to permit developers of non-free programs to use free libraries, while preserving your freedom as a user of such programs to change the free libraries that are incorporated in them. (We have not seen how to achieve this as regards changes in header files, but we have achieved it as regards changes in the actual functions of the Library.) The hope is that this will lead to faster development of free libraries. The precise terms and conditions for copying, distribution and modification follow. Pay close attention to the difference between a "work based on the library" and a "work that uses the library". The former contains code derived from the library, while the latter only works together with the library. Note that it is possible for a library to be covered by the ordinary General Public License rather than by this special one. GNU LIBRARY GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License Agreement applies to any software library which contains a notice placed by the copyright holder or other authorized party saying it may be distributed under the terms of this Library General Public License (also called "this License"). Each licensee is addressed as "you". A "library" means a collection of software functions and/or data prepared so as to be conveniently linked with application programs (which use some of those functions and data) to form executables. The "Library", below, refers to any such software library or work which has been distributed under these terms. A "work based on the Library" means either the Library or any derivative work under copyright law: that is to say, a work containing the Library or a portion of it, either verbatim or with modifications and/or translated straightforwardly into another language. (Hereinafter, translation is included without limitation in the term "modification".) "Source code" for a work means the preferred form of the work for making modifications to it. For a library, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the library. Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running a program using the Library is not restricted, and output from such a program is covered only if its contents constitute a work based on the Library (independent of the use of the Library in a tool for writing it). Whether that is true depends on what the Library does and what the program that uses the Library does. 1. You may copy and distribute verbatim copies of the Library's complete source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and distribute a copy of this License along with the Library. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Library or any portion of it, thus forming a work based on the Library, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) The modified work must itself be a software library. b) You must cause the files modified to carry prominent notices stating that you changed the files and the date of any change. c) You must cause the whole of the work to be licensed at no charge to all third parties under the terms of this License. d) If a facility in the modified Library refers to a function or a table of data to be supplied by an application program that uses the facility, other than as an argument passed when the facility is invoked, then you must make a good faith effort to ensure that, in the event an application does not supply such function or table, the facility still operates, and performs whatever part of its purpose remains meaningful. (For example, a function in a library to compute square roots has a purpose that is entirely well-defined independent of the application. Therefore, Subsection 2d requires that any application-supplied function or table used by this function must be optional: if the application does not supply it, the square root function must still compute square roots.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Library, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Library, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Library. In addition, mere aggregation of another work not based on the Library with the Library (or with a work based on the Library) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may opt to apply the terms of the ordinary GNU General Public License instead of this License to a given copy of the Library. To do this, you must alter all the notices that refer to this License, so that they refer to the ordinary GNU General Public License, version 2, instead of to this License. (If a newer version than version 2 of the ordinary GNU General Public License has appeared, then you can specify that version instead if you wish.) Do not make any other change in these notices. Once this change is made in a given copy, it is irreversible for that copy, so the ordinary GNU General Public License applies to all subsequent copies and derivative works made from that copy. This option is useful when you wish to copy part of the code of the Library into a program that is not a library. 4. You may copy and distribute the Library (or a portion or derivative of it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange. If distribution of object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place satisfies the requirement to distribute the source code, even though third parties are not compelled to copy the source along with the object code. 5. A program that contains no derivative of any portion of the Library, but is designed to work with the Library by being compiled or linked with it, is called a "work that uses the Library". Such a work, in isolation, is not a derivative work of the Library, and therefore falls outside the scope of this License. However, linking a "work that uses the Library" with the Library creates an executable that is a derivative of the Library (because it contains portions of the Library), rather than a "work that uses the library". The executable is therefore covered by this License. Section 6 states terms for distribution of such executables. When a "work that uses the Library" uses material from a header file that is part of the Library, the object code for the work may be a derivative work of the Library even though the source code is not. Whether this is true is especially significant if the work can be linked without the Library, or if the work is itself a library. The threshold for this to be true is not precisely defined by law. If such an object file uses only numerical parameters, data structure layouts and accessors, and small macros and small inline functions (ten lines or less in length), then the use of the object file is unrestricted, regardless of whether it is legally a derivative work. (Executables containing this object code plus portions of the Library will still fall under Section 6.) Otherwise, if the work is a derivative of the Library, you may distribute the object code for the work under the terms of Section 6. Any executables containing that work also fall under Section 6, whether or not they are linked directly with the Library itself. 6. As an exception to the Sections above, you may also compile or link a "work that uses the Library" with the Library to produce a work containing portions of the Library, and distribute that work under terms of your choice, provided that the terms permit modification of the work for the customer's own use and reverse engineering for debugging such modifications. You must give prominent notice with each copy of the work that the Library is used in it and that the Library and its use are covered by this License. You must supply a copy of this License. If the work during execution displays copyright notices, you must include the copyright notice for the Library among them, as well as a reference directing the user to the copy of this License. Also, you must do one of these things: a) Accompany the work with the complete corresponding machine-readable source code for the Library including whatever changes were used in the work (which must be distributed under Sections 1 and 2 above); and, if the work is an executable linked with the Library, with the complete machine-readable "work that uses the Library", as object code and/or source code, so that the user can modify the Library and then relink to produce a modified executable containing the modified Library. (It is understood that the user who changes the contents of definitions files in the Library will not necessarily be able to recompile the application to use the modified definitions.) b) Accompany the work with a written offer, valid for at least three years, to give the same user the materials specified in Subsection 6a, above, for a charge no more than the cost of performing this distribution. c) If distribution of the work is made by offering access to copy from a designated place, offer equivalent access to copy the above specified materials from the same place. d) Verify that the user has already received a copy of these materials or that you have already sent this user a copy. For an executable, the required form of the "work that uses the Library" must include any data and utility programs needed for reproducing the executable from it. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. It may happen that this requirement contradicts the license restrictions of other proprietary libraries that do not normally accompany the operating system. Such a contradiction means you cannot use both them and the Library together in an executable that you distribute. 7. You may place library facilities that are a work based on the Library side-by-side in a single library together with other library facilities not covered by this License, and distribute such a combined library, provided that the separate distribution of the work based on the Library and of the other library facilities is otherwise permitted, and provided that you do these two things: a) Accompany the combined library with a copy of the same work based on the Library, uncombined with any other library facilities. This must be distributed under the terms of the Sections above. b) Give prominent notice with the combined library of the fact that part of it is a work based on the Library, and explaining where to find the accompanying uncombined form of the same work. 8. You may not copy, modify, sublicense, link with, or distribute the Library except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense, link with, or distribute the Library is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 9. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Library or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Library (or any work based on the Library), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Library or works based on it. 10. Each time you redistribute the Library (or any work based on the Library), the recipient automatically receives a license from the original licensor to copy, distribute, link with or modify the Library subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 11. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Library at all. For example, if a patent license would not permit royalty-free redistribution of the Library by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Library. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply, and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 12. If the distribution and/or use of the Library is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Library under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 13. The Free Software Foundation may publish revised and/or new versions of the Library General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Library specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Library does not specify a license version number, you may choose any version ever published by the Free Software Foundation. 14. If you wish to incorporate parts of the Library into other free programs whose distribution conditions are incompatible with these, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Libraries If you develop a new library, and you want it to be of the greatest possible use to the public, we recommend making it free software that everyone can redistribute and change. You can do so by permitting redistribution under these terms (or, alternatively, under the terms of the ordinary General Public License). To apply these terms, attach the following notices to the library. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This library is free software; you can redistribute it and/or modify it under the terms of the GNU Library General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Library General Public License for more details. You should have received a copy of the GNU Library General Public License along with this library; if not, write to the Free Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Also add information on how to contact you by electronic and paper mail. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the library, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the library `Frob' (a library for tweaking knobs) written by James Random Hacker. , 1 April 1990 Ty Coon, President of Vice That's all there is to it! @ 1.1.1.1 log @move to cvs @ text @@ cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/httpp/0000775000076500007650000000000012027373500024422 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/httpp/test.c,v0000664000076500007650000000572510702477014026023 0ustar mhaggermhagger00000000000000head 1.2; access; symbols libshout-2_0:1.2 libshout-2_0b3:1.2 libshout-2_0b2:1.2 libshout_2_0b1:1.2 libogg2-zerocopy:1.1.1.1.0.2 start:1.1.1.1 xiph:1.1.1; locks; strict; comment @ * @; 1.2 date 2003.03.15.02.10.18; author msmith; state Exp; branches; next 1.1; 1.1 date 2001.09.10.02.28.49; author jack; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2001.09.10.02.28.49; author jack; state Exp; branches; next ; desc @@ 1.2 log @Brendan was getting pissed off about inconsistent indentation styles. Convert all tabs to 4 spaces. All code must now use 4 space indents. @ text @#include #include #include "httpp.h" int main(int argc, char **argv) { char buff[8192]; int readed; http_parser_t parser; avl_node *node; http_var_t *var; httpp_initialize(&parser, NULL); readed = fread(buff, 1, 8192, stdin); if (httpp_parse(&parser, buff, readed)) { printf("Parse succeeded...\n\n"); printf("Request was "); switch (parser.req_type) { case httpp_req_none: printf(" none\n"); break; case httpp_req_unknown: printf(" unknown\n"); break; case httpp_req_get: printf(" get\n"); break; case httpp_req_post: printf(" post\n"); break; case httpp_req_head: printf(" head\n"); break; } printf("Version was 1.%d\n", parser.version); node = avl_get_first(parser.vars); while (node) { var = (http_var_t *)node->key; if (var) printf("Iterating variable(s): %s = %s\n", var->name, var->value); node = avl_get_next(node); } } else { printf("Parse failed...\n"); } printf("Destroying parser...\n"); httpp_destroy(&parser); return 0; } @ 1.1 log @Initial revision @ text @d9 5 a13 5 char buff[8192]; int readed; http_parser_t parser; avl_node *node; http_var_t *var; d15 1 a15 1 httpp_initialize(&parser, NULL); d17 35 a51 35 readed = fread(buff, 1, 8192, stdin); if (httpp_parse(&parser, buff, readed)) { printf("Parse succeeded...\n\n"); printf("Request was "); switch (parser.req_type) { case httpp_req_none: printf(" none\n"); break; case httpp_req_unknown: printf(" unknown\n"); break; case httpp_req_get: printf(" get\n"); break; case httpp_req_post: printf(" post\n"); break; case httpp_req_head: printf(" head\n"); break; } printf("Version was 1.%d\n", parser.version); node = avl_get_first(parser.vars); while (node) { var = (http_var_t *)node->key; if (var) printf("Iterating variable(s): %s = %s\n", var->name, var->value); node = avl_get_next(node); } } else { printf("Parse failed...\n"); } d53 2 a54 2 printf("Destroying parser...\n"); httpp_destroy(&parser); d56 1 a56 1 return 0; @ 1.1.1.1 log @move to cvs @ text @@ cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/httpp/.cvsignore,v0000664000076500007650000000070210702477014026666 0ustar mhaggermhagger00000000000000head 1.2; access; symbols libshout-2_0:1.2 libshout-2_0b3:1.2 libshout-2_0b2:1.2 libshout_2_0b1:1.2 libogg2-zerocopy:1.2.0.2; locks; strict; comment @# @; 1.2 date 2001.09.10.03.04.10; author jack; state Exp; branches; next 1.1; 1.1 date 2001.09.10.03.00.40; author jack; state Exp; branches; next ; desc @@ 1.2 log @.cvsignore is fun! @ text @Makefile Makefile.in .deps .libs *.la *.lo @ 1.1 log @Still more .cvsignore @ text @d4 3 @ cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/httpp/BUILDING,v0000664000076500007650000000102710702477014026007 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access; symbols libshout-2_0:1.1.1.1 libshout-2_0b3:1.1.1.1 libshout-2_0b2:1.1.1.1 libshout_2_0b1:1.1.1.1 libogg2-zerocopy:1.1.1.1.0.2 start:1.1.1.1 xiph:1.1.1; locks; strict; comment @# @; 1.1 date 2001.09.10.02.28.49; author jack; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2001.09.10.02.28.49; author jack; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @defines that affect compilation none library dependencies uses avl @ 1.1.1.1 log @move to cvs @ text @@ cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/httpp/README,v0000664000076500007650000000106510702477014025551 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access; symbols libshout-2_0:1.1.1.1 libshout-2_0b3:1.1.1.1 libshout-2_0b2:1.1.1.1 libshout_2_0b1:1.1.1.1 libogg2-zerocopy:1.1.1.1.0.2 start:1.1.1.1 xiph:1.1.1; locks; strict; comment @# @; 1.1 date 2001.09.10.02.28.47; author jack; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2001.09.10.02.28.47; author jack; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @httpp is a simple http parser licensed under the lgpl created by jack moffitt @ 1.1.1.1 log @move to cvs @ text @@ cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/httpp/Makefile.am,v0000664000076500007650000000261110702477014026723 0ustar mhaggermhagger00000000000000head 1.3; access; symbols libshout-2_0:1.3 libshout-2_0b3:1.3 libshout-2_0b2:1.3 libshout_2_0b1:1.3 libogg2-zerocopy:1.1.1.1.0.2 start:1.1.1.1 xiph:1.1.1; locks; strict; comment @# @; 1.3 date 2003.03.09.22.56.46; author karl; state Exp; branches; next 1.2; 1.2 date 2003.03.08.00.46.58; author karl; state Exp; branches; next 1.1; 1.1 date 2001.09.10.02.28.47; author jack; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2001.09.10.02.28.47; author jack; state Exp; branches; next ; desc @@ 1.3 log @reduce include file namespace clutter for libshout and the associated smaller libs. @ text @## Process this with automake to create Makefile.in AUTOMAKE_OPTIONS = foreign noinst_LTLIBRARIES = libicehttpp.la noinst_HEADERS = httpp.h libicehttpp_la_SOURCES = httpp.c libicehttpp_la_CFLAGS = @@XIPH_CFLAGS@@ INCLUDES = -I$(srcdir)/.. # SCCS stuff (for BitKeeper) GET = true debug: $(MAKE) all CFLAGS="@@DEBUG@@" profile: $(MAKE) all CFLAGS="@@PROFILE@@" @ 1.2 log @more on the XIPH_CFLAGS. For the smaller libs like thread etc put any passed flags into the compiling rules. Also configure in libshout now sets up the XIPH_CFLAGS @ text @d11 1 a11 1 INCLUDES = -I$(srcdir)/../avl -I$(srcdir)/../thread @ 1.1 log @Initial revision @ text @d9 1 d17 1 a17 1 $(MAKE) all CFLAGS="@@DEBUG@@" d20 1 a20 1 $(MAKE) all CFLAGS="@@PROFILE@@" @ 1.1.1.1 log @move to cvs @ text @@ cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/httpp/TODO,v0000664000076500007650000000075210702477014025363 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access; symbols libshout-2_0:1.1.1.1 libshout-2_0b3:1.1.1.1 libshout-2_0b2:1.1.1.1 libshout_2_0b1:1.1.1.1 libogg2-zerocopy:1.1.1.1.0.2 start:1.1.1.1 xiph:1.1.1; locks; strict; comment @# @; 1.1 date 2001.09.10.02.28.47; author jack; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2001.09.10.02.28.47; author jack; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @- nothing i can think of @ 1.1.1.1 log @move to cvs @ text @@ cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/httpp/httpp.h,v0000664000076500007650000001110410702477014026174 0ustar mhaggermhagger00000000000000head 1.10; access; symbols libshout-2_0:1.10 libshout-2_0b3:1.10 libshout-2_0b2:1.9 libshout_2_0b1:1.9 libogg2-zerocopy:1.4.0.2 start:1.1.1.1 xiph:1.1.1; locks; strict; comment @ * @; 1.10 date 2003.07.07.01.49.27; author brendan; state Exp; branches; next 1.9; 1.9 date 2003.03.15.02.10.18; author msmith; state Exp; branches; next 1.8; 1.8 date 2003.03.09.22.56.46; author karl; state Exp; branches; next 1.7; 1.7 date 2003.03.08.04.57.02; author msmith; state Exp; branches; next 1.6; 1.6 date 2003.01.16.05.48.31; author brendan; state Exp; branches; next 1.5; 1.5 date 2002.12.31.06.28.39; author msmith; state Exp; branches; next 1.4; 1.4 date 2002.08.16.14.22.44; author msmith; state Exp; branches; next 1.3; 1.3 date 2002.08.05.14.48.03; author msmith; state Exp; branches; next 1.2; 1.2 date 2002.05.03.15.04.56; author msmith; state Exp; branches; next 1.1; 1.1 date 2001.09.10.02.28.47; author jack; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2001.09.10.02.28.47; author jack; state Exp; branches; next ; desc @@ 1.10 log @httpp goes through the rinse cycle @ text @/* httpp.h ** ** http parsing library */ #ifndef __HTTPP_H #define __HTTPP_H #include #define HTTPP_VAR_PROTOCOL "__protocol" #define HTTPP_VAR_VERSION "__version" #define HTTPP_VAR_URI "__uri" #define HTTPP_VAR_REQ_TYPE "__req_type" #define HTTPP_VAR_ERROR_MESSAGE "__errormessage" #define HTTPP_VAR_ERROR_CODE "__errorcode" #define HTTPP_VAR_ICYPASSWORD "__icy_password" typedef enum httpp_request_type_tag { httpp_req_none, httpp_req_get, httpp_req_post, httpp_req_head, httpp_req_source, httpp_req_play, httpp_req_stats, httpp_req_unknown } httpp_request_type_e; typedef struct http_var_tag { char *name; char *value; } http_var_t; typedef struct http_varlist_tag { http_var_t var; struct http_varlist_tag *next; } http_varlist_t; typedef struct http_parser_tag { httpp_request_type_e req_type; char *uri; avl_tree *vars; avl_tree *queryvars; } http_parser_t; #ifdef _mangle # define httpp_create_parser _mangle(httpp_create_parser) # define httpp_initialize _mangle(httpp_initialize) # define httpp_parse _mangle(httpp_parse) # define httpp_parse_icy _mangle(httpp_parse_icy) # define httpp_parse_response _mangle(httpp_parse_response) # define httpp_setvar _mangle(httpp_setvar) # define httpp_getvar _mangle(httpp_getvar) # define httpp_set_query_param _mangle(httpp_set_query_param) # define httpp_get_query_param _mangle(httpp_get_query_param) # define httpp_destroy _mangle(httpp_destroy) # define httpp_clear _mangle(httpp_clear) #endif http_parser_t *httpp_create_parser(void); void httpp_initialize(http_parser_t *parser, http_varlist_t *defaults); int httpp_parse(http_parser_t *parser, char *http_data, unsigned long len); int httpp_parse_icy(http_parser_t *parser, char *http_data, unsigned long len); int httpp_parse_response(http_parser_t *parser, char *http_data, unsigned long len, char *uri); void httpp_setvar(http_parser_t *parser, char *name, char *value); char *httpp_getvar(http_parser_t *parser, char *name); void httpp_set_query_param(http_parser_t *parser, char *name, char *value); char *httpp_get_query_param(http_parser_t *parser, char *name); void httpp_destroy(http_parser_t *parser); void httpp_clear(http_parser_t *parser); #endif @ 1.9 log @Brendan was getting pissed off about inconsistent indentation styles. Convert all tabs to 4 spaces. All code must now use 4 space indents. @ text @d41 14 @ 1.8 log @reduce include file namespace clutter for libshout and the associated smaller libs. @ text @d20 2 a21 2 httpp_req_none, httpp_req_get, httpp_req_post, httpp_req_head, httpp_req_source, httpp_req_play, httpp_req_stats, httpp_req_unknown d25 2 a26 2 char *name; char *value; d30 2 a31 2 http_var_t var; struct http_varlist_tag *next; d35 4 a38 4 httpp_request_type_e req_type; char *uri; avl_tree *vars; avl_tree *queryvars; @ 1.7 log @Added support for shoutcast login protocol (ewww...) @ text @d9 1 a9 1 #include "avl.h" @ 1.6 log @Indentation again, don't mind me @ text @d17 1 d44 1 @ 1.5 log @mp3 metadata complete. Still untested. @ text @d19 2 a20 1 httpp_req_none, httpp_req_get, httpp_req_post, httpp_req_head, httpp_req_source, httpp_req_play, httpp_req_stats, httpp_req_unknown d37 1 a37 1 avl_tree *queryvars; a51 3 @ 1.4 log @bugfixes for httpp_parse_response @ text @d36 1 d45 2 @ 1.3 log @Cleaned up version of Ciaran Anscomb's relaying patch. @ text @d16 1 @ 1.2 log @Memory leaks. Lots of little ones. @ text @d15 1 a33 1 d40 1 @ 1.1 log @Initial revision @ text @d43 1 @ 1.1.1.1 log @move to cvs @ text @@ cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/httpp/httpp.c,v0000664000076500007650000010430410702477014026174 0ustar mhaggermhagger00000000000000head 1.23; access; symbols libshout-2_0:1.23 libshout-2_0b3:1.23 libshout-2_0b2:1.22 libshout_2_0b1:1.22 libogg2-zerocopy:1.8.0.2 start:1.1.1.1 xiph:1.1.1; locks; strict; comment @ * @; 1.23 date 2003.07.07.01.49.27; author brendan; state Exp; branches; next 1.22; 1.22 date 2003.06.18.15.52.25; author karl; state Exp; branches; next 1.21; 1.21 date 2003.06.18.11.13.11; author karl; state Exp; branches; next 1.20; 1.20 date 2003.06.09.22.30.09; author brendan; state Exp; branches; next 1.19; 1.19 date 2003.06.05.17.09.12; author brendan; state Exp; branches; next 1.18; 1.18 date 2003.03.15.02.10.18; author msmith; state Exp; branches; next 1.17; 1.17 date 2003.03.09.22.56.46; author karl; state Exp; branches; next 1.16; 1.16 date 2003.03.08.16.05.38; author karl; state Exp; branches; next 1.15; 1.15 date 2003.03.08.05.27.17; author msmith; state Exp; branches; next 1.14; 1.14 date 2003.03.08.04.57.02; author msmith; state Exp; branches; next 1.13; 1.13 date 2003.03.06.01.55.20; author brendan; state Exp; branches; next 1.12; 1.12 date 2003.01.17.09.01.04; author msmith; state Exp; branches; next 1.11; 1.11 date 2003.01.16.05.48.31; author brendan; state Exp; branches; next 1.10; 1.10 date 2003.01.15.23.46.56; author brendan; state Exp; branches; next 1.9; 1.9 date 2002.12.31.06.28.39; author msmith; state Exp; branches; next 1.8; 1.8 date 2002.08.16.14.22.44; author msmith; state Exp; branches; next 1.7; 1.7 date 2002.08.05.14.48.03; author msmith; state Exp; branches; next 1.6; 1.6 date 2002.05.03.15.04.56; author msmith; state Exp; branches; next 1.5; 1.5 date 2002.04.05.09.28.25; author msmith; state Exp; branches; next 1.4; 1.4 date 2002.02.11.09.11.18; author msmith; state Exp; branches; next 1.3; 1.3 date 2001.10.20.07.40.09; author jack; state Exp; branches; next 1.2; 1.2 date 2001.10.20.04.41.54; author jack; state Exp; branches; next 1.1; 1.1 date 2001.09.10.02.28.47; author jack; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2001.09.10.02.28.47; author jack; state Exp; branches; next ; desc @@ 1.23 log @httpp goes through the rinse cycle @ text @/* Httpp.c ** ** http parsing engine */ #ifdef HAVE_CONFIG_H #include #endif #include #include #include #include #ifdef HAVE_STRINGS_H #include #endif #include #include "httpp.h" #ifdef _WIN32 #define strcasecmp stricmp #endif #define MAX_HEADERS 32 /* internal functions */ /* misc */ static char *_lowercase(char *str); /* for avl trees */ static int _compare_vars(void *compare_arg, void *a, void *b); static int _free_vars(void *key); http_parser_t *httpp_create_parser(void) { return (http_parser_t *)malloc(sizeof(http_parser_t)); } void httpp_initialize(http_parser_t *parser, http_varlist_t *defaults) { http_varlist_t *list; parser->req_type = httpp_req_none; parser->uri = NULL; parser->vars = avl_tree_new(_compare_vars, NULL); parser->queryvars = avl_tree_new(_compare_vars, NULL); /* now insert the default variables */ list = defaults; while (list != NULL) { httpp_setvar(parser, list->var.name, list->var.value); list = list->next; } } static int split_headers(char *data, unsigned long len, char **line) { /* first we count how many lines there are ** and set up the line[] array */ int lines = 0; unsigned long i; line[lines] = data; for (i = 0; i < len && lines < MAX_HEADERS; i++) { if (data[i] == '\r') data[i] = '\0'; if (data[i] == '\n') { lines++; data[i] = '\0'; if (i + 1 < len) { if (data[i + 1] == '\n' || data[i + 1] == '\r') break; line[lines] = &data[i + 1]; } } } i++; while (data[i] == '\n') i++; return lines; } static void parse_headers(http_parser_t *parser, char **line, int lines) { int i,l; int whitespace, where, slen; char *name = NULL; char *value = NULL; /* parse the name: value lines. */ for (l = 1; l < lines; l++) { where = 0; whitespace = 0; name = line[l]; value = NULL; slen = strlen(line[l]); for (i = 0; i < slen; i++) { if (line[l][i] == ':') { whitespace = 1; line[l][i] = '\0'; } else { if (whitespace) { whitespace = 0; while (i < slen && line[l][i] == ' ') i++; if (i < slen) value = &line[l][i]; break; } } } if (name != NULL && value != NULL) { httpp_setvar(parser, _lowercase(name), value); name = NULL; value = NULL; } } } int httpp_parse_response(http_parser_t *parser, char *http_data, unsigned long len, char *uri) { char *data; char *line[MAX_HEADERS]; int lines, slen,i, whitespace=0, where=0,code; char *version=NULL, *resp_code=NULL, *message=NULL; if(http_data == NULL) return 0; /* make a local copy of the data, including 0 terminator */ data = (char *)malloc(len+1); if (data == NULL) return 0; memcpy(data, http_data, len); data[len] = 0; lines = split_headers(data, len, line); /* In this case, the first line contains: * VERSION RESPONSE_CODE MESSAGE, such as HTTP/1.0 200 OK */ slen = strlen(line[0]); version = line[0]; for(i=0; i < slen; i++) { if(line[0][i] == ' ') { line[0][i] = 0; whitespace = 1; } else if(whitespace) { whitespace = 0; where++; if(where == 1) resp_code = &line[0][i]; else { message = &line[0][i]; break; } } } if(version == NULL || resp_code == NULL || message == NULL) { free(data); return 0; } httpp_setvar(parser, HTTPP_VAR_ERROR_CODE, resp_code); code = atoi(resp_code); if(code < 200 || code >= 300) { httpp_setvar(parser, HTTPP_VAR_ERROR_MESSAGE, message); } httpp_setvar(parser, HTTPP_VAR_URI, uri); httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "NONE"); parse_headers(parser, line, lines); free(data); return 1; } static int hex(char c) { if(c >= '0' && c <= '9') return c - '0'; else if(c >= 'A' && c <= 'F') return c - 'A' + 10; else if(c >= 'a' && c <= 'f') return c - 'a' + 10; else return -1; } static char *url_escape(char *src) { int len = strlen(src); unsigned char *decoded; int i; char *dst; int done = 0; decoded = calloc(1, len + 1); dst = decoded; for(i=0; i < len; i++) { switch(src[i]) { case '%': if(i+2 >= len) { free(decoded); return NULL; } if(hex(src[i+1]) == -1 || hex(src[i+2]) == -1 ) { free(decoded); return NULL; } *dst++ = hex(src[i+1]) * 16 + hex(src[i+2]); i+= 2; break; case '#': done = 1; break; case 0: free(decoded); return NULL; break; default: *dst++ = src[i]; break; } if(done) break; } *dst = 0; /* null terminator */ return decoded; } /** TODO: This is almost certainly buggy in some cases */ static void parse_query(http_parser_t *parser, char *query) { int len; int i=0; char *key = query; char *val=NULL; if(!query || !*query) return; len = strlen(query); while(ireq_type = httpp_req_source; httpp_setvar(parser, HTTPP_VAR_URI, "/"); httpp_setvar(parser, HTTPP_VAR_ICYPASSWORD, line[0]); httpp_setvar(parser, HTTPP_VAR_PROTOCOL, "ICY"); httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "SOURCE"); /* This protocol is evil */ httpp_setvar(parser, HTTPP_VAR_VERSION, "666"); parse_headers(parser, line, lines); free(data); return 1; } int httpp_parse(http_parser_t *parser, char *http_data, unsigned long len) { char *data, *tmp; char *line[MAX_HEADERS]; /* limited to 32 lines, should be more than enough */ int i; int lines; char *req_type = NULL; char *uri = NULL; char *version = NULL; int whitespace, where, slen; if (http_data == NULL) return 0; /* make a local copy of the data, including 0 terminator */ data = (char *)malloc(len+1); if (data == NULL) return 0; memcpy(data, http_data, len); data[len] = 0; lines = split_headers(data, len, line); /* parse the first line special ** the format is: ** REQ_TYPE URI VERSION ** eg: ** GET /index.html HTTP/1.0 */ where = 0; whitespace = 0; slen = strlen(line[0]); req_type = line[0]; for (i = 0; i < slen; i++) { if (line[0][i] == ' ') { whitespace = 1; line[0][i] = '\0'; } else { /* we're just past the whitespace boundry */ if (whitespace) { whitespace = 0; where++; switch (where) { case 1: uri = &line[0][i]; break; case 2: version = &line[0][i]; break; } } } } if (strcasecmp("GET", req_type) == 0) { parser->req_type = httpp_req_get; } else if (strcasecmp("POST", req_type) == 0) { parser->req_type = httpp_req_post; } else if (strcasecmp("HEAD", req_type) == 0) { parser->req_type = httpp_req_head; } else if (strcasecmp("SOURCE", req_type) == 0) { parser->req_type = httpp_req_source; } else if (strcasecmp("PLAY", req_type) == 0) { parser->req_type = httpp_req_play; } else if (strcasecmp("STATS", req_type) == 0) { parser->req_type = httpp_req_stats; } else { parser->req_type = httpp_req_unknown; } if (uri != NULL && strlen(uri) > 0) { char *query; if((query = strchr(uri, '?')) != NULL) { *query = 0; query++; parse_query(parser, query); } parser->uri = strdup(uri); } else { free(data); return 0; } if ((version != NULL) && ((tmp = strchr(version, '/')) != NULL)) { tmp[0] = '\0'; if ((strlen(version) > 0) && (strlen(&tmp[1]) > 0)) { httpp_setvar(parser, HTTPP_VAR_PROTOCOL, version); httpp_setvar(parser, HTTPP_VAR_VERSION, &tmp[1]); } else { free(data); return 0; } } else { free(data); return 0; } if (parser->req_type != httpp_req_none && parser->req_type != httpp_req_unknown) { switch (parser->req_type) { case httpp_req_get: httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "GET"); break; case httpp_req_post: httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "POST"); break; case httpp_req_head: httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "HEAD"); break; case httpp_req_source: httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "SOURCE"); break; case httpp_req_play: httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "PLAY"); break; case httpp_req_stats: httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "STATS"); break; default: break; } } else { free(data); return 0; } if (parser->uri != NULL) { httpp_setvar(parser, HTTPP_VAR_URI, parser->uri); } else { free(data); return 0; } parse_headers(parser, line, lines); free(data); return 1; } void httpp_setvar(http_parser_t *parser, char *name, char *value) { http_var_t *var; if (name == NULL || value == NULL) return; var = (http_var_t *)malloc(sizeof(http_var_t)); if (var == NULL) return; var->name = strdup(name); var->value = strdup(value); if (httpp_getvar(parser, name) == NULL) { avl_insert(parser->vars, (void *)var); } else { avl_delete(parser->vars, (void *)var, _free_vars); avl_insert(parser->vars, (void *)var); } } char *httpp_getvar(http_parser_t *parser, char *name) { http_var_t var; http_var_t *found; void *fp; fp = &found; var.name = name; var.value = NULL; if (avl_get_by_key(parser->vars, &var, fp) == 0) return found->value; else return NULL; } void httpp_set_query_param(http_parser_t *parser, char *name, char *value) { http_var_t *var; if (name == NULL || value == NULL) return; var = (http_var_t *)malloc(sizeof(http_var_t)); if (var == NULL) return; var->name = strdup(name); var->value = url_escape(value); if (httpp_get_query_param(parser, name) == NULL) { avl_insert(parser->queryvars, (void *)var); } else { avl_delete(parser->queryvars, (void *)var, _free_vars); avl_insert(parser->queryvars, (void *)var); } } char *httpp_get_query_param(http_parser_t *parser, char *name) { http_var_t var; http_var_t *found; void *fp; fp = &found; var.name = name; var.value = NULL; if (avl_get_by_key(parser->queryvars, (void *)&var, fp) == 0) return found->value; else return NULL; } void httpp_clear(http_parser_t *parser) { parser->req_type = httpp_req_none; if (parser->uri) free(parser->uri); parser->uri = NULL; avl_tree_free(parser->vars, _free_vars); avl_tree_free(parser->queryvars, _free_vars); parser->vars = NULL; } void httpp_destroy(http_parser_t *parser) { httpp_clear(parser); free(parser); } static char *_lowercase(char *str) { char *p = str; for (; *p != '\0'; p++) *p = tolower(*p); return str; } static int _compare_vars(void *compare_arg, void *a, void *b) { http_var_t *vara, *varb; vara = (http_var_t *)a; varb = (http_var_t *)b; return strcmp(vara->name, varb->name); } static int _free_vars(void *key) { http_var_t *var; var = (http_var_t *)key; if (var->name) free(var->name); if (var->value) free(var->value); free(var); return 1; } @ 1.22 log @ermmm, let's use the right operator. @ text @d34 2 a35 2 int _compare_vars(void *compare_arg, void *a, void *b); int _free_vars(void *key); d556 1 a556 1 int _compare_vars(void *compare_arg, void *a, void *b) d566 1 a566 1 int _free_vars(void *key) @ 1.21 log @minor cleanup, removes compiler warning, makes it static, and doesn't re-evaluate string length each time. @ text @d550 1 a550 1 for (; *p |= '\0'; p++) @ 1.20 log @gcc 3.3 warns: dereferencing type-punned pointer will break strict-aliasing rules @ text @d31 1 a31 1 char *_lowercase(char *str); d547 1 a547 1 char *_lowercase(char *str) d549 3 a551 3 long i; for (i = 0; i < strlen(str); i++) str[i] = tolower(str[i]); @ 1.19 log @Karl's patch for freebsd, minus the sys/select.h test which breaks on OS X. Also enables IPV6 in libshout! @ text @d481 1 d483 1 d487 1 a487 1 if (avl_get_by_key(parser->vars, (void *)&var, (void **)&found) == 0) d518 1 d520 1 d524 1 a524 1 if (avl_get_by_key(parser->queryvars, (void *)&var, (void **)&found) == 0) @ 1.18 log @Brendan was getting pissed off about inconsistent indentation styles. Convert all tabs to 4 spaces. All code must now use 4 space indents. @ text @d15 3 @ 1.17 log @reduce include file namespace clutter for libshout and the associated smaller libs. @ text @d36 1 a36 1 return (http_parser_t *)malloc(sizeof(http_parser_t)); d41 1 a41 1 http_varlist_t *list; d43 11 a53 11 parser->req_type = httpp_req_none; parser->uri = NULL; parser->vars = avl_tree_new(_compare_vars, NULL); parser->queryvars = avl_tree_new(_compare_vars, NULL); /* now insert the default variables */ list = defaults; while (list != NULL) { httpp_setvar(parser, list->var.name, list->var.value); list = list->next; } d58 4 a61 4 /* first we count how many lines there are ** and set up the line[] array */ int lines = 0; d63 11 a73 11 line[lines] = data; for (i = 0; i < len && lines < MAX_HEADERS; i++) { if (data[i] == '\r') data[i] = '\0'; if (data[i] == '\n') { lines++; data[i] = '\0'; if (i + 1 < len) { if (data[i + 1] == '\n' || data[i + 1] == '\r') break; line[lines] = &data[i + 1]; d75 2 a76 2 } } d78 2 a79 2 i++; while (data[i] == '\n') i++; d87 35 a121 35 int whitespace, where, slen; char *name = NULL; char *value = NULL; /* parse the name: value lines. */ for (l = 1; l < lines; l++) { where = 0; whitespace = 0; name = line[l]; value = NULL; slen = strlen(line[l]); for (i = 0; i < slen; i++) { if (line[l][i] == ':') { whitespace = 1; line[l][i] = '\0'; } else { if (whitespace) { whitespace = 0; while (i < slen && line[l][i] == ' ') i++; if (i < slen) value = &line[l][i]; break; } } } if (name != NULL && value != NULL) { httpp_setvar(parser, _lowercase(name), value); name = NULL; value = NULL; } } d126 4 a129 4 char *data; char *line[MAX_HEADERS]; int lines, slen,i, whitespace=0, where=0,code; char *version=NULL, *resp_code=NULL, *message=NULL; d131 2 a132 2 if(http_data == NULL) return 0; d134 5 a138 39 /* make a local copy of the data, including 0 terminator */ data = (char *)malloc(len+1); if (data == NULL) return 0; memcpy(data, http_data, len); data[len] = 0; lines = split_headers(data, len, line); /* In this case, the first line contains: * VERSION RESPONSE_CODE MESSAGE, such as HTTP/1.0 200 OK */ slen = strlen(line[0]); version = line[0]; for(i=0; i < slen; i++) { if(line[0][i] == ' ') { line[0][i] = 0; whitespace = 1; } else if(whitespace) { whitespace = 0; where++; if(where == 1) resp_code = &line[0][i]; else { message = &line[0][i]; break; } } } if(version == NULL || resp_code == NULL || message == NULL) { free(data); return 0; } httpp_setvar(parser, HTTPP_VAR_ERROR_CODE, resp_code); code = atoi(resp_code); if(code < 200 || code >= 300) { httpp_setvar(parser, HTTPP_VAR_ERROR_MESSAGE, message); } d140 1 a140 2 httpp_setvar(parser, HTTPP_VAR_URI, uri); httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "NONE"); d142 20 a161 1 parse_headers(parser, line, lines); d163 4 a166 1 free(data); d168 14 a181 1 return 1; d186 8 a193 8 if(c >= '0' && c <= '9') return c - '0'; else if(c >= 'A' && c <= 'F') return c - 'A' + 10; else if(c >= 'a' && c <= 'f') return c - 'a' + 10; else return -1; d198 39 a236 39 int len = strlen(src); unsigned char *decoded; int i; char *dst; int done = 0; decoded = calloc(1, len + 1); dst = decoded; for(i=0; i < len; i++) { switch(src[i]) { case '%': if(i+2 >= len) { free(decoded); return NULL; } if(hex(src[i+1]) == -1 || hex(src[i+2]) == -1 ) { free(decoded); return NULL; } *dst++ = hex(src[i+1]) * 16 + hex(src[i+2]); i+= 2; break; case '#': done = 1; break; case 0: free(decoded); return NULL; break; default: *dst++ = src[i]; break; } if(done) break; } d238 1 a238 1 *dst = 0; /* null terminator */ d240 1 a240 1 return decoded; d246 29 a274 29 int len; int i=0; char *key = query; char *val=NULL; if(!query || !*query) return; len = strlen(query); while(ireq_type = httpp_req_get; } else if (strcasecmp("POST", req_type) == 0) { parser->req_type = httpp_req_post; } else if (strcasecmp("HEAD", req_type) == 0) { parser->req_type = httpp_req_head; } else if (strcasecmp("SOURCE", req_type) == 0) { parser->req_type = httpp_req_source; } else if (strcasecmp("PLAY", req_type) == 0) { parser->req_type = httpp_req_play; } else if (strcasecmp("STATS", req_type) == 0) { parser->req_type = httpp_req_stats; } else { parser->req_type = httpp_req_unknown; } if (uri != NULL && strlen(uri) > 0) { char *query; if((query = strchr(uri, '?')) != NULL) { *query = 0; query++; parse_query(parser, query); } d397 10 a406 2 parser->uri = strdup(uri); } else { d411 27 a437 48 if ((version != NULL) && ((tmp = strchr(version, '/')) != NULL)) { tmp[0] = '\0'; if ((strlen(version) > 0) && (strlen(&tmp[1]) > 0)) { httpp_setvar(parser, HTTPP_VAR_PROTOCOL, version); httpp_setvar(parser, HTTPP_VAR_VERSION, &tmp[1]); } else { free(data); return 0; } } else { free(data); return 0; } if (parser->req_type != httpp_req_none && parser->req_type != httpp_req_unknown) { switch (parser->req_type) { case httpp_req_get: httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "GET"); break; case httpp_req_post: httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "POST"); break; case httpp_req_head: httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "HEAD"); break; case httpp_req_source: httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "SOURCE"); break; case httpp_req_play: httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "PLAY"); break; case httpp_req_stats: httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "STATS"); break; default: break; } } else { free(data); return 0; } if (parser->uri != NULL) { httpp_setvar(parser, HTTPP_VAR_URI, parser->uri); } else { free(data); return 0; } d439 6 a444 1 parse_headers(parser, line, lines); d446 3 a448 1 free(data); d450 1 a450 1 return 1; d455 1 a455 1 http_var_t *var; d457 2 a458 2 if (name == NULL || value == NULL) return; d460 2 a461 2 var = (http_var_t *)malloc(sizeof(http_var_t)); if (var == NULL) return; d463 9 a471 9 var->name = strdup(name); var->value = strdup(value); if (httpp_getvar(parser, name) == NULL) { avl_insert(parser->vars, (void *)var); } else { avl_delete(parser->vars, (void *)var, _free_vars); avl_insert(parser->vars, (void *)var); } d476 2 a477 2 http_var_t var; http_var_t *found; d479 2 a480 2 var.name = name; var.value = NULL; d482 4 a485 4 if (avl_get_by_key(parser->vars, (void *)&var, (void **)&found) == 0) return found->value; else return NULL; d490 1 a490 1 http_var_t *var; d492 2 a493 2 if (name == NULL || value == NULL) return; d495 2 a496 2 var = (http_var_t *)malloc(sizeof(http_var_t)); if (var == NULL) return; d498 9 a506 9 var->name = strdup(name); var->value = url_escape(value); if (httpp_get_query_param(parser, name) == NULL) { avl_insert(parser->queryvars, (void *)var); } else { avl_delete(parser->queryvars, (void *)var, _free_vars); avl_insert(parser->queryvars, (void *)var); } d511 2 a512 2 http_var_t var; http_var_t *found; d514 2 a515 2 var.name = name; var.value = NULL; d517 4 a520 4 if (avl_get_by_key(parser->queryvars, (void *)&var, (void **)&found) == 0) return found->value; else return NULL; d525 7 a531 7 parser->req_type = httpp_req_none; if (parser->uri) free(parser->uri); parser->uri = NULL; avl_tree_free(parser->vars, _free_vars); avl_tree_free(parser->queryvars, _free_vars); parser->vars = NULL; d536 2 a537 2 httpp_clear(parser); free(parser); d542 3 a544 3 long i; for (i = 0; i < strlen(str); i++) str[i] = tolower(str[i]); d546 1 a546 1 return str; d551 1 a551 1 http_var_t *vara, *varb; d553 2 a554 2 vara = (http_var_t *)a; varb = (http_var_t *)b; d556 1 a556 1 return strcmp(vara->name, varb->name); d561 1 a561 1 http_var_t *var; d563 1 a563 1 var = (http_var_t *)key; d565 5 a569 5 if (var->name) free(var->name); if (var->value) free(var->value); free(var); d571 1 a571 1 return 1; @ 1.16 log @include the automake config.h file if the application defines one @ text @d16 1 a16 1 #include "avl.h" @ 1.15 log @Set another parameter in the icy protocol parse that logging expects @ text @d6 4 @ 1.14 log @Added support for shoutcast login protocol (ewww...) @ text @d299 1 @ 1.13 log @Use gnu archive ACX_PTHREAD macro to figure out how to compile thread support. Also make it possible to build libshout without threads, albeit without locking in the resolver or avl trees. New option --disable-pthread too. @ text @d273 36 d387 4 a390 2 } else parser->uri = NULL; @ 1.12 log @Fix some warnings, fix cflags. @ text @a11 1 #include "thread.h" @ 1.11 log @Indentation again, don't mind me @ text @d58 2 a59 1 int i, lines = 0; @ 1.10 log @Make indentation consistent before doing other work @ text @d472 1 a472 1 var.value = NULL; @ 1.9 log @mp3 metadata complete. Still untested. @ text @d43 1 a43 1 parser->queryvars = avl_tree_new(_compare_vars, NULL); d122 4 a125 4 char *data; char *line[MAX_HEADERS]; int lines, slen,i, whitespace=0, where=0,code; char *version=NULL, *resp_code=NULL, *message=NULL; d127 2 a128 2 if(http_data == NULL) return 0; d134 1 a134 1 data[len] = 0; d136 1 a136 1 lines = split_headers(data, len, line); d138 20 a157 22 /* In this case, the first line contains: * VERSION RESPONSE_CODE MESSAGE, such as * HTTP/1.0 200 OK */ slen = strlen(line[0]); version = line[0]; for(i=0; i < slen; i++) { if(line[0][i] == ' ') { line[0][i] = 0; whitespace = 1; } else if(whitespace) { whitespace = 0; where++; if(where == 1) resp_code = &line[0][i]; else { message = &line[0][i]; break; } } } d159 4 a162 10 if(version == NULL || resp_code == NULL || message == NULL) { free(data); return 0; } httpp_setvar(parser, HTTPP_VAR_ERROR_CODE, resp_code); code = atoi(resp_code); if(code < 200 || code >= 300) { httpp_setvar(parser, HTTPP_VAR_ERROR_MESSAGE, message); } d164 7 a170 1 httpp_setvar(parser, HTTPP_VAR_URI, uri); d173 1 a173 1 parse_headers(parser, line, lines); d182 8 a189 8 if(c >= '0' && c <= '9') return c - '0'; else if(c >= 'A' && c <= 'F') return c - 'A' + 10; else if(c >= 'a' && c <= 'f') return c - 'a' + 10; else return -1; d194 39 a232 39 int len = strlen(src); unsigned char *decoded; int i; char *dst; int done = 0; decoded = calloc(1, len + 1); dst = decoded; for(i=0; i < len; i++) { switch(src[i]) { case '%': if(i+2 >= len) { free(decoded); return NULL; } if(hex(src[i+1]) == -1 || hex(src[i+2]) == -1 ) { free(decoded); return NULL; } *dst++ = hex(src[i+1]) * 16 + hex(src[i+2]); i+= 2; break; case '#': done = 1; break; case 0: free(decoded); return NULL; break; default: *dst++ = src[i]; break; } if(done) break; } d234 1 a234 1 *dst = 0; /* null terminator */ d236 1 a236 1 return decoded; d242 29 a270 30 int len; int i=0; char *key = query; char *val=NULL; if(!query || !*query) return; len = strlen(query); while(iuri != NULL) { d403 1 a403 1 parse_headers(parser, line, lines); d437 1 a437 1 var.value = NULL; d493 2 a494 2 httpp_clear(parser); free(parser); @ 1.8 log @bugfixes for httpp_parse_response @ text @d43 1 d182 94 d345 9 a353 1 if (uri != NULL && strlen(uri) > 0) d355 1 d442 1 a442 1 var.value = NULL; d450 35 d492 1 @ 1.7 log @Cleaned up version of Ciaran Anscomb's relaying patch. @ text @d165 1 a168 2 free(data); return 0; d172 1 a172 1 httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "RELAY"); @ 1.6 log @Memory leaks. Lots of little ones. @ text @d52 1 a52 1 int httpp_parse(http_parser_t *parser, char *http_data, unsigned long len) a53 21 char *data, *tmp; char *line[MAX_HEADERS]; /* limited to 32 lines, should be more than enough */ int i, l, retlen; int lines; char *req_type = NULL; char *uri = NULL; char *version = NULL; char *name = NULL; char *value = NULL; int whitespace, where; int slen; if (http_data == NULL) return 0; /* make a local copy of the data, including 0 terminator */ data = (char *)malloc(len+1); if (data == NULL) return 0; memcpy(data, http_data, len); data[len] = 0; d57 1 a57 1 lines = 0; d75 128 a202 1 retlen = i; d298 1 a298 1 if (parser->uri != NULL) { d305 1 a305 31 /* parse the name: value lines. */ for (l = 1; l < lines; l++) { where = 0; whitespace = 0; name = line[l]; value = NULL; slen = strlen(line[l]); for (i = 0; i < slen; i++) { if (line[l][i] == ':') { whitespace = 1; line[l][i] = '\0'; } else { if (whitespace) { whitespace = 0; while (i < slen && line[l][i] == ' ') i++; if (i < slen) value = &line[l][i]; break; } } } if (name != NULL && value != NULL) { httpp_setvar(parser, _lowercase(name), value); name = NULL; value = NULL; } } d309 1 a309 1 return retlen; @ 1.5 log @Buffer overflows. Requires a change to the format plugin interface - jack: if you want this done differently, feel free to change it (or ask me to). @ text @d271 1 a271 1 void httpp_destroy(http_parser_t *parser) d279 6 @ 1.4 log @Bunch of fixes: - connections are now matched to format plugins based on content-type headers, and are rejected if there isn't a format handler for that content-type, or there is no content-type at all. - format_vorbis now handles pages with granulepos of -1 in the headers correctly (this happens if the headers are fairly large, because of many comments, for example). - various #include fixes. - buffer overflow in httpp.c fixed. @ text @d6 2 d20 2 d55 1 a55 1 char *line[32]; /* limited to 32 lines, should be more than enough */ d80 1 a80 1 for (i = 0; i < len; i++) { @ 1.3 log @Thanks to Akos Maroy for this. These variables need to be uppercase always in order to comply with the HTTP specification. While not a problem internal to icecast, they were slipping into the log files and breaking some less-than-robust parsers. @ text @d65 2 a66 2 /* make a local copy of the data */ data = (char *)malloc(len); d69 1 d81 3 a83 3 if (i + 1 < len) if (data[i + 1] == '\n' || data[i + 1] == '\r') { data[i] = '\0'; a84 3 } data[i] = '\0'; if (i < len - 1) d86 1 @ 1.2 log @Win32 compatibility courtesy of Oddsock. @ text @d150 1 a150 1 httpp_setvar(parser, HTTPP_VAR_PROTOCOL, _lowercase(version)); d164 1 a164 1 httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "get"); d167 1 a167 1 httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "post"); d170 1 a170 1 httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "head"); d173 1 a173 1 httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "source"); d176 1 a176 1 httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "play"); d179 1 a179 1 httpp_setvar(parser, HTTPP_VAR_REQ_TYPE, "stats"); @ 1.1 log @Initial revision @ text @d14 4 d54 5 a58 4 char *req_type; char *uri; char *version; char *name, *value; @ 1.1.1.1 log @move to cvs @ text @@ cvs2svn-2.4.0/test-data/resync-misgroups-cvsrepos/httpp/COPYING,v0000664000076500007650000006221410702477014025727 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access; symbols libshout-2_0:1.1.1.1 libshout-2_0b3:1.1.1.1 libshout-2_0b2:1.1.1.1 libshout_2_0b1:1.1.1.1 libogg2-zerocopy:1.1.1.1.0.2 start:1.1.1.1 xiph:1.1.1; locks; strict; comment @# @; 1.1 date 2001.09.10.02.28.49; author jack; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2001.09.10.02.28.49; author jack; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @ GNU LIBRARY GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1991 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. [This is the first released version of the library GPL. It is numbered 2 because it goes with version 2 of the ordinary GPL.] Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public Licenses are intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This license, the Library General Public License, applies to some specially designated Free Software Foundation software, and to any other libraries whose authors decide to use it. You can use it for your libraries, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the library, or if you modify it. For example, if you distribute copies of the library, whether gratis or for a fee, you must give the recipients all the rights that we gave you. You must make sure that they, too, receive or can get the source code. If you link a program with the library, you must provide complete object files to the recipients so that they can relink them with the library, after making changes to the library and recompiling it. And you must show them these terms so they know their rights. Our method of protecting your rights has two steps: (1) copyright the library, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the library. Also, for each distributor's protection, we want to make certain that everyone understands that there is no warranty for this free library. If the library is modified by someone else and passed on, we want its recipients to know that what they have is not the original version, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that companies distributing free software will individually obtain patent licenses, thus in effect transforming the program into proprietary software. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. Most GNU software, including some libraries, is covered by the ordinary GNU General Public License, which was designed for utility programs. This license, the GNU Library General Public License, applies to certain designated libraries. This license is quite different from the ordinary one; be sure to read it in full, and don't assume that anything in it is the same as in the ordinary license. The reason we have a separate public license for some libraries is that they blur the distinction we usually make between modifying or adding to a program and simply using it. Linking a program with a library, without changing the library, is in some sense simply using the library, and is analogous to running a utility program or application program. However, in a textual and legal sense, the linked executable is a combined work, a derivative of the original library, and the ordinary General Public License treats it as such. Because of this blurred distinction, using the ordinary General Public License for libraries did not effectively promote software sharing, because most developers did not use the libraries. We concluded that weaker conditions might promote sharing better. However, unrestricted linking of non-free programs would deprive the users of those programs of all benefit from the free status of the libraries themselves. This Library General Public License is intended to permit developers of non-free programs to use free libraries, while preserving your freedom as a user of such programs to change the free libraries that are incorporated in them. (We have not seen how to achieve this as regards changes in header files, but we have achieved it as regards changes in the actual functions of the Library.) The hope is that this will lead to faster development of free libraries. The precise terms and conditions for copying, distribution and modification follow. Pay close attention to the difference between a "work based on the library" and a "work that uses the library". The former contains code derived from the library, while the latter only works together with the library. Note that it is possible for a library to be covered by the ordinary General Public License rather than by this special one. GNU LIBRARY GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License Agreement applies to any software library which contains a notice placed by the copyright holder or other authorized party saying it may be distributed under the terms of this Library General Public License (also called "this License"). Each licensee is addressed as "you". A "library" means a collection of software functions and/or data prepared so as to be conveniently linked with application programs (which use some of those functions and data) to form executables. The "Library", below, refers to any such software library or work which has been distributed under these terms. A "work based on the Library" means either the Library or any derivative work under copyright law: that is to say, a work containing the Library or a portion of it, either verbatim or with modifications and/or translated straightforwardly into another language. (Hereinafter, translation is included without limitation in the term "modification".) "Source code" for a work means the preferred form of the work for making modifications to it. For a library, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the library. Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running a program using the Library is not restricted, and output from such a program is covered only if its contents constitute a work based on the Library (independent of the use of the Library in a tool for writing it). Whether that is true depends on what the Library does and what the program that uses the Library does. 1. You may copy and distribute verbatim copies of the Library's complete source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and distribute a copy of this License along with the Library. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Library or any portion of it, thus forming a work based on the Library, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) The modified work must itself be a software library. b) You must cause the files modified to carry prominent notices stating that you changed the files and the date of any change. c) You must cause the whole of the work to be licensed at no charge to all third parties under the terms of this License. d) If a facility in the modified Library refers to a function or a table of data to be supplied by an application program that uses the facility, other than as an argument passed when the facility is invoked, then you must make a good faith effort to ensure that, in the event an application does not supply such function or table, the facility still operates, and performs whatever part of its purpose remains meaningful. (For example, a function in a library to compute square roots has a purpose that is entirely well-defined independent of the application. Therefore, Subsection 2d requires that any application-supplied function or table used by this function must be optional: if the application does not supply it, the square root function must still compute square roots.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Library, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Library, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Library. In addition, mere aggregation of another work not based on the Library with the Library (or with a work based on the Library) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may opt to apply the terms of the ordinary GNU General Public License instead of this License to a given copy of the Library. To do this, you must alter all the notices that refer to this License, so that they refer to the ordinary GNU General Public License, version 2, instead of to this License. (If a newer version than version 2 of the ordinary GNU General Public License has appeared, then you can specify that version instead if you wish.) Do not make any other change in these notices. Once this change is made in a given copy, it is irreversible for that copy, so the ordinary GNU General Public License applies to all subsequent copies and derivative works made from that copy. This option is useful when you wish to copy part of the code of the Library into a program that is not a library. 4. You may copy and distribute the Library (or a portion or derivative of it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange. If distribution of object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place satisfies the requirement to distribute the source code, even though third parties are not compelled to copy the source along with the object code. 5. A program that contains no derivative of any portion of the Library, but is designed to work with the Library by being compiled or linked with it, is called a "work that uses the Library". Such a work, in isolation, is not a derivative work of the Library, and therefore falls outside the scope of this License. However, linking a "work that uses the Library" with the Library creates an executable that is a derivative of the Library (because it contains portions of the Library), rather than a "work that uses the library". The executable is therefore covered by this License. Section 6 states terms for distribution of such executables. When a "work that uses the Library" uses material from a header file that is part of the Library, the object code for the work may be a derivative work of the Library even though the source code is not. Whether this is true is especially significant if the work can be linked without the Library, or if the work is itself a library. The threshold for this to be true is not precisely defined by law. If such an object file uses only numerical parameters, data structure layouts and accessors, and small macros and small inline functions (ten lines or less in length), then the use of the object file is unrestricted, regardless of whether it is legally a derivative work. (Executables containing this object code plus portions of the Library will still fall under Section 6.) Otherwise, if the work is a derivative of the Library, you may distribute the object code for the work under the terms of Section 6. Any executables containing that work also fall under Section 6, whether or not they are linked directly with the Library itself. 6. As an exception to the Sections above, you may also compile or link a "work that uses the Library" with the Library to produce a work containing portions of the Library, and distribute that work under terms of your choice, provided that the terms permit modification of the work for the customer's own use and reverse engineering for debugging such modifications. You must give prominent notice with each copy of the work that the Library is used in it and that the Library and its use are covered by this License. You must supply a copy of this License. If the work during execution displays copyright notices, you must include the copyright notice for the Library among them, as well as a reference directing the user to the copy of this License. Also, you must do one of these things: a) Accompany the work with the complete corresponding machine-readable source code for the Library including whatever changes were used in the work (which must be distributed under Sections 1 and 2 above); and, if the work is an executable linked with the Library, with the complete machine-readable "work that uses the Library", as object code and/or source code, so that the user can modify the Library and then relink to produce a modified executable containing the modified Library. (It is understood that the user who changes the contents of definitions files in the Library will not necessarily be able to recompile the application to use the modified definitions.) b) Accompany the work with a written offer, valid for at least three years, to give the same user the materials specified in Subsection 6a, above, for a charge no more than the cost of performing this distribution. c) If distribution of the work is made by offering access to copy from a designated place, offer equivalent access to copy the above specified materials from the same place. d) Verify that the user has already received a copy of these materials or that you have already sent this user a copy. For an executable, the required form of the "work that uses the Library" must include any data and utility programs needed for reproducing the executable from it. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. It may happen that this requirement contradicts the license restrictions of other proprietary libraries that do not normally accompany the operating system. Such a contradiction means you cannot use both them and the Library together in an executable that you distribute. 7. You may place library facilities that are a work based on the Library side-by-side in a single library together with other library facilities not covered by this License, and distribute such a combined library, provided that the separate distribution of the work based on the Library and of the other library facilities is otherwise permitted, and provided that you do these two things: a) Accompany the combined library with a copy of the same work based on the Library, uncombined with any other library facilities. This must be distributed under the terms of the Sections above. b) Give prominent notice with the combined library of the fact that part of it is a work based on the Library, and explaining where to find the accompanying uncombined form of the same work. 8. You may not copy, modify, sublicense, link with, or distribute the Library except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense, link with, or distribute the Library is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 9. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Library or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Library (or any work based on the Library), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Library or works based on it. 10. Each time you redistribute the Library (or any work based on the Library), the recipient automatically receives a license from the original licensor to copy, distribute, link with or modify the Library subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 11. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Library at all. For example, if a patent license would not permit royalty-free redistribution of the Library by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Library. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply, and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 12. If the distribution and/or use of the Library is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Library under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 13. The Free Software Foundation may publish revised and/or new versions of the Library General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Library specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Library does not specify a license version number, you may choose any version ever published by the Free Software Foundation. 14. If you wish to incorporate parts of the Library into other free programs whose distribution conditions are incompatible with these, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Libraries If you develop a new library, and you want it to be of the greatest possible use to the public, we recommend making it free software that everyone can redistribute and change. You can do so by permitting redistribution under these terms (or, alternatively, under the terms of the ordinary General Public License). To apply these terms, attach the following notices to the library. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This library is free software; you can redistribute it and/or modify it under the terms of the GNU Library General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Library General Public License for more details. You should have received a copy of the GNU Library General Public License along with this library; if not, write to the Free Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Also add information on how to contact you by electronic and paper mail. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the library, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the library `Frob' (a library for tweaking knobs) written by James Random Hacker. , 1 April 1990 Ty Coon, President of Vice That's all there is to it! @ 1.1.1.1 log @move to cvs @ text @@ cvs2svn-2.4.0/test-data/peer-path-pruning-cvsrepos/0000775000076500007650000000000012027373500023277 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/peer-path-pruning-cvsrepos/README0000664000076500007650000002517310702477016024174 0ustar mhaggermhagger00000000000000This repository is for testing cvs2svn's pruning choices when filling symbolic names. The layout is: /one.txt two.txt /foo/three.txt four.txt /bar/five.txt six.txt Every file was imported in a standard way, then revisions 1.2 and 1.3 were committed on every file. Then this branch was made on four files (/one.txt, /two.txt, /foo/three.txt, /foo/four.txt): BRANCH: sprouts from 1.3 (so branch number 1.3.0.2) Then a revision was committed on that branch, creating revision 1.3.2.1 on those files. No branch was made on /bar/five.txt nor /bar/six.txt. Thus, when BRANCH is created, subdir /bar should be deleted entirely. But if you revert r1125 of cvs2svn.py, /bar won't be deleted. Search below for the word "Suppose" for more details. (Note that we still don't have a test for the proposed 'if not prune_ok:' conditional, though.) --------------------8-<-------cut-here---------8-<----------------------- From nobody Wed Jun 2 18:24:01 2004 Sender: kfogel@newton.ch.collab.net To: fitz@tigris.org Cc: commits@cvs2svn.tigris.org Subject: Re: cvs2svn commit: r1035 - branches/may-04-redesign References: <200405290539.i4T5dmM10064@morbius.ch.collab.net> From: kfogel@collab.net Reply-To: kfogel@collab.net X-Windows: putting new limits on productivity. Date: 02 Jun 2004 18:24:00 -0500 In-Reply-To: <200405290539.i4T5dmM10064@morbius.ch.collab.net> Message-ID: <85vfi9czn3.fsf@newton.ch.collab.net> Lines: 200 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.1.50 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii fitz@tigris.org writes: > Modified: branches/may-04-redesign/cvs2svn.py > ============================================================================== > --- branches/may-04-redesign/cvs2svn.py (original) > +++ branches/may-04-redesign/cvs2svn.py Sat May 29 00:39:43 2004 > @@ -4802,15 +4802,16 @@ > return dest > > def _fill(self, symbol_fill, key, name, > - parent_path_so_far=None, preferred_revnum=None): > + parent_path_so_far=None, preferred_revnum=None, prune_ok=None): > """Descends through all nodes in SYMBOL_FILL.node_tree that are > rooted at KEY, which is a string key into SYMBOL_FILL.node_tree. > Generates copy (and delete) commands for all destination nodes > that don't exist in NAME. > > - PARENT_PATH_SO_FAR is the parent directory of the path(s) that may > - be copied in this invocation of the method. If None, that means > - that our source path starts from the root of the repository. > + PARENT_PATH_SO_FAR is the parent directory of the source path(s) > + that may be copied in this invocation of the method. If None, > + that means that our source path starts from the root of the > + repository. > > PREFERRED_REVNUM is an int which is the source revision number > that the caller (who may have copied KEY's parent) used to > @@ -4818,9 +4819,15 @@ > is preferable to any other (which probably means that no copies > have happened yet). > > - PARENT_PATH_SO_FAR and PREFERRED_REVNUM should only be passed in > - by recursive calls.""" > + PRUNE_OK means that a copy has been made in this recursion, and > + it's safe to prune directories that are not in SYMBOL_FILL.node_tree. This documentation of PRUNE_OK might want to be a little more detailed, and make clear the cause-effect relationship between the first and second clauses. I.e., PRUNE_OK doesn't mean two unrelated things, it means one thing that happens to have an implication. Also, the pruning behavior applies to files as well as directories, no? (Maybe you were tempted to say "directories" because that's what the unrelated --prune flag to cvs2svn affects... Which is one reason not to call this new param 'prune_ok'; see later for another reason.) Anyway, I'm thinking something like this for the doc: PRUNE_OK means that a copy has already been made higher in this recursion, and that therefore it's safe to prune directories and files that do not appear in the part of SYMBOL_FILL.node_tree() governing this recursion. If that were the doc string, would it be accurate? > @@ -4839,7 +4846,7 @@ > src_revnum = None > # if our destination path doesn't already exist, then we may > # have to make a copy. > - if (dest_path is not None and not self.path_exists(dest_path)): > + if not self.path_exists(dest_path): > src_revnum = symbol_fill.get_best_revnum(key, preferred_revnum) > > # If the revnum of our parent's copy (src_revnum) is the same > @@ -4849,6 +4856,7 @@ > if src_revnum != preferred_revnum: > # Do the copy > new_entries = self.copy_path(src_path_so_far, dest_path, src_revnum) > + prune_ok = peer_path_unsafe_for_pruning = 1 > # Delete invalid entries that got swept in by the copy. > valid_entries = symbol_fill.node_tree[key] > bad_entries = self._get_invalid_entries(valid_entries, new_entries) > @@ -4857,8 +4865,25 @@ > del_path = dest_path + '/' + entry > self.delete_path(del_path) > > - self._fill(symbol_fill, key, name, src_path_so_far, src_revnum) > + self._fill(symbol_fill, key, name, src_path_so_far, src_revnum, prune_ok) > + > + if peer_path_unsafe_for_pruning: > + return > + # Any entries still present in the dest directory that we've just > + # created by copying, but not in > + # symbol_fill.node_tree[parent_key], don't belong and should be > + # deleted as well. If we haven't actually made a copy, do > + # nothing. > + if parent_path_so_far and prune_ok: > + this_path = self._dest_path_for_source_path(name, parent_path_so_far) > + ign, this_contents = self._node_for_path(this_path, self.youngest) > + expected_contents = symbol_fill.node_tree[parent_key] > + bad_entries = self._get_invalid_entries(expected_contents, this_contents) > + for entry in bad_entries: > + del_path = this_path + '/' + entry > > + print "FITZ: deleting path", del_path > + self.delete_path(del_path) Hmmm. Okay, I get the general idea here, and like it. There's something weird about 'peer_path_unsafe_for_pruning', though... Let me see if I can put my finger on it. Suppose we first make this copy: cp /trunk/foo/ @ r4 --> /branches/MYBRANCH/foo/ That sets 'peer_path_unsafe_for_pruning' to 1 in the stack frame for "/" (that is, the frame in which "/foo" is a child entry). Great. Suppose that foo@4 had two subdirs, "/foo/bar/" and "/foo/baz/", only the first of which is on this branch at all, albeit from r6 not r4. So, now we recurse down and copy "bar/" from r6: del /branches/MYBRANCH/foo/bar/ cp /trunk/foo/bar/ @ r6 --> /branches/MYBRANCH/foo/bar/ That sets 'peer_path_unsafe_for_pruning' to 1 in the stack frame for "/foo", of course, since "/foo/bar" is a child entry of that. But remember, earlier when we copied "/foo", we accidentally got "/foo/baz/" as well, which needs to be pruned out. We used to have code to prune that, but it's commented out now, see the comment that begins "###TODO OPTIMIZE: If we keep a list COPIED_PATHS of...". So the question is, who *will* prune "/foo/baz/"? All of the stack frames we've created in this fill have 'peer_path_unsafe_for_pruning' set to 1, so all of them will return early, without entering the blob of code at the end that is supposed to clean up these bad entries. Thus, I don't think anyone can prune "/foo/baz/", even though it clearly must be pruned. Another way of saying it is, that commented out loop that's supposed to be just an optimization is actually not an optimization -- it's correctness code. But I'm not saying the answer here is just to uncomment that loop. I think also 'peer_path_unsafe_for_pruning' should not get set *if* 'prune_ok' is already set! In other words prune_ok = peer_path_unsafe_for_pruning = 1 should be changed to if not prune_ok: prune_ok = peer_path_unsafe_for_pruning = 1 Because if 'prune_ok' is already set, then we're already in a recursive call after a copy, so peer paths are, in fact, safe for pruning, even though they are peer paths! (One way to think of it is that the name 'prune_ok' should be 'we_are_now_under_a_copy', or 'copy_happened', or something like that. What the param really indicates is that all dest paths being generated in this leg of the recursion are underneath a copy -- the fact that pruning is okay is merely one consequence of that situation.) Is this making any sense, or am I off my rocker? The comment right before the final blob of code confused me for a minute... # Any entries still present in the dest directory that we've just # created by copying, but not in # symbol_fill.node_tree[parent_key], don't belong and should be # deleted as well. If we haven't actually made a copy, do # nothing. ... because it talks about "any entries still present in the dest directory that we've just copied", yet 'parent_path_so_far' is *not* a directory we just copied, rather it's the parent of things we may or may not have copied. IOW, this code (the last bit of code in the function, right below the above comment)... if parent_path_so_far and prune_ok: this_path = self._dest_path_for_source_path(name, parent_path_so_far) ign, this_contents = self._node_for_path(this_path, self.youngest) expected_contents = symbol_fill.node_tree[parent_key] bad_entries = self._get_invalid_entries(expected_contents, this_contents) for entry in bad_entries: del_path = this_path + '/' + entry self.delete_path(del_path) ... cannot be talking about a directory we've just copied in this stack frame, but rather about a directory copied in some previous frame, for which we're now in a recursive call (that is, an old parent_path_so_far got extended by one or more entry components, following a copy). Perhaps this is what is meant by "the dest directory that we've just copied", and all we need to do is clarify things by saying "the dest directory at or under a copy made in an call higher up in this recursion". Well that's clumsy, but you know what I mean. Anyway, those are my thoughts for now. Are they anything like yours, or have I fanned off into my own private Oort cloud of insanity? Btw, can't say I share the feeling that _fill() has gotten overly complex. I mean, apparently it has a bug or two right now, but in general it seems to be about as complex as the problem it solves -- I wouldn't want to break it up any further. -K cvs2svn-2.4.0/test-data/peer-path-pruning-cvsrepos/bar/0000775000076500007650000000000012027373500024043 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/peer-path-pruning-cvsrepos/bar/six.txt,v0000664000076500007650000000125010702477016025654 0ustar mhaggermhagger00000000000000head 1.3; access; symbols vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2004.06.07.18.42.06; author kfogel; state Exp; branches; next 1.2; 1.2 date 2004.06.07.18.38.34; author kfogel; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches; next ; desc @@ 1.3 log @Commit 1.3 of every file. @ text @Another small change. @ 1.2 log @Commit 1.2 of every file. @ text @d1 1 a1 1 A small change. @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is the first revision. @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/peer-path-pruning-cvsrepos/bar/five.txt,v0000664000076500007650000000125010702477016026002 0ustar mhaggermhagger00000000000000head 1.3; access; symbols vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2004.06.07.18.42.06; author kfogel; state Exp; branches; next 1.2; 1.2 date 2004.06.07.18.38.34; author kfogel; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches; next ; desc @@ 1.3 log @Commit 1.3 of every file. @ text @Another small change. @ 1.2 log @Commit 1.2 of every file. @ text @d1 1 a1 1 A small change. @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is the first revision. @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/peer-path-pruning-cvsrepos/one.txt,v0000664000076500007650000000161310702477016025071 0ustar mhaggermhagger00000000000000head 1.3; access; symbols BRANCH:1.2.0.2 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2004.06.07.18.42.06; author kfogel; state Exp; branches; next 1.2; 1.2 date 2004.06.07.18.38.34; author kfogel; state Exp; branches 1.2.2.1; next 1.1; 1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches; next ; 1.2.2.1 date 2004.06.07.20.29.58; author kfogel; state Exp; branches; next ; desc @@ 1.3 log @Commit 1.3 of every file. @ text @Another small change. @ 1.2 log @Commit 1.2 of every file. @ text @d1 1 a1 1 A small change. @ 1.2.2.1 log @Commit change to everyone on branch BRANCH_1. @ text @d1 1 a1 1 First change on branch BRANCH_1 (1.2.2.1). @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is the first revision. @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/peer-path-pruning-cvsrepos/two.txt,v0000664000076500007650000000161310702477016025121 0ustar mhaggermhagger00000000000000head 1.3; access; symbols BRANCH:1.2.0.2 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2004.06.07.18.42.06; author kfogel; state Exp; branches; next 1.2; 1.2 date 2004.06.07.18.38.34; author kfogel; state Exp; branches 1.2.2.1; next 1.1; 1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches; next ; 1.2.2.1 date 2004.06.07.20.29.58; author kfogel; state Exp; branches; next ; desc @@ 1.3 log @Commit 1.3 of every file. @ text @Another small change. @ 1.2 log @Commit 1.2 of every file. @ text @d1 1 a1 1 A small change. @ 1.2.2.1 log @Commit change to everyone on branch BRANCH_1. @ text @d1 1 a1 1 First change on branch BRANCH_1 (1.2.2.1). @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is the first revision. @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/peer-path-pruning-cvsrepos/foo/0000775000076500007650000000000012027373500024062 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/peer-path-pruning-cvsrepos/foo/three.txt,v0000664000076500007650000000161310702477016026202 0ustar mhaggermhagger00000000000000head 1.3; access; symbols BRANCH:1.3.0.2 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2004.06.07.18.42.06; author kfogel; state Exp; branches 1.3.2.1; next 1.2; 1.2 date 2004.06.07.18.38.34; author kfogel; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches; next ; 1.3.2.1 date 2004.06.07.20.29.58; author kfogel; state Exp; branches; next ; desc @@ 1.3 log @Commit 1.3 of every file. @ text @Another small change. @ 1.3.2.1 log @Commit change to everyone on branch BRANCH_1. @ text @d1 1 a1 1 First change on branch BRANCH_1 (1.3.2.1). @ 1.2 log @Commit 1.2 of every file. @ text @d1 1 a1 1 A small change. @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is the first revision. @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/peer-path-pruning-cvsrepos/foo/four.txt,v0000664000076500007650000000161310702477016026046 0ustar mhaggermhagger00000000000000head 1.3; access; symbols BRANCH:1.3.0.2 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2004.06.07.18.42.06; author kfogel; state Exp; branches 1.3.2.1; next 1.2; 1.2 date 2004.06.07.18.38.34; author kfogel; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches; next ; 1.3.2.1 date 2004.06.07.20.29.58; author kfogel; state Exp; branches; next ; desc @@ 1.3 log @Commit 1.3 of every file. @ text @Another small change. @ 1.3.2.1 log @Commit change to everyone on branch BRANCH_1. @ text @d1 1 a1 1 First change on branch BRANCH_1 (1.3.2.1). @ 1.2 log @Commit 1.2 of every file. @ text @d1 1 a1 1 A small change. @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is the first revision. @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/eol-variants-cvsrepos/0000775000076500007650000000000012027373500022336 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/eol-variants-cvsrepos/proj/0000775000076500007650000000000012027373500023310 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/eol-variants-cvsrepos/proj/file.txt,v0000444000076500007650000000033010705507702025226 0ustar mhaggermhagger00000000000000head 1.1; access; symbols; locks; strict; comment @# @; 1.1 date 2007.10.17.21.00.40; author mhagger; state Exp; branches; next ; commitid nkHx4xZOOGvuiZBs; desc @@ 1.1 log @Adding file @ text @line 1 line 2 @ cvs2svn-2.4.0/test-data/strange-default-branch-cvsrepos/0000775000076500007650000000000012027373500024252 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/strange-default-branch-cvsrepos/file5347,v0000664000076500007650000000175711500107340025703 0ustar mhaggermhagger00000000000000head 1.2; branch 1.2.4.3.2.1.2; access; symbols symbol1:1.2.0.4 symbol2:1.2.4.3.0.2 symbol3:1.2.4.3.2.1.0.2; locks; strict; comment @# @; 1.2 date 2003.08.18.09.31.37; author author9; state Exp; branches 1.2.4.1; next 1.1; 1.1 date 2003.07.03.13.58.23; author author9; state Exp; branches; next ; 1.2.4.1 date 2003.09.29.07.36.35; author author8; state Exp; branches; next 1.2.4.2; 1.2.4.2 date 2003.10.01.11.47.47; author author8; state Exp; branches; next 1.2.4.3; 1.2.4.3 date 2003.10.02.10.09.19; author author8; state Exp; branches 1.2.4.3.2.1; next ; 1.2.4.3.2.1 date 2003.11.18.17.40.18; author author9; state Exp; branches 1.2.4.3.2.1.2.1; next ; 1.2.4.3.2.1.2.1 date 2003.12.29.13.02.08; author author9; state Exp; branches; next ; desc @@ 1.2 log @log 12594@ text @@ 1.2.4.1 log @log 12595@ text @@ 1.2.4.2 log @log 12596@ text @@ 1.2.4.3 log @log 12597@ text @@ 1.2.4.3.2.1 log @log 12598@ text @@ 1.2.4.3.2.1.2.1 log @log 12560@ text @@ 1.1 log @log 12506@ text @@ cvs2svn-2.4.0/test-data/unicode-log-cvsrepos/0000775000076500007650000000000012027373500022137 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/unicode-log-cvsrepos/testunicode,v0000664000076500007650000000033110702477014024652 0ustar mhaggermhagger00000000000000head 1.1; access; symbols; locks; strict; comment @# @; 1.1 date 2007.02.13.21.13.21; author kylo; state Exp; branches ; next ; desc @@ 1.1 log @This is a test message with unicode: å @ text @@ cvs2svn-2.4.0/test-data/leftover-revs-cvsrepos/0000775000076500007650000000000012027373500022535 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/leftover-revs-cvsrepos/file.txt,v0000664000076500007650000000103110702477013024454 0ustar mhaggermhagger00000000000000head 1.2; access; symbols BRANCH:1.2.0.2; locks; strict; comment @// @; 1.2 date 2003.07.04.16.36.26; author author4; state dead; branches 1.2.2.1; next 1.1; 1.1 date 2003.07.04.16.13.47; author author4; state Exp; branches; next ; 1.2.2.1 date 2003.07.13.07.12.30; author author15; state Exp; branches; next 1.2.2.2; 1.2.2.2 date 2004.04.09.01.57.02; author author15; state dead; branches; next ; desc @@ 1.2 log @log 1670@ text @@ 1.2.2.1 log @log 1950@ text @@ 1.2.2.2 log @log 11@ text @@ 1.1 log @log 1671@ text @@ cvs2svn-2.4.0/test-data/compose-tag-three-sources-cvsrepos/0000775000076500007650000000000012027373500024736 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/compose-tag-three-sources-cvsrepos/tagged-on-b2,v0000664000076500007650000000115110702477015027273 0ustar mhaggermhagger00000000000000head 1.2; access; symbols T:1.1.4.1 b2:1.1.0.4 b1:1.1.0.2; locks; strict; comment @# @; 1.2 date 2004.03.16.17.41.24; author max; state Exp; branches; next 1.1; 1.1 date 2004.03.11.01.46.11; author max; state Exp; branches 1.1.2.1 1.1.4.1; next ; 1.1.2.1 date 2004.03.11.01.49.00; author max; state Exp; branches; next ; 1.1.4.1 date 2004.03.11.01.49.16; author max; state Exp; branches; next ; desc @@ 1.2 log @Commit again on trunk @ text @trunk-again @ 1.1 log @Add on trunk @ text @d1 1 @ 1.1.4.1 log @Commit on branch b2 @ text @a0 1 b2 @ 1.1.2.1 log @Commit on branch b1 @ text @a0 1 b1 @ cvs2svn-2.4.0/test-data/compose-tag-three-sources-cvsrepos/tagged-on-trunk-1.2-b,v0000664000076500007650000000114510702477015030653 0ustar mhaggermhagger00000000000000head 1.2; access; symbols T:1.2 b2:1.1.0.4 b1:1.1.0.2; locks; strict; comment @# @; 1.2 date 2004.03.16.17.41.23; author max; state Exp; branches; next 1.1; 1.1 date 2004.03.11.01.46.11; author max; state Exp; branches 1.1.2.1 1.1.4.1; next ; 1.1.2.1 date 2004.03.11.01.49.00; author max; state Exp; branches; next ; 1.1.4.1 date 2004.03.11.01.49.16; author max; state Exp; branches; next ; desc @@ 1.2 log @Commit again on trunk @ text @trunk-again @ 1.1 log @Add on trunk @ text @d1 1 @ 1.1.4.1 log @Commit on branch b2 @ text @a0 1 b2 @ 1.1.2.1 log @Commit on branch b1 @ text @a0 1 b1 @ cvs2svn-2.4.0/test-data/compose-tag-three-sources-cvsrepos/tagged-on-trunk-1.2-a,v0000664000076500007650000000114510702477015030652 0ustar mhaggermhagger00000000000000head 1.2; access; symbols T:1.2 b2:1.1.0.4 b1:1.1.0.2; locks; strict; comment @# @; 1.2 date 2004.03.16.17.41.23; author max; state Exp; branches; next 1.1; 1.1 date 2004.03.11.01.46.11; author max; state Exp; branches 1.1.2.1 1.1.4.1; next ; 1.1.2.1 date 2004.03.11.01.49.00; author max; state Exp; branches; next ; 1.1.4.1 date 2004.03.11.01.49.16; author max; state Exp; branches; next ; desc @@ 1.2 log @Commit again on trunk @ text @trunk-again @ 1.1 log @Add on trunk @ text @d1 1 @ 1.1.4.1 log @Commit on branch b2 @ text @a0 1 b2 @ 1.1.2.1 log @Commit on branch b1 @ text @a0 1 b1 @ cvs2svn-2.4.0/test-data/compose-tag-three-sources-cvsrepos/tagged-on-b1,v0000664000076500007650000000115110702477015027272 0ustar mhaggermhagger00000000000000head 1.2; access; symbols T:1.1.2.1 b2:1.1.0.4 b1:1.1.0.2; locks; strict; comment @# @; 1.2 date 2004.03.16.17.41.23; author max; state Exp; branches; next 1.1; 1.1 date 2004.03.11.01.46.11; author max; state Exp; branches 1.1.2.1 1.1.4.1; next ; 1.1.2.1 date 2004.03.11.01.49.00; author max; state Exp; branches; next ; 1.1.4.1 date 2004.03.11.01.49.16; author max; state Exp; branches; next ; desc @@ 1.2 log @Commit again on trunk @ text @trunk-again @ 1.1 log @Add on trunk @ text @d1 1 @ 1.1.4.1 log @Commit on branch b2 @ text @a0 1 b2 @ 1.1.2.1 log @Commit on branch b1 @ text @a0 1 b1 @ cvs2svn-2.4.0/test-data/compose-tag-three-sources-cvsrepos/tagged-on-trunk-1.1,v0000664000076500007650000000114510702477015030433 0ustar mhaggermhagger00000000000000head 1.2; access; symbols T:1.1 b2:1.1.0.4 b1:1.1.0.2; locks; strict; comment @# @; 1.2 date 2004.03.16.17.41.23; author max; state Exp; branches; next 1.1; 1.1 date 2004.03.11.01.46.11; author max; state Exp; branches 1.1.2.1 1.1.4.1; next ; 1.1.2.1 date 2004.03.11.01.49.00; author max; state Exp; branches; next ; 1.1.4.1 date 2004.03.11.01.49.16; author max; state Exp; branches; next ; desc @@ 1.2 log @Commit again on trunk @ text @trunk-again @ 1.1 log @Add on trunk @ text @d1 1 @ 1.1.4.1 log @Commit on branch b2 @ text @a0 1 b2 @ 1.1.2.1 log @Commit on branch b1 @ text @a0 1 b1 @ cvs2svn-2.4.0/test-data/symbol-mess-cvsrepos/0000775000076500007650000000000012027373500022204 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/symbol-mess-cvsrepos/symbol-mess-parent-hints.txt0000664000076500007650000000014510720403001027615 0ustar mhaggermhagger000000000000000 MOSTLY_BRANCH branch . . 0 MOSTLY_TAG tag . . 0 BRANCH_WITH_COMMIT branch . BRANCH cvs2svn-2.4.0/test-data/symbol-mess-cvsrepos/symbol-mess-path-hints.txt0000664000076500007650000000032010723371337027276 0ustar mhaggermhagger00000000000000. .trunk. trunk /a/strange/trunk/path . . MOSTLY_BRANCH branch /special/branch/path . . MOSTLY_TAG tag /special/tag/path . . BRANCH_WITH_COMMIT . /special/other/branch/path . cvs2svn-2.4.0/test-data/symbol-mess-cvsrepos/dir/0000775000076500007650000000000012027373500022762 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/symbol-mess-cvsrepos/dir/file1,v0000664000076500007650000000262610702477014024160 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BLOCKED_BY_UNNAMED:1.1.0.12 BLOCKING_COMMIT:1.1.10.1.0.2 BLOCKED_BY_COMMIT:1.1.0.10 BLOCKING_BRANCH:1.1.8.1.0.2 BLOCKED_BY_BRANCH:1.1.0.8 MOSTLY_BRANCH:1.1.0.6 MOSTLY_TAG:1.1 BRANCH_WITH_COMMIT:1.1.0.4 BRANCH:1.1.0.2 TAG:1.1; locks; strict; comment @# @; 1.1 date 2006.06.24.23.20.28; author mhagger; state Exp; branches 1.1.4.1 1.1.8.1 1.1.10.1 1.1.12.1; next ; 1.1.4.1 date 2006.06.24.23.20.29; author mhagger; state Exp; branches; next ; 1.1.8.1 date 2006.06.24.23.20.31; author mhagger; state Exp; branches; next ; 1.1.10.1 date 2006.06.24.23.20.33; author mhagger; state Exp; branches 1.1.10.1.2.1; next ; 1.1.10.1.2.1 date 2006.06.24.23.20.34; author mhagger; state Exp; branches; next ; 1.1.12.1 date 2006.06.24.23.20.36; author mhagger; state Exp; branches 1.1.12.1.2.1; next ; 1.1.12.1.2.1 date 2006.06.24.23.20.37; author mhagger; state Exp; branches; next ; desc @@ 1.1 log @Adding files on trunk @ text @line1 @ 1.1.12.1 log @Establish branch BLOCKED_BY_UNNAMED @ text @a1 1 line2 @ 1.1.12.1.2.1 log @Committing blocking commit on TEMP @ text @a2 1 line3 @ 1.1.10.1 log @Establish branch BLOCKED_BY_COMMIT @ text @a1 1 line2 @ 1.1.10.1.2.1 log @Committing blocking commit on BLOCKING_COMMIT @ text @a2 1 line3 @ 1.1.8.1 log @Establish branch BLOCKED_BY_BRANCH @ text @a1 1 line2 @ 1.1.4.1 log @Commit on branch BRANCH_WITH_COMMIT @ text @a1 1 line2 @ cvs2svn-2.4.0/test-data/symbol-mess-cvsrepos/dir/file3,v0000664000076500007650000000064310702477014024157 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BLOCKED_BY_UNNAMED:1.1.0.16 BLOCKING_COMMIT:1.1.0.14 BLOCKED_BY_COMMIT:1.1.0.12 BLOCKING_BRANCH:1.1.0.10 BLOCKED_BY_BRANCH:1.1.0.8 MOSTLY_BRANCH:1.1 MOSTLY_TAG:1.1.0.6 BRANCH_WITH_COMMIT:1.1.0.4 BRANCH:1.1.0.2 TAG:1.1; locks; strict; comment @# @; 1.1 date 2006.06.24.23.20.28; author mhagger; state Exp; branches; next ; desc @@ 1.1 log @Adding files on trunk @ text @line1 @ cvs2svn-2.4.0/test-data/symbol-mess-cvsrepos/dir/file2,v0000664000076500007650000000064310702477014024156 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BLOCKED_BY_UNNAMED:1.1.0.16 BLOCKING_COMMIT:1.1.0.14 BLOCKED_BY_COMMIT:1.1.0.12 BLOCKING_BRANCH:1.1.0.10 BLOCKED_BY_BRANCH:1.1.0.8 MOSTLY_BRANCH:1.1.0.6 MOSTLY_TAG:1.1 BRANCH_WITH_COMMIT:1.1.0.4 BRANCH:1.1.0.2 TAG:1.1; locks; strict; comment @# @; 1.1 date 2006.06.24.23.20.28; author mhagger; state Exp; branches; next ; desc @@ 1.1 log @Adding files on trunk @ text @line1 @ cvs2svn-2.4.0/test-data/symbol-mess-cvsrepos/makerepo.sh0000775000076500007650000000413110702477014024350 0ustar mhaggermhagger00000000000000#! /bin/sh # This is the script used to create the symbol-mess CVS repository. # (The repository is checked into svn; this script is only here for # its documentation value.) # The script should be started from the main cvs2svn directory. repo=`pwd`/test-data/symbol-mess-cvsrepos wc=`pwd`/cvs2svn-tmp/symbol-mess-wc [ -e $repo/CVSROOT ] && rm -rf $repo/CVSROOT [ -e $repo/dir ] && rm -rf $repo/dir [ -e $wc ] && rm -rf $wc cvs -d $repo init cvs -d $repo co -d $wc . cd $wc mkdir dir cvs add dir echo 'line1' >dir/file1 echo 'line1' >dir/file2 echo 'line1' >dir/file3 cvs add dir/file1 dir/file2 dir/file3 cvs commit -m 'Adding files on trunk' dir cd dir # One plain old garden-variety tag and one branch: cvs tag TAG cvs tag -b BRANCH # A branch with a commit on it: cvs tag -b BRANCH_WITH_COMMIT cvs up -r BRANCH_WITH_COMMIT echo 'line2' >>file1 cvs commit -m 'Commit on branch BRANCH_WITH_COMMIT' file1 cvs up -A # Make some symbols for testing majority rule strategy: cvs tag MOSTLY_TAG file1 file2 cvs tag -b MOSTLY_TAG file3 cvs tag -b MOSTLY_BRANCH file1 file2 cvs tag MOSTLY_BRANCH file3 # A branch that is blocked by another branch (but with no commits): cvs tag -b BLOCKED_BY_BRANCH file1 file2 file3 cvs up -r BLOCKED_BY_BRANCH echo 'line2' >>file1 cvs commit -m 'Establish branch BLOCKED_BY_BRANCH' file1 cvs tag -b BLOCKING_BRANCH cvs up -A # A branch that is blocked by another branch with a commit: cvs tag -b BLOCKED_BY_COMMIT file1 file2 file3 cvs up -r BLOCKED_BY_COMMIT echo 'line2' >>file1 cvs commit -m 'Establish branch BLOCKED_BY_COMMIT' file1 cvs tag -b BLOCKING_COMMIT cvs up -r BLOCKING_COMMIT echo 'line3' >>file1 cvs commit -m 'Committing blocking commit on BLOCKING_COMMIT' file1 cvs up -A # A branch that is blocked by an unnamed branch with a commit: cvs tag -b BLOCKED_BY_UNNAMED file1 file2 file3 cvs up -r BLOCKED_BY_UNNAMED echo 'line2' >>file1 cvs commit -m 'Establish branch BLOCKED_BY_UNNAMED' file1 cvs tag -b TEMP cvs up -r TEMP echo 'line3' >>file1 cvs commit -m 'Committing blocking commit on TEMP' file1 cvs up -A # Now delete the name from the blocking branch. cvs tag -d -B TEMP cvs2svn-2.4.0/test-data/symbol-mess-cvsrepos/symbol-mess-parent-hints-wildcards.txt0000664000076500007650000000014510720403001031567 0ustar mhaggermhagger00000000000000. MOSTLY_BRANCH branch . . . MOSTLY_TAG tag . . . BRANCH_WITH_COMMIT . . BRANCH cvs2svn-2.4.0/test-data/symbol-mess-cvsrepos/symbol-mess-parent-hints-invalid.txt0000664000076500007650000000016010720403001031236 0ustar mhaggermhagger000000000000000 MOSTLY_BRANCH branch . . 0 MOSTLY_TAG tag . . 0 BRANCH_WITH_COMMIT branch . BLOCKED_BY_BRANCH cvs2svn-2.4.0/test-data/symbol-mess-cvsrepos/symbol-mess-symbol-hints.txt0000664000076500007650000000006610720403001027633 0ustar mhaggermhagger000000000000000 MOSTLY_BRANCH branch . . 0 MOSTLY_TAG tag . . cvs2svn-2.4.0/test-data/double-fill2-cvsrepos/0000775000076500007650000000000012027373500022212 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/double-fill2-cvsrepos/proj/0000775000076500007650000000000012027373500023164 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/double-fill2-cvsrepos/proj/d.txt,v0000664000076500007650000000065710702477014024425 0ustar mhaggermhagger00000000000000head 1.2; access; symbols BRANCH2:1.1.2.1.0.2 BRANCH1:1.1.0.2; locks; strict; comment @// @; 1.2 date 2005.08.17.02.27.39; author author1; state Exp; branches; next 1.1; 1.1 date 2005.02.25.18.17.06; author author8; state Exp; branches 1.1.2.1; next ; 1.1.2.1 date 2005.02.25.23.10.11; author author8; state Exp; branches; next ; desc @@ 1.2 log @log 1@ text @@ 1.1 log @log 16@ text @@ 1.1.2.1 log @log 16@ text @@ cvs2svn-2.4.0/test-data/double-fill2-cvsrepos/proj/a.txt,v0000664000076500007650000000120710702477014024412 0ustar mhaggermhagger00000000000000head 1.3; access; symbols BRANCH2:1.2.0.4 BRANCH1:1.2.0.2; locks; strict; comment @# @; 1.3 date 2005.03.23.16.45.45; author author2; state Exp; branches; next 1.2; 1.2 date 2004.02.27.22.01.32; author author3; state Exp; branches 1.2.2.1 1.2.4.1; next 1.1; 1.1 date 2003.12.22.20.09.23; author author4; state Exp; branches; next ; 1.2.2.1 date 2005.05.05.04.07.24; author author2; state Exp; branches; next ; 1.2.4.1 date 2005.05.05.20.56.37; author author6; state Exp; branches; next ; desc @@ 1.3 log @log 5@ text @@ 1.2 log @log 6@ text @@ 1.2.4.1 log @log 7@ text @@ 1.2.2.1 log @log 9@ text @@ 1.1 log @log 10@ text @@ cvs2svn-2.4.0/test-data/double-fill2-cvsrepos/proj/b.txt,v0000664000076500007650000000104310702477014024411 0ustar mhaggermhagger00000000000000head 1.2; access; symbols BRANCH2:1.1.0.4 BRANCH1:1.1.0.2; locks; strict; comment @// @; 1.2 date 2005.04.05.11.08.53; author author7; state Exp; branches; next 1.1; 1.1 date 2005.02.22.00.10.19; author author7; state Exp; branches 1.1.2.1 1.1.4.1; next ; 1.1.2.1 date 2005.04.05.11.09.38; author author7; state Exp; branches; next ; 1.1.4.1 date 2005.04.05.22.20.39; author author6; state Exp; branches; next ; desc @@ 1.2 log @log 13@ text @@ 1.1 log @log 14@ text @@ 1.1.4.1 log @log 15@ text @@ 1.1.2.1 log @log 13@ text @@ cvs2svn-2.4.0/test-data/double-fill2-cvsrepos/proj/c.txt,v0000664000076500007650000000067210702477014024421 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BRANCH2:1.1.0.8 BRANCH1:1.1.0.2; locks; strict; comment @// @; 1.1 date 2005.10.03.17.35.55; author author8; state Exp; branches 1.1.2.1 1.1.8.1; next ; 1.1.2.1 date 2005.10.04.10.54.11; author author8; state Exp; branches; next ; 1.1.8.1 date 2005.10.05.20.46.30; author author6; state Exp; branches; next ; desc @@ 1.1 log @log 17@ text @@ 1.1.8.1 log @log 18@ text @@ 1.1.2.1 log @log 19@ text @@ cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/0000775000076500007650000000000012027373500022352 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/AB-loop/0000775000076500007650000000000012027373500023603 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/AB-loop/a.txt,v0000664000076500007650000000045510702477014025035 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2006.10.30.22.11.10; author mhagger; state Exp; branches; next 1.1; 1.1 date 2006.10.30.22.11.09; author mhagger; state Exp; branches; next ; desc @@ 1.2 log @AB-loop-B @ text @1.2 @ 1.1 log @AB-loop-A @ text @d1 1 a1 1 1.1 @ cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/AB-loop/b.txt,v0000664000076500007650000000045510702477014025036 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2006.10.30.22.11.12; author mhagger; state Exp; branches; next 1.1; 1.1 date 2006.10.30.22.11.11; author mhagger; state Exp; branches; next ; desc @@ 1.2 log @AB-loop-A @ text @1.2 @ 1.1 log @AB-loop-B @ text @d1 1 a1 1 1.1 @ cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/make-nasty-graphs.sh0000775000076500007650000000734710702477014026262 0ustar mhaggermhagger00000000000000#! /bin/sh # This script can be moved to the test-data directory and executed # there to recreate nasty-graphs-cvsrepos. (Well, approximately. It # doesn't clean up CVSROOT or add CVSROOT/README.) CVSROOT=`pwd`/nasty-graphs-cvsrepos export CVSROOT rm -rf $CVSROOT WC=`pwd`/cvs2svn-tmp rm -rf $WC cvs init cvs co -d $WC . # +-> A -> B --+ # | | # +------------+ # # A: a.txt<1.1> b.txt<1.2> # B: a.txt<1.2> b.txt<1.1> TEST=AB-loop D=$WC/$TEST mkdir $D cvs add $D echo "1.1" >$D/a.txt cvs add $D/a.txt cvs commit -m "$TEST-A" $D/a.txt echo "1.2" >$D/a.txt cvs commit -m "$TEST-B" $D/a.txt echo "1.1" >$D/b.txt cvs add $D/b.txt cvs commit -m "$TEST-B" $D/b.txt echo "1.2" >$D/b.txt cvs commit -m "$TEST-A" $D/b.txt # +-> A -> B -> C --+ # | | # +-----------------+ # # A: a.txt<1.1> c.txt<1.2> # B: a.txt<1.2> b.txt<1.1> # C: b.txt<1.2> c.txt<1.1> TEST=ABC-loop D=$WC/$TEST mkdir $D cvs add $D echo "1.1" >$D/a.txt cvs add $D/a.txt cvs commit -m "$TEST-A" $D/a.txt echo "1.2" >$D/a.txt cvs commit -m "$TEST-B" $D/a.txt echo "1.1" >$D/b.txt cvs add $D/b.txt cvs commit -m "$TEST-B" $D/b.txt echo "1.2" >$D/b.txt cvs commit -m "$TEST-C" $D/b.txt echo "1.1" >$D/c.txt cvs add $D/c.txt cvs commit -m "$TEST-C" $D/c.txt echo "1.2" >$D/c.txt cvs commit -m "$TEST-A" $D/c.txt # A: a.txt<1.1> b.txt<1.3> c.txt<1.2> # B: a.txt<1.2> b.txt<1.1> c.txt<1.3> # C: a.txt<1.3> b.txt<1.2> c.txt<1.1> TEST=ABC-passthru-loop D=$WC/$TEST mkdir $D cvs add $D echo "1.1" >$D/a.txt cvs add $D/a.txt cvs commit -m "$TEST-A" $D/a.txt echo "1.2" >$D/a.txt cvs commit -m "$TEST-B" $D/a.txt echo "1.3" >$D/a.txt cvs commit -m "$TEST-C" $D/a.txt echo "1.1" >$D/b.txt cvs add $D/b.txt cvs commit -m "$TEST-B" $D/b.txt echo "1.2" >$D/b.txt cvs commit -m "$TEST-C" $D/b.txt echo "1.3" >$D/b.txt cvs commit -m "$TEST-A" $D/b.txt echo "1.1" >$D/c.txt cvs add $D/c.txt cvs commit -m "$TEST-C" $D/c.txt echo "1.2" >$D/c.txt cvs commit -m "$TEST-A" $D/c.txt echo "1.3" >$D/c.txt cvs commit -m "$TEST-B" $D/c.txt # A: a.txt<1.1> c.txt<1.3> d.txt<1.2> # B: a.txt<1.2> b.txt<1.1> d.txt<1.3> # C: a.txt<1.3> b.txt<1.2> c.txt<1.1> # D: b.txt<1.3> c.txt<1.2> d.txt<1.1> TEST=ABCD-passthru-loop D=$WC/$TEST mkdir $D cvs add $D echo "1.1" >$D/a.txt cvs add $D/a.txt cvs commit -m "$TEST-A" $D/a.txt echo "1.2" >$D/a.txt cvs commit -m "$TEST-B" $D/a.txt echo "1.3" >$D/a.txt cvs commit -m "$TEST-C" $D/a.txt echo "1.1" >$D/b.txt cvs add $D/b.txt cvs commit -m "$TEST-B" $D/b.txt echo "1.2" >$D/b.txt cvs commit -m "$TEST-C" $D/b.txt echo "1.3" >$D/b.txt cvs commit -m "$TEST-D" $D/b.txt echo "1.1" >$D/c.txt cvs add $D/c.txt cvs commit -m "$TEST-C" $D/c.txt echo "1.2" >$D/c.txt cvs commit -m "$TEST-D" $D/c.txt echo "1.3" >$D/c.txt cvs commit -m "$TEST-A" $D/c.txt echo "1.1" >$D/d.txt cvs add $D/d.txt cvs commit -m "$TEST-D" $D/d.txt echo "1.2" >$D/d.txt cvs commit -m "$TEST-A" $D/d.txt echo "1.3" >$D/d.txt cvs commit -m "$TEST-B" $D/d.txt # The following test has the nasty property that each changeset has # either one LINK_PREV or LINK_SUCC and also one LINK_PASSTHRU. # # A: a.txt<1.1> b.txt<1.3> # B: a.txt<1.2> b.txt<1.4> # C: a.txt<1.3> b.txt<1.1> # D: a.txt<1.4> b.txt<1.2> TEST=AB-double-passthru-loop D=$WC/$TEST mkdir $D cvs add $D echo "1.1" >$D/a.txt cvs add $D/a.txt cvs commit -m "$TEST-A" $D/a.txt echo "1.2" >$D/a.txt cvs commit -m "$TEST-B" $D/a.txt echo "1.3" >$D/a.txt cvs commit -m "$TEST-C" $D/a.txt echo "1.4" >$D/a.txt cvs commit -m "$TEST-D" $D/a.txt echo "1.1" >$D/b.txt cvs add $D/b.txt cvs commit -m "$TEST-C" $D/b.txt echo "1.2" >$D/b.txt cvs commit -m "$TEST-D" $D/b.txt echo "1.3" >$D/b.txt cvs commit -m "$TEST-A" $D/b.txt echo "1.4" >$D/b.txt cvs commit -m "$TEST-B" $D/b.txt cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/ABC-loop/0000775000076500007650000000000012027373500023706 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/ABC-loop/a.txt,v0000664000076500007650000000045710702477014025142 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2006.10.30.22.11.14; author mhagger; state Exp; branches; next 1.1; 1.1 date 2006.10.30.22.11.13; author mhagger; state Exp; branches; next ; desc @@ 1.2 log @ABC-loop-B @ text @1.2 @ 1.1 log @ABC-loop-A @ text @d1 1 a1 1 1.1 @ cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/ABC-loop/b.txt,v0000664000076500007650000000045710702477014025143 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2006.10.30.22.11.16; author mhagger; state Exp; branches; next 1.1; 1.1 date 2006.10.30.22.11.15; author mhagger; state Exp; branches; next ; desc @@ 1.2 log @ABC-loop-C @ text @1.2 @ 1.1 log @ABC-loop-B @ text @d1 1 a1 1 1.1 @ cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/ABC-loop/c.txt,v0000664000076500007650000000045710702477014025144 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2006.10.30.22.11.18; author mhagger; state Exp; branches; next 1.1; 1.1 date 2006.10.30.22.11.17; author mhagger; state Exp; branches; next ; desc @@ 1.2 log @ABC-loop-A @ text @1.2 @ 1.1 log @ABC-loop-C @ text @d1 1 a1 1 1.1 @ cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/AB-double-passthru-loop/0000775000076500007650000000000012027373500026722 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/AB-double-passthru-loop/a.txt,v0000664000076500007650000000114310702477014030147 0ustar mhaggermhagger00000000000000head 1.4; access; symbols; locks; strict; comment @# @; 1.4 date 2006.10.30.22.11.43; author mhagger; state Exp; branches; next 1.3; 1.3 date 2006.10.30.22.11.42; author mhagger; state Exp; branches; next 1.2; 1.2 date 2006.10.30.22.11.41; author mhagger; state Exp; branches; next 1.1; 1.1 date 2006.10.30.22.11.40; author mhagger; state Exp; branches; next ; desc @@ 1.4 log @AB-double-passthru-loop-D @ text @1.4 @ 1.3 log @AB-double-passthru-loop-C @ text @d1 1 a1 1 1.3 @ 1.2 log @AB-double-passthru-loop-B @ text @d1 1 a1 1 1.2 @ 1.1 log @AB-double-passthru-loop-A @ text @d1 1 a1 1 1.1 @ cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/AB-double-passthru-loop/b.txt,v0000664000076500007650000000114310702477014030150 0ustar mhaggermhagger00000000000000head 1.4; access; symbols; locks; strict; comment @# @; 1.4 date 2006.10.30.22.11.47; author mhagger; state Exp; branches; next 1.3; 1.3 date 2006.10.30.22.11.46; author mhagger; state Exp; branches; next 1.2; 1.2 date 2006.10.30.22.11.45; author mhagger; state Exp; branches; next 1.1; 1.1 date 2006.10.30.22.11.44; author mhagger; state Exp; branches; next ; desc @@ 1.4 log @AB-double-passthru-loop-B @ text @1.4 @ 1.3 log @AB-double-passthru-loop-A @ text @d1 1 a1 1 1.3 @ 1.2 log @AB-double-passthru-loop-D @ text @d1 1 a1 1 1.2 @ 1.1 log @AB-double-passthru-loop-C @ text @d1 1 a1 1 1.1 @ cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/ABC-passthru-loop/0000775000076500007650000000000012027373500025555 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/ABC-passthru-loop/a.txt,v0000664000076500007650000000070610702477014027006 0ustar mhaggermhagger00000000000000head 1.3; access; symbols; locks; strict; comment @# @; 1.3 date 2006.10.30.22.11.21; author mhagger; state Exp; branches; next 1.2; 1.2 date 2006.10.30.22.11.20; author mhagger; state Exp; branches; next 1.1; 1.1 date 2006.10.30.22.11.19; author mhagger; state Exp; branches; next ; desc @@ 1.3 log @ABC-passthru-loop-C @ text @1.3 @ 1.2 log @ABC-passthru-loop-B @ text @d1 1 a1 1 1.2 @ 1.1 log @ABC-passthru-loop-A @ text @d1 1 a1 1 1.1 @ cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/ABC-passthru-loop/b.txt,v0000664000076500007650000000070610702477014027007 0ustar mhaggermhagger00000000000000head 1.3; access; symbols; locks; strict; comment @# @; 1.3 date 2006.10.30.22.11.24; author mhagger; state Exp; branches; next 1.2; 1.2 date 2006.10.30.22.11.23; author mhagger; state Exp; branches; next 1.1; 1.1 date 2006.10.30.22.11.22; author mhagger; state Exp; branches; next ; desc @@ 1.3 log @ABC-passthru-loop-A @ text @1.3 @ 1.2 log @ABC-passthru-loop-C @ text @d1 1 a1 1 1.2 @ 1.1 log @ABC-passthru-loop-B @ text @d1 1 a1 1 1.1 @ cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/ABC-passthru-loop/c.txt,v0000664000076500007650000000070610702477014027010 0ustar mhaggermhagger00000000000000head 1.3; access; symbols; locks; strict; comment @# @; 1.3 date 2006.10.30.22.11.27; author mhagger; state Exp; branches; next 1.2; 1.2 date 2006.10.30.22.11.26; author mhagger; state Exp; branches; next 1.1; 1.1 date 2006.10.30.22.11.25; author mhagger; state Exp; branches; next ; desc @@ 1.3 log @ABC-passthru-loop-B @ text @1.3 @ 1.2 log @ABC-passthru-loop-A @ text @d1 1 a1 1 1.2 @ 1.1 log @ABC-passthru-loop-C @ text @d1 1 a1 1 1.1 @ cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/ABCD-passthru-loop/0000775000076500007650000000000012027373500025661 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/ABCD-passthru-loop/d.txt,v0000664000076500007650000000071110702477014027111 0ustar mhaggermhagger00000000000000head 1.3; access; symbols; locks; strict; comment @# @; 1.3 date 2006.10.30.22.11.39; author mhagger; state Exp; branches; next 1.2; 1.2 date 2006.10.30.22.11.38; author mhagger; state Exp; branches; next 1.1; 1.1 date 2006.10.30.22.11.37; author mhagger; state Exp; branches; next ; desc @@ 1.3 log @ABCD-passthru-loop-B @ text @1.3 @ 1.2 log @ABCD-passthru-loop-A @ text @d1 1 a1 1 1.2 @ 1.1 log @ABCD-passthru-loop-D @ text @d1 1 a1 1 1.1 @ cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/ABCD-passthru-loop/a.txt,v0000664000076500007650000000071110702477014027106 0ustar mhaggermhagger00000000000000head 1.3; access; symbols; locks; strict; comment @# @; 1.3 date 2006.10.30.22.11.30; author mhagger; state Exp; branches; next 1.2; 1.2 date 2006.10.30.22.11.29; author mhagger; state Exp; branches; next 1.1; 1.1 date 2006.10.30.22.11.28; author mhagger; state Exp; branches; next ; desc @@ 1.3 log @ABCD-passthru-loop-C @ text @1.3 @ 1.2 log @ABCD-passthru-loop-B @ text @d1 1 a1 1 1.2 @ 1.1 log @ABCD-passthru-loop-A @ text @d1 1 a1 1 1.1 @ cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/ABCD-passthru-loop/b.txt,v0000664000076500007650000000071110702477014027107 0ustar mhaggermhagger00000000000000head 1.3; access; symbols; locks; strict; comment @# @; 1.3 date 2006.10.30.22.11.33; author mhagger; state Exp; branches; next 1.2; 1.2 date 2006.10.30.22.11.32; author mhagger; state Exp; branches; next 1.1; 1.1 date 2006.10.30.22.11.31; author mhagger; state Exp; branches; next ; desc @@ 1.3 log @ABCD-passthru-loop-D @ text @1.3 @ 1.2 log @ABCD-passthru-loop-C @ text @d1 1 a1 1 1.2 @ 1.1 log @ABCD-passthru-loop-B @ text @d1 1 a1 1 1.1 @ cvs2svn-2.4.0/test-data/nasty-graphs-cvsrepos/ABCD-passthru-loop/c.txt,v0000664000076500007650000000071110702477014027110 0ustar mhaggermhagger00000000000000head 1.3; access; symbols; locks; strict; comment @# @; 1.3 date 2006.10.30.22.11.36; author mhagger; state Exp; branches; next 1.2; 1.2 date 2006.10.30.22.11.35; author mhagger; state Exp; branches; next 1.1; 1.1 date 2006.10.30.22.11.34; author mhagger; state Exp; branches; next ; desc @@ 1.3 log @ABCD-passthru-loop-A @ text @1.3 @ 1.2 log @ABCD-passthru-loop-D @ text @d1 1 a1 1 1.2 @ 1.1 log @ABCD-passthru-loop-C @ text @d1 1 a1 1 1.1 @ cvs2svn-2.4.0/test-data/unicode-author-cvsrepos/0000775000076500007650000000000012027373500022660 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/unicode-author-cvsrepos/testunicode,v0000664000076500007650000000145310751442437025406 0ustar mhaggermhagger00000000000000head 1.6; access; symbols; locks; strict; comment @# @; 1.6 date 2008.02.03.22.15.23; author hülsmann; state Exp; branches; next 1.5; 1.5 date 2008.02.03.22.15.22; author hülsmann; state Exp; branches; next 1.4; 1.4 date 2008.02.03.22.15.21; author ringström; state Exp; branches; next 1.3; 1.3 date 2008.02.03.22.15.20; author ringström; state Exp; branches; next 1.2; 1.2 date 2008.02.03.22.15.18; author @čibej@; state Exp; branches; next 1.1; 1.1 date 2007.02.13.21.13.21; author @čibej@; state Exp; branches; next ; desc @@ 1.6 log @Revision 1.6 @ text @6 @ 1.5 log @Revision 1.5 @ text @d1 1 a1 1 5 @ 1.4 log @Revision 1.4 @ text @d1 1 a1 1 4 @ 1.3 log @Revision 1.3 @ text @d1 1 a1 1 3 @ 1.2 log @Revision 1.2 @ text @d1 1 a1 1 2 @ 1.1 log @Revision 1.1 @ text @d1 1 a1 1 1 @ cvs2svn-2.4.0/test-data/branch-from-empty-dir-cvsrepos/0000775000076500007650000000000012027373500024040 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/branch-from-empty-dir-cvsrepos/proj/0000775000076500007650000000000012027373500025012 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/branch-from-empty-dir-cvsrepos/proj/a.txt,v0000664000076500007650000000057310702477021026243 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BRANCH2:1.1.2.1.0.2 BRANCH1:1.1.0.2; locks; strict; comment @# @; 1.1 date 2007.04.30.15.20.32; author mhagger; state Exp; branches 1.1.2.1; next ; 1.1.2.1 date 2007.04.30.15.20.33; author mhagger; state Exp; branches; next ; desc @@ 1.1 log @Adding a.txt:1.1 @ text @1.1 @ 1.1.2.1 log @Committing a.txt:1.1.2.1 @ text @d1 1 a1 1 1.1.2.1 @ cvs2svn-2.4.0/test-data/branch-from-empty-dir-cvsrepos/proj/subdir/0000775000076500007650000000000012027373500026302 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/branch-from-empty-dir-cvsrepos/proj/subdir/Attic/0000775000076500007650000000000012027373500027346 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/branch-from-empty-dir-cvsrepos/proj/subdir/Attic/b.txt,v0000664000076500007650000000053410702477021030575 0ustar mhaggermhagger00000000000000head 1.2; access; symbols BRANCH2:1.2.0.4 BRANCH1:1.2.0.2; locks; strict; comment @# @; 1.2 date 2007.04.30.15.20.32; author mhagger; state dead; branches; next 1.1; 1.1 date 2007.04.30.15.20.31; author mhagger; state Exp; branches; next ; desc @@ 1.2 log @Removing subdir/b.txt @ text @1.1 @ 1.1 log @Adding subdir/b.txt:1.1 @ text @@ cvs2svn-2.4.0/test-data/branch-from-empty-dir-cvsrepos/makerepo.sh0000775000076500007650000000225610702477021026210 0ustar mhaggermhagger00000000000000#! /bin/sh # This is the script used to create the branch-from-empty-dir CVS # repository. (The repository is checked into svn; this script is # only here for its documentation value.) # # The script should be started from the main cvs2svn directory. # # The repository itself tickles a problem that I was having with an # uncommitted version of better-symbol-selection when BRANCH2 is # grafted onto BRANCH1. name=branch-from-empty-dir repo=`pwd`/test-data/$name-cvsrepos wc=`pwd`/cvs2svn-tmp/$name-wc [ -e $repo/CVSROOT ] && rm -rf $repo/CVSROOT [ -e $repo/proj ] && rm -rf $repo/proj [ -e $wc ] && rm -rf $wc cvs -d $repo init cvs -d $repo co -d $wc . cd $wc mkdir proj cvs add proj cd proj mkdir subdir cvs add subdir echo '1.1' >subdir/b.txt cvs add subdir/b.txt cvs commit -m 'Adding subdir/b.txt:1.1' . rm subdir/b.txt cvs rm subdir/b.txt cvs commit -m 'Removing subdir/b.txt' . cvs rtag -r 1.2 -b BRANCH1 proj/subdir/b.txt cvs rtag -r 1.2 -b BRANCH2 proj/subdir/b.txt echo '1.1' >a.txt cvs add a.txt cvs commit -m 'Adding a.txt:1.1' . cvs tag -b BRANCH1 a.txt cvs update -r BRANCH1 echo '1.1.2.1' >a.txt cvs commit -m 'Committing a.txt:1.1.2.1' a.txt cvs tag -b BRANCH2 a.txt cvs2svn-2.4.0/test-data/branch-from-deleted-1-1-cvsrepos/0000775000076500007650000000000012027373500024030 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/branch-from-deleted-1-1-cvsrepos/proj/0000775000076500007650000000000012027373500025002 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/branch-from-deleted-1-1-cvsrepos/proj/Attic/0000775000076500007650000000000012027373500026046 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/branch-from-deleted-1-1-cvsrepos/proj/Attic/c.txt,v0000664000076500007650000000107210702477020027273 0ustar mhaggermhagger00000000000000head 1.1; access; symbols TAG1:1.1 BRANCH3:1.1.0.6 BRANCH2:1.1.0.4 BRANCH1:1.1.0.2; locks; strict; comment @# @; 1.1 date 2007.06.25.22.20.19; author mhagger; state dead; branches 1.1.2.1 1.1.4.1; next ; 1.1.2.1 date 2007.06.25.22.20.19; author mhagger; state Exp; branches; next ; 1.1.4.1 date 2007.06.25.22.20.21; author mhagger; state Exp; branches; next ; desc @@ 1.1 log @file c.txt was initially added on branch BRANCH1. @ text @@ 1.1.4.1 log @Adding c.txt:1.1.4.1 @ text @a0 1 1.1.4.1 @ 1.1.2.1 log @Adding c.txt:1.1.2.1 @ text @a0 1 1.1.2.1 @ cvs2svn-2.4.0/test-data/branch-from-deleted-1-1-cvsrepos/proj/a.txt,v0000664000076500007650000000033210702477020026223 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BRANCH2:1.1.0.4 BRANCH1:1.1.0.2; locks; strict; comment @# @; 1.1 date 2007.06.25.22.20.14; author mhagger; state Exp; branches; next ; desc @@ 1.1 log @Adding a.txt:1.1 @ text @1.1 @ cvs2svn-2.4.0/test-data/branch-from-deleted-1-1-cvsrepos/proj/b.txt,v0000664000076500007650000000126710702477020026234 0ustar mhaggermhagger00000000000000head 1.2; access; symbols TAG1:1.1 BRANCH3:1.1.0.6 BRANCH2:1.1.0.4 BRANCH1:1.1.0.2; locks; strict; comment @# @; 1.2 date 2007.06.25.22.20.17; author mhagger; state Exp; branches; next 1.1; 1.1 date 2007.06.25.22.20.15; author mhagger; state dead; branches 1.1.2.1 1.1.4.1; next ; 1.1.2.1 date 2007.06.25.22.20.15; author mhagger; state Exp; branches; next ; 1.1.4.1 date 2007.06.25.22.20.16; author mhagger; state Exp; branches; next ; desc @@ 1.2 log @Adding b.txt:1.2 @ text @1.2 @ 1.1 log @file b.txt was initially added on branch BRANCH1. @ text @d1 1 @ 1.1.4.1 log @Adding b.txt:1.1.4.1 @ text @a0 1 1.1.4.1 @ 1.1.2.1 log @Adding b.txt:1.1.2.1 @ text @a0 1 1.1.2.1 @ cvs2svn-2.4.0/test-data/branch-from-deleted-1-1-cvsrepos/makerepo.sh0000775000076500007650000000305110702477020026171 0ustar mhaggermhagger00000000000000#! /bin/sh # This is the script used to create the branch-from-deleted-1-1 CVS # repository. (The repository is checked into svn; this script is # only here for its documentation value.) # # The script should be started from the main cvs2svn directory. name=branch-from-deleted-1-1 repo=`pwd`/test-data/$name-cvsrepos wc=`pwd`/cvs2svn-tmp/$name-wc [ -e $repo/CVSROOT ] && rm -rf $repo/CVSROOT [ -e $repo/proj ] && rm -rf $repo/proj [ -e $wc ] && rm -rf $wc cvs -d $repo init cvs -d $repo co -d $wc . cd $wc mkdir proj cvs add proj cd proj echo "Create a file a.txt on trunk:" echo '1.1' >a.txt cvs add a.txt cvs commit -m 'Adding a.txt:1.1' . echo "Create two branches on file a.txt:" cvs tag -b BRANCH1 cvs tag -b BRANCH2 echo "Add file b.txt on BRANCH1:" cvs up -r BRANCH1 echo '1.1.2.1' >b.txt cvs add b.txt cvs commit -m 'Adding b.txt:1.1.2.1' echo "Add file b.txt on BRANCH2:" cvs up -r BRANCH2 echo '1.1.4.1' >b.txt cvs add b.txt cvs commit -m 'Adding b.txt:1.1.4.1' echo "Add file b.txt on trunk:" cvs up -A echo '1.2' >b.txt cvs add b.txt cvs commit -m 'Adding b.txt:1.2' echo "Add file c.txt on BRANCH1:" cvs up -r BRANCH1 echo '1.1.2.1' >c.txt cvs add c.txt cvs commit -m 'Adding c.txt:1.1.2.1' echo "Add file c.txt on BRANCH2:" cvs up -r BRANCH2 echo '1.1.4.1' >c.txt cvs add c.txt cvs commit -m 'Adding c.txt:1.1.4.1' echo "Create branch BRANCH3 from 1.1 versions of b.txt and c.txt:" cvs rtag -r 1.1 -b BRANCH3 proj/b.txt proj/c.txt echo "Create tag TAG1 from 1.1 versions of b.txt and c.txt:" cvs rtag -r 1.1 TAG1 proj/b.txt proj/c.txt cvs2svn-2.4.0/test-data/newphrases-cvsrepos/0000775000076500007650000000000012027373500022111 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/newphrases-cvsrepos/file001,v0000664000076500007650000000301111710517254023435 0ustar mhaggermhagger00000000000000head 1.7; access; symbols symbol00000:1.7 symbol00001:1.7.0.8 symbol00002:1.7 symbol00003:1.7.0.6 symbol00004:1.7 symbol00005:1.7.0.4 symbol00006:1.7 symbol00007:1.7.0.2 symbol00008:1.7 symbol00009:1.3 symbol00010:1.3.0.2; locks; strict; comment @ * @; this-is-a-newphrase:1.3 ; 1.7 date 2003.04.23.12.15.16; author author1; state Exp; branches; next 1.6; 1.6 date 2003.04.10.08.39.59; author author1; state Exp; branches; next 1.5; 1.5 date 2003.01.27.16.35.38; author author1; state Exp; branches; next 1.4; 1.4 date 2002.11.28.11.50.18; author author1; state Exp; branches; next 1.3; 1.3 date 2002.09.19.08.30.17; author author2; state Exp; branches 1.3.2.1; next 1.2; 1.2 date 2002.08.06.12.15.33; author author2; state Exp; branches; next 1.1; 1.1 date 2002.03.07.09.51.58; author author1; state Exp; branches; next ; 1.3.2.1 date 2003.02.10.08.43.05; author author2; state Exp; branches; next ; desc @@ 1.7 log @log 1@ text @This text was last seen in HEAD (revision 1.7) @ 1.6 log @log 2@ text @d1 1 a1 1 This text was last seen in revision 1.6 @ 1.5 log @log 3@ text @d1 1 a1 1 This text was last seen in revision 1.5 @ 1.4 log @log 4@ text @d1 1 a1 1 This text was last seen in revision 1.4 @ 1.3 log @log 5@ text @d1 1 a1 1 This text was last seen in revision 1.3 @ 1.3.2.1 log @log 6@ text @d1 1 a1 1 This text was committed in revision 1.3.2.1 @ 1.2 log @log 7@ text @d1 1 a1 1 This text was last seen in revision 1.2 @ 1.1 log @log 8@ text @d1 1 a1 1 This text was last seen in revision 1.1 @ cvs2svn-2.4.0/test-data/fill-choices-cvsrepos/0000775000076500007650000000000012027373500022273 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/fill-choices-cvsrepos/three.txt,v0000664000076500007650000000205710702477015024415 0ustar mhaggermhagger00000000000000head 1.3; access; symbols TAG_B1:1.3 BRANCH_B1:1.3.0.12 BRANCH_8:1.3.2.1.0.4 BRANCH_7:1.3.0.10 BRANCH_5:1.3.0.8 BRANCH_4:1.3.0.6 BRANCH_3:1.3.0.4 BRANCH_2:1.3.2.1.0.2 BRANCH_1:1.3.0.2 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2004.06.07.18.42.06; author kfogel; state Exp; branches 1.3.2.1; next 1.2; 1.2 date 2004.06.07.18.38.34; author kfogel; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches; next ; 1.3.2.1 date 2004.06.07.20.29.58; author kfogel; state Exp; branches; next ; desc @@ 1.3 log @Commit 1.3 of every file. @ text @Another small change. @ 1.3.2.1 log @Commit change to everyone on branch BRANCH_1. @ text @d1 1 a1 1 First change on branch BRANCH_1 (1.3.2.1 for everyone). @ 1.2 log @Commit 1.2 of every file. @ text @d1 1 a1 1 A small change. @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is the first revision. @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/fill-choices-cvsrepos/README0000664000076500007650000000660110702477015023162 0ustar mhaggermhagger00000000000000This repository is for testing cvs2svn's choices of which directories to copy when filling symbolic names. The layout is: /one.txt two.txt three.txt /sub1/four.txt five.txt six.txt sub2/seven.txt eight.txt nine.txt ten.txt Every file was imported in a standard way, then revisions 1.2 and 1.3 were committed on every file. Then a branch was made: BRANCH_1: sprouts from every file's 1.3 (so branch number 1.3.0.2) Then a revision was committed on that branch, creating revision 1.3.2.1 on every file. Next a branch was made from that revision, on every file: BRANCH_2: sprouts from every file's 1.3.2.1 (so branch number 1.3.2.1.0.2) BRANCH_3 to BRANCH_8 all sprout from either trunk, or from the first revision on BRANCH_1 (that is, from 1.3.2.1), in various combinations. Every branch below exists on every file, the only question is where the branch is rooted for each file. BRANCH_3: Sprouts from trunk everywhere except sub1/sub2/*, where it sprouts from BRANCH_1 for all four files. BRANCH_4: Sprouts from trunk everywhere except for sub1/sub2/ten.txt, where it sprouts from BRANCH_1. Note that this is a clear minority in sub1/sub2/, since it still sprouts from trunk on the other three files there ('seven.txt', 'eight.txt', and 'nine.txt'). BRANCH_5: Sprouts from trunk everywhere except for sub1/sub2/nine.txt and sub1/sub2/ten.txt, where it sprouts from BRANCH_1. This is an even division in sub1/sub2/, since it sprouts from trunk on two files ('seven.txt' and 'eight.txt') and from BRANCH_1 on the other two ('nine.txt' and 'ten.txt'). BRANCH_6: Sprouts from trunk everywhere except for sub1/sub2/eight.txt, sub1/sub2/nine.txt, and sub1/sub2/ten.txt, where it sprouts from BRANCH_1. This is a clear majority in favor of BRANCH_1, since BRANCH_6 sprouts from trunk on only one file ('seven.txt') and from BRANCH_1 on the other three ('eight.txt', 'nine.txt' and 'ten.txt'). BRANCH_7: Sprouts from trunk everywhere except sub1/five.txt and sub1/six.txt, where it sprouts from BRANCH_1. This is a majority in favor of BRANCH_1 there, as the only other file in that directory is 'four.txt', but note that both the parent directory and the sole subdirectory are majority from trunk. BRANCH_8: The reverse of BRANCH_7. Sprouts from BRANCH_1 everywhere except sub1/five.txt and sub1/six.txt, where it sprouts from trunk. This is a majority in favor of trunk there, as the only other file in that directory is 'four.txt', but note that both the parent directory and the sole subdirectory are majority from BRANCH_1. To test the filling of a tag set on a branch, a new branch is created. BRANCH_B1: sprouts from every file's 1.3 (so branch number 1.3.0.12) A single change to one.txt in BRANCH_B1 is committed so that 1.3.12.1 is created on that file, and TAG_B1 is set on the tip of that branch. TAG_B1: set on 1.3.12.1 on one.txt, and 1.3 on the rest TAG_B1 should be created as a single copy from BRANCH_B1. cvs2svn-2.4.0/test-data/fill-choices-cvsrepos/sub1/0000775000076500007650000000000012027373500023145 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/fill-choices-cvsrepos/sub1/six.txt,v0000664000076500007650000000205710702477015024763 0ustar mhaggermhagger00000000000000head 1.3; access; symbols TAG_B1:1.3 BRANCH_B1:1.3.0.12 BRANCH_8:1.3.0.10 BRANCH_7:1.3.2.1.0.4 BRANCH_5:1.3.0.8 BRANCH_4:1.3.0.6 BRANCH_3:1.3.0.4 BRANCH_2:1.3.2.1.0.2 BRANCH_1:1.3.0.2 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2004.06.07.18.42.06; author kfogel; state Exp; branches 1.3.2.1; next 1.2; 1.2 date 2004.06.07.18.38.34; author kfogel; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches; next ; 1.3.2.1 date 2004.06.07.20.29.58; author kfogel; state Exp; branches; next ; desc @@ 1.3 log @Commit 1.3 of every file. @ text @Another small change. @ 1.3.2.1 log @Commit change to everyone on branch BRANCH_1. @ text @d1 1 a1 1 First change on branch BRANCH_1 (1.3.2.1 for everyone). @ 1.2 log @Commit 1.2 of every file. @ text @d1 1 a1 1 A small change. @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is the first revision. @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/fill-choices-cvsrepos/sub1/four.txt,v0000664000076500007650000000205710702477015025133 0ustar mhaggermhagger00000000000000head 1.3; access; symbols TAG_B1:1.3 BRANCH_B1:1.3.0.12 BRANCH_8:1.3.2.1.0.4 BRANCH_7:1.3.0.10 BRANCH_5:1.3.0.8 BRANCH_4:1.3.0.6 BRANCH_3:1.3.0.4 BRANCH_2:1.3.2.1.0.2 BRANCH_1:1.3.0.2 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2004.06.07.18.42.06; author kfogel; state Exp; branches 1.3.2.1; next 1.2; 1.2 date 2004.06.07.18.38.34; author kfogel; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches; next ; 1.3.2.1 date 2004.06.07.20.29.58; author kfogel; state Exp; branches; next ; desc @@ 1.3 log @Commit 1.3 of every file. @ text @Another small change. @ 1.3.2.1 log @Commit change to everyone on branch BRANCH_1. @ text @d1 1 a1 1 First change on branch BRANCH_1 (1.3.2.1 for everyone). @ 1.2 log @Commit 1.2 of every file. @ text @d1 1 a1 1 A small change. @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is the first revision. @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/fill-choices-cvsrepos/sub1/five.txt,v0000664000076500007650000000205710702477015025111 0ustar mhaggermhagger00000000000000head 1.3; access; symbols TAG_B1:1.3 BRANCH_B1:1.3.0.12 BRANCH_8:1.3.0.10 BRANCH_7:1.3.2.1.0.4 BRANCH_5:1.3.0.8 BRANCH_4:1.3.0.6 BRANCH_3:1.3.0.4 BRANCH_2:1.3.2.1.0.2 BRANCH_1:1.3.0.2 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2004.06.07.18.42.06; author kfogel; state Exp; branches 1.3.2.1; next 1.2; 1.2 date 2004.06.07.18.38.34; author kfogel; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches; next ; 1.3.2.1 date 2004.06.07.20.29.58; author kfogel; state Exp; branches; next ; desc @@ 1.3 log @Commit 1.3 of every file. @ text @Another small change. @ 1.3.2.1 log @Commit change to everyone on branch BRANCH_1. @ text @d1 1 a1 1 First change on branch BRANCH_1 (1.3.2.1 for everyone). @ 1.2 log @Commit 1.2 of every file. @ text @d1 1 a1 1 A small change. @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is the first revision. @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/fill-choices-cvsrepos/sub1/sub2/0000775000076500007650000000000012027373500024020 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/fill-choices-cvsrepos/sub1/sub2/nine.txt,v0000664000076500007650000000211410702477015025756 0ustar mhaggermhagger00000000000000head 1.3; access; symbols TAG_B1:1.3 BRANCH_B1:1.3.0.8 BRANCH_8:1.3.2.1.0.10 BRANCH_7:1.3.0.6 BRANCH_6:1.3.2.1.0.8 BRANCH_5:1.3.2.1.0.6 BRANCH_4:1.3.0.4 BRANCH_3:1.3.2.1.0.4 BRANCH_2:1.3.2.1.0.2 BRANCH_1:1.3.0.2 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2004.06.07.18.42.06; author kfogel; state Exp; branches 1.3.2.1; next 1.2; 1.2 date 2004.06.07.18.38.34; author kfogel; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches; next ; 1.3.2.1 date 2004.06.07.20.29.58; author kfogel; state Exp; branches; next ; desc @@ 1.3 log @Commit 1.3 of every file. @ text @Another small change. @ 1.3.2.1 log @Commit change to everyone on branch BRANCH_1. @ text @d1 1 a1 1 First change on branch BRANCH_1 (1.3.2.1 for everyone). @ 1.2 log @Commit 1.2 of every file. @ text @d1 1 a1 1 A small change. @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is the first revision. @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/fill-choices-cvsrepos/sub1/sub2/eight.txt,v0000664000076500007650000000211010702477015026121 0ustar mhaggermhagger00000000000000head 1.3; access; symbols TAG_B1:1.3 BRANCH_B1:1.3.0.10 BRANCH_8:1.3.2.1.0.8 BRANCH_7:1.3.0.8 BRANCH_6:1.3.2.1.0.6 BRANCH_5:1.3.0.6 BRANCH_4:1.3.0.4 BRANCH_3:1.3.2.1.0.4 BRANCH_2:1.3.2.1.0.2 BRANCH_1:1.3.0.2 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2004.06.07.18.42.06; author kfogel; state Exp; branches 1.3.2.1; next 1.2; 1.2 date 2004.06.07.18.38.34; author kfogel; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches; next ; 1.3.2.1 date 2004.06.07.20.29.58; author kfogel; state Exp; branches; next ; desc @@ 1.3 log @Commit 1.3 of every file. @ text @Another small change. @ 1.3.2.1 log @Commit change to everyone on branch BRANCH_1. @ text @d1 1 a1 1 First change on branch BRANCH_1 (1.3.2.1 for everyone). @ 1.2 log @Commit 1.2 of every file. @ text @d1 1 a1 1 A small change. @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is the first revision. @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/fill-choices-cvsrepos/sub1/sub2/ten.txt,v0000664000076500007650000000212110702477015025611 0ustar mhaggermhagger00000000000000head 1.3; access; symbols TAG_B1:1.3 BRANCH_B1:1.3.0.6 BRANCH_8:1.3.2.1.0.12 BRANCH_7:1.3.0.4 BRANCH_6:1.3.2.1.0.10 BRANCH_5:1.3.2.1.0.8 BRANCH_4:1.3.2.1.0.6 BRANCH_3:1.3.2.1.0.4 BRANCH_2:1.3.2.1.0.2 BRANCH_1:1.3.0.2 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2004.06.07.18.42.06; author kfogel; state Exp; branches 1.3.2.1; next 1.2; 1.2 date 2004.06.07.18.38.34; author kfogel; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches; next ; 1.3.2.1 date 2004.06.07.20.29.58; author kfogel; state Exp; branches; next ; desc @@ 1.3 log @Commit 1.3 of every file. @ text @Another small change. @ 1.3.2.1 log @Commit change to everyone on branch BRANCH_1. @ text @d1 1 a1 1 First change on branch BRANCH_1 (1.3.2.1 for everyone). @ 1.2 log @Commit 1.2 of every file. @ text @d1 1 a1 1 A small change. @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is the first revision. @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/fill-choices-cvsrepos/sub1/sub2/seven.txt,v0000664000076500007650000000210510702477015026145 0ustar mhaggermhagger00000000000000head 1.3; access; symbols TAG_B1:1.3 BRANCH_B1:1.3.0.12 BRANCH_8:1.3.2.1.0.6 BRANCH_7:1.3.0.10 BRANCH_6:1.3.0.8 BRANCH_5:1.3.0.6 BRANCH_4:1.3.0.4 BRANCH_3:1.3.2.1.0.4 BRANCH_2:1.3.2.1.0.2 BRANCH_1:1.3.0.2 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2004.06.07.18.42.06; author kfogel; state Exp; branches 1.3.2.1; next 1.2; 1.2 date 2004.06.07.18.38.34; author kfogel; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches; next ; 1.3.2.1 date 2004.06.07.20.29.58; author kfogel; state Exp; branches; next ; desc @@ 1.3 log @Commit 1.3 of every file. @ text @Another small change. @ 1.3.2.1 log @Commit change to everyone on branch BRANCH_1. @ text @d1 1 a1 1 First change on branch BRANCH_1 (1.3.2.1 for everyone). @ 1.2 log @Commit 1.2 of every file. @ text @d1 1 a1 1 A small change. @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is the first revision. @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/fill-choices-cvsrepos/one.txt,v0000664000076500007650000000235210702477015024065 0ustar mhaggermhagger00000000000000head 1.3; access; symbols TAG_B1:1.3.12.1 BRANCH_B1:1.3.0.12 BRANCH_8:1.3.2.1.0.4 BRANCH_7:1.3.0.10 BRANCH_5:1.3.0.8 BRANCH_4:1.3.0.6 BRANCH_3:1.3.0.4 BRANCH_2:1.3.2.1.0.2 BRANCH_1:1.3.0.2 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2004.06.07.18.42.06; author kfogel; state Exp; branches 1.3.2.1 1.3.12.1; next 1.2; 1.2 date 2004.06.07.18.38.34; author kfogel; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches; next ; 1.3.2.1 date 2004.06.07.20.29.58; author kfogel; state Exp; branches; next ; 1.3.12.1 date 2004.06.28.22.03.12; author tori; state Exp; branches; next ; desc @@ 1.3 log @Commit 1.3 of every file. @ text @Another small change. @ 1.3.12.1 log @First change on branch BRANCH_B1 @ text @d1 1 a1 1 Yet another small change. @ 1.3.2.1 log @Commit change to everyone on branch BRANCH_1. @ text @d1 1 a1 1 First change on branch BRANCH_1 (1.3.2.1 for everyone). @ 1.2 log @Commit 1.2 of every file. @ text @d1 1 a1 1 A small change. @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is the first revision. @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/fill-choices-cvsrepos/two.txt,v0000664000076500007650000000205710702477015024117 0ustar mhaggermhagger00000000000000head 1.3; access; symbols TAG_B1:1.3 BRANCH_B1:1.3.0.12 BRANCH_8:1.3.2.1.0.4 BRANCH_7:1.3.0.10 BRANCH_5:1.3.0.8 BRANCH_4:1.3.0.6 BRANCH_3:1.3.0.4 BRANCH_2:1.3.2.1.0.2 BRANCH_1:1.3.0.2 vendortag:1.1.1.1 vendorbranch:1.1.1; locks; strict; comment @# @; 1.3 date 2004.06.07.18.42.06; author kfogel; state Exp; branches 1.3.2.1; next 1.2; 1.2 date 2004.06.07.18.38.34; author kfogel; state Exp; branches; next 1.1; 1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.05.24.01.02.01; author kfogel; state Exp; branches; next ; 1.3.2.1 date 2004.06.07.20.29.58; author kfogel; state Exp; branches; next ; desc @@ 1.3 log @Commit 1.3 of every file. @ text @Another small change. @ 1.3.2.1 log @Commit change to everyone on branch BRANCH_1. @ text @d1 1 a1 1 First change on branch BRANCH_1 (1.3.2.1 for everyone). @ 1.2 log @Commit 1.2 of every file. @ text @d1 1 a1 1 A small change. @ 1.1 log @Initial revision @ text @d1 1 a1 1 This is the first revision. @ 1.1.1.1 log @Initial import. @ text @@ cvs2svn-2.4.0/test-data/delete-cvsignore-cvsrepos/0000775000076500007650000000000012027373500023171 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/delete-cvsignore-cvsrepos/proj/0000775000076500007650000000000012027373500024143 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/delete-cvsignore-cvsrepos/proj/Attic/0000775000076500007650000000000012027373500025207 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/delete-cvsignore-cvsrepos/proj/Attic/.cvsignore,v0000664000076500007650000000047510702477016027464 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; comment @# @; 1.2 date 2006.10.15.13.13.20; author mhagger; state dead; branches; next 1.1; 1.1 date 2006.10.15.13.12.43; author mhagger; state Exp; branches; next ; desc @@ 1.2 log @Remove .cvsignore @ text @*.o @ 1.1 log @Add random file and .cvsignore @ text @@ cvs2svn-2.4.0/test-data/delete-cvsignore-cvsrepos/proj/file.txt,v0000664000076500007650000000030210702477016026065 0ustar mhaggermhagger00000000000000head 1.1; access; symbols; locks; strict; comment @# @; 1.1 date 2006.10.15.13.12.43; author mhagger; state Exp; branches; next ; desc @@ 1.1 log @Add random file and .cvsignore @ text @@ cvs2svn-2.4.0/test-data/many-deletes-cvsrepos/0000775000076500007650000000000012027373500022321 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/many-deletes-cvsrepos/proj/0000775000076500007650000000000012027373500023273 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/many-deletes-cvsrepos/proj/Attic/0000775000076500007650000000000012027373500024337 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/many-deletes-cvsrepos/proj/Attic/d.txt,v0000664000076500007650000000067611317026312025573 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BRANCH2:1.1.0.4 BRANCH:1.1.0.2 TAG:1.1; locks; strict; comment @# @; 1.1 date 2009.09.04.09.45.32; author mhagger; state dead; branches 1.1.4.2; next ; commitid Jeo5F7Vn6nyLrl2u; 1.1.4.2 date 2009.09.04.09.46.18; author mhagger; state Exp; branches; next ; commitid l4kyUGsxvoE1sl2u; desc @@ 1.1 log @file d.txt was initially added on branch BRANCH. @ text @@ 1.1.4.2 log @Add files on BRANCH2. @ text @@ cvs2svn-2.4.0/test-data/many-deletes-cvsrepos/proj/Attic/f.txt,v0000664000076500007650000000043411317026312025565 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BRANCH2:1.1.0.4 BRANCH:1.1.0.2 TAG:1.1; locks; strict; comment @# @; 1.1 date 2009.09.04.09.45.32; author mhagger; state dead; branches; next ; commitid Jeo5F7Vn6nyLrl2u; desc @@ 1.1 log @file e.txt was initially added on branch BRANCH. @ text @@ cvs2svn-2.4.0/test-data/many-deletes-cvsrepos/proj/Attic/e.txt,v0000664000076500007650000000077311317026312025572 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BRANCH2:1.1.0.4 BRANCH:1.1.0.2 TAG:1.1 TAG2:1.1.4.1; locks; strict; comment @# @; 1.1 date 2009.09.04.09.45.32; author mhagger; state dead; branches 1.1.4.1; next ; commitid Jeo5F7Vn6nyLrl2u; 1.1.4.1 date 2009.09.04.09.45.32; author mhagger; state dead; branches; next ; commitid l4kyUGsxvoE1sl2u; desc @@ 1.1 log @file e.txt was initially added on branch BRANCH. @ text @@ 1.1.4.1 log @file e.txt was added on branch BRANCH2 on 2009-09-04 09:46:18 +0000 @ text @@ cvs2svn-2.4.0/test-data/many-deletes-cvsrepos/proj/Attic/b.txt,v0000664000076500007650000000147311317026312025565 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BRANCH2:1.1.0.4 BRANCH:1.1.0.2 TAG:1.1 TAG2:1.1.4.1; locks; strict; comment @# @; 1.1 date 2009.09.04.09.45.32; author mhagger; state dead; branches 1.1.2.1 1.1.4.1; next ; commitid Jeo5F7Vn6nyLrl2u; 1.1.2.1 date 2009.09.04.09.45.32; author mhagger; state Exp; branches; next ; commitid Jeo5F7Vn6nyLrl2u; 1.1.4.1 date 2009.09.04.09.45.32; author mhagger; state dead; branches; next 1.1.4.2; commitid l4kyUGsxvoE1sl2u; 1.1.4.2 date 2009.09.04.09.46.18; author mhagger; state Exp; branches; next ; commitid l4kyUGsxvoE1sl2u; desc @@ 1.1 log @file b.txt was initially added on branch BRANCH. @ text @@ 1.1.4.1 log @file b.txt was added on branch BRANCH2 on 2009-09-04 09:46:18 +0000 @ text @@ 1.1.4.2 log @Add files on BRANCH2. @ text @@ 1.1.2.1 log @Add files on BRANCH. @ text @@ cvs2svn-2.4.0/test-data/many-deletes-cvsrepos/proj/Attic/c.txt,v0000664000076500007650000000123311317026312025560 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BRANCH2:1.1.0.4 BRANCH:1.1.0.2 TAG:1.1 TAG2:1.1.4.1; locks; strict; comment @# @; 1.1 date 2009.09.04.09.45.32; author mhagger; state dead; branches 1.1.4.1; next ; commitid Jeo5F7Vn6nyLrl2u; 1.1.4.1 date 2009.09.04.09.45.32; author mhagger; state dead; branches; next 1.1.4.2; commitid l4kyUGsxvoE1sl2u; 1.1.4.2 date 2009.09.04.09.46.18; author mhagger; state Exp; branches; next ; commitid l4kyUGsxvoE1sl2u; desc @@ 1.1 log @file c.txt was initially added on branch BRANCH. @ text @@ 1.1.4.1 log @file c.txt was added on branch BRANCH2 on 2009-09-04 09:46:18 +0000 @ text @@ 1.1.4.2 log @Add files on BRANCH2. @ text @@ cvs2svn-2.4.0/test-data/many-deletes-cvsrepos/proj/a.txt,v0000664000076500007650000000035111317026312024512 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BRANCH2:1.1.0.4 BRANCH:1.1.0.2; locks; strict; comment @# @; 1.1 date 2009.09.04.09.44.53; author mhagger; state Exp; branches; next ; commitid FASyv2HRCxOxrl2u; desc @@ 1.1 log @Add a.txt @ text @@ cvs2svn-2.4.0/test-data/overlapping-branch-cvsrepos/0000775000076500007650000000000012027373500023513 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/overlapping-branch-cvsrepos/nonoverlapping-branch,v0000664000076500007650000000067710702477014030211 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access ; symbols vendorA:1.1.1; locks ; strict; comment @# @; 1.1 date 2002.11.30.19.27.42; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2002.11.30.19.27.42; author jrandom; state Exp; branches ; next ; desc @@ 1.1 log @Initial revision @ text @The content of this file is unimportant, what matters is that it has one branch. @ 1.1.1.1 log @imported @ text @@ cvs2svn-2.4.0/test-data/overlapping-branch-cvsrepos/overlapping-branch,v0000664000076500007650000000103510702477014027463 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access ; symbols vendorA:1.1.1 vendorB:1.1.1; locks ; strict; comment @# @; 1.1 date 2002.11.30.19.27.42; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2002.11.30.19.27.42; author jrandom; state Exp; branches ; next ; desc @@ 1.1 log @Initial revision @ text @The content of this file is unimportant, what matters is that the same branch has two different symbolic names, a condition which cvs2svn.py should warn about. @ 1.1.1.1 log @imported @ text @@ cvs2svn-2.4.0/test-data/add-on-branch2-cvsrepos/0000775000076500007650000000000012027373500022411 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/add-on-branch2-cvsrepos/file1,v0000664000076500007650000000053411357430075023606 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BRANCH:1.1.0.2; locks; strict; comment @# @; 1.1 date 2005.12.05.22.35.31; author gward; state dead; branches 1.1.2.1; next ; 1.1.2.1 date 2005.12.05.22.35.31; author gward; state Exp; branches; next ; desc @@ 1.1 log @file file1 was added on branch BRANCH @ text @@ 1.1.2.1 log @add file on branch @ text @@ cvs2svn-2.4.0/test-data/tag-with-no-revision-cvsrepos/0000775000076500007650000000000012027373500023724 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/tag-with-no-revision-cvsrepos/file.txt,v0000664000076500007650000000061710702477015025656 0ustar mhaggermhagger00000000000000head 1.3; access; symbols TAG:1.1.2.1 SUBBRANCH:1.1.2.1.0.2; locks; strict; comment @// @; 1.3 date 2004.03.19.19.47.32; author joeschmo; state dead; branches; next 1.2; 1.2 date 98.06.22.21.46.37; author joeschmo; state Exp; branches; next 1.1; 1.1 date 98.06.05.10.40.26; author joeschmo; state dead; branches; next ; desc @@ 1.3 log @@ text @@ 1.2 log @@ text @@ 1.1 log @@ text @@ cvs2svn-2.4.0/test-data/tag-with-no-revision-cvsrepos/README0000664000076500007650000000026510702477015024613 0ustar mhaggermhagger00000000000000This repository is corrupt in a way that appears rather common among our users. It contains a tag and a branch that refer to revision 1.1.2.1, but revision 1.1.2.1 does not exist. cvs2svn-2.4.0/test-data/internal-co-keywords-cvsrepos/0000775000076500007650000000000012027373500024012 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/internal-co-keywords-cvsrepos/dir/0000775000076500007650000000000012027373500024570 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/internal-co-keywords-cvsrepos/dir/kk.txt,v0000664000076500007650000000036510702477016026211 0ustar mhaggermhagger00000000000000head 1.1; access; symbols; locks; strict; comment @# @; expand @k@; 1.1 date 2007.09.13.14.34.25; author ossi; state Exp; branches; next ; commitid e7E4xRVK9dgJfAxs; desc @@ 1.1 log @add @ text @some text $Id: literal blunder$ more text @ cvs2svn-2.4.0/test-data/internal-co-keywords-cvsrepos/dir/kv.txt,v0000664000076500007650000000040310702477016026215 0ustar mhaggermhagger00000000000000head 1.1; access; symbols; locks; strict; comment @# @; 1.1 date 2007.09.13.14.34.25; author ossi; state Exp; branches; next ; commitid e7E4xRVK9dgJfAxs; desc @@ 1.1 log @add @ text @$Author$ $Date$ $RCSfile$ $Source$ $State$ $Revision$ $Id$ $Header$ @ cvs2svn-2.4.0/test-data/internal-co-keywords-cvsrepos/dir/ko.txt,v0000664000076500007650000000036510702477016026215 0ustar mhaggermhagger00000000000000head 1.1; access; symbols; locks; strict; comment @# @; expand @o@; 1.1 date 2007.09.13.14.34.25; author ossi; state Exp; branches; next ; commitid e7E4xRVK9dgJfAxs; desc @@ 1.1 log @add @ text @some text $Id: literal blunder$ more text @ cvs2svn-2.4.0/test-data/internal-co-keywords-cvsrepos/dir/kv-deleted.txt,v0000664000076500007650000000101512027257623027623 0ustar mhaggermhagger00000000000000head 1.2; branch; access; symbols b:1.1.1; locks; strict; comment @# @; 1.2 date 2004.07.28.10.42.27; author kfogel; state dead; branches; next 1.1; 1.1 date 2004.07.19.20.57.24; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2004.07.19.20.57.25; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @Deleting this dreadful file @ text @This is a fine file. This file resides at $Source$ @ 1.1 log @Add a file. @ text @@ 1.1.1.1 log @Adding a line @ text @a1 1 branches are fun @ cvs2svn-2.4.0/test-data/unlabeled-branch-cvsrepos/0000775000076500007650000000000012027373500023120 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/unlabeled-branch-cvsrepos/proj/0000775000076500007650000000000012027373500024072 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/unlabeled-branch-cvsrepos/proj/a.txt,v0000664000076500007650000000077411244037276025334 0ustar mhaggermhagger00000000000000head 1.1; access; symbols BRANCH:1.1.0.2; locks; strict; comment @# @; 1.1 date 2009.08.21.23.07.52; author mhagger; state Exp; branches 1.1.2.1 1.1.4.1; next ; 1.1.2.1 date 2009.08.21.23.08.54; author mhagger; state Exp; branches; next ; 1.1.4.1 date 2009.08.21.23.10.11; author mhagger; state Exp; branches; next ; desc @@ 1.1 log @Initial commit. @ text @1.1 @ 1.1.4.1 log @Commit on unlabeled branch. @ text @d1 1 a1 1 1.1.4.1 @ 1.1.2.1 log @Commit on BRANCH. @ text @d1 1 a1 1 1.1.2.1 @ cvs2svn-2.4.0/test-data/unlabeled-branch-cvsrepos/cvs2svn-ignore.options0000664000076500007650000000143211244037276027431 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # Try ignoring an unlabeled branch using a SymbolMapper (it should # fail). from cvs2svn_lib.symbol_transform import SymbolMapper execfile('cvs2svn-example.options') name = 'unlabeled-branch' ctx.output_option = NewRepositoryOutputOption( 'cvs2svn-tmp/%s--options=cvs2svn-ignore.options-svnrepos' % (name,), ) run_options.clear_projects() filename = 'test-data/%s-cvsrepos/proj/a.txt,v' % (name,) symbol_mapper = SymbolMapper([ (filename, 'unlabeled-1.1.4', '1.1.4', None), ]) run_options.add_project( r'test-data/%s-cvsrepos' % (name,), trunk_path='trunk', branches_path='branches', tags_path='tags', symbol_transforms=[ symbol_mapper, ], symbol_strategy_rules=global_symbol_strategy_rules, ) cvs2svn-2.4.0/test-data/bogus-branch-copy-cvsrepos/0000775000076500007650000000000012027373500023254 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/bogus-branch-copy-cvsrepos/Attic/0000775000076500007650000000000012027373500024320 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/bogus-branch-copy-cvsrepos/Attic/file3.txt,v0000664000076500007650000000137010702477016026333 0ustar mhaggermhagger00000000000000head 1.1; access; symbols branch-boom:1.1.0.2; locks; strict; comment @# @; 1.1 date 2005.03.04.18.57.55; author ira; state dead; branches 1.1.2.1; next ; 1.1.2.1 date 2005.03.04.18.57.55; author ira; state Exp; branches; next 1.1.2.2; 1.1.2.2 date 2005.03.28.22.18.05; author ira; state Exp; branches; next 1.1.2.3; 1.1.2.3 date 2005.03.29.20.21.54; author ira; state Exp; branches; next 1.1.2.4; 1.1.2.4 date 2005.04.05.15.47.56; author ira; state dead; branches; next ; desc @@ 1.1 log @file file3.txt was initially added on branch branch-boom. @ text @@ 1.1.2.1 log @This is a log message. @ text @@ 1.1.2.2 log @This is another log message. @ text @@ 1.1.2.3 log @This is yet another log message. @ text @@ 1.1.2.4 log @ci @ text @@ cvs2svn-2.4.0/test-data/bogus-branch-copy-cvsrepos/file2.txt,v0000664000076500007650000000163110702477016025266 0ustar mhaggermhagger00000000000000head 1.4; access; symbols branch-boom:1.3.0.2; locks; strict; comment @# @; 1.4 date 2005.07.21.15.31.18; author harry; state Exp; branches; next 1.3; 1.3 date 2005.04.01.20.47.29; author harry; state Exp; branches 1.3.2.1; next 1.2; 1.2 date 2005.03.29.23.11.45; author harry; state Exp; branches; next 1.1; 1.1 date 2005.03.23.23.20.09; author sally; state dead; branches; next ; 1.3.2.1 date 2005.04.01.20.47.29; author harry; state dead; branches; next 1.3.2.2; 1.3.2.2 date 2005.04.01.23.13.40; author harry; state Exp; branches; next ; desc @@ 1.4 log @merged sally to trunk @ text @@ 1.3 log @merged sally to trunk @ text @@ 1.3.2.1 log @file file2.txt was added on branch branch-boom on 2005-04-01 23:13:40 +0000 @ text @@ 1.3.2.2 log @merged trunk to ira @ text @@ 1.2 log @merged sally to trunk @ text @@ 1.1 log @file file2.txt.py was initially added on branch branch-sally. @ text @@ cvs2svn-2.4.0/test-data/bogus-branch-copy-cvsrepos/file1.txt,v0000664000076500007650000000125510702477016025267 0ustar mhaggermhagger00000000000000head 1.2; access; symbols branch-boom:1.2.0.16; locks; strict; comment @# @; 1.2 date 2005.03.29.23.11.45; author harry; state Exp; branches 1.2.16.1; next 1.1; 1.1 date 2005.03.24.22.00.30; author sally; state dead; branches; next ; 1.2.16.1 date 2005.03.29.23.11.45; author harry; state dead; branches; next 1.2.16.2; 1.2.16.2 date 2005.04.01.23.13.40; author harry; state Exp; branches; next ; desc @@ 1.2 log @merged sally to trunk @ text @@ 1.2.16.1 log @file file1.txt was added on branch branch-boom on 2005-04-01 23:13:40 +0000 @ text @@ 1.2.16.2 log @merged trunk to ira @ text @@ 1.1 log @file file1.txt was initially added on branch branch-sally. @ text @@ cvs2svn-2.4.0/test-data/preferred-parent-cycle-cvsrepos/0000775000076500007650000000000012027373500024274 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/preferred-parent-cycle-cvsrepos/dir/0000775000076500007650000000000012027373500025052 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/preferred-parent-cycle-cvsrepos/dir/file1,v0000664000076500007650000000064110702477021026241 0ustar mhaggermhagger00000000000000head 1.1; access; symbols C:1.1.2.1.0.6 B:1.1.2.1.0.4 A:1.1.2.1.0.2 X:1.1.0.2; locks; strict; comment @# @; 1.1 date 2007.04.22.16.35.58; author mhagger; state Exp; branches 1.1.2.1; next ; 1.1.2.1 date 2007.04.22.16.35.59; author mhagger; state Exp; branches; next ; desc @@ 1.1 log @Adding files on trunk @ text @1.1 @ 1.1.2.1 log @Adding revision on first-level branches @ text @d1 1 a1 1 1.1.2.1 @ cvs2svn-2.4.0/test-data/preferred-parent-cycle-cvsrepos/dir/file3,v0000664000076500007650000000064110702477021026243 0ustar mhaggermhagger00000000000000head 1.1; access; symbols B:1.1.2.1.0.6 A:1.1.2.1.0.4 C:1.1.2.1.0.2 Z:1.1.0.2; locks; strict; comment @# @; 1.1 date 2007.04.22.16.35.58; author mhagger; state Exp; branches 1.1.2.1; next ; 1.1.2.1 date 2007.04.22.16.35.59; author mhagger; state Exp; branches; next ; desc @@ 1.1 log @Adding files on trunk @ text @1.1 @ 1.1.2.1 log @Adding revision on first-level branches @ text @d1 1 a1 1 1.1.2.1 @ cvs2svn-2.4.0/test-data/preferred-parent-cycle-cvsrepos/dir/file2,v0000664000076500007650000000064110702477021026242 0ustar mhaggermhagger00000000000000head 1.1; access; symbols A:1.1.2.1.0.6 C:1.1.2.1.0.4 B:1.1.2.1.0.2 Y:1.1.0.2; locks; strict; comment @# @; 1.1 date 2007.04.22.16.35.58; author mhagger; state Exp; branches 1.1.2.1; next ; 1.1.2.1 date 2007.04.22.16.35.59; author mhagger; state Exp; branches; next ; desc @@ 1.1 log @Adding files on trunk @ text @1.1 @ 1.1.2.1 log @Adding revision on first-level branches @ text @d1 1 a1 1 1.1.2.1 @ cvs2svn-2.4.0/test-data/preferred-parent-cycle-cvsrepos/makerepo.sh0000775000076500007650000000471710702477021026450 0ustar mhaggermhagger00000000000000#! /bin/sh # This is the script used to create the preferred-parent-cycle CVS # repository. (The repository is checked into svn; this script is # only here for its documentation value.) # # The script should be started from the main cvs2svn directory. # # The branching structure of the three files in this repository is # constructed to create a loop in the preferred parent of each branch # A, B, and C. The branches are as follows ('*' marks revisions, # which are used to prevent trunk from being a possible parent of # branches A, B, or C): # # file1: # --*--+-------------------- trunk # | # +--*--+-------------- branch X # | # +-------------- branch A # | # +-------------- branch B # | # +-------------- branch C # # file2: # --*--+-------------------- trunk # | # +--*--+-------------- branch Y # | # +-------------- branch B # | # +-------------- branch C # | # +-------------- branch A # # file3: # --*--+-------------------- trunk # | # +--*--+-------------- branch Z # | # +-------------- branch C # | # +-------------- branch A # | # +-------------- branch B # # Note that the possible parents of A are (X, Y, Z, C*2, B*1), those # of B are (X, Y, Z, A*2, C*1), and those of C are (X, Y, Z, B*2, # A*1). Therefore the preferred parents form a cycle A -> C -> B -> # A. repo=`pwd`/test-data/preferred-parent-cycle-cvsrepos wc=`pwd`/cvs2svn-tmp/preferred-parent-cycle-wc [ -e $repo/CVSROOT ] && rm -rf $repo/CVSROOT [ -e $repo/dir ] && rm -rf $repo/dir [ -e $wc ] && rm -rf $wc cvs -d $repo init cvs -d $repo co -d $wc . cd $wc mkdir dir cvs add dir cd dir echo '1.1' >file1 echo '1.1' >file2 echo '1.1' >file3 cvs add file1 file2 file3 cvs commit -m 'Adding files on trunk' . cvs tag -b X file1 cvs up -r X file1 cvs tag -b Y file2 cvs up -r Y file2 cvs tag -b Z file3 cvs up -r Z file3 echo '1.1.2.1' >file1 echo '1.1.2.1' >file2 echo '1.1.2.1' >file3 cvs commit -m 'Adding revision on first-level branches' . cvs tag -b A file1 cvs up -r A file1 cvs tag -b B file1 cvs up -r B file1 cvs tag -b C file1 cvs up -r C file1 cvs tag -b B file2 cvs up -r B file2 cvs tag -b C file2 cvs up -r C file2 cvs tag -b A file2 cvs up -r A file2 cvs tag -b C file3 cvs up -r C file3 cvs tag -b A file3 cvs up -r A file3 cvs tag -b B file3 cvs up -r B file3 cvs2svn-2.4.0/test-data/bogus-tag-cvsrepos/0000775000076500007650000000000012027373500021622 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/bogus-tag-cvsrepos/bogus-tag,v0000664000076500007650000000104210702477013023676 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access ; symbols ends_with_slash/:1.1; locks ; strict; comment @# @; 1.1 date 2002.11.30.19.27.42; author jrandom; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2002.11.30.19.27.42; author jrandom; state Exp; branches ; next ; desc @@ 1.1 log @The content of this revision is unimportant, what matters is that there's a tag named 'ends_with_slash/' on it, an illegal tag name that cvs2svn.py should detect early. @ text @Nothing to see here. @ 1.1.1.1 log @imported @ text @@ cvs2svn-2.4.0/test-data/revision-reorder-bug-cvsrepos/0000775000076500007650000000000012027373500024003 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/revision-reorder-bug-cvsrepos/file.txt,v0000664000076500007650000000060410702477021025726 0ustar mhaggermhagger00000000000000head 1.3; access; symbols; locks; strict; comment @# @; 1.3 date 2003.09.11.13.57.15; author qwerty; state Exp; branches; next 1.2; 1.2 date 2003.09.11.13.57.15; author qwerty; state dead; branches; next 1.1; 1.1 date 2003.09.09.15.47.53; author qwerty; state Exp; branches; next ; desc @@ 1.3 log @x @ text @x x x x x @ 1.2 log @x @ text @d1 1 a1 1 x @ 1.1 log @x @ text @@ cvs2svn-2.4.0/test-data/invalid-closings-on-trunk-cvsrepos/0000775000076500007650000000000012027373500024752 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/invalid-closings-on-trunk-cvsrepos/proj/0000775000076500007650000000000012027373500025724 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/invalid-closings-on-trunk-cvsrepos/proj/trunk-changed-later.txt,v0000664000076500007650000000164210702477015032575 0ustar mhaggermhagger00000000000000head 1.2; access; symbols TAG-ALL-FILES:1.1 vtag-1:1.1.1.1 vbranchA:1.1.1; locks; strict; comment @# @; 1.2 date 2004.02.19.15.43.13; author fitz; state Exp; branches; next 1.1; 1.1 date 2004.02.09.15.43.13; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.2; 1.1.1.2 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next ; desc @@ 1.2 log @Commit to trunk-changed-later.txt later in its life @ text @This is a change to trunk-changed-later.txt intended to generate multiple closings (since it was on a default branch prior to this commit). @ 1.1 log @Initial revision @ text @d1 3 a3 1 This is vtag-1 (on vbranchA) of trunk-changed-later.txt. @ 1.1.1.1 log @Import (vbranchA, vtag-1). @ text @@ 1.1.1.2 log @Import (vbranchA, vtag-3). @ text @d1 1 a1 1 This is vtag-3 (on vbranchA) of trunk-changed-later.txt. @ cvs2svn-2.4.0/test-data/invalid-closings-on-trunk-cvsrepos/proj/deleted-on-vendor-branch.txt,v0000664000076500007650000000122110702477015033475 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access; symbols TAG-ALL-FILES:1.1 vtag-1:1.1.1.1 vbranchA:1.1.1; locks; strict; comment @# @; 1.1 date 2004.02.09.15.43.13; author kfogel; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2004.02.09.15.43.13; author kfogel; state Exp; branches; next 1.1.1.2; 1.1.1.2 date 2004.02.09.15.43.13; author kfogel; state dead; branches; next ; desc @@ 1.1 log @Initial revision @ text @This is vtag-1 (on vbranchA) of deleted-on-vendor-branch.txt. @ 1.1.1.1 log @Import (vbranchA, vtag-1). @ text @@ 1.1.1.2 log @Import (vbranchA, vtag-3). @ text @d1 1 a1 1 This is vtag-3 (on vbranchA) of deleted-on-vendor-branch.txt. @ cvs2svn-2.4.0/test-data/invalid-closings-on-trunk-cvsrepos/proj/a.txt,v0000664000076500007650000000033610702477015027155 0ustar mhaggermhagger00000000000000head 1.1; access; symbols TAG-ALL-FILES:1.1; locks; strict; comment @# @; 1.1 date 2004.02.19.15.43.13; author fitz; state Exp; branches; next ; desc @@ 1.1 log @Committing files a.txt and b.txt on trunk. @ text @@ cvs2svn-2.4.0/test-data/invalid-closings-on-trunk-cvsrepos/proj/b.txt,v0000664000076500007650000000033610702477015027156 0ustar mhaggermhagger00000000000000head 1.1; access; symbols TAG-ALL-FILES:1.1; locks; strict; comment @# @; 1.1 date 2004.02.19.15.43.13; author fitz; state Exp; branches; next ; desc @@ 1.1 log @Committing files a.txt and b.txt on trunk. @ text @@ cvs2svn-2.4.0/test-data/multiply-defined-symbols-cvsrepos/0000775000076500007650000000000012027373500024673 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/multiply-defined-symbols-cvsrepos/cvs2svn-rename.options0000664000076500007650000000151710760115614031167 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # Fix a problem with multiply-defined symbols by renaming one copy of # each symbol. from cvs2svn_lib.symbol_transform import SymbolMapper execfile('cvs2svn-example.options') name = 'multiply-defined-symbols' ctx.output_option = NewRepositoryOutputOption( 'cvs2svn-tmp/%s--options=cvs2svn-rename.options-svnrepos' % (name,), ) run_options.clear_projects() filename = 'test-data/%s-cvsrepos/proj/default,v' % (name,) symbol_mapper = SymbolMapper([ (filename, 'BRANCH', '1.2.4', 'BRANCH2'), (filename, 'TAG', '1.2', 'TAG2'), ]) run_options.add_project( r'test-data/%s-cvsrepos' % (name,), trunk_path='trunk', branches_path='branches', tags_path='tags', symbol_transforms=[ symbol_mapper, ], symbol_strategy_rules=global_symbol_strategy_rules, ) cvs2svn-2.4.0/test-data/multiply-defined-symbols-cvsrepos/proj/0000775000076500007650000000000012027373500025645 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/multiply-defined-symbols-cvsrepos/proj/default,v0000664000076500007650000000105310702477013027457 0ustar mhaggermhagger00000000000000head 1.2; access; symbols BRANCH:1.2.0.4 BRANCH:1.2.0.2 TAG:1.2 TAG:1.1; locks; strict; comment @# @; 1.2 date 2003.05.23.00.17.53; author jrandom; state Exp; branches 1.2.2.1 1.2.4.1; next 1.1; 1.1 date 2003.05.22.23.20.19; author jrandom; state Exp; branches; next ; 1.2.2.1 date 2003.05.23.00.31.36; author jrandom; state Exp; branches; next ; 1.2.4.1 date 2003.06.03.03.20.31; author jrandom; state Exp; branches; next ; desc @@ 1.2 log @@ text @@ 1.2.4.1 log @@ text @@ 1.2.2.1 log @@ text @@ 1.1 log @Initial revision @ text @@ cvs2svn-2.4.0/test-data/multiply-defined-symbols-cvsrepos/cvs2svn-ignore.options0000664000076500007650000000151010760115614031174 0ustar mhaggermhagger00000000000000# (Be in -*- python -*- mode.) # Fix a problem with multiply-defined symbols by ignoring one copy of # each symbol. from cvs2svn_lib.symbol_transform import SymbolMapper execfile('cvs2svn-example.options') name = 'multiply-defined-symbols' ctx.output_option = NewRepositoryOutputOption( 'cvs2svn-tmp/%s--options=cvs2svn-ignore.options-svnrepos' % (name,), ) run_options.clear_projects() filename = 'test-data/%s-cvsrepos/proj/default,v' % (name,) symbol_mapper = SymbolMapper([ (filename, 'BRANCH', '1.2.4', None), (filename, 'TAG', '1.2', None), ]) run_options.add_project( r'test-data/%s-cvsrepos' % (name,), trunk_path='trunk', branches_path='branches', tags_path='tags', symbol_transforms=[ symbol_mapper, ], symbol_strategy_rules=global_symbol_strategy_rules, ) cvs2svn-2.4.0/test-data/mirror-keyerror3-cvsrepos/0000775000076500007650000000000012027373500023167 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/mirror-keyerror3-cvsrepos/proj/0000775000076500007650000000000012027373500024141 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/mirror-keyerror3-cvsrepos/proj/subdir/0000775000076500007650000000000012027373500025431 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/mirror-keyerror3-cvsrepos/proj/subdir/file3.txt,v0000664000076500007650000000052111011572260027430 0ustar mhaggermhagger00000000000000head 1.1; access; symbols tree-serialize-branch:1.1.0.2 NET:1.1.1; locks; strict; comment @ * @; 1.1 date 99.05.04.19.30.27; author author2; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 99.05.04.19.30.27; author author2; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @@ 1.1.1.1 log @log 5@ text @@ cvs2svn-2.4.0/test-data/mirror-keyerror3-cvsrepos/proj/subdir/Attic/0000775000076500007650000000000012027373500026475 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/mirror-keyerror3-cvsrepos/proj/subdir/Attic/file4.txt,v0000664000076500007650000000063011011572261030477 0ustar mhaggermhagger00000000000000head 1.2; access; symbols NET:1.1.1; locks; strict; comment @# @; 1.2 date 99.05.05.10.04.36; author author2; state dead; branches; next 1.1; 1.1 date 99.05.04.19.30.26; author author2; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 99.05.04.19.30.26; author author2; state Exp; branches; next ; desc @@ 1.2 log @log 6@ text @@ 1.1 log @Initial revision @ text @@ 1.1.1.1 log @log 5@ text @@ cvs2svn-2.4.0/test-data/mirror-keyerror3-cvsrepos/proj/subdir/file2.txt,v0000664000076500007650000000056011011572260027432 0ustar mhaggermhagger00000000000000head 1.1; access; symbols tree-serialize-branch:1.1.0.2 tree-serialize-branchpoint:1.1 NET:1.1.1; locks; strict; comment @# @; 1.1 date 99.05.04.19.30.26; author author2; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 99.05.04.19.30.26; author author2; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @@ 1.1.1.1 log @log 5@ text @@ cvs2svn-2.4.0/test-data/mirror-keyerror3-cvsrepos/proj/file1.txt,v0000664000076500007650000000034711011572260026144 0ustar mhaggermhagger00000000000000head 1.1; access; symbols tree-serialize-branch:1.1.0.2 tree-serialize-branchpoint:1.1; locks; strict; comment @# @; 1.1 date 2000.07.22.08.08.22; author author1; state Exp; branches; next ; desc @@ 1.1 log @log 1@ text @@ cvs2svn-2.4.0/test-data/issue-99-cvsrepos/0000775000076500007650000000000012027373500021321 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/issue-99-cvsrepos/file1,v0000664000076500007650000000053210702477020022506 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access; symbols David:1.1.1; locks; strict; comment @# @; expand @o@; 1.1 date 2003.07.08.20.57.54; author proski; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.07.08.20.57.54; author proski; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @@ 1.1.1.1 log @Version 0.13e @ text @@ cvs2svn-2.4.0/test-data/issue-99-cvsrepos/file2,v0000664000076500007650000000100510702477020022503 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access; symbols orinoco_0_13d:1.1.1.2 orinoco_0_13c:1.1.1.1 David:1.1.1; locks; strict; comment @ * @; expand @o@; 1.1 date 2003.07.08.20.56.26; author proski; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2003.07.08.20.56.26; author proski; state Exp; branches; next 1.1.1.2; 1.1.1.2 date 2003.07.08.20.56.50; author proski; state Exp; branches; next ; desc @@ 1.1 log @Initial revision @ text @@ 1.1.1.1 log @Version 0.13c @ text @@ 1.1.1.2 log @Version 0.13d @ text @@ cvs2svn-2.4.0/test-data/branch-from-vendor-branch-cvsrepos/0000775000076500007650000000000012027373500024656 5ustar mhaggermhagger00000000000000././@LongLink0000000000000000000000000000014600000000000011216 Lustar 00000000000000cvs2svn-2.4.0/test-data/branch-from-vendor-branch-cvsrepos/branch-from-vendor-branch-symbol-hints.txtcvs2svn-2.4.0/test-data/branch-from-vendor-branch-cvsrepos/branch-from-vendor-branch-symbol-hints.tx0000664000076500007650000000052711435427214034615 0ustar mhaggermhagger00000000000000# Columns: #project_id symbol_name conversion symbol_path preferred_parent_name 0 .trunk. trunk trunk . 0 my-branch branch branches/my-branch .trunk. 0 vendor-tag exclude . . 0 vendor-branch exclude . . cvs2svn-2.4.0/test-data/branch-from-vendor-branch-cvsrepos/data,v0000664000076500007650000000115211435427214025757 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access; symbols my-branch:1.1.1.1.0.2 vendor-tag:1.1.1.1 vendor-branch:1.1.1; locks; strict; comment @# @; 1.1 date 2010.04.08.15.37.56; author fosterj; state Exp; branches 1.1.1.1; next ; commitid 2i5HeSdvL0B9s8uu; 1.1.1.1 date 2010.04.08.15.37.56; author fosterj; state Exp; branches 1.1.1.1.2.1; next ; commitid 2i5HeSdvL0B9s8uu; 1.1.1.1.2.1 date 2010.04.08.15.38.58; author fosterj; state Exp; branches; next ; commitid eDJ6tPpuBwVxs8uu; desc @@ 1.1 log @Initial revision @ text @x @ 1.1.1.1 log @Test import @ text @@ 1.1.1.1.2.1 log @Branch commit @ text @d1 1 a1 1 y @ cvs2svn-2.4.0/test-data/resync-pass2-push-backward-cvsrepos/0000775000076500007650000000000012027373500025014 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/resync-pass2-push-backward-cvsrepos/file2.txt,v0000664000076500007650000000041710702477013027024 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; 1.2 date 90.04.19.15.10.21; author user1; state Exp; branches; next 1.1; 1.1 date 90.04.19.15.10.21; author user1; state Exp; branches; next ; desc @@ 1.2 log @Initial revision @ text @@ 1.1 log @Summary: foo @ text @@ cvs2svn-2.4.0/test-data/resync-pass2-push-backward-cvsrepos/file1.txt,v0000664000076500007650000000041710702477013027023 0ustar mhaggermhagger00000000000000head 1.2; access; symbols; locks; strict; 1.2 date 90.04.19.15.10.30; author user1; state Exp; branches; next 1.1; 1.1 date 90.04.19.15.10.29; author user1; state Exp; branches; next ; desc @@ 1.2 log @Summary: foo @ text @@ 1.1 log @Initial revision @ text @@ cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/0000775000076500007650000000000012027373500023402 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/direct/0000775000076500007650000000000012027373500024654 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/direct/b.txt,v0000664000076500007650000000101311324543262026076 0ustar mhaggermhagger00000000000000head 1.3; access; symbols TAG:1.3 BRANCH:1.3.0.2; locks; strict; comment @# @; 1.3 date 2010.01.17.03.34.23; author mhagger; state Exp; branches; next 1.2; commitid 0WdTFEWnGdyo3Fju; 1.2 date 2010.01.17.03.31.15; author mhagger; state dead; branches; next 1.1; commitid Ok3DlSXoWUFj2Fju; 1.1 date 2010.01.16.06.19.51; author mhagger; state Exp; branches; next ; commitid T0xGHIyHXxz90yju; desc @@ 1.3 log @Re-add b.txt. @ text @b2 @ 1.2 log @Remove b.txt. @ text @d1 1 a1 1 b @ 1.1 log @Add b.txt. @ text @@ cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/direct/empty-directory/0000775000076500007650000000000012027373500030014 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/direct/empty-directory/empty-subdirectory/0000775000076500007650000000000012027373500033666 5ustar mhaggermhagger00000000000000././@LongLink0000000000000000000000000000015000000000000011211 Lustar 00000000000000cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/direct/empty-directory/empty-subdirectory/README.txtcvs2svn-2.4.0/test-data/empty-directories-cvsrepos/direct/empty-directory/empty-subdirectory/README.0000664000076500007650000000014311324543262034625 0ustar mhaggermhagger00000000000000This directory should be created when its parent directory is created, namely when b.txt is added. cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/direct/empty-directory/README.txt0000664000076500007650000000007611324543262031520 0ustar mhaggermhagger00000000000000This empty directory should be created when b.txt is created. cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/root-empty-directory/0000775000076500007650000000000012027373500027523 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/root-empty-directory/empty-subdirectory/0000775000076500007650000000000012027373500033375 5ustar mhaggermhagger00000000000000././@LongLink0000000000000000000000000000014600000000000011216 Lustar 00000000000000cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/root-empty-directory/empty-subdirectory/README.txtcvs2svn-2.4.0/test-data/empty-directories-cvsrepos/root-empty-directory/empty-subdirectory/README.tx0000664000076500007650000000007511324543262034714 0ustar mhaggermhagger00000000000000This directory should be created when its parent is created. cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/root-empty-directory/README.txt0000664000076500007650000000025011324543262031221 0ustar mhaggermhagger00000000000000This directory does not contain any RCS files, so if the --include-empty-directories option is used it should be created in the commit that initializes the repository. cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/indirect/0000775000076500007650000000000012027373500025203 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/indirect/subdirectory/0000775000076500007650000000000012027373500027721 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/indirect/subdirectory/c.txt,v0000664000076500007650000000101311324543262031144 0ustar mhaggermhagger00000000000000head 1.3; access; symbols TAG:1.3 BRANCH:1.3.0.2; locks; strict; comment @# @; 1.3 date 2010.01.17.03.34.55; author mhagger; state Exp; branches; next 1.2; commitid 0ur5CblrKrQz3Fju; 1.2 date 2010.01.17.03.31.45; author mhagger; state dead; branches; next 1.1; commitid 7o2iSAs9MgMu2Fju; 1.1 date 2010.01.16.06.24.25; author mhagger; state Exp; branches; next ; commitid qH19tYzVmpCI1yju; desc @@ 1.3 log @Re-add c.txt. @ text @c2 @ 1.2 log @Remove c.txt. @ text @d1 1 a1 1 c @ 1.1 log @Add c.txt. @ text @@ cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/indirect/empty-directory/0000775000076500007650000000000012027373500030343 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/indirect/empty-directory/empty-subdirectory/0000775000076500007650000000000012027373500034215 5ustar mhaggermhagger00000000000000././@LongLink0000000000000000000000000000015200000000000011213 Lustar 00000000000000cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/indirect/empty-directory/empty-subdirectory/README.txtcvs2svn-2.4.0/test-data/empty-directories-cvsrepos/indirect/empty-directory/empty-subdirectory/READM0000664000076500007650000000013111324543262034766 0ustar mhaggermhagger00000000000000This directory should be created when its parent is created, namely when c.txt is added. cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/indirect/empty-directory/README.txt0000664000076500007650000000014111324543262032040 0ustar mhaggermhagger00000000000000This directory should be created when the addition of c.txt triggers the creation of its parent. cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/a.txt,v0000664000076500007650000000034411324543262024631 0ustar mhaggermhagger00000000000000head 1.1; access; symbols TAG:1.1 BRANCH:1.1.0.2; locks; strict; comment @# @; 1.1 date 2010.01.16.06.17.56; author mhagger; state Exp; branches; next ; commitid 1nYTVRk8r2OuZxju; desc @@ 1.1 log @Add a.txt. @ text @a @ cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/import/0000775000076500007650000000000012027373500024714 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/import/d.txt,v0000664000076500007650000000113511325311442026137 0ustar mhaggermhagger00000000000000head 1.1; branch 1.1.1; access; symbols VENDORTAG2:1.1.1.2 VENDORTAG:1.1.1.1 VENDORBRANCH:1.1.1; locks; strict; comment @# @; 1.1 date 2010.01.17.04.15.38; author mhagger; state Exp; branches 1.1.1.1; next ; commitid 3LFcsqdQSLZxhFju; 1.1.1.1 date 2010.01.17.04.15.38; author mhagger; state Exp; branches; next 1.1.1.2; commitid 3LFcsqdQSLZxhFju; 1.1.1.2 date 2010.01.17.04.24.25; author mhagger; state Exp; branches; next ; commitid ziMoRPJIXZOykFju; desc @@ 1.1 log @Initial revision @ text @d @ 1.1.1.1 log @Import d.txt. @ text @@ 1.1.1.2 log @Re-import d.txt. @ text @d1 1 a1 1 d2 @ cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/import/empty-directory/0000775000076500007650000000000012027373500030054 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/import/empty-directory/empty-subdirectory/0000775000076500007650000000000012027373500033726 5ustar mhaggermhagger00000000000000././@LongLink0000000000000000000000000000015000000000000011211 Lustar 00000000000000cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/import/empty-directory/empty-subdirectory/README.txtcvs2svn-2.4.0/test-data/empty-directories-cvsrepos/import/empty-directory/empty-subdirectory/README.0000664000076500007650000000014311325311442034657 0ustar mhaggermhagger00000000000000This directory should be created when its parent directory is created, namely when d.txt is added. cvs2svn-2.4.0/test-data/empty-directories-cvsrepos/import/empty-directory/README.txt0000664000076500007650000000007611325311442031552 0ustar mhaggermhagger00000000000000This empty directory should be created when d.txt is created. cvs2svn-2.4.0/www/0000775000076500007650000000000012027373500015026 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/www/project_tools.html0000664000076500007650000000207511435427214020612 0ustar mhaggermhagger00000000000000
cvs2svn-2.4.0/www/faq.html0000664000076500007650000014224411710517254016476 0ustar mhaggermhagger00000000000000 cvs2svn FAQ

cvs2svn FAQ

General:

  1. Does cvs2svn support incremental repository conversion?

Compatibility:

  1. Does cvs2svn run under Psyco?

How-to:

  1. How can I convert a CVS repository to which I only have remote access?
  2. How can I convert my CVS repository one module at a time?
  3. How can I convert part of a CVS repository?
  4. How can I convert separate projects in my CVS repository into a single Subversion repository?
  5. I have hundreds of subprojects to convert and my options file is getting huge
  6. How can I convert project foo so that trunk/tags/branches are inside of foo?
  7. How do I fix up end-of-line translation problems?
  8. I want a single project but tag-rewriting rules that vary by subdirectory. Can this be done?
  9. How can I convert a CVSNT repository?
  10. How do I get cvs2svn to run on OS X 10.5.5?

Problems:

  1. I get an error "A CVS repository cannot contain both repo/path/file.txt,v and repo/path/Attic/file.txt,v". What can I do?
  2. I get an error "ERROR: filename,v is not a valid ,v file."
  3. gdbm.error: (45, 'Operation not supported')
  4. When converting a CVS repository that was used on a Macintosh, some files have incorrect contents in SVN.
  5. Using cvs2svn 1.3.x, I get an error "The command '['co', '-q', '-x,v', '-p1.1', '-kk', '/home/cvsroot/myfile,v']' failed" in pass 8.
  6. Vendor branches created with "cvs import -b <branch number>" are not correctly handled.

Getting help:

  1. How do I get help?
  2. What information should I include when requesting help?
  3. How do I subscribe to a mailing list?
  4. How do I report a bug?
  5. How can I produce a useful test case?
  6. Does anybody offer commercial support for cvs2svn/cvs2git conversions?

General:

Does cvs2svn support incremental repository conversion?

No.

Explanation: During the transition from CVS to Subversion, it would sometimes be useful to have the new Subversion repository track activity in the CVS repository for a period of time until the final switchover. This would require each conversion to determine what had changed in CVS since the last conversion, and add those commits on top of the Subversion repository.

Unfortunately, cvs2svn/cvs2git does not support incremental conversions. With some work it would be possible to add this feature, but it would be difficult to make it robust. The trickiest problem is that CVS allows changes to the repository that have retroactive effects (e.g., affecting parts of the history that have already been converted).

Some conversion tools claim to support incremental conversions from CVS, but as far as is known none of them are reliable.

Volunteers or sponsorship to add support for incremental conversions to cvs2svn/cvs2git would be welcome.


Compatibility:

Does cvs2svn run under Psyco?

No.

Explanation: Psyco is a python extension that can speed up the execution of Python code by compiling parts of it into i386 machine code. Unfortunately, Psyco is known not to run cvs2svn correctly (this was last tested with the Psyco pre-2.0 development branch). When cvs2svn is run under Psyco it crashes in OutputPass with an error message that looks something like this:

cvs2svn_lib.common.InternalError: ID changed from 2 -> 3 for Trunk, r2

The Psyco team has been informed about the problem.


How-to:

How can I convert a CVS repository to which I only have remote access?

cvs2svn requires direct, filesystem access to a copy of the CVS repository that you want to convert. The reason for this requirement is that cvs2svn directly parses the *,v files that make up the CVS repository.

Many remote hosting sites provide access to backups of your CVS repository, which could be used for a cvs2svn conversion. For example:

If your provider does not provide any way to download your CVS repository, there are two known tools that claim to be able to clone a CVS repository via the CVS protocol:

It should be possible to use one of these tools to fetch a copy of your CVS repository from your provider, then to use cvs2svn to convert the copy. However, the developers of cvs2svn do not have any experience with these tools, so you are on your own here. If you try one of them, please tell us about your experience on the users mailing list.

How can I convert my CVS repository one module at a time?

If you need to convert certain CVS modules (in one large repository) to Subversion now and other modules later, you may want to convert your repository one module at a time. This situation is typically encountered in large organizations where each project has a separate lifecycle and schedule, and a one-step conversion process is not practical.

First you have to decide whether you want to put your converted projects into a single Subversion repositories or multiple ones. This decision mostly depends on the degree of coupling between the projects and is beyond the scope of this FAQ. See the Subversion book for a discussion of repository organization.

If you decide to convert your projects into separate Subversion repositories, then please follow the instructions in How can I convert part of a CVS repository? once for each repository.

If you decide to put more than one CVS project into a single Subversion repository, then please follow the instructions in How can I convert separate projects in my CVS repository into a single Subversion repository?.

How can I convert part of a CVS repository?

This is easy: simply run cvs2svn normally, passing it the path of the project subdirectory within the CVS repository. Since cvs2svn ignores any files outside of the path it is given, other projects within the CVS repository will be excluded from the conversion.

Example: You have a CVS repository at path /path/cvsrepo with projects in subdirectories /path/cvsrepo/foo and /path/cvsrepo/bar, and you want to create a new Subversion repository at /path/foo-svn that includes only the foo project:

    $ cvs2svn -s /path/foo-svn /path/cvsrepo/foo

How can I convert separate projects in my CVS repository into a single Subversion repository?

cvs2svn supports multiproject conversions, but you have to use the options file method to start the conversion. In your options file, you simply call run_options.add_project() once for each sub-project in your repository. For example, if your CVS repository has the layout:

  /project_a
  /project_b

and you want your Subversion repository to be laid out like this:

   project_a/
      trunk/
         ...
      branches/
         ...
      tags/
         ...
   project_b/
      trunk/
         ...
      branches/
         ...
      tags/
         ...

then you need to have a section like this in your options file:

run_options.add_project(
    'my/cvsrepo/project_a',
    trunk_path='project_a/trunk',
    branches_path='project_a/branches',
    tags_path='project_a/tags',
    symbol_transforms=[
        #...whatever...
        ],
    symbol_strategy_rules=[
        #...whatever...
        ],
    )
run_options.add_project(
    'my/cvsrepo/project_b',
    trunk_path='project_b/trunk',
    branches_path='project_b/branches',
    tags_path='project_b/tags',
    symbol_transforms=[
        #...whatever...
        ],
    symbol_strategy_rules=[
        #...whatever...
        ],
    )

I have hundreds of subprojects to convert and my options file is getting huge

The options file is Python code, executed by the Python interpreter. This makes it easy to automate parts of the configuration process. For example, to add many subprojects, you can write a Python loop:

projects = ['A', 'B', 'C', ...etc...]

cvs_repo_main_dir = r'test-data/main-cvsrepos'
for project in projects:
    run_options.add_project(
        cvs_repo_main_dir + '/' + project,
        trunk_path=(project + '/trunk'),
        branches_path=(project + '/branches'),
        tags_path=(project + '/tags'),
        # ...
        )

or you could even read the subprojects directly from the CVS repository:

import os
cvs_repo_main_dir = r'test-data/main-cvsrepos'
projects = os.listdir(cvs_repo_main_dir)

# Probably you don't want to convert CVSROOT:
projects.remove('CVSROOT')

for project in projects:
    # ...as above...

How can I convert project foo so that trunk/tags/branches are inside of foo?

If foo is the only project that you want to convert, then either run cvs2svn like this:

   $ cvs2svn --trunk=foo/trunk --branches=foo/branches --tags=foo/tags CVSREPO/foo

or use an options file that defines a project like this:

run_options.add_project(
    'my/cvsrepo/foo',
    trunk_path='foo/trunk',
    branches_path='foo/branches',
    tags_path='foo/tags',
    symbol_transforms=[
        #...whatever...
        ],
    symbol_strategy_rules=[
        #...whatever...
        ],
    )

If foo is not the only project that you want to convert, then you need to do a multiproject conversion; see How can I convert separate projects in my CVS repository into a single Subversion repository? for more information.

How do I fix up end-of-line translation problems?

Warning: cvs2svn's handling of end-of-line options changed between version 1.5.x and version 2.0.x. This documentation applies to version 2.0.x and later. The documentation applying to an earlier version can be found in the www directory of that release of cvs2svn.

Starting with version 2.0, the default behavior of cvs2svn is to treat all files as binary except those explicitly determined to be text. (Previous versions treated files as text unless they were determined to be binary.) This behavior was changed because, generally speaking, it is safer to treat a text file as binary than vice versa.

However, it is often preferred to set svn:eol-style=native for text files, so that their end-of-file format is converted to that of the client platform when the file is checked out. This section describes how to get the settings that you want.

If a file is marked as binary in CVS (with cvs admin -kb, then cvs2svn will always treat the file as binary. For other files, cvs2svn has a number of options that can help choose the correct end-of-line translation parameters during the conversion:

--auto-props=FILE

Set arbitrary Subversion properties on files based on the auto-props section of a file in svn config format. The auto-props file might have content like this:

[auto-props]
*.txt = svn:mime-type=text/plain;svn:eol-style=native
*.doc = svn:mime-type=application/msword;!svn:eol-style

This option can also be used in combination with --eol-from-mime-type.

To force end-of-line translation off, use a setting of the form !svn:eol-style (with a leading exclamation point).

--mime-types=FILE

Specifies an Apache-style mime.types file for setting files' svn:mime-type property based on the file extension. The mime-types file might have content like this:

text/plain              txt
application/msword      doc

This option only has an effect on svn:eol-style if it is used in combination with --eol-from-mime-type.

--eol-from-mime-type Set svn:eol-style based on the file's mime type (if known). If the mime type starts with "text/", then the file is treated as a text file; otherwise, it is treated as binary. This option is useful in combination with --auto-props or --mime-types.
--default-eol=STYLE Usually cvs2svn treats a file as binary unless one of the other rules determines that it is not binary and it is not marked as binary in CVS. But if this option is specified, then cvs2svn uses the specified style as the default. STYLE can be 'binary' (default), 'native', 'CRLF', 'LF', or 'CR'. If you have been diligent about annotating binary files in CVS, or if you are confident that the above options will catch all of your binary files, then --default-style=native should give good results.

If you don't use any of these options, then cvs2svn will not arrange any line-end translation whatsoever. The file contents in the SVN repository should be the same as the contents you would get if checking out with CVS on the machine on which cvs2svn is run. This also means that the EOL characters of text files will be the same no matter where the SVN data are checked out (i.e., not translated to the checkout machine's EOL format).

To do a better job, you can use --auto-props, --mime-types, and --eol-from-mime-type to specify exactly which properties to set on each file based on its filename.

For total control over setting properties on files, you can use the --options-file method and write your own FilePropertySetter or RevisionPropertySetter in Python. For example,

from cvs2svn_lib.property_setters import FilePropertySetter

class MyPropertySetter(FilePropertySetter):
  def set_properties(self, cvs_file):
    if cvs_file.cvs_path.startswith('path/to/funny/files/'):
      cvs_file.properties['svn:mime-type'] = 'text/plain'
      cvs_file.properties['svn:eol-style'] = 'CRLF'

ctx.file_property_setters.append(MyPropertySetter())

See the file cvs2svn_lib/property_setters.py for more examples.

I want a single project but tag-rewriting rules that vary by subdirectory. Can this be done?

This is an example of how the cvs2svn conversion can be customized using Python.

Suppose you want to write symbol transform rules that are more complicated than "replace REGEXP with PATTERN". This can easily be done by adding just a little bit of Python code to your options file.

When a symbol is encountered, cvs2svn iterates through the list of SymbolTransform objects defined for the project. For each one, it calls symbol_transform.transform(cvs_file, symbol_name, revision). That method can return any legal symbol name, which will be used in the conversion instead of the original name.

To use this feature, you will have to use an options file to start the conversion. You then write a new SymbolTransform class that inherits from RegexpSymbolTransform but checks the path before deciding whether to transform the symbol. Add the following to your options file:

from cvs2svn_lib.symbol_transform import RegexpSymbolTransform

class MySymbolTransform(RegexpSymbolTransform):
    def __init__(self, path, pattern, replacement):
        """Transform only symbols that occur within the specified PATH."""

        self.path = path
        RegexpSymbolTransform.__init__(self, pattern, replacement)

    def transform(self, cvs_file, symbol_name, revision):
        # Is the file is within the path we are interested in?
        if cvs_file.cvs_path.startswith(path + '/'):
            # Yes -> Allow RegexpSymbolTransform to transform the symbol:
            return RegexpSymbolTransform.transform(
                    self, cvs_file, symbol_name, revision)
        else:
            # No -> Return the symbol unchanged:
            return symbol_name

# Note that we use a Python loop to fill the list of symbol_transforms:
symbol_transforms = []
for subdir in ['project1', 'project2', 'project3']:
    symbol_transforms.append(
        MySymbolTransform(
            subdir,
            r'release-(\d+)_(\d+)',
            r'%s-release-\1.\2' % subdir))

# Now register the project, using our own symbol transforms:
run_options.add_project(
    'your_cvs_path',
    trunk_path='trunk',
    branches_path='branches',
    tags_path='tags',
    symbol_transforms=symbol_transforms))

This example causes any symbol under "project1" that looks like "release-3_12" to be transformed into a symbol named "project1-release-3.12", whereas if the same symbol appears under "project2" it will be transformed into "project2-release-3.12".

How can I convert a CVSNT repository?

CVSNT is a version control system that started out by adding support for running CVS under Windows NT. Since then it has made numerous extensions to the RCS file format, to the point where CVS compatibility does not imply CVSNT compatibility with any degree of certainty.

cvs2svn might happen to successfully convert a CVSNT repository, especially if the repository has never had any CVSNT-only features used on it, but this use is not supported and should not be expected to work.

If you want to experiment with converting a CVSNT repository, then please consider the following suggestions:

  • Use cvs2svn's --use-cvs option.
  • Use CVSNT's version of the cvs executable (i.e., ensure that the first cvs program in your $PATH is the one that came with CVSNT).
  • Carefully check the result of the conversion before you rely on it, even if the conversion completed without any errors or warnings.

Patches to support the conversion of CVSNT repositories would, of course, be welcome.

How do I get cvs2svn to run on OS X 10.5.5?

Attempting to run cvs2svn on a standard OS X 10.5.5 installation yields the following error:

ERROR: cvs2svn uses the anydbm package, which depends on lower level dbm libraries. Your system has dbm, with which cvs2svn is known to have problems. To use cvs2svn, you must install a Python dbm library other than dumbdbm or dbm. See http://python.org/doc/current/lib/module-anydbm.html for more information.

The problem is that the standard distribution of python on OS X 10.5.5 does not include any other dbm libraries other than the standard dbm. In order for cvs2svn to work, we need to install the gdbm library, in addition to a new version of python that enables the python gdbm module.

The precompiled versions of python for OS X available from python.org or activestate.com (currently version 2.6.2) do not have gdbm support turned on. To check for gdbm support, check for the library module (libgdmmodule.so) within the python installation.

Here is the procedure for a successful installation of cvs2svn and all supporting libs:

  1. Download the gdbm-1.8.3 (or greater) source, unarchive and change directory to gdbm-1.8.3. We need to install the gdbm libraries so python's gdbm module can use them.
    1. Type ./configure
    2. Edit "Makefile" so that the owner and group are not the non-existing "bin" owner and group by changing
      BINOWN = bin
      BINGRP = bin
      
      to
      BINOWN = root
      BINGRP = admin
      
    3. Type "make"
    4. Type "sudo make install"
  2. Download the Python2.6 (or greater) source, unarchive, and change directory to Python2.6. We need to enable python gdbm support which is not enabled in the default OS X 10.5.5 installation of python, as the gdbm libs are not included. However, we just installed the gdbm libs in step 1, so we can now compile python with gdbm support.
    1. Edit the file "Modules/Setup" by uncommenting the line which links against gdbm by changing
      #gdbm gdbmmodule.c -I/usr/local/include -L/usr/local/lib -lgdbm
      
      to
      gdbm gdbmmodule.c -I/usr/local/include -L/usr/local/lib -lgdbm
      
    2. Edit the file "Modules/Setup" by uncommenting the line to create shared libs by changing
      #*shared*
      
      to
      *shared*
      
    3. Type ./configure --enable-framework --enable-universalsdk in the top-level Python2.6 directory. This will configure the installation of python as a shared OS X framework, and usable with OS X GUI frameworks and SDKs. You may have problems building if you don't have the SDKs that support the PPC platform. If you do, just specify --disable-universalsdk. By default, python will be installed in "/Library/Frameworks/Python.framework", which is what we want.
    4. Type make
    5. Type sudo make install
    6. Type cd /usr/local/bin; sudo ln -s python2.6 python
    7. Make sure "/usr/local/bin" is at the front of your search path in ~/.profile or ~/.bashrc etc.
    8. Type source ~/.profle or source ~/.bashrc etc. or alternatively, just open a new shell window. When you type which python it should give you the new version in "/usr/local/bin" not the one in "/usr/bin".
  3. Download the cvs2svn-2.2.0 (or greater) source, unarchive and change directory to cvs2svn-2.2.0. Many people can't get cvs2svn to work except in the installation directory. The reason for this is that the installation places copies of cvs2svn, cvs2svn_libs, and cvs2svn_rcsparse in the /Library/Frameworks/Python.framework hierarchy. All we need to do is make a link in /usr/local/bin pointing to the location of cvs2svn in the python framework hierarchy. And for good measure we also make links to the lib and include directories:
    1. Type sudo make install
    2. Create the required links by typing the following:
      sudo ln -s /Library/Frameworks/Python.framework/Versions/2.6/bin/cvs2svn /usr/local/bin/cvs2svn
      sudo ln -s /Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6 /usr/local/lib/python2.6
      sudo ln -s /Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 /usr/local/include/python2.6
      

The installation is complete. Change directory out of the cvs2svn-2.2.0 installation directory, and you should be able to run cvs2svn. Be careful *not* to copy the version of cvs2svn in the cvs2svn-2.2.0 installation directory to /usr/local/bin, as this has a different python environment setting at the top of the file than the one that was installed in the /Library/Frameworks/Python.framework hierarchy. Follow the instructions exactly, and it should work.


Problems:

I get an error "A CVS repository cannot contain both repo/path/file.txt,v and repo/path/Attic/file.txt,v". What can I do?

Background: Normally, if you have a file called path/file.txt in your project, CVS stores its history in a file called repo/path/file.txt,v. But if file.txt is deleted on the main line of development, CVS moves its history file to a special Attic subdirectory: repo/path/Attic/file.txt,v. (If the file is recreated, then it is moved back out of the Attic subdirectory.) Your repository should never contain both of these files at the same time.

This cvs2svn error message thus indicates a mild form of corruption in your CVS repository. The file has two conflicting histories, and even CVS does not know the correct history of path/file.txt. The corruption was probably created by using tools other than CVS to backup or manipulate the files in your repository. With a little work you can learn more about the two histories by viewing each of the file.txt,v files in a text editor.

There are four straightforward approaches to fixing the repository corruption, but each has potential disadvantages. Remember to make a backup before starting. Never run cvs2svn on a live CVS repository--always work on a copy of your repository.

  1. Restart the conversion with the --retain-conflicting-attic-files option. This causes the non-attic and attic versions of the file to be converted separately, with the Attic version stored to a new subdirectory as path/Attic/file.txt. This approach avoids losing any history, but by moving the Attic version of the file to a different subdirectory it might cause historical revisions to be broken.
  2. Remove the Attic version of the file and restart the conversion. Sometimes it represents an old version of the file that was deleted long ago, and it won't be missed. But this completely discards one of the file's histories, probably causing file.txt to be missing in older historical revisions. (For what it's worth, this is probably how CVS would behave in this situation.)
          # You did make a backup, right?
          $ rm repo/path/Attic/file.txt,v
        
  3. Remove the non-Attic version of the file and restart the conversion. This might be appropriate if the non-Attic version has less important content than the Attic version. But this completely discards one of the file's histories, probably causing file.txt to be missing in recent historical revisions.
          # You did make a backup, right?
          $ rm repo/path/file.txt,v
        
  4. Rename the non-Attic version of the file and restart the conversion. This avoids losing history, but it changes the name of the non-Attic version of the file to file-not-from-Attic.txt whenever it appeared, and might thereby cause revisions to be broken.
          # You did make a backup, right?
          $ mv repo/path/file.txt,v repo/path/file-not-from-Attic.txt,v
        

If you run cvs2svn on a case-insensitive operating system, it is possible to get this error even if the filename of the file in Attic has different case than the one out of the Attic. This could happen, for example, if the CVS repository was served from a case-sensitive operating system at some time. A workaround for this problem is to copy the CVS repository to a case-sensitive operating system and convert it there.

I get an error "ERROR: filename,v is not a valid ,v file."

The named file is corrupt in some way. (Corruption is surprisingly common in CVS repositories.) It is likely that even CVS has problems with this file; try checking out the head revision, revision 1.1, and the tip revision on each branch of this file; probably one or more of them don't work.

Here are some options:

  1. Omit this file from the conversion (by making a copy of your repository, deleting this file from the copy, then converting from the copy).
  2. Restore an older copy of this file from backups, if you have backups from before it was corrupted.
  3. Hand-fix the file as best you can by opening it in a binary editor and trying to put it back in RCS file format (documented in the rcsfile(5) manpage). Often it is older revisions that are affected by corruption; you might need to delete some old revisions to salvage newer ones.

gdbm.error: (45, 'Operation not supported')

This has been reported to be caused by trying to create gdbm databases on an NFS partition. Apparently gdbm does not support databases on NFS partitions. The workaround is to use the --tmpdir option to choose a local partition for cvs2svn to write its temporary files.

When converting a CVS repository that was used on a Macintosh, the contents of some files are incorrect in SVN.

Some Macintosh CVS clients use a nonstandard trick to store the resource fork of files in CVS: instead of storing the file contents directly, store an AppleSingle data stream containing both the data fork and resource fork. When checking the file out, the client unpacks the AppleSingle data and writes the two forks separately to disk. By default, cvs2svn treats the file contents literally, so when you check the file out of Subversion, the file contains the combined data in AppleSingle format rather than only the data fork of the file as expected.

Subversion does not have any special facilities for dealing with Macintosh resource forks, so there is nothing cvs2svn can do to preserve both forks of your data. However, sometimes the resource fork is not needed. If you would like to discard the resource fork and only record the data fork in Subversion, then start your conversion using the options file method and set the following option to True in your options file:

      ctx.decode_apple_single = True

There is more information about this option in the comments in cvs2svn-example.options.

Using cvs2svn 1.3.x, I get an error "The command '['co', '-q', '-x,v', '-p1.1', '-kk', '/home/cvsroot/myfile,v']' failed" in pass 8.

What are you using cvs2svn version 1.3.x for anyway? Upgrade!

But if you must, either install RCS, or ensure that CVS is installed and use cvs2svn's --use-cvs option.

Vendor branches created with "cvs import -b <branch number>" are not correctly handled.

Normally, people using "cvs import" don't specify the "-b" flag. cvs2svn handles this normal case fine.

If you have a file which has an active vendor branch, i.e. there have never been any trunk commits but only "cvs imports" onto the vendor branch, then cvs2svn will handle this fine. (Even if you've used the "-b" option to specify a non-standard branch number).

If you've used "cvs import -b <branch number>", you didn't specify the standard CVS vendor branch number of 1.1.1, and there has since been a commit on trunk (either a modification or delete), then your history has been damaged. This isn't cvs2svn's fault. CVS simply doesn't record the branch number of the old vendor branch, it assumes it was 1.1.1. You will even get the wrong results from "cvs checkout -D" with a date when the vendor branch was active.

Symptoms of this problem can include:

  • cvs2svn refusing to let you exclude the vendor branch, because some other branch depends on it
  • if you did more than one import onto the vendor branch, then your SVN history "missing" one of the changes on trunk (though the change will be on the vendor branch).

(Note: There are other possible causes for these symptoms, don't assume you have a non-standard vendor branch number just because you see these symptoms).

The way to solve this problem is to renumber the vendor branch to the standard 1.1.1 branch number. This has to be done before you run cvs2svn. To help you do this, there is the "renumber_branch.py" script in the "contrib" directroy of the cvs2svn distribution.

The typical usage, assuming you used "cvs import -b 1.1.2 ..." to create your vendor branch, is:

      contrib/renumber_branch.py 1.1.2 1.1.1 repos/dir/file,v

You should only run this on a copy of your CVS repository, as it edits the repository in-place. You can fix a single file or a whole directory tree at a time.

The script will check that the 1.1.1 branch doesn't already exist; if it does exist then it will fail with an error message.

Getting help:

How do I get help?

There are several sources of help for cvs2svn:

What information should I include when requesting help?

If you ask for help and/or report a bug on a mailing list, it is important that you include the following information. Failure to include important information is the best way to dissuade the volunteers of the cvs2svn project from trying to help you.

  1. Exactly what version of cvs2svn are you using? If you are not using an official release, please tell us what branch and revision number from the svn archive you are using. If you have modified cvs2svn, please tell us exactly what you have changed.
  2. What platform are you using (Linux, BSD, Windows, etc.)? What python version (e.g., type python --version)?
  3. What is the exact command line that you used to start the conversion? If you used the --options option, please attach a copy of the options file that you used.
  4. What happened when you ran the program? How did that differ from what you wanted/expected? Include transcripts and/or error output if available.
  5. If you think you have found a bug, try to submit a repository that we can use to reproduce the problem. See "How can I produce a useful test case?" for more information. In most cases, if we cannot reproduce the problem, there is nothing we can do to help you.

How do I subscribe to a mailing list?

It is not so obvious how to subscribe to the cvs2svn mailing lists. There are two ways:

  • If you have an account on tigris.org, then you can go to any cvs2svn project page, click on "Mailing lists" in the "Project tools" menu on the left-hand column, then click on "Manage my subscriptions" (above the list of mailing lists). On that page, tick the "Subscribed" checkbox next to the lists to which you would like to subscribe.
  • If you do not have a tigris account, then you can subscribe by sending an email to $LIST-subscribe@cvs2svn.tigris.org, where $LIST is one of "announce", "users", "dev", "issues", or "commits". Please be sure to send the email to $LIST-subscribe and not to the list itself! (To unsubscribe, send and email to $LIST-unsubscribe@cvs2svn.tigris.org.) More details can be found here.

How do I report a bug?

cvs2svn is an open source project that is largely developed and supported by volunteers in their free time. Therefore please try to help out by reporting bugs in a way that will enable us to help you efficiently.

The first question is whether the problem you are experiencing is caused by a cvs2svn bug at all. A large fraction of reported "bugs" are caused by problems with the user's CVS repository, especially mild forms of repository corruption or trying to convert a CVSNT repository with cvs2svn. Please also double-check the manual to be sure that you are using the command-line options correctly.

A good way to localize potential repository corruption is to use the shrink_test_case.py script (which is located in the contrib directory of the cvs2svn source tree). This script tries to find the minimum subset of files in your repository that still shows the same problem. Warning: Only apply this script to a backup copy of your repository, as it destroys the repository that it operates on! Often this script can narrow the problem down to a single file which, as often as not, is corrupt in some way. Even if the problem is not in your repository, the shrunk-down test case will be useful for reporting the bug. Please see "How can I produce a useful test case?" and the comments at the top of shrink_test_case.py for information about how to use this script.

Assuming that you still think you have found a bug, the next step is to investigate whether the bug is already known. Please look through the issue tracker for bugs that sound familiar. If the bug is already known, then there is no need to report it (though possibly you could contribute a useful test case or a workaround).

If your bug seems new, then the best thing to do is report it via email to the dev@cvs2svn.tigris.org mailing list. Be sure to include the information listed in "What information should I include when requesting help?"

How can I produce a useful test case?

If you need to report a bug, it is extremely helpful if you can include a test repository with your bug report. In most cases, if we cannot reproduce the problem, there is nothing we can do to help you. This section describes ways to overcome the most common problems that people have in producing a useful test case. When you have a reasonable-sized test case (say under 1 MB--the smaller the better), you can just tar it up and attach it to the email in which you report the bug.

If the repository is too big and/or contains proprietary information

You don't want to send us your proprietary information, and we don't want to receive it either. Short of open-sourcing your software, here is a way to strip out most of the proprietary information and simultaneously reduce the size of the archive tremendously.

The destroy_repository.py script tries to delete as much information as possible out of your repository while still preserving its basic structure (and therefore hopefully any cvs2svn bugs). Specifically, it tries to delete file descriptions, text content, all nontrivial log messages, and all author names. It also renames all files and directories to have generic names (e.g., dir015/file053,v). (It does not affect the number and dates of revisions to the files.)

  1. This procedure will destroy the repository that it is applied to, so be sure to make a backup copy of your repository and work with the backup!
  2. Make sure you have the destroy_repository.py script. If you don't already have it, you should download the source code for cvs2svn (there is no need to install it). The script is located in the contrib subdirectory.
  3. Run destroy_repository.py by typing
    # You did make a backup, right?
    /path/to/config/destroy_repository.py /path/to/copy/of/repo
    
  4. Verify that the "destroyed" archive does not include any information that you consider proprietary. Your data security is ultimately your responsibility, and we make no guarantees that the destroy_repository.py script works correctly. You can open the *,v files using a text editor to see what they contain.
  5. Try converting the "destroyed" repository using cvs2svn, and ensure that the bug still exists. Take a note of the exact cvs2svn command line that you used and include it along with a tarball of the "destroyed" repository with your bug report.

If running destroy_repository.py with its default options causes the bug to go away, consider using destroy_repository.py command-line options to leave part of the repository information intact. Run destroy_repository.py --help for more information.

The repository is still too large

This step is a tiny bit more work, so if your repository is already small enough to send you can skip this step. But this step helps narrow down the problem (maybe even point you to a corrupt file in your repository!) so it is still recommended.

The shrink_test_case.py script tries to delete as many files and directories from your repository as possible while preserving the cvs2svn bug. To use this command, you need to write a little test script that tries to convert your repository and checks whether the bug is still present. The script should exit successfully (e.g., "exit 0") if the bug is still present, and fail (e.g., "exit 1") if the bug has disappeared. The form of the test script depends on the bug that you saw, but it can be as simple as something like this:

#! /bin/sh

cvs2svn --dry-run /path/to/copy/of/repo 2>&1 | grep -q 'KeyError'

If the bug is more subtle, then the test script obviously needs to be more involved.

Once the test script is ready, you can shrink your repository via the following steps:

  1. This procedure will destroy the repository that it is applied to, so be sure to make a backup copy of your repository and work with the backup!
  2. Make sure you have the shrink_test_case.py script. If you don't already have it, you should download the source code for cvs2svn (there is no need to install it). The script is located in the contrib subdirectory.
  3. Run shrink_test_case.py by typing
    # You did make a backup, right?
    /path/to/config/shrink_test_case.py /path/to/copy/of/repo testscript.sh
    
    , where testscript.sh is the name of the test script described above. This script will execute testscript.sh many times, each time using a subset of the original repository.
  4. If the shrunken repository only consists of one or two files, look inside the files with a text editor to see whether they are corrupted in any obvious way. (Many so-called cvs2svn "bugs" are actually the result of a corrupt CVS repository.)
  5. Try converting the "shrunk" repository using cvs2svn, to make sure that the original bug still exists. Take a note of the exact cvs2svn command line that you used, and include it along with a tarball of the "destroyed" repository with your bug report.

Does anybody offer commercial support for cvs2svn/cvs2git conversions?

Disclaimer:These links in this section are provided as a service to cvs2svn/cvs2git users. Neither Tigris.org, CollabNet Inc., nor the cvs2svn team guarantee the correctness, validity or usefulness of these links. To add a link to this section, please submit it to the cvs2svn developers' mailing list.

Following is a list of known sources for commercial support for cvs2svn/cvs2git conversions:

  • Michael Haggerty, the maintainer of cvs2svn/cvs2git, offers individual help with conversions, including implementation of new cvs2svn/cvs2git features, on a consulting basis. Please contact Michael via email for more information.
cvs2svn-2.4.0/www/cvs2svn.html0000664000076500007650000013641611500107340017323 0ustar mhaggermhagger00000000000000 cvs2svn Documentation

cvs2svn Documentation

Index


Introduction

cvs2svn is a program that can be used to migrate a CVS repository to Subversion (otherwise known as "SVN") or git. Documentation:


Requirements

cvs2svn requires the following:

  • Direct (filesystem) access to a copy of the CVS repository that you want to convert. cvs2svn parses the files in the CVS repository directly, so it is not enough to have remote CVS access. See the FAQ for more information and a possible workaround.
  • Python 2, version 2.4 or later. See http://www.python.org/. (cvs2svn does not work with Python 3.x.)
  • A compatible database library, usually gdbm, and the corresponding Python bindings. Neither dumbdbm nor standard dbm is sufficient.
  • If you use the --use-rcs option, then RCS's `co' program is required. The RCS home page is http://www.cs.purdue.edu/homes/trinkle/RCS/. See the --use-rcs flag for more details.
  • If you use the --use-cvs option, then the `cvs' command is required. The CVS home page is http://ccvs.cvshome.org/. See the --use-cvs flag for more details.

CVSNT repositories

cvs2svn does not support conversion of CVSNT repositories. Some people have indicated success with such conversions, while others have had problems. In other words, such conversions, even if apparently successful, should be checked carefully before use. See the FAQ for more information.


Installation

  • As root, run 'make install'.
  • Or, if you do not wish to install cvs2svn on your system, you can simply run it out of this directory. As long as it can find the 'cvs2svn_rcsparse' library, it should be happy.
  • If you want to create Unix-style manpages for the main programs, run 'make man'.

Deciding how much to convert

If you're looking to switch an existing CVS repository to Subversion, you have a number of choices for migrating your existing CVS data to a Subversion repository, depending on your needs.

There are a few basic routes to choose when switching from CVS to Subversion, and the one you choose will depend on how much historical data you want in your Subversion repository. You may be content to refer to your existing (soon-to-be-converted-to-read-only) CVS repository for "pre-Subversion" data and start working with a new Subversion repository. Maybe you prefer to squeeze every last drop of data out of your CVS repository into your Subversion repository. Then again, perhaps you want a conversion somewhere in between these two. Based on these needs, we've come up with these different recommended paths for converting your CVS repository to a Subversion repository.

  • Top-skim (Doesn't require cvs2svn!)
  • Trunk only
  • Pick and choose
  • Full conversion
  • Smorgasbord
  • One project at a time

If you decide that top-skimming doesn't meet your needs and you're going to use cvs2svn (yay!), then be sure to read the section below on prepping your repository before you start your conversion.

Top-skimming

This is the quickest and easiest way to get started in your new repository. You're basically going to export the latest revision of your cvs repository, possibly do some rearranging, and then import the resulting files into your Subversion repository. Typically, if you top-skim, that means you'll be either be keeping your old CVS repository around as a read-only reference for older data or just tossing that historical data outright (Note to you data packrats who have just stopped breathing, please take a deep breath and put down the letter opener. You don't have to do this yourself--it's just that some people don't feel the same way you do about historical data. They're really not bad people. Really.).

Pros: Quick, easy, convenient, results in a very compact and "neat" Subversion repository.

Cons: You've got no historical data, no branches, and no tags in your Subversion repository. If you want any of this data, you'll have to go back into the CVS Repository and get it.

Trunk only

If you decide that you'd like to have the main development line of your historical data in your Subversion repository but don't need to carry over the tags and branches, you may want to skip converting your CVS tags and branches entirely and only convert the "trunk" of your repository. To do this, you'll use the --trunk-only switch to cvs2svn.

Pros:Saves disk space in your new Subversion repository. Attractive to neatniks.

Cons: You've got no branches and no tags in your Subversion repository.

Pick and choose

Let's say, for example, that you want to convert your CVS repository's historical data but you have no use for the myriad daily build tags that you've got in your CVS repository. In addition to that, you want some branches but would prefer to ignore others. In this case, you'll want to use the --exclude switch to instruct cvs2svn which branches and tags it should ignore.

Pros:You only get what you want from your CVS repository. Saves a some space.

Cons:If you forgot something, you'll have to go to your CVS repository.

Full conversion

If you want to convert your entire CVS repository, including all tags and branches, you want a full conversion. This is cvs2svn's default behavior.

Pros: Converts every last byte of your CVS repository.

Cons: Requires more disk space.

Smorgasbord

You can convert your repository (or repositories) piece by piece using a combination of the above .

Pros: You get exactly what you want.

Cons: Importing converted repositories multiple times into a single Subversion repository will likely break date-based range commands (e.g. svn diff -r {2002-02-17:2002-03-18}) since Subversion does a binary search through the repository for dates. While this is not the end of the world, it can be a minor inconvenience.

One project at a time

If you have many diverse projects in your CVS repository and you don't want to move them all to Subversion at once, you may want to convert to Subversion one project at a time. This requires a few extra steps, but it can make the conversion of a large CVS repository much more manageable. See How can I convert my CVS repository one module at a time? on the cvs2svn FAQ for a detailed example on converting your CVS repository one project at a time.

Pros:Allows multiple projects in a single repository to convert to Subversion according to a schedule that works best for them.

Cons:Requires some extra steps to accomplish the conversion. Importing converted repositories multiple times into a single Subversion repository will likely break date-based range commands (e.g. svn diff -r {2002-02-17:2002-03-18}) since Subversion does a binary search through the repository for dates. While this is not the end of the world, it can be a minor inconvenience.


Prepping your repository

There are a number of reasons that you may need to prep your CVS Repository. If you decide that you need to change part of your CVS repository, we strongly recommend working on a copy of it instead of working on the real thing. cvs2svn itself does not make any changes to your CVS repository, but if you start moving things around and deleting things in a CVS repository, it's all too easy to shoot yourself in the foot.

End-of-line translation

One of the most important topics to consider when converting a repository is the distinction between binary and text files. If you accidentally treat a binary file as text your repository contents will be corrupted.

Text files are handled differently than binary files by both CVS and Subversion. When a text file is checked out, the character used to denote the end of line ("EOL") is converted to the local computer's format. This is usually the most convenient behavior for text files. Moreover, both CVS and Subversion allow "keywords" in text files (such as $Id$), which are expanded with version control information when the file is checked out. However, if line-end translation or keyword expansion is applied to a binary file, the file will usually be corrupted.

CVS treats a file as text unless you specifically tell it that the file is binary. You can tell CVS that a file is binary by using the command cvs admin -kb filename. But often CVS users forget to specify which files are binary, and as long as the repository is only used under Unix, they may never notice a problem, because the internal format of CVS is the same as the Unix format. But Subversion is not as forgiving as CVS if you tell it to treat a binary file as text.

If you have been conscientious about marking files as binary in CVS, then you should be able to use --default-eol=native. If you have been sloppy, then you have a few choices:

  • Convert your repository with cvs2svn's default options. Your text files will be treated as binary, but that usually isn't very harmful (at least no information will be lost).
  • Mend your slovenly ways by fixing your CVS repository before conversion: run cvs admin -kb filename for each binary file in the repository. Then you can use --default-eol=native along with the anal-retentive folks.
  • Use cvs2svn options to help cvs2svn deduce which files are binary during the conversion. The useful options are --eol-from-mime-type, --keywords-off, --auto-props, and --default-eol. See the FAQ for more information.

Converting part of repository

If you want to convert a subdirectory in your repository, you can just point cvs2svn at the subdirectory and go. There is no need to delete the unwanted directories from the CVS repository.

If the subdirectory that you are converting contains any files that you don't want converted into your new Subversion repository, you should delete them or move them aside. Such files can be deleted from HEAD after the conversion, but they will still be visible in the repository history.

Lastly, even though you can move and copy files and directories around in Subversion, you may want to do some rearranging of project directories before running your conversion to get the desired repository project organization.


Command line vs. options file

There are two ways to specify the options that define a conversion: via the cvs2svn command line, or via an options file. The command line is useful for simple conversions, but the options file method is recommended for nontrivial conversions as it gives the user more flexibility.

Command line method

A command-line conversion allows the use of all of the command line options listed below (except for --options). This method allows almost all of the built-in conversion options to be selected, with the primary limitation that it does not support multiproject conversions. However, it may require a long command line to specify all of the options for a complicated conversion.

Options file method

The options file method allows full control of the conversion process, including multiproject conversions. It also allows expert users to customize the conversion even more radically by writing Python code. Finally, the options file used in the conversion can be retained as permanent record of the options used in a conversion.

To use the options file method, you need to create a file defining all of the options that are to be used for the conversion. A heavily-commented sample options file, cvs2svn-example.options, is included in the cvs2svn distribution. The easiest way to create your own options file is to make a copy of the sample file and modify it as directed by the comments in that file.

Note: The options file format changes frequently. Please be sure to base your options file on the cvs2svn-example.options file from the version of cvs2svn that you plan to use.

To start a conversion using an options file, invoke cvs2svn like this:

   $ cvs2svn --options=OPTIONSFILE

Only the following options are allowed in combination with --options: -h/--help, --help-passes, --version, -v/--verbose, -q/--quiet, -p/--pass/--passes, --dry-run, and --profile.


Symbol handling

cvs2svn converts CVS tags and branches into Subversion tags and branches. This section discusses issues related to symbol handling.

HINT: If there are problems with symbol usage in your repository, they are usually reported during CollateSymbolsPass of the conversion, causing the conversion to be interrupted. However, it is not necessary to restart the whole conversion to fix the problems. Usually it is adequate to adjust the symbol-handling options then re-start cvs2svn starting at CollateSymbolsPass, by adding the option "-p CollateSymbolsPass:". This trick can save a lot of time if you have a large repository, as it might take a few iterations before you find the best set of options to convert your repository.

Placement of trunk, branches, and tags directories

cvs2svn converts CVS branches and tags into Subversion branches and tags following the standard Subversion convention. For single-project conversions, the default is to put the trunk, branches, and tags directories at the top level of the repository tree, though this behavior can be changed by using the --trunk, --branches, and --tags options. For multiproject conversions, you must specify the location of each project's trunk, branches, and tags directory in the options file; repository layout strategies are discussed in the Subversion book. For even finer control over the conversion, you can use a --symbol-hints file to specify the SVN path to be used for each CVS tag and branch.

Excluding tags and branches

Often a CVS repository contains tags and branches that will not be needed after the conversion to Subversion. You can instruct cvs2svn to exclude such symbols from the conversion, in which case they will not be present in the resulting Subversion repository. Please be careful when doing this; excluding symbols causes information that was present in CVS to be omitted in Subversion, thereby discarding potentially useful historical information. Also be aware that if you exclude a branch, then all CVS revisions that were committed to that branch will also be excluded.

To exclude a tag or branch, use the option --exclude=SYMBOL. You can also exclude a whole group of symbols matching a specified regular expression; for example, --exclude='RELEASE_0_.*'. (The regular expression has to match the whole symbol name for the rule to apply.)

However, please note the following restriction. If a branch has a subbranch or a tag on it, then the branch cannot be excluded unless the dependent symbol is also excluded. cvs2svn checks for this situation; if it occurs then CollateSymbolsPass outputs an error message like the following:

   ERROR: The branch 'BRANCH' cannot be excluded because the following symbols depend on it:
       'TAG'
       'SUBBRANCH'

In such a case you can either exclude the dependent symbol(s) (in this case by using --exclude=TAG --exclude=SUBBRANCH) or not exclude 'BRANCH'.

Excluding vendor branches

There is one more special case related to branch handling. A vendor branch is a CVS branch that is used to track source code received from an outside source. A vendor branch typically has CVS branch number 1.1.1 and revision numbers 1.1.1.1, 1.1.1.2, etc. Vendor branches are created automatically whenever the cvs import command is used. Vendor branches have the strange property that, under certain circumstances, a file that appears on a vendor branch also implicitly exists on trunk. cvs2svn knows all about vendor branches and does its best to ensure that a file that appears on a vendor branch is also copied to trunk, to give Subversion behavior that is as close as possible to the CVS behavior.

However, often vendor branches exist for reasons unrelated to tracking outside sources. Indeed, some CVS documentation recommends using the cvs import command to import your own code into your CVS repository (which is arguably a misuse of the cvs import command). Vendor branches created by this practice are useless and would only serve to clutter up your Subversion repository. Therefore, cvs2svn allows vendor branches to be excluded, in which case the vendor branch revisions are grafted onto the history of trunk. This is allowed even if other branches or tags appear to sprout from the vendor branch, in which case the dependent tags are grafted to trunk as well. Such branches can be recognized in the --write-symbol-info by looking for a symbol that is a "pure import" in the same number of files that it appears as a branch. It is typically advantageous to exclude such branches.

Tag/branch inconsistencies

In CVS, the same symbol can appear as a tag in some files (e.g., cvs tag SYMBOL file1.txt) and a branch in others (e.g., cvs tag -b SYMBOL file2.txt). Subversion takes a more global view of your repository, and therefore works better when each symbol is used in a self-consistent way--either always as a branch or always as a tag. cvs2svn provides features to help you resolve these ambiguities.

If your repository contains inconsistently-used symbols, then CollateSymbolsPass, by default, uses heuristics to decide which symbols to convert as branches and which as tags. Often this behavior will be adequate, and you don't have to do anything special. You can use the --write-symbol-info=filename option to cause cvs2svn to list all of the symbols in your repository and how it chose to convert them to filename.

However, if you want to take finer control over how symbols are converted, you can do so. The first step is probably to change the default symbol handling style from heuristic (the default value) to strict using the option --symbol-default=strict. With the strict setting, cvs2svn prints error messages and aborts the conversion if there are any ambiguous symbols. The error messages look like this:

   ERROR: It is not clear how the following symbols should be converted.
   Use --symbol-hints, --force-tag, --force-branch, --exclude, and/or
   --symbol-default to resolve the ambiguity.
       'SYMBOL1' is a tag in 1 files, a branch in 2 files and has commits in 0 files
       'SYMBOL2' is a tag in 2 files, a branch in 1 files and has commits in 0 files
       'SYMBOL3' is a tag in 1 files, a branch in 2 files and has commits in 1 files

You have to tell cvs2svn how to fix the inconsistencies then restart the conversion at CollateSymbolsPass.

There are three ways to deal with an inconsistent symbol: treat it as a tag, treat it as a branch, or exclude it from the conversion altogether.

In the example above, the symbol 'SYMBOL1' was used as a branch in two files but used as a tag in only one file. Therefore, it might make sense to convert it as a branch, by using the option --force-branch=SYMBOL1. However, no revisions were committed on this branch, so it would also be possible to convert it as a tag, by using the option --force-tag=SYMBOL1. If the symbol is not needed at all, it can be excluded by using --exclude=SYMBOL1.

Similarly, 'SYMBOL2' was used more often as a tag, but can still be converted as a branch or a tag, or excluded.

SYMBOL3, on the other hand, was sometimes used as a branch, and at least one revision was committed on the branch. It can be converted as a branch, using --force-branch=SYMBOL3. But it cannot be converted as a tag (because tags are not allowed to have revisions on them). If it is excluded, using --exclude=SYMBOL3, then both the branch and the revisions on the branch will be left out of the Subversion repository.

If you are not so picky about which symbols are converted as tags and which as branches, you can ask cvs2svn to decide by itself. To do this, specify the --symbol-default=OPTION, where OPTION can be either "heuristic" (the default; decide how to treat each ambiguous symbol based on whether it was used more often as a branch or as a tag in CVS), "branch" (treat every ambiguous symbol as a branch), or "tag" (treat every ambiguous symbol as a tag). You can use the --force-branch and --force-tag options to specify the treatment of particular symbols, in combination with --symbol-default to specify the default to be used for other ambiguous symbols.

Finally, you can have cvs2svn write a text file showing how each symbol was converted by using the --write-symbol-info option. If you disagree with any of cvs2svn's choices, you can make a copy of this file, edit it, then pass it to cvs2svn by using the --symbol-hints option. In this manner you can influence how each symbol is converted and also the parent line of development of each symbol (the line of development from which the symbol sprouts).


Command line reference

USAGE:
cvs2svn [OPTIONS]... [-s SVN-REPOS-PATH|--dumpfile=PATH|--dry-run] CVS-REPOS-PATH
cvs2svn [OPTIONS]... --options=PATH
CVS-REPOS-PATH The filesystem path of the part of the CVS repository that you want to convert. It is not possible to convert a CVS repository to which you only have remote access; see the FAQ for details. This doesn't have to be the top level directory of a CVS repository; it can point at a project within a repository, in which case only that project will be converted. This path or one of its parent directories has to contain a subdirectory called CVSROOT (though the CVSROOT directory can be empty).
Configuration via options file
--options=PATH Read the conversion options from the specified file. See section options file method for more information.
Output options
-s PATH
--svnrepos PATH
Write the output of the conversion into a Subversion repository located at PATH. This option causes a new Subversion repository to be created at PATH unless the --existing-svnrepos option is also used.
--existing-svnrepos Load the converted CVS repository into an existing Subversion repository, instead of creating a new repository. (This option should be used in combination with -s/--svnrepos.) The repository must either be empty or contain no paths that overlap with those that will result from the conversion. Please note that you need write permission for the repository files.
--fs-type=TYPE Pass the --fs-type=TYPE option to "svnadmin create" if creating a new Subversion repository.
--bdb-txn-nosync Pass the --bdb-txn-nosync switch to "svnadmin create" if creating a new Subversion repository.
--create-option=OPT Pass OPT to "svnadmin create" if creating a new Subversion repository (can be specified multiple times to pass multiple options).
--dumpfile=PATH Output the converted CVS repository into a Subversion dumpfile instead of a Subversion repository (useful for importing a CVS repository into an existing Subversion repository). PATH is the filename in which to store the dumpfile.
--dry-run Do not create a repository or a dumpfile; just print the details of what cvs2svn would do if it were really converting your repository.
Conversion options
--trunk-only Convert only the main line of development from the CVS repository (commonly referred to in Subversion parlance as "trunk"), ignoring all tags and branches.
--trunk=PATH The top-level path to use for trunk in the Subversion repository. The default value is "trunk".
--branches=PATH The top-level path to use for branches in the Subversion repository. The default value is "branches".
--tags=PATH The top-level path to use for tags in the Subversion repository. The default value is "tags".
--include-empty-directories Treat empty subdirectories within the CVS repository as actual directories, creating them when the parent directory is created and removing them if and when the parent directory is pruned.
--no-prune When all files are deleted from a directory in the Subversion repository, don't delete the empty directory (the default is to delete any empty directories.
--encoding=ENC Use ENC as the encoding for filenames, log messages, and author names in the CVS repos. (By using an --options file, it is possible to specify one set of encodings to use for filenames and a second set for log messages and author names.) This option may be specified multiple times, in which case the encodings are tried in order until one succeeds. Default: ascii. Other possible values include the standard Python encodings.
--fallback-encoding=ENC If none of the encodings specified with --encoding succeed in decoding an author name or log message, then fall back to using ENC in lossy 'replace' mode. Use of this option may cause information to be lost, but at least it allows the conversion to run to completion. This option only affects the encoding of log messages and author names; there is no fallback encoding for filenames. (By using an --options file, it is possible to specify a fallback encoding for filenames.) Default: disabled.
--no-cross-branch-commits Prevent the creation of SVN commits that affect multiple branches or trunk and a branch. Instead, break such changesets into multiple commits, one per branch.
--retain-conflicting-attic-files If a file appears both inside an outside of the CVS attic, retain the attic version in an SVN subdirectory called `Attic'. (Normally this situation is treated as a fatal error.)
Symbol handling
--symbol-transform=PAT:SUB

Transform RCS/CVS symbol names before entering them into Subversion. PAT is a Python regular expression pattern that is matched against the entire symbol name. If it matches, the symbol is replaced with SUB, which is a replacement pattern using Python's reference syntax. You may specify any number of these options; they will be applied in the order given on the command line.

This option can be useful if you're converting a repository in which the developer used directory-wide symbol names like 1_0, 1_1 and 2_1 as a kludgy form of release tagging (the C-x v s command in Emacs VC mode encourages this practice). A command like

cvs2svn --symbol-transform='([0-9])-(.*):release-\1.\2' -s SVN RCS

will transform a local CVS repository into a local SVN repository, performing the following sort of mappings of RCS symbolic names to SVN tags:

1-0 → release-1.0
1-1 → release-1.1
2-0 → release-2.0
--symbol-hints=PATH

Read symbol conversion hints from PATH. The format of PATH is the same as the format output by --write-symbol-info, namely a text file with four whitespace-separated columns:

project-id symbol conversion svn-path parent-lod-name

project-id is the numerical ID of the project to which the symbol belongs, counting from 0. project-id can be set to '.' if project-specificity is not needed. symbol-name is the name of the symbol being specified. conversion specifies how the symbol should be converted, and can be one of the values 'branch', 'tag', or 'exclude'. If conversion is '.', then this rule does not affect how the symbol is converted. svn-path is the name of the SVN path to which this line of development should be written. If svn-path is omitted or '.', then this rule does not affect the SVN path of this symbol. parent-lod-name is the name of the symbol from which this symbol should sprout, or '.trunk.' if the symbol should sprout from trunk. If parent-lod-name is omitted or '.', then this rule does not affect the preferred parent of this symbol. The file may contain blank lines or comment lines (lines whose first non-whitespace character is '#').

The simplest way to use this option is to run the conversion through CollateSymbolsPass with --write-symbol-info option, copy the symbol info and edit it to create a hints file, then re-start the conversion at CollateSymbolsPass with this option enabled.

--symbol-default=OPT Specify how to convert ambiguous symbols (i.e., those that appear in the CVS archive as both branches and tags). OPT is one of the following:
  • "heuristic": Decide how to treat each ambiguous symbol based on whether it was used more often as a branch or tag in CVS. (This is the default behavior.)
  • "strict": No default; every ambiguous symbol has to be resolved manually using --symbol-hints, --force-branch, --force-tag, or --exclude.
  • "branch": Treat every ambiguous symbol as a branch.
  • "tag": Treat every ambiguous symbols as a tag.
--force-branch=REGEXP Force symbols whose names match REGEXP to be branches.
--force-tag=REGEXP Force symbols whose names match REGEXP to be tags. This will cause an error if such a symbol has commits on it.
--exclude=REGEXP Exclude branches and tags whose names match REGEXP from the conversion.
--keep-trivial-imports Do not exclude branches that were only used for a single import. (By default such branches are excluded because they are usually created by the inappropriate use of cvs import.)
Subversion properties
--username=NAME Use NAME as the author for cvs2svn-synthesized commits (the default value is no author at all.
--auto-props=FILE

Specify a file in the format of Subversion's config file, whose [auto-props] section can be used to set arbitrary properties on files in the Subversion repository based on their filenames. (The [auto-props] section header must be present; other sections of the config file, including the enable-auto-props setting, are ignored.) Filenames are matched to the filename patterns case-insensitively, consistent with Subversion's behavior. The auto-props file might have content like this:

[auto-props]
*.txt = svn:mime-type=text/plain;svn:eol-style=native
*.doc = svn:mime-type=application/msword;!svn:eol-style

Please note that cvs2svn allows properties to be explicitly unset: if cvs2svn sees a setting like !svn:eol-style (with a leading exclamation point), it forces the property to remain unset, even if later rules would otherwise set the property.

--mime-types=FILE Specify an apache-style mime.types file for setting svn:mime-type properties on files in the Subversion repository.
--eol-from-mime-type For files that don't have the kb expansion mode but have a known mime type, set the eol-style based on the mime type. For such files, set the svn:eol-style property to "native" if the mime type begins with "text/", and leave it unset (i.e., no EOL translation) otherwise. Files with unknown mime types are not affected by this option. This option has no effect unless the --mime-types option is also specified.
--default-eol=STYLE Set svn:eol-style to STYLE for files that don't have the kb expansion mode and whose end-of-line translation mode hasn't been determined by one of the other options. STYLE can be "binary" (default), "native", "CRLF", "LF", or "CR".
--keywords-off By default, cvs2svn sets svn:keywords on CVS files to "Author Date Id Revision" if the file's svn:eol-style property is set (see the --default-eol option). The --keywords-off switch prevents cvs2svn from setting svn:keywords for any file. (The result for files that do contain keyword strings is somewhat unexpected: the keywords will be left with the expansions that they had when committed to CVS, which is usually the expansion for the previous revision.)
--keep-cvsignore Include .cvsignore files in the output. (Normally they are unneeded because cvs2svn sets the corresponding svn:ignore properties.)
--cvs-revnums Record CVS revision numbers as file properties in the Subversion repository. (Note that unless it is removed explicitly, the last CVS revision number will remain associated with the file even after the file is changed within Subversion.)
Extraction options
--use-internal-co Use internal code to extract the contents of CVS revisions. This is the default extraction option. This is up to 50% faster than --use-rcs, but needs a lot of disk space: roughly the size of your CVS repository plus the peak size of a complete checkout of the repository with all branches that existed and still had commits pending at a given time. If this option is used, the $Log$ keyword is not handled.
--use-rcs Use RCS's co command to extract the contents of CVS revisions. RCS is much faster than CVS, but in certain rare cases it has problems with data that CVS can handle. Specifically: If you are having trouble in OutputPass of a conversion when using the --use-rcs option, the first thing to try is using the --use-cvs option instead.
--use-cvs If RCS co is having trouble extracting CVS revisions, you may need to pass this flag, which causes cvs2svn to use CVS instead of RCS to read the repository. See --use-rcs for more information.
Environment options
--tmpdir=PATH Use the directory PATH for all of cvs2svn's temporary data (which can be a lot of data). The default value is cvs2svn-tmp in the current working directory.
--svnadmin=PATH If the svnadmin program is not in your $PATH you should specify its absolute path with this switch. svnadmin is needed when the -s/--svnrepos output option is used
--co=PATH If the co program (a part of RCS) is not in your $PATH you should specify its absolute path with this switch. (co is needed if the --use-rcs extraction option is used.)
--cvs=PATH If the cvs program is not in your $PATH you should specify its absolute path with this switch. (cvs is needed if the --use-cvs extraction option is used.)
Partial conversions
-p PASS
--pass PASS
Execute only pass PASS of the conversion. PASS can be specified by name or by number (see --help-passes)
-p [START]:[END]
--passes [START]:[END]
Execute passes START through END of the conversion (inclusive). START and END can be specified by name or by number (see --help-passes). If START or END is missing, it defaults to the first or last pass, respectively.
Information options
--version Print the version number.
--help, -h Print the usage message and exit with success.
--help-passes Print the numbers and names of the conversion passes and exit with success.
--man Write the manpage for this program to standard output.
--verbose, -v Tell cvs2svn to print lots of information about what it's doing to STDOUT. This option can be specified twice to get debug-level output.
--quiet, -q Tell cvs2svn to operate in quiet mode, printing little more than pass starts and stops to STDOUT. This option may be specified twice to suppress all non-error output.
--write-symbol-info=PATH Write symbol statistics and information about how symbols were converted to PATH during CollateSymbolsPass. See --symbol-hints for a description of the output format.
--skip-cleanup Prevent the deletion of the temporary files that cvs2svn creates in the process of conversion.
--profile Dump Python cProfile profiling data to the file cvs2svn.cProfile. In Python 2.4 and earlier, if cProfile is not installed, it will instead dump Hotshot profiling data to the file cvs2svn.hotshot.

A Few Examples

To create a new Subversion repository by converting an existing CVS repository, run the script like this:

   $ cvs2svn --svnrepos NEW_SVNREPOS CVSREPOS

To create a new Subversion repository containing only trunk commits, and omitting all branches and tags from the CVS repository, do

   $ cvs2svn --trunk-only --svnrepos NEW_SVNREPOS CVSREPOS

To create a Subversion dumpfile (suitable for 'svnadmin load') from a CVS repository, run it like this:

   $ cvs2svn --dumpfile DUMPFILE CVSREPOS

To use an options file to define all of the conversion parameters, specify --options:

   $ cvs2svn --options OPTIONSFILE

As it works, cvs2svn will create many temporary files in a temporary directory called "cvs2svn-tmp" (or the directory specified with --tmpdir). This is normal. If the entire conversion is successful, however, those tempfiles will be automatically removed. If the conversion is not successful, or if you specify the '--skip-cleanup' option, cvs2svn will leave the temporary files behind for possible debugging.

cvs2svn-2.4.0/www/project_license.html0000664000076500007650000000657710646512632021111 0ustar mhaggermhagger00000000000000 cvs2svn License
Project cvs2svn License

This license applies to all portions of cvs2svn which are not externally-maintained libraries (e.g. rcsparse). Such libraries have their own licenses; we recommend you read them, as their terms may differ from the terms below.

/* ================================================================
 * Copyright (c) 2000-2007 CollabNet.  All rights reserved.
 * 
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions are
 * met:
 * 
 * 1. Redistributions of source code must retain the above copyright
 * notice, this list of conditions and the following disclaimer.
 * 
 * 2. Redistributions in binary form must reproduce the above copyright
 * notice, this list of conditions and the following disclaimer in the
 * documentation and/or other materials provided with the distribution.
 * 
 * 3. The end-user documentation included with the redistribution, if
 * any, must include the following acknowledgment: "This product includes
 * software developed by CollabNet (http://www.Collab.Net/)."
 * Alternately, this acknowledgment may appear in the software itself, if
 * and wherever such third-party acknowledgments normally appear.
 * 
 * 4. The hosted project names must not be used to endorse or promote
 * products derived from this software without prior written
 * permission. For written permission, please contact info@collab.net.
 * 
 * 5. Products derived from this software may not use the "Tigris" name
 * nor may "Tigris" appear in their names without prior written
 * permission of CollabNet.
 * 
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
 * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
 * IN NO EVENT SHALL COLLABNET OR ITS CONTRIBUTORS BE LIABLE FOR ANY
 * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
 * GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
 * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
 * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
 * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
 * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 *
 * ====================================================================
 * 
 * This software consists of voluntary contributions made by many
 * individuals on behalf of CollabNet.
 */

cvs2svn-2.4.0/www/features.html0000664000076500007650000002502311710517254017540 0ustar mhaggermhagger00000000000000 cvs2svn Features

cvs2svn Features

The primary goal of cvs2svn is to migrate as much information as possible from your old CVS repository to your new Subversion or git repository.

Unfortunately, CVS doesn't record complete information about your project's history. For example, CVS doesn't record what file modifications took place together in the same CVS commit. Therefore, cvs2svn attempts to infer from CVS's incomplete information what really happened in the history of your repository. So the second goal of cvs2svn is to reconstruct as much of your CVS repository's history as possible.

The third goal of cvs2svn is to allow you to customize the conversion process and the form of your output repository as flexibly as possible. cvs2svn has very many conversion options that can be used from the command line, many more that can be configured via an options file, and provides many hooks to allow even more extreme customization by writing Python code.

Feature summary

No information lost
cvs2svn works hard to avoid losing any information from your CVS repository (unless you specifically ask for a partial conversion using --trunk-only or --exclude).
Changesets
CVS records modifications file-by-file, and does not keep track of what files were modified at the same time. cvs2svn uses information like the file modification times, log messages, and dependency information to deduce the original changesets. cvs2svn allows changesets that affect multiple branches and/or multiple projects (as is allowed by CVS), or it can be configured to split such changesets up into separate commits (--no-cross-branch-commits; see also options file).
Multiproject conversions
cvs2svn can convert a CVS repository that contains multiple projects into a single Subversion repository with the conventional multiproject directory layout. See the FAQ for more information.
Branch vs. tag
CVS allows the same symbol name to be used sometimes as a branch, sometimes as a tag. cvs2svn has options and heuristics to decide how to convert such "mixed" symbols (--symbol-hints, --force-branch, --force-tag, --symbol-default).
Branch/tag exclusion
cvs2svn allows the user to specify branches and/or tags that should be excluded from the conversion altogether (--symbol-hints, --exclude). It checks that the exclusions are self-consistent (e.g., it doesn't allow a branch to be excluded if a branch that sprouts from it is not excluded).
Branch/tag renaming
cvs2svn can rename branches and tags during the conversion using regular-expression patterns (--symbol-transform).
Choosing SVN paths for branches/tags
You can choose what SVN paths to use as the trunk/branches/tags directories (--trunk, --branches, --tags), or set arbitrary paths for specific CVS branches/tags (--symbol-hints). For example, you might want to store some tags to the project/tags directory, but others to project/releases.
Branch and tag parents
In many cases, the CVS history is ambiguous about which branch served as the parent of another branch or tag. cvs2svn determines the most plausible parent for symbols using cross-file information. You can override cvs2svn's choices on a case-by-case basis by using the --symbol-hints option.
Branch and tag creation times
CVS does not record when branches and tags are created. cvs2svn creates branches and tags at a reasonable time, consistent with the file revisions that were tagged, and tries to create each one within a single Subversion commit if possible.
Mime types
CVS does not record files' mime types. cvs2svn provides several mechanisms for choosing reasonable file mime types (--mime-types, --auto-props).
Binary vs. text
Many CVS users do not systematically record which files are binary and which are text. (This is mostly important if the repository is used on non-Unix systems.) cvs2svn provides a number of ways to infer this information (--eol-from-mime-type, --default-eol, --keywords-off, --auto-props).
Subversion file properties
Subversion allows arbitrary text properties to be attached to files. cvs2svn provides a mechanism to set such properties when a file is first added to the repository (--auto-props) as well as a hook that users can use to set arbitrary file properties via Python code.
Handling of .cvsignore
.cvsignore files in the CVS repository are converted into the equivalent svn:ignore properties in the output. By default, the .cvsignore files themselves are not included in the output; this behavior can be changed by specifying the --keep-cvsignore option.
Subversion repository customization
cvs2svn provides many options that allow you to customize the structure of the resulting Subversion repository (--trunk, --branches, --tags, --include-empty-directories, --no-prune, --symbol-transform, etc.; see also the additional customization options available by using the --options-file method).
Support for multiple character encodings
CVS does not record which character encoding was used to store metainformation like file names, author names and log messages. cvs2svn provides options to help convert such text into UTF-8 (--encoding, --fallback-encoding).
Vendor branches
CVS supports "vendor branches", which (under some circumstances) affect the contents of the main line of development. cvs2svn detects vendor branches whenever possible and handles them intelligently. For example,
  • cvs2svn explicitly copies vendor branch revisions back to trunk so that a checkout of trunk gives the same results under SVN as under CVS.
  • If a vendor branch is excluded from the conversion, cvs2svn grafts the relevant vendor branch revisions onto trunk so that the contents of trunk are still the same as in CVS. If other tags or branches sprout from these revisions, they are grafted to trunk as well.
  • When a file is imported into CVS, CVS creates two revisions ("1.1" and "1.1.1.1") with the same contents. cvs2svn discards the redundant "1.1" revision in such cases (since revision "1.1.1.1" will be copied to trunk anyway).
  • Often users create vendor branches unnecessarily by using "cvs import" to import their own sources into the CVS repository. Such vendor branches do not contain any useful information, so by default cvs2svn excludes any vendor branch that was only used for a single import. You can change this default behavior by specifying the --keep-trivial-imports option.
CVS quirks
cvs2svn goes to great length to deal with CVS's many quirks. For example,
  • CVS introduces spurious "1.1" revisions when a file is added on a branch. cvs2svn discards these revisions.
  • If a file is added on a branch, CVS introduces a spurious "dead" revision at the beginning of the branch to indicate that the file did not exist when the branch was created. cvs2svn deletes these spurious revisions and adds the file on the branch at the correct time.
Robust against repository corruption
cvs2svn knows how to handle several types of CVS repository corruption that have been reported frequently, and gives informative error messages in other cases:
  • An RCS file that exists both in and out of the "Attic" directory.
  • Multiple deltatext blocks for a single CVS file revision.
  • Multiple revision headers for the same CVS file revision.
  • Tags and branches that refer to non-existent revisions or ill-formed revision numbers.
  • Repeated definitions of a symbol name to the same revision number.
  • Branches that have no associated labels.
  • A directory name that conflicts with a file name (in or out of the Attic).
  • Filenames that contain forbidden characters.
  • Log messages with variant end-of-line styles.
  • Vendor branch declarations that refer to non-existent branches.
Timestamp error correction
Many CVS repositories contain timestamp errors due to servers' clocks being set incorrectly during part of the repository's history. cvs2svn's history reconstruction is relatively robust against timestamp errors and it writes monotonic timestamps to the Subversion repository.
Scalable
cvs2svn stores most intermediate data to on-disk databases so that it can convert very large CVS repositories using a reasonable amount of RAM. Conversions are organized as multiple passes and can be restarted at an arbitrary pass in the case of problems.
Configurable/extensible using Python
Many aspects of the conversion can be customized using Python plugins that interact with cvs2svn through documented interfaces (--options).
cvs2svn-2.4.0/www/issue_tracker.html0000664000076500007650000001053110646512632020565 0ustar mhaggermhagger00000000000000 cvs2svn Issue Tracker

cvs2svn Issue Tracker

Issue Tracker Guidelines

We welcome bug reports and enhancement requests. However, to make the process of prioritizing cvs2svn tasks easier, we ask that you follow a few guidelines. Before filing an issue in the Issue Tracker, please:

  • Look through the existing issues to determine if your concern has already been noted by someone else.
  • Make sure that you've read the appropriate documentation (for example, the README) to verify that you are using the software appropriately, and to determine if any problems you are seeing are perhaps not real bugs.
  • Send email to the dev list (dev@cvs2svn.tigris.org) fully describing the enhancement or bug that brought you here today. That will give the maintainers a chance to ask you questions, confirm that it is a bug, explain the behavior, etc.

What the Fields Mean

When an issue is first filed, it automatically goes in the "---" milestone, meaning it is unscheduled. A developer will examine it and maybe talk to other developers, then estimate the bug's severity, the effort required to fix it, and schedule it in a numbered milestone, for example 1.0. (Or they may put it the future or no milestone milestones, if they consider it tolerable for all currently planned releases.)

An issue filed in future might still get fixed soon, if some committer decides they want it done. Putting it in future merely means we're not planning to block any particular release on that issue.

Severity is represented in the Priority field. Here is how priority numbers map to severity:

  • P1: Prevents work from getting done, causes data loss, or BFI ("Bad First Impression" -- too embarrassing for a public release).
  • P2: Workaround required to get stuff done.
  • P3: Like P2, but rarely encountered in normal usage.
  • P4: Developer concern only, API stability or cleanliness issue.
  • P5: Nice to fix, but in a pinch we could live with it.

Effort Required is sometimes represented in the Status Whiteboard with an "e number", which is the average of the most optimistic and most pessimistic projections for number of engineer/days needed to fix the bug. The e number always comes first, so we can sort on the field, but we include the actual spread after it, so we know when we're dealing with a wide range. For example "e2.5 (2 / 3)" is not quite the same as "e2.5 (1 / 4)"!

Enter the Issue Tracker

And so, with further ado, we give you (drumroll…) the cvs2svn Issue Tracker.

cvs2svn-2.4.0/www/index.html0000664000076500007650000000551612027257623017041 0ustar mhaggermhagger00000000000000 cvs2svn - CVS to Subversion Repository Converter
Mail users@cvs2svn.tigris.org if you have any questions or encounter any problems. You can also ask questions on IRC at irc.freenode.net, channel #cvs2svn.

What Is cvs2svn?

cvs2svn is a tool for migrating a CVS repository to Subversion, git, or Bazaar. The main design goals are robustness and 100% data preservation. cvs2svn can convert just about any CVS repository we've ever seen. For example, it has been used to convert gcc, FreeBSD, KDE, GNOME, PostgreSQL...

cvs2svn infers what happened in the history of your CVS repository and replicates that history as accurately as possible in the target SCM. All revisions, branches, tags, log messages, author names, and commit dates are converted. cvs2svn deduces what CVS modifications were made at the same time, and outputs these modifications grouped together as changesets in the target SCM. cvs2svn also deals with many CVS quirks and is highly configurable. See the comprehensive feature list.

You can get the latest release from the Downloads Area. Please read the documentation carefully before using cvs2svn.

For general use, the most recent released version of cvs2svn is usually the best choice. However, if you want to use the newest cvs2svn features or if you're debugging or patching cvs2svn, you might want to use the trunk version (which is usually quite stable). To do so, use Subversion to check out a working copy from http://cvs2svn.tigris.org/svn/cvs2svn/trunk/ using a command like

svn co --username=guest --password="" http://cvs2svn.tigris.org/svn/cvs2svn/trunk cvs2svn-trunk
cvs2svn-2.4.0/www/validate.sh0000775000076500007650000000454612027262773017200 0ustar mhaggermhagger00000000000000#!/bin/bash -e # If this script gives errors like # # warning: failed to load external entity "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" # # and you happen to be using a Debian-derived version of Linux, it # might help to run (as root) # # apt-get install w3c-dtd-xhtml WWWDIR="`dirname \"$0\"`" ensure () { LOCALFILE="$WWWDIR/$2/`basename \"$1\"`" test -f "$LOCALFILE" || wget -O "$LOCALFILE" "$1" } # Download files necessary to preview the web pages locally. if [ ! -r "$WWWDIR/tigris-branding/.download-complete" ]; then BRANDING_URL="http://cvs2svn.tigris.org/branding" mkdir -p "$WWWDIR/tigris-branding/"{css,scripts,images} for i in tigris inst print; do ensure "$BRANDING_URL/css/$i.css" "tigris-branding/css" done ensure "$BRANDING_URL/scripts/tigris.js" "tigris-branding/scripts" for f in `sed -n -e 's,.*url(\.\./images/\([^)]*\).*,\1,;tp' \ -etp -ed -e:p -ep $WWWDIR/tigris-branding/css/*.css`; do case $f in collapsed_big.gif|expanded_big.gif) ;; # 404! *) ensure "$BRANDING_URL/images/$f" "tigris-branding/images" ;; esac done touch "$WWWDIR/tigris-branding/.download-complete" fi # Check we have DTDs available LOCAL_CATALOG="$WWWDIR/xhtml1.catalog" if [ ! -r "$LOCAL_CATALOG" ]; then RESULT=`echo 'resolve "-//W3C//DTD XHTML 1.0 Strict//EN" ' \ '"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"' | xmlcatalog --shell` case $RESULT in *file:///*) # It looks like the system already provides this ;; *) ensure "http://www.w3.org/TR/xhtml1/xhtml1.tgz" "." rm -rf "$WWWDIR/xhtml1-20020801" "$LOCAL_CATALOG" tar -zxvf "$WWWDIR/xhtml1.tgz" xmlcatalog --noout --create "$LOCAL_CATALOG" xmlcatalog --noout --add rewriteSystem "http://www.w3.org/TR/xhtml1/" \ "`cd \"$WWWDIR\" && pwd`/xhtml1-20020801/" "$LOCAL_CATALOG" ;; esac fi test -r "$LOCAL_CATALOG" && export XML_CATALOG_FILES="$LOCAL_CATALOG" if [ $# -eq 0 ]; then echo "Usage: ./validate.sh ..." >&2; exit 1; fi if [ "$1" = "all" ]; then set - "$WWWDIR"/*.html fi if [ $# -eq 1 ]; then xmllint --nonet --noout --valid "$1"; exit $?; fi for f in "$@"; do case $f in *project_tools.html) echo "$f: Skipped" ;; *.html) xmllint --nonet --noout --valid "$f" && echo -e \ "$f: "'\033[32mvalid\033[0m' || echo -e "$f: "'\033[31;1mINVALID\033[0m' ;; *) echo "$f: Not HTML" ;; esac done cvs2svn-2.4.0/www/cvs2git.html0000664000076500007650000003437512027257623017320 0ustar mhaggermhagger00000000000000 cvs2git Documentation

cvs2git

Index


Introduction

cvs2svn/cvs2git is a tool that can be used to migrate CVS repositories to newer version control tools, including git. git is a distributed version control system most famous for being used for Linux kernel development. The program used to convert to git, called cvs2git, is distributed as part of the cvs2svn project.

If you are reading this documentation on the cvs2svn website, then please be aware that it describes the current trunk version of cvs2svn, which may be different than the most recent released version. Please refer to the documentation that was included with your version of cvs2svn.

Conversion to git was added in release 2.1 of cvs2svn and has improved significantly since then. Please make sure you are using an up-to-date version of cvs2svn--perhaps even the development trunk version.

Requirements

cvs2git requires the following:

  • Direct (filesystem) access to a copy of the CVS repository that you want to convert. cvs2git parses the files in the CVS repository directly, so it is not enough to have remote CVS access. See the FAQ for more information and a possible workaround.
  • Python 2, version 2.4 or later. See http://www.python.org/. (cvs2git does not work with Python 3.x.)
  • If you use the --use-rcs option, then RCS's `co' program is required. The RCS home page is http://www.cs.purdue.edu/homes/trinkle/RCS/. See the --use-rcs flag for more details.
  • If you use the --use-cvs option, then the `cvs' command is required. The CVS home page is http://ccvs.cvshome.org/. See the --use-cvs flag for more details.
  • Git version 1.5.4.4 or later (earlier versions have a bug in "git fast-import" that prevent them from loading the files generated by cvs2git).

Development status

Most of the work of converting a repository from CVS to a more modern version control system is inferring the most likely history given the incomplete information that CVS records. cvs2svn has a long history of making sense of even the most convoluted CVS repositories, and cvs2git uses this same machinery. Therefore, cvs2git inherits the robustness and many of the features of cvs2svn. cvs2svn can convert just about every CVS repository we have ever seen, and includes a plethora of options for customizing your conversion.

The output of cvs2git is one or more dump files that can be imported into git using the excellent git fast-import tool.

Although cvs2git is considerably newer than cvs2svn, and much less well tested, it is believed that cvs2git can (cautiously) be used for production conversions. If you use cvs2git, please let us know how it worked for you!

cvs2git limitations

cvs2git still has many limitations compared to cvs2svn. The main cvs2svn developer has limited git experience and very limited time, so help would be much appreciated! Some of these missing features would be pretty easy to program, and I'd be happy to help you get started.

  • The cvs2git documentation is still not as complete as that for cvs2svn. See below for more references.
  • Differences between CVS and git branch/tag models: CVS allows a branch or tag to be created from arbitrary combinations of source revisions from multiple source branches. It even allows file revisions that were never contemporaneous to be added to a single branch/tag. Git, on the other hand, only allows the full source tree, as it existed at some instant in the history, to be branched or tagged as a unit. Moreover, the ancestry of a git revision makes implications about the contents of that revision. This difference means that it is fundamentally impossible to represent an arbitrary CVS history in a git repository 100% faithfully. cvs2git uses the following workarounds:
    • cvs2git tries to create a branch from a single source, but if it can't figure out how to, it creates the branch using a "merge" from multiple source branches. In pathological situations, the number of merge sources for a branch can be arbitrarily large. The resulting history implies that whenever any file was added to a branch, the entire source branch was merged into the destination branch, which is clearly incorrect. (The alternative, to omit the merge, would discard the information that some content was moved from one branch to the other.)
    • If cvs2git cannot determine that a CVS tag can be created from a single revision, then it creates a tag fixup branch named TAG.FIXUP, then tags this branch. (This is a necessary workaround for the fact that git only allows existing revisions to be tagged.) The TAG.FIXUP branch is created as a merge between all of the branches that contain file revisions included in the tag, which involves the same tradeoff described above for branches. The TAG.FIXUP branch is cleared at the end of the conversion, but (due to a technical limitation of the git fast-import file format) not deleted. There are some situations when a tag could be created from a single revision, but cvs2git does not realize it and creates a superfluous tag fixup branch. It is possible to delete superfluous tag fixup branches after the conversion by running the contrib/git-move-refs.py script within the resulting git repository.
  • There are no checks that CVS branch and tag names are legal git names. There are probably other git constraints that should also be checked.
  • The data that should be fed to git fast-import are written to two files, which have to be loaded into git fast-import manually. These files might grow to very large size. It would be nice to add an option to invoke git fast-import automatically and pipe the output directly into git fast-import; this should also speed up the conversion.
  • Only single projects can be converted at a time. Given the way git is typically used, I don't think that this is a significant limitation.
  • The cvs2svn test suite does not include meaningful tests of git output.
  • cvs2git makes no attempt to convert .cvsignore files into .gitignore files.
  • cvs2git, like cvs2svn, does not support incremental conversion (i.e., tracking a live CVS repository). However, at least one person has documented a possible workaround.

Documentation

There is some documentation specific to cvs2git, and much of the cvs2svn documentation also applies fairly straightforwardly to cvs2git. See the following sources:

  • This document.
  • The output of cvs2git --help.
  • The cvs2git man page. If the man page is not installed on your Unix-like system, you can view it by typing a command like cvs2git --man | groff -man -Tascii | less.
  • The cvs2svn documentation and the cvs2svn FAQ, which contain much general discussion and describe many features that can also be used for cvs2git.
  • cvs2git-example.options in the cvs2svn source tree, which is an example of an options file that can be used to configure a cvs2git conversion. The file is extensively documented.
  • The cvs2svn mailing lists, IRC channel, etc., as described in the cvs2svn FAQ.

Usage

This section outlines the steps needed to convert a CVS repository to git using cvs2git.

  1. Be sure that you have the requirements, including either RCS or CVS (used to read revision contents from the CVS repository).
  2. Obtain a copy of cvs2svn/cvs2git version 2.1 or newer. It is recommended that you use the most recent version available, or even the development version.
    • To install cvs2svn from a tarball, simply unpack the tarball into a directory on your conversion computer (cvs2git can be run directly from this directory).
    • To check out the current trunk version of cvs2svn, make sure that you have Subversion installed and then run:

      svn co --username=guest --password="" http://cvs2svn.tigris.org/svn/cvs2svn/trunk cvs2svn-trunk
      cd cvs2svn-trunk
      make man # If you want to create manpages for the main programs
      make check # ...optional
      

      Please note that the test suite includes tests that are marked "XFAIL" (expected failure); these are known and are not considered serious problems.

  3. Configure cvs2git and run the conversion. This can be done via command-line options or via an options file:
    • The command-line options for running cvs2git are documented in the cvs2git man page and in the output of cvs2git --help. For example:

      cvs2git \
          --blobfile=cvs2svn-tmp/git-blob.dat \
          --dumpfile=cvs2svn-tmp/git-dump.dat \
          --username=cvs2git \
          /path/to/cvs/repo
      
    • The more flexible options-file method requires you to create an options file, then start cvs2git with

      cvs2git --options=OPTIONS-FILE
      

      Use cvs2git-example.options in the cvs2svn source tree as your starting point; the file contains lots of documentation.

    This creates two output files in git fast-import format. The names of these files are specified by your options file or command-line arguments. In the example, these files are named cvs2svn-tmp/git-blob.dat and cvs2svn-tmp/git-dump.dat.

  4. Initialize a git repository:

    mkdir myproject.git
    cd myproject.git
    git init --bare
    
  5. Load the dump files into the new git repository using git fast-import:

    git fast-import --export-marks=../cvs2svn-tmp/git-marks.dat < ../cvs2svn-tmp/git-blob.dat
    git fast-import --import-marks=../cvs2svn-tmp/git-marks.dat < ../cvs2svn-tmp/git-dump.dat
    

    On Linux/Unix this can be shortened to:

    cat ../cvs2svn-tmp/git-blob.dat ../cvs2svn-tmp/git-dump.dat | git fast-import
    
  6. (Optional) View the results of the conversion, for example:

    gitk --all
    
  7. (Recommended) To get rid of unnecessary tag fixup branches, run the contrib/git-move-refs.py script from within the git repository.
  8. The result of this procedure is a bare git repository (one that does not have a checked-out version of the source tree). This is the type of repository that you would put on your server. To work on your project, make a non-bare clone (one that includes a checked-out source tree):

    cd $HOME
    git clone /path/to/myproject.git
    cd myproject
    

    Now you are ready to start editing files and committing to git!

Converting to a non-bare repository

If you want to convert into a non-bare git repository (one including a working tree), then you need to make two changes to the above procedure:

  • Omit the --bare option in step 4; i.e., type

    mkdir myproject.git
    cd myproject.git
    git init
    
  • When the conversion is done, instead of cloning as described in step 8, you need to explicitly check out the "master" version of the files into your working tree:

    git checkout
    

Feedback would be much appreciated, including reports of success using cvs2git. Please send comments, bug reports, and patches to the cvs2svn mailing lists.

cvs2svn-2.4.0/www/cvs2bzr.html0000664000076500007650000002411311435427214017314 0ustar mhaggermhagger00000000000000 cvs2bzr Documentation

cvs2bzr

Index


Introduction

cvs2svn/cvs2bzr is a tool that can be used to migrate CVS repositories to newer version control tools, including Bazaar. Bazaar is an adaptive version control system that supports both centralised and distributed version control. It is most famous for being used for Ubuntu and MySQL development. The program used to convert to Bazaar, called cvs2bzr, is distributed as part of the cvs2svn project.

If you are reading this documentation on the cvs2svn website, then please be aware that it describes the current trunk version of cvs2svn, which may be different than the most recent released version. Please refer to the documentation that was included with your version of cvs2svn.

Conversion to Bazaar was added in release 2.3 of cvs2svn and may have improved significantly since then. Please make sure you are using an up-to-date version of cvs2svn--perhaps even the development trunk version.

Requirements

cvs2bzr requires the following:

  • Direct (filesystem) access to a copy of the CVS repository that you want to convert. cvs2bzr parses the files in the CVS repository directly, so it is not enough to have remote CVS access. See the FAQ for more information and a possible workaround.
  • Python 2, version 2.4 or later. See http://www.python.org/. (cvs2bzr does not work with Python 3.x.)
  • If you use the --use-rcs option, then RCS's `co' program is required. The RCS home page is http://www.cs.purdue.edu/homes/trinkle/RCS/. See the --use-rcs flag for more details.
  • If you use the --use-cvs option, then the `cvs' command is required. The CVS home page is http://ccvs.cvshome.org/. See the --use-cvs flag for more details.
  • Bazaar version 1.13 or later.
  • The bzr-fastimport plugin version 0.9 or later.

Development status

Most of the work of converting a repository from CVS to a more modern version control system is inferring the most likely history given the incomplete information that CVS records. cvs2svn has a long history of making sense of even the most convoluted CVS repositories, and cvs2bzr uses this same machinery. Therefore, cvs2bzr inherits the robustness and many of the features of cvs2svn. cvs2svn can convert just about every CVS repository we have ever seen, and includes a plethora of options for customizing your conversion.

The output of cvs2bzr is a "fastimport" dump file that can be imported into Bazaar using the bzr-fastimport plugin.

Although cvs2bzr is considerably newer than cvs2svn, and much less well tested, it is believed that cvs2bzr can (cautiously) be used for production conversions. If you use cvs2bzr, please let us know how it worked for you!

cvs2bzr limitations

cvs2bzr still has many limitations compared to cvs2svn. The main cvs2svn developer has limited Bazaar experience and very limited time, so help would be much appreciated! Some of these missing features would be pretty easy to program, and I'd be happy to help you get started.

  • The cvs2bzr documentation is still rather thin. See below for more references.
  • CVS allows a branch to be created from arbitrary combinations of source revisions and/or source branches. cvs2bzr tries to create a branch from a single source, but if it can't figure out how to, it creates the branch using "merge" from multiple sources. In pathological situations, the number of merge sources for a branch can be arbitrarily large.
  • There are no checks that CVS branch and tag names are legal names in Bazaar. This is unlikely to be a problem because Bazaar uses paths for branch names similar to CVS and Subversion. Tag naming in Bazaar is also more flexible than in git, say.
  • Only single projects can be converted at a time. Given the way Bazaar is typically used, I don't think that this is a significant limitation.
  • cvs2bzr is not especially fast. Among other things, it still uses RCS or CVS to extract the contents of the CVS revisions. Implementing the --internal-co option for cvs2bzr (using code that already exists in cvs2svn) might improve the conversion speed considerably.
  • The cvs2svn test suite does not include meaningful tests of Bazaar output.
  • cvs2bzr makes no attempt to convert .cvsignore files into .bzrignore files.
  • cvs2bzr, like cvs2svn, does not support incremental conversion (i.e., tracking a live CVS repository). However, this possible workaround for using cvs2git along those lines might provide some assistance for anyone wanting to try doing that using cvs2bzr.

Documentation

There is some documentation specific to cvs2bzr, and much of the cvs2svn documentation also applies fairly straightforwardly to cvs2bzr. See the following sources:

  • This document.
  • The cvs2bzr man page and the output of cvs2bzr --help.
  • The cvs2svn documentation and the cvs2svn FAQ, which contain much general discussion and describe many features that can also be used for cvs2bzr.
  • cvs2bzr-example.options in the cvs2svn source tree, which is an example of an options file that can be used to configure a cvs2bzr conversion. The file is extensively documented.
  • The cvs2svn mailing lists, IRC channel, etc., as described in the cvs2svn FAQ.
  • The Bazaar Data Migration Guide.

Usage

This section outlines the steps needed to convert a CVS repository to Bazaar using cvs2bzr.

  1. Be sure that you have the requirements, including either RCS or CVS (used to read revision contents from the CVS repository).
  2. Obtain a copy of cvs2svn/cvs2bzr version 2.3 or newer. It is recommended that you use the most recent version available, or even the development version.
    • To install cvs2svn from a tarball, simply unpack the tarball into a directory on your conversion computer (cvs2bzr can be run directly from this directory).
    • To check out the current trunk version of cvs2svn, make sure that you have Subversion installed and then run:

      svn co --username=guest --password="" http://cvs2svn.tigris.org/svn/cvs2svn/trunk cvs2svn-trunk
      cd cvs2svn-trunk
      make man # If you want to create manpages for the main programs
      make check # ...optional
      

      Please note that the test suite includes tests that are marked "XFAIL" (expected failure); these are known and are not considered serious problems.

  3. Configure cvs2bzr for your conversion. This can be done via command-line options or via an options file:
    • The command-line options for running cvs2bzr are documented in the cvs2bzr man page and in the output of cvs2bzr --help.
    • The more flexible options-file method requires you to create an options file, then start cvs2bzr with

      cvs2bzr --options=OPTIONS-FILE
      

      Use cvs2bzr-example.options in the cvs2svn source tree as your starting point; the file contains lots of documentation.

  4. Run cvs2bzr. This creates an output file in fast-import format. The name of this file is specified by your options file or a command-line argument. In the example, the file is named cvs2svn-tmp/dumpfile.fi.

  5. Load the dump file using bzr fast-import:

    bzr fast-import cvs2svn-tmp/dumpfile.fi project.bzr
    

Feedback would be much appreciated, including reports of success using cvs2bzr. Please send comments, bug reports, and patches to the cvs2svn mailing lists.

cvs2svn-2.4.0/BUGS0000664000076500007650000000441710646512653014704 0ustar mhaggermhagger00000000000000 -*- text -*- REPORTING BUGS ============== This document tells how and where to report bugs in cvs2svn. It is not a list of all outstanding bugs -- we use an online issue tracker for that, see http://cvs2svn.tigris.org/issue_tracker.html Before reporting a bug: a) Verify that you are running the latest version of cvs2svn. b) Read the current frequently-asked-questions list at http://cvs2svn.tigris.org/faq.html to see if your problem has a known solution, and to help determine if your problem is caused by corruption in your CVS repository. c) Check to see if your bug is already filed in the issue tracker (see http://tinyurl.com/2uxwv for a list of all open bugs). Then, mail your bug report to dev@cvs2svn.tigris.org. To be useful, a bug report should include the following information: * The revision of cvs2svn you ran. Run 'cvs2svn --version' to determine this. * The version of Subversion you used it with. Run 'svnadmin --version' to determine this. * The exact cvs2svn command line you invoked, and the output it produced. * The contents of the configuration file that you used (if you used the --config option). * The data you ran it on. If your CVS repository is small (only a few kilobytes), then just provide the repository itself. If it's large, or if the data is confidential, then please try to come up with some smaller, releasable data set that still stimulates the bug. The cvs2svn project includes one script that can often help you narrow down the source of the bug to just a few *,v files, and another that helps strip proprietary information out of your repository. See the FAQ (http://cvs2svn.tigris.org/faq.html) for more information. The most important thing is that we be able to reproduce the bug :-). If we can reproduce it, we can usually fix it. If we can't reproduce it, we'll probably never fix it. So describing the bug conditions accurately is crucial. If in addition to that, you want to add some speculations as to the cause of the bug, or even include a patch to fix it, that's great! Thank you in advance, -The cvs2svn development team cvs2svn-2.4.0/cvs2bzr0000775000076500007650000000464211244044454015534 0ustar mhaggermhagger00000000000000#!/usr/bin/env python # (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2009 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== import sys # Make sure that a supported version of Python is being used. Do this # as early as possible, using only code compatible with Python 1.5.2 # and Python 3.x before the check. Remember: # # Python 1.5.2 doesn't have sys.version_info or ''.join(). # Python 3.0 doesn't have string.join(). # There are plans to start deprecating the string formatting '%' # operator in Python 3.1 (but we use it here anyway). version_error = """\ ERROR: cvs2bzr requires Python 2, version 2.4 or later; it does not work with Python 3. You are currently using""" version_advice = """\ Please restart cvs2bzr using a different version of the Python interpreter. Visit http://www.python.org or consult your local system administrator if you need help. HINT: If you already have a usable Python version installed, it might be possible to invoke cvs2bzr with the correct Python interpreter by typing something like 'python2.5 """ + sys.argv[0] + """ [...]'. """ try: version = sys.version_info except AttributeError: # This is probably a pre-2.0 version of Python. sys.stderr.write(version_error + '\n') sys.stderr.write('-'*70 + '\n') sys.stderr.write(sys.version + '\n') sys.stderr.write('-'*70 + '\n') sys.stderr.write(version_advice) sys.exit(1) if not ((2,4) <= version < (3,0)): sys.stderr.write( version_error + ' version %d.%d.%d.\n' % (version[0], version[1], version[2],) ) sys.stderr.write(version_advice) sys.exit(1) import os from cvs2svn_lib.common import FatalException from cvs2svn_lib.main import bzr_main try: bzr_main(os.path.basename(sys.argv[0]), sys.argv[1:]) except FatalException, e: sys.stderr.write(str(e) + '\n') sys.exit(1) cvs2svn-2.4.0/COMMITTERS0000664000076500007650000000201210754725354015623 0ustar mhaggermhagger00000000000000The following people have commit access to cvs2svn. This is not a full list of cvs2svn's authors, however -- for that, you'd need to look over the log messages to see all the patch contributors. If you have a question or comment, it's probably best to mail dev@cvs2svn.tigris.org, rather than mailing any of these people directly. cmpilato C. Michael Pilato gstein Greg Stein brane Branko Čibej blair Blair Zajac maxb Max Bowsher fitz Brian W. Fitzpatrick ringstrom Tobias Ringström kfogel Karl Fogel dionisos Erik Hülsmann david David Summers mhagger Michael Haggerty ossi Oswald Buddenhagen ## Local Variables: ## coding:utf-8 ## End: ## vim:encoding=utf8 cvs2svn-2.4.0/cvs2svn0000775000076500007650000000464211134563221015541 0ustar mhaggermhagger00000000000000#!/usr/bin/env python # (Be in -*- python -*- mode.) # # ==================================================================== # Copyright (c) 2000-2008 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license-1.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://cvs2svn.tigris.org/. # ==================================================================== import sys # Make sure that a supported version of Python is being used. Do this # as early as possible, using only code compatible with Python 1.5.2 # and Python 3.x before the check. Remember: # # Python 1.5.2 doesn't have sys.version_info or ''.join(). # Python 3.0 doesn't have string.join(). # There are plans to start deprecating the string formatting '%' # operator in Python 3.1 (but we use it here anyway). version_error = """\ ERROR: cvs2svn requires Python 2, version 2.4 or later; it does not work with Python 3. You are currently using""" version_advice = """\ Please restart cvs2svn using a different version of the Python interpreter. Visit http://www.python.org or consult your local system administrator if you need help. HINT: If you already have a usable Python version installed, it might be possible to invoke cvs2svn with the correct Python interpreter by typing something like 'python2.5 """ + sys.argv[0] + """ [...]'. """ try: version = sys.version_info except AttributeError: # This is probably a pre-2.0 version of Python. sys.stderr.write(version_error + '\n') sys.stderr.write('-'*70 + '\n') sys.stderr.write(sys.version + '\n') sys.stderr.write('-'*70 + '\n') sys.stderr.write(version_advice) sys.exit(1) if not ((2,4) <= version < (3,0)): sys.stderr.write( version_error + ' version %d.%d.%d.\n' % (version[0], version[1], version[2],) ) sys.stderr.write(version_advice) sys.exit(1) import os from cvs2svn_lib.common import FatalException from cvs2svn_lib.main import svn_main try: svn_main(os.path.basename(sys.argv[0]), sys.argv[1:]) except FatalException, e: sys.stderr.write(str(e) + '\n') sys.exit(1) cvs2svn-2.4.0/svntest/0000775000076500007650000000000012027373500015710 5ustar mhaggermhagger00000000000000cvs2svn-2.4.0/svntest/sandbox.py0000664000076500007650000002056011434364627017736 0ustar mhaggermhagger00000000000000# # sandbox.py : tools for manipulating a test's working area ("a sandbox") # # ==================================================================== # Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. # ==================================================================== # import os import shutil import copy import svntest class Sandbox: """Manages a sandbox (one or more repository/working copy pairs) for a test to operate within.""" dependents = None def __init__(self, module, idx): self._set_name("%s-%d" % (module, idx)) # This flag is set to True by build() and returned by is_built() self._is_built = False def _set_name(self, name, read_only=False): """A convenience method for renaming a sandbox, useful when working with multiple repositories in the same unit test.""" if not name is None: self.name = name self.read_only = read_only self.wc_dir = os.path.join(svntest.main.general_wc_dir, self.name) if not read_only: self.repo_dir = os.path.join(svntest.main.general_repo_dir, self.name) self.repo_url = (svntest.main.options.test_area_url + '/' + svntest.main.pathname2url(self.repo_dir)) else: self.repo_dir = svntest.main.pristine_dir self.repo_url = svntest.main.pristine_url ### TODO: Move this into to the build() method # For dav tests we need a single authz file which must be present, # so we recreate it each time a sandbox is created with some default # contents. if self.repo_url.startswith("http"): # this dir doesn't exist out of the box, so we may have to make it if not os.path.exists(svntest.main.work_dir): os.makedirs(svntest.main.work_dir) self.authz_file = os.path.join(svntest.main.work_dir, "authz") open(self.authz_file, 'w').write("[/]\n* = rw\n") # For svnserve tests we have a per-repository authz file, and it # doesn't need to be there in order for things to work, so we don't # have any default contents. elif self.repo_url.startswith("svn"): self.authz_file = os.path.join(self.repo_dir, "conf", "authz") self.test_paths = [self.wc_dir, self.repo_dir] def clone_dependent(self, copy_wc=False): """A convenience method for creating a near-duplicate of this sandbox, useful when working with multiple repositories in the same unit test. If COPY_WC is true, make an exact copy of this sandbox's working copy at the new sandbox's working copy directory. Any necessary cleanup operations are triggered by cleanup of the original sandbox.""" if not self.dependents: self.dependents = [] clone = copy.deepcopy(self) self.dependents.append(clone) clone._set_name("%s-%d" % (self.name, len(self.dependents))) if copy_wc: self.add_test_path(clone.wc_dir) shutil.copytree(self.wc_dir, clone.wc_dir, symlinks=True) return clone def build(self, name=None, create_wc=True, read_only=False): """Make a 'Greek Tree' repo (or refer to the central one if READ_ONLY), and check out a WC from it (unless CREATE_WC is false). Change the sandbox's name to NAME. See actions.make_repo_and_wc() for details.""" self._set_name(name, read_only) if svntest.actions.make_repo_and_wc(self, create_wc, read_only): raise svntest.Failure("Could not build repository and sandbox '%s'" % self.name) else: self._is_built = True def add_test_path(self, path, remove=True): self.test_paths.append(path) if remove: svntest.main.safe_rmtree(path) def add_repo_path(self, suffix, remove=True): """Generate a path, under the general repositories directory, with a name that ends in SUFFIX, e.g. suffix="2" -> ".../basic_tests.2". If REMOVE is true, remove anything currently on disk at that path. Remember that path so that the automatic clean-up mechanism can delete it at the end of the test. Generate a repository URL to refer to a repository at that path. Do not create a repository. Return (REPOS-PATH, REPOS-URL).""" path = (os.path.join(svntest.main.general_repo_dir, self.name) + '.' + suffix) url = svntest.main.options.test_area_url + \ '/' + svntest.main.pathname2url(path) self.add_test_path(path, remove) return path, url def add_wc_path(self, suffix, remove=True): """Generate a path, under the general working copies directory, with a name that ends in SUFFIX, e.g. suffix="2" -> ".../basic_tests.2". If REMOVE is true, remove anything currently on disk at that path. Remember that path so that the automatic clean-up mechanism can delete it at the end of the test. Do not create a working copy. Return the generated WC-PATH.""" path = self.wc_dir + '.' + suffix self.add_test_path(path, remove) return path def cleanup_test_paths(self): "Clean up detritus from this sandbox, and any dependents." if self.dependents: # Recursively cleanup any dependent sandboxes. for sbox in self.dependents: sbox.cleanup_test_paths() # cleanup all test specific working copies and repositories for path in self.test_paths: if not path is svntest.main.pristine_dir: _cleanup_test_path(path) def is_built(self): "Returns True when build() has been called on this instance." return self._is_built def ospath(self, relpath, wc_dir=None): if wc_dir is None: wc_dir = self.wc_dir return os.path.join(wc_dir, svntest.wc.to_ospath(relpath)) def simple_commit(self, target=None): assert not self.read_only if target is None: target = self.wc_dir svntest.main.run_svn(False, 'commit', '-m', svntest.main.make_log_msg(), target) def simple_rm(self, *targets): assert len(targets) > 0 if len(targets) == 1 and is_url(targets[0]): assert not self.read_only targets = ('-m', svntests.main.make_log_msg(), targets[0]) svntest.main.run_svn(False, 'rm', *targets) def simple_mkdir(self, *targets): assert len(targets) > 0 if len(targets) == 1 and is_url(targets[0]): assert not self.read_only targets = ('-m', svntests.main.make_log_msg(), targets[0]) svntest.main.run_svn(False, 'mkdir', *targets) def simple_add(self, *targets): assert len(targets) > 0 svntest.main.run_svn(False, 'add', *targets) def simple_revert(self, *targets): assert len(targets) > 0 svntest.main.run_svn(False, 'revert', *targets) def simple_propset(self, name, value, *targets): assert len(targets) > 0 svntest.main.run_svn(False, 'propset', name, value, *targets) def simple_propdel(self, name, *targets): assert len(targets) > 0 svntest.main.run_svn(False, 'propdel', name, *targets) def is_url(target): return (target.startswith('^/') or target.startswith('file://') or target.startswith('http://') or target.startswith('https://') or target.startswith('svn://') or target.startswith('svn+ssh://')) _deferred_test_paths = [] def cleanup_deferred_test_paths(): global _deferred_test_paths test_paths = _deferred_test_paths _deferred_test_paths = [] for path in test_paths: _cleanup_test_path(path, True) def _cleanup_test_path(path, retrying=False): if svntest.main.options.verbose: if retrying: print("CLEANUP: RETRY: %s" % path) else: print("CLEANUP: %s" % path) try: svntest.main.safe_rmtree(path) except: if svntest.main.verbose_mode: print("WARNING: cleanup failed, will try again later") _deferred_test_paths.append(path) cvs2svn-2.4.0/svntest/testcase.py0000664000076500007650000002026711434364627020117 0ustar mhaggermhagger00000000000000# # testcase.py: Control of test case execution. # # Subversion is a tool for revision control. # See http://subversion.tigris.org for more information. # # ==================================================================== # Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. ###################################################################### import os, types, sys import svntest # if somebody does a "from testcase import *", they only get these names __all__ = ['XFail', 'Wimp', 'Skip', 'SkipUnless'] RESULT_OK = 'ok' RESULT_FAIL = 'fail' RESULT_SKIP = 'skip' class TextColors: '''Some ANSI terminal constants for output color''' ENDC = '\033[0;m' FAILURE = '\033[1;31m' SUCCESS = '\033[1;32m' @classmethod def disable(cls): cls.ENDC = '' cls.FAILURE = '' cls.SUCCESS = '' @classmethod def success(cls, str): return lambda: cls.SUCCESS + str + cls.ENDC @classmethod def failure(cls, str): return lambda: cls.FAILURE + str + cls.ENDC if not sys.stdout.isatty() or sys.platform == 'win32': TextColors.disable() class TestCase: """A thing that can be tested. This is an abstract class with several methods that need to be overridden.""" _result_map = { RESULT_OK: (0, TextColors.success('PASS: '), True), RESULT_FAIL: (1, TextColors.failure('FAIL: '), False), RESULT_SKIP: (2, TextColors.success('SKIP: '), True), } def __init__(self, delegate=None, cond_func=lambda: True, doc=None, wip=None): """Create a test case instance based on DELEGATE. COND_FUNC is a callable that is evaluated at test run time and should return a boolean value that determines how a pass or failure is interpreted: see the specialized kinds of test case such as XFail and Skip for details. The evaluation of COND_FUNC is deferred so that it can base its decision on useful bits of information that are not available at __init__ time (like the fact that we're running over a particular RA layer). DOC is ... WIP is ... """ assert hasattr(cond_func, '__call__') self._delegate = delegate self._cond_func = cond_func self.description = doc or delegate.description self.inprogress = wip def get_function_name(self): """Return the name of the python function implementing the test.""" return self._delegate.get_function_name() def get_sandbox_name(self): """Return the name that should be used for the sandbox. If a sandbox should not be constructed, this method returns None. """ return self._delegate.get_sandbox_name() def run(self, sandbox): """Run the test within the given sandbox.""" return self._delegate.run(sandbox) def list_mode(self): return '' def results(self, result): # if our condition applied, then use our result map. otherwise, delegate. if self._cond_func(): val = list(self._result_map[result]) val[1] = val[1]() return val return self._delegate.results(result) class FunctionTestCase(TestCase): """A TestCase based on a naked Python function object. FUNC should be a function that returns None on success and throws an svntest.Failure exception on failure. It should have a brief docstring describing what it does (and fulfilling certain conditions). FUNC must take one argument, an Sandbox instance. (The sandbox name is derived from the file name in which FUNC was defined) """ def __init__(self, func): # it better be a function that accepts an sbox parameter and has a # docstring on it. assert isinstance(func, types.FunctionType) name = func.func_name assert func.func_code.co_argcount == 1, \ '%s must take an sbox argument' % name doc = func.__doc__.strip() assert doc, '%s must have a docstring' % name # enforce stylistic guidelines for the function docstrings: # - no longer than 50 characters # - should not end in a period # - should not be capitalized assert len(doc) <= 50, \ "%s's docstring must be 50 characters or less" % name assert doc[-1] != '.', \ "%s's docstring should not end in a period" % name assert doc[0].lower() == doc[0], \ "%s's docstring should not be capitalized" % name TestCase.__init__(self, doc=doc) self.func = func def get_function_name(self): return self.func.func_name def get_sandbox_name(self): """Base the sandbox's name on the name of the file in which the function was defined.""" filename = self.func.func_code.co_filename return os.path.splitext(os.path.basename(filename))[0] def run(self, sandbox): return self.func(sandbox) class XFail(TestCase): """A test that is expected to fail, if its condition is true.""" _result_map = { RESULT_OK: (1, TextColors.failure('XPASS:'), False), RESULT_FAIL: (0, TextColors.success('XFAIL:'), True), RESULT_SKIP: (2, TextColors.success('SKIP: '), True), } def __init__(self, test_case, cond_func=lambda: True, wip=None): """Create an XFail instance based on TEST_CASE. COND_FUNC is a callable that is evaluated at test run time and should return a boolean value. If COND_FUNC returns true, then TEST_CASE is expected to fail (and a pass is considered an error); otherwise, TEST_CASE is run normally. The evaluation of COND_FUNC is deferred so that it can base its decision on useful bits of information that are not available at __init__ time (like the fact that we're running over a particular RA layer). WIP is ...""" TestCase.__init__(self, create_test_case(test_case), cond_func, wip=wip) def list_mode(self): # basically, the only possible delegate is a Skip test. favor that mode. return self._delegate.list_mode() or 'XFAIL' class Wimp(XFail): """Like XFail, but indicates a work-in-progress: an unexpected pass is not considered a test failure.""" _result_map = { RESULT_OK: (0, TextColors.success('XPASS:'), True), RESULT_FAIL: (0, TextColors.success('XFAIL:'), True), RESULT_SKIP: (2, TextColors.success('SKIP: '), True), } def __init__(self, wip, test_case, cond_func=lambda: True): XFail.__init__(self, test_case, cond_func, wip) class Skip(TestCase): """A test that will be skipped if its conditional is true.""" def __init__(self, test_case, cond_func=lambda: True): """Create a Skip instance based on TEST_CASE. COND_FUNC is a callable that is evaluated at test run time and should return a boolean value. If COND_FUNC returns true, then TEST_CASE is skipped; otherwise, TEST_CASE is run normally. The evaluation of COND_FUNC is deferred so that it can base its decision on useful bits of information that are not available at __init__ time (like the fact that we're running over a particular RA layer).""" TestCase.__init__(self, create_test_case(test_case), cond_func) def list_mode(self): if self._cond_func(): return 'SKIP' return self._delegate.list_mode() def get_sandbox_name(self): if self._cond_func(): return None return self._delegate.get_sandbox_name() def run(self, sandbox): if self._cond_func(): raise svntest.Skip return self._delegate.run(sandbox) class SkipUnless(Skip): """A test that will be skipped if its conditional is false.""" def __init__(self, test_case, cond_func): Skip.__init__(self, test_case, lambda c=cond_func: not c()) def create_test_case(func): if isinstance(func, TestCase): return func else: return FunctionTestCase(func) cvs2svn-2.4.0/svntest/tree.py0000664000076500007650000007141211434364627017241 0ustar mhaggermhagger00000000000000# # tree.py: tools for comparing directory trees # # Subversion is a tool for revision control. # See http://subversion.tigris.org for more information. # # ==================================================================== # Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. ###################################################################### import re import os import sys if sys.version_info[0] >= 3: # Python >=3.0 from io import StringIO else: # Python <3.0 from StringIO import StringIO from xml.dom.minidom import parseString import base64 import svntest # Tree Exceptions. # All tree exceptions should inherit from SVNTreeError class SVNTreeError(svntest.Failure): "Exception raised if you screw up in the tree module." pass class SVNTreeUnequal(SVNTreeError): "Exception raised if two trees are unequal." pass class SVNTypeMismatch(SVNTreeError): "Exception raised if one node is file and other is dir" pass #======================================================================== # ===> Overview of our Datastructures <=== # The general idea here is that many, many things can be represented by # a tree structure: # - a working copy's structure and contents # - the output of 'svn status' # - the output of 'svn checkout/update' # - the output of 'svn commit' # The idea is that a test function creates a "expected" tree of some # kind, and is then able to compare it to an "actual" tree that comes # from running the Subversion client. This is what makes a test # automated; if an actual and expected tree match exactly, then the test # has passed. (See compare_trees() below.) # The SVNTreeNode class is the fundamental data type used to build tree # structures. The class contains a method for "dropping" a new node # into an ever-growing tree structure. (See also create_from_path()). # We have four parsers in this file for the four use cases listed above: # each parser examines some kind of input and returns a tree of # SVNTreeNode objects. (See build_tree_from_checkout(), # build_tree_from_commit(), build_tree_from_status(), and # build_tree_from_wc()). These trees are the "actual" trees that result # from running the Subversion client. # Also necessary, of course, is a convenient way for a test to create an # "expected" tree. The test *could* manually construct and link a bunch # of SVNTreeNodes, certainly. But instead, all the tests are using the # build_generic_tree() routine instead. # build_generic_tree() takes a specially-formatted list of lists as # input, and returns a tree of SVNTreeNodes. The list of lists has this # structure: # [ ['/full/path/to/item', 'text contents', {prop-hash}, {att-hash}], # [...], # [...], # ... ] # You can see that each item in the list essentially defines an # SVNTreeNode. build_generic_tree() instantiates a SVNTreeNode for each # item, and then drops it into a tree by parsing each item's full path. # So a typical test routine spends most of its time preparing lists of # this format and sending them to build_generic_tree(), rather than # building the "expected" trees directly. # ### Note: in the future, we'd like to remove this extra layer of # ### abstraction. We'd like the SVNTreeNode class to be more # ### directly programmer-friendly, providing a number of accessor # ### routines, so that tests can construct trees directly. # The first three fields of each list-item are self-explanatory. It's # the fourth field, the "attribute" hash, that needs some explanation. # The att-hash is used to place extra information about the node itself, # depending on the parsing context: # - in the 'svn co/up' use-case, each line of output starts with two # characters from the set of (A, D, G, U, C, _) or 'Restored'. The # status code is stored in a attribute named 'status'. In the case # of a restored file, the word 'Restored' is stored in an attribute # named 'verb'. # - in the 'svn ci/im' use-case, each line of output starts with one # of the words (Adding, Deleting, Sending). This verb is stored in # an attribute named 'verb'. # - in the 'svn status' use-case (which is always run with the -v # (--verbose) flag), each line of output contains a working revision # number and a two-letter status code similar to the 'svn co/up' # case. This information is stored in attributes named 'wc_rev' # and 'status'. The repository revision is also printed, but it # is ignored. # - in the working-copy use-case, the att-hash is ignored. # Finally, one last explanation: the file 'actions.py' contain a number # of helper routines named 'run_and_verify_FOO'. These routines take # one or more "expected" trees as input, then run some svn subcommand, # then push the output through an appropriate parser to derive an # "actual" tree. Then it runs compare_trees() and raises an exception # on failure. This is why most tests typically end with a call to # run_and_verify_FOO(). #======================================================================== # A node in a tree. # # If CHILDREN is None, then the node is a file. Otherwise, CHILDREN # is a list of the nodes making up that directory's children. # # NAME is simply the name of the file or directory. CONTENTS is a # string that contains the file's contents (if a file), PROPS are # properties attached to files or dirs, and ATTS is a dictionary of # other metadata attached to the node. class SVNTreeNode: def __init__(self, name, children=None, contents=None, props={}, atts={}): self.name = name self.children = children self.contents = contents self.props = props self.atts = atts self.path = name # TODO: Check to make sure contents and children are mutually exclusive def add_child(self, newchild): child_already_exists = 0 if self.children is None: # if you're a file, self.children = [] # become an empty dir. else: for a in self.children: if a.name == newchild.name: child_already_exists = 1 break if child_already_exists: if newchild.children is None: # this is the 'end' of the chain, so copy any content here. a.contents = newchild.contents a.props = newchild.props a.atts = newchild.atts a.path = os.path.join(self.path, newchild.name) else: # try to add dangling children to your matching node for i in newchild.children: a.add_child(i) else: self.children.append(newchild) newchild.path = os.path.join(self.path, newchild.name) def pprint(self, stream = sys.stdout): "Pretty-print the meta data for this node to STREAM." stream.write(" * Node name: %s\n" % self.name) stream.write(" Path: %s\n" % self.path) mime_type = self.props.get("svn:mime-type") if not mime_type or mime_type.startswith("text/"): if self.children is not None: stream.write(" Contents: N/A (node is a directory)\n") else: stream.write(" Contents: %s\n" % self.contents) else: stream.write(" Contents: %d bytes (binary)\n" % len(self.contents)) stream.write(" Properties: %s\n" % self.props) stream.write(" Attributes: %s\n" % self.atts) ### FIXME: I'd like to be able to tell the difference between ### self.children is None (file) and self.children == [] (empty ### directory), but it seems that most places that construct ### SVNTreeNode objects don't even try to do that. --xbc ### ### See issue #1611 about this problem. -kfogel if self.children is not None: stream.write(" Children: %s\n" % len(self.children)) else: stream.write(" Children: None (node is probably a file)\n") stream.flush() def get_printable_path(self): """Remove some occurrences of root_node_name = "__SVN_ROOT_NODE", it is in the way when matching for a subtree, and looks bad.""" path = self.path if path.startswith(root_node_name + os.sep): path = path[len(root_node_name + os.sep):] return path def print_script(self, stream = sys.stdout, subtree = "", prepend="\n ", drop_empties = True): """Python-script-print the meta data for this node to STREAM. Print only those nodes whose path string starts with the string SUBTREE, and print only the part of the path string that remains after SUBTREE. PREPEND is a string prepended to each node printout (does the line feed if desired, don't include a comma in PREPEND). If DROP_EMPTIES is true, all dir nodes that have no data set in them (no props, no atts) and that have children (so they are included implicitly anyway) are not printed. Return 1 if this node was printed, 0 otherwise (added up by dump_tree_script())""" # figure out if this node would be obsolete to print. if drop_empties and len(self.props) < 1 and len(self.atts) < 1 and \ self.contents is None and self.children is not None: return 0 path = self.get_printable_path() # remove the subtree path, skip this node if necessary. if path.startswith(subtree): path = path[len(subtree):] else: return 0 if path.startswith(os.sep): path = path[1:] line = prepend line += "%-20s: Item(" % ("'%s'" % path) comma = False mime_type = self.props.get("svn:mime-type") if not mime_type or mime_type.startswith("text/"): if self.contents is not None: # Escape some characters for nicer script and readability. # (This is error output. I guess speed is no consideration here.) line += "contents=\"%s\"" % (self.contents .replace('\n','\\n') .replace('"','\\"') .replace('\r','\\r') .replace('\t','\\t')) comma = True else: line += 'content is binary data' comma = True if self.props: if comma: line += ", " line += "props={" comma = False for name in self.props: if comma: line += ", " line += "'%s':'%s'" % (name, self.props[name]) comma = True line += "}" comma = True for name in self.atts: if comma: line += ", " line += "%s='%s'" % (name, self.atts[name]) comma = True line += ")," stream.write("%s" % line) stream.flush() return 1 def __str__(self): s = StringIO() self.pprint(s) return s.getvalue() def __cmp__(self, other): """Define a simple ordering of two nodes without regard to their full path (i.e. position in the tree). This can be used for sorting the children within a directory.""" return cmp(self.name, other.name) def as_state(self, prefix=None): """Return an svntest.wc.State instance that is equivalent to this tree.""" root = self if self.path == root_node_name: assert prefix is None wc_dir = '' while True: if root is not self: # don't prepend ROOT_NODE_NAME wc_dir = os.path.join(wc_dir, root.name) if root.contents or root.props or root.atts: break if not root.children or len(root.children) != 1: break root = root.children[0] state = svntest.wc.State(wc_dir, { }) if root.contents or root.props or root.atts: state.add({'': root.as_item()}) prefix = wc_dir else: assert prefix is not None path = self.path if path.startswith(root_node_name): path = path[len(root_node_name)+1:] # prefix should only be set on a recursion, which means a child, # which means this path better not be the same as the prefix. assert path != prefix, 'not processing a child of the root' l = len(prefix) if l > 0: assert path[:l] == prefix, \ '"%s" is not a prefix of "%s"' % (prefix, path) # return the portion after the separator path = path[l+1:].replace(os.sep, '/') state = svntest.wc.State('', { path: self.as_item() }) if root.children: for child in root.children: state.add_state('', child.as_state(prefix)) return state def as_item(self): return svntest.wc.StateItem(self.contents, self.props, self.atts.get('status'), self.atts.get('verb'), self.atts.get('wc_rev'), self.atts.get('locked'), self.atts.get('copied'), self.atts.get('switched'), self.atts.get('writelocked'), self.atts.get('treeconflict')) def recurse(self, function): results = [] results += [ function(self) ] if self.children: for child in self.children: results += child.recurse(function) return results def find_node(self, path): if self.get_printable_path() == path: return self if self.children: for child in self.children: result = child.find_node(path) if result: return result return None # reserved name of the root of the tree root_node_name = "__SVN_ROOT_NODE" # helper func def add_elements_as_path(top_node, element_list): """Add the elements in ELEMENT_LIST as if they were a single path below TOP_NODE.""" # The idea of this function is to take a list like so: # ['A', 'B', 'C'] and a top node, say 'Z', and generate a tree # like this: # # Z -> A -> B -> C # # where 1 -> 2 means 2 is a child of 1. # prev_node = top_node for i in element_list: new_node = SVNTreeNode(i, None) prev_node.add_child(new_node) prev_node = new_node def compare_atts(a, b): """Compare two dictionaries of attributes, A (actual) and B (expected). If the attribute 'treeconflict' in B is missing or is 'None', ignore it. Return 0 if the same, 1 otherwise.""" a = a.copy() b = b.copy() # Remove any attributes to ignore. for att in ['treeconflict']: if (att not in b) or (b[att] is None): if att in a: del a[att] if att in b: del b[att] if a != b: return 1 return 0 # Helper for compare_trees def compare_file_nodes(a, b): """Compare two nodes, A (actual) and B (expected). Compare their names, contents, properties and attributes, ignoring children. Return 0 if the same, 1 otherwise.""" if a.name != b.name: return 1 if a.contents != b.contents: return 1 if a.props != b.props: return 1 return compare_atts(a.atts, b.atts) # Internal utility used by most build_tree_from_foo() routines. # # (Take the output and .add_child() it to a root node.) def create_from_path(path, contents=None, props={}, atts={}): """Create and return a linked list of treenodes, given a PATH representing a single entry into that tree. CONTENTS and PROPS are optional arguments that will be deposited in the tail node.""" # get a list of all the names in the path # each of these will be a child of the former if os.sep != "/": path = path.replace(os.sep, "/") elements = path.split("/") if len(elements) == 0: ### we should raise a less generic error here. which? raise SVNTreeError root_node = None # if this is Windows: if the path contains a drive name (X:), make it # the root node. if os.name == 'nt': m = re.match("([a-zA-Z]:)(.+)", elements[0]) if m: root_node = SVNTreeNode(m.group(1), None) elements[0] = m.group(2) add_elements_as_path(root_node, elements[0:]) if not root_node: root_node = SVNTreeNode(elements[0], None) add_elements_as_path(root_node, elements[1:]) # deposit contents in the very last node. node = root_node while True: if node.children is None: node.contents = contents node.props = props node.atts = atts break node = node.children[0] return root_node eol_re = re.compile(r'(\r\n|\r)') # helper for build_tree_from_wc() def get_props(paths): """Return a hash of hashes of props for PATHS, using the svn client. Convert each embedded end-of-line to a single LF character.""" # It's not kosher to look inside .svn/ and try to read the internal # property storage format. Instead, we use 'svn proplist'. After # all, this is the only way the user can retrieve them, so we're # respecting the black-box paradigm. files = {} exit_code, output, errput = svntest.main.run_svn(1, "proplist", "--verbose", "--xml", *paths) output = (line for line in output if not line.startswith('DBG:')) dom = parseString(''.join(output)) target_nodes = dom.getElementsByTagName('target') for target_node in target_nodes: filename = target_node.attributes['path'].nodeValue file_props = {} for property_node in target_node.getElementsByTagName('property'): name = property_node.attributes['name'].nodeValue if property_node.hasChildNodes(): text_node = property_node.firstChild value = text_node.nodeValue else: value = '' try: encoding = property_node.attributes['encoding'].nodeValue if encoding == 'base64': value = base64.b64decode(value) else: raise Exception("Unknown encoding '%s' for file '%s' property '%s'" % (encoding, filename, name,)) except KeyError: pass # If the property value contained a CR, or if under Windows an # "svn:*" property contains a newline, then the XML output # contains a CR character XML-encoded as ' '. The XML # parser converts it back into a CR character. So again convert # all end-of-line variants into a single LF: value = eol_re.sub('\n', value) file_props[name] = value files[filename] = file_props dom.unlink() return files ### ridiculous function. callers should do this one line themselves. def get_text(path): "Return a string with the textual contents of a file at PATH." # sanity check if not os.path.isfile(path): return None return open(path, 'r').read() def get_child(node, name): """If SVNTreeNode NODE contains a child named NAME, return child; else, return None. If SVNTreeNode is not a directory, exit completely.""" if node.children == None: print("Error: Foolish call to get_child.") sys.exit(1) for n in node.children: if name == n.name: return n return None # Helper for compare_trees def default_singleton_handler(node, description): """Print SVNTreeNode NODE's name, describing it with the string DESCRIPTION, then raise SVNTreeUnequal.""" print("Couldn't find node '%s' in %s tree" % (node.name, description)) node.pprint() raise SVNTreeUnequal # A test helper function implementing the singleton_handler_a API. def detect_conflict_files(node, extra_files): """NODE has been discovered, an extra file on disk. Verify that it matches one of the regular expressions in the EXTRA_FILES list. If it matches, remove the match from the list. If it doesn't match, raise an exception.""" for pattern in extra_files: mo = re.match(pattern, node.name) if mo: extra_files.pop(extra_files.index(pattern)) # delete pattern from list break else: msg = "Encountered unexpected disk path '" + node.name + "'" print(msg) node.pprint() raise SVNTreeUnequal(msg) ########################################################################### ########################################################################### # EXPORTED ROUTINES ARE BELOW # Main tree comparison routine! def compare_trees(label, a, b, singleton_handler_a = None, a_baton = None, singleton_handler_b = None, b_baton = None): """Compare SVNTreeNodes A (actual) and B (expected), expressing differences using FUNC_A and FUNC_B. FUNC_A and FUNC_B are functions of two arguments (a SVNTreeNode and a context baton), and may raise exception SVNTreeUnequal, in which case they use the string LABEL to describe the error (their return value is ignored). LABEL is typically "output", "disk", "status", or some other word that labels the trees being compared. If A and B are both files, then return if their contents, properties, and names are all the same; else raise a SVNTreeUnequal. If A is a file and B is a directory, raise a SVNTreeUnequal; same vice-versa. If both are directories, then for each entry that exists in both, call compare_trees on the two entries; otherwise, if the entry exists only in A, invoke FUNC_A on it, and likewise for B with FUNC_B.""" def display_nodes(a, b): 'Display two nodes, expected and actual.' print("=============================================================") print("Expected '%s' and actual '%s' in %s tree are different!" % (b.name, a.name, label)) print("=============================================================") print("EXPECTED NODE TO BE:") print("=============================================================") b.pprint() print("=============================================================") print("ACTUAL NODE FOUND:") print("=============================================================") a.pprint() # Setup singleton handlers if singleton_handler_a is None: singleton_handler_a = default_singleton_handler a_baton = "expected " + label if singleton_handler_b is None: singleton_handler_b = default_singleton_handler b_baton = "actual " + label try: # A and B are both files. if (a.children is None) and (b.children is None): if compare_file_nodes(a, b): display_nodes(a, b) raise SVNTreeUnequal # One is a file, one is a directory. elif (((a.children is None) and (b.children is not None)) or ((a.children is not None) and (b.children is None))): display_nodes(a, b) raise SVNTypeMismatch # They're both directories. else: # First, compare the directories' two hashes. if (a.props != b.props) or compare_atts(a.atts, b.atts): display_nodes(a, b) raise SVNTreeUnequal accounted_for = [] # For each child of A, check and see if it's in B. If so, run # compare_trees on the two children and add b's child to # accounted_for. If not, run FUNC_A on the child. Next, for each # child of B, check and see if it's in accounted_for. If it is, # do nothing. If not, run FUNC_B on it. for a_child in a.children: b_child = get_child(b, a_child.name) if b_child: accounted_for.append(b_child) compare_trees(label, a_child, b_child, singleton_handler_a, a_baton, singleton_handler_b, b_baton) else: singleton_handler_a(a_child, a_baton) for b_child in b.children: if b_child not in accounted_for: singleton_handler_b(b_child, b_baton) except SVNTypeMismatch: print('Unequal Types: one Node is a file, the other is a directory') raise SVNTreeUnequal except IndexError: print("Error: unequal number of children") raise SVNTreeUnequal except SVNTreeUnequal: if a.name != root_node_name: print("Unequal at node %s" % a.name) raise # Visually show a tree's structure def dump_tree(n,indent=""): """Print out a nice representation of the structure of the tree in the SVNTreeNode N. Prefix each line with the string INDENT.""" # Code partially stolen from Dave Beazley tmp_children = sorted(n.children or []) if n.name == root_node_name: print("%s%s" % (indent, "ROOT")) else: print("%s%s" % (indent, n.name)) indent = indent.replace("-", " ") indent = indent.replace("+", " ") for i in range(len(tmp_children)): c = tmp_children[i] if i == len(tmp_children)-1: dump_tree(c,indent + " +-- ") else: dump_tree(c,indent + " |-- ") def dump_tree_script__crawler(n, subtree="", stream=sys.stdout): "Helper for dump_tree_script. See that comment." count = 0 # skip printing the root node. if n.name != root_node_name: count += n.print_script(stream, subtree) for child in n.children or []: count += dump_tree_script__crawler(child, subtree, stream) return count def dump_tree_script(n, subtree="", stream=sys.stdout, wc_varname='wc_dir'): """Print out a python script representation of the structure of the tree in the SVNTreeNode N. Print only those nodes whose path string starts with the string SUBTREE, and print only the part of the path string that remains after SUBTREE. The result is printed to STREAM. The WC_VARNAME is inserted in the svntest.wc.State(wc_dir,{}) call that is printed out (this is used by factory.py).""" stream.write("svntest.wc.State(" + wc_varname + ", {") count = dump_tree_script__crawler(n, subtree, stream) if count > 0: stream.write('\n') stream.write("})") ################################################################### ################################################################### # PARSERS that return trees made of SVNTreeNodes.... ################################################################### # Build an "expected" static tree from a list of lists # Create a list of lists, of the form: # # [ [path, contents, props, atts], ... ] # # and run it through this parser. PATH is a string, a path to the # object. CONTENTS is either a string or None, and PROPS and ATTS are # populated dictionaries or {}. Each CONTENTS/PROPS/ATTS will be # attached to the basename-node of the associated PATH. def build_generic_tree(nodelist): "Given a list of lists of a specific format, return a tree." root = SVNTreeNode(root_node_name) for list in nodelist: new_branch = create_from_path(list[0], list[1], list[2], list[3]) root.add_child(new_branch) return root #################################################################### # Build trees from different kinds of subcommand output. # Parse co/up output into a tree. # # Tree nodes will contain no contents, a 'status' att, and a # 'treeconflict' att. def build_tree_from_checkout(lines, include_skipped=True): "Return a tree derived by parsing the output LINES from 'co' or 'up'." return svntest.wc.State.from_checkout(lines, include_skipped).old_tree() # Parse ci/im output into a tree. # # Tree nodes will contain no contents, and only one 'verb' att. def build_tree_from_commit(lines): "Return a tree derived by parsing the output LINES from 'ci' or 'im'." return svntest.wc.State.from_commit(lines).old_tree() # Parse status output into a tree. # # Tree nodes will contain no contents, and these atts: # # 'status', 'wc_rev', # ... and possibly 'locked', 'copied', 'switched', # 'writelocked' and 'treeconflict', # IFF columns non-empty. # def build_tree_from_status(lines): "Return a tree derived by parsing the output LINES from 'st -vuq'." return svntest.wc.State.from_status(lines).old_tree() # Parse merge "skipped" output def build_tree_from_skipped(lines): return svntest.wc.State.from_skipped(lines).old_tree() def build_tree_from_diff_summarize(lines): "Build a tree from output of diff --summarize" return svntest.wc.State.from_summarize(lines).old_tree() #################################################################### # Build trees by looking at the working copy # The reason the 'load_props' flag is off by default is because it # creates a drastic slowdown -- we spawn a new 'svn proplist' # process for every file and dir in the working copy! def build_tree_from_wc(wc_path, load_props=0, ignore_svn=1): """Takes WC_PATH as the path to a working copy. Walks the tree below that path, and creates the tree based on the actual found files. If IGNORE_SVN is true, then exclude SVN admin dirs from the tree. If LOAD_PROPS is true, the props will be added to the tree.""" return svntest.wc.State.from_wc(wc_path, load_props, ignore_svn).old_tree() cvs2svn-2.4.0/svntest/objects.py0000664000076500007650000003104511434364627017731 0ustar mhaggermhagger00000000000000#!/usr/bin/env python # # objects.py: Objects that keep track of state during a test # # Subversion is a tool for revision control. # See http://subversion.tigris.org for more information. # # ==================================================================== # Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. ###################################################################### # General modules import shutil, sys, re, os, subprocess # Our testing module import svntest from svntest import actions, main, tree, verify, wc ###################################################################### # Helpers def local_path(path): """Convert a path from internal style ('/' separators) to the local style.""" if path == '': path = '.' return os.sep.join(path.split('/')) def db_dump(db_dump_name, repo_path, table): """Yield a human-readable representation of the rows of the BDB table TABLE in the repo at REPO_PATH. Yield one line of text at a time. Calls the external program "db_dump" which is supplied with BDB.""" table_path = repo_path + "/db/" + table process = subprocess.Popen([db_dump_name, "-p", table_path], stdout=subprocess.PIPE, universal_newlines=True) retcode = process.wait() assert retcode == 0 # Strip out the header and footer; copy the rest into FILE. copying = False for line in process.stdout.readlines(): if line == "HEADER=END\n": copying = True; elif line == "DATA=END\n": break elif copying: yield line def pretty_print_skel(line): """Return LINE modified so as to look prettier for human reading, but no longer unambiguous or machine-parsable. LINE is assumed to be in the "Skel" format in which some values are preceded by a decimal length. This function removes the length indicator, and also replaces a zero-length value with a pair of single quotes.""" new_line = '' while line: # an explicit-length atom explicit_atom = re.match(r'\d+ ', line) if explicit_atom: n = int(explicit_atom.group()) line = line[explicit_atom.end():] new_line += line[:n] line = line[n:] if n == 0: new_line += "''" continue # an implicit-length atom implicit_atom = re.match(r'[A-Za-z][^\s()]*', line) if implicit_atom: n = implicit_atom.end() new_line += line[:n] line = line[n:] continue # parentheses, white space or any other non-atom new_line += line[:1] line = line[1:] return new_line def dump_bdb(db_dump_name, repo_path, dump_dir): """Dump all the known BDB tables in the repository at REPO_PATH into a single text file in DUMP_DIR. Omit any "next-key" records.""" dump_file = dump_dir + "/all.bdb" file = open(dump_file, 'w') for table in ['revisions', 'transactions', 'changes', 'copies', 'nodes', 'node-origins', 'representations', 'checksum-reps', 'strings', 'locks', 'lock-tokens', 'miscellaneous', 'uuids']: file.write(table + ":\n") next_key_line = False for line in db_dump(db_dump_name, repo_path, table): # Omit any 'next-key' line and the following line. if next_key_line: next_key_line = False continue if line == ' next-key\n': next_key_line = True continue # The line isn't necessarily a skel, but pretty_print_skel() shouldn't # do too much harm if it isn't. file.write(pretty_print_skel(line)) file.write("\n") file.close() def locate_db_dump(): """Locate a db_dump executable""" # Assume that using the newest version is OK. for db_dump_name in ['db4.8_dump', 'db4.7_dump', 'db4.6_dump', 'db4.5_dump', 'db4.4_dump', 'db_dump']: try: if subprocess.Popen([db_dump_name, "-V"]).wait() == 0: return db_dump_name except OSError, e: pass return 'none' ###################################################################### # Class SvnRepository class SvnRepository: """An object of class SvnRepository represents a Subversion repository, providing both a client-side view and a server-side view.""" def __init__(self, repo_url, repo_dir): self.repo_url = repo_url self.repo_absdir = os.path.abspath(repo_dir) self.db_dump_name = locate_db_dump() # This repository object's idea of its own head revision. self.head_rev = 0 def dump(self, output_dir): """Dump the repository into the directory OUTPUT_DIR""" ldir = local_path(output_dir) os.mkdir(ldir) """Run a BDB dump on the repository""" if self.db_dump_name != 'none': dump_bdb(self.db_dump_name, self.repo_absdir, ldir) """Run 'svnadmin dump' on the repository.""" exit_code, stdout, stderr = \ actions.run_and_verify_svnadmin(None, None, None, 'dump', self.repo_absdir) ldumpfile = local_path(output_dir + "/svnadmin.dump") main.file_write(ldumpfile, ''.join(stderr)) main.file_append(ldumpfile, ''.join(stdout)) def obliterate_node_rev(self, path, rev, exp_out=None, exp_err=[], exp_exit=0): """Obliterate the single node-rev PATH in revision REV. Check the expected stdout, stderr and exit code (EXP_OUT, EXP_ERR, EXP_EXIT).""" arg = self.repo_url + '/' + path + '@' + str(rev) actions.run_and_verify_svn2(None, exp_out, exp_err, exp_exit, 'obliterate', arg) def svn_mkdirs(self, *dirs): """Run 'svn mkdir' on the repository. DIRS is a list of directories to make, and each directory is a path relative to the repository root, neither starting nor ending with a slash.""" urls = [self.repo_url + '/' + dir for dir in dirs] actions.run_and_verify_svn(None, None, [], 'mkdir', '-m', 'svn_mkdirs()', '--parents', *urls) self.head_rev += 1 ###################################################################### # Class SvnWC class SvnWC: """An object of class SvnWC represents a WC, and provides methods for operating on it. It keeps track of the state of the WC and of the repository, so that the expected results of common operations are automatically known. Path arguments to class methods paths are relative to the WC dir and in Subversion canonical form ('/' separators). """ def __init__(self, wc_dir, repo): """Initialize the object to use the existing WC at path WC_DIR and the existing repository object REPO.""" self.wc_absdir = os.path.abspath(wc_dir) # 'state' is, at all times, the 'wc.State' representation of the state # of the WC, with paths relative to 'wc_absdir'. #self.state = wc.State('', {}) initial_wc_tree = tree.build_tree_from_wc(self.wc_absdir, load_props=True) self.state = initial_wc_tree.as_state() self.state.add({ '': wc.StateItem() }) self.repo = repo def __str__(self): return "SvnWC(head_rev=" + str(self.repo.head_rev) + ", state={" + \ str(self.state.desc) + \ "})" def svn_mkdir(self, rpath): lpath = local_path(rpath) actions.run_and_verify_svn(None, None, [], 'mkdir', lpath) self.state.add({ rpath : wc.StateItem(status='A ') }) # def propset(self, pname, pvalue, *rpaths): # "Set property 'pname' to value 'pvalue' on each path in 'rpaths'" # local_paths = tuple([local_path(rpath) for rpath in rpaths]) # actions.run_and_verify_svn(None, None, [], 'propset', pname, pvalue, # *local_paths) def svn_set_props(self, rpath, props): """Change the properties of PATH to be the dictionary {name -> value} PROPS. """ lpath = local_path(rpath) #for prop in path's existing props: # actions.run_and_verify_svn(None, None, [], 'propdel', # prop, lpath) for prop in props: actions.run_and_verify_svn(None, None, [], 'propset', prop, props[prop], lpath) self.state.tweak(rpath, props=props) def svn_file_create_add(self, rpath, content=None, props=None): "Make and add a file with some default content, and keyword expansion." lpath = local_path(rpath) ldirname, filename = os.path.split(lpath) if content is None: # Default content content = "This is the file '" + filename + "'.\n" + \ "Last changed in '$Revision$'.\n" main.file_write(lpath, content) actions.run_and_verify_svn(None, None, [], 'add', lpath) self.state.add({ rpath : wc.StateItem(status='A ') }) if props is None: # Default props props = { 'svn:keywords': 'Revision' } self.svn_set_props(rpath, props) def file_modify(self, rpath, content=None, props=None): "Make text and property mods to a WC file." lpath = local_path(rpath) if content is not None: #main.file_append(lpath, "An extra line.\n") #actions.run_and_verify_svn(None, None, [], 'propset', # 'newprop', 'v', lpath) main.file_write(lpath, content) self.state.tweak(rpath, content=content) if props is not None: self.set_props(rpath, props) self.state.tweak(rpath, props=props) def svn_move(self, rpath1, rpath2, parents=False): """Move/rename the existing WC item RPATH1 to become RPATH2. RPATH2 must not already exist. If PARENTS is true, any missing parents of RPATH2 will be created.""" lpath1 = local_path(rpath1) lpath2 = local_path(rpath2) args = [lpath1, lpath2] if parents: args += ['--parents'] actions.run_and_verify_svn(None, None, [], 'copy', *args) self.state.add({ rpath2: self.state.desc[rpath1] }) self.state.remove(rpath1) def svn_copy(self, rpath1, rpath2, parents=False, rev=None): """Copy the existing WC item RPATH1 to become RPATH2. RPATH2 must not already exist. If PARENTS is true, any missing parents of RPATH2 will be created. If REV is not None, copy revision REV of the node identified by WC item RPATH1.""" lpath1 = local_path(rpath1) lpath2 = local_path(rpath2) args = [lpath1, lpath2] if rev is not None: args += ['-r', rev] if parents: args += ['--parents'] actions.run_and_verify_svn(None, None, [], 'copy', *args) self.state.add({ rpath2: self.state.desc[rpath1] }) def svn_delete(self, rpath, even_if_modified=False): "Delete a WC path locally." lpath = local_path(rpath) args = [] if even_if_modified: args += ['--force'] actions.run_and_verify_svn(None, None, [], 'delete', lpath, *args) def svn_commit(self, rpath='', log=''): "Commit a WC path (recursively). Return the new revision number." lpath = local_path(rpath) actions.run_and_verify_svn(None, verify.AnyOutput, [], 'commit', '-m', log, lpath) actions.run_and_verify_update(lpath, None, None, None) self.repo.head_rev += 1 return self.repo.head_rev def svn_update(self, rpath='', rev='HEAD'): "Update the WC to the specified revision" lpath = local_path(rpath) actions.run_and_verify_update(lpath, None, None, None) # def svn_merge(self, rev_spec, source, target, exp_out=None): # """Merge a single change from path 'source' to path 'target'. # SRC_CHANGE_NUM is either a number (to cherry-pick that specific change) # or a command-line option revision range string such as '-r10:20'.""" # lsource = local_path(source) # ltarget = local_path(target) # if isinstance(rev_spec, int): # rev_spec = '-c' + str(rev_spec) # if exp_out is None: # target_re = re.escape(target) # exp_1 = "--- Merging r.* into '" + target_re + ".*':" # exp_2 = "(A |D |[UG] | [UG]|[UG][UG]) " + target_re + ".*" # exp_out = verify.RegexOutput(exp_1 + "|" + exp_2) # actions.run_and_verify_svn(None, exp_out, [], # 'merge', rev_spec, lsource, ltarget) cvs2svn-2.4.0/svntest/__init__.py0000664000076500007650000000343711434364627020043 0ustar mhaggermhagger00000000000000# # # Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. # # any bozos that do "from svntest import *" should die. export nothing # to the dumbasses. __all__ = [ ] import sys if sys.hexversion < 0x2040000: sys.stderr.write('[SKIPPED] at least Python 2.4 is required\n') # note: exiting is a bit harsh for a library module, but we really do # require Python 2.4. this package isn't going to work otherwise. # we're skipping this test, not failing, so exit with 0 sys.exit(0) try: import sqlite3 except ImportError: try: from pysqlite2 import dbapi2 as sqlite3 except ImportError: sys.stderr.write('[SKIPPED] Python sqlite3 module required\n') sys.exit(0) # don't export this name del sys class Failure(Exception): 'Base class for exceptions that indicate test failure' pass class Skip(Exception): 'Base class for exceptions that indicate test was skipped' pass # import in a specific order: things with the fewest circular imports first. import testcase import wc import verify import tree import sandbox import main import actions import factory cvs2svn-2.4.0/svntest/actions.py0000664000076500007650000031112111434364627017734 0ustar mhaggermhagger00000000000000# # actions.py: routines that actually run the svn client. # # Subversion is a tool for revision control. # See http://subversion.tigris.org for more information. # # ==================================================================== # Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. ###################################################################### import os, shutil, re, sys, errno import difflib, pprint import xml.parsers.expat from xml.dom.minidom import parseString import svntest from svntest import main, verify, tree, wc from svntest import Failure def no_sleep_for_timestamps(): os.environ['SVN_I_LOVE_CORRUPTED_WORKING_COPIES_SO_DISABLE_SLEEP_FOR_TIMESTAMPS'] = 'yes' def do_sleep_for_timestamps(): os.environ['SVN_I_LOVE_CORRUPTED_WORKING_COPIES_SO_DISABLE_SLEEP_FOR_TIMESTAMPS'] = 'no' def setup_pristine_repository(): """Create the pristine repository and 'svn import' the greek tree""" # these directories don't exist out of the box, so we may have to create them if not os.path.exists(main.general_wc_dir): os.makedirs(main.general_wc_dir) if not os.path.exists(main.general_repo_dir): os.makedirs(main.general_repo_dir) # this also creates all the intermediate dirs # If there's no pristine repos, create one. if not os.path.exists(main.pristine_dir): main.create_repos(main.pristine_dir) # if this is dav, gives us access rights to import the greek tree. if main.is_ra_type_dav(): authz_file = os.path.join(main.work_dir, "authz") main.file_write(authz_file, "[/]\n* = rw\n") # dump the greek tree to disk. main.greek_state.write_to_disk(main.greek_dump_dir) # import the greek tree, using l:foo/p:bar ### todo: svn should not be prompting for auth info when using ### repositories with no auth/auth requirements exit_code, output, errput = main.run_svn(None, 'import', '-m', 'Log message for revision 1.', main.greek_dump_dir, main.pristine_url) # check for any errors from the import if len(errput): display_lines("Errors during initial 'svn import':", 'STDERR', None, errput) sys.exit(1) # verify the printed output of 'svn import'. lastline = output.pop().strip() match = re.search("(Committed|Imported) revision [0-9]+.", lastline) if not match: print("ERROR: import did not succeed, while creating greek repos.") print("The final line from 'svn import' was:") print(lastline) sys.exit(1) output_tree = wc.State.from_commit(output) expected_output_tree = main.greek_state.copy(main.greek_dump_dir) expected_output_tree.tweak(verb='Adding', contents=None) try: expected_output_tree.compare_and_display('output', output_tree) except tree.SVNTreeUnequal: verify.display_trees("ERROR: output of import command is unexpected.", "OUTPUT TREE", expected_output_tree.old_tree(), output_tree.old_tree()) sys.exit(1) # Finally, disallow any changes to the "pristine" repos. error_msg = "Don't modify the pristine repository" create_failing_hook(main.pristine_dir, 'start-commit', error_msg) create_failing_hook(main.pristine_dir, 'pre-lock', error_msg) create_failing_hook(main.pristine_dir, 'pre-revprop-change', error_msg) ###################################################################### def guarantee_empty_repository(path): """Guarantee that a local svn repository exists at PATH, containing nothing.""" if path == main.pristine_dir: print("ERROR: attempt to overwrite the pristine repos! Aborting.") sys.exit(1) # create an empty repository at PATH. main.safe_rmtree(path) main.create_repos(path) # Used by every test, so that they can run independently of one # another. Every time this routine is called, it recursively copies # the `pristine repos' to a new location. # Note: make sure setup_pristine_repository was called once before # using this function. def guarantee_greek_repository(path): """Guarantee that a local svn repository exists at PATH, containing nothing but the greek-tree at revision 1.""" if path == main.pristine_dir: print("ERROR: attempt to overwrite the pristine repos! Aborting.") sys.exit(1) # copy the pristine repository to PATH. main.safe_rmtree(path) if main.copy_repos(main.pristine_dir, path, 1): print("ERROR: copying repository failed.") sys.exit(1) # make the repos world-writeable, for mod_dav_svn's sake. main.chmod_tree(path, 0666, 0666) def run_and_verify_svnlook(message, expected_stdout, expected_stderr, *varargs): """Like run_and_verify_svnlook2, but the expected exit code is assumed to be 0 if no output is expected on stderr, and 1 otherwise.""" expected_exit = 0 if expected_stderr is not None and expected_stderr != []: expected_exit = 1 return run_and_verify_svnlook2(message, expected_stdout, expected_stderr, expected_exit, *varargs) def run_and_verify_svnlook2(message, expected_stdout, expected_stderr, expected_exit, *varargs): """Run svnlook command and check its output and exit code.""" exit_code, out, err = main.run_svnlook(*varargs) verify.verify_outputs("Unexpected output", out, err, expected_stdout, expected_stderr) verify.verify_exit_code(message, exit_code, expected_exit) return exit_code, out, err def run_and_verify_svnadmin(message, expected_stdout, expected_stderr, *varargs): """Like run_and_verify_svnadmin2, but the expected exit code is assumed to be 0 if no output is expected on stderr, and 1 otherwise.""" expected_exit = 0 if expected_stderr is not None and expected_stderr != []: expected_exit = 1 return run_and_verify_svnadmin2(message, expected_stdout, expected_stderr, expected_exit, *varargs) def run_and_verify_svnadmin2(message, expected_stdout, expected_stderr, expected_exit, *varargs): """Run svnadmin command and check its output and exit code.""" exit_code, out, err = main.run_svnadmin(*varargs) verify.verify_outputs("Unexpected output", out, err, expected_stdout, expected_stderr) verify.verify_exit_code(message, exit_code, expected_exit) return exit_code, out, err def run_and_verify_svnversion(message, wc_dir, repo_url, expected_stdout, expected_stderr): """like run_and_verify_svnversion2, but the expected exit code is assumed to be 0 if no output is expected on stderr, and 1 otherwise.""" expected_exit = 0 if expected_stderr is not None and expected_stderr != []: expected_exit = 1 return run_and_verify_svnversion2(message, wc_dir, repo_url, expected_stdout, expected_stderr, expected_exit) def run_and_verify_svnversion2(message, wc_dir, repo_url, expected_stdout, expected_stderr, expected_exit): """Run svnversion command and check its output and exit code.""" exit_code, out, err = main.run_svnversion(wc_dir, repo_url) verify.verify_outputs("Unexpected output", out, err, expected_stdout, expected_stderr) verify.verify_exit_code(message, exit_code, expected_exit) return exit_code, out, err def run_and_verify_svn(message, expected_stdout, expected_stderr, *varargs): """like run_and_verify_svn2, but the expected exit code is assumed to be 0 if no output is expected on stderr, and 1 otherwise.""" expected_exit = 0 if expected_stderr is not None: if isinstance(expected_stderr, verify.ExpectedOutput): if not expected_stderr.matches([]): expected_exit = 1 elif expected_stderr != []: expected_exit = 1 return run_and_verify_svn2(message, expected_stdout, expected_stderr, expected_exit, *varargs) def run_and_verify_svn2(message, expected_stdout, expected_stderr, expected_exit, *varargs): """Invoke main.run_svn() with *VARARGS. Return exit code as int; stdout, stderr as lists of lines (including line terminators). For both EXPECTED_STDOUT and EXPECTED_STDERR, create an appropriate instance of verify.ExpectedOutput (if necessary): - If it is an array of strings, create a vanilla ExpectedOutput. - If it is a single string, create a RegexOutput that must match every line (for stdout) or any line (for stderr) of the expected output. - If it is already an instance of ExpectedOutput (e.g. UnorderedOutput), leave it alone. ...and invoke compare_and_display_lines() on MESSAGE, a label based on the name of the stream being compared (e.g. STDOUT), the ExpectedOutput instance, and the actual output. If EXPECTED_STDOUT is None, do not check stdout. EXPECTED_STDERR may not be None. If output checks pass, the expected and actual codes are compared. If a comparison fails, a Failure will be raised.""" if expected_stderr is None: raise verify.SVNIncorrectDatatype("expected_stderr must not be None") want_err = None if isinstance(expected_stderr, verify.ExpectedOutput): if not expected_stderr.matches([]): want_err = True elif expected_stderr != []: want_err = True exit_code, out, err = main.run_svn(want_err, *varargs) verify.verify_outputs(message, out, err, expected_stdout, expected_stderr) verify.verify_exit_code(message, exit_code, expected_exit) return exit_code, out, err def run_and_verify_load(repo_dir, dump_file_content): "Runs 'svnadmin load' and reports any errors." if not isinstance(dump_file_content, list): raise TypeError("dump_file_content argument should have list type") expected_stderr = [] exit_code, output, errput = main.run_command_stdin( main.svnadmin_binary, expected_stderr, 0, 1, dump_file_content, 'load', '--force-uuid', '--quiet', repo_dir) verify.verify_outputs("Unexpected stderr output", None, errput, None, expected_stderr) def run_and_verify_dump(repo_dir): "Runs 'svnadmin dump' and reports any errors, returning the dump content." exit_code, output, errput = main.run_svnadmin('dump', repo_dir) verify.verify_outputs("Missing expected output(s)", output, errput, verify.AnyOutput, verify.AnyOutput) return output def load_repo(sbox, dumpfile_path = None, dump_str = None): "Loads the dumpfile into sbox" if not dump_str: dump_str = open(dumpfile_path, "rb").read() # Create a virgin repos and working copy main.safe_rmtree(sbox.repo_dir, 1) main.safe_rmtree(sbox.wc_dir, 1) main.create_repos(sbox.repo_dir) # Load the mergetracking dumpfile into the repos, and check it out the repo run_and_verify_load(sbox.repo_dir, dump_str.splitlines(True)) run_and_verify_svn(None, None, [], "co", sbox.repo_url, sbox.wc_dir) return dump_str ###################################################################### # Subversion Actions # # These are all routines that invoke 'svn' in particular ways, and # then verify the results by comparing expected trees with actual # trees. # def run_and_verify_checkout(URL, wc_dir_name, output_tree, disk_tree, singleton_handler_a = None, a_baton = None, singleton_handler_b = None, b_baton = None, *args): """Checkout the URL into a new directory WC_DIR_NAME. *ARGS are any extra optional args to the checkout subcommand. The subcommand output will be verified against OUTPUT_TREE, and the working copy itself will be verified against DISK_TREE. For the latter comparison, SINGLETON_HANDLER_A and SINGLETON_HANDLER_B will be passed to tree.compare_trees -- see that function's doc string for more details. Return if successful, raise on failure. WC_DIR_NAME is deleted if present unless the '--force' option is passed in *ARGS.""" if isinstance(output_tree, wc.State): output_tree = output_tree.old_tree() if isinstance(disk_tree, wc.State): disk_tree = disk_tree.old_tree() # Remove dir if it's already there, unless this is a forced checkout. # In that case assume we want to test a forced checkout's toleration # of obstructing paths. if '--force' not in args: main.safe_rmtree(wc_dir_name) # Checkout and make a tree of the output, using l:foo/p:bar ### todo: svn should not be prompting for auth info when using ### repositories with no auth/auth requirements exit_code, output, errput = main.run_svn(None, 'co', URL, wc_dir_name, *args) actual = tree.build_tree_from_checkout(output) # Verify actual output against expected output. try: tree.compare_trees("output", actual, output_tree) except tree.SVNTreeUnequal: print("ACTUAL OUTPUT TREE:") tree.dump_tree_script(actual, wc_dir_name + os.sep) raise # Create a tree by scanning the working copy actual = tree.build_tree_from_wc(wc_dir_name) # Verify expected disk against actual disk. try: tree.compare_trees("disk", actual, disk_tree, singleton_handler_a, a_baton, singleton_handler_b, b_baton) except tree.SVNTreeUnequal: print("ACTUAL DISK TREE:") tree.dump_tree_script(actual, wc_dir_name + os.sep) raise def run_and_verify_export(URL, export_dir_name, output_tree, disk_tree, *args): """Export the URL into a new directory WC_DIR_NAME. The subcommand output will be verified against OUTPUT_TREE, and the exported copy itself will be verified against DISK_TREE. Return if successful, raise on failure. """ assert isinstance(output_tree, wc.State) assert isinstance(disk_tree, wc.State) disk_tree = disk_tree.old_tree() output_tree = output_tree.old_tree() # Export and make a tree of the output, using l:foo/p:bar ### todo: svn should not be prompting for auth info when using ### repositories with no auth/auth requirements exit_code, output, errput = main.run_svn(None, 'export', URL, export_dir_name, *args) actual = tree.build_tree_from_checkout(output) # Verify actual output against expected output. try: tree.compare_trees("output", actual, output_tree) except tree.SVNTreeUnequal: print("ACTUAL OUTPUT TREE:") tree.dump_tree_script(actual, export_dir_name + os.sep) raise # Create a tree by scanning the working copy. Don't ignore # the .svn directories so that we generate an error if they # happen to show up. actual = tree.build_tree_from_wc(export_dir_name, ignore_svn=False) # Verify expected disk against actual disk. try: tree.compare_trees("disk", actual, disk_tree) except tree.SVNTreeUnequal: print("ACTUAL DISK TREE:") tree.dump_tree_script(actual, export_dir_name + os.sep) raise # run_and_verify_log_xml class LogEntry: def __init__(self, revision, changed_paths=None, revprops=None): self.revision = revision if changed_paths == None: self.changed_paths = {} else: self.changed_paths = changed_paths if revprops == None: self.revprops = {} else: self.revprops = revprops def assert_changed_paths(self, changed_paths): """Not implemented, so just raises svntest.Failure. """ raise Failure('NOT IMPLEMENTED') def assert_revprops(self, revprops): """Assert that the dict revprops is the same as this entry's revprops. Raises svntest.Failure if not. """ if self.revprops != revprops: raise Failure('\n' + '\n'.join(difflib.ndiff( pprint.pformat(revprops).splitlines(), pprint.pformat(self.revprops).splitlines()))) class LogParser: def parse(self, data): """Return a list of LogEntrys parsed from the sequence of strings data. This is the only method of interest to callers. """ try: for i in data: self.parser.Parse(i) self.parser.Parse('', True) except xml.parsers.expat.ExpatError, e: raise verify.SVNUnexpectedStdout('%s\n%s\n' % (e, ''.join(data),)) return self.entries def __init__(self): # for expat self.parser = xml.parsers.expat.ParserCreate() self.parser.StartElementHandler = self.handle_start_element self.parser.EndElementHandler = self.handle_end_element self.parser.CharacterDataHandler = self.handle_character_data # Ignore some things. self.ignore_elements('log', 'paths', 'path', 'revprops') self.ignore_tags('logentry_end', 'author_start', 'date_start', 'msg_start') # internal state self.cdata = [] self.property = None # the result self.entries = [] def ignore(self, *args, **kwargs): del self.cdata[:] def ignore_tags(self, *args): for tag in args: setattr(self, tag, self.ignore) def ignore_elements(self, *args): for element in args: self.ignore_tags(element + '_start', element + '_end') # expat handlers def handle_start_element(self, name, attrs): getattr(self, name + '_start')(attrs) def handle_end_element(self, name): getattr(self, name + '_end')() def handle_character_data(self, data): self.cdata.append(data) # element handler utilities def use_cdata(self): result = ''.join(self.cdata).strip() del self.cdata[:] return result def svn_prop(self, name): self.entries[-1].revprops['svn:' + name] = self.use_cdata() # element handlers def logentry_start(self, attrs): self.entries.append(LogEntry(int(attrs['revision']))) def author_end(self): self.svn_prop('author') def msg_end(self): self.svn_prop('log') def date_end(self): # svn:date could be anything, so just note its presence. self.cdata[:] = [''] self.svn_prop('date') def property_start(self, attrs): self.property = attrs['name'] def property_end(self): self.entries[-1].revprops[self.property] = self.use_cdata() def run_and_verify_log_xml(message=None, expected_paths=None, expected_revprops=None, expected_stdout=None, expected_stderr=None, args=[]): """Call run_and_verify_svn with log --xml and args (optional) as command arguments, and pass along message, expected_stdout, and expected_stderr. If message is None, pass the svn log command as message. expected_paths checking is not yet implemented. expected_revprops is an optional list of dicts, compared to each revision's revprops. The list must be in the same order the log entries come in. Any svn:date revprops in the dicts must be '' in order to match, as the actual dates could be anything. expected_paths and expected_revprops are ignored if expected_stdout or expected_stderr is specified. """ if message == None: message = ' '.join(args) # We'll parse the output unless the caller specifies expected_stderr or # expected_stdout for run_and_verify_svn. parse = True if expected_stderr == None: expected_stderr = [] else: parse = False if expected_stdout != None: parse = False log_args = list(args) if expected_paths != None: log_args.append('-v') (exit_code, stdout, stderr) = run_and_verify_svn( message, expected_stdout, expected_stderr, 'log', '--xml', *log_args) if not parse: return entries = LogParser().parse(stdout) for index in range(len(entries)): entry = entries[index] if expected_revprops != None: entry.assert_revprops(expected_revprops[index]) if expected_paths != None: entry.assert_changed_paths(expected_paths[index]) def verify_update(actual_output, actual_mergeinfo_output, actual_elision_output, wc_dir_name, output_tree, mergeinfo_output_tree, elision_output_tree, disk_tree, status_tree, singleton_handler_a=None, a_baton=None, singleton_handler_b=None, b_baton=None, check_props=False): """Verify update of WC_DIR_NAME. The subcommand output (found in ACTUAL_OUTPUT, ACTUAL_MERGEINFO_OUTPUT, and ACTUAL_ELISION_OUTPUT) will be verified against OUTPUT_TREE, MERGEINFO_OUTPUT_TREE, and ELISION_OUTPUT_TREE respectively (if any of these is provided, they may be None in which case a comparison is not done). The working copy itself will be verified against DISK_TREE (if provided), and the working copy's 'svn status' output will be verified against STATUS_TREE (if provided). (This is a good way to check that revision numbers were bumped.) Return if successful, raise on failure. For the comparison with DISK_TREE, pass SINGLETON_HANDLER_A and SINGLETON_HANDLER_B to tree.compare_trees -- see that function's doc string for more details. If CHECK_PROPS is set, then disk comparison will examine props.""" if isinstance(actual_output, wc.State): actual_output = actual_output.old_tree() if isinstance(actual_mergeinfo_output, wc.State): actual_mergeinfo_output = actual_mergeinfo_output.old_tree() if isinstance(actual_elision_output, wc.State): actual_elision_output = actual_elision_output.old_tree() if isinstance(output_tree, wc.State): output_tree = output_tree.old_tree() if isinstance(mergeinfo_output_tree, wc.State): mergeinfo_output_tree = mergeinfo_output_tree.old_tree() if isinstance(elision_output_tree, wc.State): elision_output_tree = elision_output_tree.old_tree() if isinstance(disk_tree, wc.State): disk_tree = disk_tree.old_tree() if isinstance(status_tree, wc.State): status_tree = status_tree.old_tree() # Verify actual output against expected output. if output_tree: try: tree.compare_trees("output", actual_output, output_tree) except tree.SVNTreeUnequal: print("ACTUAL OUTPUT TREE:") tree.dump_tree_script(actual_output, wc_dir_name + os.sep) raise # Verify actual mergeinfo recording output against expected output. if mergeinfo_output_tree: try: tree.compare_trees("mergeinfo_output", actual_mergeinfo_output, mergeinfo_output_tree) except tree.SVNTreeUnequal: print("ACTUAL MERGEINFO OUTPUT TREE:") tree.dump_tree_script(actual_mergeinfo_output, wc_dir_name + os.sep) raise # Verify actual mergeinfo elision output against expected output. if elision_output_tree: try: tree.compare_trees("elision_output", actual_elision_output, elision_output_tree) except tree.SVNTreeUnequal: print("ACTUAL ELISION OUTPUT TREE:") tree.dump_tree_script(actual_elision_output, wc_dir_name + os.sep) raise # Create a tree by scanning the working copy, and verify it if disk_tree: actual_disk = tree.build_tree_from_wc(wc_dir_name, check_props) try: tree.compare_trees("disk", actual_disk, disk_tree, singleton_handler_a, a_baton, singleton_handler_b, b_baton) except tree.SVNTreeUnequal: print("ACTUAL DISK TREE:") tree.dump_tree_script(actual_disk) raise # Verify via 'status' command too, if possible. if status_tree: run_and_verify_status(wc_dir_name, status_tree) def verify_disk(wc_dir_name, disk_tree, check_props=False): """Verify WC_DIR_NAME against DISK_TREE. If CHECK_PROPS is set, the comparison will examin props. Returns if successful, raises on failure.""" verify_update(None, None, None, wc_dir_name, None, None, None, disk_tree, None, check_props=check_props) def run_and_verify_update(wc_dir_name, output_tree, disk_tree, status_tree, error_re_string = None, singleton_handler_a = None, a_baton = None, singleton_handler_b = None, b_baton = None, check_props = False, *args): """Update WC_DIR_NAME. *ARGS are any extra optional args to the update subcommand. NOTE: If *ARGS is specified at all, explicit target paths must be passed in *ARGS as well (or a default `.' will be chosen by the 'svn' binary). This allows the caller to update many items in a single working copy dir, but still verify the entire working copy dir. If ERROR_RE_STRING, the update must exit with error, and the error message must match regular expression ERROR_RE_STRING. Else if ERROR_RE_STRING is None, then: If OUTPUT_TREE is not None, the subcommand output will be verified against OUTPUT_TREE. If DISK_TREE is not None, the working copy itself will be verified against DISK_TREE. If STATUS_TREE is not None, the 'svn status' output will be verified against STATUS_TREE. (This is a good way to check that revision numbers were bumped.) For the DISK_TREE verification, SINGLETON_HANDLER_A and SINGLETON_HANDLER_B will be passed to tree.compare_trees -- see that function's doc string for more details. If CHECK_PROPS is set, then disk comparison will examine props. Return if successful, raise on failure.""" # Update and make a tree of the output. if len(args): exit_code, output, errput = main.run_svn(error_re_string, 'up', *args) else: exit_code, output, errput = main.run_svn(error_re_string, 'up', wc_dir_name, *args) if error_re_string: rm = re.compile(error_re_string) for line in errput: match = rm.search(line) if match: return raise main.SVNUnmatchedError actual = wc.State.from_checkout(output) verify_update(actual, None, None, wc_dir_name, output_tree, None, None, disk_tree, status_tree, singleton_handler_a, a_baton, singleton_handler_b, b_baton, check_props) def run_and_parse_info(*args): """Run 'svn info' and parse its output into a list of dicts, one dict per target.""" # the returned array all_infos = [] # per-target variables iter_info = {} prev_key = None lock_comment_lines = 0 lock_comments = [] exit_code, output, errput = main.run_svn(None, 'info', *args) for line in output: line = line[:-1] # trim '\n' if lock_comment_lines > 0: # mop up any lock comment lines lock_comments.append(line) lock_comment_lines = lock_comment_lines - 1 if lock_comment_lines == 0: iter_info[prev_key] = lock_comments elif len(line) == 0: # separator line between items all_infos.append(iter_info) iter_info = {} prev_key = None lock_comment_lines = 0 lock_comments = [] elif line[0].isspace(): # continuation line (for tree conflicts) iter_info[prev_key] += line[1:] else: # normal line key, value = line.split(':', 1) if re.search(' \(\d+ lines?\)$', key): # numbered continuation lines match = re.match('^(.*) \((\d+) lines?\)$', key) key = match.group(1) lock_comment_lines = int(match.group(2)) elif len(value) > 1: # normal normal line iter_info[key] = value[1:] else: ### originally added for "Tree conflict:\n" lines; ### tree-conflicts output format has changed since then # continuation lines are implicit (prefixed by whitespace) iter_info[key] = '' prev_key = key return all_infos def run_and_verify_info(expected_infos, *args): """Run 'svn info' with the arguments in *ARGS and verify the results against expected_infos. The latter should be a list of dicts (in the same order as the targets). In the dicts, each key is the before-the-colon part of the 'svn info' output, and each value is either None (meaning that the key should *not* appear in the 'svn info' output) or a regex matching the output value. Output lines not matching a key in the dict are ignored. Return if successful, raise on failure.""" actual_infos = run_and_parse_info(*args) try: # zip() won't complain, so check this manually if len(actual_infos) != len(expected_infos): raise verify.SVNUnexpectedStdout( "Expected %d infos, found %d infos" % (len(expected_infos), len(actual_infos))) for actual, expected in zip(actual_infos, expected_infos): # compare dicts for key, value in expected.items(): assert ':' not in key # caller passed impossible expectations? if value is None and key in actual: raise main.SVNLineUnequal("Found unexpected key '%s' with value '%s'" % (key, actual[key])) if value is not None and key not in actual: raise main.SVNLineUnequal("Expected key '%s' (with value '%s') " "not found" % (key, value)) if value is not None and not re.search(value, actual[key]): raise verify.SVNUnexpectedStdout("Values of key '%s' don't match:\n" " Expected: '%s' (regex)\n" " Found: '%s' (string)\n" % (key, value, actual[key])) except: sys.stderr.write("Bad 'svn info' output:\n" " Received: %s\n" " Expected: %s\n" % (actual_infos, expected_infos)) raise def run_and_verify_merge(dir, rev1, rev2, url1, url2, output_tree, mergeinfo_output_tree, elision_output_tree, disk_tree, status_tree, skip_tree, error_re_string = None, singleton_handler_a = None, a_baton = None, singleton_handler_b = None, b_baton = None, check_props = False, dry_run = True, *args): """Run 'svn merge URL1@REV1 URL2@REV2 DIR' if URL2 is not None (for a three-way merge between URLs and WC). If URL2 is None, run 'svn merge -rREV1:REV2 URL1 DIR'. If both REV1 and REV2 are None, leave off the '-r' argument. If ERROR_RE_STRING, the merge must exit with error, and the error message must match regular expression ERROR_RE_STRING. Else if ERROR_RE_STRING is None, then: The subcommand output will be verified against OUTPUT_TREE. Output related to mergeinfo notifications will be verified against MERGEINFO_OUTPUT_TREE if that is not None. Output related to mergeinfo elision will be verified against ELISION_OUTPUT_TREE if that is not None. The working copy itself will be verified against DISK_TREE. If optional STATUS_TREE is given, then 'svn status' output will be compared. The 'skipped' merge output will be compared to SKIP_TREE. For the DISK_TREE verification, SINGLETON_HANDLER_A and SINGLETON_HANDLER_B will be passed to tree.compare_trees -- see that function's doc string for more details. If CHECK_PROPS is set, then disk comparison will examine props. If DRY_RUN is set then a --dry-run merge will be carried out first and the output compared with that of the full merge. Return if successful, raise on failure.""" merge_command = [ "merge" ] if url2: merge_command.extend((url1 + "@" + str(rev1), url2 + "@" + str(rev2))) else: if not (rev1 is None and rev2 is None): merge_command.append("-r" + str(rev1) + ":" + str(rev2)) merge_command.append(url1) merge_command.append(dir) merge_command = tuple(merge_command) if dry_run: pre_disk = tree.build_tree_from_wc(dir) dry_run_command = merge_command + ('--dry-run',) dry_run_command = dry_run_command + args exit_code, out_dry, err_dry = main.run_svn(error_re_string, *dry_run_command) post_disk = tree.build_tree_from_wc(dir) try: tree.compare_trees("disk", post_disk, pre_disk) except tree.SVNTreeError: print("=============================================================") print("Dry-run merge altered working copy") print("=============================================================") raise # Update and make a tree of the output. merge_command = merge_command + args exit_code, out, err = main.run_svn(error_re_string, *merge_command) if error_re_string: if not error_re_string.startswith(".*"): error_re_string = ".*(" + error_re_string + ")" expected_err = verify.RegexOutput(error_re_string, match_all=False) verify.verify_outputs(None, None, err, None, expected_err) return elif err: raise verify.SVNUnexpectedStderr(err) # Split the output into that related to application of the actual diff # and that related to the recording of mergeinfo describing the merge. merge_diff_out = [] mergeinfo_notification_out = [] mergeinfo_elision_out = [] mergeinfo_notifications = False elision_notifications = False for line in out: if line.startswith('--- Recording'): mergeinfo_notifications = True elision_notifications = False elif line.startswith('--- Eliding'): mergeinfo_notifications = False elision_notifications = True elif line.startswith('--- Merging') or \ line.startswith('--- Reverse-merging') or \ line.startswith('Summary of conflicts') or \ line.startswith('Skipped missing target'): mergeinfo_notifications = False elision_notifications = False if mergeinfo_notifications: mergeinfo_notification_out.append(line) elif elision_notifications: mergeinfo_elision_out.append(line) else: merge_diff_out.append(line) if dry_run and merge_diff_out != out_dry: # Due to the way ra_serf works, it's possible that the dry-run and # real merge operations did the same thing, but the output came in # a different order. Let's see if maybe that's the case. # # NOTE: Would be nice to limit this dance to serf tests only, but... out_copy = merge_diff_out[:] out_dry_copy = out_dry[:] out_copy.sort() out_dry_copy.sort() if out_copy != out_dry_copy: print("=============================================================") print("Merge outputs differ") print("The dry-run merge output:") for x in out_dry: sys.stdout.write(x) print("The full merge output:") for x in out: sys.stdout.write(x) print("=============================================================") raise main.SVNUnmatchedError def missing_skip(a, b): print("=============================================================") print("Merge failed to skip: " + a.path) print("=============================================================") raise Failure def extra_skip(a, b): print("=============================================================") print("Merge unexpectedly skipped: " + a.path) print("=============================================================") raise Failure myskiptree = tree.build_tree_from_skipped(out) if isinstance(skip_tree, wc.State): skip_tree = skip_tree.old_tree() try: tree.compare_trees("skip", myskiptree, skip_tree, extra_skip, None, missing_skip, None) except tree.SVNTreeUnequal: print("ACTUAL SKIP TREE:") tree.dump_tree_script(myskiptree, dir + os.sep) raise actual_diff = svntest.wc.State.from_checkout(merge_diff_out, False) actual_mergeinfo = svntest.wc.State.from_checkout(mergeinfo_notification_out, False) actual_elision = svntest.wc.State.from_checkout(mergeinfo_elision_out, False) verify_update(actual_diff, actual_mergeinfo, actual_elision, dir, output_tree, mergeinfo_output_tree, elision_output_tree, disk_tree, status_tree, singleton_handler_a, a_baton, singleton_handler_b, b_baton, check_props) def run_and_verify_patch(dir, patch_path, output_tree, disk_tree, status_tree, skip_tree, error_re_string=None, check_props=False, dry_run=True, *args): """Run 'svn patch patch_path DIR'. If ERROR_RE_STRING, 'svn patch' must exit with error, and the error message must match regular expression ERROR_RE_STRING. Else if ERROR_RE_STRING is None, then: The subcommand output will be verified against OUTPUT_TREE, and the working copy itself will be verified against DISK_TREE. If optional STATUS_TREE is given, then 'svn status' output will be compared. The 'skipped' merge output will be compared to SKIP_TREE. If CHECK_PROPS is set, then disk comparison will examine props. If DRY_RUN is set then a --dry-run patch will be carried out first and the output compared with that of the full patch application. Returns if successful, raises on failure.""" patch_command = [ "patch" ] patch_command.append(patch_path) patch_command.append(dir) patch_command = tuple(patch_command) if dry_run: pre_disk = tree.build_tree_from_wc(dir) dry_run_command = patch_command + ('--dry-run',) dry_run_command = dry_run_command + args exit_code, out_dry, err_dry = main.run_svn(error_re_string, *dry_run_command) post_disk = tree.build_tree_from_wc(dir) try: tree.compare_trees("disk", post_disk, pre_disk) except tree.SVNTreeError: print("=============================================================") print("'svn patch --dry-run' altered working copy") print("=============================================================") raise # Update and make a tree of the output. patch_command = patch_command + args exit_code, out, err = main.run_svn(True, *patch_command) if error_re_string: rm = re.compile(error_re_string) match = None for line in err: match = rm.search(line) if match: break if not match: raise main.SVNUnmatchedError elif err: print("UNEXPECTED STDERR:") for x in err: sys.stdout.write(x) raise verify.SVNUnexpectedStderr if dry_run and out != out_dry: print("=============================================================") print("Outputs differ") print("'svn patch --dry-run' output:") for x in out_dry: sys.stdout.write(x) print("'svn patch' output:") for x in out: sys.stdout.write(x) print("=============================================================") raise main.SVNUnmatchedError def missing_skip(a, b): print("=============================================================") print("'svn patch' failed to skip: " + a.path) print("=============================================================") raise Failure def extra_skip(a, b): print("=============================================================") print("'svn patch' unexpectedly skipped: " + a.path) print("=============================================================") raise Failure myskiptree = tree.build_tree_from_skipped(out) if isinstance(skip_tree, wc.State): skip_tree = skip_tree.old_tree() tree.compare_trees("skip", myskiptree, skip_tree, extra_skip, None, missing_skip, None) mytree = tree.build_tree_from_checkout(out, 0) # when the expected output is a list, we want a line-by-line # comparison to happen instead of a tree comparison if isinstance(output_tree, list): verify.verify_outputs(None, out, err, output_tree, error_re_string) output_tree = None verify_update(mytree, None, None, dir, output_tree, None, None, disk_tree, status_tree, check_props=check_props) def run_and_verify_mergeinfo(error_re_string = None, expected_output = [], *args): """Run 'svn mergeinfo ARGS', and compare the result against EXPECTED_OUTPUT, a list of string representations of revisions expected in the output. Raise an exception if an unexpected output is encountered.""" mergeinfo_command = ["mergeinfo"] mergeinfo_command.extend(args) exit_code, out, err = main.run_svn(error_re_string, *mergeinfo_command) if error_re_string: if not error_re_string.startswith(".*"): error_re_string = ".*(" + error_re_string + ")" expected_err = verify.RegexOutput(error_re_string, match_all=False) verify.verify_outputs(None, None, err, None, expected_err) return out = sorted([_f for _f in [x.rstrip()[1:] for x in out] if _f]) expected_output.sort() extra_out = [] if out != expected_output: exp_hash = dict.fromkeys(expected_output) for rev in out: if rev in exp_hash: del(exp_hash[rev]) else: extra_out.append(rev) extra_exp = list(exp_hash.keys()) raise Exception("Unexpected 'svn mergeinfo' output:\n" " expected but not found: %s\n" " found but not expected: %s" % (', '.join([str(x) for x in extra_exp]), ', '.join([str(x) for x in extra_out]))) def run_and_verify_switch(wc_dir_name, wc_target, switch_url, output_tree, disk_tree, status_tree, error_re_string = None, singleton_handler_a = None, a_baton = None, singleton_handler_b = None, b_baton = None, check_props = False, *args): """Switch WC_TARGET (in working copy dir WC_DIR_NAME) to SWITCH_URL. If ERROR_RE_STRING, the switch must exit with error, and the error message must match regular expression ERROR_RE_STRING. Else if ERROR_RE_STRING is None, then: The subcommand output will be verified against OUTPUT_TREE, and the working copy itself will be verified against DISK_TREE. If optional STATUS_TREE is given, then 'svn status' output will be compared. (This is a good way to check that revision numbers were bumped.) For the DISK_TREE verification, SINGLETON_HANDLER_A and SINGLETON_HANDLER_B will be passed to tree.compare_trees -- see that function's doc string for more details. If CHECK_PROPS is set, then disk comparison will examine props. Return if successful, raise on failure.""" # Update and make a tree of the output. exit_code, output, errput = main.run_svn(error_re_string, 'switch', switch_url, wc_target, *args) if error_re_string: if not error_re_string.startswith(".*"): error_re_string = ".*(" + error_re_string + ")" expected_err = verify.RegexOutput(error_re_string, match_all=False) verify.verify_outputs(None, None, errput, None, expected_err) return elif errput: raise verify.SVNUnexpectedStderr(err) actual = wc.State.from_checkout(output) verify_update(actual, None, None, wc_dir_name, output_tree, None, None, disk_tree, status_tree, singleton_handler_a, a_baton, singleton_handler_b, b_baton, check_props) def process_output_for_commit(output): """Helper for run_and_verify_commit(), also used in the factory.""" # Remove the final output line, and verify that the commit succeeded. lastline = "" if len(output): lastline = output.pop().strip() cm = re.compile("(Committed|Imported) revision [0-9]+.") match = cm.search(lastline) if not match: print("ERROR: commit did not succeed.") print("The final line from 'svn ci' was:") print(lastline) raise main.SVNCommitFailure # The new 'final' line in the output is either a regular line that # mentions {Adding, Deleting, Sending, ...}, or it could be a line # that says "Transmitting file data ...". If the latter case, we # want to remove the line from the output; it should be ignored when # building a tree. if len(output): lastline = output.pop() tm = re.compile("Transmitting file data.+") match = tm.search(lastline) if not match: # whoops, it was important output, put it back. output.append(lastline) return output def run_and_verify_commit(wc_dir_name, output_tree, status_tree, error_re_string = None, *args): """Commit and verify results within working copy WC_DIR_NAME, sending ARGS to the commit subcommand. The subcommand output will be verified against OUTPUT_TREE. If optional STATUS_TREE is given, then 'svn status' output will be compared. (This is a good way to check that revision numbers were bumped.) If ERROR_RE_STRING is None, the commit must not exit with error. If ERROR_RE_STRING is a string, the commit must exit with error, and the error message must match regular expression ERROR_RE_STRING. Return if successful, raise on failure.""" if isinstance(output_tree, wc.State): output_tree = output_tree.old_tree() if isinstance(status_tree, wc.State): status_tree = status_tree.old_tree() # Commit. exit_code, output, errput = main.run_svn(error_re_string, 'ci', '-m', 'log msg', *args) if error_re_string: if not error_re_string.startswith(".*"): error_re_string = ".*(" + error_re_string + ")" expected_err = verify.RegexOutput(error_re_string, match_all=False) verify.verify_outputs(None, None, errput, None, expected_err) return # Else not expecting error: # Convert the output into a tree. output = process_output_for_commit(output) actual = tree.build_tree_from_commit(output) # Verify actual output against expected output. try: tree.compare_trees("output", actual, output_tree) except tree.SVNTreeError: verify.display_trees("Output of commit is unexpected", "OUTPUT TREE", output_tree, actual) print("ACTUAL OUTPUT TREE:") tree.dump_tree_script(actual, wc_dir_name + os.sep) raise # Verify via 'status' command too, if possible. if status_tree: run_and_verify_status(wc_dir_name, status_tree) # This function always passes '-q' to the status command, which # suppresses the printing of any unversioned or nonexistent items. def run_and_verify_status(wc_dir_name, output_tree, singleton_handler_a = None, a_baton = None, singleton_handler_b = None, b_baton = None): """Run 'status' on WC_DIR_NAME and compare it with the expected OUTPUT_TREE. SINGLETON_HANDLER_A and SINGLETON_HANDLER_B will be passed to tree.compare_trees - see that function's doc string for more details. Returns on success, raises on failure.""" if isinstance(output_tree, wc.State): output_state = output_tree output_tree = output_tree.old_tree() else: output_state = None exit_code, output, errput = main.run_svn(None, 'status', '-v', '-u', '-q', wc_dir_name) actual = tree.build_tree_from_status(output) # Verify actual output against expected output. try: tree.compare_trees("status", actual, output_tree, singleton_handler_a, a_baton, singleton_handler_b, b_baton) except tree.SVNTreeError: verify.display_trees(None, 'STATUS OUTPUT TREE', output_tree, actual) print("ACTUAL STATUS TREE:") tree.dump_tree_script(actual, wc_dir_name + os.sep) raise # if we have an output State, and we can/are-allowed to create an # entries-based State, then compare the two. if output_state: entries_state = wc.State.from_entries(wc_dir_name) if entries_state: tweaked = output_state.copy() tweaked.tweak_for_entries_compare() try: tweaked.compare_and_display('entries', entries_state) except tree.SVNTreeUnequal: ### do something more raise # A variant of previous func, but doesn't pass '-q'. This allows us # to verify unversioned or nonexistent items in the list. def run_and_verify_unquiet_status(wc_dir_name, status_tree): """Run 'status' on WC_DIR_NAME and compare it with the expected STATUS_TREE. Returns on success, raises on failure.""" if isinstance(status_tree, wc.State): status_tree = status_tree.old_tree() exit_code, output, errput = main.run_svn(None, 'status', '-v', '-u', wc_dir_name) actual = tree.build_tree_from_status(output) # Verify actual output against expected output. try: tree.compare_trees("UNQUIET STATUS", actual, status_tree) except tree.SVNTreeError: print("ACTUAL UNQUIET STATUS TREE:") tree.dump_tree_script(actual, wc_dir_name + os.sep) raise def run_and_verify_diff_summarize_xml(error_re_string = [], expected_prefix = None, expected_paths = [], expected_items = [], expected_props = [], expected_kinds = [], *args): """Run 'diff --summarize --xml' with the arguments *ARGS, which should contain all arguments beyond for your 'diff --summarize --xml' omitting said arguments. EXPECTED_PREFIX will store a "common" path prefix expected to be at the beginning of each summarized path. If EXPECTED_PREFIX is None, then EXPECTED_PATHS will need to be exactly as 'svn diff --summarize --xml' will output. If ERROR_RE_STRING, the command must exit with error, and the error message must match regular expression ERROR_RE_STRING. Else if ERROR_RE_STRING is None, the subcommand output will be parsed into an XML document and will then be verified by comparing the parsed output to the contents in the EXPECTED_PATHS, EXPECTED_ITEMS, EXPECTED_PROPS and EXPECTED_KINDS. Returns on success, raises on failure.""" exit_code, output, errput = run_and_verify_svn(None, None, error_re_string, 'diff', '--summarize', '--xml', *args) # Return if errors are present since they were expected if len(errput) > 0: return doc = parseString(''.join(output)) paths = doc.getElementsByTagName("path") items = expected_items kinds = expected_kinds for path in paths: modified_path = path.childNodes[0].data if (expected_prefix is not None and modified_path.find(expected_prefix) == 0): modified_path = modified_path.replace(expected_prefix, '')[1:].strip() # Workaround single-object diff if len(modified_path) == 0: modified_path = path.childNodes[0].data.split(os.sep)[-1] # From here on, we use '/' as path separator. if os.sep != "/": modified_path = modified_path.replace(os.sep, "/") if modified_path not in expected_paths: print("ERROR: %s not expected in the changed paths." % modified_path) raise Failure index = expected_paths.index(modified_path) expected_item = items[index] expected_kind = kinds[index] expected_prop = expected_props[index] actual_item = path.getAttribute('item') actual_kind = path.getAttribute('kind') actual_prop = path.getAttribute('props') if expected_item != actual_item: print("ERROR: expected: %s actual: %s" % (expected_item, actual_item)) raise Failure if expected_kind != actual_kind: print("ERROR: expected: %s actual: %s" % (expected_kind, actual_kind)) raise Failure if expected_prop != actual_prop: print("ERROR: expected: %s actual: %s" % (expected_prop, actual_prop)) raise Failure def run_and_verify_diff_summarize(output_tree, *args): """Run 'diff --summarize' with the arguments *ARGS. The subcommand output will be verified against OUTPUT_TREE. Returns on success, raises on failure. """ if isinstance(output_tree, wc.State): output_tree = output_tree.old_tree() exit_code, output, errput = main.run_svn(None, 'diff', '--summarize', *args) actual = tree.build_tree_from_diff_summarize(output) # Verify actual output against expected output. try: tree.compare_trees("output", actual, output_tree) except tree.SVNTreeError: verify.display_trees(None, 'DIFF OUTPUT TREE', output_tree, actual) print("ACTUAL DIFF OUTPUT TREE:") tree.dump_tree_script(actual) raise def run_and_validate_lock(path, username): """`svn lock' the given path and validate the contents of the lock. Use the given username. This is important because locks are user specific.""" comment = "Locking path:%s." % path # lock the path run_and_verify_svn(None, ".*locked by user", [], 'lock', '--username', username, '-m', comment, path) # Run info and check that we get the lock fields. exit_code, output, err = run_and_verify_svn(None, None, [], 'info','-R', path) ### TODO: Leverage RegexOuput([...], match_all=True) here. # prepare the regexs to compare against token_re = re.compile(".*?Lock Token: opaquelocktoken:.*?", re.DOTALL) author_re = re.compile(".*?Lock Owner: %s\n.*?" % username, re.DOTALL) created_re = re.compile(".*?Lock Created:.*?", re.DOTALL) comment_re = re.compile(".*?%s\n.*?" % re.escape(comment), re.DOTALL) # join all output lines into one output = "".join(output) # Fail even if one regex does not match if ( not (token_re.match(output) and author_re.match(output) and created_re.match(output) and comment_re.match(output))): raise Failure def _run_and_verify_resolve(cmd, expected_paths, *args): """Run "svn CMD" (where CMD is 'resolve' or 'resolved') with arguments ARGS, and verify that it resolves the paths in EXPECTED_PATHS and no others. If no ARGS are specified, use the elements of EXPECTED_PATHS as the arguments.""" # TODO: verify that the status of PATHS changes accordingly. if len(args) == 0: args = expected_paths expected_output = verify.UnorderedOutput([ "Resolved conflicted state of '" + path + "'\n" for path in expected_paths]) run_and_verify_svn(None, expected_output, [], cmd, *args) def run_and_verify_resolve(expected_paths, *args): """Run "svn resolve" with arguments ARGS, and verify that it resolves the paths in EXPECTED_PATHS and no others. If no ARGS are specified, use the elements of EXPECTED_PATHS as the arguments.""" _run_and_verify_resolve('resolve', expected_paths, *args) def run_and_verify_resolved(expected_paths, *args): """Run "svn resolved" with arguments ARGS, and verify that it resolves the paths in EXPECTED_PATHS and no others. If no ARGS are specified, use the elements of EXPECTED_PATHS as the arguments.""" _run_and_verify_resolve('resolved', expected_paths, *args) ###################################################################### # Other general utilities # This allows a test to *quickly* bootstrap itself. def make_repo_and_wc(sbox, create_wc = True, read_only = False): """Create a fresh 'Greek Tree' repository and check out a WC from it. If read_only is False, a dedicated repository will be created, named TEST_NAME. The repository will live in the global dir 'general_repo_dir'. If read_only is True the pristine repository will be used. If create_wc is True, a dedicated working copy will be checked out from the repository, named TEST_NAME. The wc directory will live in the global dir 'general_wc_dir'. Both variables 'general_repo_dir' and 'general_wc_dir' are defined at the top of this test suite.) Returns on success, raises on failure.""" # Create (or copy afresh) a new repos with a greek tree in it. if not read_only: guarantee_greek_repository(sbox.repo_dir) if create_wc: # Generate the expected output tree. expected_output = main.greek_state.copy() expected_output.wc_dir = sbox.wc_dir expected_output.tweak(status='A ', contents=None) # Generate an expected wc tree. expected_wc = main.greek_state # Do a checkout, and verify the resulting output and disk contents. run_and_verify_checkout(sbox.repo_url, sbox.wc_dir, expected_output, expected_wc) else: # just make sure the parent folder of our working copy is created try: os.mkdir(main.general_wc_dir) except OSError, err: if err.errno != errno.EEXIST: raise # Duplicate a working copy or other dir. def duplicate_dir(wc_name, wc_copy_name): """Copy the working copy WC_NAME to WC_COPY_NAME. Overwrite any existing tree at that location.""" main.safe_rmtree(wc_copy_name) shutil.copytree(wc_name, wc_copy_name) def get_virginal_state(wc_dir, rev): "Return a virginal greek tree state for a WC and repos at revision REV." rev = str(rev) ### maybe switch rev to an integer? # copy the greek tree, shift it to the new wc_dir, insert a root elem, # then tweak all values state = main.greek_state.copy() state.wc_dir = wc_dir state.desc[''] = wc.StateItem() state.tweak(contents=None, status=' ', wc_rev=rev) return state def remove_admin_tmp_dir(wc_dir): "Remove the tmp directory within the administrative directory." tmp_path = os.path.join(wc_dir, main.get_admin_name(), 'tmp') ### Any reason not to use main.safe_rmtree()? os.rmdir(os.path.join(tmp_path, 'prop-base')) os.rmdir(os.path.join(tmp_path, 'props')) os.rmdir(os.path.join(tmp_path, 'text-base')) os.rmdir(tmp_path) # Cheap administrative directory locking def lock_admin_dir(wc_dir): "Lock a SVN administrative directory" db = svntest.sqlite3.connect(os.path.join(wc_dir, main.get_admin_name(), 'wc.db')) db.execute('insert into wc_lock (wc_id, local_dir_relpath, locked_levels) ' + 'values (?, ?, ?)', (1, '', 0)) db.commit() db.close() def get_wc_uuid(wc_dir): "Return the UUID of the working copy at WC_DIR." return run_and_parse_info(wc_dir)[0]['Repository UUID'] def get_wc_base_rev(wc_dir): "Return the BASE revision of the working copy at WC_DIR." return run_and_parse_info(wc_dir)[0]['Revision'] def hook_failure_message(hook_name): """Return the error message that the client prints for failure of the specified hook HOOK_NAME. The wording changed with Subversion 1.5.""" if svntest.main.options.server_minor_version < 5: return "'%s' hook failed with error output:\n" % hook_name else: if hook_name in ["start-commit", "pre-commit"]: action = "Commit" elif hook_name == "pre-revprop-change": action = "Revprop change" elif hook_name == "pre-lock": action = "Lock" elif hook_name == "pre-unlock": action = "Unlock" else: action = None if action is None: message = "%s hook failed (exit code 1)" % (hook_name,) else: message = "%s blocked by %s hook (exit code 1)" % (action, hook_name) return message + " with output:\n" def create_failing_hook(repo_dir, hook_name, text): """Create a HOOK_NAME hook in the repository at REPO_DIR that prints TEXT to stderr and exits with an error.""" hook_path = os.path.join(repo_dir, 'hooks', hook_name) # Embed the text carefully: it might include characters like "%" and "'". main.create_python_hook_script(hook_path, 'import sys\n' 'sys.stderr.write(' + repr(text) + ')\n' 'sys.exit(1)\n') def enable_revprop_changes(repo_dir): """Enable revprop changes in the repository at REPO_DIR by creating a pre-revprop-change hook script and (if appropriate) making it executable.""" hook_path = main.get_pre_revprop_change_hook_path(repo_dir) main.create_python_hook_script(hook_path, 'import sys; sys.exit(0)') def disable_revprop_changes(repo_dir): """Disable revprop changes in the repository at REPO_DIR by creating a pre-revprop-change hook script that prints "pre-revprop-change" followed by its arguments, and returns an error.""" hook_path = main.get_pre_revprop_change_hook_path(repo_dir) main.create_python_hook_script(hook_path, 'import sys\n' 'sys.stderr.write("pre-revprop-change %s" % " ".join(sys.argv[1:6]))\n' 'sys.exit(1)\n') def create_failing_post_commit_hook(repo_dir): """Create a post-commit hook script in the repository at REPO_DIR that always reports an error.""" hook_path = main.get_post_commit_hook_path(repo_dir) main.create_python_hook_script(hook_path, 'import sys\n' 'sys.stderr.write("Post-commit hook failed")\n' 'sys.exit(1)') # set_prop can be used for properties with NULL characters which are not # handled correctly when passed to subprocess.Popen() and values like "*" # which are not handled correctly on Windows. def set_prop(name, value, path, expected_err=None): """Set a property with specified value""" if value and (value[0] == '-' or '\x00' in value or sys.platform == 'win32'): from tempfile import mkstemp (fd, value_file_path) = mkstemp() value_file = open(value_file_path, 'wb') value_file.write(value) value_file.flush() value_file.close() main.run_svn(expected_err, 'propset', '-F', value_file_path, name, path) os.close(fd) os.remove(value_file_path) else: main.run_svn(expected_err, 'propset', name, value, path) def check_prop(name, path, exp_out): """Verify that property NAME on PATH has a value of EXP_OUT""" # Not using run_svn because binary_mode must be set exit_code, out, err = main.run_command(main.svn_binary, None, 1, 'pg', '--strict', name, path, '--config-dir', main.default_config_dir, '--username', main.wc_author, '--password', main.wc_passwd) if out != exp_out: print("svn pg --strict %s output does not match expected." % name) print("Expected standard output: %s\n" % exp_out) print("Actual standard output: %s\n" % out) raise Failure def fill_file_with_lines(wc_path, line_nbr, line_descrip=None, append=True): """Change the file at WC_PATH (adding some lines), and return its new contents. LINE_NBR indicates the line number at which the new contents should assume that it's being appended. LINE_DESCRIP is something like 'This is line' (the default) or 'Conflicting line'.""" if line_descrip is None: line_descrip = "This is line" # Generate the new contents for the file. contents = "" for n in range(line_nbr, line_nbr + 3): contents = contents + line_descrip + " " + repr(n) + " in '" + \ os.path.basename(wc_path) + "'.\n" # Write the new contents to the file. if append: main.file_append(wc_path, contents) else: main.file_write(wc_path, contents) return contents def inject_conflict_into_wc(sbox, state_path, file_path, expected_disk, expected_status, merged_rev): """Create a conflict at FILE_PATH by replacing its contents, committing the change, backdating it to its previous revision, changing its contents again, then updating it to merge in the previous change.""" wc_dir = sbox.wc_dir # Make a change to the file. contents = fill_file_with_lines(file_path, 1, "This is line", append=False) # Commit the changed file, first taking note of the current revision. prev_rev = expected_status.desc[state_path].wc_rev expected_output = wc.State(wc_dir, { state_path : wc.StateItem(verb='Sending'), }) if expected_status: expected_status.tweak(state_path, wc_rev=merged_rev) run_and_verify_commit(wc_dir, expected_output, expected_status, None, file_path) # Backdate the file. exit_code, output, errput = main.run_svn(None, "up", "-r", str(prev_rev), file_path) if expected_status: expected_status.tweak(state_path, wc_rev=prev_rev) # Make a conflicting change to the file, and backdate the file. conflicting_contents = fill_file_with_lines(file_path, 1, "Conflicting line", append=False) # Merge the previous change into the file to produce a conflict. if expected_disk: expected_disk.tweak(state_path, contents="") expected_output = wc.State(wc_dir, { state_path : wc.StateItem(status='C '), }) inject_conflict_into_expected_state(state_path, expected_disk, expected_status, conflicting_contents, contents, merged_rev) exit_code, output, errput = main.run_svn(None, "up", "-r", str(merged_rev), sbox.repo_url + "/" + state_path, file_path) if expected_status: expected_status.tweak(state_path, wc_rev=merged_rev) def inject_conflict_into_expected_state(state_path, expected_disk, expected_status, wc_text, merged_text, merged_rev): """Update the EXPECTED_DISK and EXPECTED_STATUS trees for the conflict at STATE_PATH (ignored if None). WC_TEXT, MERGED_TEXT, and MERGED_REV are used to determine the contents of the conflict (the text parameters should be newline-terminated).""" if expected_disk: conflict_marker = make_conflict_marker_text(wc_text, merged_text, merged_rev) existing_text = expected_disk.desc[state_path].contents or "" expected_disk.tweak(state_path, contents=existing_text + conflict_marker) if expected_status: expected_status.tweak(state_path, status='C ') def make_conflict_marker_text(wc_text, merged_text, merged_rev): """Return the conflict marker text described by WC_TEXT (the current text in the working copy, MERGED_TEXT (the conflicting text merged in), and MERGED_REV (the revision from whence the conflicting text came).""" return "<<<<<<< .working\n" + wc_text + "=======\n" + \ merged_text + ">>>>>>> .merge-right.r" + str(merged_rev) + "\n" def build_greek_tree_conflicts(sbox): """Create a working copy that has tree-conflict markings. After this function has been called, sbox.wc_dir is a working copy that has specific tree-conflict markings. In particular, this does two conflicting sets of edits and performs an update so that tree conflicts appear. Note that this function calls sbox.build() because it needs a clean sbox. So, there is no need to call sbox.build() before this. The conflicts are the result of an 'update' on the following changes: Incoming Local A/D/G/pi text-mod del A/D/G/rho del text-mod A/D/G/tau del del This function is useful for testing that tree-conflicts are handled properly once they have appeared, e.g. that commits are blocked, that the info output is correct, etc. See also the tree-conflicts tests using deep_trees in various other .py files, and tree_conflict_tests.py. """ sbox.build() wc_dir = sbox.wc_dir j = os.path.join G = j(wc_dir, 'A', 'D', 'G') pi = j(G, 'pi') rho = j(G, 'rho') tau = j(G, 'tau') # Make incoming changes and "store them away" with a commit. main.file_append(pi, "Incoming edit.\n") main.run_svn(None, 'del', rho) main.run_svn(None, 'del', tau) expected_output = wc.State(wc_dir, { 'A/D/G/pi' : Item(verb='Sending'), 'A/D/G/rho' : Item(verb='Deleting'), 'A/D/G/tau' : Item(verb='Deleting'), }) expected_status = get_virginal_state(wc_dir, 1) expected_status.tweak('A/D/G/pi', wc_rev='2') expected_status.remove('A/D/G/rho', 'A/D/G/tau') run_and_verify_commit(wc_dir, expected_output, expected_status, None, '-m', 'Incoming changes.', wc_dir ) # Update back to the pristine state ("time-warp"). expected_output = wc.State(wc_dir, { 'A/D/G/pi' : Item(status='U '), 'A/D/G/rho' : Item(status='A '), 'A/D/G/tau' : Item(status='A '), }) expected_disk = main.greek_state expected_status = get_virginal_state(wc_dir, 1) run_and_verify_update(wc_dir, expected_output, expected_disk, expected_status, None, None, None, None, None, False, '-r', '1', wc_dir) # Make local changes main.run_svn(None, 'del', pi) main.file_append(rho, "Local edit.\n") main.run_svn(None, 'del', tau) # Update, receiving the incoming changes on top of the local changes, # causing tree conflicts. Don't check for any particular result: that is # the job of other tests. run_and_verify_svn(None, verify.AnyOutput, [], 'update', wc_dir) def make_deep_trees(base): """Helper function for deep trees conflicts. Create a set of trees, each in its own "container" dir. Any conflicts can be tested separately in each container. """ j = os.path.join # Create the container dirs. F = j(base, 'F') D = j(base, 'D') DF = j(base, 'DF') DD = j(base, 'DD') DDF = j(base, 'DDF') DDD = j(base, 'DDD') os.makedirs(F) os.makedirs(j(D, 'D1')) os.makedirs(j(DF, 'D1')) os.makedirs(j(DD, 'D1', 'D2')) os.makedirs(j(DDF, 'D1', 'D2')) os.makedirs(j(DDD, 'D1', 'D2', 'D3')) # Create their files. alpha = j(F, 'alpha') beta = j(DF, 'D1', 'beta') gamma = j(DDF, 'D1', 'D2', 'gamma') main.file_append(alpha, "This is the file 'alpha'.\n") main.file_append(beta, "This is the file 'beta'.\n") main.file_append(gamma, "This is the file 'gamma'.\n") def add_deep_trees(sbox, base_dir_name): """Prepare a "deep_trees" within a given directory. The directory / is created and a deep_tree is created within. The items are only added, a commit has to be called separately, if needed. will thus be a container for the set of containers mentioned in make_deep_trees(). """ j = os.path.join base = j(sbox.wc_dir, base_dir_name) make_deep_trees(base) main.run_svn(None, 'add', base) Item = wc.StateItem # initial deep trees state deep_trees_virginal_state = wc.State('', { 'F' : Item(), 'F/alpha' : Item("This is the file 'alpha'.\n"), 'D' : Item(), 'D/D1' : Item(), 'DF' : Item(), 'DF/D1' : Item(), 'DF/D1/beta' : Item("This is the file 'beta'.\n"), 'DD' : Item(), 'DD/D1' : Item(), 'DD/D1/D2' : Item(), 'DDF' : Item(), 'DDF/D1' : Item(), 'DDF/D1/D2' : Item(), 'DDF/D1/D2/gamma' : Item("This is the file 'gamma'.\n"), 'DDD' : Item(), 'DDD/D1' : Item(), 'DDD/D1/D2' : Item(), 'DDD/D1/D2/D3' : Item(), }) # Many actions on deep trees and their resulting states... def deep_trees_leaf_edit(base): """Helper function for deep trees test cases. Append text to files, create new files in empty directories, and change leaf node properties.""" j = os.path.join F = j(base, 'F', 'alpha') DF = j(base, 'DF', 'D1', 'beta') DDF = j(base, 'DDF', 'D1', 'D2', 'gamma') main.file_append(F, "More text for file alpha.\n") main.file_append(DF, "More text for file beta.\n") main.file_append(DDF, "More text for file gamma.\n") run_and_verify_svn(None, verify.AnyOutput, [], 'propset', 'prop1', '1', F, DF, DDF) D = j(base, 'D', 'D1') DD = j(base, 'DD', 'D1', 'D2') DDD = j(base, 'DDD', 'D1', 'D2', 'D3') run_and_verify_svn(None, verify.AnyOutput, [], 'propset', 'prop1', '1', D, DD, DDD) D = j(base, 'D', 'D1', 'delta') DD = j(base, 'DD', 'D1', 'D2', 'epsilon') DDD = j(base, 'DDD', 'D1', 'D2', 'D3', 'zeta') main.file_append(D, "This is the file 'delta'.\n") main.file_append(DD, "This is the file 'epsilon'.\n") main.file_append(DDD, "This is the file 'zeta'.\n") run_and_verify_svn(None, verify.AnyOutput, [], 'add', D, DD, DDD) # deep trees state after a call to deep_trees_leaf_edit deep_trees_after_leaf_edit = wc.State('', { 'F' : Item(), 'F/alpha' : Item("This is the file 'alpha'.\nMore text for file alpha.\n"), 'D' : Item(), 'D/D1' : Item(), 'D/D1/delta' : Item("This is the file 'delta'.\n"), 'DF' : Item(), 'DF/D1' : Item(), 'DF/D1/beta' : Item("This is the file 'beta'.\nMore text for file beta.\n"), 'DD' : Item(), 'DD/D1' : Item(), 'DD/D1/D2' : Item(), 'DD/D1/D2/epsilon' : Item("This is the file 'epsilon'.\n"), 'DDF' : Item(), 'DDF/D1' : Item(), 'DDF/D1/D2' : Item(), 'DDF/D1/D2/gamma' : Item("This is the file 'gamma'.\nMore text for file gamma.\n"), 'DDD' : Item(), 'DDD/D1' : Item(), 'DDD/D1/D2' : Item(), 'DDD/D1/D2/D3' : Item(), 'DDD/D1/D2/D3/zeta' : Item("This is the file 'zeta'.\n"), }) def deep_trees_leaf_del(base): """Helper function for deep trees test cases. Delete files and empty dirs.""" j = os.path.join F = j(base, 'F', 'alpha') D = j(base, 'D', 'D1') DF = j(base, 'DF', 'D1', 'beta') DD = j(base, 'DD', 'D1', 'D2') DDF = j(base, 'DDF', 'D1', 'D2', 'gamma') DDD = j(base, 'DDD', 'D1', 'D2', 'D3') main.run_svn(None, 'rm', F, D, DF, DD, DDF, DDD) # deep trees state after a call to deep_trees_leaf_del deep_trees_after_leaf_del = wc.State('', { 'F' : Item(), 'D' : Item(), 'DF' : Item(), 'DF/D1' : Item(), 'DD' : Item(), 'DD/D1' : Item(), 'DDF' : Item(), 'DDF/D1' : Item(), 'DDF/D1/D2' : Item(), 'DDD' : Item(), 'DDD/D1' : Item(), 'DDD/D1/D2' : Item(), }) def deep_trees_tree_del(base): """Helper function for deep trees test cases. Delete top-level dirs.""" j = os.path.join F = j(base, 'F', 'alpha') D = j(base, 'D', 'D1') DF = j(base, 'DF', 'D1') DD = j(base, 'DD', 'D1') DDF = j(base, 'DDF', 'D1') DDD = j(base, 'DDD', 'D1') main.run_svn(None, 'rm', F, D, DF, DD, DDF, DDD) def deep_trees_rmtree(base): """Helper function for deep trees test cases. Delete top-level dirs with rmtree instead of svn del.""" j = os.path.join F = j(base, 'F', 'alpha') D = j(base, 'D', 'D1') DF = j(base, 'DF', 'D1') DD = j(base, 'DD', 'D1') DDF = j(base, 'DDF', 'D1') DDD = j(base, 'DDD', 'D1') os.unlink(F) main.safe_rmtree(D) main.safe_rmtree(DF) main.safe_rmtree(DD) main.safe_rmtree(DDF) main.safe_rmtree(DDD) # deep trees state after a call to deep_trees_tree_del deep_trees_after_tree_del = wc.State('', { 'F' : Item(), 'D' : Item(), 'DF' : Item(), 'DD' : Item(), 'DDF' : Item(), 'DDD' : Item(), }) # deep trees state without any files deep_trees_empty_dirs = wc.State('', { 'F' : Item(), 'D' : Item(), 'D/D1' : Item(), 'DF' : Item(), 'DF/D1' : Item(), 'DD' : Item(), 'DD/D1' : Item(), 'DD/D1/D2' : Item(), 'DDF' : Item(), 'DDF/D1' : Item(), 'DDF/D1/D2' : Item(), 'DDD' : Item(), 'DDD/D1' : Item(), 'DDD/D1/D2' : Item(), 'DDD/D1/D2/D3' : Item(), }) def deep_trees_tree_del_repos(base): """Helper function for deep trees test cases. Delete top-level dirs, directly in the repository.""" j = '/'.join F = j([base, 'F', 'alpha']) D = j([base, 'D', 'D1']) DF = j([base, 'DF', 'D1']) DD = j([base, 'DD', 'D1']) DDF = j([base, 'DDF', 'D1']) DDD = j([base, 'DDD', 'D1']) main.run_svn(None, 'mkdir', '-m', '', F, D, DF, DD, DDF, DDD) # Expected merge/update/switch output. deep_trees_conflict_output = wc.State('', { 'F/alpha' : Item(status=' ', treeconflict='C'), 'D/D1' : Item(status=' ', treeconflict='C'), 'DF/D1' : Item(status=' ', treeconflict='C'), 'DD/D1' : Item(status=' ', treeconflict='C'), 'DDF/D1' : Item(status=' ', treeconflict='C'), 'DDD/D1' : Item(status=' ', treeconflict='C'), }) deep_trees_conflict_output_skipped = wc.State('', { 'D/D1' : Item(verb='Skipped'), 'F/alpha' : Item(verb='Skipped'), 'DD/D1' : Item(verb='Skipped'), 'DF/D1' : Item(verb='Skipped'), 'DDD/D1' : Item(verb='Skipped'), 'DDF/D1' : Item(verb='Skipped'), }) # Expected status output after merge/update/switch. deep_trees_status_local_tree_del = wc.State('', { '' : Item(status=' ', wc_rev=3), 'D' : Item(status=' ', wc_rev=3), 'D/D1' : Item(status='D ', wc_rev=2, treeconflict='C'), 'DD' : Item(status=' ', wc_rev=3), 'DD/D1' : Item(status='D ', wc_rev=2, treeconflict='C'), 'DD/D1/D2' : Item(status='D ', wc_rev=2), 'DDD' : Item(status=' ', wc_rev=3), 'DDD/D1' : Item(status='D ', wc_rev=2, treeconflict='C'), 'DDD/D1/D2' : Item(status='D ', wc_rev=2), 'DDD/D1/D2/D3' : Item(status='D ', wc_rev=2), 'DDF' : Item(status=' ', wc_rev=3), 'DDF/D1' : Item(status='D ', wc_rev=2, treeconflict='C'), 'DDF/D1/D2' : Item(status='D ', wc_rev=2), 'DDF/D1/D2/gamma' : Item(status='D ', wc_rev=2), 'DF' : Item(status=' ', wc_rev=3), 'DF/D1' : Item(status='D ', wc_rev=2, treeconflict='C'), 'DF/D1/beta' : Item(status='D ', wc_rev=2), 'F' : Item(status=' ', wc_rev=3), 'F/alpha' : Item(status='D ', wc_rev=2, treeconflict='C'), }) deep_trees_status_local_leaf_edit = wc.State('', { '' : Item(status=' ', wc_rev=3), 'D' : Item(status=' ', wc_rev=3), 'D/D1' : Item(status=' M', wc_rev=2, treeconflict='C'), 'D/D1/delta' : Item(status='A ', wc_rev=0), 'DD' : Item(status=' ', wc_rev=3), 'DD/D1' : Item(status=' ', wc_rev=2, treeconflict='C'), 'DD/D1/D2' : Item(status=' M', wc_rev=2), 'DD/D1/D2/epsilon' : Item(status='A ', wc_rev=0), 'DDD' : Item(status=' ', wc_rev=3), 'DDD/D1' : Item(status=' ', wc_rev=2, treeconflict='C'), 'DDD/D1/D2' : Item(status=' ', wc_rev=2), 'DDD/D1/D2/D3' : Item(status=' M', wc_rev=2), 'DDD/D1/D2/D3/zeta' : Item(status='A ', wc_rev=0), 'DDF' : Item(status=' ', wc_rev=3), 'DDF/D1' : Item(status=' ', wc_rev=2, treeconflict='C'), 'DDF/D1/D2' : Item(status=' ', wc_rev=2), 'DDF/D1/D2/gamma' : Item(status='MM', wc_rev=2), 'DF' : Item(status=' ', wc_rev=3), 'DF/D1' : Item(status=' ', wc_rev=2, treeconflict='C'), 'DF/D1/beta' : Item(status='MM', wc_rev=2), 'F' : Item(status=' ', wc_rev=3), 'F/alpha' : Item(status='MM', wc_rev=2, treeconflict='C'), }) class DeepTreesTestCase: """Describes one tree-conflicts test case. See deep_trees_run_tests_scheme_for_update(), ..._switch(), ..._merge(). The name field is the subdirectory name in which the test should be run. The local_action and incoming_action are the functions to run to construct the local changes and incoming changes, respectively. See deep_trees_leaf_edit, deep_trees_tree_del, etc. The expected_* and error_re_string arguments are described in functions run_and_verify_[update|switch|merge] except expected_info, which is a dict that has path keys with values that are dicts as passed to run_and_verify_info(): expected_info = { 'F/alpha' : { 'Revision' : '3', 'Tree conflict' : '^local delete, incoming edit upon update' + ' Source left: .file.*/F/alpha@2' + ' Source right: .file.*/F/alpha@3$', }, 'DF/D1' : { 'Tree conflict' : '^local delete, incoming edit upon update' + ' Source left: .dir.*/DF/D1@2' + ' Source right: .dir.*/DF/D1@3$', }, ... } Note: expected_skip is only used in merge, i.e. using deep_trees_run_tests_scheme_for_merge. """ def __init__(self, name, local_action, incoming_action, expected_output = None, expected_disk = None, expected_status = None, expected_skip = None, error_re_string = None, commit_block_string = ".*remains in conflict.*", expected_info = None): self.name = name self.local_action = local_action self.incoming_action = incoming_action self.expected_output = expected_output self.expected_disk = expected_disk self.expected_status = expected_status self.expected_skip = expected_skip self.error_re_string = error_re_string self.commit_block_string = commit_block_string self.expected_info = expected_info def deep_trees_run_tests_scheme_for_update(sbox, greater_scheme): """ Runs a given list of tests for conflicts occuring at an update operation. This function wants to save time and perform a number of different test cases using just a single repository and performing just one commit for all test cases instead of one for each test case. 1) Each test case is initialized in a separate subdir. Each subdir again contains one set of "deep_trees", being separate container dirs for different depths of trees (F, D, DF, DD, DDF, DDD). 2) A commit is performed across all test cases and depths. (our initial state, -r2) 3) In each test case subdir (e.g. "local_tree_del_incoming_leaf_edit"), its *incoming* action is performed (e.g. "deep_trees_leaf_edit"), in each of the different depth trees (F, D, DF, ... DDD). 4) A commit is performed across all test cases and depths: our "incoming" state is "stored away in the repository for now", -r3. 5) All test case dirs and contained deep_trees are time-warped (updated) back to -r2, the initial state containing deep_trees. 6) In each test case subdir (e.g. "local_tree_del_incoming_leaf_edit"), its *local* action is performed (e.g. "deep_trees_leaf_del"), in each of the different depth trees (F, D, DF, ... DDD). 7) An update to -r3 is performed across all test cases and depths. This causes tree-conflicts between the "local" state in the working copy and the "incoming" state from the repository, -r3. 8) A commit is performed in each separate container, to verify that each tree-conflict indeed blocks a commit. The sbox parameter is just the sbox passed to a test function. No need to call sbox.build(), since it is called (once) within this function. The "table" greater_scheme models all of the different test cases that should be run using a single repository. greater_scheme is a list of DeepTreesTestCase items, which define complete test setups, so that they can be performed as described above. """ j = os.path.join sbox.build() wc_dir = sbox.wc_dir # 1) create directories for test_case in greater_scheme: try: add_deep_trees(sbox, test_case.name) except: print("ERROR IN: Tests scheme for update: " + "while setting up deep trees in '%s'" % test_case.name) raise # 2) commit initial state main.run_svn(None, 'commit', '-m', 'initial state', wc_dir) # 3) apply incoming changes for test_case in greater_scheme: try: test_case.incoming_action(j(sbox.wc_dir, test_case.name)) except: print("ERROR IN: Tests scheme for update: " + "while performing incoming action in '%s'" % test_case.name) raise # 4) commit incoming changes main.run_svn(None, 'commit', '-m', 'incoming changes', wc_dir) # 5) time-warp back to -r2 main.run_svn(None, 'update', '-r2', wc_dir) # 6) apply local changes for test_case in greater_scheme: try: test_case.local_action(j(wc_dir, test_case.name)) except: print("ERROR IN: Tests scheme for update: " + "while performing local action in '%s'" % test_case.name) raise # 7) update to -r3, conflicting with incoming changes. # A lot of different things are expected. # Do separate update operations for each test case. for test_case in greater_scheme: try: base = j(wc_dir, test_case.name) x_out = test_case.expected_output if x_out != None: x_out = x_out.copy() x_out.wc_dir = base x_disk = test_case.expected_disk x_status = test_case.expected_status if x_status != None: x_status.copy() x_status.wc_dir = base run_and_verify_update(base, x_out, x_disk, None, error_re_string = test_case.error_re_string) if x_status: run_and_verify_unquiet_status(base, x_status) x_info = test_case.expected_info or {} for path in x_info: run_and_verify_info([x_info[path]], j(base, path)) except: print("ERROR IN: Tests scheme for update: " + "while verifying in '%s'" % test_case.name) raise # 8) Verify that commit fails. for test_case in greater_scheme: try: base = j(wc_dir, test_case.name) x_status = test_case.expected_status if x_status != None: x_status.copy() x_status.wc_dir = base run_and_verify_commit(base, None, x_status, test_case.commit_block_string, base) except: print("ERROR IN: Tests scheme for update: " + "while checking commit-blocking in '%s'" % test_case.name) raise def deep_trees_skipping_on_update(sbox, test_case, skip_paths, chdir_skip_paths): """ Create tree conflicts, then update again, expecting the existing tree conflicts to be skipped. SKIP_PATHS is a list of paths, relative to the "base dir", for which "update" on the "base dir" should report as skipped. CHDIR_SKIP_PATHS is a list of (target-path, skipped-path) pairs for which an update of "target-path" (relative to the "base dir") should result in "skipped-path" (relative to "target-path") being reported as skipped. """ """FURTHER_ACTION is a function that will make a further modification to each target, this being the modification that we expect to be skipped. The function takes the "base dir" (the WC path to the test case directory) as its only argument.""" further_action = deep_trees_tree_del_repos j = os.path.join wc_dir = sbox.wc_dir base = j(wc_dir, test_case.name) # Initialize: generate conflicts. (We do not check anything here.) setup_case = DeepTreesTestCase(test_case.name, test_case.local_action, test_case.incoming_action, None, None, None) deep_trees_run_tests_scheme_for_update(sbox, [setup_case]) # Make a further change to each target in the repository so there is a new # revision to update to. (This is r4.) further_action(sbox.repo_url + '/' + test_case.name) # Update whole working copy, expecting the nodes still in conflict to be # skipped. x_out = test_case.expected_output if x_out != None: x_out = x_out.copy() x_out.wc_dir = base x_disk = test_case.expected_disk x_status = test_case.expected_status if x_status != None: x_status = x_status.copy() x_status.wc_dir = base # Account for nodes that were updated by further_action x_status.tweak('', 'D', 'F', 'DD', 'DF', 'DDD', 'DDF', wc_rev=4) run_and_verify_update(base, x_out, x_disk, None, error_re_string = test_case.error_re_string) run_and_verify_unquiet_status(base, x_status) # Try to update each in-conflict subtree. Expect a 'Skipped' output for # each, and the WC status to be unchanged. for path in skip_paths: run_and_verify_update(j(base, path), wc.State(base, {path : Item(verb='Skipped')}), None, None) run_and_verify_unquiet_status(base, x_status) # Try to update each in-conflict subtree. Expect a 'Skipped' output for # each, and the WC status to be unchanged. # This time, cd to the subdir before updating it. was_cwd = os.getcwd() for path, skipped in chdir_skip_paths: #print("CHDIR TO: %s" % j(base, path)) os.chdir(j(base, path)) run_and_verify_update('', wc.State('', {skipped : Item(verb='Skipped')}), None, None) os.chdir(was_cwd) run_and_verify_unquiet_status(base, x_status) # Verify that commit still fails. for path, skipped in chdir_skip_paths: run_and_verify_commit(j(base, path), None, None, test_case.commit_block_string, base) run_and_verify_unquiet_status(base, x_status) def deep_trees_run_tests_scheme_for_switch(sbox, greater_scheme): """ Runs a given list of tests for conflicts occuring at a switch operation. This function wants to save time and perform a number of different test cases using just a single repository and performing just one commit for all test cases instead of one for each test case. 1) Each test case is initialized in a separate subdir. Each subdir again contains two subdirs: one "local" and one "incoming" for the switch operation. These contain a set of deep_trees each. 2) A commit is performed across all test cases and depths. (our initial state, -r2) 3) In each test case subdir's incoming subdir, the incoming actions are performed. 4) A commit is performed across all test cases and depths. (-r3) 5) In each test case subdir's local subdir, the local actions are performed. They remain uncommitted in the working copy. 6) In each test case subdir's local dir, a switch is performed to its corresponding incoming dir. This causes conflicts between the "local" state in the working copy and the "incoming" state from the incoming subdir (still -r3). 7) A commit is performed in each separate container, to verify that each tree-conflict indeed blocks a commit. The sbox parameter is just the sbox passed to a test function. No need to call sbox.build(), since it is called (once) within this function. The "table" greater_scheme models all of the different test cases that should be run using a single repository. greater_scheme is a list of DeepTreesTestCase items, which define complete test setups, so that they can be performed as described above. """ j = os.path.join sbox.build() wc_dir = sbox.wc_dir # 1) Create directories. for test_case in greater_scheme: try: base = j(sbox.wc_dir, test_case.name) os.makedirs(base) make_deep_trees(j(base, "local")) make_deep_trees(j(base, "incoming")) main.run_svn(None, 'add', base) except: print("ERROR IN: Tests scheme for switch: " + "while setting up deep trees in '%s'" % test_case.name) raise # 2) Commit initial state (-r2). main.run_svn(None, 'commit', '-m', 'initial state', wc_dir) # 3) Apply incoming changes for test_case in greater_scheme: try: test_case.incoming_action(j(sbox.wc_dir, test_case.name, "incoming")) except: print("ERROR IN: Tests scheme for switch: " + "while performing incoming action in '%s'" % test_case.name) raise # 4) Commit all changes (-r3). main.run_svn(None, 'commit', '-m', 'incoming changes', wc_dir) # 5) Apply local changes in their according subdirs. for test_case in greater_scheme: try: test_case.local_action(j(sbox.wc_dir, test_case.name, "local")) except: print("ERROR IN: Tests scheme for switch: " + "while performing local action in '%s'" % test_case.name) raise # 6) switch the local dir to the incoming url, conflicting with incoming # changes. A lot of different things are expected. # Do separate switch operations for each test case. for test_case in greater_scheme: try: local = j(wc_dir, test_case.name, "local") incoming = sbox.repo_url + "/" + test_case.name + "/incoming" x_out = test_case.expected_output if x_out != None: x_out = x_out.copy() x_out.wc_dir = local x_disk = test_case.expected_disk x_status = test_case.expected_status if x_status != None: x_status.copy() x_status.wc_dir = local run_and_verify_switch(local, local, incoming, x_out, x_disk, None, error_re_string = test_case.error_re_string) run_and_verify_unquiet_status(local, x_status) x_info = test_case.expected_info or {} for path in x_info: run_and_verify_info([x_info[path]], j(local, path)) except: print("ERROR IN: Tests scheme for switch: " + "while verifying in '%s'" % test_case.name) raise # 7) Verify that commit fails. for test_case in greater_scheme: try: local = j(wc_dir, test_case.name, 'local') x_status = test_case.expected_status if x_status != None: x_status.copy() x_status.wc_dir = local run_and_verify_commit(local, None, x_status, test_case.commit_block_string, local) except: print("ERROR IN: Tests scheme for switch: " + "while checking commit-blocking in '%s'" % test_case.name) raise def deep_trees_run_tests_scheme_for_merge(sbox, greater_scheme, do_commit_local_changes): """ Runs a given list of tests for conflicts occuring at a merge operation. This function wants to save time and perform a number of different test cases using just a single repository and performing just one commit for all test cases instead of one for each test case. 1) Each test case is initialized in a separate subdir. Each subdir initially contains another subdir, called "incoming", which contains a set of deep_trees. 2) A commit is performed across all test cases and depths. (a pre-initial state) 3) In each test case subdir, the "incoming" subdir is copied to "local", via the `svn copy' command. Each test case's subdir now has two sub- dirs: "local" and "incoming", initial states for the merge operation. 4) An update is performed across all test cases and depths, so that the copies made in 3) are pulled into the wc. 5) In each test case's "incoming" subdir, the incoming action is performed. 6) A commit is performed across all test cases and depths, to commit the incoming changes. If do_commit_local_changes is True, this becomes step 7 (swap steps). 7) In each test case's "local" subdir, the local_action is performed. If do_commit_local_changes is True, this becomes step 6 (swap steps). Then, in effect, the local changes are committed as well. 8) In each test case subdir, the "incoming" subdir is merged into the "local" subdir. This causes conflicts between the "local" state in the working copy and the "incoming" state from the incoming subdir. 9) A commit is performed in each separate container, to verify that each tree-conflict indeed blocks a commit. The sbox parameter is just the sbox passed to a test function. No need to call sbox.build(), since it is called (once) within this function. The "table" greater_scheme models all of the different test cases that should be run using a single repository. greater_scheme is a list of DeepTreesTestCase items, which define complete test setups, so that they can be performed as described above. """ j = os.path.join sbox.build() wc_dir = sbox.wc_dir # 1) Create directories. for test_case in greater_scheme: try: base = j(sbox.wc_dir, test_case.name) os.makedirs(base) make_deep_trees(j(base, "incoming")) main.run_svn(None, 'add', base) except: print("ERROR IN: Tests scheme for merge: " + "while setting up deep trees in '%s'" % test_case.name) raise # 2) Commit pre-initial state (-r2). main.run_svn(None, 'commit', '-m', 'pre-initial state', wc_dir) # 3) Copy "incoming" to "local". for test_case in greater_scheme: try: base_url = sbox.repo_url + "/" + test_case.name incoming_url = base_url + "/incoming" local_url = base_url + "/local" main.run_svn(None, 'cp', incoming_url, local_url, '-m', 'copy incoming to local') except: print("ERROR IN: Tests scheme for merge: " + "while copying deep trees in '%s'" % test_case.name) raise # 4) Update to load all of the "/local" subdirs into the working copies. try: main.run_svn(None, 'up', sbox.wc_dir) except: print("ERROR IN: Tests scheme for merge: " + "while updating local subdirs") raise # 5) Perform incoming actions for test_case in greater_scheme: try: test_case.incoming_action(j(sbox.wc_dir, test_case.name, "incoming")) except: print("ERROR IN: Tests scheme for merge: " + "while performing incoming action in '%s'" % test_case.name) raise # 6) or 7) Commit all incoming actions if not do_commit_local_changes: try: main.run_svn(None, 'ci', '-m', 'Committing incoming actions', sbox.wc_dir) except: print("ERROR IN: Tests scheme for merge: " + "while committing incoming actions") raise # 7) or 6) Perform all local actions. for test_case in greater_scheme: try: test_case.local_action(j(sbox.wc_dir, test_case.name, "local")) except: print("ERROR IN: Tests scheme for merge: " + "while performing local action in '%s'" % test_case.name) raise # 6) or 7) Commit all incoming actions if do_commit_local_changes: try: main.run_svn(None, 'ci', '-m', 'Committing incoming and local actions', sbox.wc_dir) except: print("ERROR IN: Tests scheme for merge: " + "while committing incoming and local actions") raise # 8) Merge all "incoming" subdirs to their respective "local" subdirs. # This creates conflicts between the local changes in the "local" wc # subdirs and the incoming states committed in the "incoming" subdirs. for test_case in greater_scheme: try: local = j(sbox.wc_dir, test_case.name, "local") incoming = sbox.repo_url + "/" + test_case.name + "/incoming" x_out = test_case.expected_output if x_out != None: x_out = x_out.copy() x_out.wc_dir = local x_disk = test_case.expected_disk x_status = test_case.expected_status if x_status != None: x_status.copy() x_status.wc_dir = local x_skip = test_case.expected_skip if x_skip != None: x_skip.copy() x_skip.wc_dir = local run_and_verify_merge(local, None, None, incoming, None, x_out, None, None, x_disk, None, x_skip, error_re_string = test_case.error_re_string, dry_run = False) run_and_verify_unquiet_status(local, x_status) except: print("ERROR IN: Tests scheme for merge: " + "while verifying in '%s'" % test_case.name) raise # 9) Verify that commit fails. for test_case in greater_scheme: try: local = j(wc_dir, test_case.name, 'local') x_status = test_case.expected_status if x_status != None: x_status.copy() x_status.wc_dir = local run_and_verify_commit(local, None, x_status, test_case.commit_block_string, local) except: print("ERROR IN: Tests scheme for merge: " + "while checking commit-blocking in '%s'" % test_case.name) raise cvs2svn-2.4.0/svntest/factory.py0000664000076500007650000016173711434364627017763 0ustar mhaggermhagger00000000000000# # factory.py: Automatically generate a (near-)complete new cmdline test # from a series of shell commands. # # Subversion is a tool for revision control. # See http://subversion.tigris.org for more information. # # ==================================================================== # Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. ###################################################################### ## HOW TO USE: # # (1) Edit the py test script you want to enhance (for example # cmdline/basic_tests.py), add a new test header as usual. # Insert a call to factory.make() into the empty test: # # def my_new_test(sbox): # "my new test modifies iota" # svntest.factory.make(sbox, """ # echo "foo" > A/D/foo # svn add A/D/foo # svn st # svn ci # """) # # (2) Add the test to the tests list at the bottom of the py file. # [...] # some_other_test, # my_new_test, # ] # # # (3) Run the test, paste the output back into your new test, # replacing the factory call. # # $ ./foo_tests.py my_new_test # OR # $ ./foo_tests.py my_new_test > new_test.snippet # OR # $ ./foo_tests.py my_new_test >> basic_tests.py # Then edit (e.g.) basic_tests.py to put the script in the right place. # # Ensure that the py script (e.g. basic_tests.py) has these imports, # so that the composed script that you pasted back finds everything # that it uses: # import os, shutil # from svntest import main, wc, actions, verify # # Be aware that you have to paste the result back to the .py file. # # Be more aware that you have to read every single line and understand # that it makes sense. If the current behaviour is wrong, you need to # make the changes to expect the correct behaviour and XFail() your test. # # factory.make() just probes the current situation and writes a test that # PASSES any success AND ANY FAILURE THAT IT FINDS. The resulting script # will never fail anything (if it works correctly), not even a failure. # # ### TODO: some sort of intelligent pasting directly into the # right place, like looking for the factory call, # inserting the new test there, back-up-ing the old file. # # # TROUBLESHOOTING # If the result has a problem somewhere along the middle, you can, # of course, only use the first part of the output result, maybe tweak # something, and continue with another factory.make() at the end of that. # # Or you can first do stuff to your sbox and then call factory on it. # Factory will notice if the sbox has already been built and calls # sbox.build() only if you didn't already. # # You can also have any number of factory.make() calls scattered # around "real" test code. # # Note that you can pass a prev_status and prev_disk to factory, to make # the expected_* trees re-use a pre-existing one in the test, entirely # for code beauty :P (has to match the wc_dir you will be using next). # # # YOU ARE CORDIALLY INVITED to add/tweak/change to your needs. # If you want to know what's going on, look at the switch() # funtion of TestFactory below. # # # DETAILS # ======= # # The input to factory.make(sbox, input) is not "real" shell-script. # Factory goes at great lengths to try and understand your script, it # parses common shell operations during tests and translates them. # # All arguments are tokenized similarly to shell, so if you need a space # in an argument, use quotes. # echo "my content" > A/new_file # Quote char escaping is done like this: # echo "my \\" content" > A/new_file # echo 'my \\' content' > A/new_file # If you use r""" echo 'my \' content' > A/new_file """ (triple quotes # with a leading 'r' character), you don't need to double-escape any # characters. # # You can either supply multiple lines, or separate the lines with ';'. # factory.make(sbox, 'echo foo > bar; svn add bar') # factory.make(sbox, 'echo foo > bar\n svn add bar') # factory.make(sbox, r""" # echo "foo\nbar" > bar # svn add bar # """) # # # WORKING COPY PATHS # - Factory will automatically build sbox.wc_dir if you didn't do so yet. # # - If you supply any path or file name, factory will prepend sbox.wc_dir # to it. # echo more >> iota # --> main.file_append( # os.path.join(sbox.wc_dir, 'iota'), # "more") # You can also do so explicitly. # echo more >> wc_dir/iota # --> main.file_append( # os.path.join(sbox.wc_dir, 'iota'), # "more") # # Factory implies the sbox.wc_dir if you fail to supply an explicit # working copy dir. If you want to supply one explicitly, you can # choose among these wildcards: # 'wc_dir', 'wcdir', '$WC_DIR', '$WC' -- all expanded to sbox.wc_dir # For example: # 'svn mkdir wc_dir/A/D/X' # But as long as you want to use only the default sbox.wc_dir, you usually # don't need to supply any wc_dir-wildcard: # 'mkdir A/X' creates the directory sbox.wc_dir/A/X # (Factory tries to know which arguments of the commands you supplied # are eligible to be path arguments. If something goes wrong here, try # to fix factory.py to not mistake the arg for something different. # You usually just need to tweak some parameters to args2svntest() to # achieve correct expansion.) # # - If you want to use a second (or Nth) working copy, just supply any # working copy wildcard with any made-up suffix, e.g. like this: # 'svn st wc_dir_2' or 'svn info $WC_2' # Factory will detect that you used another wc_dir and will automatically # add a corresponding directory to your sbox. The directory will initially # be nonexistent, so call 'mkdir', 'svn co' or 'cp' before using: # 'cp wc_dir wc_dir_other' -- a copy of the current WC # 'svn co $URL wc_dir_new' -- a clean checkout # 'mkdir wc_dir_empty' -- an empty directory # You can subsequently use any wc-dir wildcard with your suffix added. # # cp wc_dir wc_dir_2 # echo more >> wc_dir_2/iota # --> wc_dir_2 = sbox.add_wc_path('2') # shutil.copytrees(wc_dir, wc_dir_2) # main.file_append( # os.path.join(wc_dir_2, 'iota'), # "more") # # # URLs # Factory currently knows only one repository, thus only one repos root. # The wildcards you can use for it are: # 'url', '$URL' # A URL is not inserted automatically like wc_dir, you need to supply a # URL wildcard. # Alternatively, you can use '^/' URLs. However, that is in effect a different # test from an explicit entire URL. The test needs to chdir to the working # copy in order find which URL '^/' should expand to. # (currently, factory will chdir to sbox.wc_dir. It will only chdir # to another working copy if one of the other arguments involved a WC. # ### TODO add a 'cd wc_dir_2' command to select another WC as default.) # Example: # 'svn co $URL Y' -- make a new nested working copy in sbox.wc_dir/Y # 'svn co $URL wc_dir_2' -- create a new separate working copy # 'svn cp ^/A ^/X' -- do a URL copy, creating $URL/X (branch) # # # SOME EXAMPLES # These commands should work: # # - "svn " # Some subcommands are parsed specially, others by a catch-all default # parser (cmd_svn()), see switch(). # 'svn commit', 'svn commit --force', 'svn ci wc_dir_2' # 'svn copy url/A url/X' # # - "echo contents > file" (replace) # "echo contents >> file" (append) # Calls main.file_write() / main.file_append(). # 'echo "froogle" >> A/D/G/rho' -- append to an existing file # 'echo "bar" > A/quux' -- create a new file # 'echo "fool" > wc_dir_2/me' -- manipulate other working copies # # - "mkdir ..." # Calls os.makedirs(). # You probably want 'svn mkdir' instead, or use 'svn add' after this. # 'mkdir A/D/X' -- create an unversioned directory # 'mkdir wc_dir_5' -- create a new, empty working copy # # - "rm " # Calls main.safe_rmtree(). # You probably want to use 'svn delete' instead. # 'rm A/D/G' # 'rm wc_dir_2' # # - "mv [ ...] " # Calls shutil.move() # You probably want to use 'svn move' instead. # 'mv iota A/D/' -- move sbox.wc_dir/iota to sbox.wc_dir/A/D/. # # - "cp [ ...] " # Do a filesystem copy. # You probably want to use 'svn copy' instead. # 'cp wc_dir wc_dir_copy' # 'cp A/D/G A/X' # # IF YOU NEED ANY OTHER COMMANDS: # - first check if it doesn't work already. If not, # - add your desired commands to factory.py! :) # - alternatively, use a number of separate factory calls, doing what # you need done in "real" svntest language in-between. # # IF YOU REALLY DON'T GROK THIS: # - ask #svn-dev # - ask dev@ # - ask neels import sys, re, os, shutil, bisect, textwrap, shlex import svntest from svntest import main, actions, tree from svntest import Failure if sys.version_info[0] >= 3: # Python >=3.0 from io import StringIO else: # Python <3.0 from StringIO import StringIO def make(wc_dir, commands, prev_status=None, prev_disk=None, verbose=True): """The Factory Invocation Function. This is typically the only one called from outside this file. See top comment in factory.py. Prints the resulting py script to stdout when verbose is True and returns the resulting line-list containing items as: [ ['pseudo-shell input line #1', ' translation\n to\n py #1'], ...]""" fac = TestFactory(wc_dir, prev_status, prev_disk) fac.make(commands) fac.print_script() return fac.lines class TestFactory: """This class keeps all state around a factory.make() call.""" def __init__(self, sbox, prev_status=None, prev_disk=None): self.sbox = sbox # The input lines and their translations. # Each translation usually has multiple output lines ('\n' characters). self.lines = [] # [ ['in1', 'out1'], ['in2', 'out'], ... # Any expected_status still there from a previous verification self.prev_status = None if prev_status: self.prev_status = [None, prev_status] # svntest.wc.State # Any expected_disk still there from a previous verification self.prev_disk = None if prev_disk: self.prev_disk = [None, prev_disk] # svntest.wc.State # Those command line options that expect an argument following # which is not a path. (don't expand args following these) self.keep_args_of = ['--depth', '--encoding', '-r', '--changelist', '-m', '--message'] # A stack of $PWDs, to be able to chdir back after a chdir. self.prevdirs = [] # The python variables we want to be declared at the beginning. # These are path variables like "A_D = os.path.join(wc_dir, 'A', 'D')". # The original wc_dir and url vars are not kept here. self.vars = {} # An optimized list kept up-to-date by variable additions self.sorted_vars_by_pathlen = [] # Wether we ever used the variables 'wc_dir' and 'url' (tiny tweak) self.used_wc_dir = False self.used_url = False # The alternate working copy directories created that need to be # registered with sbox (are not inside another working copy). self.other_wc_dirs = {} def make(self, commands): "internal main function, delegates everything except final output." # keep a spacer for init self.add_line(None, None) init = "" if not self.sbox.is_built(): self.sbox.build() init += "sbox.build()\n" try: # split input args input_lines = commands.replace(';','\n').splitlines() for str in input_lines: if len(str.strip()) > 0: self.add_line(str) for i in range(len(self.lines)): if self.lines[i][0] is not None: # This is where everything happens: self.lines[i][1] = self.switch(self.lines[i][0]) # We're done. Add a final greeting. self.add_line( None, "Remember, this only saves you typing. Doublecheck everything.") # -- Insert variable defs in the first line -- # main wc_dir and url if self.used_wc_dir: init += 'wc_dir = sbox.wc_dir\n' if self.used_url: init += 'url = sbox.repo_url\n' # registration of new WC dirs sorted_names = self.get_sorted_other_wc_dir_names() for name in sorted_names: init += name + ' = ' + self.other_wc_dirs[name][0] + '\n' if len(init) > 0: init += '\n' # general variable definitions sorted_names = self.get_sorted_var_names() for name in sorted_names: init += name + ' = ' + self.vars[name][0] + '\n' # Insert at the first line, being the spacer from above if len(init) > 0: self.lines[0][1] = init # This usually goes to make() below (outside this class) return self.lines except: for line in self.lines: if line[1] is not None: print(line[1]) raise def print_script(self, stream=sys.stdout): "Output the resulting script of the preceding make() call" if self.lines is not None: for line in self.lines: if line[1] is None: # fall back to just that line as it was in the source stripped = line[0].strip() if not stripped.startswith('#'): # for comments, don't say this: stream.write(" # don't know how to handle:\n") stream.write(" " + line[0].strip() + '\n') else: if line[0] is not None: stream.write( wrap_each_line(line[0].strip(), " # ", " # ", True) + '\n') stream.write(wrap_each_line(line[1], " ", " ", False) + '\n\n') else: stream.write(" # empty.\n") stream.flush() # End of public functions. # "Shell" command handlers: def switch(self, line): "Given one input line, delegates to the appropriate sub-functions." args = shlex.split(line) if len(args) < 1: return "" first = args[0] # This is just an if-cascade. Feel free to change that. if first == 'svn': second = args[1] if second == 'add': return self.cmd_svn(args[1:], False, self.keep_args_of) if second in ['changelist', 'cl']: keep_count = 2 if '--remove' in args: keep_count = 1 return self.cmd_svn(args[1:], False, self.keep_args_of, keep_count) if second in ['status','stat','st']: return self.cmd_svn_status(args[2:]) if second in ['commit','ci']: return self.cmd_svn_commit(args[2:]) if second in ['update','up']: return self.cmd_svn_update(args[2:]) if second in ['copy', 'cp', 'move', 'mv', 'rename', 'ren']: return self.cmd_svn_copy_move(args[1:]) if second in ['checkout', 'co']: return self.cmd_svn_checkout(args[2:]) if second in ['propset','pset','ps']: return self.cmd_svn(args[1:], False, self.keep_args_of, 3) if second in ['delete','del','remove', 'rm']: return self.cmd_svn(args[1:], False, self.keep_args_of + ['--with-revprop']) # NOTE that not all commands need to be listed here, since # some are already adequately handled by self.cmd_svn(). # If you find yours is not, add another self.cmd_svn_xxx(). return self.cmd_svn(args[1:], False, self.keep_args_of) if first == 'echo': return self.cmd_echo(args[1:]) if first == 'mkdir': return self.cmd_mkdir(args[1:]) if first == 'rm': return self.cmd_rm(args[1:]) if first == 'mv': return self.cmd_mv(args[1:]) if first == 'cp': return self.cmd_cp(args[1:]) # if all fails, take the line verbatim return None def cmd_svn_standard_run(self, pyargs, runargs, do_chdir, wc): "The generic invocation of svn, helper function." pychdir = self.chdir(do_chdir, wc) code, out, err = main.run_svn("Maybe", *runargs) if code == 0 and len(err) < 1: # write a test that expects success pylist = self.strlist2py(out) if len(out) <= 1: py = "expected_stdout = " + pylist + "\n\n" else: py = "expected_stdout = verify.UnorderedOutput(" + pylist + ")\n\n" py += pychdir py += "actions.run_and_verify_svn2('OUTPUT', expected_stdout, [], 0" else: # write a test that expects failure pylist = self.strlist2py(err) if len(err) <= 1: py = "expected_stderr = " + pylist + "\n\n" else: py = "expected_stderr = verify.UnorderedOutput(" + pylist + ")\n\n" py += pychdir py += ("actions.run_and_verify_svn2('OUTPUT', " + "[], expected_stderr, " + str(code)) if len(pyargs) > 0: py += ", " + ", ".join(pyargs) py += ")\n" py += self.chdir_back(do_chdir) return py def cmd_svn(self, svnargs, append_wc_dir_if_missing = False, keep_args_of = [], keep_first_count = 1, drop_with_arg = []): "Handles all svn calls not handled by more specific functions." pyargs, runargs, do_chdir, targets = self.args2svntest(svnargs, append_wc_dir_if_missing, keep_args_of, keep_first_count, drop_with_arg) return self.cmd_svn_standard_run(pyargs, runargs, do_chdir, self.get_first_wc(targets)) def cmd_svn_status(self, status_args): "Runs svn status, looks what happened and writes the script for it." pyargs, runargs, do_chdir, targets = self.args2svntest( status_args, True, self.keep_args_of, 0) py = "" for target in targets: if not target.wc: py += '# SKIPPING NON-WC ' + target.runarg + '\n' continue if '-q' in status_args: pystatus = self.get_current_status(target.wc, True) py += (pystatus + "actions.run_and_verify_status(" + target.wc.py + ", expected_status)\n") else: pystatus = self.get_current_status(target.wc, False) py += (pystatus + "actions.run_and_verify_unquiet_status(" + target.wc.py + ", expected_status)\n") return py def cmd_svn_commit(self, commit_args): "Runs svn commit, looks what happened and writes the script for it." # these are the options that are followed by something that should not # be parsed as a filename in the WC. commit_arg_opts = [ "--depth", "--with-revprop", "--changelist", # "-F", "--file", these take a file argument, don't list here. # "-m", "--message", treated separately ] pyargs, runargs, do_chdir, targets = self.args2svntest( commit_args, True, commit_arg_opts, 0, ['-m', '--message']) wc = self.get_first_wc(targets) pychdir = self.chdir(do_chdir, wc) code, output, err = main.run_svn("Maybe", 'ci', '-m', 'log msg', *runargs) if code == 0 and len(err) < 1: # write a test that expects success output = actions.process_output_for_commit(output) actual_out = tree.build_tree_from_commit(output) py = ("expected_output = " + self.tree2py(actual_out, wc) + "\n\n") pystatus = self.get_current_status(wc) py += pystatus py += pychdir py += ("actions.run_and_verify_commit(" + wc.py + ", " + "expected_output, expected_status, " + "None") else: # write a test that expects error py = "expected_error = " + self.strlist2py(err) + "\n\n" py += pychdir py += ("actions.run_and_verify_commit(" + wc.py + ", " + "None, None, expected_error") if len(pyargs) > 0: py += ', ' + ', '.join(pyargs) py += ")" py += self.chdir_back(do_chdir) return py def cmd_svn_update(self, update_args): "Runs svnn update, looks what happened and writes the script for it." pyargs, runargs, do_chdir, targets = self.args2svntest( update_args, True, self.keep_args_of, 0) wc = self.get_first_wc(targets) pychdir = self.chdir(do_chdir, wc) code, output, err = main.run_svn('Maybe', 'up', *runargs) if code == 0 and len(err) < 1: # write a test that expects success actual_out = svntest.wc.State.from_checkout(output).old_tree() py = ("expected_output = " + self.tree2py(actual_out, wc) + "\n\n") pydisk = self.get_current_disk(wc) py += pydisk pystatus = self.get_current_status(wc) py += pystatus py += pychdir py += ("actions.run_and_verify_update(" + wc.py + ", " + "expected_output, expected_disk, expected_status, " + "None, None, None, None, None, False") else: # write a test that expects error py = "expected_error = " + self.strlist2py(err) + "\n\n" py += pychdir py += ("actions.run_and_verify_update(" + wc.py + ", None, None, " + "None, expected_error, None, None, None, None, False") if len(pyargs) > 0: py += ', ' + ', '.join(pyargs) py += ")" py += self.chdir_back(do_chdir) return py def cmd_svn_checkout(self, checkout_args): "Runs svn checkout, looks what happened and writes the script for it." pyargs, runargs, do_chdir, targets = self.args2svntest( checkout_args, True, self.keep_args_of, 0) # Sort out the targets. We need one URL and one dir, in that order. if len(targets) < 2: raise Failure("Sorry, I'm currently enforcing two targets for svn " + "checkout. If you want to supply less, remove this " + "check and implement whatever seems appropriate.") # We need this separate for the call to run_and_verify_checkout() # that's composed in the output script. wc_arg = targets[1] del pyargs[wc_arg.argnr] del runargs[wc_arg.argnr] url_arg = targets[0] del pyargs[url_arg.argnr] del runargs[url_arg.argnr] wc = wc_arg.wc pychdir = self.chdir(do_chdir, wc) if '--force' in runargs: self.really_safe_rmtree(wc_arg.runarg) code, output, err = main.run_svn('Maybe', 'co', url_arg.runarg, wc_arg.runarg, *runargs) py = "" if code == 0 and len(err) < 1: # write a test that expects success actual_out = tree.build_tree_from_checkout(output) pyout = ("expected_output = " + self.tree2py(actual_out, wc) + "\n\n") py += pyout pydisk = self.get_current_disk(wc) py += pydisk py += pychdir py += ("actions.run_and_verify_checkout(" + url_arg.pyarg + ", " + wc_arg.pyarg + ", expected_output, expected_disk, None, None, None, None") else: # write a test that expects failure pylist = self.strlist2py(err) if len(err) <= 1: py += "expected_stderr = " + pylist + "\n\n" else: py += "expected_stderr = verify.UnorderedOutput(" + pylist + ")\n\n" py += pychdir py += ("actions.run_and_verify_svn2('OUTPUT', " + "[], expected_stderr, " + str(code) + ", " + url_arg.pyarg + ", " + wc_arg.pyarg) # Append the remaining args if len(pyargs) > 0: py += ', ' + ', '.join(pyargs) py += ")" py += self.chdir_back(do_chdir) return py def cmd_svn_copy_move(self, args): "Runs svn copy or move, looks what happened and writes the script for it." pyargs, runargs, do_chdir, targets = self.args2svntest(args, False, self.keep_args_of, 1) if len(targets) == 2 and targets[1].is_url: # The second argument is a URL. # This needs a log message. Is one supplied? has_message = False for arg in runargs: if arg.startswith('-m') or arg == '--message': has_message = True break if not has_message: # add one runargs += [ '-m', 'copy log' ] pyargs = [] for arg in runargs: pyargs += [ self.str2svntest(arg) ] return self.cmd_svn_standard_run(pyargs, runargs, do_chdir, self.get_first_wc(targets)) def cmd_echo(self, echo_args): "Writes a string to a file and writes the script for it." # split off target target_arg = None replace = True contents = None for i in range(len(echo_args)): arg = echo_args[i] if arg.startswith('>'): if len(arg) > 1: if arg[1] == '>': # it's a '>>' replace = False arg = arg[2:] else: arg = arg[1:] if len(arg) > 0: target_arg = arg if target_arg is None: # we need an index (i+1) to exist, and # we need (i+1) to be the only existing index left in the list. if i+1 != len(echo_args)-1: raise Failure("don't understand: echo " + " ".join(echo_args)) target_arg = echo_args[i+1] else: # already got the target. no more indexes should exist. if i != len(echo_args)-1: raise Failure("don't understand: echo " + " ".join(echo_args)) contents = " ".join(echo_args[:i]) if target_arg is None: raise Failure("echo needs a '>' pipe to a file name: echo " + " ".join(echo_args)) target = self.path2svntest(target_arg) if replace: main.file_write(target.runarg, contents) py = "main.file_write(" else: main.file_append(target.runarg, contents) py = "main.file_append(" py += target.pyarg + ", " + self.str2svntest(contents) + ")" return py def cmd_mkdir(self, mkdir_args): "Makes a new directory and writes the script for it." # treat all mkdirs as -p, ignore all -options. out = "" for arg in mkdir_args: if not arg.startswith('-'): target = self.path2svntest(arg) # don't check for not being a url, # maybe it's desired by the test or something. os.makedirs(target.runarg) out += "os.makedirs(" + target.pyarg + ")\n" return out def cmd_rm(self, rm_args): "Removes a directory tree and writes the script for it." # treat all removes as -rf, ignore all -options. out = "" for arg in rm_args: if not arg.startswith('-'): target = self.path2svntest(arg) self.really_safe_rmtree(target.runarg) out += "main.safe_rmtree(" + target.pyarg + ")\n" return out def cmd_mv(self, mv_args): "Moves things in the filesystem and writes the script for it." # ignore all -options. out = "" sources = [] target = None for arg in mv_args: if not arg.startswith('-'): if target is not None: sources += [target] target = self.path2svntest(arg) out = "" for source in sources: out += "shutil.move(" + source.pyarg + ", " + target.pyarg + ")\n" shutil.move(source.runarg, target.runarg) return out def cmd_cp(self, mv_args): "Copies in the filesystem and writes the script for it." # ignore all -options. out = "" sources = [] target = None for arg in mv_args: if not arg.startswith('-'): if target is not None: sources += [target] target = self.path2svntest(arg) if not target: raise Failure("cp needs a source and a target 'cp wc_dir wc_dir_2'") out = "" for source in sources: if os.path.exists(target.runarg): raise Failure("cp target exists, remove first: " + target.pyarg) if os.path.isdir(source.runarg): shutil.copytree(source.runarg, target.runarg) out += "shutil.copytree(" + source.pyarg + ", " + target.pyarg + ")\n" elif os.path.isfile(source.runarg): shutil.copy2(source.runarg, target.runarg) out += "shutil.copy2(" + source.pyarg + ", " + target.pyarg + ")\n" else: raise Failure("cp copy source does not exist: " + source.pyarg) return out # End of "shell" command handling functions. # Internal helpers: class WorkingCopy: "Defines the list of info we need around a working copy." def __init__(self, py, realpath, suffix): self.py = py self.realpath = realpath self.suffix = suffix class Target: "Defines the list of info we need around a command line supplied target." def __init__(self, pyarg, runarg, argnr, is_url=False, wc=None): self.pyarg = pyarg self.runarg = runarg self.argnr = argnr self.is_url = is_url self.wc = wc def add_line(self, args, translation=None): "Definition of how to add a new in/out line pair to LINES." self.lines += [ [args, translation] ] def really_safe_rmtree(self, dir): # Safety catch. We don't want to remove outside the sandbox. if dir.find('svn-test-work') < 0: raise Failure("Tried to remove path outside working area: " + dir) main.safe_rmtree(dir) def get_current_disk(self, wc): "Probes the given working copy and writes an expected_disk for it." actual_disk = svntest.wc.State.from_wc(wc.realpath, False, True) actual_disk.wc_dir = wc.realpath make_py, prev_disk = self.get_prev_disk(wc) # The tests currently compare SVNTreeNode trees, so let's do that too. actual_disk_tree = actual_disk.old_tree() prev_disk_tree = prev_disk.old_tree() # find out the tweaks tweaks = self.diff_trees(prev_disk_tree, actual_disk_tree, wc) if tweaks == 'Purge': make_py = '' else: tweaks = self.optimize_tweaks(tweaks, actual_disk_tree, wc) self.remember_disk(wc, actual_disk) pydisk = make_py + self.tweaks2py(tweaks, "expected_disk", wc) if len(pydisk) > 0: pydisk += '\n' return pydisk def get_prev_disk(self, wc): "Retrieves the last used expected_disk tree if any." make_py = "" # If a disk was supplied via __init__(), self.prev_disk[0] is set # to None, in which case we always use it, not checking WC. if self.prev_disk is None or \ not self.prev_disk[0] in [None, wc.realpath]: disk = svntest.main.greek_state.copy() disk.wc_dir = wc.realpath self.remember_disk(wc, disk) make_py = "expected_disk = svntest.main.greek_state.copy()\n" else: disk = self.prev_disk[1] return make_py, disk def remember_disk(self, wc, actual): "Remembers the current disk tree for future reference." self.prev_disk = [wc.realpath, actual] def get_current_status(self, wc, quiet=True): "Probes the given working copy and writes an expected_status for it." if quiet: code, output, err = main.run_svn(None, 'status', '-v', '-u', '-q', wc.realpath) else: code, output, err = main.run_svn(None, 'status', '-v', '-u', wc.realpath) if code != 0 or len(err) > 0: raise Failure("Hmm. `svn status' failed. What now.") make_py, prev_status = self.get_prev_status(wc) actual_status = svntest.wc.State.from_status(output) # The tests currently compare SVNTreeNode trees, so let's do that too. prev_status_tree = prev_status.old_tree() actual_status_tree = actual_status.old_tree() # Get the tweaks tweaks = self.diff_trees(prev_status_tree, actual_status_tree, wc) if tweaks == 'Purge': # The tree is empty (happens with invalid WC dirs) make_py = "expected_status = wc.State(" + wc.py + ", {})\n" tweaks = [] else: tweaks = self.optimize_tweaks(tweaks, actual_status_tree, wc) self.remember_status(wc, actual_status) pystatus = make_py + self.tweaks2py(tweaks, "expected_status", wc) if len(pystatus) > 0: pystatus += '\n' return pystatus def get_prev_status(self, wc): "Retrieves the last used expected_status tree if any." make_py = "" prev_status = None # re-use any previous status if we are still in the same WC dir. # If a status was supplied via __init__(), self.prev_status[0] is set # to None, in which case we always use it, not checking WC. if self.prev_status is None or \ not self.prev_status[0] in [None, wc.realpath]: # There is no or no matching previous status. Make new one. try: # If it's really a WC, use its base revision base_rev = actions.get_wc_base_rev(wc.realpath) except: # Else, just use zero. Whatever. base_rev = 0 prev_status = actions.get_virginal_state(wc.realpath, base_rev) make_py += ("expected_status = actions.get_virginal_state(" + wc.py + ", " + str(base_rev) + ")\n") else: # We will re-use the previous expected_status. prev_status = self.prev_status[1] # no need to make_py anything return make_py, prev_status def remember_status(self, wc, actual_status): "Remembers the current status tree for future reference." self.prev_status = [wc.realpath, actual_status] def chdir(self, do_chdir, wc): "Pushes the current dir onto the dir stack, does an os.chdir()." if not do_chdir: return "" self.prevdirs.append(os.getcwd()) os.chdir(wc.realpath) py = ("orig_dir = os.getcwd() # Need to chdir because of '^/' args\n" + "os.chdir(" + wc.py + ")\n") return py def chdir_back(self, do_chdir): "Does os.chdir() back to the directory popped from the dir stack's top." if not do_chdir: return "" # If this fails, there's a missing chdir() call: os.chdir(self.prevdirs.pop()) return "os.chdir(orig_dir)\n" def get_sorted_vars_by_pathlen(self): """Compose a listing of variable names to be expanded in script output. This is intended to be stored in self.sorted_vars_by_pathlen.""" list = [] for dict in [self.vars, self.other_wc_dirs]: for name in dict: runpath = dict[name][1] strlen = len(runpath) item = [strlen, name, runpath] bisect.insort(list, item) return list def get_sorted_var_names(self): """Compose a listing of variable names to be declared. This is used by TestFactory.make().""" paths = [] urls = [] for name in self.vars: if name.startswith('url_'): bisect.insort(urls, [name.lower(), name]) else: bisect.insort(paths, [name.lower(), name]) list = [] for path in paths: list += [path[1]] for url in urls: list += [url[1]] return list def get_sorted_other_wc_dir_names(self): """Compose a listing of working copies to be declared with sbox. This is used by TestFactory.make().""" list = [] for name in self.other_wc_dirs: bisect.insort(list, [name.lower(), name]) names = [] for item in list: names += [item[1]] return names def str2svntest(self, str): "Like str2py(), but replaces any known paths with variable names." if str is None: return "None" str = str2py(str) quote = str[0] def replace(str, path, name, quote): return str.replace(path, quote + " + " + name + " + " + quote) # We want longer paths first. for var in reversed(self.sorted_vars_by_pathlen): name = var[1] path = var[2] str = replace(str, path, name, quote) str = replace(str, self.sbox.wc_dir, 'wc_dir', quote) str = replace(str, self.sbox.repo_url, 'url', quote) # now remove trailing null-str adds: # '' + url_A_C + '' str = str.replace("'' + ",'').replace(" + ''",'') # "" + url_A_C + "" str = str.replace('"" + ',"").replace(' + ""',"") # just a stupid check. tiny tweak. (don't declare wc_dir and url # if they never appear) if not self.used_wc_dir: self.used_wc_dir = (re.search('\bwc_dir\b', str) is not None) if not self.used_url: self.used_url = str.find('url') >= 0 return str def strlist2py(self, list): "Given a list of strings, composes a py script that produces the same." if list is None: return "None" if len(list) < 1: return "[]" if len(list) == 1: return "[" + self.str2svntest(list[0]) + "]" py = "[\n" for line in list: py += " " + self.str2svntest(line) + ",\n" py += "]" return py def get_node_path(self, node, wc): "Tries to return the node path relative to the given working copy." path = node.get_printable_path() if path.startswith(wc.realpath + os.sep): path = path[len(wc.realpath + os.sep):] elif path.startswith(wc.realpath): path = path[len(wc.realpath):] return path def node2py(self, node, wc, prepend="", drop_empties=True): "Creates a line like 'A/C' : Item({ ... }) for wc.State composition." buf = StringIO() node.print_script(buf, wc.realpath, prepend, drop_empties) return buf.getvalue() def tree2py(self, node, wc): "Writes the wc.State definition for the given SVNTreeNode in given WC." # svntest.wc.State(wc_dir, { # 'A/mu' : Item(verb='Sending'), # 'A/D/G/rho' : Item(verb='Sending'), # }) buf = StringIO() tree.dump_tree_script(node, stream=buf, subtree=wc.realpath, wc_varname=wc.py) return buf.getvalue() def diff_trees(self, left, right, wc): """Compares the two trees given by the SVNTreeNode instances LEFT and RIGHT in the given working copy and composes an internal list of tweaks necessary to make LEFT into RIGHT.""" if not right.children: return 'Purge' return self._diff_trees(left, right, wc) def _diff_trees(self, left, right, wc): "Used by self.diff_trees(). No need to call this. See there." # all tweaks collected tweaks = [] # the current tweak in composition path = self.get_node_path(left, wc) tweak = [] # node attributes if ((left.contents is None) != (right.contents is None)) or \ (left.contents != right.contents): tweak += [ ["contents", right.contents] ] for key in left.props: if key not in right.props: tweak += [ [key, None] ] elif left.props[key] != right.props[key]: tweak += [ [key, right.props[key]] ] for key in right.props: if key not in left.props: tweak += [ [key, right.props[key]] ] for key in left.atts: if key not in right.atts: tweak += [ [key, None] ] elif left.atts[key] != right.atts[key]: tweak += [ [key, right.atts[key]] ] for key in right.atts: if key not in left.atts: tweak += [ [key, right.atts[key]] ] if len(tweak) > 0: changetweak = [ 'Change', [path], tweak] tweaks += [changetweak] if left.children is not None: for leftchild in left.children: rightchild = None if right.children is not None: rightchild = tree.get_child(right, leftchild.name) if rightchild is None: paths = leftchild.recurse(lambda n: self.get_node_path(n, wc)) removetweak = [ 'Remove', paths ] tweaks += [removetweak] if right.children is not None: for rightchild in right.children: leftchild = None if left.children is not None: leftchild = tree.get_child(left, rightchild.name) if leftchild is None: paths_and_nodes = rightchild.recurse( lambda n: [ self.get_node_path(n, wc), n ] ) addtweak = [ 'Add', paths_and_nodes ] tweaks += [addtweak] else: tweaks += self._diff_trees(leftchild, rightchild, wc) return tweaks def optimize_tweaks(self, tweaks, actual_tree, wc): "Given an internal list of tweaks, make them optimal by common sense." if tweaks == 'Purge': return tweaks subtree = actual_tree.find_node(wc.realpath) if not subtree: subtree = actual_tree remove_paths = [] additions = [] changes = [] for tweak in tweaks: if tweak[0] == 'Remove': remove_paths += tweak[1] elif tweak[0] == 'Add': additions += tweak[1] else: changes += [tweak] # combine removals removal = [] if len(remove_paths) > 0: removal = [ [ 'Remove', remove_paths] ] # combine additions addition = [] if len(additions) > 0: addition = [ [ 'Add', additions ] ] # find those changes that should be done on all nodes at once. def remove_mod(mod): for change in changes: if mod in change[2]: change[2].remove(mod) seen = [] tweak_all = [] for change in changes: tweak = change[2] for mod in tweak: if mod in seen: continue seen += [mod] # here we see each single "name=value" tweak in mod. # Check if the actual tree had this anyway all the way through. name = mod[0] val = mod[1] def check_node(node): if ( (name == 'contents' and node.contents == val) or (node.props and (name in node.props) and node.props[name] == val) or (node.atts and (name in node.atts) and node.atts[name] == val)): # has this same thing set. count on the left. return [node, None] return [None, node] results = subtree.recurse(check_node) have = [] havent = [] for result in results: if result[0]: have += [result[0]] else: havent += [result[1]] if havent == []: # ok, then, remove all tweaks that are like this, then # add a generic tweak. remove_mod(mod) tweak_all += [mod] elif len(havent) < len(have) * 3: # this is "an empirical factor" remove_mod(mod) tweak_all += [mod] # record the *other* nodes' actual item, overwritten above for node in havent: name = mod[0] if name == 'contents': value = node.contents elif name in node.props: value = node.props[name] elif name in node.atts: value = node.atts[name] else: continue changes += [ ['Change', [self.get_node_path(node, wc)], [[name, value]] ] ] # combine those paths that have exactly the same changes i = 0 j = 0 while i < len(changes): # find other changes that are identical j = i + 1 while j < len(changes): if changes[i][2] == changes[j][2]: changes[i][1] += changes[j][1] del changes[j] else: j += 1 i += 1 # combine those changes that have exactly the same paths i = 0 j = 0 while i < len(changes): # find other paths that are identical j = i + 1 while j < len(changes): if changes[i][1] == changes[j][1]: changes[i][2] += changes[j][2] del changes[j] else: j += 1 i += 1 if tweak_all != []: changes = [ ['Change', [], tweak_all ] ] + changes return removal + addition + changes def tweaks2py(self, tweaks, var_name, wc): "Given an internal list of tweaks, write the tweak script for it." py = "" if tweaks is None: return "" if tweaks == 'Purge': return var_name + " = wc.State(" + wc.py + ", {})\n" for tweak in tweaks: if tweak[0] == 'Remove': py += var_name + ".remove(" paths = tweak[1] py += self.str2svntest(paths[0]) for path in paths[1:]: py += ", " + self.str2svntest(path) py += ")\n" elif tweak[0] == 'Add': # add({'A/D/H/zeta' : Item(status=' ', wc_rev=9), ...}) py += var_name + ".add({" adds = tweak[1] for add in adds: path = add[0] node = add[1] py += self.node2py(node, wc, "\n ", False) py += "\n})\n" else: paths = tweak[1] mods = tweak[2] if mods != []: py += var_name + ".tweak(" for path in paths: py += self.str2svntest(path) + ", " def mod2py(mod): return mod[0] + "=" + self.str2svntest(mod[1]) py += mod2py(mods[0]) for mod in mods[1:]: py += ", " + mod2py(mod) py += ")\n" return py def path2svntest(self, path, argnr=None): """Given an input argument, do one hell of a path expansion on it. ARGNR is simply inserted into the resulting Target. Returns a self.Target instance. """ wc = self.WorkingCopy('wc_dir', self.sbox.wc_dir, None) url = self.sbox.repo_url # do we need multiple URLs too?? pathsep = '/' if path.find('/') < 0 and path.find('\\') >= 0: pathsep = '\\' is_url = False # If you add to these, make sure you add longer ones first, to # avoid e.g. '$WC_DIR' matching '$WC' first. wc_dir_wildcards = ['wc_dir', 'wcdir', '$WC_DIR', '$WC'] url_wildcards = ['url', '$URL'] first = path.split(pathsep, 1)[0] if first in wc_dir_wildcards: path = path[len(first):] elif first in url_wildcards: path = path[len(first):] is_url = True else: for url_scheme in ['^/', 'file:/', 'http:/', 'svn:/', 'svn+ssh:/']: if path.startswith(url_scheme): is_url = True # keep it as it is pyarg = self.str2svntest(path) runarg = path return self.Target(pyarg, runarg, argnr, is_url, None) for wc_dir_wildcard in wc_dir_wildcards: if first.startswith(wc_dir_wildcard): # The first path element starts with "wc_dir" (or similar), # but it has more attached to it. Like "wc_dir.2" or "wc_dir_other" # Record a new wc dir name. # try to figure out a nice suffix to pass to sbox. # (it will create a new dir called sbox.wc_dir + '.' + suffix) suffix = '' if first[len(wc_dir_wildcard)] in ['.','-','_']: # it's a separator already, don't duplicate the dot. (warm&fuzzy) suffix = first[len(wc_dir_wildcard) + 1:] if len(suffix) < 1: suffix = first[len(wc_dir_wildcard):] if len(suffix) < 1: raise Failure("no suffix supplied to other-wc_dir arg") # Streamline the var name suffix = suffix.replace('.','_').replace('-','_') other_wc_dir_varname = 'wc_dir_' + suffix path = path[len(first):] real_path = self.get_other_wc_real_path(other_wc_dir_varname, suffix, do_remove_on_new_wc_path) wc = self.WorkingCopy(other_wc_dir_varname, real_path, suffix) # found a match, no need to loop further, but still process # the path further. break if len(path) < 1 or path == pathsep: if is_url: self.used_url = True pyarg = 'url' runarg = url wc = None else: if wc.suffix is None: self.used_wc_dir = True pyarg = wc.py runarg = wc.realpath else: pathelements = split_remove_empty(path, pathsep) # make a new variable, if necessary if is_url: pyarg, runarg = self.ensure_url_var(pathelements) wc = None else: pyarg, runarg = self.ensure_path_var(wc, pathelements) return self.Target(pyarg, runarg, argnr, is_url, wc) def get_other_wc_real_path(self, varname, suffix, do_remove): "Create or retrieve the path of an alternate working copy." if varname in self.other_wc_dirs: return self.other_wc_dirs[varname][1] # else, we must still create one. path = self.sbox.add_wc_path(suffix, do_remove) py = "sbox.add_wc_path(" + str2py(suffix) if not do_remove: py += ", remove=False" py += ')' value = [py, path] self.other_wc_dirs[varname] = [py, path] self.sorted_vars_by_pathlen = self.get_sorted_vars_by_pathlen() return path def define_var(self, name, value): "Add a variable definition, don't allow redefinitions." # see if we already have this var if name in self.vars: if self.vars[name] != value: raise Failure("Variable name collision. Hm, fix factory.py?") # ok, it's recorded correctly. Nothing needs to happen. return # a new variable needs to be recorded self.vars[name] = value # update the sorted list of vars for substitution by str2svntest() self.sorted_vars_by_pathlen = self.get_sorted_vars_by_pathlen() def ensure_path_var(self, wc, pathelements): "Given a path in a working copy, make sure we have a variable for it." name = "_".join(pathelements) if wc.suffix is not None: # This is an "other" working copy (not the default). # The suffix of the wc_dir variable serves as the prefix: # wc_dir_other ==> other_A_D = os.path.join(wc_dir_other, 'A', 'D') name = wc.suffix + "_" + name if name[0].isdigit(): name = "_" + name else: self.used_wc_dir = True py = 'os.path.join(' + wc.py if len(pathelements) > 0: py += ", '" + "', '".join(pathelements) + "'" py += ')' wc_dir_real_path = wc.realpath run = os.path.join(wc_dir_real_path, *pathelements) value = [py, run] self.define_var(name, value) return name, run def ensure_url_var(self, pathelements): "Given a path in the test repository, ensure we have a url var for it." name = "url_" + "_".join(pathelements) joined = "/" + "/".join(pathelements) py = 'url' if len(pathelements) > 0: py += " + " + str2py(joined) self.used_url = True run = self.sbox.repo_url + joined value = [py, run] self.define_var(name, value) return name, run def get_first_wc(self, target_list): """In a list of Target instances, find the first one that is in a working copy and return that WorkingCopy. Default to sbox.wc_dir. This is useful if we need a working copy for a '^/' URL.""" for target in target_list: if target.wc: return target.wc return self.WorkingCopy('wc_dir', self.sbox.wc_dir, None) def args2svntest(self, args, append_wc_dir_if_missing = False, keep_args_of = [], keep_first_count = 1, drop_with_arg = []): """Tries to be extremely intelligent at parsing command line arguments. It needs to know which args are file targets that should be in a working copy. File targets are magically expanded. args: list of string tokens as passed to factory.make(), e.g. ['svn', 'commit', '--force', 'wc_dir2'] append_wc_dir_if_missing: It's a switch. keep_args_of: See TestFactory.keep_args_of (comment in __init__) keep_first_count: Don't expand the first N non-option args. This is used to preserve e.g. the token 'update' in '[svn] update wc_dir' (the 'svn' is usually split off before this function is called). drop_with_arg: list of string tokens that are commandline options with following argument which we want to drop from the list of args (e.g. -m message). """ wc_dir = self.sbox.wc_dir url = self.sbox.repo_url target_supplied = False pyargs = [] runargs = [] do_chdir = False targets = [] wc_dirs = [] i = 0 while i < len(args): arg = args[i] if arg in drop_with_arg: # skip this and the next arg if not arg.startswith('--') and len(arg) > 2: # it is a concatenated arg like -r123 instead of -r 123 # skip only this one. Do nothing. i = i else: # skip this and the next arg i += 1 elif arg.startswith('-'): # keep this option arg verbatim. pyargs += [ self.str2svntest(arg) ] runargs += [ arg ] # does this option expect a non-filename argument? # take that verbatim as well. if arg in keep_args_of: i += 1 if i < len(args): arg = args[i] pyargs += [ self.str2svntest(arg) ] runargs += [ arg ] elif keep_first_count > 0: # args still to be taken verbatim. pyargs += [ self.str2svntest(arg) ] runargs += [ arg ] keep_first_count -= 1 elif arg.startswith('^/'): # this is a ^/url, keep it verbatim. # if we use "^/", we need to chdir(wc_dir). do_chdir = True pyarg = str2py(arg) targets += [ self.Target(pyarg, arg, len(pyargs), True, None) ] pyargs += [ pyarg ] runargs += [ arg ] else: # well, then this must be a filename or url, autoexpand it. target = self.path2svntest(arg, argnr=len(pyargs)) pyargs += [ target.pyarg ] runargs += [ target.runarg ] target_supplied = True targets += [ target ] i += 1 if not target_supplied and append_wc_dir_if_missing: # add a simple wc_dir target self.used_wc_dir = True wc = self.WorkingCopy('wc_dir', wc_dir, None) targets += [ self.Target('wc_dir', wc_dir, len(pyargs), False, wc) ] pyargs += [ 'wc_dir' ] runargs += [ wc_dir ] return pyargs, runargs, do_chdir, targets ###### END of the TestFactory class ###### # Quotes-preserving text wrapping for output def find_quote_end(text, i): "In string TEXT, find the end of the qoute that starts at TEXT[i]" # don't handle """ quotes quote = text[i] i += 1 while i < len(text): if text[i] == '\\': i += 1 elif text[i] == quote: return i i += 1 return len(text) - 1 class MyWrapper(textwrap.TextWrapper): "A textwrap.TextWrapper that doesn't break a line within quotes." ### TODO regexes would be nice, maybe? def _split(self, text): parts = [] i = 0 start = 0 # This loop will break before and after each space, but keep # quoted strings in one piece. Example, breaks marked '/': # /(one,/ /two(blagger),/ /'three three three',)/ while i < len(text): if text[i] in ['"', "'"]: # handle """ quotes. (why, actually?) if text[i:i+3] == '"""': end = text[i+3:].find('"""') if end >= 0: i += end + 2 else: i = len(text) - 1 else: # handle normal quotes i = find_quote_end(text, i) elif text[i].isspace(): # split off previous section, if any if start < i: parts += [text[start:i]] start = i # split off this space parts += [text[i]] start = i + 1 i += 1 if start < len(text): parts += [text[start:]] return parts def wrap_each_line(str, ii, si, blw): """Wrap lines to a defined width (<80 chars). Feed the lines single to MyWrapper, so that it preserves the current line endings already in there. We only want to insert new wraps, not remove existing newlines.""" wrapper = MyWrapper(77, initial_indent=ii, subsequent_indent=si) lines = str.splitlines() for i in range(0,len(lines)): if lines[i] != '': lines[i] = wrapper.fill(lines[i]) return '\n'.join(lines) # Other miscellaneous helpers def sh2str(string): "un-escapes away /x sequences" if string is None: return None return string.decode("string-escape") def get_quote_style(str): """find which quote is the outer one, ' or ".""" quote_char = None at = None found = str.find("'") found2 = str.find('"') # If found == found2, both must be -1, so nothing was found. if found != found2: # If a quote was found if found >= 0 and found2 >= 0: # If both were found, invalidate the later one if found < found2: found2 = -1 else: found = -1 # See which one remains. if found >= 0: at = found + 1 quote_char = "'" elif found2 >= 0: at = found2 + 1 quote_char = '"' return quote_char, at def split_remove_empty(str, sep): "do a split, then remove empty elements." list = str.split(sep) return filter(lambda item: item and len(item) > 0, list) def str2py(str): "returns the string enclosed in quotes, suitable for py scripts." if str is None: return "None" # try to make a nice choice of quoting character if str.find("'") >= 0: return '"' + str.encode("string-escape" ).replace("\\'", "'" ).replace('"', '\\"') + '"' else: return "'" + str.encode("string-escape") + "'" return str ### End of file. cvs2svn-2.4.0/svntest/update.sh0000775000076500007650000000026311434364627017545 0ustar mhaggermhagger00000000000000#!/bin/sh set -ex # Update the svntest library from Subversion's subversion svn export --force http://svn.apache.org/repos/asf/subversion/trunk/subversion/tests/cmdline/svntest . cvs2svn-2.4.0/svntest/wc.py0000664000076500007650000007020011434364627016705 0ustar mhaggermhagger00000000000000# # wc.py: functions for interacting with a Subversion working copy # # Subversion is a tool for revision control. # See http://subversion.tigris.org for more information. # # ==================================================================== # Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. ###################################################################### import os import sys import re import urllib import svntest # # 'status -v' output looks like this: # # "%c%c%c%c%c%c%c %c %6s %6s %-12s %s\n" # # (Taken from 'print_status' in subversion/svn/status.c.) # # Here are the parameters. The middle number or string in parens is the # match.group(), followed by a brief description of the field: # # - text status (1) (single letter) # - prop status (1) (single letter) # - wc-lockedness flag (2) (single letter: "L" or " ") # - copied flag (3) (single letter: "+" or " ") # - switched flag (4) (single letter: "S", "X" or " ") # - repos lock status (5) (single letter: "K", "O", "B", "T", " ") # - tree conflict flag (6) (single letter: "C" or " ") # # [one space] # # - out-of-date flag (7) (single letter: "*" or " ") # # [three spaces] # # - working revision ('wc_rev') (either digits or "-", "?" or " ") # # [one space] # # - last-changed revision (either digits or "?" or " ") # # [one space] # # - last author (optional string of non-whitespace # characters) # # [spaces] # # - path ('path') (string of characters until newline) # # Working revision, last-changed revision, and last author are whitespace # only if the item is missing. # _re_parse_status = re.compile('^([?!MACDRUGI_~ ][MACDRUG_ ])' '([L ])' '([+ ])' '([SX ])' '([KOBT ])' '([C ]) ' '([* ]) +' '((?P\d+|-|\?) +(\d|-|\?)+ +(\S+) +)?' '(?P.+)$') _re_parse_skipped = re.compile("^Skipped.* '(.+)'\n") _re_parse_summarize = re.compile("^([MAD ][M ]) (.+)\n") _re_parse_checkout = re.compile('^([RMAGCUDE_ ][MAGCUDE_ ])' '([B ])' '([C ])\s+' '(.+)') _re_parse_co_skipped = re.compile('^(Restored|Skipped)\s+\'(.+)\'') _re_parse_co_restored = re.compile('^(Restored)\s+\'(.+)\'') # Lines typically have a verb followed by whitespace then a path. _re_parse_commit = re.compile('^(\w+( \(bin\))?)\s+(.+)') class State: """Describes an existing or expected state of a working copy. The primary metaphor here is a dictionary of paths mapping to instances of StateItem, which describe each item in a working copy. Note: the paths should be *relative* to the root of the working copy, using '/' for the separator (see to_relpath()), and the root of the working copy is identified by the empty path: ''. """ def __init__(self, wc_dir, desc): "Create a State using the specified description." assert isinstance(desc, dict) self.wc_dir = wc_dir self.desc = desc # dictionary: path -> StateItem def add(self, more_desc): "Add more state items into the State." assert isinstance(more_desc, dict) self.desc.update(more_desc) def add_state(self, parent, state): "Import state items from a State object, reparent the items to PARENT." assert isinstance(state, State) if parent and parent[-1] != '/': parent += '/' for path, item in state.desc.items(): path = parent + path self.desc[path] = item def remove(self, *paths): "Remove a path from the state (the path must exist)." for path in paths: del self.desc[to_relpath(path)] def copy(self, new_root=None): """Make a deep copy of self. If NEW_ROOT is not None, then set the copy's wc_dir NEW_ROOT instead of to self's wc_dir.""" desc = { } for path, item in self.desc.items(): desc[path] = item.copy() if new_root is None: new_root = self.wc_dir return State(new_root, desc) def tweak(self, *args, **kw): """Tweak the items' values. Each argument in ARGS is the path of a StateItem that already exists in this State. Each keyword argument in KW is a modifiable property of StateItem. The general form of this method is .tweak([paths...,] key=value...). If one or more paths are provided, then those items' values are modified. If no paths are given, then all items are modified. """ if args: for path in args: try: path_ref = self.desc[to_relpath(path)] except KeyError, e: e.args = ["Path '%s' not present in WC state descriptor" % path] raise path_ref.tweak(**kw) else: for item in self.desc.values(): item.tweak(**kw) def tweak_some(self, filter, **kw): "Tweak the items for which the filter returns true." for path, item in self.desc.items(): if list(filter(path, item)): item.tweak(**kw) def subtree(self, subtree_path): """Return a State object which is a deep copy of the sub-tree identified by SUBTREE_PATH (which is assumed to contain only one element rooted at the tree of this State object's WC_DIR).""" desc = { } for path, item in self.desc.items(): path_elements = path.split("/") if len(path_elements) > 1 and path_elements[0] == subtree_path: desc["/".join(path_elements[1:])] = item.copy() return State(self.wc_dir, desc) def write_to_disk(self, target_dir): """Construct a directory structure on disk, matching our state. WARNING: any StateItem that does not have contents (.contents is None) is assumed to be a directory. """ if not os.path.exists(target_dir): os.makedirs(target_dir) for path, item in self.desc.items(): fullpath = os.path.join(target_dir, path) if item.contents is None: # a directory if not os.path.exists(fullpath): os.makedirs(fullpath) else: # a file # ensure its directory exists dirpath = os.path.dirname(fullpath) if not os.path.exists(dirpath): os.makedirs(dirpath) # write out the file contents now open(fullpath, 'wb').write(item.contents) def normalize(self): """Return a "normalized" version of self. A normalized version has the following characteristics: * wc_dir == '' * paths use forward slashes * paths are relative If self is already normalized, then it is returned. Otherwise, a new State is constructed with (shallow) references to self's StateItem instances. If the caller needs a fully disjoint State, then use .copy() on the result. """ if self.wc_dir == '': return self base = to_relpath(os.path.normpath(self.wc_dir)) desc = dict([(repos_join(base, path), item) for path, item in self.desc.items()]) return State('', desc) def compare(self, other): """Compare this State against an OTHER State. Three new set objects will be returned: CHANGED, UNIQUE_SELF, and UNIQUE_OTHER. These contain paths of StateItems that are different between SELF and OTHER, paths of items unique to SELF, and paths of item that are unique to OTHER, respectively. """ assert isinstance(other, State) norm_self = self.normalize() norm_other = other.normalize() # fast-path the easy case if norm_self == norm_other: fs = frozenset() return fs, fs, fs paths_self = set(norm_self.desc.keys()) paths_other = set(norm_other.desc.keys()) changed = set() for path in paths_self.intersection(paths_other): if norm_self.desc[path] != norm_other.desc[path]: changed.add(path) return changed, paths_self - paths_other, paths_other - paths_self def compare_and_display(self, label, other): """Compare this State against an OTHER State, and display differences. Information will be written to stdout, displaying any differences between the two states. LABEL will be used in the display. SELF is the "expected" state, and OTHER is the "actual" state. If any changes are detected/displayed, then SVNTreeUnequal is raised. """ norm_self = self.normalize() norm_other = other.normalize() changed, unique_self, unique_other = norm_self.compare(norm_other) if not changed and not unique_self and not unique_other: return # Use the shortest path as a way to find the "root-most" affected node. def _shortest_path(path_set): shortest = None for path in path_set: if shortest is None or len(path) < len(shortest): shortest = path return shortest if changed: path = _shortest_path(changed) display_nodes(label, path, norm_self.desc[path], norm_other.desc[path]) elif unique_self: path = _shortest_path(unique_self) default_singleton_handler('actual ' + label, path, norm_self.desc[path]) elif unique_other: path = _shortest_path(unique_other) default_singleton_handler('expected ' + label, path, norm_other.desc[path]) raise svntest.tree.SVNTreeUnequal def tweak_for_entries_compare(self): for path, item in self.desc.copy().items(): if item.status: # If this is an unversioned tree-conflict, remove it. # These are only in their parents' THIS_DIR, they don't have entries. if item.status[0] in '!?' and item.treeconflict == 'C': del self.desc[path] else: # when reading the entry structures, we don't examine for text or # property mods, so clear those flags. we also do not examine the # filesystem, so we cannot detect missing or obstructed files. if item.status[0] in 'M!~': item.status = ' ' + item.status[1] if item.status[1] == 'M': item.status = item.status[0] + ' ' # under wc-ng terms, we may report a different revision than the # backwards-compatible code should report. if there is a special # value for compatibility, then use it. if item.entry_rev is not None: item.wc_rev = item.entry_rev item.entry_rev = None if item.writelocked: # we don't contact the repository, so our only information is what # is in the working copy. 'K' means we have one and it matches the # repos. 'O' means we don't have one but the repos says the item # is locked by us, elsewhere. 'T' means we have one, and the repos # has one, but it is now owned by somebody else. 'B' means we have # one, but the repos does not. # # for each case of "we have one", set the writelocked state to 'K', # and clear it to None for the others. this will match what is # generated when we examine our working copy state. if item.writelocked in 'TB': item.writelocked = 'K' elif item.writelocked == 'O': item.writelocked = None def old_tree(self): "Return an old-style tree (for compatibility purposes)." nodelist = [ ] for path, item in self.desc.items(): nodelist.append(item.as_node_tuple(os.path.join(self.wc_dir, path))) tree = svntest.tree.build_generic_tree(nodelist) if 0: check = tree.as_state() if self != check: import pprint pprint.pprint(self.desc) pprint.pprint(check.desc) # STATE -> TREE -> STATE is lossy. # In many cases, TREE -> STATE -> TREE is not. # Even though our conversion from a TREE has lost some information, we # may be able to verify that our lesser-STATE produces the same TREE. svntest.tree.compare_trees('mismatch', tree, check.old_tree()) return tree def __str__(self): return str(self.old_tree()) def __eq__(self, other): if not isinstance(other, State): return False norm_self = self.normalize() norm_other = other.normalize() return norm_self.desc == norm_other.desc def __ne__(self, other): return not self.__eq__(other) @classmethod def from_status(cls, lines): """Create a State object from 'svn status' output.""" def not_space(value): if value and value != ' ': return value return None desc = { } for line in lines: if line.startswith('DBG:'): continue # Quit when we hit an externals status announcement. ### someday we can fix the externals tests to expect the additional ### flood of externals status data. if line.startswith('Performing'): break match = _re_parse_status.search(line) if not match or match.group(10) == '-': # ignore non-matching lines, or items that only exist on repos continue item = StateItem(status=match.group(1), locked=not_space(match.group(2)), copied=not_space(match.group(3)), switched=not_space(match.group(4)), writelocked=not_space(match.group(5)), treeconflict=not_space(match.group(6)), wc_rev=not_space(match.group('wc_rev')), ) desc[to_relpath(match.group('path'))] = item return cls('', desc) @classmethod def from_skipped(cls, lines): """Create a State object from 'Skipped' lines.""" desc = { } for line in lines: if line.startswith('DBG:'): continue match = _re_parse_skipped.search(line) if match: desc[to_relpath(match.group(1))] = StateItem() return cls('', desc) @classmethod def from_summarize(cls, lines): """Create a State object from 'svn diff --summarize' lines.""" desc = { } for line in lines: if line.startswith('DBG:'): continue match = _re_parse_summarize.search(line) if match: desc[to_relpath(match.group(2))] = StateItem(status=match.group(1)) return cls('', desc) @classmethod def from_checkout(cls, lines, include_skipped=True): """Create a State object from 'svn checkout' lines.""" if include_skipped: re_extra = _re_parse_co_skipped else: re_extra = _re_parse_co_restored desc = { } for line in lines: if line.startswith('DBG:'): continue match = _re_parse_checkout.search(line) if match: if match.group(3) == 'C': treeconflict = 'C' else: treeconflict = None desc[to_relpath(match.group(4))] = StateItem(status=match.group(1), treeconflict=treeconflict) else: match = re_extra.search(line) if match: desc[to_relpath(match.group(2))] = StateItem(verb=match.group(1)) return cls('', desc) @classmethod def from_commit(cls, lines): """Create a State object from 'svn commit' lines.""" desc = { } for line in lines: if line.startswith('DBG:') or line.startswith('Transmitting'): continue match = _re_parse_commit.search(line) if match: desc[to_relpath(match.group(3))] = StateItem(verb=match.group(1)) return cls('', desc) @classmethod def from_wc(cls, base, load_props=False, ignore_svn=True): """Create a State object from a working copy. Walks the tree at PATH, building a State based on the actual files and directories found. If LOAD_PROPS is True, then the properties will be loaded for all nodes (Very Expensive!). If IGNORE_SVN is True, then the .svn subdirectories will be excluded from the State. """ if not base: # we're going to walk the base, and the OS wants "." base = '.' desc = { } dot_svn = svntest.main.get_admin_name() for dirpath, dirs, files in os.walk(base): parent = path_to_key(dirpath, base) if ignore_svn and dot_svn in dirs: dirs.remove(dot_svn) for name in dirs + files: node = os.path.join(dirpath, name) if os.path.isfile(node): contents = open(node, 'r').read() else: contents = None desc[repos_join(parent, name)] = StateItem(contents=contents) if load_props: paths = [os.path.join(base, to_ospath(p)) for p in desc.keys()] paths.append(base) all_props = svntest.tree.get_props(paths) for node, props in all_props.items(): if node == base: desc['.'] = StateItem(props=props) else: if base == '.': # 'svn proplist' strips './' from the paths. put it back on. node = os.path.join('.', node) desc[path_to_key(node, base)].props = props return cls('', desc) @classmethod def from_entries(cls, base): """Create a State object from a working copy, via the old "entries" API. Walks the tree at PATH, building a State based on the information provided by the old entries API, as accessed via the 'entries-dump' program. """ if not base: # we're going to walk the base, and the OS wants "." base = '.' if os.path.isfile(base): # a few tests run status on a single file. quick-and-dirty this. we # really should analyze the entry (similar to below) to be general. dirpath, basename = os.path.split(base) entries = svntest.main.run_entriesdump(dirpath) return cls('', { to_relpath(base): StateItem.from_entry(entries[basename]), }) desc = { } dot_svn = svntest.main.get_admin_name() for dirpath, dirs, files in os.walk(base): if dot_svn in dirs: # don't visit the .svn subdir dirs.remove(dot_svn) else: # this is not a versioned directory. remove all subdirectories since # we don't want to visit them. then skip this directory. dirs[:] = [] continue entries = svntest.main.run_entriesdump(dirpath) if dirpath == '.': parent = '' elif dirpath.startswith('.' + os.sep): parent = to_relpath(dirpath[2:]) else: parent = to_relpath(dirpath) parent_url = entries[''].url for name, entry in entries.items(): # if the entry is marked as DELETED *and* it is something other than # schedule-add, then skip it. we can add a new node "over" where a # DELETED node lives. if entry.deleted and entry.schedule != 1: continue if name and entry.kind == 2: # stub subdirectory. leave a "missing" StateItem in here. note # that we can't put the status as "! " because that gets tweaked # out of our expected tree. item = StateItem(status=' ', wc_rev='?') desc[repos_join(parent, name)] = item continue item = StateItem.from_entry(entry) if name: desc[repos_join(parent, name)] = item implied_url = repos_join(parent_url, svn_url_quote(name)) else: item._url = entry.url # attach URL to directory StateItems desc[parent] = item grandpa, this_name = repos_split(parent) if grandpa in desc: implied_url = repos_join(desc[grandpa]._url, svn_url_quote(this_name)) else: implied_url = None if implied_url and implied_url != entry.url: item.switched = 'S' # only recurse into directories found in this entries. remove any # which are not mentioned. unmentioned = set(dirs) - set(entries.keys()) for subdir in unmentioned: dirs.remove(subdir) return cls('', desc) class StateItem: """Describes an individual item within a working copy. Note that the location of this item is not specified. An external mechanism, such as the State class, will provide location information for each item. """ def __init__(self, contents=None, props=None, status=None, verb=None, wc_rev=None, entry_rev=None, locked=None, copied=None, switched=None, writelocked=None, treeconflict=None): # provide an empty prop dict if it wasn't provided if props is None: props = { } ### keep/make these ints one day? if wc_rev is not None: wc_rev = str(wc_rev) # Any attribute can be None if not relevant, unless otherwise stated. # A string of content (if the node is a file). self.contents = contents # A dictionary mapping prop name to prop value; never None. self.props = props # A two-character string from the first two columns of 'svn status'. self.status = status # The action word such as 'Adding' printed by commands like 'svn update'. self.verb = verb # The base revision number of the node in the WC, as a string. self.wc_rev = wc_rev # This one will be set when we expect the wc_rev to differ from the one # found ni the entries code. self.entry_rev = entry_rev # For the following attributes, the value is the status character of that # field from 'svn status', except using value None instead of status ' '. self.locked = locked self.copied = copied self.switched = switched self.writelocked = writelocked # Value 'C' or ' ', or None as an expected status meaning 'do not check'. self.treeconflict = treeconflict def copy(self): "Make a deep copy of self." new = StateItem() vars(new).update(vars(self)) new.props = self.props.copy() return new def tweak(self, **kw): for name, value in kw.items(): # Refine the revision args (for now) to ensure they are strings. if value is not None and name == 'wc_rev': value = str(value) setattr(self, name, value) def __eq__(self, other): if not isinstance(other, StateItem): return False v_self = dict([(k, v) for k, v in vars(self).items() if not k.startswith('_')]) v_other = dict([(k, v) for k, v in vars(other).items() if not k.startswith('_')]) if self.treeconflict is None: v_other = v_other.copy() v_other['treeconflict'] = None if other.treeconflict is None: v_self = v_self.copy() v_self['treeconflict'] = None return v_self == v_other def __ne__(self, other): return not self.__eq__(other) def as_node_tuple(self, path): atts = { } if self.status is not None: atts['status'] = self.status if self.verb is not None: atts['verb'] = self.verb if self.wc_rev is not None: atts['wc_rev'] = self.wc_rev if self.locked is not None: atts['locked'] = self.locked if self.copied is not None: atts['copied'] = self.copied if self.switched is not None: atts['switched'] = self.switched if self.writelocked is not None: atts['writelocked'] = self.writelocked if self.treeconflict is not None: atts['treeconflict'] = self.treeconflict return (os.path.normpath(path), self.contents, self.props, atts) @classmethod def from_entry(cls, entry): status = ' ' if entry.schedule == 1: # svn_wc_schedule_add status = 'A ' elif entry.schedule == 2: # svn_wc_schedule_delete status = 'D ' elif entry.schedule == 3: # svn_wc_schedule_replace status = 'R ' elif entry.conflict_old: ### I'm assuming we only need to check one, rather than all conflict_* status = 'C ' ### is this the sufficient? guessing here w/o investigation. if entry.prejfile: status = status[0] + 'C' if entry.locked: locked = 'L' else: locked = None if entry.copied: wc_rev = '-' copied = '+' else: if entry.revision == -1: wc_rev = '?' else: wc_rev = entry.revision copied = None ### figure out switched switched = None if entry.lock_token: writelocked = 'K' else: writelocked = None return cls(status=status, wc_rev=wc_rev, locked=locked, copied=copied, switched=switched, writelocked=writelocked, ) if os.sep == '/': to_relpath = to_ospath = lambda path: path else: def to_relpath(path): """Return PATH but with all native path separators changed to '/'.""" return path.replace(os.sep, '/') def to_ospath(path): """Return PATH but with each '/' changed to the native path separator.""" return path.replace('/', os.sep) def path_to_key(path, base): """Return the relative path that represents the absolute path PATH under the absolute path BASE. PATH must be a path under BASE. The returned path has '/' separators.""" if path == base: return '' if base.endswith(os.sep) or base.endswith('/') or base.endswith(':'): # Special path format on Windows: # 'C:/' Is a valid root which includes its separator ('C:/file') # 'C:' is a valid root which isn't followed by a separator ('C:file') # # In this case, we don't need a separator between the base and the path. pass else: # Account for a separator between the base and the relpath we're creating base += os.sep assert path.startswith(base), "'%s' is not a prefix of '%s'" % (base, path) return to_relpath(path[len(base):]) def repos_split(repos_relpath): """Split a repos path into its directory and basename parts.""" idx = repos_relpath.rfind('/') if idx == -1: return '', repos_relpath return repos_relpath[:idx], repos_relpath[idx+1:] def repos_join(base, path): """Join two repos paths. This generally works for URLs too.""" if base == '': return path if path == '': return base return base + '/' + path def svn_url_quote(url): # svn defines a different set of "safe" characters than Python does, so # we need to avoid escaping them. see subr/path.c:uri_char_validity[] return urllib.quote(url, "!$&'()*+,-./:=@_~") # ------------ def text_base_path(file_path): """Return the path to the text-base file for the versioned file FILE_PATH.""" dot_svn = svntest.main.get_admin_name() return os.path.join(os.path.dirname(file_path), dot_svn, 'text-base', os.path.basename(file_path) + '.svn-base') # ------------ ### probably toss these at some point. or major rework. or something. ### just bootstrapping some changes for now. # def item_to_node(path, item): tree = svntest.tree.build_generic_tree([item.as_node_tuple(path)]) while tree.children: assert len(tree.children) == 1 tree = tree.children[0] return tree ### yanked from tree.compare_trees() def display_nodes(label, path, expected, actual): 'Display two nodes, expected and actual.' expected = item_to_node(path, expected) actual = item_to_node(path, actual) print("=============================================================") print("Expected '%s' and actual '%s' in %s tree are different!" % (expected.name, actual.name, label)) print("=============================================================") print("EXPECTED NODE TO BE:") print("=============================================================") expected.pprint() print("=============================================================") print("ACTUAL NODE FOUND:") print("=============================================================") actual.pprint() ### yanked from tree.py def default_singleton_handler(description, path, item): node = item_to_node(path, item) print("Couldn't find node '%s' in %s tree" % (node.name, description)) node.pprint() raise svntest.tree.SVNTreeUnequal cvs2svn-2.4.0/svntest/verify.py0000664000076500007650000002711511434364627017607 0ustar mhaggermhagger00000000000000# # verify.py: routines that handle comparison and display of expected # vs. actual output # # Subversion is a tool for revision control. # See http://subversion.tigris.org for more information. # # ==================================================================== # Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. ###################################################################### import re, sys import svntest ###################################################################### # Exception types class SVNUnexpectedOutput(svntest.Failure): """Exception raised if an invocation of svn results in unexpected output of any kind.""" pass class SVNUnexpectedStdout(SVNUnexpectedOutput): """Exception raised if an invocation of svn results in unexpected output on STDOUT.""" pass class SVNUnexpectedStderr(SVNUnexpectedOutput): """Exception raised if an invocation of svn results in unexpected output on STDERR.""" pass class SVNExpectedStdout(SVNUnexpectedOutput): """Exception raised if an invocation of svn results in no output on STDOUT when output was expected.""" pass class SVNExpectedStderr(SVNUnexpectedOutput): """Exception raised if an invocation of svn results in no output on STDERR when output was expected.""" pass class SVNUnexpectedExitCode(SVNUnexpectedOutput): """Exception raised if an invocation of svn exits with a value other than what was expected.""" pass class SVNIncorrectDatatype(SVNUnexpectedOutput): """Exception raised if invalid input is passed to the run_and_verify_* API""" pass ###################################################################### # Comparison of expected vs. actual output def createExpectedOutput(expected, output_type, match_all=True): """Return EXPECTED, promoted to an ExpectedOutput instance if not None. Raise SVNIncorrectDatatype if the data type of EXPECTED is not handled.""" if isinstance(expected, list): expected = ExpectedOutput(expected) elif isinstance(expected, str): expected = RegexOutput(expected, match_all) elif expected is AnyOutput: expected = AnyOutput() elif expected is not None and not isinstance(expected, ExpectedOutput): raise SVNIncorrectDatatype("Unexpected type for '%s' data" % output_type) return expected class ExpectedOutput: """Contains expected output, and performs comparisons.""" is_regex = False is_unordered = False def __init__(self, output, match_all=True): """Initialize the expected output to OUTPUT which is a string, or a list of strings, or None meaning an empty list. If MATCH_ALL is True, the expected strings will be matched with the actual strings, one-to-one, in the same order. If False, they will be matched with a subset of the actual strings, one-to-one, in the same order, ignoring any other actual strings among the matching ones.""" self.output = output self.match_all = match_all def __str__(self): return str(self.output) def __cmp__(self, other): raise 'badness' def matches(self, other): """Return whether SELF.output matches OTHER (which may be a list of newline-terminated lines, or a single string). Either value may be None.""" if self.output is None: expected = [] else: expected = self.output if other is None: actual = [] else: actual = other if not isinstance(actual, list): actual = [actual] if not isinstance(expected, list): expected = [expected] return self.is_equivalent_list(expected, actual) def is_equivalent_list(self, expected, actual): "Return whether EXPECTED and ACTUAL are equivalent." if not self.is_regex: if self.match_all: # The EXPECTED lines must match the ACTUAL lines, one-to-one, in # the same order. return expected == actual # The EXPECTED lines must match a subset of the ACTUAL lines, # one-to-one, in the same order, with zero or more other ACTUAL # lines interspersed among the matching ACTUAL lines. i_expected = 0 for actual_line in actual: if expected[i_expected] == actual_line: i_expected += 1 if i_expected == len(expected): return True return False expected_re = expected[0] # If we want to check that every line matches the regexp # assume they all match and look for any that don't. If # only one line matching the regexp is enough, assume none # match and look for even one that does. if self.match_all: all_lines_match_re = True else: all_lines_match_re = False # If a regex was provided assume that we actually require # some output. Fail if we don't have any. if len(actual) == 0: return False for actual_line in actual: if self.match_all: if not re.match(expected_re, actual_line): return False else: # As soon an actual_line matches something, then we're good. if re.match(expected_re, actual_line): return True return all_lines_match_re def display_differences(self, message, label, actual): """Delegate to the display_lines() routine with the appropriate args. MESSAGE is ignored if None.""" display_lines(message, label, self.output, actual, self.is_regex, self.is_unordered) class AnyOutput(ExpectedOutput): def __init__(self): ExpectedOutput.__init__(self, None, False) def is_equivalent_list(self, ignored, actual): if len(actual) == 0: # No actual output. No match. return False for line in actual: # If any line has some text, then there is output, so we match. if line: return True # We did not find a line with text. No match. return False def display_differences(self, message, label, actual): if message: print(message) class RegexOutput(ExpectedOutput): is_regex = True class UnorderedOutput(ExpectedOutput): """Marks unordered output, and performs comparisons.""" is_unordered = True def __cmp__(self, other): raise 'badness' def is_equivalent_list(self, expected, actual): "Disregard the order of ACTUAL lines during comparison." e_set = set(expected) a_set = set(actual) if self.match_all: if len(e_set) != len(a_set): return False if self.is_regex: for expect_re in e_set: for actual_line in a_set: if re.match(expect_re, actual_line): a_set.remove(actual_line) break else: # One of the regexes was not found return False return True # All expected lines must be in the output. return e_set == a_set if self.is_regex: # If any of the expected regexes are in the output, then we match. for expect_re in e_set: for actual_line in a_set: if re.match(expect_re, actual_line): return True return False # If any of the expected lines are in the output, then we match. return len(e_set.intersection(a_set)) > 0 class UnorderedRegexOutput(UnorderedOutput, RegexOutput): is_regex = True is_unordered = True ###################################################################### # Displaying expected and actual output def display_trees(message, label, expected, actual): 'Print two trees, expected and actual.' if message is not None: print(message) if expected is not None: print('EXPECTED %s:' % label) svntest.tree.dump_tree(expected) if actual is not None: print('ACTUAL %s:' % label) svntest.tree.dump_tree(actual) def display_lines(message, label, expected, actual, expected_is_regexp=None, expected_is_unordered=None): """Print MESSAGE, unless it is None, then print EXPECTED (labeled with LABEL) followed by ACTUAL (also labeled with LABEL). Both EXPECTED and ACTUAL may be strings or lists of strings.""" if message is not None: print(message) if expected is not None: output = 'EXPECTED %s' % label if expected_is_regexp: output += ' (regexp)' if expected_is_unordered: output += ' (unordered)' output += ':' print(output) for x in expected: sys.stdout.write(x) if expected_is_regexp: sys.stdout.write('\n') if actual is not None: print('ACTUAL %s:' % label) for x in actual: sys.stdout.write(x) def compare_and_display_lines(message, label, expected, actual, raisable=None): """Compare two sets of output lines, and print them if they differ, preceded by MESSAGE iff not None. EXPECTED may be an instance of ExpectedOutput (and if not, it is wrapped as such). RAISABLE is an exception class, an instance of which is thrown if ACTUAL doesn't match EXPECTED.""" if raisable is None: raisable = svntest.main.SVNLineUnequal ### It'd be nicer to use createExpectedOutput() here, but its ### semantics don't match all current consumers of this function. if not isinstance(expected, ExpectedOutput): expected = ExpectedOutput(expected) if isinstance(actual, str): actual = [actual] actual = [line for line in actual if not line.startswith('DBG:')] if not expected.matches(actual): expected.display_differences(message, label, actual) raise raisable def verify_outputs(message, actual_stdout, actual_stderr, expected_stdout, expected_stderr, all_stdout=True): """Compare and display expected vs. actual stderr and stdout lines: if they don't match, print the difference (preceded by MESSAGE iff not None) and raise an exception. If EXPECTED_STDERR or EXPECTED_STDOUT is a string the string is interpreted as a regular expression. For EXPECTED_STDOUT and ACTUAL_STDOUT to match, every line in ACTUAL_STDOUT must match the EXPECTED_STDOUT regex, unless ALL_STDOUT is false. For EXPECTED_STDERR regexes only one line in ACTUAL_STDERR need match.""" expected_stderr = createExpectedOutput(expected_stderr, 'stderr', False) expected_stdout = createExpectedOutput(expected_stdout, 'stdout', all_stdout) for (actual, expected, label, raisable) in ( (actual_stderr, expected_stderr, 'STDERR', SVNExpectedStderr), (actual_stdout, expected_stdout, 'STDOUT', SVNExpectedStdout)): if expected is None: continue if isinstance(expected, RegexOutput): raisable = svntest.main.SVNUnmatchedError elif not isinstance(expected, AnyOutput): raisable = svntest.main.SVNLineUnequal compare_and_display_lines(message, label, expected, actual, raisable) def verify_exit_code(message, actual, expected, raisable=SVNUnexpectedExitCode): """Compare and display expected vs. actual exit codes: if they don't match, print the difference (preceded by MESSAGE iff not None) and raise an exception.""" if expected != actual: display_lines(message, "Exit Code", str(expected) + '\n', str(actual) + '\n') raise raisable cvs2svn-2.4.0/svntest/main.py0000664000076500007650000015153611434364627017234 0ustar mhaggermhagger00000000000000# # main.py: a shared, automated test suite for Subversion # # Subversion is a tool for revision control. # See http://subversion.tigris.org for more information. # # ==================================================================== # Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. ###################################################################### import sys # for argv[] import os import shutil # for rmtree() import re import stat # for ST_MODE import subprocess import copy # for deepcopy() import time # for time() import traceback # for print_exc() import threading import optparse # for argument parsing try: # Python >=3.0 import queue from urllib.parse import quote as urllib_parse_quote from urllib.parse import unquote as urllib_parse_unquote except ImportError: # Python <3.0 import Queue as queue from urllib import quote as urllib_parse_quote from urllib import unquote as urllib_parse_unquote import svntest from svntest import Failure from svntest import Skip ###################################################################### # # HOW TO USE THIS MODULE: # # Write a new python script that # # 1) imports this 'svntest' package # # 2) contains a number of related 'test' routines. (Each test # routine should take no arguments, and return None on success # or throw a Failure exception on failure. Each test should # also contain a short docstring.) # # 3) places all the tests into a list that begins with None. # # 4) calls svntest.main.client_test() on the list. # # Also, your tests will probably want to use some of the common # routines in the 'Utilities' section below. # ##################################################################### # Global stuff class SVNProcessTerminatedBySignal(Failure): "Exception raised if a spawned process segfaulted, aborted, etc." pass class SVNLineUnequal(Failure): "Exception raised if two lines are unequal" pass class SVNUnmatchedError(Failure): "Exception raised if an expected error is not found" pass class SVNCommitFailure(Failure): "Exception raised if a commit failed" pass class SVNRepositoryCopyFailure(Failure): "Exception raised if unable to copy a repository" pass class SVNRepositoryCreateFailure(Failure): "Exception raised if unable to create a repository" pass # Windows specifics if sys.platform == 'win32': windows = True file_scheme_prefix = 'file:///' _exe = '.exe' _bat = '.bat' os.environ['SVN_DBG_STACKTRACES_TO_STDERR'] = 'y' else: windows = False file_scheme_prefix = 'file://' _exe = '' _bat = '' # The location of our mock svneditor script. if windows: svneditor_script = os.path.join(sys.path[0], 'svneditor.bat') else: svneditor_script = os.path.join(sys.path[0], 'svneditor.py') # Username and password used by the working copies wc_author = 'jrandom' wc_passwd = 'rayjandom' # Username and password used by the working copies for "second user" # scenarios wc_author2 = 'jconstant' # use the same password as wc_author # Set C locale for command line programs os.environ['LC_ALL'] = 'C' # This function mimics the Python 2.3 urllib function of the same name. def pathname2url(path): """Convert the pathname PATH from the local syntax for a path to the form used in the path component of a URL. This does not produce a complete URL. The return value will already be quoted using the quote() function.""" # Don't leave ':' in file://C%3A/ escaped as our canonicalization # rules will replace this with a ':' on input. return urllib_parse_quote(path.replace('\\', '/')).replace('%3A', ':') # This function mimics the Python 2.3 urllib function of the same name. def url2pathname(path): """Convert the path component PATH from an encoded URL to the local syntax for a path. This does not accept a complete URL. This function uses unquote() to decode PATH.""" return os.path.normpath(urllib_parse_unquote(path)) ###################################################################### # The locations of the svn, svnadmin and svnlook binaries, relative to # the only scripts that import this file right now (they live in ../). # Use --bin to override these defaults. svn_binary = os.path.abspath('../../svn/svn' + _exe) svnadmin_binary = os.path.abspath('../../svnadmin/svnadmin' + _exe) svnlook_binary = os.path.abspath('../../svnlook/svnlook' + _exe) svnsync_binary = os.path.abspath('../../svnsync/svnsync' + _exe) svnversion_binary = os.path.abspath('../../svnversion/svnversion' + _exe) svndumpfilter_binary = os.path.abspath('../../svndumpfilter/svndumpfilter' + \ _exe) entriesdump_binary = os.path.abspath('entries-dump' + _exe) # Location to the pristine repository, will be calculated from test_area_url # when we know what the user specified for --url. pristine_url = None # Global variable to track all of our options options = None # End of command-line-set global variables. ###################################################################### # All temporary repositories and working copies are created underneath # this dir, so there's one point at which to mount, e.g., a ramdisk. work_dir = "svn-test-work" # Constant for the merge info property. SVN_PROP_MERGEINFO = "svn:mergeinfo" # Where we want all the repositories and working copies to live. # Each test will have its own! general_repo_dir = os.path.join(work_dir, "repositories") general_wc_dir = os.path.join(work_dir, "working_copies") # temp directory in which we will create our 'pristine' local # repository and other scratch data. This should be removed when we # quit and when we startup. temp_dir = os.path.join(work_dir, 'local_tmp') # (derivatives of the tmp dir.) pristine_dir = os.path.join(temp_dir, "repos") greek_dump_dir = os.path.join(temp_dir, "greekfiles") default_config_dir = os.path.abspath(os.path.join(temp_dir, "config")) # # Our pristine greek-tree state. # # If a test wishes to create an "expected" working-copy tree, it should # call main.greek_state.copy(). That method will return a copy of this # State object which can then be edited. # _item = svntest.wc.StateItem greek_state = svntest.wc.State('', { 'iota' : _item("This is the file 'iota'.\n"), 'A' : _item(), 'A/mu' : _item("This is the file 'mu'.\n"), 'A/B' : _item(), 'A/B/lambda' : _item("This is the file 'lambda'.\n"), 'A/B/E' : _item(), 'A/B/E/alpha' : _item("This is the file 'alpha'.\n"), 'A/B/E/beta' : _item("This is the file 'beta'.\n"), 'A/B/F' : _item(), 'A/C' : _item(), 'A/D' : _item(), 'A/D/gamma' : _item("This is the file 'gamma'.\n"), 'A/D/G' : _item(), 'A/D/G/pi' : _item("This is the file 'pi'.\n"), 'A/D/G/rho' : _item("This is the file 'rho'.\n"), 'A/D/G/tau' : _item("This is the file 'tau'.\n"), 'A/D/H' : _item(), 'A/D/H/chi' : _item("This is the file 'chi'.\n"), 'A/D/H/psi' : _item("This is the file 'psi'.\n"), 'A/D/H/omega' : _item("This is the file 'omega'.\n"), }) ###################################################################### # Utilities shared by the tests def wrap_ex(func): "Wrap a function, catch, print and ignore exceptions" def w(*args, **kwds): try: return func(*args, **kwds) except Failure, ex: if ex.__class__ != Failure or ex.args: ex_args = str(ex) if ex_args: print('EXCEPTION: %s: %s' % (ex.__class__.__name__, ex_args)) else: print('EXCEPTION: %s' % ex.__class__.__name__) return w def setup_development_mode(): "Wraps functions in module actions" l = [ 'run_and_verify_svn', 'run_and_verify_svnversion', 'run_and_verify_load', 'run_and_verify_dump', 'run_and_verify_checkout', 'run_and_verify_export', 'run_and_verify_update', 'run_and_verify_merge', 'run_and_verify_switch', 'run_and_verify_commit', 'run_and_verify_unquiet_status', 'run_and_verify_status', 'run_and_verify_diff_summarize', 'run_and_verify_diff_summarize_xml', 'run_and_validate_lock'] for func in l: setattr(svntest.actions, func, wrap_ex(getattr(svntest.actions, func))) def get_admin_name(): "Return name of SVN administrative subdirectory." if (windows or sys.platform == 'cygwin') \ and 'SVN_ASP_DOT_NET_HACK' in os.environ: return '_svn' else: return '.svn' def get_start_commit_hook_path(repo_dir): "Return the path of the start-commit-hook conf file in REPO_DIR." return os.path.join(repo_dir, "hooks", "start-commit") def get_pre_commit_hook_path(repo_dir): "Return the path of the pre-commit-hook conf file in REPO_DIR." return os.path.join(repo_dir, "hooks", "pre-commit") def get_post_commit_hook_path(repo_dir): "Return the path of the post-commit-hook conf file in REPO_DIR." return os.path.join(repo_dir, "hooks", "post-commit") def get_pre_revprop_change_hook_path(repo_dir): "Return the path of the pre-revprop-change hook script in REPO_DIR." return os.path.join(repo_dir, "hooks", "pre-revprop-change") def get_svnserve_conf_file_path(repo_dir): "Return the path of the svnserve.conf file in REPO_DIR." return os.path.join(repo_dir, "conf", "svnserve.conf") def get_fsfs_conf_file_path(repo_dir): "Return the path of the fsfs.conf file in REPO_DIR." return os.path.join(repo_dir, "db", "fsfs.conf") def get_fsfs_format_file_path(repo_dir): "Return the path of the format file in REPO_DIR." return os.path.join(repo_dir, "db", "format") # Run any binary, logging the command line and return code def run_command(command, error_expected, binary_mode=0, *varargs): """Run COMMAND with VARARGS. Return exit code as int; stdout, stderr as lists of lines (including line terminators). See run_command_stdin() for details. If ERROR_EXPECTED is None, any stderr also will be printed.""" return run_command_stdin(command, error_expected, 0, binary_mode, None, *varargs) # A regular expression that matches arguments that are trivially safe # to pass on a command line without quoting on any supported operating # system: _safe_arg_re = re.compile(r'^[A-Za-z\d\.\_\/\-\:\@]+$') def _quote_arg(arg): """Quote ARG for a command line. Simply surround every argument in double-quotes unless it contains only universally harmless characters. WARNING: This function cannot handle arbitrary command-line arguments. It can easily be confused by shell metacharacters. A perfect job would be difficult and OS-dependent (see, for example, http://msdn.microsoft.com/library/en-us/vccelng/htm/progs_12.asp). In other words, this function is just good enough for what we need here.""" arg = str(arg) if _safe_arg_re.match(arg): return arg else: if os.name != 'nt': arg = arg.replace('$', '\$') return '"%s"' % (arg,) def open_pipe(command, bufsize=0, stdin=None, stdout=None, stderr=None): """Opens a subprocess.Popen pipe to COMMAND using STDIN, STDOUT, and STDERR. BUFSIZE is passed to subprocess.Popen's argument of the same name. Returns (infile, outfile, errfile, waiter); waiter should be passed to wait_on_pipe.""" command = [str(x) for x in command] # On Windows subprocess.Popen() won't accept a Python script as # a valid program to execute, rather it wants the Python executable. if (sys.platform == 'win32') and (command[0].endswith('.py')): command.insert(0, sys.executable) # Quote only the arguments on Windows. Later versions of subprocess, # 2.5.2+ confirmed, don't require this quoting, but versions < 2.4.3 do. if sys.platform == 'win32': args = command[1:] args = ' '.join([_quote_arg(x) for x in args]) command = command[0] + ' ' + args command_string = command else: command_string = ' '.join(command) if not stdin: stdin = subprocess.PIPE if not stdout: stdout = subprocess.PIPE if not stderr: stderr = subprocess.PIPE p = subprocess.Popen(command, bufsize, stdin=stdin, stdout=stdout, stderr=stderr, close_fds=not windows) return p.stdin, p.stdout, p.stderr, (p, command_string) def wait_on_pipe(waiter, binary_mode, stdin=None): """WAITER is (KID, COMMAND_STRING). Wait for KID (opened with open_pipe) to finish, dying if it does. If KID fails, create an error message containing any stdout and stderr from the kid. Show COMMAND_STRING in diagnostic messages. Normalize Windows line endings of stdout and stderr if not BINARY_MODE. Return KID's exit code as int; stdout, stderr as lists of lines (including line terminators).""" if waiter is None: return kid, command_string = waiter stdout, stderr = kid.communicate(stdin) exit_code = kid.returncode # Normalize Windows line endings if in text mode. if windows and not binary_mode: stdout = stdout.replace('\r\n', '\n') stderr = stderr.replace('\r\n', '\n') # Convert output strings to lists. stdout_lines = stdout.splitlines(True) stderr_lines = stderr.splitlines(True) if exit_code < 0: if not windows: exit_signal = os.WTERMSIG(-exit_code) else: exit_signal = exit_code if stdout_lines is not None: sys.stdout.write("".join(stdout_lines)) sys.stdout.flush() if stderr_lines is not None: sys.stderr.write("".join(stderr_lines)) sys.stderr.flush() if options.verbose: # show the whole path to make it easier to start a debugger sys.stderr.write("CMD: %s terminated by signal %d\n" % (command_string, exit_signal)) sys.stderr.flush() raise SVNProcessTerminatedBySignal else: if exit_code and options.verbose: sys.stderr.write("CMD: %s exited with %d\n" % (command_string, exit_code)) sys.stderr.flush() return stdout_lines, stderr_lines, exit_code def spawn_process(command, bufsize=0, binary_mode=0, stdin_lines=None, *varargs): """Run any binary, supplying input text, logging the command line. BUFSIZE dictates the pipe buffer size used in communication with the subprocess: 0 means unbuffered, 1 means line buffered, any other positive value means use a buffer of (approximately) that size. A negative bufsize means to use the system default, which usually means fully buffered. The default value for bufsize is 0 (unbuffered). Normalize Windows line endings of stdout and stderr if not BINARY_MODE. Return exit code as int; stdout, stderr as lists of lines (including line terminators).""" if stdin_lines and not isinstance(stdin_lines, list): raise TypeError("stdin_lines should have list type") # Log the command line if options.verbose and not command.endswith('.py'): sys.stdout.write('CMD: %s %s\n' % (os.path.basename(command), ' '.join([_quote_arg(x) for x in varargs]))) sys.stdout.flush() infile, outfile, errfile, kid = open_pipe([command] + list(varargs), bufsize) if stdin_lines: for x in stdin_lines: infile.write(x) stdout_lines, stderr_lines, exit_code = wait_on_pipe(kid, binary_mode) infile.close() outfile.close() errfile.close() return exit_code, stdout_lines, stderr_lines def run_command_stdin(command, error_expected, bufsize=0, binary_mode=0, stdin_lines=None, *varargs): """Run COMMAND with VARARGS; input STDIN_LINES (a list of strings which should include newline characters) to program via stdin - this should not be very large, as if the program outputs more than the OS is willing to buffer, this will deadlock, with both Python and COMMAND waiting to write to each other for ever. For tests where this is a problem, setting BUFSIZE to a sufficiently large value will prevent the deadlock, see spawn_process(). Normalize Windows line endings of stdout and stderr if not BINARY_MODE. Return exit code as int; stdout, stderr as lists of lines (including line terminators). If ERROR_EXPECTED is None, any stderr also will be printed.""" if options.verbose: start = time.time() exit_code, stdout_lines, stderr_lines = spawn_process(command, bufsize, binary_mode, stdin_lines, *varargs) if options.verbose: stop = time.time() print('