Perl-Tidy-20230309/0002755000175000017500000000000014401515241012624 5ustar stevestevePerl-Tidy-20230309/README.md0000644000175000017500000000140114360547360014110 0ustar stevesteve# Build Status + [![Github Actions Build Status](https://github.com/perltidy/perltidy/actions/workflows/perltest.yml/badge.svg)](https://github.com/perltidy/perltidy/actions) * [CPAN Testers](https://www.cpantesters.org/distro/P/Perl-Tidy.html) # Welcome to Perltidy Perltidy is a tool to indent and reformat scripts written in Perl. Perltidy is free software released under the GNU General Public License -- please see the included file "COPYING" for details. Documentation can be found at the web site [at GitHub](https://perltidy.github.io/perltidy/) or [at Sourceforge](http://perltidy.sourceforge.net) or [at metacpan](https://metacpan.org/pod/distribution/Perl-Tidy/bin/perltidy) A copy of the web site is contained in the docs folder of the distribution. Perl-Tidy-20230309/INSTALL.md0000644000175000017500000003343413374421054014267 0ustar stevesteve# PERLTIDY INSTALLATION NOTES # Get a distribution file - Source Files in .tar.gz and .zip format This document tells how to install perltidy from the basic source distribution files in `.tar.gz` or `.zip` format. These files are identical except for the line endings. The `.tar.gz` has Unix style line endings, and the `.zip` file has Windows style line endings. The standard perl MakeMaker method should work for these in most cases. - Source files in RPM and .deb format The web site also has links to RPM and Debian .deb Linux packages, which may be convenient for some users. # Quick Test Drive If you want to do a quick test of perltidy without doing any installation, get a `.tar.gz` or a `.zip` source file and see the section below "Method 2: Installation as a single binary script". # Uninstall older versions In certain circumstances, it is best to remove an older version of perltidy before installing the latest version. These are: - Uninstall a Version older than 20020225 You can use perltidy -v to determine the version number. The first version of perltidy to use Makefile.PL for installation was 20020225, so if your previous installation is older than this, it is best to remove it, because the installation path may now be different. There were up to 3 files these older installations: the script `perltidy` and possibly two man pages, `perltidy.1` and `perl2web.1`. If you saved your Makefile, you can probably use `make uninstall`. Otherwise, you can use a `locate` or `find` command to find and remove these files. - Uninstall older versions when changing installation method If you switch from one installation method to another, the paths to the components of perltidy may change, so it is probably best to remove the older version before installing the new version. If your older installation method had an uninstall option (such as with RPM's and debian packages), use it. Otherwise, you can locate and remove the older files by hand. There are two key files: `Tidy.pm` and `perltidy`. In addition, there may be one or two man pages, something like `Perl::Tidy.3pm` and `perltidy.1p`. You can use a `locate` and/or `find` command to find and remove these files. After installation, you can verify that the new version of perltidy is working with the `perltidy -v` command. # Two Installation Methods - Overview These are generic instructions. Some system-specific notes and hints are given in later sections. Two separate installation methods are possible. - Method 1: Standard Installation Method The standard method based on MakeMaker should work in a normal perl environment. This is the recommended installation procedure for systems which support it. perl Makefile.PL make make test make install The `make` command is probably `nmake` under a Windows system. You may need to become root (or administrator) before doing the `make install` step. - Method 2: Installation as a single binary script If you just want to take perltidy for a quick test drive without installing it, or are having trouble installing modules, you can bundle it all in one independent executable script. This might also be helpful on a system for which the Makefile.PL method does not work, or if you are temporarily a guest on some system, or if you want to try hacking a special version of perltidy without messing up your regular version. You just need to uncompress the source distribution, cd down into it, and enter the command: perl pm2pl which will combine the pieces of perltidy into a single script named `perltidy` in the current directory. This script should be fully functional. Try it out on a handy perl script, for example perl perltidy Makefile.PL This should create `Makefile.PL.tdy`. - After Installation After installation by either method, verify that the installation worked and that the correct new version is being by entering: perltidy -v If the version number disagrees with the version number embedded in the distribution file name, search for and remove the old version. For example, under a Unix system, the command `which perltidy` might show where it is. Also, see the above notes on uninstalling older versions. On a Unix system running the `bash` shell, if you had a previous installation of perltidy, you may have to use hash -r to get the shell to find the new one. After `perltidy` is installed, you can find where it will look for configuration files and environment variables on your system with the command: perltidy -dpro - How to Uninstall Unfortunately, the standard Perl installation method does not seem able to do an uninstall. But try this: make uninstall On some systems, it will give you a list of files to remove by hand. If not, you need to find the script `perltidy` and its module file `Tidy.pm`, which will be in a subdirectory named `Perl` in the site library. If you installed perltidy with the alternative method, you should just reverse the steps that you used. ## Unix Installation Notes - Alternative method - Unix If the alternative method is used, test the script produced by the `pm2pl` perl script: perl ./perltidy somefile.pl where `somefile.pl` is any convenient test file, such as `Makefile.PL` itself. Then, 1\. If the script is not executable, use chmod +x perltidy 2\. Verify that the initial line in perltidy works for your system by entering: ./perltidy -h which should produce the usage text and then exit. This should usually work, but if it does not, you will need to change the first line in `perltidy` to reflect the location of perl on your system. On a Unix system, you might find the path to perl with the command 'which perl'. 3\. A sample `Makefile` for this installation method is `Makefile.npm`. Edit it to have the correct paths. You will need to become root unless you change the paths to point to somewhere in your home directory. Then issue the command make -f Makefile.npm install This installs perltidy and the man page perltidy.1. 5\. Test the installation using perltidy -h You should see the usage screen. Then, if you installed the man pages, try man perltidy which should bring up the manual page. If you ever want to remove perltidy, you can remove perltidy and its man pages by hand or use make uninstall ## Windows Installation Notes On a Windows 9x/Me system you should CLOSE ANY OPEN APPLICATIONS to avoid losing unsaved data in case of trouble. - Standard Method - Windows After you unzip the distribution file, the procedure is probably this: perl Makefile.PL nmake nmake test nmake install You may need to download a copy of `unzip` to unzip the `.zip` distribution file; you can get this at http://www.info-zip.org/pub/infozip/UnZip.html If you have ActiveState Perl, the installation method is outlined at http://aspn.activestate.com//ASPN/Reference/Products/ActivePerl/faq/Windows/ActivePerl-Winfaq9.html#How\_can\_I\_use\_modules\_from\_CPAN\_ You may need to download a copy of Microsoft's `nmake` program from ftp://ftp.microsoft.com/Softlib/MSLFILES/nmake15.exe If you are not familiar with installing modules, or have trouble doing so, and want to start testing perltidy quickly, you may want to use the alternative method instead (next section). - Alternative Method - Windows From the main installation directory, just enter perl pm2pl Placing the resulting file `perltidy` and the example batch file `perltidy.bat`, located in the `examples` directory, in your path should work. (You can determine your path by issuing the msdos command `PATH`). However, the batch file probably will not support file redirection. So, for example, to pipe the long help message through 'more', you might have to invoke perltidy with perl directly, like this: perl \somepath\perltidy -h | more The batch file will not work properly with wildcard filenames, but you may use wildcard filenames if you place them in quotes. For example perltidy '*.pl' ## VMS Installation Notes - Links to VMS Utilities and Documentation To install perltidy you will need the following utilities Perl, of course, source with VMS goodies available from http://www.sidhe.org/vmsperl or binary available from the Compaq OpenVMS freeware CD. To unpack the source either gunzip and vmstar available from the Compaq OpenVMS freeware CD or zip available from http://www.info-zip.org/ To build perltidy you can use either **MMS**, Compaq's VMS equivalent of make, or **MMK**, an **MMS** clone available from http://www.madgoat.com. Information on running perl under VMS can be found at: http://w4.lns.cornell.edu/~pvhp/perl/VMS.html - Unpack the source: $ unzip -a perl-tidy-yyyymmdd.zip ! or $ unzip /text=auto perl-tidy-yyyymmdd.zip ! or $ gunzip perl-tidy-yyyymmdd.tgz $ vmstar perl-tidy-yyyymmdd.tar - Build and install perltidy under VMS: $ set default [.perl-tidy-yyymmdd] $ perl perltidy.pl $ mmk $ mmk test $ mmk install - Using Perltidy under VMS Create a symbol. This should be put in a logon script, eg sylogin.com $ perltidy == "perl perl_root:[utils]perltidy." Default parameters can be placed in a `perltidyrc` file. Perltidy looks for one in the following places and uses the first found if the logical `PERLTIDY` is a file and the file exists then that is used if the logical `PERLTIDY` is a directory then look for a `.perltidyrc` file in the directory look for a `.perltidyrc` file in the user's home directory To see where the search is done and which `.perltidyrc` is used type $ perltidy -dpro A system `PERLTIDY` logical can be defined pointing to a file with a minimal configuration, and users can defined their own logical to use a personal `.perltidyrc` file. $ define /system perltidy perl_root:[utils]perltidy.rc - The -x Parameter If you have one of the magic incantations at the start of perl scripts, so that they can be invoked as a .com file, then you will need to use the **-x** parameter which causes perltidy to skip all lines until it finds a hash bang line eg `#!perl -w`. Since it is such a common option this is probably a good thing to put in a `.perltidyrc` file. - VMS File Extensions VMS file extensions will use an underscore character instead of a dot, when necessary, to create a valid filename. So perltidy myfile.pl will generate the output file `myfile.pl_tdy` instead of `myfile.pl.tdy`, and so on. # Troubleshooting / Other Operating Systems If there seems to be a problem locating a configuration file, you can see what is going on in the config file search with: perltidy -dpro If you want to customize where perltidy looks for configuration files, look at the routine 'find\_config\_file' in module 'Tidy.pm'. You should be able to at least use the '-pro=filename' method under most systems. Remember to place quotes (either single or double) around input parameters which contain spaces, such as file names. For example: perltidy "file name with spaces" Without the quotes, perltidy would look for four files: `file`, `name`, `with`, and `spaces`. If you develop a system-dependent patch that might be of general interest, please let us know. # CONFIGURATION FILE You do not need a configuration file, but you may eventually want to create one to save typing; the tutorial and man page discuss this. # SYSTEM TEMPORARY FILES Perltidy needs to create a system temporary file when it invokes Pod::Html to format pod text under the -html option. For Unix systems, this will normally be a file in /tmp, and for other systems, it will be a file in the current working directory named `perltidy.TMP`. This file will be removed when the run finishes. # DOCUMENTATION Documentation is contained in **.pod** format, either in the `docs` directory or appended to the scripts. These documents can also be found at http://perltidy.sourceforge.net Reading the brief tutorial should help you use perltidy effectively. The tutorial can be read interactively with **perldoc**, for example cd docs perldoc tutorial.pod or else an `html` version can be made with **pod2html**: pod2html tutorial.pod >tutorial.html If you use the Makefile.PL installation method on a Unix system, the **perltidy** and **Perl::Tidy** man pages should automatically be installed. Otherwise, you can extract the man pages with the **pod2xxxx** utilities, as follows: cd bin pod2text perltidy >perltidy.txt pod2html perltidy >perltidy.html cd lib/Perl pod2text Tidy.pm >Tidy.txt pod2html Tidy.pm >Tidy.html After installation, the installation directory of files may be deleted. Perltidy is still being developed, so please check sourceforge occasionally for updates if you find that it is useful. New releases are announced on freshmeat.net. # CREDITS Thanks to the many programmers who have documented problems, made suggestions and sent patches. # FEEDBACK / BUG REPORTS If you see ways to improve these notes, please let us know. A list of current bugs and issues can be found at the CPAN site [https://rt.cpan.org/Public/Dist/Display.html?Name=Perl-Tidy](https://rt.cpan.org/Public/Dist/Display.html?Name=Perl-Tidy) To report a new bug or problem, use the link on this page . Perl-Tidy-20230309/MANIFEST0000644000175000017500000000410614401515241013754 0ustar stevesteve.pre-commit-hooks.yaml bin/perltidy BUGS.md CHANGES.md COPYING docs/BugLog.html docs/ChangeLog.html docs/eos_flag.md docs/index.html docs/index.md docs/INSTALL.html docs/perltidy.html docs/stylekey.html docs/Tidy.html docs/tutorial.html examples/bbtidy.pl examples/break_long_quotes.pl examples/delete_ending_blank_lines.pl examples/ex_mp.pl examples/filter_example.in examples/filter_example.pl examples/find_naughty.pl examples/lextest examples/perlcomment.pl examples/perllinetype.pl examples/perlmask.pl examples/perltidy_hide.pl examples/perltidy_okw.pl examples/perltidyrc_dump.pl examples/perlxmltok.pl examples/pt.bat examples/README examples/testfa.t examples/testff.t INSTALL.md lib/Perl/Tidy.pm lib/Perl/Tidy.pod lib/Perl/Tidy/Debugger.pm lib/Perl/Tidy/DevNull.pm lib/Perl/Tidy/Diagnostics.pm lib/Perl/Tidy/FileWriter.pm lib/Perl/Tidy/Formatter.pm lib/Perl/Tidy/HtmlWriter.pm lib/Perl/Tidy/IndentationItem.pm lib/Perl/Tidy/IOScalar.pm lib/Perl/Tidy/IOScalarArray.pm lib/Perl/Tidy/LineBuffer.pm lib/Perl/Tidy/LineSink.pm lib/Perl/Tidy/LineSource.pm lib/Perl/Tidy/Logger.pm lib/Perl/Tidy/Tokenizer.pm lib/Perl/Tidy/VerticalAligner.pm lib/Perl/Tidy/VerticalAligner/Alignment.pm lib/Perl/Tidy/VerticalAligner/Line.pm Makefile.PL MANIFEST This list of files pm2pl README.md t/.gitattributes t/atee.t t/filter_example.t t/snippets1.t t/snippets10.t t/snippets11.t t/snippets12.t t/snippets13.t t/snippets14.t t/snippets15.t t/snippets16.t t/snippets17.t t/snippets18.t t/snippets19.t t/snippets2.t t/snippets20.t t/snippets21.t t/snippets22.t t/snippets23.t t/snippets24.t t/snippets25.t t/snippets26.t t/snippets27.t t/snippets28.t t/snippets3.t t/snippets4.t t/snippets5.t t/snippets6.t t/snippets7.t t/snippets8.t t/snippets9.t t/test-eol.t t/test.t t/test_DEBUG.t t/testsa.t t/testss.t t/testwide-passthrough.pl.src t/testwide-passthrough.t t/testwide-tidy.pl.src t/testwide-tidy.pl.srctdy t/testwide-tidy.t t/testwide.pl.src t/testwide.t META.yml Module YAML meta-data (added by MakeMaker) META.json Module JSON meta-data (added by MakeMaker) Perl-Tidy-20230309/docs/0002755000175000017500000000000014401515241013554 5ustar stevestevePerl-Tidy-20230309/docs/Tidy.html0000644000175000017500000005533114401515104015356 0ustar stevesteve

NAME

Perl::Tidy - Parses and beautifies perl source

SYNOPSIS

    use Perl::Tidy;

    my $error_flag = Perl::Tidy::perltidy(
        source            => $source,
        destination       => $destination,
        stderr            => $stderr,
        argv              => $argv,
        perltidyrc        => $perltidyrc,
        logfile           => $logfile,
        errorfile         => $errorfile,
        teefile           => $teefile,
        debugfile         => $debugfile,
        formatter         => $formatter,           # callback object (see below)
        dump_options      => $dump_options,
        dump_options_type => $dump_options_type,
        prefilter         => $prefilter_coderef,
        postfilter        => $postfilter_coderef,
    );

DESCRIPTION

This module makes the functionality of the perltidy utility available to perl scripts. Any or all of the input parameters may be omitted, in which case the @ARGV array will be used to provide input parameters as described in the perltidy(1) man page.

For example, the perltidy script is basically just this:

    use Perl::Tidy;
    Perl::Tidy::perltidy();

The call to perltidy returns a scalar $error_flag which is TRUE if an error caused premature termination, and FALSE if the process ran to normal completion. Additional discuss of errors is contained below in the ERROR HANDLING section.

The module accepts input and output streams by a variety of methods. The following list of parameters may be any of the following: a filename, an ARRAY reference, a SCALAR reference, or an object with either a getline or print method, as appropriate.

        source            - the source of the script to be formatted
        destination       - the destination of the formatted output
        stderr            - standard error output
        perltidyrc        - the .perltidyrc file
        logfile           - the .LOG file stream, if any
        errorfile         - the .ERR file stream, if any
        dump_options      - ref to a hash to receive parameters (see below),
        dump_options_type - controls contents of dump_options
        dump_getopt_flags - ref to a hash to receive Getopt flags
        dump_options_category - ref to a hash giving category of options
        dump_abbreviations    - ref to a hash giving all abbreviations

The following chart illustrates the logic used to decide how to treat a parameter.

   ref($param)  $param is assumed to be:
   -----------  ---------------------
   undef        a filename
   SCALAR       ref to string
   ARRAY        ref to array
   (other)      object with getline (if source) or print method

If the parameter is an object, and the object has a close method, that close method will be called at the end of the stream.

source

If the source parameter is given, it defines the source of the input stream. If an input stream is defined with the source parameter then no other source filenames may be specified in the @ARGV array or argv parameter.

destination

If the destination parameter is given, it will be used to define the file or memory location to receive output of perltidy.

Important note if destination is a string or array reference. Perl strings of characters which are decoded as utf8 by Perl::Tidy can be returned in either of two possible states, decoded or encoded, and it is important that the calling program and Perl::Tidy are in agreement regarding the state to be returned. A flag --encode-output-strings, or simply -eos, was added in Perl::Tidy version 20220217 for this purpose.

For some background information see https://github.com/perltidy/perltidy/blob/master/docs/eos_flag.md.

This change in default behavior was made over a period of time as follows:

stderr

The stderr parameter allows the calling program to redirect the stream that would otherwise go to the standard error output device to any of the stream types listed above. This stream contains important warnings and errors related to the parameters passed to perltidy.

perltidyrc

If the perltidyrc file is given, it will be used instead of any .perltidyrc configuration file that would otherwise be used.

errorfile

The errorfile parameter allows the calling program to capture the stream that would otherwise go to either a .ERR file. This stream contains warnings or errors related to the contents of one source file or stream.

The reason that this is different from the stderr stream is that when perltidy is called to process multiple files there will be up to one .ERR file created for each file and it would be very confusing if they were combined.

However if perltidy is called to process just a single perl script then it may be more convenient to combine the errorfile stream with the stderr stream. This can be done by setting the -se parameter, in which case this parameter is ignored.

logfile

The logfile parameter allows the calling program to capture the log stream. This stream is only created if requested with a -g parameter. It contains detailed diagnostic information about a script which may be useful for debugging.

teefile

The teefile parameter allows the calling program to capture the tee stream. This stream is only created if requested with one of the 'tee' parameters, a --tee-pod , --tee-block-comments, --tee-side-commnts, or --tee-all-comments.

debugfile

The debugfile parameter allows the calling program to capture the stream produced by the --DEBUG parameter. This parameter is mainly used for debugging perltidy itself.

argv

If the argv parameter is given, it will be used instead of the @ARGV array. The argv parameter may be a string, a reference to a string, or a reference to an array. If it is a string or reference to a string, it will be parsed into an array of items just as if it were a command line string.

dump_options

If the dump_options parameter is given, it must be the reference to a hash. In this case, the parameters contained in any perltidyrc configuration file will be placed in this hash and perltidy will return immediately. This is equivalent to running perltidy with --dump-options, except that the parameters are returned in a hash rather than dumped to standard output. Also, by default only the parameters in the perltidyrc file are returned, but this can be changed (see the next parameter). This parameter provides a convenient method for external programs to read a perltidyrc file. An example program using this feature, perltidyrc_dump.pl, is included in the distribution.

Any combination of the dump_ parameters may be used together.

dump_options_type

This parameter is a string which can be used to control the parameters placed in the hash reference supplied by dump_options. The possible values are 'perltidyrc' (default) and 'full'. The 'full' parameter causes both the default options plus any options found in a perltidyrc file to be returned.

dump_getopt_flags

If the dump_getopt_flags parameter is given, it must be the reference to a hash. This hash will receive all of the parameters that perltidy understands and flags that are passed to Getopt::Long. This parameter may be used alone or with the dump_options flag. Perltidy will exit immediately after filling this hash. See the demo program perltidyrc_dump.pl for example usage.

dump_options_category

If the dump_options_category parameter is given, it must be the reference to a hash. This hash will receive a hash with keys equal to all long parameter names and values equal to the title of the corresponding section of the perltidy manual. See the demo program perltidyrc_dump.pl for example usage.

dump_abbreviations

If the dump_abbreviations parameter is given, it must be the reference to a hash. This hash will receive all abbreviations used by Perl::Tidy. See the demo program perltidyrc_dump.pl for example usage.

prefilter

A code reference that will be applied to the source before tidying. It is expected to take the full content as a string in its input, and output the transformed content.

postfilter

A code reference that will be applied to the tidied result before outputting. It is expected to take the full content as a string in its input, and output the transformed content.

Note: A convenient way to check the function of your custom prefilter and postfilter code is to use the --notidy option, first with just the prefilter and then with both the prefilter and postfilter. See also the file filter_example.pl in the perltidy distribution.

ERROR HANDLING

An exit value of 0, 1, or 2 is returned by perltidy to indicate the status of the result.

A exit value of 0 indicates that perltidy ran to completion with no error messages.

An exit value of 1 indicates that the process had to be terminated early due to errors in the input parameters. This can happen for example if a parameter is misspelled or given an invalid value. The calling program should check for this flag because if it is set the destination stream will be empty or incomplete and should be ignored. Error messages in the stderr stream will indicate the cause of any problem.

An exit value of 2 indicates that perltidy ran to completion but there there are warning messages in the stderr stream related to parameter errors or conflicts and/or warning messages in the errorfile stream relating to possible syntax errors in the source code being tidied.

In the event of a catastrophic error for which recovery is not possible perltidy terminates by making calls to croak or confess to help the programmer localize the problem. These should normally only occur during program development.

NOTES ON FORMATTING PARAMETERS

Parameters which control formatting may be passed in several ways: in a .perltidyrc configuration file, in the perltidyrc parameter, and in the argv parameter.

The -syn (--check-syntax) flag may be used with all source and destination streams except for standard input and output. However data streams which are not associated with a filename will be copied to a temporary file before being passed to Perl. This use of temporary files can cause somewhat confusing output from Perl.

If the -pbp style is used it will typically be necessary to also specify a -nst flag. This is necessary to turn off the -st flag contained in the -pbp parameter set which otherwise would direct the output stream to the standard output.

EXAMPLES

The following example uses string references to hold the input and output code and error streams, and illustrates checking for errors.

  use Perl::Tidy;

  my $source_string = <<'EOT';
  my$error=Perl::Tidy::perltidy(argv=>$argv,source=>\$source_string,
    destination=>\$dest_string,stderr=>\$stderr_string,
  errorfile=>\$errorfile_string,);
  EOT

  my $dest_string;
  my $stderr_string;
  my $errorfile_string;
  my $argv = "-npro";   # Ignore any .perltidyrc at this site
  $argv .= " -pbp";     # Format according to perl best practices
  $argv .= " -nst";     # Must turn off -st in case -pbp is specified
  $argv .= " -se";      # -se appends the errorfile to stderr
  ## $argv .= " --spell-check";  # uncomment to trigger an error

  print "<<RAW SOURCE>>\n$source_string\n";

  my $error = Perl::Tidy::perltidy(
      argv        => $argv,
      source      => \$source_string,
      destination => \$dest_string,
      stderr      => \$stderr_string,
      errorfile   => \$errorfile_string,    # ignored when -se flag is set
      ##phasers   => 'stun',                # uncomment to trigger an error
  );

  if ($error) {

      # serious error in input parameters, no tidied output
      print "<<STDERR>>\n$stderr_string\n";
      die "Exiting because of serious errors\n";
  }

  if ($dest_string)      { print "<<TIDIED SOURCE>>\n$dest_string\n" }
  if ($stderr_string)    { print "<<STDERR>>\n$stderr_string\n" }
  if ($errorfile_string) { print "<<.ERR file>>\n$errorfile_string\n" }

Additional examples are given in examples section of the perltidy distribution.

Using the formatter Callback Object

The formatter parameter is an optional callback object which allows the calling program to receive tokenized lines directly from perltidy for further specialized processing. When this parameter is used, the two formatting options which are built into perltidy (beautification or html) are ignored. The following diagram illustrates the logical flow:

                    |-- (normal route)   -> code beautification
  caller->perltidy->|-- (-html flag )    -> create html
                    |-- (formatter given)-> callback to write_line

This can be useful for processing perl scripts in some way. The parameter $formatter in the perltidy call,

        formatter   => $formatter,

is an object created by the caller with a write_line method which will accept and process tokenized lines, one line per call. Here is a simple example of a write_line which merely prints the line number, the line type (as determined by perltidy), and the text of the line:

 sub write_line {

     # This is called from perltidy line-by-line
     my $self              = shift;
     my $line_of_tokens    = shift;
     my $line_type         = $line_of_tokens->{_line_type};
     my $input_line_number = $line_of_tokens->{_line_number};
     my $input_line        = $line_of_tokens->{_line_text};
     print "$input_line_number:$line_type:$input_line";
 }

The complete program, perllinetype, is contained in the examples section of the source distribution. As this example shows, the callback method receives a parameter $line_of_tokens, which is a reference to a hash of other useful information. This example uses these hash entries:

 $line_of_tokens->{_line_number} - the line number (1,2,...)
 $line_of_tokens->{_line_text}   - the text of the line
 $line_of_tokens->{_line_type}   - the type of the line, one of:

    SYSTEM         - system-specific code before hash-bang line
    CODE           - line of perl code (including comments)
    POD_START      - line starting pod, such as '=head'
    POD            - pod documentation text
    POD_END        - last line of pod section, '=cut'
    HERE           - text of here-document
    HERE_END       - last line of here-doc (target word)
    FORMAT         - format section
    FORMAT_END     - last line of format section, '.'
    DATA_START     - __DATA__ line
    DATA           - unidentified text following __DATA__
    END_START      - __END__ line
    END            - unidentified text following __END__
    ERROR          - we are in big trouble, probably not a perl script

Most applications will be only interested in lines of type CODE. For another example, let's write a program which checks for one of the so-called naughty matching variables &`, $&, and $', which can slow down processing. Here is a write_line, from the example program find_naughty.pl, which does that:

 sub write_line {

     # This is called back from perltidy line-by-line
     # We're looking for $`, $&, and $'
     my ( $self, $line_of_tokens ) = @_;

     # pull out some stuff we might need
     my $line_type         = $line_of_tokens->{_line_type};
     my $input_line_number = $line_of_tokens->{_line_number};
     my $input_line        = $line_of_tokens->{_line_text};
     my $rtoken_type       = $line_of_tokens->{_rtoken_type};
     my $rtokens           = $line_of_tokens->{_rtokens};
     chomp $input_line;

     # skip comments, pod, etc
     return if ( $line_type ne 'CODE' );

     # loop over tokens looking for $`, $&, and $'
     for ( my $j = 0 ; $j < @$rtoken_type ; $j++ ) {

         # we only want to examine token types 'i' (identifier)
         next unless $$rtoken_type[$j] eq 'i';

         # pull out the actual token text
         my $token = $$rtokens[$j];

         # and check it
         if ( $token =~ /^\$[\`\&\']$/ ) {
             print STDERR
               "$input_line_number: $token\n";
         }
     }
 }

This example pulls out these tokenization variables from the $line_of_tokens hash reference:

     $rtoken_type = $line_of_tokens->{_rtoken_type};
     $rtokens     = $line_of_tokens->{_rtokens};

The variable $rtoken_type is a reference to an array of token type codes, and $rtokens is a reference to a corresponding array of token text. These are obviously only defined for lines of type CODE. Perltidy classifies tokens into types, and has a brief code for each type. You can get a complete list at any time by running perltidy from the command line with

     perltidy --dump-token-types

In the present example, we are only looking for tokens of type i (identifiers), so the for loop skips past all other types. When an identifier is found, its actual text is checked to see if it is one being sought. If so, the above write_line prints the token and its line number.

The examples section of the source distribution has some examples of programs which use the formatter option.

For help with perltidy's peculiar way of breaking lines into tokens, you might run, from the command line,

 perltidy -D filename

where filename is a short script of interest. This will produce filename.DEBUG with interleaved lines of text and their token types. The -D flag has been in perltidy from the beginning for this purpose. If you want to see the code which creates this file, it is sub Perl::Tidy::Debugger::write_debug_entry

EXPORT

  &perltidy

INSTALLATION

The module 'Perl::Tidy' comes with a binary 'perltidy' which is installed when the module is installed. The module name is case-sensitive. For example, the basic command for installing with cpanm is 'cpanm Perl::Tidy'.

VERSION

This man page documents Perl::Tidy version 20230309

LICENSE

This package is free software; you can redistribute it and/or modify it under the terms of the "GNU General Public License".

Please refer to the file "COPYING" for details.

BUG REPORTS

The source code repository is at https://github.com/perltidy/perltidy.

To report a new bug or problem, use the "issues" link on this page.

SEE ALSO

The perltidy(1) man page describes all of the features of perltidy. It can be found at http://perltidy.sourceforge.net.

Perl-Tidy-20230309/docs/BugLog.html0000644000175000017500000071026014401515103015622 0ustar stevesteve

Issues fixed after release 20211029

Fix tokenization issue c109

Automated random testing produced an error tokenizing the following code fragment:

    s s(..)(.)sss
    ;

This is equivalent to 's/(..)(.)//s' with 's' as the delimiter instead of '/'. It was tokenized correctly except when the final 's' was followed by a newline, as in the example. When the delimiter is a letter rather than a punctuation character, perltidy exercises some seldom-used code which had an off-by-one loop limit. This has been fixed.

12 Nov 2021.

Fix tokenization of $$^, issue c106

Automated random testing produced an error tokenizing the following fragment:

   my$seed=$$^$^T;

The first ^ should have been tokenized as the bitwise xor operator but was not. This is fixed with this update.

8 Nov 2021

Fix coding error, issue c104

Automated random testing produced an error with something like the following input line taken from an obfuscated perl script:

    open(IN, $ 0);

The '0' was missing in the output:

    open( IN, $ );

The tokenization was correct, but a line of code in the formatter which removes the space between the '$' and the '0' should have used a 'defined' when doing a check:

    $token .= $word if ($word);             # OLD: error

This if test fails on '0'. The corrected line is

    $token .= $word if ( defined($word) );  # NEW: fixes c104

This fixes the problem and gives the correct formatted result

    open( IN, $0 );

8 Nov 2021.

Fix undefined variable reference, c102

Random testing produced an undefined variable reference for the following input

    make_sorter ( sort_sha => sub {sha512 ( $_} );
    make_sorter ( sort_ids => sub {/^ID:(\d+)/} );

when formatted with the following input parameters:

    --space-function-paren
    --maximum-line-length=26
    --noadd-newlines

Notice that the input script has a peculiar syntax error - the last two closing tokens of the first line are transposed. (Ironically, this snippet is taken from code which accompanied the book Perl Best Practices). The perltidy tokenizer caught the syntax error, but the formatter touched an undefined variable while attempting to do the formatting. It would be possible to just skip formatting for errors like this, but it can sometimes help finding bugs to see an attempted formatting. So the formatter coding has been corrected to avoid the undefined variable reference.

This fixes issue c102.

5 Nov 2021.

Some blocks with side comments exceed line length

In some rare cases, one-line blocks with side comments were exceeding the line length limit. These usually had a semicolon between the closing block brace and the side comment. For example:

        my $size
            = do { local $^W; -f $local && -s _ }; # no ALLO if sending data from a pipe

This update breaks the one-line block in an attempt to keep the total length below the line length limit. The result on the above is:

        my $size = do {
            local $^W;
            -f $local && -s _;
        };    # no ALLO if sending data from a pipe

Note that this break can be prevented by including the flag --ignore-side-comment-lengths or -iscl.

3 Nov 2021.

Issues fixed after release 20210625

Fix c090, inconsistent warning messages for deprecated syntax

For something like the following snippet, a warning about deprecated syntax was either going into the error file or the log file, depending on formatting. This has been fixed.

   do $roff ( &verify($tpage) );

20 Oct 2021, 72e4bb1.

Fix c091, incorrect closing side comment

An error was discovered and corrected in the behavior of the --closing-side-comment (-csc) flag when only subs were being marked with the setting -cscl='sub'. The problem was that in rare cases a closing paren could be marked with '## end'. The cause of the problem is that the pattern matching regex which was generated for this case happens to match an empty string, and it could happen that certain parens had empty strings as block names. This was fixed in two ways. First, the regex was fixed so that it cannot match an empty string. Second, a test for an empty string was added.

20 Oct 2021, aa1a019.

Issue c089, improve vertical alignment for lists without parens

An update was made to improve vertical alignment in situations where parens are omitted around lists. The goal is to make lists without parens align as they would if they were contained in parens. Some examples:

    # OLD, no parens, no alignment:
    glVertex3d $cx + $s * $xs, $cy, $z;
    glVertex3d $cx, $cy + $s * $ys, $z;
    glVertex3d $cx - $s * $xs, $cy, $z;
    glVertex3d $cx, $cy - $s * $ys, $z;

    # OLD, with parens and aligned:
    glVertex3d( $cx + $s * $xs, $cy,            $z );
    glVertex3d( $cx,            $cy + $s * $ys, $z );
    glVertex3d( $cx - $s * $xs, $cy,            $z );
    glVertex3d( $cx,            $cy - $s * $ys, $z );

    # NEW, no parens but aligned
    glVertex3d $cx + $s * $xs, $cy,            $z;
    glVertex3d $cx,            $cy + $s * $ys, $z;
    glVertex3d $cx - $s * $xs, $cy,            $z;
    glVertex3d $cx,            $cy - $s * $ys, $z;

    # OLD
    mkTextConfig $c, $x, $y, -anchor => 'se', $color;
    mkTextConfig $c, $x + 30, $y, -anchor => 's',  $color;
    mkTextConfig $c, $x + 60, $y, -anchor => 'sw', $color;
    mkTextConfig $c, $x, $y + 30, -anchor => 'e', $color;

    # NEW
    mkTextConfig $c, $x,      $y,      -anchor => 'se', $color;
    mkTextConfig $c, $x + 30, $y,      -anchor => 's',  $color;
    mkTextConfig $c, $x + 60, $y,      -anchor => 'sw', $color;
    mkTextConfig $c, $x,      $y + 30, -anchor => 'e',  $color;

    # OLD
    permute_test [ 'a', 'b', 'c' ],   '/', '/', [ 'a', 'b', 'c' ];
    permute_test [ 'a,', 'b', 'c,' ], '/', '/', [ 'a,', 'b', 'c,' ];
    permute_test [ 'a', ',', '#', 'c' ], '/', '/', [ 'a', ',', '#', 'c' ];
    permute_test [ 'f_oo', 'b_ar' ], '/', '/', [ 'f_oo', 'b_ar' ];

    # NEW
    permute_test [ 'a', 'b', 'c' ],      '/', '/', [ 'a', 'b', 'c' ];
    permute_test [ 'a,', 'b', 'c,' ],    '/', '/', [ 'a,', 'b', 'c,' ];
    permute_test [ 'a', ',', '#', 'c' ], '/', '/', [ 'a', ',', '#', 'c' ];
    permute_test [ 'f_oo', 'b_ar' ],     '/', '/', [ 'f_oo', 'b_ar' ];

    # OLD:
    is $thingy, "fee",           "source filters apply to evalbytten strings";
    is "foo",   $unfiltered_foo, 'filters leak not out of byte evals';
    is $av->[2], "NAME:my_xop",          "OP_NAME returns registered name";
    is $av->[3], "DESC:XOP for testing", "OP_DESC returns registered desc";
    is $av->[4], "CLASS:$OA_UNOP",       "OP_CLASS returns registered class";
    is scalar @$av, 7, "registered peep called";
    is $av->[5], "peep:$unop", "...with correct 'o' param";
    is $av->[6], "oldop:$kid", "...and correct 'oldop' param";

    # NEW
    is $av->[2],    "NAME:my_xop",          "OP_NAME returns registered name";
    is $av->[3],    "DESC:XOP for testing", "OP_DESC returns registered desc";
    is $av->[4],    "CLASS:$OA_UNOP",       "OP_CLASS returns registered class";
    is scalar @$av, 7,                      "registered peep called";
    is $av->[5],    "peep:$unop",           "...with correct 'o' param";
    is $av->[6],    "oldop:$kid",           "...and correct 'oldop' param";

20 Oct 2021, 1dffec5.

Issue c087, breaking after anonymous sub

This update keeps both of these configurations stable for all cases except when the -lp option is used. For the -lp option, both become one-line blocks (the second case) to prevents the -lp indentation style from being lost. This update was made to minimize changes to existing formatting.

    $obj = {
        foo => sub { "bar" }
    };

    $obj = { foo => sub { "bar" } };

17 Oct 2021, f05e6b5.

Improve formatting of some Moose structures

In some structures used in Moose coding, some asymmetrical container breaks were being caused by the braces being tokenized as hash braces rather than block braces. This was also causing some unwanted warning messages.

    # OLD
    ::is(
        ::exception { has '+bar' => ( default => sub { 100 } );
        },
        undef,
        '... extended the attribute successfully'
    );

    # NEW
    ::is(
        ::exception {
            has '+bar' => ( default => sub { 100 } );
        },
        undef,
        '... extended the attribute successfully'
    );

This fixes issue c074.

12 Oct 2021, 7e873fa.

Fix issue c081, -cscw preventing deletion of closing side comments

Random testing revealed a problem in which an old closing side comment was not being deleted when it fell below the interval specified on -csci=n and the -cscw flag was also set.

For example, the following snippet has been formatted with -csc -csci=1. The interval -csci=1 insures that all blocks get side comments:

    if ($x3) {
        $zz;
        if ($x2) {
            if ($x1) {
                ..;
            } ## end if ($x1)
            $a;
            $b;
            $c;
        } ## end if ($x2)
    } ## end if ($x3)

If we then run with -csc -csci=6, the comment ## end if ($x1) will fall below the threshold and be removed (correctly):

    if ($x3) {
        $zz;
        if ($x2) {
            if ($x1) {
                ..;
            }
            $a;
            $b;
            $c;
        } ## end if ($x2)
    } ## end if ($x3)

But if we also add the -cscw flag (warnings) then it was not being removed. This update fixes this problem (issue c081).

2 Oct 2021, 25ef8e8

Partial fix for issue git #74 on -lp at anonymous subs

In the following snippet, the final one-line anonymous sub is not followed by a comma. This caused the -lp mode to revert to standard indentation mode because a forced line break was being made after the closing sub brace:

    # OLD, perltidy -lp
    $got = timethese(
        $iterations,
        {
          Foo => sub { ++$foo },
          Bar => '++$bar',
          Baz => sub { ++$baz }
        }
    );

An update was made to check for and fix this.

    # NEW, perltidy -lp
    $got = timethese(
                      $iterations,
                      {
                         Foo => sub { ++$foo },
                         Bar => '++$bar',
                         Baz => sub { ++$baz }
                      }
    );

But note that this only applies to one-line anonymous subs. If an anonymous sub is broken into several lines due to its length or complexity, then these forced line breaks cause indentation to revert to the standard indentation scheme.

22 Sep 2021, 4fd58f7.

Testing with random parameters produced a formatting instability related to the -vmll flag. The problem was due to a subtle difference in the definition of nesting depth and indentation level. The vertical aligner was using nesting depth instead of indentation level to compute the maximum line length when -vmll is set. In some rare cases there is a difference. The problem was fixed by passing the maximum line length to the vertical aligner so that the calculation is only done in the formatter. This fixes b1209.

20 Sep 2021, acf1d2d.

Fix issue b1208

Testing with random parameters produced a formatting instability which could be triggered when there is a short line length limit and there is a long side comment on the closing brace of a sort/map/grep/eval block. The problem was due to not correctly including the length of the side comment when testing to see if the block could fit on one line. This update fixes issue b1208.

18 Sep 2021, 0af1321.

Fix issue git #73

The -nfpva parameter was not working correctly for functions called with pointers. For example

    # OLD: perltidy -sfp -nfpva
    $self->method                ( 'parameter_0', 'parameter_1' );
    $self->method_with_long_name ( 'parameter_0', 'parameter_1' );

    # FIXED: perltidy -sfp -nfpva
    $self->method ( 'parameter_0', 'parameter_1' );
    $self->method_with_long_name ( 'parameter_0', 'parameter_1' );

The problem had to do with how the pointer tokens '->' are represented internally and has been fixed.

17 Sep 2021,e3b4a6f.

Fix unusual parsing error b1207

Testing with random parameters produced another instability caused by misparsing an 'x' operator after a possible file handle. This is very similar to b1205, but involves a sigil immediately following a times operator.

To illustrate some cases, consider:

    sub x {print "arg is $_[0]\n"}
    my $num = 3;
    my @list=(1,2,3);
    my %hash=(1,2,3,4);
    open (my $fh, ">", "junk.txt");
    print $fh x$num;   # prints a GLOB $num times to STDOUT
    print $fh x9;      # prints 'x9' to file 'junk.txt'
    print $fh x@list;  # prints a GLOB 3 times to STDOUT
    print $fh x%hash;  # prints a GLOB 2 times to STDOUT

Note in particular the different treatment of the first two print statements.

This update fixes case b1207.

15 Sep 2021, 107586f.

Fix formatting instability b1206

Testing with random parameters produced an instability due welding with a very short line length and large value of -ci. This is similar to issues b1197-b1204 and fixed with a similar method.

14 Sep 2021, 9704cd7.

Fix unusual parsing error b1205

Testing with random parameters produced an instability caused by misparsing an 'x' operator after a possible file handle. Testing with Perl showed that an 'x' followed by a '(' in this location is always the 'times' operator and never a call to a function 'x'. If x is immediately followed by a number it is subject to the usual weird parsing rules at such a location.

To illustrate, consider what these statements do:

    open( my $zz, ">", "junk.txt" );
    sub x { return $_[0] }  # never called
    print $zz x(2);    # prints a glob 2 times; not a function call
    print $zz x 2;     # prints a glob 2 times
    print $zz x2;      # syntax error
    print $zz x;       # syntax error
    print $zz z;       # prints 'z' in file 'junk.txt'

This update fixes case b1205.

13 Sep 2021, cfa2515.

Use stress_level to fix cases b1197-b1204

Testing with random input parameters produced a number of cases of unstable formatting. All of these involved some combination of a short maximum line length, a large -ci and -i, and often one or more of -xci -lp and -wn. These parameters can force lines to break at poor locations. All of these cases were fixed by introducing a quantity called the 'stress_level', which is the approximate indentation level at which the line break logic comes under high stress and become unstable. For default parameters the stress level is about 12, but unusual parameter combinations can make it much less, even as low as zero. For code which is at an indentation level greater than this depth, some defensive actions are taken to avoid instability, such as temporarily turning off the -xci flag when the indentation depth exceeds the stress level. Most actual working code will not be influenced by this logic. Actual code which has a very deep indentation level can avoid problems by using a long line length, a short number of indentation spaces, or even the whitespace-cycle parameter.

This update fixes issues b1197 b1198 b1199 b1200 b1201 b1202 b1203 b1204

12 Sep 2021, 0ac771e.

Fix unusual hanging side comment issue, c070

This issues can be illustrated with the following input script:

    {
        if ($xxx) {
          ...
        } ## end if ($xxx ...
        # b <filename>:<line> [<condition>]
    }


    # OLD: perltidy -fpsc=21
    {
        if ($xxx) {
            ...;
        } ## end if ($xxx ...
                        # b <filename>:<line> [<condition>]
    }

The comment '# b ..' moved over to the column 21 to the right as if it were a side comment. The reason is that it accidentally got marked as a hanging side comment. It should not have been because the previous side comment was a closing side comment. This update fixes this:

    # NEW: perltidy -fpsc=21
    {
        if ($xxx) {
            ...;
        } ## end if ($xxx ...

        # b <filename>:<line> [<condition>]
    }

This fixes issue c070.

10 Sep 2021, ec6ccf9.

Fix parsing issue c068

This issue is illustrated with the following line:

   my $aa = $^ ? "defined" : "not defined";

If we tell perltidy to remove the space before the '?', then the output will no longer be a valid script:

   # perltidy -nwls='?':
   my $aa = $^? "defined" : "not defined";

The problem is that Perl considers '$^?' to be a special variable. So Rerunning perltidy on this gives an error, and perl also gives an error. This update fixes the problem by preventing a space after anything like '$^' from being removed a new special variable would be created.

This fixes issue c068.

7 Sep 2021, 9bc23d1.

Fix parsing problem issue c066

This issue is illustrated with the following line (rt80058):

   my $ok=$^Oeq"linux";

Running perltidy generated a warning message which is caused by the lack of space before the 'eq'. This update fixes the problem.

4 Sep 2021, df79a20.

Fix unusual parsing problem issue c065

Testing produced an unusual parsing problem in perltidy which can be illustrated with the following simple script:

    my $str = 'a'x2.2;
    print $str,"\n";

Normally an integer would follow the 'x' operator, but Perl seems to accept any valid number and truncates it to an integer. So in this case the number 2.2 is truncated to 2 and the output is 'aa'.

But perltidy, with default parameters, formats this as

    my $str = 'a' x 2 . 2;
    print $str,"\n";

which prints the result "aa2". The problem is fixed with this update. With the update, the result is

    my $str = 'a' x 2.2;
    print $str, "\n";

which is equivalent to the original script.

This fixes issue c065.

4 Sep 2021, f242f78.

Fix formatting instability issues b1195, b1196

Testing with random parameters produced two similar cases of unstable formatting which are fixed with this update.

28 Aug 2021, ab9ad39.

Fix formatting instability issue b1194

Testing with random parameters produced a case of unstable formatting which is fixed with this update.

This fixes case b1194, and at the same time it simplifies the logic which handles issues b1183 and b1191.

21 Aug 2021, 62e5b01.

Fix some problems involving tabs characters, case c062

This update fixes some problems found in random testing with tab characters. For example, in the following snippet there is a tab character after 'sub'

    do sub      : lvalue {
        return;
      }

Running perltidy on this repeatedly keep increasing the space between 'sub' and ':'

    # OLD: perltidy
    do sub       : lvalue {
        return;
      }

    # OLD: perltidy
    do sub        : lvalue {
        return;
      }

etc.

    # NEW: perltidy
    do sub : lvalue {
        return;
      }

Problems like this can occur if string comparisons use ' ' instead of the regex \s when working on spaces. Several instances of this were located and corrected.

This fixes issue c062.

18 Aug 2021, d86787f.

Correct parsing error, case c061

Testing with random input produced an error condition involving a side comment placed between a sub name and prototype, as in the following snippet:

    sub
    witch   # sc
    ()   # prototype may be on new line ...
    { print "which?\n" }
    witch();

The current version of perltidy makes an error:

    # OLD: perltidy
    sub witch   # sc ()    # prototype may be on new line ...
    { print "which?\n" }
    witch();

This update corrects the problem:

    # NEW: perltidy
    sub witch    # sc
      ()         # prototype may be on new line ...
    { print "which?\n" }
    witch();

This fixes case c061;

18 Aug 2021, 3bb2b2c.

Improved line break, case c060

The default formatting produced an undesirable line break at the last '&&' term in the following:

    my $static = grep {
             $class       =~ /^$_$/
          || $fullname    =~ /^$_$/
          || $method_name =~ /^$_$/
          && ( $class eq 'main' )
    } grep { !m![/\\.]! } $self->dispatch_to;    # filter PATH

This update corrects this to give

    my $static = grep {
             $class       =~ /^$_$/
          || $fullname    =~ /^$_$/
          || $method_name =~ /^$_$/ && ( $class eq 'main' )
    } grep { !m![/\\.]! } $self->dispatch_to;    # filter PATH

15 Aug 2021, 9d1c8a9.

Fix error check caused by -wn -iscl, case c058

Testing with random parameters triggered an an internal error check. This was caused by a recent coding change which allowed a weld across a side comment. The problem was in the development version, not in the latest released version, and is fixed with this update. This closes issue c058.

14 Aug 2021, 5a13886.

Fix formatting instability, b1193

Testing with random parameters produced unstable formatting involving parameters which included -lp -sbvtc=1. This update fixes this problem, case b1193.

13 Aug 2021, d4c3425.

Fix error in tokenizer, issue c055

The ultimate cause of the undefined variable reference in the previous issue was found to be a typo in the tokenizer. This update finishes fixing issue c055.

10 Aug 2021, 2963db3

Fix undefined variable reference in development version

In testing, the following unbalanced snippet caused a reference to an undefined value in the current development version (but not in the current released version).

        if($CPAN::Config->{term_is_latin}){
                $swhat=~s{([\xC0-\xDF])([\x80-\xBF])}{chr(ord($1)<<6&0xC0|ord($2)&0x3F)}eg;}if($self->colorize_output){if($CPAN::DEBUG&&$swhat=~/^Debug\(/){
                        $ornament=$CPAN::Config->{colorize_debug}||"black on_cyan";}

A check has been added to fix this.

10 Aug 2021, a3f9774.

Fix formatting instability, b1192

Testing with random parameters produced unstable formatting with the following snippet when run with some unusual parameters:

          @EXPORT =
              ( @{$EXPORT_TAGS{standard}}, );

It was also be formatted as

          @EXPORT = (
                      @{$EXPORT_TAGS{standard}},
          );

The problem was that a list formatting style was turning on and off due to the the needless terminal comma within the parens. A patch was made to ignore commas like this when deciding if list formatting should be used.

This fixes case b1192.

10 Aug 2021, b949215.

Fix formatting instability, b1191

Random testing produced an instability involving an unusual parameter combination and the following input code:

    $_[0]eq$_[1]
      or($_[1]=~/^([!~])(.)([\x00-\xff]*)/)
      and($1 eq '!')
      ^(eval{($_[2]."::".$_[0])=~/$2$3/;});

This update fixes case b1191.

9 Aug 2021, 16b4575.

Fix error parsing sub attributes without spaces, b1190

Testing with random parameters produced an instability which was caused by incorrect parsing of a sub attribute list without spaces, as in

        sub:lvalue{"a"}

This update fixes case b1190.

9 Aug 2021, 7008bcc.

Fix rare loss of vertical alignment in welded containers, c053

This update corrects a rare loss of vertical alignment in welded containers.

To illustrate the issue, the normal formatting of the following snippet is

    # perltidy -sil=1 
    ( $msg, $defstyle ) = do {
            $i == 1 ? ( "First", "Color" )
          : $i == 2 ? ( "Then", "Rarity" )
          :           ( "Then", "Name" );
    };

If it appears within a welded container, the alignment of the last line was being lost:

    # OLD: perltidy -wn -sil=1
    { {

        ( $msg, $defstyle ) = do {
                $i == 1 ? ( "First", "Color" )
              : $i == 2 ? ( "Then",  "Rarity" )
              : ( "Then", "Name" );
        };
    } }

The corrected result is

    # NEW: perltidy -wn -sil=1
    { {

        ( $msg, $defstyle ) = do {
                $i == 1 ? ( "First", "Color" )
              : $i == 2 ? ( "Then", "Rarity" )
              :           ( "Then", "Name" );
        };
    } }

Several other minor vertical alignment issues are fixed with this updated. The problem was that two slightly different measures of the depth of indentation were being compared in the vertical aligner.

This fixes case c053.

8 Aug 2021, 97f02ee.

Fix edge case of formatting instability, b1189.

Testing with random parameters produced a case of unstable formatting involving welding with parameter -notrim-qw. The problem was that the -notrim-qw flag converts a qw quote into a quote with fixed leading whitespace. The lines of these types of quotes which have no other code are are output early in the formatting process, since they have a fixed format, so welding does not work. In particular, the closing tokens cannot be welded if they are on a separate line. This also holds for all types of non-qw quotes. So welding can only be done if the first and last lines of a non-qw quote contain other code. A check for this was added.

For example, in the following snippet the terminal '}' is alone on a line:

    is eval(q{
        $[ = 3;
        BEGIN { my $x = "foo\x{666}"; $x =~ /foo\p{Alnum}/; }
        $t[3];
    }
    ), "a";

# In the previous version this was half-welded: # OLD: perltidy -wn -sil=0

    is eval( q{
        $[ = 3;
        BEGIN { my $x = "foo\x{666}"; $x =~ /foo\p{Alnum}/; }
        $t[3];
    }
      ),
      "a";

The new check avoids welding in this case, giving

    # NEW: perltidy -wn -sil=0
    is eval(
        q{
        $[ = 3;
        BEGIN { my $x = "foo\x{666}"; $x =~ /foo\p{Alnum}/; }
        $t[3];
    }
      ),
      "a";

Welding can still be done if the opening and closing container tokens have other code. For example, welding can be done for the following snippet:

    is eval(q{
        $[ = 3;
        BEGIN { my $x = "foo\x{666}"; $x =~ /foo\p{Alnum}/; }
        $t[3];
    }), "a";

And welding can still be done on all qw quotes unless the -notrim-qw flag is set.

This fixes case b1189.

7 Aug 2021, e9c25f2.

Fix edge cases of formatting instability, b1187 b1188.

Testing with random parameters produced some cases of instability involving -wn -lp and several other parameters. The mechanism involved an interaction between the formatter and vertical aligner.

This fixes cases b1187 and b1188.

3 Aug 2021, 5be949b.

Fix edge case of formatting instability, b1186.

Testing with random parameters produced a case of instability involving parameter -lp -pvt=n and a short maximum line length.

This fixes case b1186.

2 Aug 2021, f3dbee1.

Fix edge case of formatting instability, b1185.

Testing with random parameters produced a case of welding instability involving parameters -wn, -vt=2, -lp and a short maximum line length.

This fixes case b1185.

1 Aug 2021, d2ab2b7.

Fix edge case of formatting instability, b1183.

Testing with random parameters produced a case of welding instability involving parameters -wn, -vt=n, -lp. This update fixes the problem.

This fixes case b1183.

30 Jul 2021, 055650b.

Fix edge case of formatting instability, b1184.

Testing with random parameters produced a case of welding instability involving a tripple weld with parameters -wn, -vt=n, -lp. This update fixes the problem.

This fixes case b1184.

29 Jul 2021, 6dd53fb.

Fix edge case of formatting instability, b1182.

Testing with random parameters produced a case of welding instability involving unusual parameters and welding long ternary expressions. This update fixes the problem.

This fixes case b1182.

28 Jul 2021, 01d6c40.

Fix edge case of formatting instability.

Random testing with random input parameters produced cases of formatting instability involving welding with unusual parameter settings. This update makes a small tolarance adjustment to fix it.

This fixes cases b1180 b1181.

28 Jul 2021, b38ccfc.

Fix rare problem with formatting nested ternary statements

This update fixes an extremely rare problem in formatting nested ternary statements, illustrated in the following snippet:

  # OLD: There should be a break before the '?' in line 11 here:
  WriteMakefile(
      (
          $PERL_CORE ? ()
          : (
              (
                  eval { ExtUtils::MakeMaker->VERSION(6.48) }
                  ? ( MIN_PERL_VERSION => '5.006' )
                  : ()
              ),
              (
                  eval { ExtUtils::MakeMaker->VERSION(6.46) } ? (
                      META_MERGE => {
                          #
                      }
                    )
                  : ()
              ),
          )
      ),
  );

  # NEW: Line 12 correctly begins with a '?'
  WriteMakefile(
      (
          $PERL_CORE ? ()
          : (
              (
                  eval { ExtUtils::MakeMaker->VERSION(6.48) }
                  ? ( MIN_PERL_VERSION => '5.006' )
                  : ()
              ),
              (
                  eval { ExtUtils::MakeMaker->VERSION(6.46) }
                  ? (
                      META_MERGE => {
                          #
                      }
                    )
                  : ()
              ),
          )
      ),
  );

This fixes issue c050, 63784d8.

22 Jul 2021.

Fix conflict of -bom and -scp parameters

Automated testing with random parameters produced a case of instability caused by a conflict of parameters -bom and -scp. In the following script the -bom command says to keep the tokens ')->' on a new line, whereas the -scp command says to stack the closing paren on the previous line.

        $resource = {
                id => $package->new_from_mana(
                        $result->{data}
                )->id
        };

The parameters are:

    --break-at-old-method-breakpoints
    --indent-columns=8
    --maximum-line-length=60
    --stack-closing-paren

This caused an instability which was fixed by giving priority to the -bom flag. The stable state is then

        $resource = { id => $package->new_from_mana(
                        $result->{data}
        )->id };

This fixes case b1179.

21 Jul 2021, 4ecc078.

Fix problems with -kgb in complex structures

This update fixes two problems involving the -kgb option.

The first problem is that testing with random parameters produced some examples of formatting instabilities involving applying --keyword-group-blanks to complex structures, particularly welded structures. The -kgb parameters work well on simple statements or simple lists, so a check was added to prevent them from working on lists which are more than one level deep. This fixes issues b1177 and b1178 without significantly changing the -kgb functionality.

The second problem is that a terminal blank line could be misplaced if the last statement was a structure. This is illustrated with the following snippet:

    sub new {
      my $class = shift;
      my $number = shift || croak "What?? No number??\n";
      my $classToGenerate = shift || croak "Need a class to generate, man!\n";
      my $hash = shift; #No carp here, some operators do not need specific stuff
      my $self = { _number => $number,
                   _class => $classToGenerate,
                   _hash => $hash };
      bless $self, $class; # And bless it
      return $self;
    }

    # OLD: perltidy -kgb -sil=0 gave
    sub new {

        my $class           = shift;
        my $number          = shift || croak "What?? No number??\n";
        my $classToGenerate = shift || croak "Need a class to generate, man!\n";
        my $hash = shift;   #No carp here, some operators do not need specific stuff
        my $self = {
            _number => $number,

            _class => $classToGenerate,
            _hash  => $hash
        };
        bless $self, $class;    # And bless it
        return $self;
    }

The blank line which has appeared after the line '_number =>' was intended to go after the closing brace but a line count was off. This has been fixed:

    # NEW: perltidy -kgb -sil=0 gives
    sub new {

        my $class           = shift;
        my $number          = shift || croak "What?? No number??\n";
        my $classToGenerate = shift || croak "Need a class to generate, man!\n";
        my $hash = shift;   #No carp here, some operators do not need specific stuff
        my $self = {
            _number => $number,
            _class  => $classToGenerate,
            _hash   => $hash
        };

        bless $self, $class;    # And bless it
        return $self;
    }

This fixes issue c048.

19 Jul 2021, 071a3f6.

Fix to keep from losing blank lines after a code-skipping section

A problem in the current version of perltidy is that a blank line after the closing code-skipping comment can be lost if there was a blank line before the start of the code skipping section. For example, given the following code:

    $x = 1;

    #<<V
    % # = ( foo => 'bar', baz => 'buz' );
    print keys(%#), "\n";
    #>>V

    @# = ( foo, 'bar', baz, 'buz' );
    print @#, "\n";

running perltidy gives:

    $x = 1;

    #<<V
    % # = ( foo => 'bar', baz => 'buz' );
    print keys(%#), "\n";
    #>>V
    @# = ( foo, 'bar', baz, 'buz' );
    print @#, "\n";

Notice that the blank line after the closing comment #>>V is missing. What happened is that the formatter is counting blank lines and did not 'see' the code skipping section. So the blank after the closing comment looked like the second blank in a row, so it got removed since the default is --maximum-consecutive-blank-lines=1.

This update fixes this by resetting the counter. This fixes case c047. A simple workaround until the next release is to include the parameter

--maximum-consecutive-blank-lines=2, or -mbl=2.

It can be removed after the next release.

18 Jul 2021, 9648e16.

Fix possible welding instability in ternary after fat comma

Random testing produced a case of formatting instability involving welding within a ternary expression following a fat comma:

    if ( $op and $op eq 'do_search' ) {
        @{$invoices} =
          GetInvoices(
              shipmentdatefrom =>
              $shipmentdatefrom ? output_pref( {
                             str => $shipmentdatefrom,
                             dateformat => 'iso'
              } )
              : undef,
              publicationyear => $publicationyear,
          );
    }

when the following specific parameters were used

    --extended-continuation-indentation
    --line-up-parentheses
    --maximum-line-length=38
    --variable-maximum-line-length
    --weld-nested-containers

This update fixes this issue, case b1174.

18 Jul 2021, 12ae46b.

Fix mis-tokenization before pointer

Random testing produced the following case in which formatting was unstable because the variable '$t' was being mis-tokenized as a possible indirect object.

    --break-before-all-operators
    --ignore-old-breakpoints
    --maximum-line-length=22
    -sil=0

    my $json_response
      = decode_json $t
      ->tx->res->content
      ->get_body_chunk;

This update fixes cases b1175, b1176.

17 Jul 2021, 4aa1318.

Fix to make -wn and -bbxx=n flags work together

Testing with random parameters produced some cases where the combination of -wn and various -bbxx=n flags were not working together. To illustrate, consider the following script (-sil=1 just means start at 1 indentation level)

    # perltidy -sil=1
    $$d{"day_name"} = [
        [
            "lundi",    "mardi",  "mercredi", "jeudi",
            "vendredi", "samedi", "dimanche"
        ]
    ];

With welding we get:

    # -sil=1 -wn
    $$d{"day_name"} = [ [
        "lundi",    "mardi",  "mercredi", "jeudi",
        "vendredi", "samedi", "dimanche"
    ] ];

With -bbsb=3 (--break-before-square-brackets) we get:

    # -sil=1 -bbsb=3
    $$d{"day_name"} =
      [
        [
            "lundi",    "mardi",  "mercredi", "jeudi",
            "vendredi", "samedi", "dimanche"
        ]
      ];

So far, so good. But for the combination -bbsb=3 -wn we get

    # OLD: ERROR
    # -sil=1 -bbsb=3 -wn
    $$d{"day_name"} = [ [
        "lundi",    "mardi",  "mercredi", "jeudi",
        "vendredi", "samedi", "dimanche"
    ] ];

which is incorrect because it ignors the -bbsb flag. The corrected result is

    # NEW: OK
    # -sil=1 -bbsb=3 -wn
    $$d{"day_name"} =
      [ [
        "lundi",    "mardi",  "mercredi", "jeudi",
        "vendredi", "samedi", "dimanche"
      ] ];

This update fixes case b1173. It works for any number of welded containers, and the -bbxxi=n flags also work correctly.

16 Jul 2021, c71f882.

Fix problem with side comment after pattern

Testing with randomly placed side comments produced an error illustrated with the following snippet:

    testit
    /A (*THEN) X | B (*THEN) C/x#sc#
    ,
    "Simple (*THEN) test"
    ;

If 'testit' is an unknown bareword then perltidy has to guess if the '/' is a division or can start a pattern. In this case the side comment caused a bad guess. This is case c044 and is fixed with this update. There are no other known issues with side comments at the present time but testing continues.

13 Jul 2021, 8b36de8.

Fix problem with side comment after pointer, part 3

This update fixes some incorrect tokenization produced by a side comment placed between a pointer and a bareword as in this snippet:

    sub tzoffset {};

    ...

    my $J
    +=
    (
    $self
    ->#sc
    tzoffset
    / (
    24
    *
    3600
    )
    );

If a sub declaration for the bareword has been seen, the following '/' was being rejected as an operator. This update fixes this case, which is issue c043.

13 Jul 2021, cab7ed3.

Avoid line breaks before a slash in certain cases

This is a modification to the previous update for case c039 which prevents a line break before a '/' character which follows a bareword or possible indirect object. This rule will be only be used to prevent creating new line breaks. Existing line breaks can remain.

11 Jul 2021, 7afee47.

Fix error parsing sub attributes with side comment

Testing with side comments produced an error in the following snippet:

    sub plugh () :#
      Ugly('\(") : Bad;

This is fixed in this update, case c038.

11 Jul 2021, 80f2a3a.

Fix case b1172, a failure to converge

Random testing produced case b1172, a failure to converge with unusual parametrs. This update fixes this case. There are no other known cases of instability at the present time but testing continues.

10 Jul 2021, 47e7f9b.

Avoid line breaks before a slash in certain cases

This update prevents a line break before a '/' character which follows a bareword or possible indirect object. The purpose is reduce the chance of introducing a syntax error in cases where perl is using spaces to distinguish between a division and the start of a pattern.

This fixes case c039.

10 Jul 2021, 461199c.

Removed warning message if ending in code skipping section

In the previous version, a warning was produced if a 'code-skipping' opening comment '#<<V' was not followed by a closing comment '#>>V'. But the related 'format-skipping' commands do not give a warning if a '#<<<' comment is not ended with a '#>>>' closing comment. In order to be able to smoothly change between these options, it seems best to remove the warning about a missing '#>>V'. There is still a message in the log file about this, so if there is any uncertainty about it, a log file can be saved and consulted.

10 Jul 2021, 461199c.

Improve logic for distinguishing a pattern vs a division

Testing with side comments produced the following snippet which caused a error due to the side comment on the '/'

    $bond_str
    =
    VERY_WEAK #sc#
    / #sc#
    1.05
    ;

Several related examples were found in which side comments triggered errors. For example

    ok
    /[^\s]+/#sc#
    ,
    'm/[^\s]/ utf8'
    ;

This update fixes these problems, case c040.

9 Jul 2021, ffe4351.

Fix problem caused by side comment after ++

Testing with side comments produced an incorrect error message for this snippet:

    xxx 
    ++#
    $test, 
    Internals::SvREADONLY( %$copy) , 
    "cloned hash restricted?" ;

The problem was caused by the side comment between the '++' and '$test'. The same problem occurs for '--' instead of '++'. This is fixed with this update, case c042.

8 Jul 2021, 20cc9a0.

Fix problem caused by side comment after pointer, part 2

This is related to the previous issue, c037. The following snippet was misparsed at the old style ' package separater due to the side comment following the pointer.

    @ret
    =
    $o
    ->#
    SUPER'method
    (
    'whatever'
    )
    ;

This is fixed in this update, case c041.

7 Jul 2021, 1806772.

Fix problem caused by side comment after pointer

The following test script

    is(
    $one
    ->#sc#
    package
    ,
    "bar"
    ,
    "Got package"
    )
    ;

Caused the following error message:

  4: package
     ^
  found package where operator expected

The problem was caused by a side comment between the pointer '->' and the word 'package'. This caused package to be misparsed as a keyword, causing the error.

This is fixed in this update, case c037, 96f2ebb.

Fix error parsing '%#' and similar combinations

Perltidy was correctly distinguishing between '$#' and '$ #' but not between '@#' and '@ #' and '%#' and '% #'. The coding for parsing these types of expressions has been corrected. Some simple examples:

    # this is a valid program, '%#' is a punctuation variable
    %# = ( foo => 'bar', baz => 'buz' );
    print keys(%#), "\n";

    # but this is a syntax error (space before # makes a side comment)
    # (perltidy was ignoring the space and forming '%#' here)
    % # = ( foo => 'bar', baz => 'buz' );
    print keys(%#), "\n";

    # this is a valid program, '@#' is a punctuation variable
    @# = ( foo , 'bar', baz , 'buz' );
    print @#, "\n";

    # this is a valid program, the space makes the '#' a side comment
    # perltidy formed %# here, causing an error
    % #
    var = ( foo => 'bar', baz => 'buz' );
    print keys(%var), "\n";

This fixes case c036.

6 Jul 2021, e233d41.

Fix error parsing '&#'

The following test script caused an error when perltidy did not correctly parse the tight side comment after the '&' (it parsed correctly if there was a space before the '#').

    print$my_bag
    &#sc#
    $your_bag
    ,
    "\n"
    ;

This update fixes this issue, c033.

5 Jul 2021, 0d784e0.

Fix error parsing format statement

The following test script caused an error when perltidy took 'format' to start a format statement.

    my$ascii#sc#
    =#sc#
    $formatter#sc#
    ->#sc#
    format#sc#
    (#sc#
    $html#sc#
    )#sc#
    ;#sc#

This was fixed by requiring a format statement to begin where a new statement can occur. This fixes issue c035.

5 Jan 2021, 2ef16fb.

Fix some incorrect error messages due to side comments

Testing with large numbers of side comments caused perltidy to produce some incorrect error messages. Two issues are fixed with this update. First, a side comment between a pointer '->' and the next identifier caused a message. Second, in some cases a comment after an opening paren could cause a message. The following snippet is an example.

    $Msg#sc#
    ->#sc#
    $field#sc#
    (#sc#
    )#sc#
    ;#sc#

This update fixes cases c029 and c030.

4 Jul 2021, caffc2c.

Fix undefined var ref involving --format-skipping

Testing produced a situation in which version 20200625 could cause an undefined variable to be accessed (the variable 'Ktoken_vars') as in this snippet:

    #!/usr/bin/perl
    #<<<
        my $ra= (
            [ 'Shine', 40 ], [ 'Specular', [ 1, 1, 0.3, 0 ] ] );
    #<<<
    ...

The conditions for this to happen are:

  (1) format skipping (#<<<) begins before the first line of code, and
  (2) the format skipping section contains the two successive characters ', ['.

The undefined variable was 'Ktoken_vars'. This problem was introduced by commit 21ef53b, an update which fixed case b1100. This undefined variable access does influence the formatted output.

This update fixes this problem.

4 Jul 2021, 82916fe.

Check for side comment within package statement

Testing with randomly placed side comments caused perltidy to produce an incorrect warning when a side comment is placed before the end of a package statement:

    package double # side comment
    ;

This update fixes this. 3 Jul 2021, c00059a.

Fix problem with -comma-arrow-breakpoint=n flag

Testing revealed a formatting irregularity which was caused when the flag -cab=2 got applied to a container which was not a list. This is fixed with update, which fixes case b939a.

1 Jul 2021, 021b938.

Fix a formatting instability

Testing showed that a previously fixed case of instability, b1144, which was fixed 21 Jun 2021, 1b682fd, was unstable again. This update is a small change which fixes this. There are no other known unstable cases at this time but automated testing runs continue to search for instabilities.

1 Jul 2021, 021b938.

Fixed use of uninitialized value

The previous Tokenizer update caused the use of an unitialized value when run on case b1053:

 Use of uninitialized value $next_nonblank_token in pattern match (m//) at /home/steve/bin/Perl/Tidy/Tokenizer.pm line 7589.
 Use of uninitialized value $nn_nonblank_token in pattern match (m//) at /home/steve/bin/Perl/Tidy/Tokenizer.pm line 3723.
 b1053 converged on iteration 2

This update fixes this.

1 Jul 2021, ea139bd.

Fix token type of colon introducing anonomyous sub attribute list

In the following example

    print "not " unless ref +(
        map {
            sub : lvalue { "a" }
        } 1
    )[0] eq "CODE";

the colon after 'sub' was being marked as part of a label rather than the start of an attribute list. This does not cause an error, but the space before the ':' is lost. This is fixed in this update.

Note that something like 'sub :' can also be a label, so the tokenizer has to look ahead to decide what to do. For example, this 'sub :' is a label:

    my $xx = 0;
    sub : {
        $xx++;
        print "looping with label sub:, a=$xx\n";
        if ( $xx < 10 ) { goto "sub" }
    }

In this case, the goto statement needs quotes around "sub" because it is a keyword.

29 Jun 2021, d5fb3d5.

Minor adjustments to improve formatting stability

Testing with random input parameters produced several new cases of formatting instability involving unusual parameter combinations. This update fixes these cases, b1169 b1170 b1171, and all previously discovered cases remain stable with the update.

28 Jun 2021, e1f22e0.

Remove limit on -ci=n when -xci is set, see rt #136415

This update undoes the update c16c5ee of 20 Feb 2021, in which the value of -ci=n was limited to the value of -i=n when -xci was set. Recent improvements in stability tolerances allow this limit to be removed.

28 Jun 2021, 1b3c5e9.

Minor optimization

Added a quick check to bypass a needless sub call.

26 Jan 2021, e7822df.

Eliminate token variable _LEVEL_TRUE_

It was possible to eliminate this token variable by changing the order of welding operations. This reduces the number of variables per token from 12 to 11.

26 Jun 2021, 1f4f78c.

Eliminate token variable _CONTAINER_ENVIRONMENT_

Testing with NYT_Prof shows that the number of variables per token has a direct effect on efficiency. This update reduces the number of token variables from 13 to 12, and also simplifies the coding. It was possible to compute this variable from the others, so it was redundant.

26 Jun 2021, 300ca1e.

Issues fixed after release 20210402

Release 20210625

24 Jun 2021, a4ff53d.

Adjust tolerances to fix some unstable edge cases

Testing with random input parameters produced a number of edge cases of unstable formatting which were traced to the parameter combinations which included -lp and some other unusual settings.

This fixes cases b1103 b1134 b1135 b1136 b1138 b1140 b1143 b1144 b1145 b1146 b1147 b1148 b1151 b1152 b1153 b1154 b1156 b1157 b1163 b1164 b1165

There are no other known cases of formatting instability at the present time, but testing with random input parameters will continue.

21 Jun 2021, 1b682fd.

Adjust tolerances to fix some unstable edge cases

Testing with random input parameters produced a number of edge cases of unstable formatting which were traced to the parameter combinations which included -bbxi=2 and -cab=2. A small adjustment to length tolerances was made to fix the problem.

This fixes cases b1137 b1149 b1150 b1155 b1158 b1159 b1160 b1161 b1166 b1167 b1168.

19 Jun 2021, 4d4970a.

Added flag -atnl, --add-terminal-newline, see git #58

This flag, which is enabled by default, allows perltidy to terminate the last line of the output stream with a newline character, regardless of whether or not the input stream was terminated with a newline character. If this flag is negated, with -natnl, then perltidy will add a terminal newline to the the output stream only if the input stream is terminated with a newline.

Negating this flag may be useful for manipulating one-line scripts intended for use on a command line.

This update also removes the description of the obsolete --check-syntax flag from the man pages and help text.

18 Jun 2021, 6f83170.

Allow --delete-side-comments to work with -nanl

The -nanl flag (--noadd-newlines) was preventing side comments from being deleted, for example:

    # perltidy -dsc -nanl
    calc()    # side comment

The same issue was happening for --delete-closing-side comments. This has been fixed.

18 Jun 2021, dbfd802.

Update welding rule to avoid unstable states

Testing with random input parameters produced a formatting instability involving an unusual parameter combination:

    --noadd-whitespace
    --break-before-paren=3
    --continuation-indentation=8
    --delete-old-whitespace
    --line-up-parentheses
    --weld-nested-containers

and the following code

        if(defined$hints{family}){
            @infos=({
                     family=>$hints{family},
                     socktype=>$hints{socktype},
                     protocol=>$hints{protocol},
            });
        }

This update fixes the problem, case b1162.

18 Jun 2021, 76873ea.

Convert some weld sub calls to hash lookups

This is a minor optimization. These subs are eliminated: is_welded_right_at_K, is_welded_left_at_K, weld_len_right_at_K.

17 Jun 2021, 1691013.

Update LineSink.pm to allow undefined line endings

This update is necessary to eventually prevent an unwanted terminal newline being added to a file.

17 Jun 2021, 2600533.

Fix incorrect sub call

This fixes an incorrect call which could cause an incorrect weld.

16 Jun 2021, 068a28b.

Add --code-skipping option, see git #65

Added a new option '--code-skipping', requested in git #65, in which code between comment lines '#<<V' and '#>>V' is passed verbatim to the output stream without error checking. It is simmilar to --format skipping but there is no error checking, and is useful for skipping an extended syntax.

16 Jun 2021, 99ec876.

Handle nested print format blocks

Perltidy was producing an error at nested print format blocks, such as

    format NEST =
    @<<<
    {
        my $birds = "birds";
        local *NEST = *BIRDS{FORMAT};
        write NEST;
        format BIRDS =
    @<<<<<
    $birds;
    .
    "nest"
      }
    .

It was ending the first format at the first '.' rather than the second '.' in this example. This update fixes this, issue c019.

13 Jun 2021.

Allow stacked labels without spaces

When labels are stacked in a single line, such as

A:B:C:

the default is to space them:

A: B: C:

This update allows the spaces to be removed if desired:

# perltidy -naws -dws A:B:C:

13 Jun 2021, c2a63b2

Fix edge cases of instability involving -wn -lp

Random testing produced some cases of instability involving -wn -lp and some unusual additional parameters. These were traced to a test for welding, and were fixed by refining a certain tolerance. This fixes cases b1141, b1142.

12 Jun 2021, 125494b.

Remove incorrect warning at repeated function paren call

This update removes an incorrect error messagge at the construct ')('. To illustrate, the following is a valid program:

    my @words = qw(To view this email as a web page go here);
    my @subs;
    push @subs, sub { my $i=shift; $i %= @words; print "$words[$i] "; return $subs[0]};
    $subs[0](0)(1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11);
    print "\n";

However perltidy was giving an error message at the ')(' combination, which is unusual in perl scripts. This update fixes this.

These are function call parens, so logically they should be under control of the -sfp or --space-function-parens parameter. I wrote a patch to do this, but decided not to implement it. The reason is that, as noted in the manual, subtle errors in perl scripts can occur when spaces are placed before parens. So, to avoid possible problems, the -sfp parameter will be restricted to spaces between a bareword [assumed to be a function] and a paren.

This update is in Tokenizer.pm and fixes case c017.

6 Jun 2021, 6551d65.

Add warning when lexical sub names match some builtins

This update adds a warning when lexical subs have names which match some builtin names which will almost certainly cause a parsing error in the current version of perltidy. For example, the following program is valid and will run, but perltidy will produce an error.

    use feature qw(lexical_subs);
    use warnings; no warnings "experimental::lexical_subs";
    {
      my sub y { print "Hello from y: $_[0]\n"; }
      y(1);
    }

6 Jun 2021, 32729fb.

Minor cleanups

This update fixes a case of formatting instability recently found with random testing. It also does some minor coding cleanups.

This fixes case b1139.

5 Jun 2021, b8527ab.

Revised data structures for welding

This update replaces the data structures used for the welding option with simpler but more general structures. This cleans up the code and will simplify future coding. No formatting changes should occur with this update.

4 Jun 2021, 4a886c8.

improved treatment of lexical subs

This update improves the treatment of lexical subs. Previously they were formatted correctly but an error would have been produced if the same name was used for lexical subs defined in different blocks.

For example, running the previous version of perltidy on the following:

    use feature qw(lexical_subs);
    use warnings; no warnings "experimental::lexical_subs";
    {
        my sub hello { print "Hello from hello1\n" }
        {
            my sub hello { print "Hello from hello2\n" }
            hello();
        }
        hello();
    }
    {
        my sub hello { print "Hello from hello3\n" }
        hello();
    }

gave the (incorrect) error message:

    6: already saw definition of 'sub hello' in package 'main' at line 4
    12: already saw definition of 'sub hello' in package 'main' at line 6

This update fixes that.

1 Jun 2021, 85ecb7a.

add v-string underscores; warn of leading commas

This update cleans up a couple of open issues in the tokenizer.

A warning message will be produced for a list which begins with a comma:

            my %string = (
              ,       "saddr",    $stest,  "daddr",
              $dtest, "source",   $sname,  "dest")

This warning had been temporarily deactivated.

Underscores in v-strings without a leading 'v' are now parsed correctly.

Several comments have been updated.

31 May 2021, ef44e70.

Fix parsing error at operator following a comma

The following lines are syntactically correct but some were producing a syntax error

    print "hello1\n", || print "hi1\n";
    print "hello2\n", && print "bye2\n";
    print "hello3\n", or print "bye3\n";
    print "hello4\n", and print "bye4\n";

For example, the first line produced this message

    1: print "hello1\n", || print "hi1\n";
                       - ^
    found || where term expected (previous token underlined)

This has been fixed. This fixes case c015.

27 May 2021, b537a72.

Added optional o in octal number definitions

An optional letter 'o' or 'O' in the octal representation of numbers, which was added in perl version 5.33.5, is now recogized. The leading zero is still required.

For example:

    $a = 0o100;
    $a = 0O100;

26 May 2021, 544df8c.

Fix several problems with -lp formatting

This update fixes several problems with -lp formatting which are all somewhat related.

ISSUE #1 (cases c002 and c004): A problem involving -lp -wn and certain qw lists

The last line of a welded qw list was being outdented even if it contained text as well as the closing container token. This update fixes the problem and simplifies the logic.

A few examples (all use 'perltidy -wn -lp'):

    # OLD and NEW: OK, closing qw paren is on separate line
    $roads->add_capacity_path( qw( CoolCity 10 ChocolateGulch 8
                               PecanPeak 10 BlueberryWoods 6
                               HotCity
    ) );

    # OLD: poor; outdented text not aligned with previous text
    $roads->add_capacity_path( qw( CoolCity 10 ChocolateGulch 8
                               PecanPeak 10 BlueberryWoods 6
    HotCity ) );

    # NEW:
    $roads->add_capacity_path( qw( CoolCity 10 ChocolateGulch 8
                               PecanPeak 10 BlueberryWoods 6
                               HotCity ) );

    # OLD:
    $roads->add_capacity_path( qw( ChocolateGulch 3 StrawberryFields 0
    StrawberryFields ) );

    # NEW:
    $roads->add_capacity_path( qw( ChocolateGulch 3 StrawberryFields 0
                               StrawberryFields ) );

    # OLD:
    my $mon_name = ( qw(January February March April
                     May June July August
    September October November December) )[$mon];

    # NEW
    my $mon_name = ( qw(January February March April
                     May June July August
                     September October November December) )[$mon];

ISSUE #2 (case c007): A rare program error could be triggered with -lp -xci

In some very rare circumstances it was possible to trigger a "Program error" message. The program still ran to completion. The conditions for this to occur were that flags -lp -xci were set, and that there was a container of sort/map/grep blocks, and there was a side comment on the closing paren. For example:

    # OLD: perltidy -lp -xci, gave an error message and extra indentation here
    my %specified_opts = (
          (
                          map { /^no-?(.*)$/i ? ($1 => 0) : ($_ => 1) }
                          map { /^--([\-_\w]+)$/ } @ARGV
          ),    # this comment caused an error with flags -lp -xci
    );

    # NEW: perltidy -lp -xci, no error
    my %specified_opts = (
          (
             map { /^no-?(.*)$/i ? ( $1 => 0 ) : ( $_ => 1 ) }
             map { /^--([\-_\w]+)$/ } @ARGV
          ),    # this comment caused an error with flags -lp -xci
    );

ISSUE #3 (case c008): In some unusual cases the -lp formatting style was not being applied when it should have been. For example (text is shifted right):

    # NEW: perltidy -lp
    $result = runperl(
        switches => [ '-I.', "-w" ],
        stderr   => 1,
        prog     => <<'PROG' );
    SIG
    PROG

    # NEW: perltidy -lp
    $result = runperl(
                       switches => [ '-I.', "-w" ],
                       stderr   => 1,
                       prog     => <<'PROG' );
    SIG
    PROG

25 May 2021, 6947fe9.

Modify welding rules

This is an update to the patch 19 Apr 2021, eeeaf09. It restricts that patch to -lp formatting mode.

This fixes case b1131.

21 May 2021, a4ec4c1.

Fix inconsistency involving counting commas

Random testing produced a formatting instability involving the combination of flags -bbp=2 -xci -vt=2 -bvtc=2. The problem was traced to an error in counting the number of line ending commas in lists.

This fixes case b1130.

15 May 2021, 90cceb1.

Slightly modify line breaks for -lp indentation

Random testing produced an edge case of formatting instability for -lp indentation which was traced to checking for an old line break at a '=>'. This has been fixed. Some existing formatting with deeply nested structures may be slightly changed due to the fix, but most existing formatting will be unchanged.

This fixes b1035.

15 May 2021, dd42648.

Rewrite coding for -bom flag

Random testing produced some examples of formatting instability involving the -bom flag in combination with certain other flags which are fixed with this update. As part of this update, a previous update to fix case b977 (21 Feb 2021, commit 28114e9) was revised to use a better criterion for deciding when not to keep a ')->' break. The previous criterion was that the opening and closing containers should be separated by more than one line. The new criterion is that they should contain a list. This still fixes case b977. Another case, b1120, was fixed by requiring that only parentheses expressions be considered for keeping a line break, not '}->' or ']->'.

Some issues are illustrated in the following examples using '-bom -gnu'. In the first example the leading ')->' was being lost due to the old b977 fix:

    # input:
    $show = $top->Entry( '-width' => 20,
                       )->pack('-side' => 'left');

    # OLD: perltidy -gnu -bom
    $show = $top->Entry('-width' => 20,)->pack('-side' => 'left');

    # NEW: perltidy -gnu -bom
    $show = $top->Entry(
                        '-width' => 20,
                       )->pack('-side' => 'left');

In the following example a leading '->' was being lost. The NEW version keeps the leading '->' but has to give up on the -lp alignment because of complexity:

        # input
        $_make_phase_arg = join(" ",
                           map {CPAN::HandleConfig
                                 ->safe_quote($_)} @{$prefs->{$phase}{args}},
                          );

        # OLD: perltidy -gnu -bom
        $_make_phase_arg = join(" ",
                                map { CPAN::HandleConfig->safe_quote($_) }
                                  @{$prefs->{$phase}{args}},
                               );

        # NEW: perltidy -gnu -bom
        $_make_phase_arg = join(
            " ",
            map {
                CPAN::HandleConfig
                  ->safe_quote($_)
            } @{$prefs->{$phase}{args}},
        );

In the following example, a leading ')->' was being converted to a leading '->' due to the old b977 fix:

    # Starting script
    $lb = $t->Scrolled("Listbox", -scrollbars => "osoe"
                      )->pack(-fill => "both", -expand => 1);

    # OLD: perltidy -bom -gnu
    $lb = $t->Scrolled( "Listbox", -scrollbars => "osoe" )
      ->pack( -fill => "both", -expand => 1 );

    # NEW: perltidy -bom -gnu
    $lb = $t->Scrolled(
                       "Listbox", -scrollbars => "osoe"
                      )->pack(-fill => "both", -expand => 1);

In the following example, a leading ')->' was being lost, again due to the old b977 fix:

    $myDiag->Label(-text => $text,
                                  )->pack(-fill => 'x',
                                                  -padx => 3,
                                                  -pady => 3);

    # OLD: -gnu -bom
    $myDiag->Label(-text => $text,)->pack(
                                          -fill => 'x',
                                          -padx => 3,
                                          -pady => 3
                                         );

    # NEW -gnu -bom
    $myDiag->Label(
                   -text => $text,
      )->pack(
              -fill => 'x',
              -padx => 3,
              -pady => 3
             );

This update fixes case b1120 and revises the fix for b977.

13 May 2021, d0ac5e9.

Adjust tolerances for some line length tests

Random testing produced some edge cases of unstable formatting involving the -lp format. These were fixed by using a slightly larger tolerance in the length test for containers which were broken in the input file.

This fixes cases b1059 b1063 b1117.

13 May 2021, 24a11d3.

Do not apply -lp formatting to containers with here-doc text

If a container contains text of a here-doc then the indentation is fixed by the here-doc text, so applying -lp formatting does not work well. So this update turns off the -lp formatting in this case.

But note that if a container contains a here target but not the here text so it still gets the -lp indentation:

    # perltidy -lp
    &WH::Spell::AddSpell(
                          "Cause Light Wounds", "WFP",
                          "CauseLightWounds",   <<'EOH');
    ...
    EOH

This fixes case b1081.

10 May 2021, 4f7a56b.

Fix some edge welding cases

Some adjustments in welding coding was made to maintain stability for some unusual parameter combinations.

This fixes cases b1111 b1112.

9 May 2021, 68f619a.

Improve tolerance for welding qw quotes

The tolerance for welding qw quotes has been update to be the same as used for welding other tokens. This fixes case b1129.

9 May 2021, d1de85f.

Revise weld tolerances, simplify code, fix welded ci

The welding process uses a tolerance to keep results stable. Basically, a zero tolerance is used if it looks like it is welding an existing weld, while a finite tolerance is used otherwise. The effect is to reject a few marginal welds to gain stability. The coding to do this was simplified and the tolerance was made more precise to fix case b1124.

Another change with this update is that at welded containers, the value of the -ci flag of an outer opening token is transferred to the inner opening token. This may improve some indentation in a few cases if the -lp flag is also used. It has no effect if -lp is not used.

    # OLD: perltidy -wn -gnu
    emit_symbols([qw(
     ctermid
     get_sysinfo
     Perl_OS2_init
     ...
     CroakWinError
    )]);

    # NEW: perltidy -wn -gnu
    emit_symbols([qw(
        ctermid
        get_sysinfo
        Perl_OS2_init
        ...
        CroakWinError
    )]);

9 May 2021, ad8870b.

Correct brace types mismarked by tokenizer, update

This is a generalization of commit 7d3bf4 in which the tokenizer sends a signal to the formatter if the type of a brace following an unknown bareword had to be guessed. The formatter has more information and can fix the problem. This fixes case b1128.

8 May 2021, b3eaf23.

Added warning when -ci has to be reduced; ref rt #136415

In commit c16c5ee an update was made to prevent instability with -xci when the value of -ci exceeds -i (which is not recommended). This update adds a warning message to avoid confusing the user.

7 May 2021, e9e14e4.

Improve indentation of welded multiline qw quotes

Formatting of multiline qw lists with welding works best if the opening and closing qw tokens are on separate lines, like this:

    # perltidy -wn
    my $mon_name = ( qw(
        January February March April
        May June July August
        September October November December
    ) )[$mon];

    # perltidy -wn -lp
    my $mon_name = ( qw(
                     January February March April
                     May June July August
                     September October November December
    ) )[$mon];

Otherwise formatting can be poor, particularly if the last line has text and a closing container.

    # OLD: perltidy -wn
    my $mon_name = ( qw(January February March April
        May June July August
    September October November December) )[$mon];

Note that perltidy does not change the line breaks within multiline quotes, so they must be changed by hand if desired.

This update indents the last line of a multiline quote which contains both text and the closing token, such as:

    # NEW: perltidy -wn
    my $mon_name = ( qw(January February March April
        May June July August
        September October November December) )[$mon];

This update is only when the -lp flag is not used. If -lp is used and the last line contains text, the last line is still outdented:

    $ OLD and NEW: perltidy -wn -lp
    my $mon_name = ( qw(January February March April
                     May June July August
    September October November December) )[$mon];

This is difficult to fix. The best solution is to place the closing qw qw containers on a separate line.

This fix is for case c002.

6 May 2021, 176f8a7.

Test length of closing multiline qw quote before welding

Random testing produced an unstable state which was due to not checking for excessive length of the last line of a multiline qw quote. A check was added, this fixes issue b1039.

5 May 2021, b72ad24.

Update welding rule to avoid blinking states

Random testing with unusual parameter combinations produced some unstable welds. For example case b1106 has these parameters

    --noadd-whitespace
    --continuation-indentation=6
    --delete-old-whitespace
    --line-up-parentheses
    --maximum-line-length=36
    --variable-maximum-line-length
    --weld-nested-containers

and was switching between these two states:

            return map{
                ($_,[$self->$_(@_[1..$#_])])
            }@every;

            return map { (
                $_, [ $self->$_( @_[ 1 .. $#_ ] ) ]
            ) } @every;

An existing rule, WELD RULE 2, was updated to prevent welding to an intact one-line weld, as in the first snippet, if it is on a separate line from the first opening token. With this change, both of these states are stable.

This update fixes cases b1082 b1102 b1106 b1115.

4 May 2021, 07efa9d.

Fix problem of conflict of -otr and -lp

Several random test cases produced an instability involving -otr and -lp. In -lp mode, when an opening paren follows an equals and is far to the right, a break is made at the equals to minimize the indentation of the next lines. The -otr flag is a suggestion that an opening paren should be place on the right. A check has been made to avoid this in -lp mode following an equals, since that defeats the purpose of the original break.

This fixes cases b964 b1040 b1062 b1083 b1089.

4 May 2021, 24a0d32.

Add option -pvtc=3, requested in rt136417

A new integer option, n=3, has been added to the vertical tightness closing flags. For a container with n=3, the closing token will behave as for n=0 if the opening token is preceded by an '=' or '=>', and like n=1 otherwise.

3 May 2021, 93063a7.

Fix vertical alignment issue in rt136416

This update fixes a problem with unwanted vertical alignment rasied in rt#136416. The example is

    use File::Spec::Functions 'catfile', 'catdir';
    use Mojo::Base 'Mojolicious',        '-signatures';

An update was made to reject alignments in use statements with different module names. The test file t/snippets/align35.in has more examples.

3 May 2021, 048126c.

Fix some rare issues with the -lp option

Random testing produced some rare cases of unstable formatting involving the -lp option which are fixed with this update. This is a generalization of commit edc7878 of 23 Jan 2021. This fixes cases b1109 b1110.

2 May 2021, a8d1c8b.

Correct brace types mismarked by tokenizer

This is a generalization of commit 7d23bf4 to fix some additional cases found in random testing in which the type of a curly brace could not be determined in the tokenizer and was not being corrected by the formatter.

This fixes cases b1125 b1126 b1127.

2 May 2021, dac97cb.

Avoid instability of combination -bbx=2 -lp and -xci

Random testing produced several cases in which the flags -bbx=2 -lp and -xci were causing formatting instability. The fix is to locally turn off -xci when -lp and -bbx=2 are in effect. This is an extension of commit 2b05051.

This fixes cases b1090 b1095 b1101 b1116 b1118 b1121 b1122 b1099

1 May 2021, 4cb81ba.

Restrict use of flag -cab=3 to simple structures

Logic was added to turn off -cab=3 in complex structures. Otherwise, instability can occur. When it is overridden the behavior of the closest match, -cab=2, is used instead.

For example, using these parameters for case b1113

    --noadd-whitespace
    --break-before-hash-brace-and-indent=2
    --break-before-hash-brace=1
    --comma-arrow-breakpoints=3
    --continuation-indentation=9
    --maximum-line-length=76
    --variable-maximum-line-length

formatting of the following snippet was unstable:

    $STYLESHEET{'html-light'}={
        'tags'=>{
            'predefined identifier'=>
                     {
                'start'=>'<font color="#2040a0"><strong>',
                'stop'=>'</strong></font>'
                     },
        }
    };

This update fixes cases b1096 b1113.

29 Apr 2021, 32a1830.

Update docs for git #64 regarding -lp and side comments

The wording regarding when -lp reverts to the default indentation scheme has been revised to include side comment as follows:

In situations where perltidy does not have complete freedom to choose line breaks it may temporarily revert to its default indentation method. This can occur for example if there are blank lines, block comments, multi-line quotes, or side comments between the opening and closing parens, braces, or brackets.

The word 'may' is significant for side comments. In a list which is just one level deep side comments will work (perhaps with -iscl if side comments are long). For example this is ok

    # perltidy -lp
    $gif->arc(
               50, 50,     # Center x, y.
               30, 30,     # Width, Height.
               0,  360,    # Start Angle, End Angle.
               $red
    );

But if a list is more than one level deep then the default indentation is used.

28 Apr 2021, 49977b8.

Fix line break rules for uncontained commas + cleanups

This is an adjustment of update 344519e which had to do with breaking lines with commas which were not inside of a container. In a few cases it was producing very long lines when -l=0 was set. The solution was to remove the concatenation operator from the list of operators at which breaks were prevented.

Other updates are: Remove unused indentation table. Correct maximum_line_length table for -vmll when -wc is also set. Also fix whitespace rule for '$ =' within a signature to fix case b1123.

26 Apr 2021, d014c2a.

Fix problem with -wn and -wc=n

Random testing produced some cases in which the -wn flag was unstable when -wc=n was used with very small n. This has been fixed.

This fixes cases: b1098 b1107 25 Apr 2021, 92babdf.

Adjust line break rules for uncontained commas

Random testing produced case c1119 which was unstable due to the formatting rules for breaking lines at commas which occur outside of containers. The rules were modified to fix the problem.

20 Apr 2021, 344519e.

Fix a bad line break choice at a slash

Random testing produced case c001 in which the following snipppet

   ok $mi/(@mtime-1) >= 0.75 && $ai/(@atime-1) >= 0.75 &&
             $ss/(@mtime+@atime) >= 0.2;

when processed with these parameters

    --maximum-line-length=20
    --nowant-right-space=' / '
    --want-break-before='* /'

produced the following result

    ok $mi
      /( @mtime - 1 ) >=
      0.75
      && $ai
      /( @atime - 1 )
      >= 0.75
      && $ss
      /( @mtime +
          @atime ) >=
      0.2;

using 'perl -cw' on this snippet gives a syntax error

    syntax error at /tmp/issues.t line 5, near "/( "
        (Might be a runaway multi-line // string starting on line 2)

The error is due to perl's weird parsing rules near a possible indrect object. This is a situation where perltidy must ignore a user spacing and line break request. This should have been done but in this case a flag to prevent this was not being propagated to later stages of formatting. This has been fixed.

20 Apr 2021, 4fbc69a.

Fix rare problem with -lp -wn

Random testing produced case b1114 which gave unstable formatting with these parameters

    --noadd-whitespace
    --indent-columns=8
    --line-up-parentheses
    --maximum-line-length=25
    --weld-nested-containers

and this snippet

    is(length(pack("j", 0)),
        $Config{ivsize});

Fixed 19 Apr 2021, eeeaf09.

Fix issue git#63

The following lines produced an error message due to the side comment

    my $fragment = $parser->    #parse_html_string
      parse_balanced_chunk($I);

Fixed 18 Apr 2021, c2030cf.

Avoid welding at sort/map/grep paren calls

Random testing produced several cases of unstable welds in which the inner container something like 'sort ('. The problem was that there are special rules which prevent a break following such a paren. The problem was fixed by preventing welds at these locations.

This update fixes cases b1077 b1092 b1093 b1094 b1104 b1105 b1108.

17 Apr 2021, d679b48.

Fix issue git#62

This fixes issue git #62. A similar issue for the % operator was fixed. 17 Apr 2021, f80d677.

Fix problem involving -bbx=2 -xci -osbr and similar -otr flags

Random testing produced case b1100 in which the output style produced by the --opening-token-right flags interfered with counting line-ending commas, and this in turn caused the -bbx flag to turn off the -xci flag. This problem was fixed.

15 Apr 2021, 21ef53b.

Fix rare line break problem

Random testing produced case b1097 with this parameter set

    --brace-vertical-tightness-closing=1
    --continuation-indentation=8
    --indent-columns=10
    --maximum-line-length=36

and either this output

          my (@files) = @{
                    $args{-files} };

or this output

          my (@files) =
                  @{ $args{-files}
                  };

The different results were caused by the unusual combination of parameters. The problem was fixed by not allowing the formatter to consider existing breaks at highly stressed locations such as these.

15 Apr 2021, 9f15b9d.

Fix problem parsing anonymous subs with attribute lists

Random testing produced case b994 with unstable formatting:

    do
    sub :
    lvalue
    {
    return;
    }

when run with parameters:

    --continuation-indentation=0
    --ignore-old-breakpoints
    --maximum-line-length=7
    --opening-anonymous-sub-brace-on-new-line

The line 'sub :' was being correctly parsed but the following opening block brace was not correctly marked as an anonymous sub brace. This fixes cases b994 and b1053.

15 Apr 2021, 84c1123.

Correct brace types mismarked by tokenizer

Testing with random parameters produced a case in which a brace following an unknown bareword was marked by the tokenizer as a code block brace rather than a hash brace. This can cause poor formatting. The problem was solved by having the tokenizer send a signal to the formatter if a block type was guessed. The formatter has more information and can fix the problem. This fixes case b1085.

11 Apr 2021, 7d23bf4.

Unify coding for welded quotes and other welded containers

Random testing produced some cases where welded quotes were not converging. These were found to be due to the same problem previouly encountered and fixed for normal containers. The problem was fixed by moving the corrected coding to a new common sub.

This update fixes cases b1066 b1067 b1071 b1079 b1080.

10 Apr 2021, 5d73dd5.

Slight change in weld length calculation

Random testing produced some cases of instability with some unusual input parameter combinations involving the -wn parameter. This was fixed by revising a line length calculation. This fixes cases b604 and b605.

9 Apr 2021, a25cfaa.

Improve treatment of -vmll with -wn

Random testing showed a weakness in the treatment of the -vmll flag in combination with the -wn flag. This has been fixed.

This fixes cases b866 b1074 b1075 b1084 b1086 b1087 b1088

8 Apr 2021, a6effa3.

Merge weld rule 6 into rule 3

One of the welding rules, RULE 6, has been merged into RULE 3 for generality. This rule restricts welding to an opening container followed by a bare word, which can cause instability in some cases. The updated code is less restrictive and fixes some cases recently found with random testing, b1078 b1091.

8 Apr 2021, f28ab55.

Moved logic of previous update to the FileWriter module

The previous update regarding blank line generation was not sufficiently general to handle all possible parameter combinations. The problem was solved and simplified by moving the logic to a lower level, in the FileWriter module.

6 Apr 2021, 756e930.

Fix problem with excess blank line generation with -blao

Random testing produced some cases where excess blank lines could be generated with the parameter -blank-lines-after-opening-block. Case b1073 has the following script

    sub stop {

        1;
    }

and the following parameters

    --blank-lines-after-opening-block=2
    --maximum-consecutive-blank-lines=10
    --maximum-line-length=15

When run, blank lines keep getting generated until the maximum is reached. This has been fixed.

6 Apr 2021, 9216298.

Fix edge case involving -wn and -lp or -bbao

A change was made to a welding rule involving the -lp option or -wbb='=', and very short maximum line lengths. This correctly fixes case b1041. It replaces a fix for this case reported on 2 Apr 2021.

5 Apr 2021.

Modify a condition for applying -bbx=2

Random testing produced some edge cases in which formatting with the -bbx=2 flags, in combination with certain other parameters, was not converging. An existing criterion for the -bbx=2 flag to apply is that there be a broken sub-list with at least one line-ending comma. This was updated to also require either a fat comma or one additional line-ending comma. This filters out some problem cases without changing much existing formatting.

This update fixes cases b1068 b1069 b1070 b1072 b1073 b1076.

5 Apr 2021, 16c4591.

Improve previous -wn update

The previous update produced some problems in testing which are corrected with this update.

5 Apr 2021, ffef089.

Fix rare convergence problem with -wn

Random testing produced some cases in which unusual parameter combinations caused lack of convergence for the -wn flag. The problem was fixed by adjusting a tolerance in the line length calculation.

This fixes cases b1041 b1055.

2 Apr 2021, a8b6259.

Avoid conflict of -bli and -xci

Random testing produced a case with the combination -bli and -xci which did not converge. This was fixed by turning off -xci for braces under -bli control.

This fixes case b1065.

2 Apr 2021, d20ea80.

Issues fixed after release 20210111

Avoid conflict of -bbp=2 and -xci

Random testing produced a number of cases of unstable formatting when both -xci and -bbp=2 or similar flags were set. The problem was that -xci can cause one-line blocks to break open, causing the -bbp=2 flag to continually switch formatting. The problem is fixed by locally turning off -xci at containers which do not themselves contain broken containers.

This fixes cases b1033 b1036 b1037 b1038 b1042 b1043 b1044 b1045 b1046 b1047 b1051 b1052 b1061.

30 Mar 2021, 2b05051.

Fix rule for welding with barewords

Random testing produced a case which was not converging due to a rule which avoids welding when a bareword follows. The rule was modified to allow an exception for an existing one-line weld. A non-fatal typo was also discovered and fixed.

This fixes cases b1057 b1064.

29 Mar 2021, d677082.

Fix conflict between -wba='||' and -opr

Random testing produced a problem with convergence due to a conflict between two parameters for the following code

    my $lxy =
      ( @$cx - @$cy ) ||
      (
        length ( int ( $cx->[-1] ) ) -
        length ( int ( $cy->[-1] ) ) );

when using these parameters

    --break-after-all-operators
    --maximum-line-length=61
    --opening-paren-right

Both the '||' and the '(' want to be at the end of a line according to the parameters. The problem is resolved by giving priority to the '||'. This fixes case b1060.

29 Mar 2021, 6921a7d.

Follow user requests better to break before operators

Random testing produced some cases in which user requests to break before selected operators were not being followed. For example

    # OLD: perltidy -wbb='.='
    $value .=
      ( grep /\s/, ( $value, $next ) )
      ? " $next"
      : $next;

    # FIXED: perltidy -wbb='.='
    $value
      .= ( grep /\s/, ( $value, $next ) )
      ? " $next"
      : $next;

This fixes case b1054.

28 Mar 2021, 94f0877.

Fix problems with combinations of -iob -lp

This is an correction to the update of 13 Mar 2021, 71adc77. Random testing produced several additional problems with convergence involving the combination -iob -lp. This update fixes the problem by overriding -iob at some breakpoins which are essential to the -lp parameter.

This update fixes these old cases: b1021 b1023

and these new cases: b1034 b1048 b1049 b1050 b1056 b1058

27 Mar 2021, cc94623.

Add flag -lpxl=s to provide control over -lp formatting

The flag -lpxl=s provides control over which containers get -lp formatting. A shortcut flag -lfp is also added for limiting -lp to simple function calls.

Updated 25 Mar 2021, bfc00fp.

Fix error message for multiple conflicting specifications in -wnxl

There was an error in the coding for an error message which checks for conflicting requests in the -wnxl parameter.

Fixed 21 Mar 2021, 2ef97a2.

Fix issue git #57, Warn_count was not initialized

This update fixes issue git #57, in which a warning flag was not getting zeroed on each new call to perltidy.

19 Mar 2021, b6d296a.

Fix rare problem with combination -lp -wn -naws

This update fixes case b1032 by includeing lines starting with 'if', 'or', and || among the stable breakpoints for welding when -lp -naws flags are also set.

This update also modifies update 7a6be43 of 16 Mar 2021 to exclude list items when checking token lengths. This reduces changes to existing formatting while still fixing the problem in case b1031.

18 Mar 2021, 6200034.

Fix definition of list within list for -bbx flags

Testing produced a blinking state involving a -bbx=2 flag with an unusual combination of other parameters. The problem was traced to the definition of a list containing another list being too restrictive. This update fixes case 1024.

17 Mar 2021, 7f5da0a.

Fix problem with -xci and long tokens

Testing produced an unstable situation involving the -xci flag and tokens which exceed the maximum line length. This fix identifies this situation and locally deactivates the -xci flag. This fixes case b1031.

16 Mar 2021, 7a6be43.

Fix error in parsing use statement curly brace

Testing with random parameters produced some cases where the -xci option was not producing stable results when the maximum line length was set to a very small value. The problem was traced to the tokenizer misparsing a hash brace of a use statement as a code block type. This influences the value of continuation indentation within the braces. The problem was fixed.

This fixes cases b1022 b1025 b1026 b1027 b1028 b1029 b1030

16 Mar 2021, 6371be2.

Fix problems with combinations of -iob -lp -wn -dws -naws

Testing with random parameters produced some situation where the parameter -iob interfered with convergence when parameters -lp and/or -wn were also set. The combination -wn -lp -dws -naws also produced some non-converging states in testing. This update fixes these issues.

The following cases are fixed: b1019 b1020 b1021 b1023

13 Mar 2021, 71adc77.

Simplify sub weld_nested_containers

This update consolidates a number of specialized rules for welding into fewer, simpler rules which accomplish the same effect.

These cases are fixed with this update: b186 b520 b872 b937 b996 b997 b1002 b1003 b1004 b1005 b1006 b1013 b1014

There are no current open issues with the weld logic.

10 Mar 2021, cf3ed23.

Adjust line length tolerance for welding

A minor tolerance adjustment was needed to fix some edge welding cases.

This fixes cases b995 b998 b1000 b1001 b1007 b1008 b1009 b1010 b1011 b1012 b1016 b1017 b1018

7 Mar 2021, b9166ca.

Fix problem with -vtc=n and outdented long lines

Random testing produced an issue with -vtc=1 and an outdented long line. The parameters for b999 are

    --maximum-line-length=75
    --paren-vertical-tightness-closing=1

File 'b999.in' state 1 is

                while ( $line =~
    s/^([^\t]*)(\t+)/$1.(" " x ((length($2)<<3)-(length($1)&7)))/e
                  )

and state 2 is

                while ( $line =~
    s/^([^\t]*)(\t+)/$1.(" " x ((length($2)<<3)-(length($1)&7)))/e)

The problem was fixed by turning off caching for outdented long lines. This fixes case b999.

7 Mar 2021, 3da7e41.

Fix problem with combination -lp and -wbb='='

Random testing produced case b932 in which the combination -lp and -wbb='=' was not stable.

File 'b932.par' is:

    --line-up-parentheses
    --maximum-line-length=51
    --want-break-before='='

File 'b932.in' in the desired state is:

    my @parts
      = decompose( '(\s+|/|\!|=)',
                   $line, undef, 1, undef, '["\']' );

The alternate state is

    my @parts = decompose( '(\s+|/|\!|=)',
                     $line, undef, 1, undef, '["\']' );

The problem was that the -lp code which set a line break at the equals did not check the -wba flag setting.

This update fixes case b932.

7 Mar 2021, 63129c3.

Fix edge formatting cases with parameter -bbx=2

Random testing produced some cases where formatting with parameters of the form --break-before-..=2 can lead to unstable final states. The problem lies in the definition of a broken list. The problem is fixed by defining a broken list for this particular flag to be a list with at least one non-terminal line-ending comma. This insures that the list will remain broken on subsequent iterations. This fixes cases b789 and b938.

6 Mar 2021, 360d669.

Add flag -fpva, --function-paren-vertical-alignment

A flag -fpva, --function-paren-vertical-alignment, is added to prevent vertical alignment of function parens when the -sfp flag is used. This is on by default, so that existing formatting remains unchanged unless the user requests that vertical alignment not occur with -nfpva.

5 Mar 2021, 312be4c.

Fix for issue git #53, do not align spaced function parens

Introducing a space before a function call paren had a side effect of allowing the vertical aligner to align the parens, as in the example.

    # OLD and NEW, default without -sfp:
    log_something_with_long_function( 'This is a log message.', 2 );
    Coro::AnyEvent::sleep( 3, 4 );

    # OLD: perltidy -sfp 
    log_something_with_long_function ( 'This is a log message.', 2 );
    Coro::AnyEvent::sleep            ( 3, 4 );

    # NEW: perltidy -sfp 
    log_something_with_long_function ( 'This is a log message.', 2 );
    Coro::AnyEvent::sleep ( 3, 4 );

This update changes the default to not do this vertical alignment. This should have been the default but this side-effect was missed when the -sfp parameter was added. Note that parens following keywords are likewise not vertically aligned.

5 Mar 2021, 00431bf.

Fix issue git#54 involving -bbp=n and -bbpi=n

In this issue, different results were obtained depending upon the existence of a comma in a list. To fix this, the definition of a list was adjusted from requiring one or more commas to requiring either a fat comma or a comma.

At the same time, a known problem involving the combination -lp -bbp=n -bbpi=n was fixed. This fixes cases b826 b909 b989.

4 Mar 2021, 872d4b4.

Fix several minor weld issues

Some edge cases for the welding parameter -wn have been fixed. There are no other currently known weld issues. Some debug code for welding has been left in the code for possible future use.

This fixes cases b109 b110 b520 b756 b901 b937 b965 b982 b988 b991 b992 b993

3 Mar 2021, cfef087.

Update tokenizer recognition of indirect object

This is the parameter file b990.pro: --noadd-whitespace --continuation-indentation=0 --maximum-line-length=7 --space-terminal-semicolon

Applying perltidy -pro=b990.pro to the following snippet gave two states

    # State 1
    print
    H;

    # State 2
    print H
    ;

The tokenizer was alternately parsing 'H' as either possble indirect object, 'Z', or indirect object, 'Y'. Two fixes were tested. The first was to modify the tokenizer to recognize a ';' as well as a space as a direct object terminator. An alternative fix is to not allowing a break before type 'Y' so that the tokenizer kept parsing as type 'Y'. Both fixes work, but the latter fix would change existing formatting by the -extrude option, so the first fix was used. With this fix, the stable state is 'State 1' above.

This update is a generalization of the update "Fixed blinker related to line break at indirect object" of 16 Jan 2021.

This fixes case b990. 1 Mar 2021, 49cd66f.

Do not start a batch with a blank token

Perltidy does final formatting in discrete batches of tokens, where a batch is a continuous section of the total token list. A batch begins a new line and will be broken into one or more lines. If a batch starts with a blank token it will simply be skipped on on output. However, some rare problems have been found in random testing which can occur if a batch starts with a blank. An example is case b984 which has the following parameters:

        # this is file 'b984.pro'
        --block-brace-vertical-tightness=2
        --indent-columns=10
        --maximum-line-length=27
        --outdent-keywords
        --variable-maximum-line-length

          # OLD: perltidy -pro=b984.pro
          unless (
                    exists $self->{
                              'accession_number'} )
          {         return "unknown";
          }

          # NEW: perltidy -pro=b984.pro
          unless (
                    exists $self->{
                              'accession_number'} )
          {       return "unknown";
          }

Both look OK, but the OLD version did not outdent the keyword 'return' as requested with the -okw flag.

This update fixes cases b149 b888 b984 b985 b986 b987.

28 Feb 2021, 8aaf599.

Avoid double spaces in -csc text output

Random testing produced some rare cases where two spaces could occur in -csc text. This happened when there were multiple lines and the formatter placed a leading blank in one of the continuation lines as padding. This has been fixed.

For example

    while (
        <>;
      )
    {
       ...
    } ## end while ( <>; )

Previously, the last line had an extra space after the ';'

    } ## end while ( <>;  )

Another example

    while (
        do {
            { package DB; @a = caller( $i++ ) }
        }
      )
    {
      ...
    } ## end while ( do { { package DB...}})

Previously the last line had an extra space between the opening braces:

    } ## end while ( do {  { package DB...}})

27 Feb 2021, b22e891.

Remove control of debug flag -fll

Random testing produced an unstable state when a debug flag, -nfll, was set. The only time it is appropriate to set this flag is if the -extrude option is set, so a check was added to verify this. This fixes case b935.

27 Feb 2021, 9155b3d.

Restrict previous update to just -vmll

The previous update was found to occasionally needlessly change existing formatting with very long long lines. So it is restricted to just when -vmll is set. For example, it is ok to keep the long quote following the opening paren in the following case.

  # perltidy -gnu
  ok( "got to the end without dying (note without DEBUGGING passing this test means nothing)"
    );

26 Feb 2021, 2b88464.

Add a gap calculation in line length tests with -vmll

This fixes case b965. The -vmll flag can produce gaps in lines which need to be included in weld line length estimates.

26 Feb 2021, a643cf2.

Update rule for spacing paren after constant function

Random testing produced an unstable state for the following snippet (case b934)

        sub pi();
        if (
            $t >
            pi( )
              )

when run with these parameters:

  --continuation-indentation=6
  --maximum-line-length=17
  --paren-vertical-tightness-closing=2

The formatting was stable without the first line, which declares 'pi' to be a constant sub. The problem was fixed by updating a regex to treat the spacing of a paren following a sub the same for the two token types, 'U' or 'C' (constant function).

This fixes case b934, 12bfdfe.

26 Feb 2021.

Improve line length test for the -vtc=2 option

This is a small change to the update of 13 Feb 2021, f79a4f1. Random testing produced additional blinking states caused by the combination of -vtc=2 and -vmll flag, plus several others. The problem was that a line length check in the vertical aligner was being skipped as an optimization if it didn't appear necessary. The unusual properties of the -vmll flag require that the check always be done.

This fixes cases b656 b862 b971 b972.

26 Feb 2021, 80107e0.

Improve one-line block length tests

Some oscillating states produced in random parameter tests were traced to problems with forming one-line blocks. A more precise length test was added to fix this.

This fixes cases b562 b563 b574 b777 b778 b924 b936 b975 b976 b983.

In the process of fixing this issue, a glitch was discovered in the previous coding of the -bl (braces-left) flag that caused somewhat random results for block types sort/map/grep/eval. The problem was a conflict between the logic for forming one-line blocks and the logic for applying the -bl flag. Usually, -bl formatting was not applied to these block types, but occasionally it was. To minimize changes in existing formatting, in the new version the -bl flag is not applied to these block types. A future flag could be added to give user control over which of these block types are under -bl control.

25 Feb 2021, 92bec8d.

Add tolerance to one-line block length tests

Testing with random input parameters produced some cases in which a stable solution could not be found due to attempts to form one-line blocks near the end of a long line. The problem was fixed by adding a small tolerance to the line length test. This does not change existing formatting.

This fixes cases b069 b070 b077 b078.

21 Feb 2021, 0b97b94.

Restrict -bom at cuddled method calls

The -bom flag tries to keep old breakpoints at lines beginning with '->' and also with some lines beginning with ')->'. These latter lines can lead to blinking states in cases where the opening paren is on the previous line. To fix this, a restriction was added that the line difference between the opening and closing parens should be more than 1.

This fixes case b977.

21 Feb 2021, 28114e9.

Add weld rule to avoid conflict between -wn and -bom

Testing with ramdom input parameters produced states which were oscillating because of a conflict between the -wn and -bom parameters. The problem was resolved by giving the -bom parameter priority over -wn.

These cases are fixed with this update: b966 b973

20 Feb 2021.

Limit the value of -ci=n to that of -i=n when -xci is set

Testing with random input parameters produced a number of oscillating states which had both parameter -xci as well as a value of -ci=n which exceeded the value of -i=n. To correct this, perltidy will silently reduce the -ci value to the -i value when -xci is also set. This should not change existing formatting because a value of -ci greater than -i would not normally be used in practice.

These cases are fixed with this update: b707 b770 b912 b920 b930 b933 b939 b940 b941 b942 b978 b974 b979 b980 b981

20 Feb 2021, c16c5ee.

Modify length tolerance for welding to qw lists

Several cases of alternating states were produced in random testing which were caused by line length limits being reached when welding to qw lists. This was fixed by adding a small tolerance to line length tests.

This fixes cases b654 b655 b943 b944 b967 b968 b969 b970.

19 Feb 2021, 0baafc8.

Modify space rule between binary plus or minus and a bareword

The update of 13 Feb 2021, cf414fe, has been modified to be less restrictive. Space between a binary plus and minus and a bareword may now be removed in some cases where no tokenization ambiguity exists. 18 Feb 2021, a8564c8.

Do not apply -xci if it would put tokens beyond the maximum line length

This update fixes cases b899 b935. 17 Feb 2021, b955a7c.

Do not weld to a hash brace

The reason is that it has a very strong bond strength to the next token, so a line break after it may not work. Previously we allowed welding to something like '@{' but even that caused blinking states (cases b751, b779).

This will not change much existing code. This update fixes cases b751 b779.

16 Feb 2021, eb2f4e7.

Avoid line breaks after token type 'G'

Random testing with very short maximum line lengths produced some blinking states which were traced to the tokenizer alternately parsed an unknown bareword as type 'w' or type 'G', depending on whether or not an opening block brace immediately followed on the same line. To fix this, a rule was added which prevents a line break between a type 'G' token and an opening code block brace.

This update fixes these cases: b900 b902 b928 b929

15 Feb 2021, f005a95.

Restrict breaking at old uncontained commas

Random testing with very short maximum line lengths produced some blinking states which were traced to duplicating old comma breakpoints which were not really good breakpoints. A patch was made to be more selective.

These cases are fixed with this update: b610 b757 b931

15 Feb 2021, 98b41a0.

Modify line length test for the -vtc=2 option

The line length test which was added Feb 13 2021 turns out to be more restrictive than necessary. A modification was made to only apply it if a new one-line block would be formed. This prevents it from needlessly changing existing formatting.

The following cases were re-activated after this update: b654 b655 b656 b862

15 Feb 2021, 4673fdd.

Use increased line length tolerance if ci exceeds i

In testing perltidy with random input parameters, some blinking states occurred when the value of -ci was significantly larger than the value of -i. (In actual practice, -ci is not normally set greater than -i). This update adds a tolerance to line length tests which avoids this problem. This fixes the following cases

b775 b776 b826 b908 b910 b911 b923 b925 b926 b927

14 Feb 2021, 8451f2f.

Keep space between binary plus or minus and a bareword

This update makes a space between a binary + or - and a bareword an essential whitespace. Otherwise, they may be converted into unary + or - on the next pass, which can lead to blinking states. Fixes cases b660 b670 b780 b781 b787 b788 b790.

13 Feb 2021, cf414fe.

Prevent breaks after unary plus and minus

Some alternating states were produced when extremely maximum line lengths forced a break after a unary plus or minus. Fixes cases b670 b790.

13 Feb 2021, cf414fe.

Add line length test for the -vtc=2 option

Random testing produced a number of cases of blinking states which were caused when the -vtc=2 flag caused the vertical aligner to combine lines which exceeded the allowable line length. These long lines were then getting reduced in size on every other iteration. A line length test was added in the vertical aligner to prevent this. This fixes these cases:

b654 b655 b656 b657 b761 b762 b763 b764 b765 b766 b767 b768 b769 b862 b904 b905 b906 b907 b913 b914 b915 b916 b917 b918 b919

13 Feb 2021, f79a4f1.

Define left side bond strengths for unary plus and minus

Random testing produced a blinking state which was traced to the unary plus not having a defined strength in the line break algorithm. This was fixed by setting it to be the same as the left strength of a plus. This fixes case b511. 12 Feb 2021, 58a7895.

Fix problem with breaking at an = sign

Random testing produced some blinking cases which were related to detecting an old good breakpoint at an equals. If the user requested that a break be done before an equals, and the input script had a break after an equals, then that break should not have been marked as a good existing break point before a keyword. This update fixes cases b434 b903.

11 Feb 2021, f9a8543.

Fix conflict of -kbl=0 and essential space after =cut

Random testing produced a case where a blank line after an =cut was alternately being deleted and added due to a conflict with the flag setting -keep-old-blank-lines=0. This was resolved by giving prioritiy to the essential blank line after the =cut line.

This fixes case b860. 11 Feb 2021, 8c13609.

Do not break one-line block at here target

A blinking state produced by random testing was traced to a line of coding which which unnecessarily prevented one-line blocks from being formed when a here-target was encountered. This has been fixed.

For example, the code block in the following contains a here target and was being broken into two lines:

    unless ($INC{$file}) {
        die <<"END_DIE" }

These will now be output with the blocks intact, like this

    unless ($INC{$file}) { die <<"END_DIE" }

This fixes case b523. 11 Feb 2021, 6d5bb74.

Skip processing -kgb* flags in lists or if -maximum-consecutive-blank-lines=0

Random testing produced an alternating state which was caused by -kgb flags being active on keywords which were in a list rather than a code block. A check was added to prevent this. Also, the -kgb* flags have no effect if no blank lines can be output, so a check was added for this situation. This fixes case b760.

10 Feb 2021, 177fc3a.

Modify tolerance in testing for welds

Random testing with unusual parameters produced some blinking weld states which were fixed by modifying a tolerance used in a line length test. The following cases were fixed with this update:

b746 b748 b749 b750 b752 b753 b754 b755 b756 b758 b759 b771 b772 b773 b774 b782 b783 b784 b785 b786

9 Feb 2021, a4609ac.

Modified rule for breaking lines at old commas

Random testing produced some blinking cases resulting from the treatment of old line breaks at commas not contained within containers. The following cases were fixed with this update:

b064 b065 b068 b210 b747

This change has no effect on scripts with normal parameter values. 9 Feb 2021, 5c23661.

Restrict references to old line breaks

A number of cases of blinking states were traced to code which biased the breaking of long lines to existing breaks. This was fixed by restricting this coding to just use old comma line break points.

The following cases were fixed with this update:

b193 b194 b195 b197 b198 b199 b216 b217 b218 b219 b220 b221 b244 b245 b246 b247 b249 b251 b252 b253 b254 b256 b257 b258 b259 b260 b261 b262 b263 b264 b265 b266 b268 b269 b270 b271 b272 b274 b275 b278 b280 b281 b283 b285 b288 b291 b295 b296 b297 b299 b302 b304 b305 b307 b310 b311 b312 b313 b314 b315 b316 b317 b318 b319 b320 b321 b322 b323 b324 b325 b326 b327 b329 b330 b331 b332 b333 b334 b335 b336 b337 b338 b339 b340 b341 b342 b343 b344 b345 b346 b347 b348 b349

8 Feb 2021, 66be455.

Fix rare problem involving interaction of -olbn=n and -wn flags

Random testing revealed a rare alternating state which could occur when both flags --one-line-block-nesting=n and --weld-nested-containers are set, and the maximum line length is set very low. The problem was fixed by ignoring the first flag at welded tokens. This should not have any effect on scripts with realistic parameter values.

The following case was fixed with this update: b690.

6 Feb 2021, 3e96930.

add rule to avoid welding at some barewords

A rule was added to prevent certain rare blinking states involving welding. The rule is that if an opening container is immediately followed by a bareword which is unknown, a weld will be avoided.

The following cases were fixed with this update: b611 b626.

6 Feb 2021, 6b1f44a

further simplify -bbxi=n implementation

This update adds a new variable which indicates if a container is permanently broken due to a side comment or blank line. This helps reduce the number of cases where the -bbxi=n flag cannot be applied. Another change was to always apply the -bbx=n flag, even if the -bbxi=n flag cannot be applied. These two flags now operate almost exactly as in previous versions but without the blinking problem. The only difference is that now the -bbxi=n flag with n>0 will revert to n=0 for some short containers which might not be broken open.

reset -bbxi=2 to -bbxi=0 if -lp is set to avoid blinking states

The options of the form bbxi=2, such as break-before-paren-and-indent=2, have been found to cause blinking states if the -lp flag is set. Both of these options are fairly rare. To correct this the -bbxi=2 flag is now reset to -bbxi=0 if the -lp flag is set. Note that -bbxi=2 and -bbxi=0 give the same formatting result with the common setting -ci=4 and -i=4.

The following cases were fixed with this update:

b396 b397 b398 b429 b435 b457 b502 b503 b504 b505 b538 b540 b542 b617 b618 b619 b620 b621

3 Feb 2021, 67ab0ef.

rewrite sub break_before_list_opening_containers

sub break_before_list_opening_containers was rewritten to reduce the chance of producing alternating states.

The following cases were fixed with this update:

b030 b032 b455 b456 b458 b459 b460 b461 b462 b536 b622 b651 b652 b653 b708 b709 b710 b713 b714 b719 b723 b724 b725 b726 b727 b729 b731 b733 b735 b736 b737 b738 b739 b740 b743 b744

3 Feb 2021, 5083ab9.

redefine list to have at least one internal comma

Random testing produced some blinking states which could be fixed by changing the definition of a list, for formatting purposes, to have one or more interior commas rather than simply one or more commas. The difference is that something with a single terminal comma, like '( $x, )', is no longer classified as a list. This makes no difference except when perltidy is stress tested with unusual parameters.

The following cases were fixed with this update:

b116 b119 b122 b174 b179 b187 b361 b369 b370 b372 b376 b427 b428 b448 b449 b450 b451 b452 b453 b469 b473 b474 b475 b476 b477 b479 b480 b481 b482 b497 b552 b553 b554 b558 b559 b634 b637 b642 b644 b645 b647 b650 b661 b662 b663 b664 b666 b677 b685 b688 b698 b699 b700 b702 b703 b704 b711 b712 b715 b716 b717 b718 b721 b730 b734 b741 b742

1 Feb 2021, 35078f7.

rewrite and combine coding for -bbx=n and -bbxi=n

Random testing produced a large number of blinking states involving parameters such as --break-before-parens=n and --break-before-parens-and-indent=n and similar pairs. The problem was traced to the fact that the former parameter was implemented late in the pipeline whereas the latter parameter was implemented early in the pipeline. Normally there was no problem, but in some extreme cases, often involving very short maximum line lengths, this could produce alternating output states. The problem was resolved by combining the implementation of both flags in a single new sub to avoid any inconsistencies. The following cases were fixed with this update:

b018 b066 b071 b079 b090 b105 b146 b149 b158 b160 b161 b163 b164 b166 b167 b169 b170 b171 b172 b178 b185 b190 b192 b203 b206 b222 b223 b224 b237 b359 b362 b377 b379 b381 b382 b389 b395 b407 b408 b409 b410 b411 b412 b414 b417 b418 b419 b421 b433 b438 b443 b444 b478 b483 b484 b486 b490 b492 b493 b494 b496 b506 b507 b517 b521 b522 b524 b525 b535 b537 b539 b541 b543 b546 b550 b551 b555 b564 b566 b567 b569 b570 b572 b573 b575 b576 b577 b578 b579 b580 b582 b586 b587 b588 b589 b590 b591 b592 b593 b603 b607 b609 b613 b615 b616 b623 b624 b630 b635 b636 b638 b639 b640 b641 b643 b646 b648 b649 b658 b659 b665 b667 b668 b669 b671 b672 b673 b674 b675 b676 b678 b679 b680 b681 b682 b683 b684 b686 b687 b689 b691 b692 b693 b694 b695 b696 b697 b701 b705 b706 b720 b722 b728 b732 b745

31 Jan 2021, 10e8bfd.

adjust line length and token count tolerances for -wn

Most remaining edge cases of blinking states involving the -wn parameter have been fixed by adjusting some tolerances in sub weld_nested_containers. The following cases are fixed with this update:

b156 b157 b186 b196 b454 b520 b527 b530 b532 b533 b534 b612 b614 b625 b627

This update has no effect for realistic parameter settings.

30 Jan 2021, d359a60.

fix additional edge blinker cases involving -wn

Some blinking cases produced in random testing were traced to welding in very short lines (length = 20 for example) in which a weld was made to a square bracket containing just a single parameter, so that it had no good internal breaking points. A rule was added to avoid welding to a square bracket not containing any commas. The following cases were fixed with the update:

b002 b003 b005 b006 b007 b009 b010 b014 b015 b017 b020 b111 b112 b113 b124 b126 b128 b151 b153 b439 b606

29 Jan 2021, 33f1f2b.

fix additional edge blinker cases involving -wn

Random testing produced some blinking states which were traced to the precision of a line length test. In sub weld_nested_containers, the test

    $do_not_weld ||= $excess_length_to_K->($Kinner_opening) > 0;

was changed to allow a 1 character margin of error:

    $do_not_weld ||= $excess_length_to_K->($Kinner_opening) >= 0;

The following cases were fixed with this update:

b025 b075 b091 b109 b110 b152 b154 b155 b162 b168 b176 b422 b423 b424 b425 b426 b565

29 Jan 2021, 33f1f2b.

fix some edge blinker cases involving -wn

Random testing produced some blinking states which were eliminated by a simplification of the definition of a one_line_weld in sub weld_nested_containers. The following cases were fixed with this update:

b131 b134 b136 b205 b233 b238 b284 b350 b352 b358 b385 b487 b604 b605

29 Jan 2021, 33f1f2b.

fix some edge blinker cases involving -bbxi=n esp. with -boc

The following cases were fixed with this update:

b041 b182 b184 b366 b367 b368 b370 b371 b382 b420 b432 b438 b464 b466 b467 b468 b500 b501 b508 b509 b510 b512 b513 b514 b515 b516 b526 b528 b529 b531 b544 b545 b547 b548 b549 b556 b557 b568 b571 b581 b583 b584 b585 b594 b595 b596 b597 b598 b599 b600 b601 b602 b608 b041 b182 b184 b355 b356 b366 b367 b368 b371 b420 b432 b464 b465 b466 b467 b468 b500 b501 b508 b509 b510 b512 b513 b514 b515 b516 b526 b528 b529 b531 b544 b545 b547 b548 b549 b556 b557 b568 b571 b581 b583 b584

28 Jan 2021.

fix problem with combination -cab=2 and bbhbi=n

Random testing produced a number of cases in which the combination -cab=2 and bbhbi=n and similar flags were in conflict, causing alternating states. This was fixed by not changing ci for containers which can fit entirely on one line, which is what -cab=2 says to do. The following cases were fixed with this update:

b046 b061 b081 b084 b089 b093 b130 b133 b135 b138 b142 b145 b147 b150 b165 b173 b191 b211 b294 b309 b360 b363 b364 b365 b373 b386 b387 b388 b392 b437 b440 b472 b488 b489

27 Jan 2021, 6d710de.

fix problem with -freeze-whitespace

Random testing produced a case in which the --freeze-whitespace flag (which is mainly useful for testing) could cause a blank space which kept increasing. The problem was caused by the "vertical tightness" mechanism. Turning it off when the -freeze-whitespace-flag is on fixed the problem. The following cases were fixed with this update:

b037 b038 b043 b059 b060 b067 b072 b215 b225 b267 b273 b276 b279 b282 b289 b292 b300 b303 b354 b374 b375 b383 b384 b402 b403 b404 b405 b436 b441 b445 b446 b471 b485 b498 b499

27 Jan 2021, 6d710de.

Avoid blinking states associated with -bbpi and similar flags

Random testing with extreme parameter values revealed blinking states associated with the -bbpi and related flags. The problem was that continuation indentation was sometimes being added according to the flag but the lists were not actually being broken. After this was fixed the following cases ran correctly:

b024 b035 b036 b040 b042 b047 b049 b050 b051 b052 b053 b054 b057 b062 b063 b073 b074 b076 b080 b083 b085 b086 b087 b088 b102 b103 b104 b106 b107 b108 b115 b118 b121 b123 b125 b127 b132 b137 b139 b140 b141 b143 b144 b148 b159 b175 b177 b180 b181 b188 b189 b200 b201 b202 b204 b207 b212 b213 b214 b226 b227 b228 b229 b230 b232 b239 b240 b241 b243 b248 b250 b255 b277 b286 b287 b290 b293 b298 b301 b306 b308 b328 b351 b353 b357 b378 b380 b390 b391 b393 b394 b399 b400 b401 b406 b413 b415 b416 b430 b431 b442 b447 b463 b470 b491 b495

27 Jan 2021, 96144a3.

Revise coding for the --freeze-whitespace option

Random testing produced some blinking states which were traced to an incorrect implementation of the --freeze-whitespace option (which is mainly useful in stress testing perltidy). A related flag, --add-whitespace is involved. This update corrects these problems. Test cases include b057, b183, b242. 24 Jan 2021, 9956a57.

Fix for issue git #51, closing qw paren not outdented when -ndnl is set

The problem is that a bare closing qw paren was not being outdented if the flag '-nodelete-old-newlines is set. For example

    # OLD (OK, outdented): perltidy -ci=4 -xci
    {
        modules => [
            qw(
                JSON
            )
        ],
    }


    # OLD (indented) : perltidy -ndnl -ci=4 -xci
    {
        modules => [
            qw(
                JSON
                )
        ],
    }

    # FIXED: perltidy -ndnl -ci=4 -xci
    {
        modules => [
            qw(
                JSON
            )
        ],
    }

The problem happened because the -ndnl flag forces each line to be written immediately, so the next line (which needs to be checked in this case) was not available when the outdent decision had to be made. A patch to work around this was added 24 Jan 2021, 52996fb.

Some issues with the -lp option

Random testing revealed some problems involving the -lp option which are fixed with this update.

The problem is illustrated with the following snippet

    # perltidy -lp
    Alien::FillOutTemplate(
                "$main::libdir/to-$main::desttype/$main::filetype/spec",
                "$workdir/$fields{NAME}-$fields{VERSION}-$fields{RELEASE}.spec",
                %fields
    );

which alternately formats to this form

    # perltidy -lp
    Alien::FillOutTemplate(
                "$main::libdir/to-$main::desttype/$main::filetype/spec",
                "$workdir/$fields{NAME}-$fields{VERSION}-$fields{RELEASE}.spec",
                %fields );

when formatted with the single parameter -lp. A number of similar examples were found in testing. The problem was traced to the treatment of the space which perltidy tentatively adds wherever there is a newline, just in case the formatted output has different line breaks. The problem was that the indentation level of these spaces was being set as the level of the next token rather than the previous token. Normally the indentation level of a space has no effect, but the -lp option does use it and this caused the problem. This was fixed 23 Jan 2021, edc7878.

added rule for -wn, do not weld to a hash brace

In random testing, the following two alternating states

    # State 1
    {
        if ( defined
        ($symbol_table{$direccion}) )
    }
    
    # State 2
    {
        if (defined (
                $symbol_table{
                    $direccion}
            )
        )
    }

were occurring with the following particular parameter set

    --weld-nested-containers
    --maximum-line-length=40
    --continuation-indentation=7
    --paren-tightness=2
    --extended-continuation-indentation

The problem was traced to welding to the opening hash brace. A rule was added to prevent this, and testing with a large body of code showed that it did not significantly change existing formatting. With this change, the above snippet formats in the stable state

    {
        if (defined(
            $symbol_table{$direccion}
        ))
    }

20 Jan 2021, 4021436.

Do not let -kgb option delete essential blank after =cut

A blinking state was found in random testing for the following snippet

    =head1 TODO
    
    handle UNIMARC encodings
    
    =cut
    
    use strict;
    use warnings;

when run with the following parameters

    --keyword-group-blanks-size='2.8'
    --keyword-group-blanks-before=0

The space after the =cut was alternately being added as an essential blank which is required by pod utilities, and then deleted by these parameters. This was fixed 17 Jan 2021, b9a5f5d.

Turn off -boc flag if -iob is set

In random testing, the cause of a blinker was traced to both flags --ignore-old-breakpoints and --break-at-old-comma-breakpoints being set. There is a warning message but the -boc flag was not actually being turned off. This was fixed 17 Jan 2021, b9a5f5d.

Modified spacing rule for token type Y

A space following a token type 'Y' (filehandle) should not be removed. Otherwise it might be converted into type 'Z' (possible filehandle). If that were to happen, the space could not be added back automatically in later formatting, so the user would have to do it by hand. This fix prevents this from happening. 17 Jan 2021, bef9a83.

In random testing a blinker was reduced to the following snippet

    {
             print FILE
              GD::Barcode
              ->new();
    }

which switched to the following state on each iteration

    {
             print FILE GD::Barcode
              ->new();
    }

with the following parameters

    --maximum-line-length=20
    --indent-columns=9
    --continuation-indentation=1

The problem was that the token 'FILE' was either parsed as type 'Y' or 'Z' depending on the existence of a subsequent space. These have different line break rules, causing a blinker. The problem was fixed by modifying the tokenizer to consider a newline to be a space. Updated 16 Jan 2021, d40cca9.

Turn off -bli if -bar is set

A conflict arises if both -bli and -bar are set. In this case a warning message is given and -bli is turned off. Updated 15 Jan 2021, ef69531.

A blinking state was discovered in testing between the following two states

    my$table=
         [[1,2,3],[2,4,6],[3,6,9],
         ];

    my$table=
        [[1,2,3],[2,4,6],[3,6,9],];

with these parameters

    --continuation-indentation=5
    --maximum-line-length=31
    --break-before-square-bracket-and-indent=2
    --break-before-square-bracket=1
    --noadd-whitespace

The problem was found to be caused by the -bbsb parameters causing the indentation level of the first square bracket to change depending upon whether the term was broken on input or not. Two fixes would correct this. One is to turn off the option if the -ci=n value exceeds the -i=n value. The other is to require a broken container to span at least three lines before turning this option on. The latter option was made to sub 'adjust_container_indentation'. With this change the snippet remains stable at the second state above. Fixed 14 Jan 2021, 5c793a1.

In random testing with convergence a 'blinker' (oscillating states) was found for the following script

    sub _prompt {

          print $_[0];
          return (
                readline
                       (*{$_[1]})!~
                       /^q/i)
                 ; # Return false if user types 'q'

    }

with the following specific parameters:

    --maximum-line-length=32
    --indent-columns=6
    --continuation-indentation=7
    --weld-nested-containers
    --extended-continuation-indentation
    --noadd-whitespace

The other state was

    sub _prompt {

          print $_[0];
          return (
                readline(
                      *{
                            $_
                                   [
                                   1
                                   ]
                      }
                )!~/^q/i
                 )
                 ; # Return false if user types 'q'

    }

All of the listed parameters are required to cause this, but the main cause is the very large continuation indentation and short line length. Welding was being turned on and off in this case. Normally welding is not done if all containers are on a single line, but an exception was made to detect a situation like this and keep the welded string together. Updated 13 Jan 2021, 5c793a1.

Fixed incorrect guess of division vs pattern

A syntax error was produced in random testing when perltidy was fed the following line:

    sub _DR () { pi2 /360 } sub _RD () { 360 /pi2 }

The bareword 'pi2' was not recognized and the text between the two slashes was a taken as a possible pattern argument in a parenless call to pi2. Two fixes were made to fix this. Perltidy looks for 'pi' but not 'pi2', so the first fix was to expand its table to include all variations of 'pi' in Trig.pm. Second, the fact that the first slash was followed by a number should have tipped the guess to favor division, so this was fixed. As it was, a backup spacing rule was used, which favored a pattern.

The formatted result is now

    sub _DR () { pi2 / 360 }
    sub _RD () { 360 / pi2 }

This update was made 13 Jan 2021, a50ecf8.

Correct formula for estimating line length with -wn option

A formula used to estimating maximum line length when the -wn option is set was missing a term for continuation indentation. No actual changes in formatting have been seen. This update made 12 Jan 2021.

The following blinker was found in random testing. The following statement (with @j starting at level 0)

    @j = ( $x, $y, $z );

run with the following parameters

    --indent-columns=5
    --continuation-indentation=7
    --maximum-line-length=20
    --break-before-paren-and-indent=2
    --break-before-paren=2
    --maximum-fields-per-table=4

caused an oscillation between two states. An unusual feature which contributed to the problem is the very large ci value. This is fixed in a patch made 12 Jan 2021, 9a97dba.

Issues fixed after release 20201207

Improve indentation of multiline qw quotes when -xci flag is set

The indentation of multiline qw quotes runs into problems when there is nesting, as in the following.

    # OLD: perltidy -xci -ci=4
    for my $feep (
        qw{
        pwage      pwchange   pwclass    pwcomment
        pwexpire   pwgecos    pwpasswd   pwquota
        }
        )

The problem is that multiline qw quotes do not get the same indentation treatment as lists.

This update fixes this in the following circumstances:

  - the leading qw( and trailing ) are on separate lines
  - the closing token is one of ) } ] >
  - the -xci flag is set

The above example becomes

    # NEW: perltidy -xci -ci=4
    for my $feep (
        qw{
            pwage      pwchange   pwclass    pwcomment
            pwexpire   pwgecos    pwpasswd   pwquota
        }
        )

The reason that the -xci flag is required is to minimize unexpected changes to existing scripts. The extra indentation is removed if the -wn flag is also given, so both old and new versions with -wn give

    # OLD and NEW: perltidy -wn -xci -ci=4
    for my $feep ( qw{
        pwage      pwchange   pwclass    pwcomment
        pwexpire   pwgecos    pwpasswd   pwquota
    } )

This update added 8 Jan 2021, 474cfa8.

Improve alignment of leading equals in rare situation

A rare case in which a vertical alignment opportunity of leading equals was missed has been fixed. This involved lines with additional varying alignment tokens, such as 'unless' and second '=' in lines 1-3 below. In this example lines 4 and 5 were not 'looking' backwards to align their leading equals.

    # OLD:
    $them = 'localhost' unless ( $them = shift );
    $cmd  = '!print'    unless ( $cmd  = shift );
    $port = 2345        unless ( $port = shift );
    $saddr = 'S n a4 x8';
    $SIG{'INT'} = 'dokill';

    # NEW
    $them       = 'localhost' unless ( $them = shift );
    $cmd        = '!print'    unless ( $cmd  = shift );
    $port       = 2345        unless ( $port = shift );
    $saddr      = 'S n a4 x8';
    $SIG{'INT'} = 'dokill';

Fixed 5 Jan 2021, 9244678.

Moved previous patch to a better location

The previous patch was moved to a location where it only applies if there is a side comment on the line with a closing token. This minimizes changes to other side comment locations.

Further improvement in rules for forgetting last side comment location

The code for forgetting the last side comment location was rewritten to improve formatting in some edge cases. The update also fixes a very rare problem discovered during testing and illustrated with the following snippet. The problem occurs for the particular combination of parameters -sct -act=2 and when a closing paren has a side comment:

    OLD: perltidy -sct -act=2
    foreach $line (
        [0, 1, 2], [3, 4, 5], [6, 7, 8],    # rows
        [0, 3, 6], [1, 4, 7], [2, 5, 8],    # columns
        [0, 4, 8], [2, 4, 6])                                     # diagonals

    NEW: perltidy -sct -act=2
    foreach $line (
        [0, 1, 2], [3, 4, 5], [6, 7, 8],    # rows
        [0, 3, 6], [1, 4, 7], [2, 5, 8],    # columns
        [0, 4, 8], [2, 4, 6])    # diagonals

In the old version the last side comment was aligned before the closing paren was attached to the previous line, causing the final side comment to be far to the right. A patch in the new version just places it at the default location. This is the best than can be done for now, but is preferable to the old formatting. 3 Jan 2021, e57d8db.

Improve rule for forgetting last side comment location

The code which aligns side comments remembers the most recent side comment and in some cases tries to start aligning at that column for later side comments. Sometimes the old side comment column was being remembered too long, causing occasional poor formatting and causing a noticeable and unexpected drift of side comment locations to the right. The rule for forgetting the previous side comment column has been modified to reduce this problem. The new rule is essentially to forget the previous side comment location at a new side comment with different indentation level or significant number of lines without side comments (about 12). The previous implementation forgetting changes in indentation level across code blocks only. Below is an example where the old method gets into trouble and the new method is ok:

        # OLD:
        foreach my $r (@$array) {
            $Dat{Data}{ uc $r->[0] } = join( ";", @$r );    # store all info
            my $name = $Dat{GivenName}{ uc $r->[0] } || $r->[1];

            # pass array as ad-hoc string, mark missing values
            $Dat{Data}{ uc $r->[0] } = join(
                ";",
                (
                    uc $r->[0], uc $name,                   # symbol, name
                    $r->[2],    $r->[3], $r->[4],           # price, date, time
                    $r->[5],    $r->[6],                    # change, %change
                    $r->[7],    "-", "-", "-",    # vol, avg vol, bid,ask
                    $r->[8],               $r->[9],     # previous, open
                    "$r->[10] - $r->[11]", $r->[12],    # day range,year range,
                    "-",                   "-", "-", "-", "-"
                )
            );                                          # eps,p/e,div,yld,cap
        }

The second side comment is at a deeper indentation level but was not being forgotten, causing line length limits to interfere with later alignment. The new rule gives a better result:

        # NEW:
        foreach my $r (@$array) {
            $Dat{Data}{ uc $r->[0] } = join( ";", @$r );    # store all info
            my $name = $Dat{GivenName}{ uc $r->[0] } || $r->[1];

            # pass array as ad-hoc string, mark missing values
            $Dat{Data}{ uc $r->[0] } = join(
                ";",
                (
                    uc $r->[0], uc $name,               # symbol, name
                    $r->[2],    $r->[3], $r->[4],       # price, date, time
                    $r->[5],    $r->[6],                # change, %change
                    $r->[7],    "-", "-", "-",          # vol, avg vol, bid,ask
                    $r->[8],               $r->[9],     # previous, open
                    "$r->[10] - $r->[11]", $r->[12],    # day range,year range,
                    "-",                   "-", "-", "-", "-"
                )
            );    # eps,p/e,div,yld,cap
        }

The following exampel shows an unexpected alignment in the cascade of trailing comments which are aligned but slowly separating from their closing containers:

    # OLD:
    {
        $a = [
            Cascade    => $menu_cb,
            -menuitems => [
                [ Checkbutton => 'Oil checked', -variable => \$OIL ],
                [
                    Button   => 'See current values',
                    -command => [
                        \&see_vars, $TOP,

                    ],    # end see_vars
                ],        # end button
            ],            # end checkbutton menuitems
        ];                # end checkbuttons cascade
    }

This was caused by forgetting side comments only across code block changes. The new result is more reasonable:

    # NEW:
    {
        $a = [
            Cascade    => $menu_cb,
            -menuitems => [
                [ Checkbutton => 'Oil checked', -variable => \$OIL ],
                [
                    Button   => 'See current values',
                    -command => [
                        \&see_vars, $TOP,

                    ],    # end see_vars
                ],    # end button
            ],    # end checkbutton menuitems
        ];    # end checkbuttons cascade
    }

This change will cause occasional differences in side comment locations from previous versions but overall it gives fewer unexpected results so it is a worthwhile change. 29-Dec-2020, 76993f4.

Fixed very minor inconsistency in redefining lists after prune step

In rare cases it is necessary to update the type of lists, and this influences vertical alignment. This update fixes a minor inconsistency in doing this. In some rare cases with complex list elements vertical alignment can be improved. 27 Dec, 2020, 751faec.

            # OLD
            return join( '',
                $pre,   '<IMG ',   $iconsizes{$alt} || '',
                $align, 'BORDER=', $nav_border,
                ' ALT="', $alt,        "\"\n",
                ' SRC="', $ICONSERVER, "/$icon",
                '">' );

            # NEW
            return join( '',
                $pre,     '<IMG ',     $iconsizes{$alt} || '',
                $align,   'BORDER=',   $nav_border,
                ' ALT="', $alt,        "\"\n",
                ' SRC="', $ICONSERVER, "/$icon",
                '">' );
Improved vertical alignment of some edge cases

The existing rules for aligning two lines with very different lengths were rejecting some good alignments, such as the first line of numbers in the example below:

    # OLD:
    @gg_3 = (
        [
            0.0, 1.360755E-2, 9.569446E-4, 9.569446E-4,
            1.043498E-3, 1.043498E-3
        ],
        [
            9.569446E-4, 9.569446E-4, 0.0, 7.065964E-5,
            1.422811E-4, 1.422811E-4
        ],
        ...
    );

    # NEW:
    @gg_3 = (
        [
            0.0,         1.360755E-2, 9.569446E-4, 9.569446E-4,
            1.043498E-3, 1.043498E-3
        ],
        [
            9.569446E-4, 9.569446E-4, 0.0, 7.065964E-5,
            1.422811E-4, 1.422811E-4
        ],
        ...
    );

The rule in sub 'two_line_pad' was updated to allow alignment of any lists if the patterns match exactly (all numbers in this case). Updated 27-Dec-2020, 035d2b7.

Avoid -lp style formatting of lists containing multiline qw quotes

The -lp formatting style often does not work well when lists contain multiline qw quotes. This update avoids this problem by not formatting such lists with the -lp style. For example,

    # OLD, perltidy -gnu
    @EXPORT = (
        qw(
          i Re Im rho theta arg
          sqrt log ln
          log10 logn cbrt root
          cplx cplxe
          ),
        @trig,
              );


    # NEW, perltidy -gnu
    @EXPORT = (
        qw(
          i Re Im rho theta arg
          sqrt log ln
          log10 logn cbrt root
          cplx cplxe
        ),
        @trig,
    );

27-Dec-2020, 948c9bd.

improve formatting of multiline qw

This update adds a sequence numbering system for multiline qw quotes. In the perltidy tokenizer normal container pair types, like { }, (), [], are given unique serial numbers which are used as keys to data structures. qw quoted lists do not get serial numbers by the tokenizer, so this update creates a separate serial number scheme for them to correct this problem. One formatting problem that this solves is that of preventing the closing token of a multiline quote from being outdented more than the opening token. This is a general formatting rule which should be followed. Without a sequence number, the closing qw token could not lookup its corresponding opening indentation so it had to resort to a default, breaking the rule, as in the following:

    # OLD, perltidy -wn
    # qw line
    if ( $pos == 0 ) {
        @return = grep( /^$word/,
            sort qw(
              ! a b d h i m o q r u autobundle clean
              make test install force reload look
        ) ); #<-- outdented more than 'sort'
    }

    # Here is the same with a list instead of a qw; note how the
    # closing sort paren does not outdent more than the 'sort' line.
    # This is the desired result for qw.
    # perltidy -wn
    if ( $pos == 0 ) {
        @return = grep( /^$word/,
            sort (

                '!',          'a', 'b', 'd', 'h', 'i', 'm', 'o', 'q', 'r', 'u',
                'autobundle', 'clean',
                'make',       'test', 'install', 'force', 'reload', 'look'
            ) );  #<-- not outdented more than 'sort'
    }

    # NEW (perltidy -wn)
    if ( $pos == 0 ) {
        @return = grep( /^$word/,
            sort qw(
              ! a b d h i m o q r u autobundle clean
              make test install force reload look
            ) ); #<-- not outdented more than sort
    }

Here is another example # OLD: $_->meta->make_immutable( inline_constructor => 0, constructor_name => "_new", inline_accessors => 0, ) for qw( Class::XYZ::Package Class::XYZ::Module Class::XYZ::Class

        Class::XYZ::Overload
    );  #<-- outdented more than the line with 'for qw('

    # NEW:
    $_->meta->make_immutable(
        inline_constructor => 0,
        constructor_name   => "_new",
        inline_accessors   => 0,
      )
      for qw(
      Class::XYZ::Package
      Class::XYZ::Module
      Class::XYZ::Class

      Class::XYZ::Overload
      ); #<-- outdented same as the line with 'for qw('

26 Dec 2020, cdbf0e4.

improve list marking method

In the process of making vertical alignments, lines which are simple lists of items are treated different from other lines. The old method for finding and marking these lines had a few problems which are corrected with this update. The main problem was that the old method ran into trouble when there were side comments. For example, the old method was not marking the following list and as a result the two columns of values were not aligned:

    # OLD
    return (
        $startpos, $ldelpos - $startpos,         # PREFIX
        $ldelpos,  1,                            # OPENING BRACKET
        $ldelpos + 1, $endpos - $ldelpos - 2,    # CONTENTS
        $endpos - 1, 1,                          # CLOSING BRACKET
        $endpos, length($$textref) - $endpos,    # REMAINDER
    );

    # NEW
    return (
        $startpos,    $ldelpos - $startpos,           # PREFIX
        $ldelpos,     1,                              # OPENING BRACKET
        $ldelpos + 1, $endpos - $ldelpos - 2,         # CONTENTS
        $endpos - 1,  1,                              # CLOSING BRACKET
        $endpos,      length($$textref) - $endpos,    # REMAINDER
    );

Another problem was that occasionally unwanted alignments were made between lines which were not really lists because the lines were incorrectly marked. For example (note padding after first comma)

    # OLD: (undesirable alignment)
    my ( $isig2, $chisq ) = ( 1 / ( $sig * $sig ), 0 );
    my ( $ym,    $al, $cov, $bet, $olda, $ochisq, $di, $pivt, $info ) =
      map { null } ( 0 .. 8 );

    # NEW: (no alignment)
    my ( $isig2, $chisq ) = ( 1 / ( $sig * $sig ), 0 );
    my ( $ym, $al, $cov, $bet, $olda, $ochisq, $di, $pivt, $info ) =
      map { null } ( 0 .. 8 );

This update was made 22 Dec 2020, 36d4c35.

Fix git #51, closing quote pattern delimiters not following -cti flag settings

Closing pattern delimiter tokens of qw quotes were not following the -cti flag settings for containers in all cases, as would be expected, in particular when followed by a comma. For example, the closing qw paren below was indented with continuation indentation but would not have that extra indentation if it followed the default -cpi setting for a paren:

    # OLD:
    @EXPORT = (
        qw(
          i Re Im rho theta arg
          sqrt log ln
          log10 logn cbrt root
          cplx cplxe
          ),
        @trig
    );

    # NEW
    @EXPORT = (
        qw(
            i Re Im rho theta arg
            sqrt log ln
            log10 logn cbrt root
            cplx cplxe
        ),
        @trig
    );

This update makes closing qw quote terminators follow the settings for their corresponding container tokens as closely as possible. For a closing '>' the setting for a closing paren will now be followed. Other closing qw terminators will remain indented, to minimize changes to existing formatting. For example ('>' is outdented):

    @EXPORT = (
        qw<
          i Re Im rho theta arg
          sqrt log ln
          log10 logn cbrt root
          cplx cplxe
        >,
        @trig
    );

but (';' remains indented):

    @EXPORT = (
        qw;
          i Re Im rho theta arg
          sqrt log ln
          log10 logn cbrt root
          cplx cplxe
          ;,
        @trig
    );

This update was added 18 Dec 2020 and modified 24 Dec 2020, 538688f.

Update manual pages regarding issue git #50

Additional wording was added to the man pages regarding situations in which perltidy does not change whitespace. This update was added 17 Dec 2020.

Rewrote sub check_match

Moved inner part of sub check_match into sub match_line_pair in order to make info available earlier. This gave some minor alignment improvements. This was done 16 Dec 2020, 7ba4f3b.

    # OLD:
    @tests = (
        @common,     '$_',
        '"\$_"',     '@_',
        '"\@_"',     '??N',
        '"??N"',     chr 256,
        '"\x{100}"', chr 65536,
        '"\x{10000}"', ord 'N' == 78 ? ( chr 11, '"\013"' ) : ()
    );

    # NEW:
    @tests = (
        @common,       '$_',
        '"\$_"',       '@_',
        '"\@_"',       '??N',
        '"??N"',       chr 256,
        '"\x{100}"',   chr 65536,
        '"\x{10000}"', ord 'N' == 78 ? ( chr 11, '"\013"' ) : ()
    );
Improved vertical alignments by avoiding pruning step

There is a step in vertical alignment where the alignments are formed into a tree with different levels, and some deeper levels are pruned to preserve lower level alignments. This usually works well, but some deeper alignments will be lost, which is what was happening in the example below. It turns out that if the tree pruning is skipped when alignment depths increase monotonically across lines, as in the example, then better overall alignment is achieved by the subsequent 'sweep' pass.

    # OLD
    my $cmd = shift @ARGV;
    if    ( $cmd eq "new" )         { $force_new = 1; }
    elsif ( $cmd eq "interactive" ) { $interactive = 1; $batch       = 0; }
    elsif ( $cmd eq "batch" )       { $batch       = 1; $interactive = 0; }
    elsif ( $cmd eq "use_old" )     { $use_old = 1; }
    elsif ( $cmd eq "show" )        { $show    = 1; last; }
    elsif ( $cmd eq "showall" )     { $showall = 1; last; }
    elsif ( $cmd eq "show_all" )    { $showall = 1; last; }
    elsif ( $cmd eq "remove" )      { $remove  = 1; last; }
    elsif ( $cmd eq "help" )        { $help    = 1; last; }

    # NEW
    my $cmd = shift @ARGV;
    if    ( $cmd eq "new" )         { $force_new   = 1; }
    elsif ( $cmd eq "interactive" ) { $interactive = 1; $batch       = 0; }
    elsif ( $cmd eq "batch" )       { $batch       = 1; $interactive = 0; }
    elsif ( $cmd eq "use_old" )     { $use_old     = 1; }
    elsif ( $cmd eq "show" )        { $show        = 1; last; }
    elsif ( $cmd eq "showall" )     { $showall     = 1; last; }
    elsif ( $cmd eq "show_all" )    { $showall     = 1; last; }
    elsif ( $cmd eq "remove" )      { $remove      = 1; last; }
    elsif ( $cmd eq "help" )        { $help        = 1; last; }

This update was made 14 Dec 2020, 44e0afa.

Improved some marginal vertical alignments

This update fixed a rare situation in which some vertical alignment was missed. The problem had to do with two lines being incorrectly marked as a marginal match. A new routine, 'match_line_pairs' was added to set a flag with the information needed to detect and prevent this. This fix was made 13 Dec 2020, 9a8e49b.

    # OLD
    $sec = $sec + ( 60 * $min );
    $graphcpu[$sec] = $line;
    $secmax  = $sec  if ( $sec > $secmax );
    $linemax = $line if ( $line > $linemax );

    # NEW
    $sec            = $sec + ( 60 * $min );
    $graphcpu[$sec] = $line;
    $secmax         = $sec  if ( $sec > $secmax );
    $linemax        = $line if ( $line > $linemax );
Do not align equals across changes in continuation indentation

A rule was added to prevent vertical alignment of lines with leading '=' across a change in continuation indentation. Sometimes aligning across a change in CI can come out okay, but sometimes it can be very poor. For example:

    # BAD:
    $!               = 2, die qq/$0: can't stat -${arg}'s "$file"./
        unless $time = ( stat($file) )[$STAT_MTIME];

    # FIXED:
    $! = 2, die qq/$0: can't stat -${arg}'s "$file"./
      unless $time = ( stat($file) )[$STAT_MTIME];

The second line is a continuation of the first, and this update prevents this alignment. The above 'BAD' formatting was in the previous developmental version of perltidy, not the previous release. This update added 12 Dec 2020, 5b56147.

Improve vertical alignment in some two-line matches

When two lines would be perfectly aligned except for the line length limit, previously they would only be aligned if they had a common leading equals. The update removes this restriction and allows as many alignments to be made as possible. The results are generally improved. This update was made 11 Dec 2020, f3c6cd8. Some examples:

# In this example the side comments were limiting the matches

    # OLD
    shift @data if @data and $data[0] =~ /Contributed\s+Perl/;    # Skip header
    pop @data if @data and $data[-1] =~ /^\w/;    # Skip footer, like

    # NEW
    shift @data if @data and $data[0]  =~ /Contributed\s+Perl/;    # Skip header
    pop @data   if @data and $data[-1] =~ /^\w/;    # Skip footer, like

# The same is true here.

    # OLD
    if ($tvg::o_span) { $tvg::hour_span = $tvg::o_span; }
    if ( $tvg::hour_span % 2 > 0 ) { $tvg::hour_span++; }    # Multiple of 2

    # NEW
    if ($tvg::o_span)              { $tvg::hour_span = $tvg::o_span; }
    if ( $tvg::hour_span % 2 > 0 ) { $tvg::hour_span++; }    # Multiple of 2

In the next example, the first comma is now aligned but not the second, because of the line length limit:

    # OLD
    is( MyClass->meta, $mc, '... these metas are still the same thing' );
    is( MyClass->meta->meta, $mc->meta, '... these meta-metas are the same thing' );

    # NEW
    is( MyClass->meta,       $mc, '... these metas are still the same thing' );
    is( MyClass->meta->meta, $mc->meta, '... these meta-metas are the same thing' );

In this last example, the first comma is not aligned, but alignment resumes after the second comma.

    # OLD
    is( $obj->name, $COMPRESS_FILE, "   Name now set to '$COMPRESS_FILE'" );
    is( $obj->prefix, '', "   Prefix now empty" );

    # NEW
    is( $obj->name, $COMPRESS_FILE, "   Name now set to '$COMPRESS_FILE'" );
    is( $obj->prefix, '',           "   Prefix now empty" );
Improve vertical alignment in some marginal matches

In perltidy a 'marginal match' occurs for example when two lines share some alignment tokens but are somewhat different. When this happens some limits are placed on the size of the padding spaces that can be introduced. In this update the amount of allowed padding is significantly increased for certain 'good' alignment tokens. Results of extensive testing were favorable provided that the change is restricted to alignments of '=', 'if' and 'unless'. Update made 10 Dec 2020, a585f0b.

    # OLD
    my @roles = $self->role_names;
    my $role_names = join "|", @roles;

    # NEW
    my @roles      = $self->role_names;
    my $role_names = join "|", @roles;

    # OLD
    $sysname .= 'del' if $self->label =~ /deletion/;
    $sysname .= 'ins' if $self->label =~ /insertion/;
    $sysname .= uc $self->allele_ori->seq if $self->allele_ori->seq;

    # NEW
    $sysname .= 'del'                     if $self->label =~ /deletion/;
    $sysname .= 'ins'                     if $self->label =~ /insertion/;
    $sysname .= uc $self->allele_ori->seq if $self->allele_ori->seq;
Improve vertical alignment of lines ending in fat comma

A minor adjustment was made to the rule for aligning lines which end in '=>'. When there are just two lines in an alignment group, the alignment is avoided if the first of the two ends in a '=>'. Previously, alignment was avoided if either ended in a '=>'. The old rule was preventing some good alignments in a later stage of the iteration. In the following example, the last two lines are processed separately because they do not match the comma in 'sprintf'. The new rule allows the fat comma alignment to eventually get made later in the iteration. Update made 9 Dec 2020, ca0ddf4.

    # OLD
    $template->param(
        classlist => $classlist,
        ...,
        suggestion => $suggestion,
        totspent   => sprintf( "%.2f", $totspent ),
        totcomtd   => sprintf( "%.2f", $totcomtd ),
        totavail   => sprintf( "%.2f", $totavail ),
        nobudget => $#results == -1 ? 1 : 0,
        intranetcolorstylesheet =>
          C4::Context->preference("intranetcolorstylesheet"),
        ...
    );

    # NEW
    $template->param(
        classlist => $classlist,
        ...,
        suggestion              => $suggestion,
        totspent                => sprintf( "%.2f", $totspent ),
        totcomtd                => sprintf( "%.2f", $totcomtd ),
        totavail                => sprintf( "%.2f", $totavail ),
        nobudget                => $#results == -1 ? 1 : 0,
        intranetcolorstylesheet =>
          C4::Context->preference("intranetcolorstylesheet"),
        ...
    );
Avoid processing a file more than once

In the unlikely event that a user enters a filename more than once on the command line to perltidy, as for 'file1.pl' here

  perltidy file1.pl file1.pl 

then that file will be processed more than once. This looks harmless, but if the user was also using the -b (backup) parameter, then the original backup would be overwritten, which is not good. To avoid this, a filter has been placed on the list of files to remove duplicates. 9 Dec 2020, 646a542.

Fix for issue git #49, exit status not correctly set

The exit status flag was not being set for the -w option if the -se or if the -q flag were set. Issue git #44 was similar but a special case of the problem. The problem was fixed 8 Dec 2020, cb6028f.

Issues fixed after release 20201202

Fix for issue git #47

This issue has to do with the --weld-nested-containers option in the specific case of formatting a function which returns a list of anonymous subs. For example

    $promises[$i]->then(
        sub { $all->resolve(@_); () },
        sub {
            $results->[$i] = [@_];
            $all->reject(@$results) if --$remaining <= 0;
            return ();
        }
    );

A bug introduced in v20201202 caused an incorrect welding to occur when the -wn flag was set

    $promises[$i]->then( sub { $all->resolve(@_); () },
        sub {
        $results->[$i] = [@_];
        $all->reject(@$results) if --$remaining <= 0;
        return ();
        } );

This bug has been fixed, and code which has been incorrectly formatted will be correctly formatted with the next release. The bug was a result of a new coding introduced in v20201202 for fixing some issues with parsing sub signatures. Previously they were sometimes parsed the same as prototypes and sometimes as lists, now they are always parsed as lists. Fixed 6 Dec 2020, 6fd0c4f.

Issues fixed after release 20201001

removed excess spaces in a package declaration

Testing revealed that for a line such as

   package        Bob::Dog;

which has extra spaces or even tabs after the keyword 'package', the extra spaces or tabs were not being removed. This was fixed 28 Nov 2020, 008443d. The line now formats to

    package Bob::Dog;
do not automatically delete closing side comments with --indent-only

For the parameter combination --indent-only and --closing-side-comments, old closing side comments were getting deleted but new closing side comments were not made. A fix was made to prevent this deletion. This fix was made 27 Nov 2020, 957e0ca.

fix to stop at 1 iteration when using --indent-only

Previously, for the combination --indent-only and -conv, two iterations would be done. Only one iteration is necessary in this case. Fix made 23 Nov 2020, ae493d8.

fix for formatting signed numbers with spaces

In developing an improved convergence test, an issue slowing convergence was found related to signed numbers as in the following line,

    @london = (deg2rad(-  0.5), deg2rad(90 - 51.3));

The leading '-' here is separated from the following number '0.5'. This is handled by tokenizing the minus as type 'm' and the number as type 'n'. The whitespace between them is removed in formatting, and so the space is gone in the output. But a little problem is that the default rule for placing spaces within the parens is based on the token count, after the first formatting the result is

    @london = ( deg2rad( -0.5 ), deg2rad( 90 - 51.3 ) );

The next time it is formatted, the '-0.5' counts as one token, resulting in

    @london = ( deg2rad(-0.5), deg2rad( 90 - 51.3 ) );

Notice that the space within the parens around the '-0.5' is gone. An update was made to fix this, so that the final state is reached in one step. This fix was made 23 Nov 2020, f477c8b.

fix to prevent conversion of a block comment to hanging side comment

A rare situation was identified during testing in which a block comment could be converted to be a hanging side comment. For example:

    sub macro_get_names {    #
    #
    # %name = macro_get_names();  (key=macrohandle, value=macroname)
    #
        local (%name) = ();
        ...
    }

For the following specific contitions the block comment in line 2 could be converted into a hanging side comment, which is undesirable:

   1. The line contains nothing except for a '#' with no leading space
   2. It follows a line with side comment
   3. It has indentation level > 0

An update was made to prevent this from happening. There are two cases, depending on the value of --maximum-consecutive-blank-lines, or -mbl. If this value is positive (the default) then a blank line is inserted above the block comment to prevent it from becoming a hanging side comment. If this -mbl is zero, then the comment is converted to be a static block comment which again prevents it from becoming a hanging side comment. This fix was made 23 Nov 2020, 2eb3de1.

improved convergence test

A better test for convergence has been added. When iterations are requested, the new test will stop after the first pass if no changes in line break locations are made. Previously, at least two passes were required to verify convergnece unless the output stream had the same checksum as the input stream. Extensive testing has been made to verify the correctness of the new test. This update was made 23 Nov 2020, 29efb63.

fixed problem with vertical alignments involving 'if' statements

An update was made to break vertical alignment when a new sequence of if-like statements or ternary statements is encountered. This situation was causing a loss of alignment in some cases. For example

  OLD:
    $m1 = 0;
    if ( $value =~ /\bset\b/i )      { $m0 = 1; }
    if ( $value =~ /\barithmetic/i ) { $m1 = 1; }
    if    ( $m0 && !$m1 ) { $CONFIG[1] = 0; }
    elsif ( !$m0 && $m1 ) { $CONFIG[1] = 1; }
    else                  { $ok        = 0; last; }

 NEW:
    $m1 = 0;
    if    ( $value =~ /\bset\b/i )      { $m0        = 1; }
    if    ( $value =~ /\barithmetic/i ) { $m1        = 1; }
    if    ( $m0 && !$m1 )               { $CONFIG[1] = 0; }
    elsif ( !$m0 && $m1 )               { $CONFIG[1] = 1; }
    else                                { $ok        = 0; last; }

This update was made 15 Nov 2020, 2b7784d.

added option -wnxl=s to give control of welding by the -wn parameter

The parameter string can restrict the types of containers which are welded. This was added 11 Nov 2020 in 'added -wnxl=s for control of -wn', 2e642d2.

merged pull request git #46

The man page gave the incorrect string for -fse. This was fixed 11 Nov 2020 in 1f9869e.

recognize overloaded RPerl operators to avoid error messages

RPerl uses some bareword operators which caused error messages. An update was made to avoid this problem in files containing 'use RPerl'. This update was made 6 Nov 2020, f8bd088.

fix issue git #45, -wn and -vtc=n now work together

When -wn was set, the -vtc=n flag was being ignored. This was a simple fix made 5 Nov 2020 in 'fix issue git #45, -wn and -vtc=n now work together', 1fbc381.

implement request RT #133649, added parameters -kbb=s and -kba=s

These parameters request that old breakpoints be kept before or after selected token types. For example, -kbb='=>' means that newlines before fat commas should be kept. This was added 4 Nov 2020.

added parameters -maxue=n and maxle=n

These parameters had tentatively been hardwired in the tokenizer. Now the user can control them or turn the checks off altogether.

Fix problem parsing '$$*'

In random testing, an error was encountered parsing the following line

  $self->{"mod_id"}=($$*1001)%(10**(rand()*6));
                       ---^
  found Number where operator expected (previous token underlined)

The line parsed correctly with a space between the '$$' and the '*'. The problem had to do with an error in some newer code for postfix dereferencing, and this was fixed on 2 Nov 2020, 'fix problem scanning '$$'; revise call to operator_expected', 49d993b.

Update for git #44, fix exit status for assert-tidy/untidy

The exit status was always 0 for --assert-tidy if the user had turned off error messages with -quiet. This was fixed by gluesys/master in 'fix exit status for assert-tidy/untidy options', 625d250.

Fixed problem parsing extruded signature

A parsing error was encountered in a test parsing the following extruded signature:

  sub foo2
  (
  $
  first
  ,
  $
  ,
  $
  third
  )
  {
  return
  "first=$first, third=$third"
  ;
  }

The second '$' combined with the ',' on the next line to form a punctuation variable. This was fixed 20 Oct 2020 in 'fixed problem parsing extruded signature', 9b454f6.

The file parses correctly now, with formatted output

  sub foo2 ( $first, $, $third ) {
      return "first=$first, third=$third";
  }
Fixed several uses of undefined variables found in testing

Several instances of incorrect array indexing were found in testing and fixed. These each involved incorrectly indexing with index -1. They were found by placing undefs at the end of arrays. None of these was causing incorrect formatting. They were fixed 26 Oct 2020 in 'fixed several instances of incorrect array indexing', c60f694.

Prevent syntax error by breaking dashed package names

In stress testing perltidy with the -extrude option, the following test snippet

  use perl6-alpha;

was broken into sepate lines

  use
  perl6
  -
  alpha
  ;

A rule was added to prevent breaking around a dash separating two barewords. Rerunning gives

  use
  perl6-alpha
  ;

This was fixed 26 Oct 2020 in 'prevent breaking package names with trailing dashes', 9234be4.

Prevent syntax error by breaking dashed barewords

In stress testing perltidy with the -extrude option, using the following test snippet

  my %var;
  {
      $var{-y}  = 1;
      $var{-q}  = 1;
      $var{-qq} = 1;
      $var{-m}  = 1;
      $var{y}   = 1;
      $var{q}   = 1;
      $var{qq}  = 1;
      $var{m}   = 1;
  }

a syntax error was created when newlines were placed before or after the dashes. It is necessary to always keep a dash on the same line with its surrounding tokens. A rule was added to do this. The new 'extruded' result for the above snippet is:

  my%var
  ;
  {
  $var{-y}
  =
  1
  ;
  $var{-q}
  =
  1
  ;
  $var{-qq}
  =
  1
  ;
  $var{-m}
  =
  1
  ;
  $var{y}
  =
  1
  ;
  $var{q}
  =
  1
  ;
  $var{qq}
  =
  1
  ;
  $var{m}
  =
  1
  ;
  }

This update was added 26 Oct 2020, 'prevent syntax error by breaking dashed barewords', e121cae.

more types of severe errors will prevent formatting

Files for which 'severe errors' are found have always been output verbatim rather than being formatted. The definition of 'severe error' has been expanded to include a final indentation level error greater than 1, more than 2 brace errors, and more than 3 "unexpected token type" parsing errors. The goal is to avoid formatting a non-perl script or a perl script with severe errors. So for example the following snippet has a level error of 2

  {{{{
  }}

was previously output with default parameters as

  { 
      {
          {
              {}
          }

along with an error message. But now it is just output verbatim as

  {{{{
  }}

along with an error message. This update was added 25 Oct 2020, 'avoid formatting files with more types of severe errors', 2a86f51.

added 'state' as keyword

A statement such as the following was generating an error message at the colon:

   state $a : shared;

The problem was that 'state' was not in the list of keywords. This has been fixed and the line now parses without error. The 'state.t' test file for perl 5.31 now formats without error. This was added 18 Oct 2020 in "add 'state' as keyword", d73e15f.

sub signatures no longer parsed with prototypes

Simple signatures (those without commas) were being parsed with code originally written for prototypes. This prevented them from being formatted with the usual formatting rules. This was changed so that all signatures are now formatted with the normal formatting rules. For example:

 # Old, input and after formatting:
 sub t123 ($list=wantarray) { $list ? "list" : "scalar" }

 # New, after formatting
 sub t123 ( $list = wantarray ) { $list ? "list" : "scalar" }

Notice that some spaces have been introduced within the signature. Previously the contents of the parens not changed unless the parens contained a list.

This change introduced a problem parsing extended syntax within signatures which has been fixed. In the following snippet, the ':' caused a parsing error which was fixed.

  # perltidy -sal='method'
  method foo4 ( $class : $bar, $bubba ) { $class->bar($bar) }

The ':' here is given a type of 'A'. This may be used to change the spacing around it. For example:

  # perltidy -sal='method' -nwls='A'
  method foo4 ( $class: $bar, $bubba ) { $class->bar($bar) }

This update was added 18 Oct 2020, in 'format all signatures separately from prototypes', e6a10f3. The test file 'signatures.t' distributed with perl5.31 formats without error now.

fix parsing problem with $#

A problem with parsing variables of the form $# and $#array was found in testing and fixed. For most variables the leading sigil may be separated from the remaining part of the identifier by whitespace. An exception is for a variable beginning with '$#'. If there is any space between the '$' and '#' then the '#' starts a comment. So the following snippet is has valid syntax and is equivalent to $ans=40;

    my $ #
    #
    ans = 40;

This was being misparsed and was fixed 17 Oct 2020, in 'fixed parsing error with spaces in $#' a079cdb.

fix missing line break for hash of subs with signatures

During testing the following error was found and fixed. Given the following input snippet:

    get(
        on_ready => sub ($worker) {
            $on_ready->end;
            return;
        },
        on_exit => sub ( $worker, $status ) { return; },
    );

The resulting formatting was

    get(
        on_ready => sub ($worker) {
            $on_ready->end;
            return;
        }, on_exit => sub ( $worker, $status ) { return; },
    );

Notice that the break after the comma has been lost. The problem was traced to a short-cut taken by the code looking for one-line blocks. The unique circumstances in which this occurred involved a hash of anonymous subs, one with a signature with multiple parameters and short enough to be a one-line block, as in the last sub definition line. This was fixed 17 Oct 2020 in 'fix missing line break for hash of subs with signatures', 51428db.

fix issues with prototype and signature parsing

Problems with parsing prototypes and signatures were found during testing and fixed 17 Oct 2020 in 'fixed problem parsing multi-line signatures with comments', 017fd07. For example the following snippet was mis-parsed because of the hash mark.

    sub test ( # comment ))) 
        $x, $x) { $x+$y }

Complex signature expressions such as the following are now parsed without error:

    sub t086
        ( #foo)))
        $ #foo)))
        a #foo)))
        ) #foo)))
        { $a.$b }
improve guess for pattern or division

The following line caused a tokenization error in which the two slashes were parsed as a pattern.

   my $masksize = ceil( Opcode::opcodes / 8 );    # /

This problem was discovered in random testing. When a slash follows a bareword whose prototype is not known to perltidy, it has to guess whether the slash starts a pattern or is a division. The guessing logic was rewritten and improved 14 Oct 2020 in 'rewrote logic to guess if divide or pattern', afebe2f.

fix -bos to keep isolated semicolon breaks after block braces

The flag -bos, or --break-at-old-semicolon-breakpoints, keeps breaks at old isolated semicolons. For example

    $z = sqrt($x**2 + $y**2)
      ;

In testing it was found not to be doing this after braces which require semicolons, such as 'do' and anonymous subs. This was fixed 12 Oct 2020 in 'fix -bos to work with semicolons after braces', 03ee7fc. For example

    my $dist = sub {
        $z = sqrt( $x**2 + $y**2 )
          ;
      }
      ;
keep break after 'use overload'

If a line break occurs after use overload then it will now be kept. Previously it was dropped. For example, this would be kept intact:

                use overload
                    '+' => sub {
                        print length $_[2], "\n";
                        my ( $x, $y ) = _order(@_);
                        Number::Roman->new( int $x + $y );
                    },
                    '-' => sub {
                        my ( $x, $y ) = _order(@_);
                        Number::Roman->new( int $x - $y );
                    },
                    ...

This keeps the list from shifting to the right and can avoid problems in formatting the list with certain styles, including with the -xci flag. Fixed 12 Oct 2020 in 'keep break after use overload statement', 8485afa.

added flag -xci to improve formatting when -ci and -i are equal, issue git #28

This flag causes continuation indentation to "extend" deeper into structures. If you use -ci=n and -i=n with the same value of n you will probably want to set this flag. Since this is a fairly new flag, the default is -nxci to avoid disturbing existing formatting.

terminal braces not indenting correctly with -bli formatting, issue git #40

This problem is illustrated with the following snippet when run with -bli -blil='*'

    #-bli -bli list='*'
    try
      {
        die;
      }
    catch
      {
        die;
      };    # <-- this was not indenting

This was due to conflicting rules and was fixed 1 Oct 2020 in commit 'fix issue git #40, incorrect closing brace indentation with -bli', a5aefe9.

At the same time, it was noted that block types sort/map/grep and eval were not following -bli formatting when -blil='*' and this was fixed. For example, with corrected formatting, we would have

  # perltidy -bli -blil='*'
    eval
      {
        my $app = App::perlbrew->new( "install-patchperl", "-q" );
        $app->run();
      }
      or do
      {
        $error          = $@;
        $produced_error = 1;
      };

Issues fixed after release 20200907

This is a detailed log of changes since the release 20200907. All bugs were found with the help of automated random testing.

Keep any space between a bareword and quote

In random testing, the -mangle option introduced a syntax error by deleting the space between barewords and quotes (test file 'MxScreen'), such as:

  oops"Your login, $Bad_Login, is not valid";

Sub 'is_essential_whitespace' was updated to prevent this on 27 Sep 2020, in 'keep any space between a bareword and quote', f32553c.

Fixed some incorrect indentation disagreements reported in LOG file

The .LOG file reports any disagreements between the indentation of the input and output files. This can help locate brace errors. These were incorrect when some of the options were used, including --whitespace-cycle, -bbhb, -nib. This was corrected 24 Sep 2020, 'fixed incorrect log entry for indentation disagreement', 3d40545. At the same time, locations of closing brace indentation disagreements are now tracked and reported in the .ERR file when there is a brace error. This can help localize the error if the file was previously formatted by perltidy.

If an =cut starts a POD section within code, give a warning

Previously only a complaint was given, which went into the log file and was not normally seen. Perl silently accepts this but it can cause significant problems with pod utilities, so a clear warning is better. This situation arose in testing on random files in combination with a -dp flag and it took some time to understand the results because of the lack of a warning.

Switched from using an eval block to the -can() function for sub finish_formatting>

This is not a bug, but is cleaner coding and insures that error messages get reported. This change was made 20 Sep 2020, 'switch from eval { } to ->can('finish_formatting')', 28f2a4f.

fixed uninitialized value reference

The following message was generated during automated testing

 Use of uninitialized value $cti in numeric eq (==) at /home/steve/bin/Perl/Tidy/Formatter.pm line 12079.
 Use of uninitialized value $cti in numeric eq (==) at /home/steve/bin/Perl/Tidy/Formatter.pm line 12089.
 Use of uninitialized value $cti in numeric eq (==) at /home/steve/bin/Perl/Tidy/Formatter.pm line 12097.

The problem could be simplified to running perltidy -wn on this snippet:

     __PACKAGE__->load_components( qw(
>         Core
> 
>     ) );

This was fixed 20 Sep 2020 in 'fixed_uninitialized_value', 8d6c4ed.

fix incorrect parsing of certain deprecated empty here-docs

The following snippet was being incorrectly parsed:

 print <<
 # Hello World 13!
 
   ;
 print "DONE\n";

This is a deprecated here-doc without a specified target but currently still a valid program. It would have been correctly parsed if the semicolon followed the '<<' operator rather than the here-doc.

This was found in random testing and fixed 16 Sep 2020. A warning message about deprecated here-doc targets was added.

make the arrow a vertical alignment token, git #39

The -> can now be vertically aligned if a space is placed before it with -wls='->'. Added 15 Sep 2020 as part of previous item, 9ac6af6.

add flags -bbhb=n, -bbsb=n, =bbp=n, git #38

These flags give control over the opening token of a multiple-line list. They are described in the man pages, perltidy.html. Added 15 Sep 2020 in "added flags -bbhb=n, -bbsb=n, -bbq=n, suggestion git #38". 9ac6af6.

Allow vertical alignment of line-ending fat comma

A change was made to allow a '=>' at the end of a line to align vertically, provided that it aligns with two or more other '=>' tokens. This update was 14 Sep 2020, 'Allow line-ending '=>' to align vertically', ea96739.

fixed uninitialized value reference

The following message was generated when running perltidy on random text:

 Use of uninitialized value $K_semicolon in subtraction (-) at /home/steve/bin/Perl/Tidy/Formatter.pm line 16467.

This was fixed 14 Sep 2020, included in 'Allow line-ending '=>' to align vertically', ea96739.

Do not create a zero size file by deleting semicolons

A rule was added to prevent a file consisting of a single semicolon

 ;

from becoming a zero length file. This could cause problems with other software. Fixed 13 Sep 2020, 'do not create a zero length file by deleting semicolons', b39195e.

fixed uninitialized value reference

The following message was generated when running perltidy on random text:

 Use of uninitialized value $cti in numeric eq (==) at /home/steve/bin/Perl/Tidy/Formatter.pm line 11926.
 Use of uninitialized value $cti in numeric eq (==) at /home/steve/bin/Perl/Tidy/Formatter.pm line 11936.
 Use of uninitialized value $cti in numeric eq (==) at /home/steve/bin/Perl/Tidy/Formatter.pm line 11944.

This was fixed 13 Sep 2020 in 'fixed unitialized variable problem ', adb2096.

fixed uninitialized value reference

The following message was generated when running perltidy on random text:

 substr outside of string at /home/steve/bin/Perl/Tidy/Tokenizer.pm line 7362.
 Use of uninitialized value in concatenation (.) or string at /home/steve/bin/Perl/Tidy/Tokenizer.pm line 7362.

This was fixed 13 Sep 2020 in 'fixed unitialized variable problem', 5bf49a3.

fixed uninitialized value reference

The following message was generated when running perltidy on random text:

 Use of uninitialized value $K_opening in subtraction (-) at /home/steve/bin/Perl/Tidy/Formatter.pm line 16467.

This was fixed 13 Sep 2020 in 'fix undefined variable reference', 1919482.

hashbang warning changed

The following snippet generated a warning that there might be a hash-bang after the start of the script.

 $x = 2;
 #!  sunos does not yet provide a /usr/bin/perl
 $script = "$^X $script";

To prevent this annoyance, the warning is not given unless the first nonblank character after the '#!' is a '/'. Note that this change is just for the warning message. The actual hash bang check does not require the slash.

Fixed 13 Sep 2020, 'prevent unnecessary hash-bang warning message' 4f7733e and 'improved hash-bang warning filter', fa84904.

uninitialized index referenced

An unitialized index was referenced when running on a file of randomly generated text:

  Use of uninitialized value $K_oo in subtraction (-) at /home/steve/bin/Perl/Tidy/Formatter.pm line 7259.

This was fixed 12 Sep 2020 in 'fixed undefined index', 616bb88.

Oops message triggered

The parameter combination -lp -wc triggered an internal bug message from perltidy:

 398: Program bug with -lp.  seqno=77 should be 254 and i=1 should be less than max=-1
 713: The logfile perltidy.LOG may contain useful information
 713: 
 713: Oops, you seem to have encountered a bug in perltidy.  Please check the
 713: BUGS file at http://perltidy.sourceforge.net.  If the problem is not
 713: listed there, please report it so that it can be corrected.  Include the
 ...

The problem is that the parameters --line-up-parentheses and --whitespace-cycle=n are not compatible. The fix is to write a message and turn off the -wc parameter when the both occur. This was fixed 8 Sep 2020 in "do not allow -wc and -lp together, can cause bugs", 7103781.

Internal fault detected by perltidy

This snippet after processing with the indicated parameters triggered a Fault message in store-token-to-go due to discontinuous internal index values :

  perltidy --noadd-newlines --space-terminal-semicolon

  if ( $_ =~ /PENCIL/ ) { $pencil_flag= 1 } ; ;
  $yy=1;

This triggered the message:

 ==============================================================================
 While operating on input stream with name: '<stdin>'
 A fault was detected at line 7472 of sub 'Perl::Tidy::Formatter::store_token_to_go'
 in file '/home/steve/bin/Perl/Tidy/Formatter.pm'
 which was called from line 8298 of sub 'Perl::Tidy::Formatter::process_line_of_CODE'
 Message: 'Unexpected break in K values: 591 != 589+1'
 This is probably an error introduced by a recent programming change. 
 ==============================================================================

The deletion of the extra, spaced, comma had created an extra space in the token array which had not been forseen in the original programming. It was fixed 10 Sep 2020 in "fixed very rare fault found with automated testing", eb1b1d9.

Error parsing deprecated $# variable

This problem can be illustrated with this two-line snippet:

  $#
  eq$,?print"yes\n":print"no\n";

Perltidy joined '$#' and 'eq' to get $#eq, but should have stopped at the line end to get $# followed by keyword 'eq'. (Note that $# is deprecated). This was fixed 11 Sep 2020 in "fixed several fringe parsing bugs found in testing", 85e01b7.

Error message parsing a file with angle brackets and ternaries

This problem can be illustrated with the following test snippet which was not correctly parsed.

 print$$ <300?"$$<300\n":$$<700?"$$<700\n":$$<2_000?"$$<2,000\n":$$<10_000?"$$ <10,000\n":"$$>9,999\n";

The problem is related to the '<' symbol following the '$$' variable, a possible filehandle, and is similar to a previous bug. The problem was corrected 11 Sep 2020 in "fixed several fringe parsing bugs found in testing", 85e01b7. The line now correctly formats to

 print $$ < 300  ? "$$<300\n"
   : $$ < 700    ? "$$<700\n"
   : $$ < 2_000  ? "$$<2,000\n"
   : $$ < 10_000 ? "$$ <10,000\n"
   :               "$$>9,999\n";
code crash with cuddled-else formatting on unbalanced files

A file with incorrect bracing which effectively gave negative indentation caused a crash when a stack was referenced with a negative index. The problem was fixed 8 Sept 2020 in "convert array to hash to avoid trouble with neg levels in bad files", a720e0d.

error message 'Unterminated angle operator?'

This error can be demonstrated with this line.

  print $i <10 ? "yes" : "no";

Perl has some strange parsing rules near a possible filehandle, and they change over time. The '<' here is a less than symbol, but perltidy expected that it might be the start of an angle operator, based on the old rules, and gave a warning. The formatting was still correct, but the warning was confusing. This has been fixed 8 Sep 2020 in 'remove confusing warning message', 0a4d725.

Line broken after here target

This problem is illustrated with the following snippet

  $sth= $dbh->prepare (<<"END_OF_SELECT") or die "Couldn't prepare SQL" ;
      SELECT COUNT(duration),SUM(duration) 
      FROM logins WHERE username='$user'
  END_OF_SELECT

When run with a short line length it got broken after the here target, causing an error. This was due to a recent program change and fixed 7 Sep 2020 in 'fixed bug where long line with here target got broken', 8f7e4cb.

undefined variable named 'test2'

An uninitialized value was being referenced and triggered this message:

 undefined test2, i_opening=5, max=18, caller=Perl::Tidy::Formatter ./perltidy-20200907.pl 13465
 Use of uninitialized value $test2 in numeric eq (==) at ./perltidy-20200907.pl line 19692.

Fixed 8 Sep 2020 in 'fixed rare problem with stored index values for -lp option', 4147c8c.

Line order switched at start of quoted text

This problem arose in several scripts involving the parameter --line-up-parentheses pluse one or more of the vertical tightness flags. It can be illustrated with the following snippet:

    perltidy --line-up-parentheses --paren-vertical-tightness=1

    if (
        ( $name, $chap ) =
        $cur_fname =~ m!^Bible/
          .*?/          # testament
          .*?/          # range of books
          (.*?)/        # book name
          .*?           # optional range of verses
          (\d+)$!x
      )
    {
        $cur_name = "$name $chap";
    }

This gave

    if (( $name, $chap ) =
          .*?/          # testament
        $cur_fname =~ m!^Bible/
          .*?/          # range of books
          (.*?)/        # book name
          .*?           # optional range of verses
          (\d+)$!x
      )
    {
        $cur_name = "$name $chap";
    }

Notice the incorrect line order. The problem was an incorrect order of operations in the vertical aligner flush, leaving a line stranded and coming out in the wrong order. This was fixed 11 Sep 2020.

crash due to bad index named '$j_terminal_match'

This crash was due to an index error which caused a non-existent object to be referenced. The problem is fixed 2020-09-07 in "fix problem of undefined values involving j_terminal_match", c5bfa77. The particular parameters which caused this were:

    --noadd-newlines --nowant-left-space='=' 
an issue with the -x flag

This is not a bug but did take some time to resolve. The problem was reduced to the following script run with the -x flag (--look-for-hash-bang)

 print(SCRIPT$headmaybe . <<EOB . <<'EOF' .$tailmaybe),$!;
 #!$wd/perl
 EOB
 print "\$^X is $^X, \$0 is $0\n";
 EOF

The resulting file had a syntax error (here-doc target EOB changed).

 print(SCRIPT$headmaybe . <<EOB . <<'EOF' .$tailmaybe),$!;
 #!$wd/perl
 EOB print "\$^X is $^X, \$0 is $0\n";
 EOF

The problem is that the -x flag tells perltidy not to start parsing until it sees a line starting with '#!', which happens to be in a here-doc in this case.

A warning was added to the manual 7 Sept 2020 in "add warning about inappropriate -x flag", fe66743.

error parsing sub signature

This problem was reduced to the following snippet:

 substr
 (
  $#
 )

The deprecated variable '$#' was being parsed incorrectly, and this was due to an error in which the word 'substr' followed by a paren was taken as the start of a sub signature. The problem was fixed 8 Sep 2020 in 'fix problem parsing sub prototypes' 569e05f. The code

  $container_type =~ /^sub/;

was corrected to be

  $container_type =~ /^sub\b/;
uninitialized value message, found 7 Sep 2020

Unitialized values were referenced. An index was not being tested. Fixed 8 Sep 2020 in "fix undefined variable", 9729965.

 Use of uninitialized value $Kon in array element at /home/steve/bin/Perl/Tidy/Formatter.pm line 4022.
 Use of uninitialized value $Kon in array element at /home/steve/bin/Perl/Tidy/Formatter.pm line 4023.
 Use of uninitialized value $Ko in subtraction (-) at /home/steve/bin/Perl/Tidy/Formatter.pm line 4023.

Open Issues

These are known issues which have not been fixed.

lexical subs not fully supported

Basic parsing of lexical subs works but some aspects of lexical subs are not yet functional. One of these is that unlike regular subs, lexical subs can override names of builtin functions.

First consider the following snippet

  sub s { 
    my $arg=$_[0];
    print "s called with arg $arg\n";
  }
  s(1);
  s(2);

The 's' in the two last lines is the builtin s function, not the sub. Both perltidy and perl make the same assumption here. This program happens to still run but prints nothing. It will not run if the last semicolon is removed.

Now consider the following snippet in which the sub has a preceding 'my'

  use feature 'lexical_subs', 'signatures';
  my sub s { 
    my $arg=$_[0];
    print "s called with arg $arg\n";
  }
  s(1);
  s(2);

The builtin function 's' is replaced by the sub s here, and the program runs. Perltidy will format this but it is assuming that the s in the two last lines are the builtin s function. If the last semicolon is removed, there will be an formatting error. So perltidy and perl make different assumptions in this case.

Another issue is that perltidy does not yet remember the extent of the scope of a lexical sub.

issues with paren-less calls

Consider the following snippet:

  use Test::More;
  ok open($stdin, "<&", $1), 'open ... "<&", $magical_fileno', ||  _diag $!;

Note the unusual situation of a comma followed by an '||'. Perltidy will format this satisfactorily but it will write an error message. The syntax is correct, however. Perl knows the prototype of the 'ok' function, which is called here without parens, so the last comma marks the last arg and is needed to keep the || from attaching to the last arg.

Full support of peren-less calls will probably never be implemented in perltidy because it would require that it parse all of the modules used to find the prototypes. This would make it impossible to run perltidy on small snippets of code from within an editor.

The problem can be avoid if parens are used:

  ok ( open($stdin, "<&", $1), 'open ... "<&", $magical_fileno') ||  _diag $!;
multiple sub paren calls

Perltidy currently flags as an error a closing paren followed by an opening paren, as in the following

  $subsubs[0]()(0)

This syntax is ok. The example is from test 'current_sub.t' in perl5.31.

Perl-Tidy-20230309/docs/eos_flag.md0000644000175000017500000002427314266527055015702 0ustar stevesteve# The --encode-output-strings Flag ## What's this about? This is about making perltidy work better as a filter when called from other Perl scripts. For example, in the following example a reference to a string `$source` is passed to perltidy and it stores the formatted result in a string named `$output`: ``` my $output; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, argv => $argv, ); ``` For a filtering operation we expect to be able to directly compare the source and output strings, like this: ``` if ($output eq $source) {print "Your formatting is unchanged\n";} ``` Or we might want to optionally skip the filtering step and pass the source directly on to the next stage of processing. This requires that the source and output strings be in the same storage mode. The problem is that in versions of perltidy prior to 2022 there was a use case where this was not possible. That case was when perltidy received an encoded source and decoded it from a utf8 but did not re-encode it before storing it in the output string. So the source string was in a different storage mode than the output string, and a direct comparison was not meaningful. This problem is an unintentional result of the historical evolution of perltidy. The same problem occurs if the destination is an array rather than a string, so for simplicity we can limit this discussion to string destinations, which are more common. ## How has the problem been fixed? A fix was phased in over a couple of steps. The first step was to introduce a new flag in in version 20220217. The new flag is **--encode-output-strings**, or **-eos**. When this is set, perltidy will fix the specific problem mentioned above by doing an encoding before returning. So perltidy will behave well as a filter when **-eos** is set. To illustrate using this flag in the above example, we could write ``` my $output; # Make perltidy versions after 2022 behave well as a filter $argv .= " -eos" if ($Perl::Tidy::VERSION > 20220101); my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, argv => $argv, ); ``` With this modification we can make a meaningful direct comparison of `$source` and `$output`. The test on `$VERSION` allows this to work with older versions of perltidy (which would not recognize the flag -eos). In the second step, introduced in version 20220613, the new **-eos** flag became the default. ## What can go wrong? The first step is safe because the default behavior is unchanged. But the programmer has to set **-eos** for the corrected behavior to go into effect. The second step, in which **-eos** becomes the default, will have no effect on programs which do not require perltidy to decode strings, and it will make some programs start processing encoded strings correctly. But there is also the possibility of **double encoding** of the output, or in other words data corruption, in some cases. This could happen if an existing program already has already worked around this issue by encoding the output that it receives back from perltidy. It is important to check for this. To see how common this problem might be, all programs on CPAN which use Perl::Tidy as a filter were examined. Of a total of 45 programs located, one was identified for which the change in default would definitely cause double encoding, and in one program it was difficult to determine. It looked like the rest of the programs would either not be affected or would start working correctly when processing encoded files. Here is a slightly revised version of the code for the program which would have a problem with double encoding with the new default: ``` my $output; Perl::Tidy::perltidy( source => \$self->{data}, destination => \$output, stderr => \$perltidy_err, errorfile => \$perltidy_err, logfile => \$perltidy_log, argv => $perltidy_argv, ); # convert source back to raw encode_utf8 $output; ``` The problem is in the last line where encoding is done after the call to perltidy. This encoding operation was added by the module author to compensate for the lack of an encoding step with the old default behavior. But if we run this code with **-eos**, which is the planned new default, encoding will also be done by perltidy before it returns, with the result that `$output` gets double encoding. This must be avoided. Here is one way to modify the above code to avoid double encoding: ``` my $has_eos_flag = $Perl::Tidy::VERSION > 20220101; $perltidy_argv .= ' -eos' if $has_eos_flag; Perl::Tidy::perltidy( source => \$self->{data}, destination => \$output, stderr => \$perltidy_err, errorfile => \$perltidy_err, logfile => \$perltidy_log, argv => $perltidy_argv, ); # convert source back to raw if perltidy did not do it encode_utf8($output) if ( !$has_eos_flag ); ``` A related problem is if an update of Perl::Tidy is made without also updating a corrected version of a module such as the above. To help reduce the chance that this will occur the Change Log for perltidy will contain a warning to be alert for the double encoding problem, and how to reset the default if necessary. This is also the reason for waiting some time before the second step was made. If double encoding does appear to be occuring with the change in the default for some program which calls Perl::Tidy, then a quick emergency fix can be made by the program user by setting **-neos** to revert to the old default. A better fix can eventually be made by the program author by removing the second encoding using a technique such as illustrated above. ## Summary A new flag, **-eos**, has been added to cause Perl::Tidy to behave better as a filter when called from other Perl scripts. This flag is the default setting in the current release. ## Reference This flag was originally introduced to fix a problem with the widely-used **tidyall** program (see https://github.com/houseabsolute/perl-code-tidyall/issues/84). ## Appendix, a little closer look A string of text (here, a Perl script) can be stored by Perl in one of two internal storage formats. For simplicity let's call them 'B' mode (for 'Byte') mode and 'C' mode (for 'Character'). The 'B' mode can be used for text with single-byte characters or for storing an encoded string of multi-byte characters. The 'C' mode is needed for actually working with multi-byte characters. Thinking of a Perl script as a single long string of text, we can look at the mode of the text of a source script as it is processed by perltidy at three points: - when it enters as a source - at the intermediate stage as it is processed - when it is leaves to its destination Since 'C' mode only has meaning within Perl scripts, a rule is that outside of the realm of Perl the text must be stored in 'B' mode. The source can only be in 'C' mode if it arrives by a call from another Perl program, and the destination can only be in 'C' mode if the destination is a Perl program. Otherwise, if the destination is a file, or object with a print method, then it will be assumed to be ending its existance as a Perl string and will be placed in an end state which is 'B' mode. Transition from a starting 'B' mode to 'C' mode is done by a decoding operation according to the user settings. A transition from an intermediate 'C' mode to an ending 'B' mode is done by an encoding operation. It never makes sense to transition from a starting 'C' mode to a 'B' mode, or from an intermediate 'B' mode to an ending 'C' mode. Let us make a list of all possible sets of string storage modes to be sure that all cases are covered. If each of the three stages list above (entry, intermedite, and exit) could be in 'B' or 'C' mode then we would have a total of 2 x 2 x 2 = 8 combinations of states. Each end point may either be a file or a string reference. Here is a list of them, with a note indicating which ones are possible, and when: # modes when 1 - B->B->B always ok: (file or string )->(file or string) 2 - B->B->C never (trailing B->C never done) 3 - B->C->B ok if destination is a file or -eos is set [NEW DEFAULT] 4 - B->C->C ok if destination is a string and -neos is set [OLD DEFAULT] 5 - C->B->B never (leading C->B never done) 6 - C->B->C never (leading C->B and trailing B->C never done) 7 - C->C->B only for string-to-file 8 - C->C->C only for string-to-string So three of these cases (2, 5, and 6) cannot occur and the other five can occur. Of these five possible cases, only four are possible when the destination is a string: 1 - B->B->B ok 3 - B->C->B ok if -eos is set [NEW DEFAULt] 4 - B->C->C ok if -neos is set [OLD DEFAULT] 8 - C->C->C string-to-string only The first three of these may start at either a file or a string, and the last one only starts at a string. From this we can see that, if **-eos** is set, then only cases 1, 3, and 8 can occur. In that case the starting and ending states have the same storage mode for all routes through perltidy which end at a string. This verifies that perltidy will work well as a filter in all cases when the **-eos** flag is set, which is the goal here. The last case in this table, the C->C->C route, corresponds to programs which pass decoded strings to perltidy. This is a common usage pattern, and this route is not influenced by the **-eos** flag setting, since it only applies to strings that have been decoded by perltidy itself. Incidentally, the full name of the flag, **--encode-output-strings**, is not the best because it does not describe what happens in this case. It was difficult to find a concise name for this flag. A more correct name would have been **--encode-output-strings-that-you-decode**, but that is rather long. A more intuitive name for the flag might have been **--be-a-nice-filter**. Finally, note that case 7 in the full table, the C->C->B route, is an unusual but possible situation involving a source string being sent directly to a file. It is the only situation in which perltidy does an encoding without having done a corresponding previous decoding. Perl-Tidy-20230309/docs/INSTALL.html0000644000175000017500000003645414401515103015557 0ustar stevesteve

PERLTIDY INSTALLATION NOTES

Get a distribution file

Quick Test Drive

If you want to do a quick test of perltidy without doing any installation, get a .tar.gz or a .zip source file and see the section below "Method 2: Installation as a single binary script".

Uninstall older versions

In certain circumstances, it is best to remove an older version of perltidy before installing the latest version. These are:

Two Installation Methods - Overview

These are generic instructions. Some system-specific notes and hints are given in later sections.

Two separate installation methods are possible.

Unix Installation Notes

Windows Installation Notes

On a Windows 9x/Me system you should CLOSE ANY OPEN APPLICATIONS to avoid losing unsaved data in case of trouble.

VMS Installation Notes

Troubleshooting / Other Operating Systems

If there seems to be a problem locating a configuration file, you can see what is going on in the config file search with:

perltidy -dpro

If you want to customize where perltidy looks for configuration files, look at the routine 'find_config_file' in module 'Tidy.pm'. You should be able to at least use the '-pro=filename' method under most systems.

Remember to place quotes (either single or double) around input parameters which contain spaces, such as file names. For example:

perltidy "file name with spaces"

Without the quotes, perltidy would look for four files: file, name, with, and spaces.

If you develop a system-dependent patch that might be of general interest, please let us know.

CONFIGURATION FILE

You do not need a configuration file, but you may eventually want to create one to save typing; the tutorial and man page discuss this.

SYSTEM TEMPORARY FILES

Perltidy needs to create a system temporary file when it invokes Pod::Html to format pod text under the -html option. For Unix systems, this will normally be a file in /tmp, and for other systems, it will be a file in the current working directory named perltidy.TMP. This file will be removed when the run finishes.

DOCUMENTATION

Documentation is contained in .pod format, either in the docs directory or appended to the scripts.

These documents can also be found at http://perltidy.sourceforge.net

Reading the brief tutorial should help you use perltidy effectively.
The tutorial can be read interactively with perldoc, for example

cd docs
perldoc tutorial.pod

or else an html version can be made with pod2html:

pod2html tutorial.pod >tutorial.html

If you use the Makefile.PL installation method on a Unix system, the perltidy and Perl::Tidy man pages should automatically be installed. Otherwise, you can extract the man pages with the pod2xxxx utilities, as follows:

cd bin
pod2text perltidy >perltidy.txt
pod2html perltidy >perltidy.html

cd lib/Perl
pod2text Tidy.pm >Tidy.txt
pod2html Tidy.pm >Tidy.html

After installation, the installation directory of files may be deleted.

Perltidy is still being developed, so please check sourceforge occasionally for updates if you find that it is useful. New releases are announced on freshmeat.net.

CREDITS

Thanks to the many programmers who have documented problems, made suggestions and sent patches.

FEEDBACK / BUG REPORTS

If you see ways to improve these notes, please let us know.

A list of current bugs and issues can be found at the CPAN site https://rt.cpan.org/Public/Dist/Display.html?Name=Perl-Tidy

To report a new bug or problem, use the link on this page .

Perl-Tidy-20230309/docs/index.html0000644000175000017500000000616114401515104015551 0ustar stevesteve

Welcome to Perltidy

Perltidy is a Perl script which indents and reformats Perl scripts to make them easier to read. If you write Perl scripts, or spend much time reading them, you will probably find it useful.

Perltidy is free software released under the GNU General Public License -- please see the included file COPYING for details.

The formatting can be controlled with command line parameters. The default parameter settings approximately follow the suggestions in the Perl Style Guide.

Besides reformatting scripts, Perltidy can help in tracking down errors with missing or extra braces, parentheses, and square brackets because it is very good at localizing errors.

Documentation

Prerequisites

Perltidy should run on any system with perl 5.008 or later. The total disk space needed after removing the installation directory will be about 2 Mb.

Download

Installation

Perl::Tidy can be installed directly from CPAN one of the standard methods.

One way is to download a distribution file, unpack it and then test and install using the Makefile.PL:

perl Makefile.PL
make
make test
make install

The INSTALL file has additional installation notes. They are mainly for older sytems but also tell how to use perltidy without doing an installation.

Links

FEEDBACK / BUG REPORTS

The best place to report bugs and issues is GitHub

Bugs and issues can also be reported at the CPAN site https://rt.cpan.org/Public/Dist/Display.html?Name=Perl-Tidy

Perl-Tidy-20230309/docs/index.md0000644000175000017500000000526714064634531015226 0ustar stevesteve# Welcome to Perltidy Perltidy is a Perl script which indents and reformats Perl scripts to make them easier to read. If you write Perl scripts, or spend much time reading them, you will probably find it useful. Perltidy is free software released under the GNU General Public License -- please see the included file [COPYING](../COPYING) for details. The formatting can be controlled with command line parameters. The default parameter settings approximately follow the suggestions in the [Perl Style Guide](https://perldoc.perl.org/perlstyle.html). Besides reformatting scripts, Perltidy can help in tracking down errors with missing or extra braces, parentheses, and square brackets because it is very good at localizing errors. ## Documentation - [A Brief Perltidy Tutorial](./tutorial.html) - [Perltidy Style Key](./stylekey.html) will help in methodically selecting a set of style parameters. - [The Perltidy man page](./perltidy.html) explains how to precisely control the formatting details. - [The Perl::Tidy man page](./Tidy.html) discusses how to use the Perl::Tidy module - [Change Log](./ChangeLog.html) ## Prerequisites Perltidy should run on any system with perl 5.008 or later. The total disk space needed after removing the installation directory will be about 2 Mb. ## Download - The most recent release is always at [CPAN](https://metacpan.org/release/Perl-Tidy) - The most recent release is also at [sourceforge](https://sourceforge.net/projects/perltidy/) ## Installation Perl::Tidy can be installed directly from CPAN one of the standard methods. One way is to download a distribution file, unpack it and then test and install using the Makefile.PL: perl Makefile.PL make make test make install The [INSTALL file](./INSTALL.html) has additional installation notes. They are mainly for older sytems but also tell how to use perltidy without doing an installation. ## Links - [Perl::Tidy source code repository at GitHub](https://github.com/perltidy/perltidy) - [tidyall](https://metacpan.org/pod/distribution/Code-TidyAll/bin/tidyall) is a great tool for automatically running perltidy and other tools including perlcritic on a set of project files. - [Tidyview](http://sourceforge.net/projects/tidyview) is a graphical program for tweaking your .perltidyrc configuration parameters. - [A perltidy plugin for Sublime Text 2/3](https://github.com/vifo/SublimePerlTidy) ## FEEDBACK / BUG REPORTS The best place to report bugs and issues is [GitHub](https://github.com/perltidy/perltidy/issues) Bugs and issues can also be reported at the CPAN site [https://rt.cpan.org/Public/Dist/Display.html?Name=Perl-Tidy](https://rt.cpan.org/Public/Dist/Display.html?Name=Perl-Tidy) Perl-Tidy-20230309/docs/perltidy.html0000644000175000017500000077316614401515104016316 0ustar stevesteve

NAME

perltidy - a perl script indenter and reformatter

SYNOPSIS

    perltidy [ options ] file1 file2 file3 ...
            (output goes to file1.tdy, file2.tdy, file3.tdy, ...)
    perltidy [ options ] file1 -o outfile
    perltidy [ options ] file1 -st >outfile
    perltidy [ options ] <infile >outfile

DESCRIPTION

Perltidy reads a perl script and writes an indented, reformatted script. The formatting process involves converting the script into a string of tokens, removing any non-essential whitespace, and then rewriting the string of tokens with whitespace using whatever rules are specified, or defaults. This happens in a series of operations which can be controlled with the parameters described in this document.

Perltidy is a commandline frontend to the module Perl::Tidy. For documentation describing how to call the Perl::Tidy module from other applications see the separate documentation for Perl::Tidy. It is the file Perl::Tidy.pod in the source distribution.

Many users will find enough information in "EXAMPLES" to get started. New users may benefit from the short tutorial which can be found at http://perltidy.sourceforge.net/tutorial.html

A convenient aid to systematically defining a set of style parameters can be found at http://perltidy.sourceforge.net/stylekey.html

Perltidy can produce output on either of two modes, depending on the existence of an -html flag. Without this flag, the output is passed through a formatter. The default formatting tries to follow the recommendations in perlstyle(1), but it can be controlled in detail with numerous input parameters, which are described in "FORMATTING OPTIONS".

When the -html flag is given, the output is passed through an HTML formatter which is described in "HTML OPTIONS".

EXAMPLES

  perltidy somefile.pl

This will produce a file somefile.pl.tdy containing the script reformatted using the default options, which approximate the style suggested in perlstyle(1). The source file somefile.pl is unchanged.

  perltidy *.pl

Execute perltidy on all .pl files in the current directory with the default options. The output will be in files with an appended .tdy extension. For any file with an error, there will be a file with extension .ERR.

  perltidy -b file1.pl file2.pl

Modify file1.pl and file2.pl in place, and backup the originals to file1.pl.bak and file2.pl.bak. If file1.pl.bak and/or file2.pl.bak already exist, they will be overwritten.

  perltidy -b -bext='/' file1.pl file2.pl

Same as the previous example except that the backup files file1.pl.bak and file2.pl.bak will be deleted if there are no errors.

  perltidy -gnu somefile.pl

Execute perltidy on file somefile.pl with a style which approximates the GNU Coding Standards for C programs. The output will be somefile.pl.tdy.

  perltidy -i=3 somefile.pl

Execute perltidy on file somefile.pl, with 3 columns for each level of indentation (-i=3) instead of the default 4 columns. There will not be any tabs in the reformatted script, except for any which already exist in comments, pod documents, quotes, and here documents. Output will be somefile.pl.tdy.

  perltidy -i=3 -et=8 somefile.pl

Same as the previous example, except that leading whitespace will be entabbed with one tab character per 8 spaces.

  perltidy -ce -l=72 somefile.pl

Execute perltidy on file somefile.pl with all defaults except use "cuddled elses" (-ce) and a maximum line length of 72 columns (-l=72) instead of the default 80 columns.

  perltidy -g somefile.pl

Execute perltidy on file somefile.pl and save a log file somefile.pl.LOG which shows the nesting of braces, parentheses, and square brackets at the start of every line.

  perltidy -html somefile.pl

This will produce a file somefile.pl.html containing the script with html markup. The output file will contain an embedded style sheet in the <HEAD> section which may be edited to change the appearance.

  perltidy -html -css=mystyle.css somefile.pl

This will produce a file somefile.pl.html containing the script with html markup. This output file will contain a link to a separate style sheet file mystyle.css. If the file mystyle.css does not exist, it will be created. If it exists, it will not be overwritten.

  perltidy -html -pre somefile.pl

Write an html snippet with only the PRE section to somefile.pl.html. This is useful when code snippets are being formatted for inclusion in a larger web page. No style sheet will be written in this case.

  perltidy -html -ss >mystyle.css

Write a style sheet to mystyle.css and exit.

  perltidy -html -frm mymodule.pm

Write html with a frame holding a table of contents and the source code. The output files will be mymodule.pm.html (the frame), mymodule.pm.toc.html (the table of contents), and mymodule.pm.src.html (the source code).

OPTIONS - OVERVIEW

The entire command line is scanned for options, and they are processed before any files are processed. As a result, it does not matter whether flags are before or after any filenames. However, the relative order of parameters is important, with later parameters overriding the values of earlier parameters.

For each parameter, there is a long name and a short name. The short names are convenient for keyboard input, while the long names are self-documenting and therefore useful in scripts. It is customary to use two leading dashes for long names, but one may be used.

Most parameters which serve as on/off flags can be negated with a leading "n" (for the short name) or a leading "no" or "no-" (for the long name). For example, the flag to outdent long quotes is -olq or --outdent-long-quotes. The flag to skip this is -nolq or --nooutdent-long-quotes or --no-outdent-long-quotes.

Options may not be bundled together. In other words, options -q and -g may NOT be entered as -qg.

Option names may be terminated early as long as they are uniquely identified. For example, instead of --dump-token-types, it would be sufficient to enter --dump-tok, or even --dump-t, to uniquely identify this command.

I/O Control

The following parameters concern the files which are read and written.

-h, --help

Show summary of usage and exit.

-o=filename, --outfile=filename

Name of the output file (only if a single input file is being processed). If no output file is specified, and output is not redirected to the standard output (see -st), the output will go to filename.tdy. [Note: - does not redirect to standard output. Use -st instead.]

-st, --standard-output

Perltidy must be able to operate on an arbitrarily large number of files in a single run, with each output being directed to a different output file. Obviously this would conflict with outputting to the single standard output device, so a special flag, -st, is required to request outputting to the standard output. For example,

  perltidy somefile.pl -st >somefile.new.pl

This option may only be used if there is just a single input file. The default is -nst or --nostandard-output.

-se, --standard-error-output

If perltidy detects an error when processing file somefile.pl, its default behavior is to write error messages to file somefile.pl.ERR. Use -se to cause all error messages to be sent to the standard error output stream instead. This directive may be negated with -nse. Thus, you may place -se in a .perltidyrc and override it when desired with -nse on the command line.

-oext=ext, --output-file-extension=ext

Change the extension of the output file to be ext instead of the default tdy (or html in case the --html option is used). See "Specifying File Extensions".

-opath=path, --output-path=path

When perltidy creates a filename for an output file, by default it merely appends an extension to the path and basename of the input file. This parameter causes the path to be changed to path instead.

The path should end in a valid path separator character, but perltidy will try to add one if it is missing.

For example

 perltidy somefile.pl -opath=/tmp/

will produce /tmp/somefile.pl.tdy. Otherwise, somefile.pl.tdy will appear in whatever directory contains somefile.pl.

If the path contains spaces, it should be placed in quotes.

This parameter will be ignored if output is being directed to standard output, or if it is being specified explicitly with the -o=s parameter.

-b, --backup-and-modify-in-place

Modify the input file or files in-place and save the original with the extension .bak. Any existing .bak file will be deleted. See next item for changing the default backup extension, and for eliminating the backup file altogether.

Please Note: Writing back to the input file increases the risk of data loss or corruption in the event of a software or hardware malfunction. Before using the -b parameter please be sure to have backups and verify that it works correctly in your environment and operating system.

A -b flag will be ignored if input is from standard input or goes to standard output, or if the -html flag is set.

In particular, if you want to use both the -b flag and the -pbp (--perl-best-practices) flag, then you must put a -nst flag after the -pbp flag because it contains a -st flag as one of its components, which means that output will go to the standard output stream.

-bext=ext, --backup-file-extension=ext

This parameter serves two purposes: (1) to change the extension of the backup file to be something other than the default .bak, and (2) to indicate that no backup file should be saved.

To change the default extension to something other than .bak see "Specifying File Extensions".

A backup file of the source is always written, but you can request that it be deleted at the end of processing if there were no errors. This is risky unless the source code is being maintained with a source code control system.

To indicate that the backup should be deleted include one forward slash, /, in the extension. If any text remains after the slash is removed it will be used to define the backup file extension (which is always created and only deleted if there were no errors).

Here are some examples:

  Parameter           Extension          Backup File Treatment
  <-bext=bak>         F<.bak>            Keep (same as the default behavior)
  <-bext='/'>         F<.bak>            Delete if no errors
  <-bext='/backup'>   F<.backup>         Delete if no errors
  <-bext='original/'> F<.original>       Delete if no errors
-bm=s, --backup-method=s

This parameter should not normally be used but is available in the event that problems arise as a transition is made from an older implementation of the backup logic to a newer implementation. The newer implementation is the default and is specified with -bm='copy'. The older implementation is specified with -bm='move'. The difference is that the older implementation made the backup by moving the input file to the backup file, and the newer implementation makes the backup by copying the input file. The newer implementation preserves the file system inode value. This may avoid problems with other software running simultaneously. This change was made as part of issue git #103 at github.

-w, --warning-output

Setting -w causes any non-critical warning messages to be reported as errors. These include messages about possible pod problems, possibly bad starting indentation level, and cautions about indirect object usage. The default, -nw or --nowarning-output, is not to include these warnings.

-q, --quiet

Deactivate error messages (for running under an editor).

For example, if you use a vi-style editor, such as vim, you may execute perltidy as a filter from within the editor using something like

 :n1,n2!perltidy -q

where n1,n2 represents the selected text. Without the -q flag, any error message may mess up your screen, so be prepared to use your "undo" key.

-log, --logfile

Save the .LOG file, which has many useful diagnostics. Perltidy always creates a .LOG file, but by default it is deleted unless a program bug is suspected. Setting the -log flag forces the log file to be saved.

-g=n, --logfile-gap=n

Set maximum interval between input code lines in the logfile. This purpose of this flag is to assist in debugging nesting errors. The value of n is optional. If you set the flag -g without the value of n, it will be taken to be 1, meaning that every line will be written to the log file. This can be helpful if you are looking for a brace, paren, or bracket nesting error.

Setting -g also causes the logfile to be saved, so it is not necessary to also include -log.

If no -g flag is given, a value of 50 will be used, meaning that at least every 50th line will be recorded in the logfile. This helps prevent excessively long log files.

Setting a negative value of n is the same as not setting -g at all.

-npro --noprofile

Ignore any .perltidyrc command file. Normally, perltidy looks first in your current directory for a .perltidyrc file of parameters. (The format is described below). If it finds one, it applies those options to the initial default values, and then it applies any that have been defined on the command line. If no .perltidyrc file is found, it looks for one in your home directory.

If you set the -npro flag, perltidy will not look for this file.

-pro=filename or --profile=filename

To simplify testing and switching .perltidyrc files, this command may be used to specify a configuration file which will override the default name of .perltidyrc. There must not be a space on either side of the '=' sign. For example, the line

   perltidy -pro=testcfg

would cause file testcfg to be used instead of the default .perltidyrc.

A pathname begins with three dots, e.g. ".../.perltidyrc", indicates that the file should be searched for starting in the current directory and working upwards. This makes it easier to have multiple projects each with their own .perltidyrc in their root directories.

-opt, --show-options

Write a list of all options used to the .LOG file. Please see --dump-options for a simpler way to do this.

-f, --force-read-binary

Force perltidy to process binary files. To avoid producing excessive error messages, perltidy skips files identified by the system as non-text. However, valid perl scripts containing binary data may sometimes be identified as non-text, and this flag forces perltidy to process them.

-ast, --assert-tidy

This flag asserts that the input and output code streams are identical, or in other words that the input code is already 'tidy' according to the formatting parameters. If this is not the case, an error message noting this is produced. This error message will cause the process to return a non-zero exit code. The test for this is made by comparing an MD5 hash value for the input and output code streams. This flag has no other effect on the functioning of perltidy. This might be useful for certain code maintenance operations. Note: you will not see this message if you have error messages turned off with the -quiet flag.

-asu, --assert-untidy

This flag asserts that the input and output code streams are different, or in other words that the input code is 'untidy' according to the formatting parameters. If this is not the case, an error message noting this is produced. This flag has no other effect on the functioning of perltidy.

FORMATTING OPTIONS

Basic Options

--notidy

This flag disables all formatting and causes the input to be copied unchanged to the output except for possible changes in line ending characters and any pre- and post-filters. This can be useful in conjunction with a hierarchical set of .perltidyrc files to avoid unwanted code tidying. See also "Skipping Selected Sections of Code" for a way to avoid tidying specific sections of code.

-i=n, --indent-columns=n

Use n columns per indentation level (default n=4).

-l=n, --maximum-line-length=n

The default maximum line length is n=80 characters. Perltidy will try to find line break points to keep lines below this length. However, long quotes and side comments may cause lines to exceed this length.

The default length of 80 comes from the past when this was the standard CRT screen width. Many programmers prefer to increase this to something like 120.

Setting -l=0 is equivalent to setting -l=(a very large number). But this is not recommended because, for example, a very long list will be formatted in a single long line.

-vmll, --variable-maximum-line-length

A problem arises using a fixed maximum line length with very deeply nested code and data structures because eventually the amount of leading whitespace used for indicating indentation takes up most or all of the available line width, leaving little or no space for the actual code or data. One solution is to use a very long line length. Another solution is to use the -vmll flag, which basically tells perltidy to ignore leading whitespace when measuring the line length.

To be precise, when the -vmll parameter is set, the maximum line length of a line of code will be M+L*I, where

      M is the value of --maximum-line-length=M (-l=M), default 80,
      I is the value of --indent-columns=I (-i=I), default 4,
      L is the indentation level of the line of code

When this flag is set, the choice of breakpoints for a block of code should be essentially independent of its nesting depth. However, the absolute line lengths, including leading whitespace, can still be arbitrarily large. This problem can be avoided by including the next parameter.

The default is not to do this (-nvmll).

-wc=n, --whitespace-cycle=n

This flag also addresses problems with very deeply nested code and data structures. When the nesting depth exceeds the value n the leading whitespace will be reduced and start at a depth of 1 again. The result is that blocks of code will shift back to the left rather than moving arbitrarily far to the right. This occurs cyclically to any depth.

For example if one level of indentation equals 4 spaces (-i=4, the default), and one uses -wc=15, then if the leading whitespace on a line exceeds about 4*15=60 spaces it will be reduced back to 4*1=4 spaces and continue increasing from there. If the whitespace never exceeds this limit the formatting remains unchanged.

The combination of -vmll and -wc=n provides a solution to the problem of displaying arbitrarily deep data structures and code in a finite window, although -wc=n may of course be used without -vmll.

The default is not to use this, which can also be indicated using -wc=0.

Tabs

Using tab characters will almost certainly lead to future portability and maintenance problems, so the default and recommendation is not to use them. For those who prefer tabs, however, there are two different options.

Except for possibly introducing tab indentation characters, as outlined below, perltidy does not introduce any tab characters into your file, and it removes any tabs from the code (unless requested not to do so with -fws). If you have any tabs in your comments, quotes, or here-documents, they will remain.

-et=n, --entab-leading-whitespace

This flag causes each n leading space characters produced by the formatting process to be replaced by one tab character. The formatting process itself works with space characters. The -et=n parameter is applied as a last step, after formatting is complete, to convert leading spaces into tabs. Before starting to use tabs, it is essential to first get the indentation controls set as desired without tabs, particularly the two parameters --indent-columns=n (or -i=n) and --continuation-indentation=n (or -ci=n).

The value of the integer n can be any value but can be coordinated with the number of spaces used for indentation. For example, -et=4 -ci=4 -i=4 will produce one tab for each indentation level and and one for each continuation indentation level. You may want to coordinate the value of n with what your display software assumes for the spacing of a tab.

-t, --tabs

This flag causes one leading tab character to be inserted for each level of indentation. Certain other features are incompatible with this option, and if these options are also given, then a warning message will be issued and this flag will be unset. One example is the -lp option. This flag is retained for backwards compatibility, but if you use tabs, the -et=n flag is recommended. If both -t and -et=n are set, the -et=n is used.

-dt=n, --default-tabsize=n

If the first line of code passed to perltidy contains leading tabs but no tab scheme is specified for the output stream then perltidy must guess how many spaces correspond to each leading tab. This number of spaces n corresponding to each leading tab of the input stream may be specified with -dt=n. The default is n=8.

This flag has no effect if a tab scheme is specified for the output stream, because then the input stream is assumed to use the same tab scheme and indentation spaces as for the output stream (any other assumption would lead to unstable editing).

-io, --indent-only

This flag is used to deactivate all whitespace and line break changes within non-blank lines of code. When it is in effect, the only change to the script will be to the indentation and to the number of blank lines. And any flags controlling whitespace and newlines will be ignored. You might want to use this if you are perfectly happy with your whitespace and line breaks, and merely want perltidy to handle the indentation. (This also speeds up perltidy by well over a factor of two, so it might be useful when perltidy is merely being used to help find a brace error in a large script).

Setting this flag is equivalent to setting --freeze-newlines and --freeze-whitespace.

If you also want to keep your existing blank lines exactly as they are, you can add --freeze-blank-lines.

With this option perltidy is still free to modify the indenting (and outdenting) of code and comments as it normally would. If you also want to prevent long comment lines from being outdented, you can add either -noll or -l=0.

Setting this flag will prevent perltidy from doing any special operations on closing side comments. You may still delete all side comments however when this flag is in effect.

-enc=s, --character-encoding=s

This flag indicates if the input data stream use a character encoding. Perltidy does not look for the encoding directives in the source stream, such as use utf8, and instead relies on this flag to determine the encoding. (Note that perltidy often works on snippets of code rather than complete files so it cannot rely on use utf8 directives).

The possible values for s are:

 -enc=none if no encoding is used, or
 -enc=utf8 for encoding in utf8
 -enc=guess if perltidy should guess between these two possibilities.

The value none causes the stream to be processed without special encoding assumptions. This is appropriate for files which are written in single-byte character encodings such as latin-1.

The value utf8 causes the stream to be read and written as UTF-8. If the input stream cannot be decoded with this encoding then processing is not done.

The value guess tells perltidy to guess between either utf8 encoding or no encoding (meaning one character per byte). The guess option uses the Encode::Guess module which has been found to be reliable at detecting if a file is encoded in utf8 or not.

The current default is guess.

The abbreviations -utf8 or -UTF8 are equivalent to -enc=utf8, and the abbreviation -guess is equivalent to -enc=guess. So to process a file named file.pl which is encoded in UTF-8 you can use:

   perltidy -utf8 file.pl

or

   perltidy -guess file.pl

or simply

   perltidy file.pl

since -guess is the default.

To process files with an encoding other than UTF-8, it would be necessary to write a short program which calls the Perl::Tidy module with some pre- and post-processing to handle decoding and encoding.

-eos=s, --encode-output-strings=s

This flag was added to resolve an issue involving the interface between Perl::Tidy and calling programs, and in particular Code::TidyAll (tidyall).

If you only run the perltidy binary this flag has no effect. If you run a program which calls the Perl::Tidy module and receives a string in return, then the meaning of the flag is as follows:

The default was changed from -neos to -eos in versions after 20220217. If this change causes a program to start running incorrectly on encoded files, an emergency fix might be to set -neos. Additional information can be found in the man pages for the Perl::Tidy module and also in https://github.com/perltidy/perltidy/blob/master/docs/eos_flag.md.

-gcs, --use-unicode-gcstring

This flag controls whether or not perltidy may use module Unicode::GCString to obtain accurate display widths of wide characters. The default is --nouse-unicode-gcstring.

If this flag is set, and text is encoded, perltidy will look for the module Unicode::GCString and, if found, will use it to obtain character display widths. This can improve displayed vertical alignment for files with wide characters. It is a nice feature but it is off by default to avoid conflicting formatting when there are multiple developers. Perltidy installation does not require Unicode::GCString, so users wanting to use this feature need set this flag and also to install Unicode::GCString separately.

If this flag is set and perltidy does not find module Unicode::GCString, a warning message will be produced and processing will continue but without the potential benefit provided by the module.

Also note that actual vertical alignment depends upon the fonts used by the text display software, so vertical alignment may not be optimal even when Unicode::GCString is used.

-ole=s, --output-line-ending=s

where s=win, dos, unix, or mac. This flag tells perltidy to output line endings for a specific system. Normally, perltidy writes files with the line separator character of the host system. The win and dos flags have an identical result.

-ple, --preserve-line-endings

This flag tells perltidy to write its output files with the same line endings as the input file, if possible. It should work for dos, unix, and mac line endings. It will only work if perltidy input comes from a filename (rather than stdin, for example). If perltidy has trouble determining the input file line ending, it will revert to the default behavior of using the line ending of the host system.

-atnl, --add-terminal-newline

This flag, which is enabled by default, allows perltidy to terminate the last line of the output stream with a newline character, regardless of whether or not the input stream was terminated with a newline character. If this flag is negated, with -natnl, then perltidy will add a terminal newline to the the output stream only if the input stream is terminated with a newline.

Negating this flag may be useful for manipulating one-line scripts intended for use on a command line.

-it=n, --iterations=n

This flag causes perltidy to do n complete iterations. The reason for this flag is that code beautification is an iterative process and in some cases the output from perltidy can be different if it is applied a second time. For most purposes the default of n=1 should be satisfactory. However n=2 can be useful when a major style change is being made, or when code is being beautified on check-in to a source code control system. It has been found to be extremely rare for the output to change after 2 iterations. If a value n is greater than 2 is input then a convergence test will be used to stop the iterations as soon as possible, almost always after 2 iterations. See the next item for a simplified iteration control.

This flag has no effect when perltidy is used to generate html.

-conv, --converge

This flag is equivalent to -it=4 and is included to simplify iteration control. For all practical purposes one either does or does not want to be sure that the output is converged, and there is no penalty to using a large iteration limit since perltidy will check for convergence and stop iterating as soon as possible. The default is -nconv (no convergence check). Using -conv will approximately double run time since typically one extra iteration is required to verify convergence. No extra iterations are required if no new line breaks are made, and two extra iterations are occasionally needed when reformatting complex code structures, such as deeply nested ternary statements.

Code Indentation Control

-ci=n, --continuation-indentation=n

Continuation indentation is extra indentation spaces applied when a long line is broken. The default is n=2, illustrated here:

 my $level =   # -ci=2
   ( $max_index_to_go >= 0 ) ? $levels_to_go[0] : $last_output_level;

The same example, with n=0, is a little harder to read:

 my $level =   # -ci=0
 ( $max_index_to_go >= 0 ) ? $levels_to_go[0] : $last_output_level;

The value given to -ci is also used by some commands when a small space is required. Examples are commands for outdenting labels, -ola, and control keywords, -okw.

When default values are not used, it is recommended that either

(1) the value n given with -ci=n be no more than about one-half of the number of spaces assigned to a full indentation level on the -i=n command, or

(2) the flag -extended-continuation-indentation is used (see next section).

-xci, --extended-continuation-indentation

This flag allows perltidy to use some improvements which have been made to its indentation model. One of the things it does is "extend" continuation indentation deeper into structures, hence the name. The improved indentation is particularly noticeable when the flags -ci=n and -i=n use the same value of n. There are no significant disadvantages to using this flag, but to avoid disturbing existing formatting the default is not to use it, -nxci.

Please see the section "-pbp, --perl-best-practices" for an example of how this flag can improve the formatting of ternary statements. It can also improve indentation of some multi-line qw lists as shown below.

            # perltidy
            foreach $color (
                qw(
                AntiqueWhite3 Bisque1 Bisque2 Bisque3 Bisque4
                SlateBlue3 RoyalBlue1 SteelBlue2 DeepSkyBlue3
                ),
                qw(
                LightBlue1 DarkSlateGray1 Aquamarine2 DarkSeaGreen2
                SeaGreen1 Yellow1 IndianRed1 IndianRed2 Tan1 Tan4
                )
              )

            # perltidy -xci
            foreach $color (
                qw(
                    AntiqueWhite3 Bisque1 Bisque2 Bisque3 Bisque4
                    SlateBlue3 RoyalBlue1 SteelBlue2 DeepSkyBlue3
                ),
                qw(
                    LightBlue1 DarkSlateGray1 Aquamarine2 DarkSeaGreen2
                    SeaGreen1 Yellow1 IndianRed1 IndianRed2 Tan1 Tan4
                )
              )
-sil=n --starting-indentation-level=n

By default, perltidy examines the input file and tries to determine the starting indentation level. While it is often zero, it may not be zero for a code snippet being sent from an editing session.

To guess the starting indentation level perltidy simply assumes that indentation scheme used to create the code snippet is the same as is being used for the current perltidy process. This is the only sensible guess that can be made. It should be correct if this is true, but otherwise it probably won't. For example, if the input script was written with -i=2 and the current perltidy flags have -i=4, the wrong initial indentation will be guessed for a code snippet which has non-zero initial indentation. Likewise, if an entabbing scheme is used in the input script and not in the current process then the guessed indentation will be wrong.

If the default method does not work correctly, or you want to change the starting level, use -sil=n, to force the starting level to be n.

List indentation using --line-up-parentheses, -lp or --extended--line-up-parentheses , -xlp

These flags provide an alternative indentation method for list data. The original flag for this is -lp, but it has some limitations (explained below) which are avoided with the newer -xlp flag. So -xlp is probably the better choice for new work, but the -lp flag is retained to minimize changes to existing formatting. If you enter both -lp and -xlp, then -xlp will be used.

In the default indentation method perltidy indents lists with 4 spaces, or whatever value is specified with -i=n. Here is a small list formatted in this way:

    # perltidy (default)
    @month_of_year = (
        'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
        'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'
    );

The -lp or -xlp flags add extra indentation to cause the data to begin past the opening parentheses of a sub call or list, or opening square bracket of an anonymous array, or opening curly brace of an anonymous hash. With this option, the above list would become:

    # perltidy -lp or -xlp
    @month_of_year = (
                       'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
                       'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'
    );

If the available line length (see -l=n ) does not permit this much space, perltidy will use less. For alternate placement of the closing paren, see the next section.

These flags have no effect on code BLOCKS, such as if/then/else blocks, which always use whatever is specified with -i=n.

Some limitations on these flags are:

There are some potential disadvantages of this indentation method compared to the default method that should be noted:

Some things that can be done to minimize these problems are:

-lpil=s, --line-up-parentheses-inclusion-list and -lpxl=s, --line-up-parentheses-exclusion-list

The following discussion is written for -lp but applies equally to the newer -xlp version. By default, the -lp flag applies to as many containers as possible. The set of containers to which the -lp style applies can be reduced by either one of these two flags:

Use -lpil=s to specify the containers to which -lp applies, or

use -lpxl=s to specify the containers to which -lp does NOT apply.

Only one of these two flags may be used. Both flags can achieve the same result, but the -lpil=s flag is much easier to describe and use and is recommended. The -lpxl=s flag was the original implementation and is only retained for backwards compatibility.

This list s for these parameters is a string with space-separated items. Each item consists of up to three pieces of information in this order: (1) an optional letter code (2) a required container type, and (3) an optional numeric code.

The only required piece of information is a container type, which is one of '(', '[', or '{'. For example the string

  -lpil='('

means use -lp formatting only on lists within parentheses, not lists in square-brackets or braces. The same thing could alternatively be specified with

  -lpxl = '[ {'

which says to exclude lists within square-brackets and braces. So what remains is lists within parentheses.

A second optional item of information which can be given for parentheses is an alphanumeric letter which is used to limit the selection further depending on the type of token immediately before the paren. The possible letters are currently 'k', 'K', 'f', 'F', 'w', and 'W', with these meanings for matching whatever precedes an opening paren:

 'k' matches if the previous nonblank token is a perl built-in keyword (such as 'if', 'while'),
 'K' matches if 'k' does not, meaning that the previous token is not a keyword.
 'f' matches if the previous token is a function other than a keyword.
 'F' matches if 'f' does not.
 'w' matches if either 'k' or 'f' match.
 'W' matches if 'w' does not.

For example:

  -lpil = 'f('

means only apply -lp to function calls, and

  -lpil = 'w('

means only apply -lp to parenthesized lists which follow a function or a keyword.

This last example could alternatively be written using the -lpxl=s flag as

  -lpxl = '[ { W('

which says exclude -lp for lists within square-brackets, braces, and parens NOT preceded by a keyword or function. Clearly, the -lpil=s method is easier to understand.

An optional numeric code may follow any of the container types to further refine the selection based on container contents. The numeric codes are:

  '0' or blank: no check on contents is made
  '1' exclude B<-lp> unless the contents is a simple list without sublists
  '2' exclude B<-lp> unless the contents is a simple list without sublists, without
      code blocks, and without ternary operators

For example,

  -lpil = 'f(2'

means only apply -lp to function call lists which do not contain any sublists, code blocks or ternary expressions.

-cti=n, --closing-token-indentation

The -cti=n flag controls the indentation of a line beginning with a ), ], or a non-block }. Such a line receives:

 -cti = 0 no extra indentation (default)
 -cti = 1 extra indentation such that the closing token
        aligns with its opening token.
 -cti = 2 one extra indentation level if the line looks like:
        );  or  ];  or  };
 -cti = 3 one extra indentation level always

The flags -cti=1 and -cti=2 work well with the -lp flag (previous section).

    # perltidy -lp -cti=1
    @month_of_year = (
                       'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
                       'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'
                     );

    # perltidy -lp -cti=2
    @month_of_year = (
                       'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
                       'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'
                       );

These flags are merely hints to the formatter and they may not always be followed. In particular, if -lp is not being used, the indentation for cti=1 is constrained to be no more than one indentation level.

If desired, this control can be applied independently to each of the closing container token types. In fact, -cti=n is merely an abbreviation for -cpi=n -csbi=n -cbi=n, where: -cpi or --closing-paren-indentation controls )'s, -csbi or --closing-square-bracket-indentation controls ]'s, -cbi or --closing-brace-indentation controls non-block }'s.

-icp, --indent-closing-paren

The -icp flag is equivalent to -cti=2, described in the previous section. The -nicp flag is equivalent -cti=0. They are included for backwards compatibility.

-icb, --indent-closing-brace

The -icb option gives one extra level of indentation to a brace which terminates a code block . For example,

        if ($task) {
            yyy();
            }    # -icb
        else {
            zzz();
            }

The default is not to do this, indicated by -nicb.

-nib, --non-indenting-braces

Normally, lines of code contained within a pair of block braces receive one additional level of indentation. This flag, which is enabled by default, causes perltidy to look for opening block braces which are followed by a special side comment. This special side comment is #<<< by default. If found, the code between this opening brace and its corresponding closing brace will not be given the normal extra indentation level. For example:

            { #<<<   a closure to contain lexical vars

            my $var;  # this line does not get one level of indentation
            ...

            }

            # this line does not 'see' $var;

This can be useful, for example, when combining code from different files. Different sections of code can be placed within braces to keep their lexical variables from being visible to the end of the file. To keep the new braces from causing all of their contained code to be indented if you run perltidy, and possibly introducing new line breaks in long lines, you can mark the opening braces with this special side comment.

Only the opening brace needs to be marked, since perltidy knows where the closing brace is. Braces contained within marked braces may also be marked as non-indenting.

If your code happens to have some opening braces followed by '#<<<', and you don't want this behavior, you can use -nnib to deactivate it. To make it easy to remember, the default string is the same as the string for starting a format-skipping section. There is no confusion because in that case it is for a block comment rather than a side-comment.

The special side comment can be changed with the next parameter.

-nibp=s, --non-indenting-brace-prefix=s

The -nibp=string parameter may be used to change the marker for non-indenting braces. The default is equivalent to -nibp='#<<<'. The string that you enter must begin with a # and should be in quotes as necessary to get past the command shell of your system. This string is the leading text of a regex pattern that is constructed by appending pre-pending a '^' and appending a'\s', so you must also include backslashes for characters to be taken literally rather than as patterns.

For example, to match the side comment '#++', the parameter would be

  -nibp='#\+\+'
-olq, --outdent-long-quotes

When -olq is set, lines which is a quoted string longer than the value maximum-line-length will have their indentation removed to make them more readable. This is the default. To prevent such out-denting, use -nolq or --nooutdent-long-lines.

-oll, --outdent-long-lines

This command is equivalent to --outdent-long-quotes and --outdent-long-comments, and it is included for compatibility with previous versions of perltidy. The negation of this also works, -noll or --nooutdent-long-lines, and is equivalent to setting -nolq and -nolc.

Outdenting Labels: -ola, --outdent-labels

This command will cause labels to be outdented by 2 spaces (or whatever -ci has been set to), if possible. This is the default. For example:

        my $i;
      LOOP: while ( $i = <FOTOS> ) {
            chomp($i);
            next unless $i;
            fixit($i);
        }

Use -nola to not outdent labels. To control line breaks after labels see "-bal=n, --break-after-labels=n".

Outdenting Keywords
-okw, --outdent-keywords

The command -okw will cause certain leading control keywords to be outdented by 2 spaces (or whatever -ci has been set to), if possible. By default, these keywords are redo, next, last, goto, and return. The intention is to make these control keywords easier to see. To change this list of keywords being outdented, see the next section.

For example, using perltidy -okw on the previous example gives:

        my $i;
      LOOP: while ( $i = <FOTOS> ) {
            chomp($i);
          next unless $i;
            fixit($i);
        }

The default is not to do this.

Specifying Outdented Keywords: -okwl=string, --outdent-keyword-list=string

This command can be used to change the keywords which are outdented with the -okw command. The parameter string is a required list of perl keywords, which should be placed in quotes if there are more than one. By itself, it does not cause any outdenting to occur, so the -okw command is still required.

For example, the commands -okwl="next last redo goto" -okw will cause those four keywords to be outdented. It is probably simplest to place any -okwl command in a .perltidyrc file.

Whitespace Control

Whitespace refers to the blank space between variables, operators, and other code tokens.

-fws, --freeze-whitespace

This flag causes your original whitespace to remain unchanged, and causes the rest of the whitespace commands in this section, the Code Indentation section, and the Comment Control section to be ignored.

Tightness of curly braces, parentheses, and square brackets

Here the term "tightness" will mean the closeness with which pairs of enclosing tokens, such as parentheses, contain the quantities within. A numerical value of 0, 1, or 2 defines the tightness, with 0 being least tight and 2 being most tight. Spaces within containers are always symmetric, so if there is a space after a ( then there will be a space before the corresponding ).

The -pt=n or --paren-tightness=n parameter controls the space within parens. The example below shows the effect of the three possible values, 0, 1, and 2:

 if ( ( my $len_tab = length( $tabstr ) ) > 0 ) {  # -pt=0
 if ( ( my $len_tab = length($tabstr) ) > 0 ) {    # -pt=1 (default)
 if ((my $len_tab = length($tabstr)) > 0) {        # -pt=2

When n is 0, there is always a space to the right of a '(' and to the left of a ')'. For n=2 there is never a space. For n=1, the default, there is a space unless the quantity within the parens is a single token, such as an identifier or quoted string.

Likewise, the parameter -sbt=n or --square-bracket-tightness=n controls the space within square brackets, as illustrated below.

 $width = $col[ $j + $k ] - $col[ $j ];  # -sbt=0
 $width = $col[ $j + $k ] - $col[$j];    # -sbt=1 (default)
 $width = $col[$j + $k] - $col[$j];      # -sbt=2

Curly braces which do not contain code blocks are controlled by the parameter -bt=n or --brace-tightness=n.

 $obj->{ $parsed_sql->{ 'table' }[0] };    # -bt=0
 $obj->{ $parsed_sql->{'table'}[0] };      # -bt=1 (default)
 $obj->{$parsed_sql->{'table'}[0]};        # -bt=2

And finally, curly braces which contain blocks of code are controlled by the parameter -bbt=n or --block-brace-tightness=n as illustrated in the example below.

 %bf = map { $_ => -M $_ } grep { /\.deb$/ } dirents '.'; # -bbt=0 (default)
 %bf = map { $_ => -M $_ } grep {/\.deb$/} dirents '.';   # -bbt=1
 %bf = map {$_ => -M $_} grep {/\.deb$/} dirents '.';     # -bbt=2

To simplify input in the case that all of the tightness flags have the same value <n>, the parameter <-act=n> or --all-containers-tightness=n is an abbreviation for the combination <-pt=n -sbt=n -bt=n -bbt=n>.

-tso, --tight-secret-operators

The flag -tso causes certain perl token sequences (secret operators) which might be considered to be a single operator to be formatted "tightly" (without spaces). The operators currently modified by this flag are:

     0+  +0  ()x!! ~~<>  ,=>   =( )=

For example the sequence 0 +, which converts a string to a number, would be formatted without a space: 0+ when the -tso flag is set. This flag is off by default.

-sts, --space-terminal-semicolon

Some programmers prefer a space before all terminal semicolons. The default is for no such space, and is indicated with -nsts or --nospace-terminal-semicolon.

        $i = 1 ;     #  -sts
        $i = 1;      #  -nsts   (default)
-sfs, --space-for-semicolon

Semicolons within for loops may sometimes be hard to see, particularly when commas are also present. This option places spaces on both sides of these special semicolons, and is the default. Use -nsfs or --nospace-for-semicolon to deactivate it.

 for ( @a = @$ap, $u = shift @a ; @a ; $u = $v ) {  # -sfs (default)
 for ( @a = @$ap, $u = shift @a; @a; $u = $v ) {    # -nsfs
-asc, --add-semicolons

Setting -asc allows perltidy to add any missing optional semicolon at the end of a line which is followed by a closing curly brace on the next line. This is the default, and may be deactivated with -nasc or --noadd-semicolons.

-dsm, --delete-semicolons

Setting -dsm allows perltidy to delete extra semicolons which are simply empty statements. This is the default, and may be deactivated with -ndsm or --nodelete-semicolons. (Such semicolons are not deleted, however, if they would promote a side comment to a block comment).

-aws, --add-whitespace

Setting this option allows perltidy to add certain whitespace to improve code readability. This is the default. If you do not want any whitespace added, but are willing to have some whitespace deleted, use -naws. (Use -fws to leave whitespace completely unchanged).

-dws, --delete-old-whitespace

Setting this option allows perltidy to remove some old whitespace between characters, if necessary. This is the default. If you do not want any old whitespace removed, use -ndws or --nodelete-old-whitespace.

Detailed whitespace controls around tokens

For those who want more detailed control over the whitespace around tokens, there are four parameters which can directly modify the default whitespace rules built into perltidy for any token. They are:

-wls=s or --want-left-space=s,

-nwls=s or --nowant-left-space=s,

-wrs=s or --want-right-space=s,

-nwrs=s or --nowant-right-space=s.

These parameters are each followed by a quoted string, s, containing a list of token types. No more than one of each of these parameters should be specified, because repeating a command-line parameter always overwrites the previous one before perltidy ever sees it.

To illustrate how these are used, suppose it is desired that there be no space on either side of the token types = + - / *. The following two parameters would specify this desire:

  -nwls="= + - / *"    -nwrs="= + - / *"

(Note that the token types are in quotes, and that they are separated by spaces). With these modified whitespace rules, the following line of math:

  $root = -$b + sqrt( $b * $b - 4. * $a * $c ) / ( 2. * $a );

becomes this:

  $root=-$b+sqrt( $b*$b-4.*$a*$c )/( 2.*$a );

These parameters should be considered to be hints to perltidy rather than fixed rules, because perltidy must try to resolve conflicts that arise between them and all of the other rules that it uses. One conflict that can arise is if, between two tokens, the left token wants a space and the right one doesn't. In this case, the token not wanting a space takes priority.

It is necessary to have a list of all token types in order to create this type of input. Such a list can be obtained by the command --dump-token-types. Also try the -D flag on a short snippet of code and look at the .DEBUG file to see the tokenization.

WARNING Be sure to put these tokens in quotes to avoid having them misinterpreted by your command shell.

Note1: Perltidy does always follow whitespace controls

The various parameters controlling whitespace within a program are requests which perltidy follows as well as possible, but there are a number of situations where changing whitespace could change program behavior and is not done. Some of these are obvious; for example, we should not remove the space between the two plus symbols in '$x+ +$y' to avoid creating a '++' operator. Some are more subtle and involve the whitespace around bareword symbols and locations of possible filehandles. For example, consider the problem of formatting the following subroutine:

   sub print_div {
      my ($x,$y)=@_;
      print $x/$y;
   }

Suppose the user requests that / signs have a space to the left but not to the right. Perltidy will refuse to do this, but if this were done the result would be

   sub print_div {
       my ($x,$y)=@_;
       print $x /$y;
   }

If formatted in this way, the program will not run (at least with recent versions of perl) because the $x is taken to be a filehandle and / is assumed to start a quote. In a complex program, there might happen to be a / which terminates the multiline quote without a syntax error, allowing the program to run, but not as intended.

Related issues arise with other binary operator symbols, such as + and -, and in older versions of perl there could be problems with ternary operators. So to avoid changing program behavior, perltidy has the simple rule that whitespace around possible filehandles is left unchanged. Likewise, whitespace around barewords is left unchanged. The reason is that if the barewords are defined in other modules, or in code that has not even been written yet, perltidy will not have seen their prototypes and must treat them cautiously.

In perltidy this is implemented in the tokenizer by marking token following a print keyword as a special type Z. When formatting is being done, whitespace following this token type is generally left unchanged as a precaution against changing program behavior. This is excessively conservative but simple and easy to implement. Keywords which are treated similarly to print include printf, sort, exec, system. Changes in spacing around parameters following these keywords may have to be made manually. For example, the space, or lack of space, after the parameter $foo in the following line will be unchanged in formatting.

   system($foo );
   system($foo);

To find if a token is of type Z you can use perltidy -DEBUG. For the first line above the result is

   1: system($foo );
   1: kkkkkk{ZZZZb};

which shows that system is type k (keyword) and $foo is type Z.

Note2: Perltidy's whitespace rules are not perfect

Despite these precautions, it is still possible to introduce syntax errors with some asymmetric whitespace rules, particularly when call parameters are not placed in containing parens or braces. For example, the following two lines will be parsed by perl without a syntax error:

  # original programming, syntax ok
  my @newkeys = map $_-$nrecs+@data, @oldkeys;

  # perltidy default, syntax ok
  my @newkeys = map $_ - $nrecs + @data, @oldkeys;

But the following will give a syntax error:

  # perltidy -nwrs='-'
  my @newkeys = map $_ -$nrecs + @data, @oldkeys;

For another example, the following two lines will be parsed without syntax error:

  # original programming, syntax ok
  for my $severity ( reverse $SEVERITY_LOWEST+1 .. $SEVERITY_HIGHEST ) { ...  }

  # perltidy default, syntax ok
  for my $severity ( reverse $SEVERITY_LOWEST + 1 .. $SEVERITY_HIGHEST ) { ... }

But the following will give a syntax error:

  # perltidy -nwrs='+', syntax error:
  for my $severity ( reverse $SEVERITY_LOWEST +1 .. $SEVERITY_HIGHEST ) { ... }

To avoid subtle parsing problems like this, it is best to avoid spacing a binary operator asymmetrically with a space on the left but not on the right.

Space between specific keywords and opening paren

When an opening paren follows a Perl keyword, no space is introduced after the keyword, unless it is (by default) one of these:

   my local our and or xor eq ne if else elsif until unless
   while for foreach return switch case given when

These defaults can be modified with two commands:

-sak=s or --space-after-keyword=s adds keywords.

-nsak=s or --nospace-after-keyword=s removes keywords.

where s is a list of keywords (in quotes if necessary). For example,

  my ( $a, $b, $c ) = @_;    # default
  my( $a, $b, $c ) = @_;     # -nsak="my local our"

The abbreviation -nsak='*' is equivalent to including all of the keywords in the above list.

When both -nsak=s and -sak=s commands are included, the -nsak=s command is executed first. For example, to have space after only the keywords (my, local, our) you could use -nsak="*" -sak="my local our".

To put a space after all keywords, see the next item.

Space between all keywords and opening parens

When an opening paren follows a function or keyword, no space is introduced after the keyword except for the keywords noted in the previous item. To always put a space between a function or keyword and its opening paren, use the command:

-skp or --space-keyword-paren

You may also want to use the flag -sfp (next item) too.

Space between all function names and opening parens

When an opening paren follows a function the default and recommended formatting is not to introduce a space. To cause a space to be introduced use:

-sfp or --space-function-paren

  myfunc( $a, $b, $c );    # default
  myfunc ( $a, $b, $c );   # -sfp

You will probably also want to use the flag -skp (previous item) too.

The parameter is not recommended because spacing a function paren can make a program vulnerable to parsing problems by Perl. For example, the following two-line program will run as written but will have a syntax error if reformatted with -sfp:

  if ( -e filename() ) { print "I'm here\n"; }
  sub filename { return $0 }

In this particular case the syntax error can be removed if the line order is reversed, so that Perl parses 'sub filename' first.

-fpva or --function-paren-vertical-alignment

A side-effect of using the -sfp flag is that the parens may become vertically aligned. For example,

    # perltidy -sfp
    myfun     ( $aaa, $b, $cc );
    mylongfun ( $a, $b, $c );

This is the default behavior. To prevent this alignment use -nfpva:

    # perltidy -sfp -nfpva
    myfun ( $aaa, $b, $cc );
    mylongfun ( $a, $b, $c );
-spp=n or --space-prototype-paren=n

This flag can be used to control whether a function prototype is preceded by a space. For example, the following prototype does not have a space.

      sub usage();

This integer n may have the value 0, 1, or 2 as follows:

    -spp=0 means no space before the paren
    -spp=1 means follow the example of the source code [DEFAULT]
    -spp=2 means always put a space before the paren

The default is -spp=1, meaning that a space will be used if and only if there is one in the source code. Given the above line of code, the result of applying the different options would be:

        sub usage();    # n=0 [no space]
        sub usage();    # n=1 [default; follows input]
        sub usage ();   # n=2 [space]
-kpit=n or --keyword-paren-inner-tightness=n

The space inside of an opening paren, which itself follows a certain keyword, can be controlled by this parameter. The space on the inside of the corresponding closing paren will be treated in the same (balanced) manner. This parameter has precedence over any other paren spacing rules. The values of n are as follows:

   -kpit=0 means always put a space (not tight)
   -kpit=1 means ignore this parameter [default]
   -kpit=2 means never put a space (tight)

To illustrate, the following snippet is shown formatted in three ways:

    if ( seek( DATA, 0, 0 ) ) { ... }    # perltidy (default)
    if (seek(DATA, 0, 0)) { ... }        # perltidy -pt=2
    if ( seek(DATA, 0, 0) ) { ... }      # perltidy -pt=2 -kpit=0

In the second case the -pt=2 parameter makes all of the parens tight. In the third case the -kpit=0 flag causes the space within the 'if' parens to have a space, since 'if' is one of the keywords to which the -kpit flag applies by default. The remaining parens are still tight because of the -pt=2 parameter.

The set of keywords to which this parameter applies are by default are:

   if elsif unless while until for foreach

These can be changed with the parameter -kpitl=s described in the next section.

-kpitl=string or --keyword-paren-inner-tightness=string

This command can be used to change the keywords to which the the -kpit=n command applies. The parameter string is a required list either keywords or functions, which should be placed in quotes if there are more than one. By itself, this parameter does not cause any change in spacing, so the -kpit=n command is still required.

For example, the commands -kpitl="if else while" -kpit=2 will cause the just the spaces inside parens following 'if', 'else', and 'while' keywords to follow the tightness value indicated by the -kpit=2 flag.

-lop or --logical-padding

In the following example some extra space has been inserted on the second line between the two open parens. This extra space is called "logical padding" and is intended to help align similar things vertically in some logical or ternary expressions.

    # perltidy [default formatting]
    $same =
      (      ( $aP eq $bP )
          && ( $aS eq $bS )
          && ( $aT eq $bT )
          && ( $a->{'title'} eq $b->{'title'} )
          && ( $a->{'href'} eq $b->{'href'} ) );

Note that this is considered to be a different operation from "vertical alignment" because space at just one line is being adjusted, whereas in "vertical alignment" the spaces at all lines are being adjusted. So it sort of a local version of vertical alignment.

Here is an example involving a ternary operator:

    # perltidy [default formatting]
    $bits =
        $top > 0xffff ? 32
      : $top > 0xff   ? 16
      : $top > 1      ? 8
      :                 1;

This behavior is controlled with the flag --logical-padding, which is set 'on' by default. If it is not desired it can be turned off using --nological-padding or -nlop. The above two examples become, with -nlop:

    # perltidy -nlop
    $same =
      ( ( $aP eq $bP )
          && ( $aS eq $bS )
          && ( $aT eq $bT )
          && ( $a->{'title'} eq $b->{'title'} )
          && ( $a->{'href'} eq $b->{'href'} ) );

    # perltidy -nlop
    $bits =
      $top > 0xffff ? 32
      : $top > 0xff ? 16
      : $top > 1    ? 8
      :               1;
Trimming whitespace around qw quotes

-tqw or --trim-qw provide the default behavior of trimming spaces around multi-line qw quotes and indenting them appropriately.

-ntqw or --notrim-qw cause leading and trailing whitespace around multi-line qw quotes to be left unchanged. This option will not normally be necessary, but was added for testing purposes, because in some versions of perl, trimming qw quotes changes the syntax tree.

-sbq=n or --space-backslash-quote=n

lines like

       $str1=\"string1";
       $str2=\'string2';

can confuse syntax highlighters unless a space is included between the backslash and the single or double quotation mark.

this can be controlled with the value of n as follows:

    -sbq=0 means no space between the backslash and quote
    -sbq=1 means follow the example of the source code
    -sbq=2 means always put a space between the backslash and quote

The default is -sbq=1, meaning that a space will be used if there is one in the source code.

Trimming trailing whitespace from lines of POD

-trp or --trim-pod will remove trailing whitespace from lines of POD. The default is not to do this.

Comment Controls

Perltidy has a number of ways to control the appearance of both block comments and side comments. The term block comment here refers to a full-line comment, whereas side comment will refer to a comment which appears on a line to the right of some code.

-ibc, --indent-block-comments

Block comments normally look best when they are indented to the same level as the code which follows them. This is the default behavior, but you may use -nibc to keep block comments left-justified. Here is an example:

             # this comment is indented      (-ibc, default)
             if ($task) { yyy(); }

The alternative is -nibc:

 # this comment is not indented              (-nibc)
             if ($task) { yyy(); }

See also the next item, -isbc, as well as -sbc, for other ways to have some indented and some outdented block comments.

-isbc, --indent-spaced-block-comments

If there is no leading space on the line, then the comment will not be indented, and otherwise it may be.

If both -ibc and -isbc are set, then -isbc takes priority.

-olc, --outdent-long-comments

When -olc is set, lines which are full-line (block) comments longer than the value maximum-line-length will have their indentation removed. This is the default; use -nolc to prevent outdenting.

-msc=n, --minimum-space-to-comment=n

Side comments look best when lined up several spaces to the right of code. Perltidy will try to keep comments at least n spaces to the right. The default is n=4 spaces.

-fpsc=n, --fixed-position-side-comment=n

This parameter tells perltidy to line up side comments in column number n whenever possible. The default, n=0, will not do this.

-iscl, --ignore-side-comment-lengths

This parameter causes perltidy to ignore the length of side comments when setting line breaks. The default, -niscl, is to include the length of side comments when breaking lines to stay within the length prescribed by the -l=n maximum line length parameter. For example, the following long single line would remain intact with -l=80 and -iscl:

     perltidy -l=80 -iscl
        $vmsfile =~ s/;[\d\-]*$//; # Clip off version number; we can use a newer version as well

whereas without the -iscl flag the line will be broken:

     perltidy -l=80
        $vmsfile =~ s/;[\d\-]*$//
          ;    # Clip off version number; we can use a newer version as well
-hsc, --hanging-side-comments

By default, perltidy tries to identify and align "hanging side comments", which are something like this:

        my $IGNORE = 0;    # This is a side comment
                           # This is a hanging side comment
                           # And so is this

A comment is considered to be a hanging side comment if (1) it immediately follows a line with a side comment, or another hanging side comment, and (2) there is some leading whitespace on the line. To deactivate this feature, use -nhsc or --nohanging-side-comments. If block comments are preceded by a blank line, or have no leading whitespace, they will not be mistaken as hanging side comments.

Closing Side Comments

A closing side comment is a special comment which perltidy can automatically create and place after the closing brace of a code block. They can be useful for code maintenance and debugging. The command -csc (or --closing-side-comments) adds or updates closing side comments. For example, here is a small code snippet

        sub message {
            if ( !defined( $_[0] ) ) {
                print("Hello, World\n");
            }
            else {
                print( $_[0], "\n" );
            }
        }

And here is the result of processing with perltidy -csc:

        sub message {
            if ( !defined( $_[0] ) ) {
                print("Hello, World\n");
            }
            else {
                print( $_[0], "\n" );
            }
        } ## end sub message

A closing side comment was added for sub message in this case, but not for the if and else blocks, because they were below the 6 line cutoff limit for adding closing side comments. This limit may be changed with the -csci command, described below.

The command -dcsc (or --delete-closing-side-comments) reverses this process and removes these comments.

Several commands are available to modify the behavior of these two basic commands, -csc and -dcsc:

-csci=n, or --closing-side-comment-interval=n

where n is the minimum number of lines that a block must have in order for a closing side comment to be added. The default value is n=6. To illustrate:

        # perltidy -csci=2 -csc
        sub message {
            if ( !defined( $_[0] ) ) {
                print("Hello, World\n");
            } ## end if ( !defined( $_[0] ))
            else {
                print( $_[0], "\n" );
            } ## end else [ if ( !defined( $_[0] ))
        } ## end sub message

Now the if and else blocks are commented. However, now this has become very cluttered.

-cscp=string, or --closing-side-comment-prefix=string

where string is the prefix used before the name of the block type. The default prefix, shown above, is ## end. This string will be added to closing side comments, and it will also be used to recognize them in order to update, delete, and format them. Any comment identified as a closing side comment will be placed just a single space to the right of its closing brace.

-cscl=string, or --closing-side-comment-list

where string is a list of block types to be tagged with closing side comments. By default, all code block types preceded by a keyword or label (such as if, sub, and so on) will be tagged. The -cscl command changes the default list to be any selected block types; see "Specifying Block Types". For example, the following command requests that only sub's, labels, BEGIN, and END blocks be affected by any -csc or -dcsc operation:

   -cscl="sub : BEGIN END"
-csct=n, or --closing-side-comment-maximum-text=n

The text appended to certain block types, such as an if block, is whatever lies between the keyword introducing the block, such as if, and the opening brace. Since this might be too much text for a side comment, there needs to be a limit, and that is the purpose of this parameter. The default value is n=20, meaning that no additional tokens will be appended to this text after its length reaches 20 characters. Omitted text is indicated with .... (Tokens, including sub names, are never truncated, however, so actual lengths may exceed this). To illustrate, in the above example, the appended text of the first block is ( !defined( $_[0] ).... The existing limit of n=20 caused this text to be truncated, as indicated by the .... See the next flag for additional control of the abbreviated text.

-cscb, or --closing-side-comments-balanced

As discussed in the previous item, when the closing-side-comment-maximum-text limit is exceeded the comment text must be truncated. Older versions of perltidy terminated with three dots, and this can still be achieved with -ncscb:

  perltidy -csc -ncscb
  } ## end foreach my $foo (sort { $b cmp $a ...

However this causes a problem with editors which cannot recognize comments or are not configured to do so because they cannot "bounce" around in the text correctly. The -cscb flag has been added to help them by appending appropriate balancing structure:

  perltidy -csc -cscb
  } ## end foreach my $foo (sort { $b cmp $a ... })

The default is -cscb.

-csce=n, or --closing-side-comment-else-flag=n

The default, n=0, places the text of the opening if statement after any terminal else.

If n=2 is used, then each elsif is also given the text of the opening if statement. Also, an else will include the text of a preceding elsif statement. Note that this may result some long closing side comments.

If n=1 is used, the results will be the same as n=2 whenever the resulting line length is less than the maximum allowed.

-cscb, or --closing-side-comments-balanced

When using closing-side-comments, and the closing-side-comment-maximum-text limit is exceeded, then the comment text must be abbreviated. It is terminated with three dots if the -cscb flag is negated:

  perltidy -csc -ncscb
  } ## end foreach my $foo (sort { $b cmp $a ...

This causes a problem with older editors which do not recognize comments because they cannot "bounce" around in the text correctly. The -cscb flag tries to help them by appending appropriate terminal balancing structures:

  perltidy -csc -cscb
  } ## end foreach my $foo (sort { $b cmp $a ... })

The default is -cscb.

-cscw, or --closing-side-comment-warnings

This parameter is intended to help make the initial transition to the use of closing side comments. It causes two things to happen if a closing side comment replaces an existing, different closing side comment: first, an error message will be issued, and second, the original side comment will be placed alone on a new specially marked comment line for later attention.

The intent is to avoid clobbering existing hand-written side comments which happen to match the pattern of closing side comments. This flag should only be needed on the first run with -csc.

Important Notes on Closing Side Comments:

Static Block Comments

Static block comments are block comments with a special leading pattern, ## by default, which will be treated slightly differently from other block comments. They effectively behave as if they had glue along their left and top edges, because they stick to the left edge and previous line when there is no blank spaces in those places. This option is particularly useful for controlling how commented code is displayed.

-sbc, --static-block-comments

When -sbc is used, a block comment with a special leading pattern, ## by default, will be treated specially.

Comments so identified are treated as follows:

  • If there is no leading space on the line, then the comment will not be indented, and otherwise it may be,

  • no new blank line will be inserted before such a comment, and

  • such a comment will never become a hanging side comment.

For example, assuming @month_of_year is left-adjusted:

    @month_of_year = (    # -sbc (default)
        'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct',
    ##  'Dec', 'Nov'
        'Nov', 'Dec');

Without this convention, the above code would become

    @month_of_year = (   # -nsbc
        'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct',

        ##  'Dec', 'Nov'
        'Nov', 'Dec'
    );

which is not as clear. The default is to use -sbc. This may be deactivated with -nsbc.

-sbcp=string, --static-block-comment-prefix=string

This parameter defines the prefix used to identify static block comments when the -sbc parameter is set. The default prefix is ##, corresponding to -sbcp=##. The prefix is actually part of a perl pattern used to match lines and it must either begin with # or ^#. In the first case a prefix ^\s* will be added to match any leading whitespace, while in the second case the pattern will match only comments with no leading whitespace. For example, to identify all comments as static block comments, one would use -sbcp=#. To identify all left-adjusted comments as static block comments, use -sbcp='^#'.

Please note that -sbcp merely defines the pattern used to identify static block comments; it will not be used unless the switch -sbc is set. Also, please be aware that since this string is used in a perl regular expression which identifies these comments, it must enable a valid regular expression to be formed.

A pattern which can be useful is:

    -sbcp=^#{2,}[^\s#]

This pattern requires a static block comment to have at least one character which is neither a # nor a space. It allows a line containing only '#' characters to be rejected as a static block comment. Such lines are often used at the start and end of header information in subroutines and should not be separated from the intervening comments, which typically begin with just a single '#'.

-osbc, --outdent-static-block-comments

The command -osbc will cause static block comments to be outdented by 2 spaces (or whatever -ci=n has been set to), if possible.

Static Side Comments

Static side comments are side comments with a special leading pattern. This option can be useful for controlling how commented code is displayed when it is a side comment.

-ssc, --static-side-comments

When -ssc is used, a side comment with a static leading pattern, which is ## by default, will be spaced only a single space from previous character, and it will not be vertically aligned with other side comments.

The default is -nssc.

-sscp=string, --static-side-comment-prefix=string

This parameter defines the prefix used to identify static side comments when the -ssc parameter is set. The default prefix is ##, corresponding to -sscp=##.

Please note that -sscp merely defines the pattern used to identify static side comments; it will not be used unless the switch -ssc is set. Also, note that this string is used in a perl regular expression which identifies these comments, so it must enable a valid regular expression to be formed.

Skipping Selected Sections of Code

Selected lines of code may be passed verbatim to the output without any formatting by marking the starting and ending lines with special comments. There are two options for doing this. The first option is called --format-skipping or -fs, and the second option is called --code-skipping or -cs.

In both cases the lines of code will be output without any changes. The difference is that in --format-skipping perltidy will still parse the marked lines of code and check for errors, whereas in --code-skipping perltidy will simply pass the lines to the output without any checking.

Both of these features are enabled by default and are invoked with special comment markers. --format-skipping uses starting and ending markers '#<<<' and '#>>>', like this:

 #<<<  format skipping: do not let perltidy change my nice formatting
    my @list = (1,
                1, 1,
                1, 2, 1,
                1, 3, 3, 1,
                1, 4, 6, 4, 1,);
 #>>>

--code-skipping uses starting and ending markers '#<<V' and '#>>V', like this:

 #<<V  code skipping: perltidy will pass this verbatim without error checking

    token ident_digit {
        [ [ <?word> | _ | <?digit> ] <?ident_digit>
        |   <''>
        ]
    };

 #>>V

Additional text may appear on the special comment lines provided that it is separated from the marker by at least one space, as in the above examples.

Any number of code-skipping or format-skipping sections may appear in a file. If an opening code-skipping or format-skipping comment is not followed by a corresponding closing comment, then skipping continues to the end of the file. If a closing code-skipping or format-skipping comment appears in a file but does not follow a corresponding opening comment, then it is treated as an ordinary comment without any special meaning.

It is recommended to use --code-skipping only if you need to hide a block of an extended syntax which would produce errors if parsed by perltidy, and use --format-skipping otherwise. This is because the --format-skipping option provides the benefits of error checking, and there are essentially no limitations on which lines to which it can be applied. The --code-skipping option, on the other hand, does not do error checking and its use is more restrictive because the code which remains, after skipping the marked lines, must be syntactically correct code with balanced containers.

These features should be used sparingly to avoid littering code with markers, but they can be helpful for working around occasional problems.

Note that it may be possible to avoid the use of --format-skipping for the specific case of a comma-separated list of values, as in the above example, by simply inserting a blank or comment somewhere between the opening and closing parens. See the section "Controlling List Formatting".

The following sections describe the available controls for these options. They should not normally be needed.

-fs, --format-skipping

As explained above, this flag, which is enabled by default, causes any code between special beginning and ending comment markers to be passed to the output without formatting. The code between the comments is still checked for errors however. The default beginning marker is #<<< and the default ending marker is #>>>.

Format skipping begins when a format skipping beginning comment is seen and continues until a format-skipping ending comment is found.

This feature can be disabled with -nfs. This should not normally be necessary.

-fsb=string, --format-skipping-begin=string

This and the next parameter allow the special beginning and ending comments to be changed. However, it is recommended that they only be changed if there is a conflict between the default values and some other use. If they are used, it is recommended that they only be entered in a .perltidyrc file, rather than on a command line. This is because properly escaping these parameters on a command line can be difficult.

If changed comment markers do not appear to be working, use the -log flag and examine the .LOG file to see if and where they are being detected.

The -fsb=string parameter may be used to change the beginning marker for format skipping. The default is equivalent to -fsb='#<<<'. The string that you enter must begin with a # and should be in quotes as necessary to get past the command shell of your system. It is actually the leading text of a pattern that is constructed by appending a '\s', so you must also include backslashes for characters to be taken literally rather than as patterns.

Some examples show how example strings become patterns:

 -fsb='#\{\{\{' becomes /^#\{\{\{\s/  which matches  #{{{ but not #{{{{
 -fsb='#\*\*'   becomes /^#\*\*\s/    which matches  #** but not #***
 -fsb='#\*{2,}' becomes /^#\*{2,}\s/  which matches  #** and #*****
-fse=string, --format-skipping-end=string

The -fse=string is the corresponding parameter used to change the ending marker for format skipping. The default is equivalent to -fse='#<<<'.

The beginning and ending strings may be the same, but it is preferable to make them different for clarity.

-cs, --code-skipping

As explained above, this flag, which is enabled by default, causes any code between special beginning and ending comment markers to be directly passed to the output without any error checking or formatting. Essentially, perltidy treats it as if it were a block of arbitrary text. The default beginning marker is #<<V and the default ending marker is #>>V.

This feature can be disabled with -ncs. This should not normally be necessary.

-csb=string, --code-skipping-begin=string

This may be used to change the beginning comment for a --code-skipping section, and its use is similar to the -fsb=string. The default is equivalent to -csb='#<<V'.

-cse=string, --code-skipping-end=string

This may be used to change the ending comment for a --code-skipping section, and its use is similar to the -fse=string. The default is equivalent to -cse='#>>V'.

Line Break Control

The parameters in this and the next sections control breaks after non-blank lines of code. Blank lines are controlled separately by parameters in the section "Blank Line Control".

-dnl, --delete-old-newlines

By default, perltidy first deletes all old line break locations, and then it looks for good break points to match the desired line length. Use -ndnl or --nodelete-old-newlines to force perltidy to retain all old line break points.

-anl, --add-newlines

By default, perltidy will add line breaks when necessary to create continuations of long lines and to improve the script appearance. Use -nanl or --noadd-newlines to prevent any new line breaks.

This flag does not prevent perltidy from eliminating existing line breaks; see --freeze-newlines to completely prevent changes to line break points.

-fnl, --freeze-newlines

If you do not want any changes to the line breaks within lines of code in your script, set -fnl, and they will remain fixed, and the rest of the commands in this section and sections "Controlling List Formatting", "Retaining or Ignoring Existing Line Breaks". You may want to use -noll with this.

Note: If you also want to keep your blank lines exactly as they are, you can use the -fbl flag which is described in the section "Blank Line Control".

Controlling Breaks at Braces, Parens, and Square Brackets

-ce, --cuddled-else

Enable the "cuddled else" style, in which else and elsif are follow immediately after the curly brace closing the previous block. The default is not to use cuddled elses, and is indicated with the flag -nce or --nocuddled-else. Here is a comparison of the alternatives:

  # -ce
  if ($task) {
      yyy();
  } else {
      zzz();
  }

  # -nce (default)
  if ($task) {
        yyy();
  }
  else {
        zzz();
  }

In this example the keyword else is placed on the same line which begins with the preceding closing block brace and is followed by its own opening block brace on the same line. Other keywords and function names which are formatted with this "cuddled" style are elsif, continue, catch, finally.

Other block types can be formatted by specifying their names on a separate parameter -cbl, described in a later section.

Cuddling between a pair of code blocks requires that the closing brace of the first block start a new line. If this block is entirely on one line in the input file, it is necessary to decide if it should be broken to allow cuddling. This decision is controlled by the flag -cbo=n discussed below. The default and recommended value of -cbo=1 bases this decision on the first block in the chain. If it spans multiple lines then cuddling is made and continues along the chain, regardless of the sizes of subsequent blocks. Otherwise, short lines remain intact.

So for example, the -ce flag would not have any effect if the above snippet is rewritten as

  if ($task) { yyy() }
  else {    zzz() }

If the first block spans multiple lines, then cuddling can be done and will continue for the subsequent blocks in the chain, as illustrated in the previous snippet.

If there are blank lines between cuddled blocks they will be eliminated. If there are comments after the closing brace where cuddling would occur then cuddling will be prevented. If this occurs, cuddling will restart later in the chain if possible.

-cb, --cuddled-blocks

This flag is equivalent to -ce.

-cbl, --cuddled-block-list

The built-in default cuddled block types are else, elsif, continue, catch, finally.

Additional block types to which the -cuddled-blocks style applies can be defined by this parameter. This parameter is a character string, giving a list of block types separated by commas or spaces. For example, to cuddle code blocks of type sort, map and grep, in addition to the default types, the string could be set to

  -cbl="sort map grep"

or equivalently

  -cbl=sort,map,grep

Note however that these particular block types are typically short so there might not be much opportunity for the cuddled format style.

Using commas avoids the need to protect spaces with quotes.

As a diagnostic check, the flag --dump-cuddled-block-list or -dcbl can be used to view the hash of values that are generated by this flag.

Finally, note that the -cbl flag by itself merely specifies which blocks are formatted with the cuddled format. It has no effect unless this formatting style is activated with -ce.

-cblx, --cuddled-block-list-exclusive

When cuddled else formatting is selected with -ce, setting this flag causes perltidy to ignore its built-in defaults and rely exclusively on the block types specified on the -cbl flag described in the previous section. For example, to avoid using cuddled catch and finally, which are among the defaults, the following set of parameters could be used:

  perltidy -ce -cbl='else elsif continue' -cblx
-cbo=n, --cuddled-break-option=n

Cuddled formatting is only possible between a pair of code blocks if the closing brace of the first block starts a new line. If a block is encountered which is entirely on a single line, and cuddled formatting is selected, it is necessary to make a decision as to whether or not to "break" the block, meaning to cause it to span multiple lines. This parameter controls that decision. The options are:

   cbo=0  Never force a short block to break.
   cbo=1  If the first of a pair of blocks is broken in the input file,
          then break the second [DEFAULT].
   cbo=2  Break open all blocks for maximal cuddled formatting.

The default and recommended value is cbo=1. With this value, if the starting block of a chain spans multiple lines, then a cascade of breaks will occur for remaining blocks causing the entire chain to be cuddled.

The option cbo=0 can produce erratic cuddling if there are numerous one-line blocks.

The option cbo=2 produces maximal cuddling but will not allow any short blocks.

-bl, --opening-brace-on-new-line, or --brace-left

Use the flag -bl to place an opening block brace on a new line:

  if ( $input_file eq '-' )
  {
      ...
  }

By default it applies to all structural blocks except sort map grep eval and anonymous subs.

The default is -nbl which places an opening brace on the same line as the keyword introducing it if possible. For example,

  # default
  if ( $input_file eq '-' ) {
     ...
  }

When -bl is set, the blocks to which this applies can be controlled with the parameters --brace-left-list and -brace-left-exclusion-list described in the next sections.

-bll=s, --brace-left-list=s

Use this parameter to change the types of block braces for which the -bl flag applies; see "Specifying Block Types". For example, -bll='if elsif else sub' would apply it to only if/elsif/else and named sub blocks. The default is all blocks, -bll='*'.

-blxl=s, --brace-left-exclusion-list=s

Use this parameter to exclude types of block braces for which the -bl flag applies; see "Specifying Block Types". For example, the default settings -bll='*' and -blxl='sort map grep eval asub' mean all blocks except sort map grep eval and anonymous sub blocks.

Note that the lists -bll=s and -blxl=s control the behavior of the -bl flag but have no effect unless the -bl flag is set.

-sbl, --opening-sub-brace-on-new-line

The flag -sbl provides a shortcut way to turn on -bl just for named subs. The same effect can be achieved by turning on -bl with the block list set as -bll='sub'.

For example,

 perltidy -sbl

produces this result:

 sub message
 {
    if (!defined($_[0])) {
        print("Hello, World\n");
    }
    else {
        print($_[0], "\n");
    }
 }

This flag is negated with -nsbl, which is the default.

-asbl, --opening-anonymous-sub-brace-on-new-line

The flag -asbl is like the -sbl flag except that it applies to anonymous sub's instead of named subs. For example

 perltidy -asbl

produces this result:

 $a = sub
 {
     if ( !defined( $_[0] ) ) {
         print("Hello, World\n");
     }
     else {
         print( $_[0], "\n" );
     }
 };

This flag is negated with -nasbl, and the default is -nasbl.

-bli, --brace-left-and-indent

The flag -bli is similar to the -bl flag but in addition it causes one unit of continuation indentation ( see -ci ) to be placed before an opening and closing block braces.

For example, perltidy -bli gives

        if ( $input_file eq '-' )
          {
            important_function();
          }

By default, this extra indentation occurs for block types: if, elsif, else, unless, while, for, foreach, do, and also named subs and blocks preceded by a label. The next item shows how to change this.

Note: The -bli flag is similar to the -bl flag, with the difference being that braces get indented. But these two flags are implemented independently, and have different default settings for historical reasons. If desired, a mixture of effects can be achieved if desired by turning them both on with different -list settings. In the event that both settings are selected for a certain block type, the -bli style has priority.

-blil=s, --brace-left-and-indent-list=s

Use this parameter to change the types of block braces for which the -bli flag applies; see "Specifying Block Types".

The default is -blil='if else elsif unless while for foreach do : sub'.

-blixl=s, --brace-left-and-indent-exclusion-list=s

Use this parameter to exclude types of block braces for which the -bli flag applies; see "Specifying Block Types".

This might be useful in conjunction with selecting all blocks -blil='*'. The default setting is -blixl=' ', which does not exclude any blocks.

Note that the two parameters -blil and -blixl control the behavior of the -bli flag but have no effect unless the -bli flag is set.

-bar, --opening-brace-always-on-right

The default style, -nbl places the opening code block brace on a new line if it does not fit on the same line as the opening keyword, like this:

        if ( $bigwasteofspace1 && $bigwasteofspace2
          || $bigwasteofspace3 && $bigwasteofspace4 )
        {
            big_waste_of_time();
        }

To force the opening brace to always be on the right, use the -bar flag. In this case, the above example becomes

        if ( $bigwasteofspace1 && $bigwasteofspace2
          || $bigwasteofspace3 && $bigwasteofspace4 ) {
            big_waste_of_time();
        }

A conflict occurs if both -bl and -bar are specified.

The -otr flag is a hint that perltidy should not place a break between a comma and an opening token. For example:

    # default formatting
    push @{ $self->{$module}{$key} },
      {
        accno       => $ref->{accno},
        description => $ref->{description}
      };

    # perltidy -otr
    push @{ $self->{$module}{$key} }, {
        accno       => $ref->{accno},
        description => $ref->{description}
      };

The flag -otr is actually an abbreviation for three other flags which can be used to control parens, hash braces, and square brackets separately if desired:

  -opr  or --opening-paren-right
  -ohbr or --opening-hash-brace-right
  -osbr or --opening-square-bracket-right

When a list of items spans multiple lines, the default formatting is to place the opening brace (or other container token) at the end of the starting line, like this:

    $romanNumerals = {
        one   => 'I',
        two   => 'II',
        three => 'III',
        four  => 'IV',
    };

This flag can change the default behavior to cause a line break to be placed before the opening brace according to the value given to the integer n:

  -bbhb=0 never break [default]
  -bbhb=1 stable: break if the input script had a break
  -bbhb=2 break if list is 'complex' (see note below)
  -bbhb=3 always break

For example,

    # perltidy -bbhb=3
    $romanNumerals =
      {
        one   => 'I',
        two   => 'II',
        three => 'III',
        four  => 'IV',
      };

There are several points to note about this flag:

-bbhbi=n, --break-before-hash-brace-and-indent=n

This flag is a companion to -bbhb=n for controlling the indentation of an opening hash brace which is placed on a new line by that parameter. The indentation is as follows:

  -bbhbi=0 one continuation level [default]
  -bbhbi=1 outdent by one continuation level
  -bbhbi=2 indent one full indentation level

For example:

    # perltidy -bbhb=3 -bbhbi=1
    $romanNumerals =
    {
        one   => 'I',
        two   => 'II',
        three => 'III',
        four  => 'IV',
    };

    # perltidy -bbhb=3 -bbhbi=2
    $romanNumerals =
        {
        one   => 'I',
        two   => 'II',
        three => 'III',
        four  => 'IV',
        };

Note that this parameter has no effect unless -bbhb=n is also set.

-bbsb=n, --break-before-square-bracket=n

This flag is similar to the flag described above, except it applies to lists contained within square brackets.

  -bbsb=0 never break [default]
  -bbsb=1 stable: break if the input script had a break
  -bbsb=2 break if list is 'complex' (part of nested list structure)
  -bbsb=3 always break
-bbsbi=n, --break-before-square-bracket-and-indent=n

This flag is a companion to -bbsb=n for controlling the indentation of an opening square bracket which is placed on a new line by that parameter. The indentation is as follows:

  -bbsbi=0 one continuation level [default]
  -bbsbi=1 outdent by one continuation level
  -bbsbi=2 indent one full indentation level
-bbp=n, --break-before-paren=n

This flag is similar to -bbhb=n, described above, except it applies to lists contained within parens.

  -bbp=0 never break [default]
  -bbp=1 stable: break if the input script had a break
  -bpb=2 break if list is 'complex' (part of nested list structure)
  -bbp=3 always break
-bbpi=n, --break-before-paren-and-indent=n

This flag is a companion to -bbp=n for controlling the indentation of an opening paren which is placed on a new line by that parameter. The indentation is as follows:

  -bbpi=0 one continuation level [default]
  -bbpi=1 outdent by one continuation level
  -bbpi=2 indent one full indentation level

Welding

-wn, --weld-nested-containers

The -wn flag causes closely nested pairs of opening and closing container symbols (curly braces, brackets, or parens) to be "welded" together, meaning that they are treated as if combined into a single unit, with the indentation of the innermost code reduced to be as if there were just a single container symbol.

For example:

        # default formatting
        do {
            {
                next if $x == $y;
            }
        } until $x++ > $z;

        # perltidy -wn
        do { {
            next if $x == $y;
        } } until $x++ > $z;

When this flag is set perltidy makes a preliminary pass through the file and identifies all nested pairs of containers. To qualify as a nested pair, the closing container symbols must be immediately adjacent and the opening symbols must either (1) be adjacent as in the above example, or (2) have an anonymous sub declaration following an outer opening container symbol which is not a code block brace, or (3) have an outer opening paren separated from the inner opening symbol by any single non-container symbol or something that looks like a function evaluation, as illustrated in the next examples. An additional option (4) which can be turned on with the flag --weld-fat-comma is when the opening container symbols are separated by a hash key and fat comma (=>).

Any container symbol may serve as both the inner container of one pair and as the outer container of an adjacent pair. Consequently, any number of adjacent opening or closing symbols may join together in weld. For example, here are three levels of wrapped function calls:

        # default formatting
        my (@date_time) = Localtime(
            Date_to_Time(
                Add_Delta_DHMS(
                    $year, $month,  $day, $hour, $minute, $second,
                    '0',   $offset, '0',  '0'
                )
            )
        );

        # perltidy -wn
        my (@date_time) = Localtime( Date_to_Time( Add_Delta_DHMS(
            $year, $month,  $day, $hour, $minute, $second,
            '0',   $offset, '0',  '0'
        ) ) );

Notice how the indentation of the inner lines are reduced by two levels in this case. This example also shows the typical result of this formatting, namely it is a sandwich consisting of an initial opening layer, a central section of any complexity forming the "meat" of the sandwich, and a final closing layer. This predictable structure helps keep the compacted structure readable.

The inner sandwich layer is required to be at least one line thick. If this cannot be achieved, welding does not occur. This constraint can cause formatting to take a couple of iterations to stabilize when it is first applied to a script. The -conv flag can be used to insure that the final format is achieved in a single run.

Here is an example illustrating a welded container within a welded containers:

        # default formatting
        $x->badd(
            bmul(
                $class->new(
                    abs(
                        $sx * int( $xr->numify() ) & $sy * int( $yr->numify() )
                    )
                ),
                $m
            )
        );

        # perltidy -wn
        $x->badd( bmul(
            $class->new( abs(
                $sx * int( $xr->numify() ) & $sy * int( $yr->numify() )
            ) ),
            $m
        ) );

The welded closing tokens are by default on a separate line but this can be modified with the -vtc=n flag (described in the next section). For example, the same example adding -vtc=2 is

        # perltidy -wn -vtc=2
        $x->badd( bmul(
            $class->new( abs(
                $sx * int( $xr->numify() ) & $sy * int( $yr->numify() ) ) ),
            $m ) );

This format option is quite general but there are some limitations.

One limitation is that any line length limit still applies and can cause long welded sections to be broken into multiple lines.

Another limitation is that an opening symbol which delimits quoted text cannot be included in a welded pair. This is because quote delimiters are treated specially in perltidy.

Finally, the stacking of containers defined by this flag have priority over any other container stacking flags. This is because any welding is done first.

-wfc, --weld-fat-comma

When the -wfc flag is set, along with -wn, perltidy is allowed to weld an opening paren to an inner opening container when they are separated by a hash key and fat comma (=>). for example

    # perltidy -wn -wfc
    elf->call_method( method_name_foo => {
        some_arg1       => $foo,
        some_other_arg3 => $bar->{'baz'},
    } );

This option is off by default.

-wnxl=s, --weld-nested-exclusion-list

The -wnxl=s flag provides some control over the types of containers which can be welded. The -wn flag by default is "greedy" in welding adjacent containers. If it welds more types of containers than desired, this flag provides a capability to reduce the amount of welding by specifying a list of things which should not be welded.

The logic in perltidy to apply this is straightforward. As each container token is being considered for joining a weld, any exclusion rules are consulted and used to reject the weld if necessary.

This list is a string with space-separated items. Each item consists of up to three pieces of information: (1) an optional position, (2) an optional preceding type, and (3) a container type.

The only required piece of information is a container type, which is one of '(', '[', '{' or 'q'. The first three of these are container tokens and the last represents a quoted list. For example the string

  -wnxl='[ { q'

means do NOT include square-brackets, braces, or quotes in any welds. The only unspecified container is '(', so this string means that only welds involving parens will be made.

To illustrate, following welded snippet consists of a chain of three welded containers with types '(' '[' and 'q':

    # perltidy -wn
    skip_symbols( [ qw(
        Perl_dump_fds
        Perl_ErrorNo
        Perl_GetVars
        PL_sys_intern
    ) ] );

Even though the qw term uses parens as the quote delimiter, it has a special type 'q' here. If it appears in a weld it always appears at the end of the welded chain.

Any of the container types '[', '{', and '(' may be prefixed with a position indicator which is either '^', to indicate the first token of a welded sequence, or '.', to indicate an interior token of a welded sequence. (Since a quoted string 'q' always ends a chain it does need a position indicator).

For example, if we do not want a sequence of welded containers to start with a square bracket we could use

  -wnxl='^['

In the above snippet, there is a square bracket but it does not start the chain, so the formatting would be unchanged if it were formatted with this restriction.

A third optional item of information which can be given is an alphanumeric letter which is used to limit the selection further depending on the type of token immediately before the container. If given, it goes just before the container symbol. The possible letters are currently 'k', 'K', 'f', 'F', 'w', and 'W', with these meanings:

 'k' matches if the previous nonblank token is a perl built-in keyword (such as 'if', 'while'),
 'K' matches if 'k' does not, meaning that the previous token is not a keyword.
 'f' matches if the previous token is a function other than a keyword.
 'F' matches if 'f' does not.
 'w' matches if either 'k' or 'f' match.
 'W' matches if 'w' does not.

For example, compare

        # perltidy -wn
        if ( defined( $_Cgi_Query{
            $Config{'methods'}{'authentication'}{'remote'}{'cgi'}{'username'}
        } ) )

with

        # perltidy -wn -wnxl='^K( {'
        if ( defined(
            $_Cgi_Query{ $Config{'methods'}{'authentication'}{'remote'}{'cgi'}
                  {'username'} }
        ) )

The first case does maximum welding. In the second case the leading paren is retained by the rule (it would have been rejected if preceded by a non-keyword) but the curly brace is rejected by the rule.

Here are some additional example strings and their meanings:

    '^('   - the weld must not start with a paren
    '.('   - the second and later tokens may not be parens
    '.w('  - the second and later tokens may not keyword or function call parens
    '('    - no parens in a weld
    '^K('  - exclude a leading paren preceded by a non-keyword
    '.k('  - exclude a secondary paren preceded by a keyword
    '[ {'  - exclude all brackets and braces
    '[ ( ^K{' - exclude everything except nested structures like do {{  ... }}
Vertical tightness of non-block curly braces, parentheses, and square brackets.

These parameters control what shall be called vertical tightness. Here are the main points:

Here are some examples:

    # perltidy -lp -vt=0 -vtc=0
    %romanNumerals = (
                       one   => 'I',
                       two   => 'II',
                       three => 'III',
                       four  => 'IV',
    );

    # perltidy -lp -vt=1 -vtc=0
    %romanNumerals = ( one   => 'I',
                       two   => 'II',
                       three => 'III',
                       four  => 'IV',
    );

    # perltidy -lp -vt=1 -vtc=1
    %romanNumerals = ( one   => 'I',
                       two   => 'II',
                       three => 'III',
                       four  => 'IV', );

    # perltidy -vtc=3
    my_function(
        one   => 'I',
        two   => 'II',
        three => 'III',
        four  => 'IV', );

    # perltidy -vtc=3
    %romanNumerals = (
        one   => 'I',
        two   => 'II',
        three => 'III',
        four  => 'IV',
    );

In the last example for -vtc=3, the opening paren is preceded by an equals so the closing paren is placed on a new line.

The difference between -vt=1 and -vt=2 is shown here:

    # perltidy -lp -vt=1
    $init->add(
                mysprintf( "(void)find_threadsv(%s);",
                           cstring( $threadsv_names[ $op->targ ] )
                )
    );

    # perltidy -lp -vt=2
    $init->add( mysprintf( "(void)find_threadsv(%s);",
                           cstring( $threadsv_names[ $op->targ ] )
                )
    );

With -vt=1, the line ending in add( does not combine with the next line because the next line is not balanced. This can help with readability, but -vt=2 can be used to ignore this rule.

The tightest, and least readable, code is produced with both -vt=2 and -vtc=2:

    # perltidy -lp -vt=2 -vtc=2
    $init->add( mysprintf( "(void)find_threadsv(%s);",
                           cstring( $threadsv_names[ $op->targ ] ) ) );

Notice how the code in all of these examples collapses vertically as -vt increases, but the indentation remains unchanged. This is because perltidy implements the -vt parameter by first formatting as if -vt=0, and then simply overwriting one output line on top of the next, if possible, to achieve the desired vertical tightness. The -lp indentation style has been designed to allow this vertical collapse to occur, which is why it is required for the -vt parameter.

The -vt=n and -vtc=n parameters apply to each type of container token. If desired, vertical tightness controls can be applied independently to each of the closing container token types.

The parameters for controlling parentheses are -pvt=n or --paren-vertical-tightness=n, and -pvtc=n or --paren-vertical-tightness-closing=n.

Likewise, the parameters for square brackets are -sbvt=n or --square-bracket-vertical-tightness=n, and -sbvtc=n or --square-bracket-vertical-tightness-closing=n.

Finally, the parameters for controlling non-code block braces are -bvt=n or --brace-vertical-tightness=n, and -bvtc=n or --brace-vertical-tightness-closing=n.

In fact, the parameter -vt=n is actually just an abbreviation for -pvt=n -bvt=n sbvt=n, and likewise -vtc=n is an abbreviation for -pvtc=n -bvtc=n -sbvtc=n.

-bbvt=n or --block-brace-vertical-tightness=n

The -bbvt=n flag is just like the -vt=n flag but applies to opening code block braces.

 -bbvt=0 break after opening block brace (default).
 -bbvt=1 do not break unless this would produce more than one
         step in indentation in a line.
 -bbvt=2 do not break after opening block brace.

It is necessary to also use either -bl or -bli for this to work, because, as with other vertical tightness controls, it is implemented by simply overwriting a line ending with an opening block brace with the subsequent line. For example:

    # perltidy -bli -bbvt=0
    if ( open( FILE, "< $File" ) )
      {
        while ( $File = <FILE> )
          {
            $In .= $File;
            $count++;
          }
        close(FILE);
      }

    # perltidy -bli -bbvt=1
    if ( open( FILE, "< $File" ) )
      { while ( $File = <FILE> )
          { $In .= $File;
            $count++;
          }
        close(FILE);
      }

By default this applies to blocks associated with keywords if, elsif, else, unless, for, foreach, sub, while, until, and also with a preceding label. This can be changed with the parameter -bbvtl=string, or --block-brace-vertical-tightness-list=string, where string is a space-separated list of block types. For more information on the possible values of this string, see "Specifying Block Types"

For example, if we want to just apply this style to if, elsif, and else blocks, we could use perltidy -bli -bbvt=1 -bbvtl='if elsif else'.

There is no vertical tightness control for closing block braces; with one exception they will be placed on separate lines. The exception is that a cascade of closing block braces may be stacked on a single line. See -scbb.

The -sot flag tells perltidy to "stack" opening tokens when possible to avoid lines with isolated opening tokens.

For example:

    # default
    $opt_c = Text::CSV_XS->new(
        {
            binary       => 1,
            sep_char     => $opt_c,
            always_quote => 1,
        }
    );

    # -sot
    $opt_c = Text::CSV_XS->new( {
            binary       => 1,
            sep_char     => $opt_c,
            always_quote => 1,
        }
    );

For detailed control of individual closing tokens the following controls can be used:

  -sop  or --stack-opening-paren
  -sohb or --stack-opening-hash-brace
  -sosb or --stack-opening-square-bracket
  -sobb or --stack-opening-block-brace

The flag -sot is an abbreviation for -sop -sohb -sosb.

The flag -sobb is an abbreviation for -bbvt=2 -bbvtl='*'. This will case a cascade of opening block braces to appear on a single line, although this an uncommon occurrence except in test scripts.

The -sct flag tells perltidy to "stack" closing tokens when possible to avoid lines with isolated closing tokens.

For example:

    # default
    $opt_c = Text::CSV_XS->new(
        {
            binary       => 1,
            sep_char     => $opt_c,
            always_quote => 1,
        }
    );

    # -sct
    $opt_c = Text::CSV_XS->new(
        {
            binary       => 1,
            sep_char     => $opt_c,
            always_quote => 1,
        } );

The -sct flag is somewhat similar to the -vtc flags, and in some cases it can give a similar result. The difference is that the -vtc flags try to avoid lines with leading opening tokens by "hiding" them at the end of a previous line, whereas the -sct flag merely tries to reduce the number of lines with isolated closing tokens by stacking them but does not try to hide them. For example:

    # -vtc=2
    $opt_c = Text::CSV_XS->new(
        {
            binary       => 1,
            sep_char     => $opt_c,
            always_quote => 1, } );

For detailed control of the stacking of individual closing tokens the following controls can be used:

  -scp  or --stack-closing-paren
  -schb or --stack-closing-hash-brace
  -scsb or --stack-closing-square-bracket
  -scbb or --stack-closing-block-brace

The flag -sct is an abbreviation for stacking the non-block closing tokens, -scp -schb -scsb.

Stacking of closing block braces, -scbb, causes a cascade of isolated closing block braces to be combined into a single line as in the following example:

    # -scbb:
    for $w1 (@w1) {
        for $w2 (@w2) {
            for $w3 (@w3) {
                for $w4 (@w4) {
                    push( @lines, "$w1 $w2 $w3 $w4\n" );
                } } } }

To simplify input even further for the case in which both opening and closing non-block containers are stacked, the flag -sac or --stack-all-containers is an abbreviation for -sot -sct.

Please note that if both opening and closing tokens are to be stacked, then the newer flag -weld-nested-containers may be preferable because it insures that stacking is always done symmetrically. It also removes an extra level of unnecessary indentation within welded containers. It is able to do this because it works on formatting globally rather than locally, as the -sot and -sct flags do.

Breaking Before or After Operators

Four command line parameters provide some control over whether a line break should be before or after specific token types. Two parameters give detailed control:

-wba=s or --want-break-after=s, and

-wbb=s or --want-break-before=s.

These parameters are each followed by a quoted string, s, containing a list of token types (separated only by spaces). No more than one of each of these parameters should be specified, because repeating a command-line parameter always overwrites the previous one before perltidy ever sees it.

By default, perltidy breaks after these token types: % + - * / x != == >= <= =~ !~ < > | & = **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x=

And perltidy breaks before these token types by default: . << >> -> && || //

To illustrate, to cause a break after a concatenation operator, '.', rather than before it, the command line would be

  -wba="."

As another example, the following command would cause a break before math operators '+', '-', '/', and '*':

  -wbb="+ - / *"

These commands should work well for most of the token types that perltidy uses (use --dump-token-types for a list). Also try the -D flag on a short snippet of code and look at the .DEBUG file to see the tokenization. However, for a few token types there may be conflicts with hardwired logic which cause unexpected results. One example is curly braces, which should be controlled with the parameter bl provided for that purpose.

WARNING Be sure to put these tokens in quotes to avoid having them misinterpreted by your command shell.

Two additional parameters are available which, though they provide no further capability, can simplify input are:

-baao or --break-after-all-operators,

-bbao or --break-before-all-operators.

The -baao sets the default to be to break after all of the following operators:

    % + - * / x != == >= <= =~ !~ < > | &
    = **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x=
    . : ? && || and or err xor

and the -bbao flag sets the default to break before all of these operators. These can be used to define an initial break preference which can be fine-tuned with the -wba and -wbb flags. For example, to break before all operators except an = one could use --bbao -wba='=' rather than listing every single perl operator except = on a -wbb flag.

-bal=n, --break-after-labels=n

This flag controls whether or not a line break occurs after a label. There are three possible values for n:

  -bal=0  break if there is a break in the input [DEFAULT]
  -bal=1  always break after a label
  -bal=2  never break after a label

For example,

      # perltidy -bal=1
      RETURN:
        return;

      # perltidy -bal=2
      RETURN: return;

Controlling List Formatting

Perltidy attempts to format lists of comma-separated values in tables which look good. Its default algorithms usually work well, but sometimes they don't. In this case, there are several methods available to control list formatting.

A very simple way to prevent perltidy from changing the line breaks within a comma-separated list of values is to insert a blank line, comment, or side-comment anywhere between the opening and closing parens (or braces or brackets). This causes perltidy to skip over its list formatting logic. (The reason is that any of these items put a constraint on line breaks, and perltidy needs complete control over line breaks within a container to adjust a list layout). For example, let us consider

    my @list = (1,
                1, 1,
                1, 2, 1,
                1, 3, 3, 1,
                1, 4, 6, 4, 1,);

The default formatting, which allows a maximum line length of 80, will flatten this down to one line:

    # perltidy (default)
    my @list = ( 1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1, );

This formatting loses the nice structure. If we place a side comment anywhere between the opening and closing parens, the original line break points are retained. For example,

    my @list = (
        1,    # a side comment forces the original line breakpoints to be kept
        1, 1,
        1, 2, 1,
        1, 3, 3, 1,
        1, 4, 6, 4, 1,
    );

The side comment can be a single hash symbol without any text. We could achieve the same result with a blank line or full comment anywhere between the opening and closing parens. Vertical alignment of the list items will still occur if possible.

For another possibility see the -fs flag in "Skipping Selected Sections of Code".

-boc, --break-at-old-comma-breakpoints

The -boc flag is another way to prevent comma-separated lists from being reformatted. Using -boc on the above example, plus additional flags to retain the original style, yields

    # perltidy -boc -lp -pt=2 -vt=1 -vtc=1
    my @list = (1,
                1, 1,
                1, 2, 1,
                1, 3, 3, 1,
                1, 4, 6, 4, 1,);

A disadvantage of this flag compared to the methods discussed above is that all tables in the file must already be nicely formatted.

-mft=n, --maximum-fields-per-table=n

If n is a positive number, and the computed number of fields for any table exceeds n, then it will be reduced to n. This parameter might be used on a small section of code to force a list to have a particular number of fields per line, and then either the -boc flag could be used to retain this formatting, or a single comment could be introduced somewhere to freeze the formatting in future applications of perltidy. For example

    # perltidy -mft=2
    @month_of_year = (
        'Jan', 'Feb',
        'Mar', 'Apr',
        'May', 'Jun',
        'Jul', 'Aug',
        'Sep', 'Oct',
        'Nov', 'Dec'
    );

The default value is n=0, which does not place a limit on the number of fields in a table.

-cab=n, --comma-arrow-breakpoints=n

A comma which follows a comma arrow, '=>', is given special consideration. In a long list, it is common to break at all such commas. This parameter can be used to control how perltidy breaks at these commas. (However, it will have no effect if old comma breaks are being forced because -boc is used). The possible values of n are:

 n=0 break at all commas after =>
 n=1 stable: break at all commas after => if container is open,
     EXCEPT FOR one-line containers
 n=2 break at all commas after =>, BUT try to form the maximum
     one-line container lengths
 n=3 do not treat commas after => specially at all
 n=4 break everything: like n=0 but ALSO break a short container with
     a => not followed by a comma when -vt=0 is used
 n=5 stable: like n=1 but ALSO break at open one-line containers when
     -vt=0 is used (default)

For example, given the following single line, perltidy by default will not add any line breaks because it would break the existing one-line container:

    bless { B => $B, Root => $Root } => $package;

Using -cab=0 will force a break after each comma-arrow item:

    # perltidy -cab=0:
    bless {
        B    => $B,
        Root => $Root
    } => $package;

If perltidy is subsequently run with this container broken, then by default it will break after each '=>' because the container is now broken. To reform a one-line container, the parameter -cab=2 could be used.

The flag -cab=3 can be used to prevent these commas from being treated specially. In this case, an item such as "01" => 31 is treated as a single item in a table. The number of fields in this table will be determined by the same rules that are used for any other table. Here is an example.

    # perltidy -cab=3
    my %last_day = (
        "01" => 31, "02" => 29, "03" => 31, "04" => 30,
        "05" => 31, "06" => 30, "07" => 31, "08" => 31,
        "09" => 30, "10" => 31, "11" => 30, "12" => 31
    );

Adding and Deleting Commas

-drc, --delete-repeated-commas

Repeated commas in a list are undesirable and can be removed with this flag. For example, given this list with a repeated comma

      ignoreSpec( $file, "file",, \%spec, \%Rspec );

we can remove it with -drc

      # perltidy -drc:
      ignoreSpec( $file, "file", \%spec, \%Rspec );

Since the default is not to add or delete commas, this feature is off by default and must be requested.

--want-trailing-commas=s or -wtc=s, --add-trailing-commas or -atc, and --delete-trailing-commas or -dtc

A trailing comma is a comma following the last item of a list. Perl allows trailing commas but they are not required. By default, perltidy does not add or delete trailing commas, but it is possible to manipulate them with the following set of three related parameters:

  --want-trailing-commas=s, -wtc=s - defines where trailing commas are wanted
  --add-trailing-commas,    -atc   - gives permission to add trailing commas to match the style wanted
  --delete-trailing-commas, -dtc   - gives permission to delete trailing commas which do not match the style wanted

The parameter --want-trailing-commas=s, or -wtc=s, defines a preferred style. The string s indicates which lists should get trailing commas, as follows:

  s=0 : no list should have a trailing comma
  s=1 or * : every list should have a trailing comma
  s=m a multi-line list should have a trailing commas
  s=b trailing commas should be 'bare' (comma followed by newline)
  s=h lists of key=>value pairs, with about one one '=>' and one ',' per line,
      with a bare trailing comma
  s=i lists with about one comma per line, with a bare trailing comma
  s=' ' or -wtc not defined : leave trailing commas unchanged [DEFAULT].

This parameter by itself only indicates the where trailing commas are wanted. Perltidy only adds these trailing commas if the flag --add-trailing-commas, or -atc is set. And perltidy only removes unwanted trailing commas if the flag --delete-trailing-commas, or -dtc is set.

Here are some example parameter combinations and their meanings

  -wtc=0 -dtc   : delete all trailing commas
  -wtc=1 -atc   : all lists get trailing commas
  -wtc=m -atc   : all multi-line lists get trailing commas, but
                  single line lists remain unchanged.
  -wtc=m -dtc   : multi-line lists remain unchanged, but
                  any trailing commas on single line lists are removed.
  -wtc=m -atc -dtc  : all multi-line lists get trailing commas, and
                      any trailing commas on single line lists are removed.

For example, given the following input without a trailing comma

    bless {
        B    => $B,
        Root => $Root
    } => $package;

we can add a trailing comma after the variable $Root using

    # perltidy -wtc=m -atc
    bless {
        B    => $B,
        Root => $Root,
    } => $package;

This could also be achieved in this case with -wtc=b instead of -wtc=m because the trailing comma here is bare (separated from its closing brace by a newline). And it could also be achieved with -wtc=h because this particular list is a list of key=>value pairs.

The above styles should cover the main of situations of interest, but it is possible to apply a different style to each type of container token by including an opening token ahead of the style character in the above table. For example

    -wtc='(m [b'

means that lists within parens should have multi-line trailing commas, and that lists within square brackets have bare trailing commas. Since there is no specification for curly braces in this example, their trailing commas would remain unchanged.

For parentheses, an additional item of information which can be given is an alphanumeric letter which is used to limit the selection further depending on the type of token immediately before the opening paren. The possible letters are currently 'k', 'K', 'f', 'F', 'w', and 'W', with these meanings for matching whatever precedes an opening paren:

 'k' matches if the previous nonblank token is a perl built-in keyword (such as 'if', 'while'),
 'K' matches if 'k' does not, meaning that the previous token is not a keyword.
 'f' matches if the previous token is a function other than a keyword.
 'F' matches if 'f' does not.
 'w' matches if either 'k' or 'f' match.
 'W' matches if 'w' does not.

These are the same codes used for --line-up-parentheses-inclusion-list. For example,

  -wtc = 'w(m'

means that trailing commas are wanted for multi-line parenthesized lists following a function call or keyword.

Here are some points to note regarding adding and deleting trailing commas:

-dwic, --delete-weld-interfering-commas

If the closing tokens of two nested containers are separated by a comma, then welding requested with --weld-nested-containers cannot occur. Any commas in this situation are optional trailing commas and can be removed with -dwic. For example, a comma in this script prevents welding:

    # perltidy -wn
    skip_symbols(
        [ qw(
            Perl_dump_fds
            Perl_ErrorNo
            Perl_GetVars
            PL_sys_intern
        ) ],
    );

Using -dwic removes the comma and allows welding:

    # perltidy -wn -dwic
    skip_symbols( [ qw(
        Perl_dump_fds
        Perl_ErrorNo
        Perl_GetVars
        PL_sys_intern
    ) ] );

Since the default is not to add or delete commas, this feature is off by default. Here are some points to note about the -dwic parameter

Retaining or Ignoring Existing Line Breaks

Several additional parameters are available for controlling the extent to which line breaks in the input script influence the output script. In most cases, the default parameter values are set so that, if a choice is possible, the output style follows the input style. For example, if a short logical container is broken in the input script, then the default behavior is for it to remain broken in the output script.

Most of the parameters in this section would only be required for a one-time conversion of a script from short container lengths to longer container lengths. The opposite effect, of converting long container lengths to shorter lengths, can be obtained by temporarily using a short maximum line length.

-bol, --break-at-old-logical-breakpoints

By default, if a logical expression is broken at a &&, ||, and, or or, then the container will remain broken. Also, breaks at internal keywords if and unless will normally be retained. To prevent this, and thus form longer lines, use -nbol.

Please note that this flag does not duplicate old logical breakpoints. They are merely used as a hint with this flag that a statement should remain broken. Without this flag, perltidy will normally try to combine relatively short expressions into a single line.

For example, given this snippet:

    return unless $cmd = $cmd || ($dot
        && $Last_Shell) || &prompt('|');

    # perltidy -bol [default]
    return
      unless $cmd = $cmd
      || ( $dot
        && $Last_Shell )
      || &prompt('|');

    # perltidy -nbol
    return unless $cmd = $cmd || ( $dot && $Last_Shell ) || &prompt('|');
-bom, --break-at-old-method-breakpoints

By default, a method call arrow -> is considered a candidate for a breakpoint, but method chains will fill to the line width before a break is considered. With -bom, breaks before the arrow are preserved, so if you have pre-formatted a method chain:

  my $q = $rs
    ->related_resultset('CDs')
    ->related_resultset('Tracks')
    ->search({
      'track.id' => {-ident => 'none_search.id'},
    })->as_query;

It will keep these breaks, rather than become this:

  my $q = $rs->related_resultset('CDs')->related_resultset('Tracks')->search({
      'track.id' => {-ident => 'none_search.id'},
    })->as_query;

This flag will also look for and keep a 'cuddled' style of calls, in which lines begin with a closing paren followed by a call arrow, as in this example:

  # perltidy -bom -wn
  my $q = $rs->related_resultset(
      'CDs'
  )->related_resultset(
      'Tracks'
  )->search( {
      'track.id' => { -ident => 'none_search.id' },
  } )->as_query;

You may want to include the -weld-nested-containers flag in this case to keep nested braces and parens together, as in the last line.

-bos, --break-at-old-semicolon-breakpoints

Semicolons are normally placed at the end of a statement. This means that formatted lines do not normally begin with semicolons. If the input stream has some lines which begin with semicolons, these can be retained by setting this flag. For example, consider the following two-line input snippet:

  $z = sqrt($x**2 + $y**2)
  ;

The default formatting will be:

  $z = sqrt( $x**2 + $y**2 );

The result using perltidy -bos keeps the isolated semicolon:

  $z = sqrt( $x**2 + $y**2 )
    ;

The default is not to do this, -nbos.

-bok, --break-at-old-keyword-breakpoints

By default, perltidy will retain a breakpoint before keywords which may return lists, such as sort and <map>. This allows chains of these operators to be displayed one per line. Use -nbok to prevent retaining these breakpoints.

-bot, --break-at-old-ternary-breakpoints

By default, if a conditional (ternary) operator is broken at a :, then it will remain broken. To prevent this, and thereby form longer lines, use -nbot.

-boa, --break-at-old-attribute-breakpoints

By default, if an attribute list is broken at a : in the source file, then it will remain broken. For example, given the following code, the line breaks at the ':'s will be retained:

                    my @field
                      : field
                      : Default(1)
                      : Get('Name' => 'foo') : Set('Name');

If the attributes are on a single line in the source code then they will remain on a single line if possible.

To prevent this, and thereby always form longer lines, use -nboa.

Keeping old breakpoints at specific token types

It is possible to override the choice of line breaks made by perltidy, and force it to follow certain line breaks in the input stream, with these two parameters:

-kbb=s or --keep-old-breakpoints-before=s, and

-kba=s or --keep-old-breakpoints-after=s

These parameters are each followed by a quoted string, s, containing a list of token types (separated only by spaces). No more than one of each of these parameters should be specified, because repeating a command-line parameter always overwrites the previous one before perltidy ever sees it.

For example, -kbb='=>' means that if an input line begins with a '=>' then the output script should also have a line break before that token.

For example, given the script:

    method 'foo'
      => [ Int, Int ]
      => sub {
        my ( $self, $x, $y ) = ( shift, @_ );
        ...;
      };

    # perltidy [default]
    method 'foo' => [ Int, Int ] => sub {
        my ( $self, $x, $y ) = ( shift, @_ );
        ...;
    };

    # perltidy -kbb='=>'
    method 'foo'
      => [ Int, Int ]
      => sub {
        my ( $self, $x, $y ) = ( shift, @_ );
        ...;
      };

For the container tokens '{', '[' and '(' and, their closing counterparts, use the token symbol. Thus, the command to keep a break after all opening parens is:

   perltidy -kba='('

It is possible to be more specific in matching parentheses by preceding them with a letter. The possible letters are 'k', 'K', 'f', 'F', 'w', and 'W', with these meanings (these are the same as used in the --weld-nested-exclusion-list and --line-up-parentheses-exclusion-list parameters):

 'k' matches if the previous nonblank token is a perl built-in keyword (such as 'if', 'while'),
 'K' matches if 'k' does not, meaning that the previous token is not a keyword.
 'f' matches if the previous token is a function other than a keyword.
 'F' matches if 'f' does not.
 'w' matches if either 'k' or 'f' match.
 'W' matches if 'w' does not.

So for example the the following parameter will keep breaks after opening function call parens:

   perltidy -kba='f('

NOTE: A request to break before an opening container, such as -kbb='(', will be silently ignored because it can lead to formatting instability. Likewise, a request to break after a closing container, such as -kba=')', will also be silently ignored.

-iob, --ignore-old-breakpoints

Use this flag to tell perltidy to ignore existing line breaks to the maximum extent possible. This will tend to produce the longest possible containers, regardless of type, which do not exceed the line length limit. But please note that this parameter has priority over all other parameters requesting that certain old breakpoints be kept.

To illustrate, consider the following input text:

    has subcmds => (
        is => 'ro',
        default => sub { [] },
    );

The default formatting will keep the container broken, giving

    # perltidy [default]
    has subcmds => (
        is      => 'ro',
        default => sub { [] },
    );

If old breakpoints are ignored, the list will be flattened:

    # perltidy -iob
    has subcmds => ( is => 'ro', default => sub { [] }, );

Besides flattening lists, this parameter also applies to lines broken at certain logical breakpoints such as 'if' and 'or'.

Even if this is parameter is not used globally, it provides a convenient way to flatten selected lists from within an editor.

-kis, --keep-interior-semicolons

Use the -kis flag to prevent breaking at a semicolon if there was no break there in the input file. Normally perltidy places a newline after each semicolon which terminates a statement unless several statements are contained within a one-line brace block. To illustrate, consider the following input lines:

    dbmclose(%verb_delim); undef %verb_delim;
    dbmclose(%expanded); undef %expanded;

The default is to break after each statement, giving

    dbmclose(%verb_delim);
    undef %verb_delim;
    dbmclose(%expanded);
    undef %expanded;

With perltidy -kis the multiple statements are retained:

    dbmclose(%verb_delim); undef %verb_delim;
    dbmclose(%expanded);   undef %expanded;

The statements are still subject to the specified value of maximum-line-length and will be broken if this maximum is exceeded.

Blank Line Control

Blank lines can improve the readability of a script if they are carefully placed. Perltidy has several commands for controlling the insertion, retention, and removal of blank lines.

-fbl, --freeze-blank-lines

Set -fbl if you want to the blank lines in your script to remain exactly as they are. The rest of the parameters in this section may then be ignored. (Note: setting the -fbl flag is equivalent to setting -mbl=0 and -kbl=2).

-bbc, --blanks-before-comments

A blank line will be introduced before a full-line comment. This is the default. Use -nbbc or --noblanks-before-comments to prevent such blank lines from being introduced.

-blbs=n, --blank-lines-before-subs=n

The parameter -blbs=n requests that least n blank lines precede a sub definition which does not follow a comment and which is more than one-line long. The default is <-blbs=1>. BEGIN and END blocks are included.

The requested number of blanks statement will be inserted regardless of the value of --maximum-consecutive-blank-lines=n (-mbl=n) with the exception that if -mbl=0 then no blanks will be output.

This parameter interacts with the value k of the parameter --maximum-consecutive-blank-lines=k (-mbl=k) as follows:

1. If -mbl=0 then no blanks will be output. This allows all blanks to be suppressed with a single parameter. Otherwise,

2. If the number of old blank lines in the script is less than n then additional blanks will be inserted to make the total n regardless of the value of -mbl=k.

3. If the number of old blank lines in the script equals or exceeds n then this parameter has no effect, however the total will not exceed value specified on the -mbl=k flag.

-blbp=n, --blank-lines-before-packages=n

The parameter -blbp=n requests that least n blank lines precede a package which does not follow a comment. The default is -blbp=1.

This parameter interacts with the value k of the parameter --maximum-consecutive-blank-lines=k (-mbl=k) in the same way as described for the previous item -blbs=n.

-bbs, --blanks-before-subs

For compatibility with previous versions, -bbs or --blanks-before-subs is equivalent to -blbp=1 and -blbs=1.

Likewise, -nbbs or --noblanks-before-subs is equivalent to -blbp=0 and -blbs=0.

-bbb, --blanks-before-blocks

A blank line will be introduced before blocks of coding delimited by for, foreach, while, until, and if, unless, in the following circumstances:

This is the default. The intention of this option is to introduce some space within dense coding. This is negated with -nbbb or --noblanks-before-blocks.

-lbl=n --long-block-line-count=n

This controls how often perltidy is allowed to add blank lines before certain block types (see previous section). The default is 8. Entering a value of 0 is equivalent to entering a very large number.

-blao=i or --blank-lines-after-opening-block=i

This control places a minimum of i blank lines after a line which ends with an opening block brace of a specified type. By default, this only applies to the block of a named sub, but this can be changed (see -blaol below). The default is not to do this (i=0).

Please see the note below on using the -blao and -blbc options.

-blbc=i or --blank-lines-before-closing-block=i

This control places a minimum of i blank lines before a line which begins with a closing block brace of a specified type. By default, this only applies to the block of a named sub, but this can be changed (see -blbcl below). The default is not to do this (i=0).

-blaol=s or --blank-lines-after-opening-block-list=s

The parameter s is a list of block type keywords to which the flag -blao should apply. The section "Specifying Block Types" explains how to list block types.

-blbcl=s or --blank-lines-before-closing-block-list=s

This parameter is a list of block type keywords to which the flag -blbc should apply. The section "Specifying Block Types" explains how to list block types.

Note on using the -blao and -blbc options.

These blank line controls introduce a certain minimum number of blank lines in the text, but the final number of blank lines may be greater, depending on values of the other blank line controls and the number of old blank lines. A consequence is that introducing blank lines with these and other controls cannot be exactly undone, so some experimentation with these controls is recommended before using them.

For example, suppose that for some reason we decide to introduce one blank space at the beginning and ending of all blocks. We could do this using

  perltidy -blao=2 -blbc=2 -blaol='*' -blbcl='*' filename

Now suppose the script continues to be developed, but at some later date we decide we don't want these spaces after all. We might expect that running with the flags -blao=0 and -blbc=0 will undo them. However, by default perltidy retains single blank lines, so the blank lines remain.

We can easily fix this by telling perltidy to ignore old blank lines by including the added parameter -kbl=0 and rerunning. Then the unwanted blank lines will be gone. However, this will cause all old blank lines to be ignored, perhaps even some that were added by hand to improve formatting. So please be cautious when using these parameters.

-mbl=n --maximum-consecutive-blank-lines=n

This parameter specifies the maximum number of consecutive blank lines which will be output within code sections of a script. The default is n=1. If the input file has more than n consecutive blank lines, the number will be reduced to n except as noted above for the -blbp and -blbs parameters. If n=0 then no blank lines will be output (unless all old blank lines are retained with the -kbl=2 flag of the next section).

This flag obviously does not apply to pod sections, here-documents, and quotes.

-kbl=n, --keep-old-blank-lines=n

The -kbl=n flag gives you control over how your existing blank lines are treated.

The possible values of n are:

 n=0 ignore all old blank lines
 n=1 stable: keep old blanks, but limited by the value of the B<-mbl=n> flag
 n=2 keep all old blank lines, regardless of the value of the B<-mbl=n> flag

The default is n=1.

-sob, --swallow-optional-blank-lines

This is equivalent to kbl=0 and is included for compatibility with previous versions.

-nsob, --noswallow-optional-blank-lines

This is equivalent to kbl=1 and is included for compatibility with previous versions.

Controls for blank lines around lines of consecutive keywords

The parameters in this section provide some control over the placement of blank lines within and around groups of statements beginning with selected keywords. These blank lines are called here keyword group blanks, and all of the parameters begin with --keyword-group-blanks*, or -kgb* for short. The default settings do not employ these controls but they can be enabled with the following parameters:

-kgbl=s or --keyword-group-blanks-list=s; s is a quoted string of keywords

-kgbs=s or --keyword-group-blanks-size=s; s gives the number of keywords required to form a group.

-kgbb=n or --keyword-group-blanks-before=n; n = (0, 1, or 2) controls a leading blank

-kgba=n or --keyword-group-blanks-after=n; n = (0, 1, or 2) controls a trailing blank

-kgbi or --keyword-group-blanks-inside is a switch for adding blanks between subgroups

-kgbd or --keyword-group-blanks-delete is a switch for removing initial blank lines between keywords

-kgbr=n or --keyword-group-blanks-repeat-count=n can limit the number of times this logic is applied

In addition, the following abbreviations are available to for simplified usage:

-kgb or --keyword-group-blanks is short for -kgbb=2 -kgba=2 kgbi

-nkgb or --nokeyword-group-blanks, is short for -kgbb=1 -kgba=1 nkgbi

Before describing the meaning of the parameters in detail let us look at an example which is formatted with default parameter settings.

        print "Entering test 2\n";
        use Test;
        use Encode qw(from_to encode decode
          encode_utf8 decode_utf8
          find_encoding is_utf8);
        use charnames qw(greek);
        my @encodings     = grep( /iso-?8859/, Encode::encodings() );
        my @character_set = ( '0' .. '9', 'A' .. 'Z', 'a' .. 'z' );
        my @source        = qw(ascii iso8859-1 cp1250);
        my @destiny       = qw(cp1047 cp37 posix-bc);
        my @ebcdic_sets   = qw(cp1047 cp37 posix-bc);
        my $str           = join( '', map( chr($_), 0x20 .. 0x7E ) );
        return unless ($str);

using perltidy -kgb gives:

        print "Entering test 2\n";
                                      <----------this blank controlled by -kgbb
        use Test;
        use Encode qw(from_to encode decode
          encode_utf8 decode_utf8
          find_encoding is_utf8);
        use charnames qw(greek);
                                      <---------this blank controlled by -kgbi
        my @encodings     = grep( /iso-?8859/, Encode::encodings() );
        my @character_set = ( '0' .. '9', 'A' .. 'Z', 'a' .. 'z' );
        my @source        = qw(ascii iso8859-1 cp1250);
        my @destiny       = qw(cp1047 cp37 posix-bc);
        my @ebcdic_sets   = qw(cp1047 cp37 posix-bc);
        my $str           = join( '', map( chr($_), 0x20 .. 0x7E ) );
                                      <----------this blank controlled by -kgba
        return unless ($str);

Blank lines have been introduced around the my and use sequences. What happened is that the default keyword list includes my and use but not print and return. So a continuous sequence of nine my and use statements was located. This number exceeds the default threshold of five, so blanks were placed before and after the entire group. Then, since there was also a subsequence of six my lines, a blank line was introduced to separate them.

Finer control over blank placement can be achieved by using the individual parameters rather than the -kgb flag. The individual controls are as follows.

-kgbl=s or --keyword-group-blanks-list=s, where s is a quoted string, defines the set of keywords which will be formed into groups. The string is a space separated list of keywords. The default set is s="use require local our my", but any list of keywords may be used. Comment lines may also be included in a keyword group, even though they are not keywords. To include ordinary block comments, include the symbol BC. To include static block comments (which normally begin with '##'), include the symbol SBC.

-kgbs=s or --keyword-group-blanks-size=s, where s is a string describing the number of consecutive keyword statements forming a group (Note: statements separated by blank lines in the input file are considered consecutive for purposes of this count). If s is an integer then it is the minimum number required for a group. A maximum value may also be given with the format s=min.max, where min is the minimum number and max is the maximum number, and the min and max values are separated by one or more dots. No groups will be found if the maximum is less than the minimum. The maximum is unlimited if not given. The default is s=5. Some examples:

    s      min   max         number for group
    3      3     unlimited   3 or more
    1.1    1     1           1
    1..3   1     3           1 to 3
    1.0    1     0           (no match)

There is no really good default value for this parameter. If it is set too small, then an excessive number of blank lines may be generated. However, some users may prefer reducing the value somewhat below the default, perhaps to s=3.

-kgbb=n or --keyword-group-blanks-before=n specifies whether a blank should appear before the first line of the group, as follows:

   n=0 => (delete) an existing blank line will be removed
   n=1 => (stable) no change to the input file is made  [DEFAULT]
   n=2 => (insert) a blank line is introduced if possible

-kgba=n or --keyword-group-blanks-after=n likewise specifies whether a blank should appear after the last line of the group, using the same scheme (0=delete, 1=stable, 2=insert).

-kgbi or --keyword-group-blanks-inside controls the insertion of blank lines between the first and last statement of the entire group. If there is a continuous run of a single statement type with more than the minimum threshold number (as specified with -kgbs=s) then this switch causes a blank line be inserted between this subgroup and the others. In the example above this happened between the use and my statements.

-kgbd or --keyword-group-blanks-delete controls the deletion of any blank lines that exist in the the group when it is first scanned. When statements are initially scanned, any existing blank lines are included in the collection. Any such original blank lines will be deleted before any other insertions are made when the parameter -kgbd is set. The default is not to do this, -nkgbd.

-kgbr=n or --keyword-group-blanks-repeat-count=n specifies n, the maximum number of times this logic will be applied to any file. The special value n=0 is the same as n=infinity which means it will be applied to an entire script [Default]. A value n=1 could be used to make it apply just one time for example. This might be useful for adjusting just the use statements in the top part of a module for example.

-kgb or --keyword-group-blanks is an abbreviation equivalent to setting -kgbb=1 -kgba=1 -kgbi. This turns on keyword group formatting with a set of default values.

-nkgb or --nokeyword-group-blanks is equivalent to -kgbb=0 -kgba nkgbi. This flag turns off keyword group blank lines and is the default setting.

Here are a few notes about the functioning of this technique.

Styles

A style refers to a convenient collection of existing parameters.

-gnu, --gnu-style

-gnu gives an approximation to the GNU Coding Standards (which do not apply to perl) as they are sometimes implemented. At present, this style overrides the default style with the following parameters:

    -lp -bl -noll -pt=2 -bt=2 -sbt=2 -icp

To use this style with -xlp instead of -lp use -gnu -xlp.

-pbp, --perl-best-practices

-pbp is an abbreviation for the parameters in the book Perl Best Practices by Damian Conway:

    -l=78 -i=4 -ci=4 -st -se -vt=2 -cti=0 -pt=1 -bt=1 -sbt=1 -bbt=1 -nsfs -nolq
    -wbb="% + - * / x != == >= <= =~ !~ < > | & =
          **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x="

Please note that this parameter set includes -st and -se flags, which make perltidy act as a filter on one file only. These can be overridden by placing -nst and/or -nse after the -pbp parameter.

Also note that the value of continuation indentation, -ci=4, is equal to the value of the full indentation, -i=4. It is recommended that the either (1) the parameter -ci=2 be used instead, or (2) the flag -xci be set. This will help show structure, particularly when there are ternary statements. The following snippet illustrates these options.

    # perltidy -pbp
    $self->{_text} = (
         !$section        ? ''
        : $type eq 'item' ? "the $section entry"
        :                   "the section on $section"
        )
        . (
        $page
        ? ( $section ? ' in ' : '' ) . "the $page$page_ext manpage"
        : ' elsewhere in this document'
        );

    # perltidy -pbp -ci=2
    $self->{_text} = (
         !$section        ? ''
        : $type eq 'item' ? "the $section entry"
        :                   "the section on $section"
      )
      . (
        $page
        ? ( $section ? ' in ' : '' ) . "the $page$page_ext manpage"
        : ' elsewhere in this document'
      );

    # perltidy -pbp -xci
    $self->{_text} = (
         !$section        ? ''
        : $type eq 'item' ? "the $section entry"
        :                   "the section on $section"
        )
        . ( $page
            ? ( $section ? ' in ' : '' ) . "the $page$page_ext manpage"
            : ' elsewhere in this document'
        );

The -xci flag was developed after the -pbp parameters were published so you need to include it separately.

One-Line Blocks

A one-line block is a block of code where the contents within the curly braces is short enough to fit on a single line. For example,

    if ( -e $file ) { print "'$file' exists\n" }

The alternative, a block which spans multiple lines, is said to be a broken block. With few exceptions, perltidy retains existing one-line blocks, if it is possible within the line-length constraint, but it does not attempt to form new ones. In other words, perltidy will try to follow the input file regarding broken and unbroken blocks.

The main exception to this rule is that perltidy will attempt to form new one-line blocks following the keywords map, eval, and sort, eval, because these code blocks are often small and most clearly displayed in a single line. This behavior can be controlled with the flag --one-line-block-exclusion-list described below.

When the cuddled-else style is used, the default treatment of one-line blocks may interfere with the cuddled style. In this case, the default behavior may be changed with the flag --cuddled-break-option=n described elsehwere.

When an existing one-line block is longer than the maximum line length, and must therefore be broken into multiple lines, perltidy checks for and adds any optional terminating semicolon (unless the -nasc option is used) if the block is a code block.

-olbxl=s, --one-line-block-exclusion-list=s

As noted above, perltidy will, by default, attempt to create new one-line blocks for certain block types. This flag allows the user to prevent this behavior for the block types listed in the string s. The list s may include any of the words sort, map, grep, eval, or it may be * to indicate all of these.

So for example to prevent multi-line eval blocks from becoming one-line blocks, the command would be -olbxl='eval'. In this case, existing one-line eval blocks will remain on one-line if possible, and existing multi-line eval blocks will remain multi-line blocks.

-olbn=n, --one-line-block-nesting=n

Nested one-line blocks are lines with code blocks which themselves contain code blocks. For example, the following line is a nested one-line block.

         foreach (@list) { if ($_ eq $asked_for) { last } ++$found }

The default behavior is to break such lines into multiple lines, but this behavior can be controlled with this flag. The values of n are:

  n=0 break nested one-line blocks into multiple lines [DEFAULT]
  n=1 stable: keep existing nested-one line blocks intact

For the above example, the default formatting (-olbn=0) is

    foreach (@list) {
        if ( $_ eq $asked_for ) { last }
        ++$found;
    }

If the parameter -olbn=1 is given, then the line will be left intact if it is a single line in the source, or it will be broken into multiple lines if it is broken in multiple lines in the source.

-olbs=n, --one-line-block-semicolons=n

This flag controls the placement of semicolons at the end of one-line blocks. Semicolons are optional before a closing block brace, and frequently they are omitted at the end of a one-line block containing just a single statement. By default, perltidy follows the input file regarding these semicolons, but this behavior can be controlled by this flag. The values of n are:

  n=0 remove terminal semicolons in one-line blocks having a single statement
  n=1 stable; keep input file placement of terminal semicolons [DEFAULT ]
  n=2 add terminal semicolons in all one-line blocks

Note that the n=2 option has no effect if adding semicolons is prohibited with the -nasc flag. Also not that while n=2 adds missing semicolons to all one-line blocks, regardless of complexity, the n=0 option only removes ending semicolons which terminate one-line blocks containing just one semicolon. So these two options are not exact inverses.

Forming new one-line blocks

Sometimes it might be desirable to convert a script to have one-line blocks whenever possible. Although there is currently no flag for this, a simple workaround is to execute perltidy twice, once with the flag -noadd-newlines and then once again with normal parameters, like this:

     cat infile | perltidy -nanl | perltidy >outfile

When executed on this snippet

    if ( $? == -1 ) {
        die "failed to execute: $!\n";
    }
    if ( $? == -1 ) {
        print "Had enough.\n";
        die "failed to execute: $!\n";
    }

the result is

    if ( $? == -1 ) { die "failed to execute: $!\n"; }
    if ( $? == -1 ) {
        print "Had enough.\n";
        die "failed to execute: $!\n";
    }

This shows that blocks with a single statement become one-line blocks.

Breaking existing one-line blocks

There is no automatic way to break existing long one-line blocks into multiple lines, but this can be accomplished by processing a script, or section of a script, with a short value of the parameter maximum-line-length=n. Then, when the script is reformatted again with the normal parameters, the blocks which were broken will remain broken (with the exceptions noted above).

Another trick for doing this for certain block types is to format one time with the -cuddled-else flag and --cuddled-break-option=2. Then format again with the normal parameters. This will break any one-line blocks which are involved in a cuddled-else style.

Controlling Vertical Alignment

Vertical alignment refers to lining up certain symbols in a list of consecutive similar lines to improve readability. For example, the "fat commas" are aligned in the following statement:

        $data = $pkg->new(
            PeerAddr => join( ".", @port[ 0 .. 3 ] ),
            PeerPort => $port[4] * 256 + $port[5],
            Proto    => 'tcp'
        );

Vertical alignment can be completely turned off using the -novalign flag mentioned below. However, vertical alignment can be forced to stop and restart by selectively introducing blank lines. For example, a blank has been inserted in the following code to keep somewhat similar things aligned.

    %option_range = (
        'format'             => [ 'tidy', 'html', 'user' ],
        'output-line-ending' => [ 'dos',  'win',  'mac', 'unix' ],
        'character-encoding' => [ 'none', 'utf8' ],

        'block-brace-tightness'    => [ 0, 2 ],
        'brace-tightness'          => [ 0, 2 ],
        'paren-tightness'          => [ 0, 2 ],
        'square-bracket-tightness' => [ 0, 2 ],
    );

Vertical alignment is implemented by locally increasing an existing blank space to produce alignment with an adjacent line. It cannot occur if there is no blank space to increase. So if a particular space is removed by one of the existing controls then vertical alignment cannot occur. Likewise, if a space is added with one of the controls, then vertical alignment might occur.

For example,

        # perltidy -nwls='=>'
        $data = $pkg->new(
            PeerAddr=> join( ".", @port[ 0 .. 3 ] ),
            PeerPort=> $port[4] * 256 + $port[5],
            Proto=> 'tcp'
        );
Completely turning off vertical alignment with -novalign

The default is to use vertical alignment, but vertical alignment can be completely turned of with the -novalign flag.

A lower level of control of vertical alignment is possible with three parameters -vc, -vsc, and -vbc. These independently control alignment of code, side comments and block comments. They are described in the next section.

The parameter -valign is in fact an alias for -vc -vsc -vbc, and its negative -novalign is an alias for -nvc -nvsc -nvbc.

Controlling code alignment with --valign-code or -vc

The -vc flag enables alignment of code symbols such as =. The default is -vc. For detailed control of which symbols to align, see the -valign-exclude-list parameter below.

Controlling side comment alignment with --valign-side-comments or -vsc

The -vsc flag enables alignment of side comments and is enabled by default. If side comment alignment is disabled with -nvsc they will appear at a fixed space from the preceding code token. The default is -vsc

Controlling block comment alignment with --valign-block-comments or -vbc

When -vbc is enabled, block comments can become aligned for example if one comment of a consecutive sequence of comments becomes outdented due a length in excess of the maximum line length. If this occurs, the entire group of comments will remain aligned and be outdented by the same amount. This coordinated alignment will not occur if -nvbc is set. The default is -vbc.

Finer alignment control with --valign-exclusion-list=s or -vxl=s and --valign-inclusion-list=s or -vil=s

More detailed control of alignment types is available with these two parameters. Most of the vertical alignments in typical programs occur at one of the tokens ',', '=', and '=>', but many other alignments are possible and are given in the following list:

  = **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x=
  { ( ? : , ; => && || ~~ !~~ =~ !~ // <=> -> q
  if unless and or err for foreach while until

These alignment types correspond to perl symbols, operators and keywords except for 'q', which refers to the special case of alignment in a 'use' statement of qw quotes and empty parens.

They are all enabled by default, but they can be selectively disabled by including one or more of these tokens in the space-separated list valign-exclusion-list=s. For example, the following would prevent alignment at = and if:

  --valign-exclusion-list='= if'

If it is simpler to specify only the token types which are to be aligned, then include the types which are to be aligned in the list of --valign-inclusion-list. In that case you may leave the valign-exclusion-list undefined, or use the special symbol * for the exclusion list. For example, the following parameters enable alignment only at commas and 'fat commas':

  --valign-inclusion-list=', =>'
  --valign-exclusion-list='*'     ( this is optional and may be omitted )

These parameter lists should consist of space-separated tokens from the above list of possible alignment tokens, or a '*'. If an unrecognized token appears, it is simply ignored. And if a specific token is entered in both lists by mistake then the exclusion list has priority.

The default values of these parameters enable all alignments and are equivalent to

  --valign-exclusion-list=' '
  --valign-inclusion-list='*'

To illustrate, consider the following snippet with default formatting

    # perltidy
    $co_description = ($color) ? 'bold cyan'  : '';           # description
    $co_prompt      = ($color) ? 'bold green' : '';           # prompt
    $co_unused      = ($color) ? 'on_green'   : 'reverse';    # unused

To exclude all alignments except the equals (i.e., include only equals) we could use:

    # perltidy -vil='='
    $co_description = ($color) ? 'bold cyan' : '';          # description
    $co_prompt      = ($color) ? 'bold green' : '';         # prompt
    $co_unused      = ($color) ? 'on_green' : 'reverse';    # unused

To exclude only the equals we could use:

    # perltidy -vxl='='
    $co_description = ($color) ? 'bold cyan' : '';     # description
    $co_prompt = ($color) ? 'bold green' : '';         # prompt
    $co_unused = ($color) ? 'on_green' : 'reverse';    # unused

Notice in this last example that although only the equals alignment was excluded, the ternary alignments were also lost. This happens because the vertical aligner sweeps from left-to-right and usually stops if an important alignment cannot be made for some reason.

But also notice that side comments remain aligned because their alignment is controlled separately with the parameter --valign-side_comments described above.

Extended Syntax

This section describes some parameters for dealing with extended syntax.

For another method of handling extended syntax see the section "Skipping Selected Sections of Code".

Also note that the module Perl::Tidy supplies a pre-filter and post-filter capability. This requires calling the module from a separate program rather than through the binary perltidy.

-xs, --extended-syntax

This flag allows perltidy to handle certain common extensions to the standard syntax without complaint.

For example, without this flag a structure such as the following would generate a syntax error:

    Method deposit( Num $amount) {
        $self->balance( $self->balance + $amount );
    }

This flag is enabled by default but it can be deactivated with -nxs. Probably the only reason to deactivate this flag is to generate more diagnostic messages when debugging a script.

-sal=s, --sub-alias-list=s

This flag causes one or more words to be treated the same as if they were the keyword sub. The string s contains one or more alias words, separated by spaces or commas.

For example,

        perltidy -sal='method fun _sub M4'

will cause the perltidy to treat the words 'method', 'fun', '_sub' and 'M4' the same as if they were 'sub'. Note that if the alias words are separated by spaces then the string of words should be placed in quotes.

Note that several other parameters accept a list of keywords, including 'sub' (see "Specifying Block Types"). You do not need to include any sub aliases in these lists. Just include keyword 'sub' if you wish, and all aliases are automatically included.

-gal=s, --grep-alias-list=s

This flag allows a code block following an external 'list operator' function to be formatted as if it followed one of the built-in keywords grep, map or sort. The string s contains the names of one or more such list operators, separated by spaces or commas.

By 'list operator' is meant a function which is invoked in the form

      word {BLOCK} @list

Perltidy tries to keep code blocks for these functions intact, since they are usually short, and does not automatically break after the closing brace since a list may follow. It also does some special handling of continuation indentation.

For example, the code block arguments to functions 'My_grep' and 'My_map' can be given formatting like 'grep' with

        perltidy -gal='My_grep My_map'

By default, the following list operators in List::Util are automatically included:

      all any first none notall reduce reductions

Any operators specified with --grep-alias-list are added to this list. The next parameter can be used to remove words from this default list.

-gaxl=s, --grep-alias-exclusion-list=s

The -gaxl=s flag provides a method for removing any of the default list operators given above by listing them in the string s. To remove all of the default operators use -gaxl='*'.

-uf=s, --use-feature=s

This flag tells perltidy to allow the syntax associated a pragma in string s. Currently only the recognized values for the string are s='class' or string s=' '. The default is --use-feature='class'. This enables perltidy to recognized the special words class, method, field, and ADJUST. If this causes a conflict with other uses of these words, the default can be turned off with --use-feature=' '.

Other Controls

Deleting selected text

Perltidy can selectively delete comments and/or pod documentation. The command -dac or --delete-all-comments will delete all comments and all pod documentation, leaving just code and any leading system control lines.

The command -dp or --delete-pod will remove all pod documentation (but not comments).

Two commands which remove comments (but not pod) are: -dbc or --delete-block-comments and -dsc or --delete-side-comments. (Hanging side comments will be deleted with side comments here.)

When side comments are deleted, any special control side comments for non-indenting braces will be retained unless they are deactivated with a -nnib flag.

The negatives of these commands also work, and are the defaults. When block comments are deleted, any leading 'hash-bang' will be retained. Also, if the -x flag is used, any system commands before a leading hash-bang will be retained (even if they are in the form of comments).

Writing selected text to a file

When perltidy writes a formatted text file, it has the ability to also send selected text to a file with a .TEE extension. This text can include comments and pod documentation.

The command -tac or --tee-all-comments will write all comments and all pod documentation.

The command -tp or --tee-pod will write all pod documentation (but not comments).

The commands which write comments (but not pod) are: -tbc or --tee-block-comments and -tsc or --tee-side-comments. (Hanging side comments will be written with side comments here.)

The negatives of these commands also work, and are the defaults.

Using a .perltidyrc command file

If you use perltidy frequently, you probably won't be happy until you create a .perltidyrc file to avoid typing commonly-used parameters. Perltidy will first look in your current directory for a command file named .perltidyrc. If it does not find one, it will continue looking for one in other standard locations.

These other locations are system-dependent, and may be displayed with the command perltidy -dpro. Under Unix systems, it will first look for an environment variable PERLTIDY. Then it will look for a .perltidyrc file in the home directory, and then for a system-wide file /usr/local/etc/perltidyrc, and then it will look for /etc/perltidyrc. Note that these last two system-wide files do not have a leading dot. Further system-dependent information will be found in the INSTALL file distributed with perltidy.

Under Windows, perltidy will also search for a configuration file named perltidy.ini since Windows does not allow files with a leading period (.). Use perltidy -dpro to see the possible locations for your system. An example might be C:\Documents and Settings\All Users\perltidy.ini.

Another option is the use of the PERLTIDY environment variable. The method for setting environment variables depends upon the version of Windows that you are using. Instructions for Windows 95 and later versions can be found here:

http://www.netmanage.com/000/20021101_005_tcm21-6336.pdf

Under Windows NT / 2000 / XP the PERLTIDY environment variable can be placed in either the user section or the system section. The later makes the configuration file common to all users on the machine. Be sure to enter the full path of the configuration file in the value of the environment variable. Ex. PERLTIDY=C:\Documents and Settings\perltidy.ini

The configuration file is free format, and simply a list of parameters, just as they would be entered on a command line. Any number of lines may be used, with any number of parameters per line, although it may be easiest to read with one parameter per line. Comment text begins with a #, and there must also be a space before the # for side comments. It is a good idea to put complex parameters in either single or double quotes.

Here is an example of a .perltidyrc file:

  # This is a simple of a .perltidyrc configuration file
  # This implements a highly spaced style
  -se    # errors to standard error output
  -w     # show all warnings
  -bl    # braces on new lines
  -pt=0  # parens not tight at all
  -bt=0  # braces not tight
  -sbt=0 # square brackets not tight

The parameters in the .perltidyrc file are installed first, so any parameters given on the command line will have priority over them.

To avoid confusion, perltidy ignores any command in the .perltidyrc file which would cause some kind of dump and an exit. These are:

 -h -v -ddf -dln -dop -dsn -dtt -dwls -dwrs -ss

There are several options may be helpful in debugging a .perltidyrc file:

Creating a new abbreviation

A special notation is available for use in a .perltidyrc file for creating an abbreviation for a group of options. This can be used to create a shorthand for one or more styles which are frequently, but not always, used. The notation is to group the options within curly braces which are preceded by the name of the alias (without leading dashes), like this:

        newword {
        -opt1
        -opt2
        }

where newword is the abbreviation, and opt1, etc, are existing parameters or other abbreviations. The main syntax requirement is that the new abbreviation along with its opening curly brace must begin on a new line. Space before and after the curly braces is optional.

For a specific example, the following line

        oneliner { --maximum-line-length=0 --noadd-newlines --noadd-terminal-newline}

or equivalently with abbreviations

        oneliner { -l=0 -nanl -natnl }

could be placed in a .perltidyrc file to temporarily override the maximum line length with a large value, to temporarily prevent new line breaks from being added, and to prevent an extra newline character from being added the file. All other settings in the .perltidyrc file still apply. Thus it provides a way to format a long 'one liner' when perltidy is invoked with

        perltidy --oneliner ...

(Either -oneliner or --oneliner may be used).

Skipping leading non-perl commands with -x or --look-for-hash-bang

If your script has leading lines of system commands or other text which are not valid perl code, and which are separated from the start of the perl code by a "hash-bang" line, ( a line of the form #!...perl ), you must use the -x flag to tell perltidy not to parse and format any lines before the "hash-bang" line. This option also invokes perl with a -x flag when checking the syntax. This option was originally added to allow perltidy to parse interactive VMS scripts, but it should be used for any script which is normally invoked with perl -x.

Please note: do not use this flag unless you are sure your script needs it. Parsing errors can occur if it does not have a hash-bang, or, for example, if the actual first hash-bang is in a here-doc. In that case a parsing error will occur because the tokenization will begin in the middle of the here-doc.

Making a file unreadable

The goal of perltidy is to improve the readability of files, but there are two commands which have the opposite effect, --mangle and --extrude. They are actually merely aliases for combinations of other parameters. Both of these strip all possible whitespace, but leave comments and pod documents, so that they are essentially reversible. The difference between these is that --mangle puts the fewest possible line breaks in a script while --extrude puts the maximum possible. Note that these options do not provided any meaningful obfuscation, because perltidy can be used to reformat the files. They were originally developed to help test the tokenization logic of perltidy, but they have other uses. One use for --mangle is the following:

  perltidy --mangle myfile.pl -st | perltidy -o myfile.pl.new

This will form the maximum possible number of one-line blocks (see next section), and can sometimes help clean up a badly formatted script.

A similar technique can be used with --extrude instead of --mangle to make the minimum number of one-line blocks.

Another use for --mangle is to combine it with -dac to reduce the file size of a perl script.

Debugging

The following flags are available for debugging:

--dump-cuddled-block-list or -dcbl will dump to standard output the internal hash of cuddled block types created by a -cuddled-block-list input string.

--dump-defaults or -ddf will write the default option set to standard output and quit

--dump-profile or -dpro will write the name of the current configuration file and its contents to standard output and quit.

--dump-options or -dop will write current option set to standard output and quit.

--dump-long-names or -dln will write all command line long names (passed to Get_options) to standard output and quit.

--dump-short-names or -dsn will write all command line short names to standard output and quit.

--dump-token-types or -dtt will write a list of all token types to standard output and quit.

--dump-want-left-space or -dwls will write the hash %want_left_space to standard output and quit. See the section on controlling whitespace around tokens.

--dump-want-right-space or -dwrs will write the hash %want_right_space to standard output and quit. See the section on controlling whitespace around tokens.

--no-memoize or -nmem will turn of memoizing. Memoization can reduce run time when running perltidy repeatedly in a single process. It is on by default but can be deactivated for testing with -nmem.

--no-timestamp or -nts will eliminate any time stamps in output files to prevent differences in dates from causing test installation scripts to fail. There are just a couple of places where timestamps normally occur. One is in the headers of html files, and another is when the -cscw option is selected. The default is to allow timestamps (--timestamp or -ts).

--file-size-order or -fso will cause files to be processed in order of increasing size, when multiple files are being processed. This is useful during program development, when large numbers of files with varying sizes are processed, because it can reduce virtual memory usage.

--maximum-file-size-mb=n or -maxfs=n specifies the maximum file size in megabytes that perltidy will attempt to format. This parameter is provided to avoid causing system problems by accidentally attempting to format an extremely large data file. Most perl scripts are less than about 2 MB in size. The integer n has a default value of 10, so perltidy will skip formatting files which have a size greater than 10 MB. The command to increase the limit to 20 MB for example would be

  perltidy -maxfs=20

This only applies to files specified by filename on the command line.

--maximum-level-errors=n or -maxle=n specifies the maximum number of indentation level errors are allowed before perltidy skips formatting and just outputs a file verbatim. The default is n=1. This means that if the final indentation of a script differs from the starting indentation by more than 1 levels, the file will be output verbatim. To avoid formatting if there are any indentation level errors use -maxle=0. To skip this check you can either set n equal to a large number, such as n=100, or set n=-1.

For example, the following script has level error of 3 and will be output verbatim

    Input and default output:
    {{{


    perltidy -maxle=100
    {
        {
            {

--maximum-unexpected-errors=n or -maxue=n specifies the maximum number of unexpected tokenization errors are allowed before formatting is skipped and a script is output verbatim. The intention is to avoid accidentally formatting a non-perl script, such as an html file for example. This check can be turned off by setting n=0.

A recommended value is n=3. However, the default is n=0 (skip this check) to avoid causing problems with scripts which have extended syntaxes.

-DEBUG will write a file with extension .DEBUG for each input file showing the tokenization of all lines of code.

Making a table of information on code blocks

A table listing information about the blocks of code in a file can be made with --dump-block-summary, or -dbs. This causes perltidy to read and parse the file, write a table of comma-separated values for selected code blocks to the standard output, and then exit. This parameter must be on the command line, not in a .perlticyrc file, and it requires a single file name on the command line. For example

   perltidy -dbs somefile.pl >blocks.csv

produces an output file blocks.csv whose lines hold these parameters:

    filename     - the name of the file
    line         - the line number of the opening brace of this block
    line_count   - the number of lines between opening and closing braces
    code_lines   - the number of lines excluding blanks, comments, and pod
    type         - the block type (sub, for, foreach, ...)
    name         - the block name if applicable (sub name, label, asub name)
    depth        - the nesting depth of the opening block brace
    max_change   - the change in depth to the most deeply nested code block
    block_count  - the total number of code blocks nested in this block
    mccabe_count - the McCabe complexity measure of this code block

This feature was developed to help identify complex sections of code as an aid in refactoring. The McCabe complexity measure follows the definition used by Perl::Critic. By default the table contains these values for subroutines, but the user may request them for any or all blocks of code or packages. For blocks which are loops nested within loops, a postfix '+' to the type is added to indicate possible code complexity. Although the table does not otherwise indicate which blocks are nested in other blocks, this can be determined by computing and comparing the block ending line numbers.

By default the table lists subroutines with more than 20 code_lines, but this can be changed with the following two parameters:

--dump-block-minimum-lines=n, or -dbl=n, where n is the minimum number of code_lines to be included. The default is -n=20. Note that code_lines is the number of lines excluding and comments, blanks and pod.

--dump-block-types=s, or -dbt=s, where string s is a list of block types to be included. The type of a block is either the name of the perl builtin keyword for that block (such as sub if elsif else for foreach ..) or the word immediately before the opening brace. In addition, there are a few symbols for special block types, as follows:

   if elsif else for foreach ... any keyword introducing a block
   sub  - any sub or anynomous sub
   asub - any anonymous sub
   *    - any block except nameless blocks
   +    - any nested inner block loop
   package - any package or class
   closure - any nameless block

In addition, specific block loop types which are nested in other loops can be selected by adding a + after the block name. (Nested loops are sometimes good candidates for restructuring).

The default is -dbt='sub'.

In the following examples a table block.csv is created for a file somefile.pl:

Working with MakeMaker, AutoLoader and SelfLoader

The first $VERSION line of a file which might be eval'd by MakeMaker is passed through unchanged except for indentation. Use --nopass-version-line, or -npvl, to deactivate this feature.

If the AutoLoader module is used, perltidy will continue formatting code after seeing an __END__ line. Use --nolook-for-autoloader, or -nlal, to deactivate this feature.

Likewise, if the SelfLoader module is used, perltidy will continue formatting code after seeing a __DATA__ line. Use --nolook-for-selfloader, or -nlsl, to deactivate this feature.

Working around problems with older version of Perl

Perltidy contains a number of rules which help avoid known subtleties and problems with older versions of perl, and these rules always take priority over whatever formatting flags have been set. For example, perltidy will usually avoid starting a new line with a bareword, because this might cause problems if use strict is active.

There is no way to override these rules.

HTML OPTIONS

The -html master switch

The flag -html causes perltidy to write an html file with extension .html. So, for example, the following command

        perltidy -html somefile.pl

will produce a syntax-colored html file named somefile.pl.html which may be viewed with a browser.

Please Note: In this case, perltidy does not do any formatting to the input file, and it does not write a formatted file with extension .tdy. This means that two perltidy runs are required to create a fully reformatted, html copy of a script.

The -pre flag for code snippets

When the -pre flag is given, only the pre-formatted section, within the <PRE> and </PRE> tags, will be output. This simplifies inclusion of the output in other files. The default is to output a complete web page.

The -nnn flag for line numbering

When the -nnn flag is given, the output lines will be numbered.

The -toc, or --html-table-of-contents flag

By default, a table of contents to packages and subroutines will be written at the start of html output. Use -ntoc to prevent this. This might be useful, for example, for a pod document which contains a number of unrelated code snippets. This flag only influences the code table of contents; it has no effect on any table of contents produced by pod2html (see next item).

The -pod, or --pod2html flag

There are two options for formatting pod documentation. The default is to pass the pod through the Pod::Html module (which forms the basis of the pod2html utility). Any code sections are formatted by perltidy, and the results then merged. Note: perltidy creates a temporary file when Pod::Html is used; see "FILES". Also, Pod::Html creates temporary files for its cache.

NOTE: Perltidy counts the number of =cut lines, and either moves the pod text to the top of the html file if there is one =cut, or leaves the pod text in its original order (interleaved with code) otherwise.

Most of the flags accepted by pod2html may be included in the perltidy command line, and they will be passed to pod2html. In some cases, the flags have a prefix pod to emphasize that they are for the pod2html, and this prefix will be removed before they are passed to pod2html. The flags which have the additional pod prefix are:

   --[no]podheader --[no]podindex --[no]podrecurse --[no]podquiet
   --[no]podverbose --podflush

The flags which are unchanged from their use in pod2html are:

   --backlink=s --cachedir=s --htmlroot=s --libpods=s --title=s
   --podpath=s --podroot=s

where 's' is an appropriate character string. Not all of these flags are available in older versions of Pod::Html. See your Pod::Html documentation for more information.

The alternative, indicated with -npod, is not to use Pod::Html, but rather to format pod text in italics (or whatever the stylesheet indicates), without special html markup. This is useful, for example, if pod is being used as an alternative way to write comments.

The -frm, or --frames flag

By default, a single html output file is produced. This can be changed with the -frm option, which creates a frame holding a table of contents in the left panel and the source code in the right side. This simplifies code browsing. Assume, for example, that the input file is MyModule.pm. Then, for default file extension choices, these three files will be created:

 MyModule.pm.html      - the frame
 MyModule.pm.toc.html  - the table of contents
 MyModule.pm.src.html  - the formatted source code

Obviously this file naming scheme requires that output be directed to a real file (as opposed to, say, standard output). If this is not the case, or if the file extension is unknown, the -frm option will be ignored.

The -text=s, or --html-toc-extension flag

Use this flag to specify the extra file extension of the table of contents file when html frames are used. The default is "toc". See "Specifying File Extensions".

The -sext=s, or --html-src-extension flag

Use this flag to specify the extra file extension of the content file when html frames are used. The default is "src". See "Specifying File Extensions".

The -hent, or --html-entities flag

This flag controls the use of Html::Entities for html formatting. By default, the module Html::Entities is used to encode special symbols. This may not be the right thing for some browser/language combinations. Use --nohtml-entities or -nhent to prevent this.

Style Sheets

Style sheets make it very convenient to control and adjust the appearance of html pages. The default behavior is to write a page of html with an embedded style sheet.

An alternative to an embedded style sheet is to create a page with a link to an external style sheet. This is indicated with the -css=filename, where the external style sheet is filename. The external style sheet filename will be created if and only if it does not exist. This option is useful for controlling multiple pages from a single style sheet.

To cause perltidy to write a style sheet to standard output and exit, use the -ss, or --stylesheet, flag. This is useful if the style sheet could not be written for some reason, such as if the -pre flag was used. Thus, for example,

  perltidy -html -ss >mystyle.css

will write a style sheet with the default properties to file mystyle.css.

The use of style sheets is encouraged, but a web page without a style sheets can be created with the flag -nss. Use this option if you must to be sure that older browsers (roughly speaking, versions prior to 4.0 of Netscape Navigator and Internet Explorer) can display the syntax-coloring of the html files.

Controlling HTML properties

Note: It is usually more convenient to accept the default properties and then edit the stylesheet which is produced. However, this section shows how to control the properties with flags to perltidy.

Syntax colors may be changed from their default values by flags of the either the long form, -html-color-xxxxxx=n, or more conveniently the short form, -hcx=n, where xxxxxx is one of the following words, and x is the corresponding abbreviation:

      Token Type             xxxxxx           x
      ----------             --------         --
      comment                comment          c
      number                 numeric          n
      identifier             identifier       i
      bareword, function     bareword         w
      keyword                keyword          k
      quite, pattern         quote            q
      here doc text          here-doc-text    h
      here doc target        here-doc-target  hh
      punctuation            punctuation      pu
      parentheses            paren            p
      structural braces      structure        s
      semicolon              semicolon        sc
      colon                  colon            co
      comma                  comma            cm
      label                  label            j
      sub definition name    subroutine       m
      pod text               pod-text         pd

A default set of colors has been defined, but they may be changed by providing values to any of the following parameters, where n is either a 6 digit hex RGB color value or an ascii name for a color, such as 'red'.

To illustrate, the following command will produce an html file somefile.pl.html with "aqua" keywords:

        perltidy -html -hck=00ffff somefile.pl

and this should be equivalent for most browsers:

        perltidy -html -hck=aqua somefile.pl

Perltidy merely writes any non-hex names that it sees in the html file. The following 16 color names are defined in the HTML 3.2 standard:

        black   => 000000,
        silver  => c0c0c0,
        gray    => 808080,
        white   => ffffff,
        maroon  => 800000,
        red     => ff0000,
        purple  => 800080,
        fuchsia => ff00ff,
        green   => 008000,
        lime    => 00ff00,
        olive   => 808000,
        yellow  => ffff00
        navy    => 000080,
        blue    => 0000ff,
        teal    => 008080,
        aqua    => 00ffff,

Many more names are supported in specific browsers, but it is safest to use the hex codes for other colors. Helpful color tables can be located with an internet search for "HTML color tables".

Besides color, two other character attributes may be set: bold, and italics. To set a token type to use bold, use the flag --html-bold-xxxxxx or -hbx, where xxxxxx or x are the long or short names from the above table. Conversely, to set a token type to NOT use bold, use --nohtml-bold-xxxxxx or -nhbx.

Likewise, to set a token type to use an italic font, use the flag --html-italic-xxxxxx or -hix, where again xxxxxx or x are the long or short names from the above table. And to set a token type to NOT use italics, use --nohtml-italic-xxxxxx or -nhix.

For example, to use bold braces and lime color, non-bold, italics keywords the following command would be used:

        perltidy -html -hbs -hck=00FF00 -nhbk -hik somefile.pl

The background color can be specified with --html-color-background=n, or -hcbg=n for short, where n is a 6 character hex RGB value. The default color of text is the value given to punctuation, which is black as a default.

Here are some notes and hints:

1. If you find a preferred set of these parameters, you may want to create a .perltidyrc file containing them. See the perltidy man page for an explanation.

2. Rather than specifying values for these parameters, it is probably easier to accept the defaults and then edit a style sheet. The style sheet contains comments which should make this easy.

3. The syntax-colored html files can be very large, so it may be best to split large files into smaller pieces to improve download times.

SOME COMMON INPUT CONVENTIONS

Specifying Block Types

Several parameters which refer to code block types may be customized by also specifying an associated list of block types. The type of a block is the name of the keyword which introduces that block, such as if, else, or sub. An exception is a labeled block, which has no keyword, and should be specified with just a colon. To specify all blocks use '*'.

The keyword sub indicates a named sub. For anonymous subs, use the special keyword asub.

For example, the following parameter specifies sub, labels, BEGIN, and END blocks:

   -cscl="sub : BEGIN END"

(the meaning of the -cscl parameter is described above.) Note that quotes are required around the list of block types because of the spaces. For another example, the following list specifies all block types for vertical tightness:

   -bbvtl='*'

Specifying File Extensions

Several parameters allow default file extensions to be overridden. For example, a backup file extension may be specified with -bext=ext, where ext is some new extension. In order to provides the user some flexibility, the following convention is used in all cases to decide if a leading '.' should be used. If the extension ext begins with A-Z, a-z, or 0-9, then it will be appended to the filename with an intermediate '.' (or perhaps a '_' on VMS systems). Otherwise, it will be appended directly.

For example, suppose the file is somefile.pl. For -bext=old, a '.' is added to give somefile.pl.old. For -bext=.old, no additional '.' is added, so again the backup file is somefile.pl.old. For -bext=~, then no dot is added, and the backup file will be somefile.pl~ .

SWITCHES WHICH MAY BE NEGATED

The following list shows all short parameter names which allow a prefix 'n' to produce the negated form:

 D      anl    asbl   asc    ast    asu    atc    atnl   aws    b
 baa    baao   bar    bbao   bbb    bbc    bbs    bl     bli    boa
 boc    bok    bol    bom    bos    bot    cblx   ce     conv   cpb
 cs     csc    cscb   cscw   dac    dbc    dbs    dcbl   dcsc   ddf
 dln    dnl    dop    dp     dpro   drc    dsc    dsm    dsn    dtc
 dtt    dwic   dwls   dwrs   dws    eos    f      fll    fpva   frm
 fs     fso    gcs    hbc    hbcm   hbco   hbh    hbhh   hbi    hbj
 hbk    hbm    hbn    hbp    hbpd   hbpu   hbq    hbs    hbsc   hbv
 hbw    hent   hic    hicm   hico   hih    hihh   hii    hij    hik
 him    hin    hip    hipd   hipu   hiq    his    hisc   hiv    hiw
 hsc    html   ibc    icb    icp    iob    isbc   iscl   kgb    kgbd
 kgbi   kis    lal    log    lop    lp     lsl    mem    nib    ohbr
 okw    ola    olc    oll    olq    opr    opt    osbc   osbr   otr
 ple    pod    pvl    q      sac    sbc    sbl    scbb   schb   scp
 scsb   sct    se     sfp    sfs    skp    sob    sobb   sohb   sop
 sosb   sot    ssc    st     sts    t      tac    tbc    toc    tp
 tqw    trp    ts     tsc    tso    vbc    vc     vmll   vsc    w
 wfc    wn     x      xci    xlp    xs

Equivalently, the prefix 'no' or 'no-' on the corresponding long names may be used.

LIMITATIONS

Parsing Limitations

Perltidy should work properly on most perl scripts. It does a lot of self-checking, but still, it is possible that an error could be introduced and go undetected. Therefore, it is essential to make careful backups and to test reformatted scripts.

The main current limitation is that perltidy does not scan modules included with 'use' statements. This makes it necessary to guess the context of any bare words introduced by such modules. Perltidy has good guessing algorithms, but they are not infallible. When it must guess, it leaves a message in the log file.

If you encounter a bug, please report it.

What perltidy does not parse and format

Perltidy indents but does not reformat comments and qw quotes. Perltidy does not in any way modify the contents of here documents or quoted text, even if they contain source code. (You could, however, reformat them separately). Perltidy does not format 'format' sections in any way. And, of course, it does not modify pod documents.

FILES

Temporary files

Under the -html option with the default --pod2html flag, a temporary file is required to pass text to Pod::Html. Unix systems will try to use the POSIX tmpnam() function. Otherwise the file perltidy.TMP will be temporarily created in the current working directory.

Special files when standard input is used

When standard input is used, the log file, if saved, is perltidy.LOG, and any errors are written to perltidy.ERR unless the -se flag is set. These are saved in the current working directory.

Files overwritten

The following file extensions are used by perltidy, and files with these extensions may be overwritten or deleted: .ERR, .LOG, .TEE, and/or .tdy, .html, and .bak, depending on the run type and settings.

Files extensions limitations

Perltidy does not operate on files for which the run could produce a file with a duplicated file extension. These extensions include .LOG, .ERR, .TEE, and perhaps .tdy and .bak, depending on the run type. The purpose of this rule is to prevent generating confusing filenames such as somefile.tdy.tdy.tdy.

ERROR HANDLING

An exit value of 0, 1, or 2 is returned by perltidy to indicate the status of the result.

A exit value of 0 indicates that perltidy ran to completion with no error messages.

A non-zero exit value indicates some kind of problem was detected.

An exit value of 1 indicates that perltidy terminated prematurely, usually due to some kind of errors in the input parameters. This can happen for example if a parameter is misspelled or given an invalid value. Error messages in the standard error output will indicate the cause of any problem. If perltidy terminates prematurely then no output files will be produced.

An exit value of 2 indicates that perltidy was able to run to completion but there there are (1) warning messages in the standard error output related to parameter errors or problems and/or (2) warning messages in the perltidy error file(s) relating to possible syntax errors in one or more of the source script(s) being tidied. When multiple files are being processed, an error detected in any single file will produce this type of exit condition.

SEE ALSO

perlstyle(1), Perl::Tidy(3)

INSTALLATION

The perltidy binary uses the Perl::Tidy module and is installed when that module is installed. The module name is case-sensitive. For example, the basic command for installing with cpanm is 'cpanm Perl::Tidy'.

VERSION

This man page documents perltidy version 20230309

BUG REPORTS

The source code repository is at https://github.com/perltidy/perltidy.

To report a new bug or problem, use the "issues" link on this page.

COPYRIGHT

Copyright (c) 2000-2022 by Steve Hancock

LICENSE

This package is free software; you can redistribute it and/or modify it under the terms of the "GNU General Public License".

Please refer to the file "COPYING" for details.

DISCLAIMER

This package is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

See the "GNU General Public License" for more details.

Perl-Tidy-20230309/docs/stylekey.html0000644000175000017500000007554014401515103016321 0ustar stevesteve

Perltidy Style Key

When perltidy was first developed, the main parameter choices were the number of indentation spaces and if the user liked cuddled else's. As the number of users has grown so has the number of parameters. Now there are so many that it can be difficult for a new user to find a good initial set. This document is one attempt to help with this problem, and some other suggestions are given at the end.

Use this document to methodically find a starting set of perltidy parameters to approximate your style. We will be working on just one aspect of formatting at a time. Just read each question and select the best answer. Enter your parameters in a file named .perltidyrc (examples are listed at the end). Then move it to one of the places where perltidy will find it. You can run perltidy with the parameter -dpro to see where these places are for your system.

Before You Start

Before you begin, experiment using just perltidy filename.pl on some of your files. From the results (which you will find in files with a .tdy extension), you will get a sense of what formatting changes, if any, you'd like to make. If the default formatting is acceptable, you do not need a .perltidyrc file.

Use as Filter?

Do you almost always want to run perltidy as a standard filter on just one input file? If yes, use -st and -se.

Line Length Setting

Perltidy will set line breaks to prevent lines from exceeding the maximum line length.

Do you want the maximum line length to be 80 columns? If no, use -l=n, where n is the number of columns you prefer.

Indentation in Code Blocks

In the block below, the variable $anchor is one indentation level deep and is indented by 4 spaces as shown here:

    if ( $flag eq "a" ) {
        $anchor = $header;
    }  

If you want to change this to be a different number n of spaces per indentation level, use -i=n.

Continuation Indentation

Look at the statement beginning with $anchor:

    if ( $flag eq "a" ) {
        $anchor =
          substr( $header, 0, 6 )
          . substr( $char_list, $place_1, 1 )
          . substr( $char_list, $place_2, 1 );
    }

The statement is too long for the line length (80 characters by default), so it has been broken into 4 lines. The second and later lines have some extra "continuation indentation" to help make the start of the statement easy to find. The default number of extra spaces is 2. If you prefer a number n different from 2, you may specify this with -ci=n. It is probably best if it does not exceed the value of the primary indentation.

Tabs

The default, and recommendation, is to represent leading whitespace with actual space characters. However, if you prefer to entab leading whitespace with one tab character for each n spaces, use -et=n. Typically, n would be 8.

Opening Block Brace Right or Left?

Opening and closing curly braces, parentheses, and square brackets are divided into two separate categories and controlled separately in most cases. The two categories are (1) code block curly braces, which contain perl code, and (2) everything else. Basically, a code block brace is one which could contain semicolon-terminated lines of perl code. We will first work on the scheme for code block curly braces.

Decide which of the following opening brace styles you prefer for most blocks of code (with the possible exception of a sub block brace which will be covered later):

If you like opening braces on the right, like this, go to "Opening Braces Right".

    if ( $flag eq "h" ) {
        $headers = 0;
    }  

If you like opening braces on the left, like this, go to "Opening Braces Left".

    if ( $flag eq "h" )
    {
        $headers = 0;
    }

Opening Braces Right

In a multi-line if test expression, the default is to place the opening brace on the left, like this:

    if ( $bigwasteofspace1 && $bigwasteofspace2
        || $bigwasteofspace3 && $bigwasteofspace4 )
    {
        big_waste_of_time();
    }

This helps to visually separate the block contents from the test expression.

An alternative is to keep the brace on the right even for multiple-line test expressions, like this:

    if ( $bigwasteofspace1 && $bigwasteofspace2
        || $bigwasteofspace3 && $bigwasteofspace4 ) {
        big_waste_of_time();
    }

If you prefer this alternative, use -bar.

Cuddled Else?

Do you prefer this Cuddled Else style

    if ( $flag eq "h" ) {
        $headers = 0;
    } elsif ( $flag eq "f" ) {
        $sectiontype = 3;
    } else {
        print "invalid option: " . substr( $arg, $i, 1 ) . "\n";
        dohelp();
    }

instead of this default style?

    if ( $flag eq "h" ) {
        $headers = 0;
    }  
    elsif ( $flag eq "f" ) {
        $sectiontype = 3;
    } 
    else {    
        print "invalid option: " . substr( $arg, $i, 1 ) . "\n";
        dohelp();
    }

If yes, you should use -ce. Now skip ahead to "Opening Sub Braces".

Opening Braces Left

Use -bl if you prefer this style:

    if ( $flag eq "h" )
    {
        $headers = 0;
    }

Use -bli if you prefer this indented-brace style:

    if ( $flag eq "h" )
      {
        $headers = 0;
      }

The number of spaces of extra indentation will be the value specified for continuation indentation with the -ci=n parameter (2 by default).

Opening Sub Braces

By default, the opening brace of a sub block will be treated the same as other code blocks. If this is okay, skip ahead to "Block Brace Vertical Tightness".

If you prefer an opening sub brace to be on a new line, like this:

    sub message
    {
        # -sbl
    }

use -sbl. If you prefer the sub brace on the right like this

    sub message {

        # -nsbl
    }

use -nsbl.

If you wish to give this opening sub brace some indentation you can do that with the parameters -bli and -blil which are described in the manual.

Block Brace Vertical Tightness

If you chose to put opening block braces of all types to the right, skip ahead to "Closing Block Brace Indentation".

If you chose to put braces of any type on the left, the default is to leave the opening brace on a line by itself, like this (shown for -bli, but also true for -bl):

    if ( $flag eq "h" )
      {
        $headers = 0;
      }

But you may also use this more compressed style if you wish:

    if ( $flag eq "h" )
      { $headers = 0;
      }

If you do not prefer this more compressed form, go to "Opening Sub Braces".

Otherwise use parameter -bbvt=n, where n=1 or n=2. To decide, look at this snippet:

    # -bli -bbvt=1
    sub _directives
      {
        {
            'ENDIF' => \&_endif,
               'IF' => \&_if,
        };
      }

    # -bli -bbvt=2
    sub _directives
    {   {
            'ENDIF' => \&_endif,
            'IF'    => \&_if,
        };
    }

The difference is that -bbvt=1 breaks after an opening brace if the next line is unbalanced, whereas -bbvt=2 never breaks.

If you were expecting the 'ENDIF' word to move up vertically here, note that the second opening brace in the above example is not a code block brace (it is a hash brace), so the -bbvt does not apply to it (another parameter will).

Closing Block Brace Indentation

The default is to place closing braces at the same indentation as the opening keyword or brace of that code block, as shown here:

        if ($task) {
            yyy();
        }            # default

If you chose the -bli style, however, the default closing braces will be indented one continuation indentation like the opening brace:

        if ($task)
          {
            yyy();
          }    # -bli

If you prefer to give closing block braces one full level of indentation, independently of how the opening brace is treated, for example like this:

        if ($task) {
            yyy();
            }          # -icb

use -icb.

This completes the definition of the placement of code block braces.

Indentation Style for Other Containers

You have a choice of two basic indentation schemes for non-block containers. The default is to use a fixed number of spaces per indentation level (the same number of spaces used for code blocks, which is 4 by default). Here is an example of the default:

    $dbh = DBI->connect(
        undef, undef, undef,
        {
            PrintError => 0,
            RaiseError => 1
        }
    );

In this default indentation scheme, a simple formula is used to find the indentation of every line. Notice how the first 'undef' is indented 4 spaces (one level) to the right, and how 'PrintError' is indented 4 more speces (one more level) to the right.

The alternate is to let the location of the opening paren (or square bracket, or curly brace) define the indentation, like this:

    $dbh = DBI->connect(
                         undef, undef, undef,
                         {
                           PrintError => 0,
                           RaiseError => 1
                         }
    );

The first scheme is completely robust. The second scheme often looks a little nicer, but be aware that deeply nested structures it can be spoiled if the line length limit is exceeded. Also, if there are comments or blank lines within a complex structure perltidy will temporarily fall back on the default indentation scheme. You may want to try both on large sections of code to see which works best.

If you prefer the first (default) scheme, no parameter is needed.

If you prefer the latter scheme, use -lp.

Opening Vertical Tightness

The information in this section applies mainly to the -lp style but it also applies in some cases to the default style. It will be illustrated for the -lp indentation style.

The default -lp indentation style ends a line at the opening tokens, like this:

    $dbh = DBI->connect(
                         undef, undef, undef,
                         {
                           PrintError => 0,
                           RaiseError => 1
                         }
    );

Here is a tighter alternative, which does not end a line with the opening tokens:

    $dbh = DBI->connect( undef, undef, undef,
                         { PrintError => 0,
                           RaiseError => 1
                         }
    );

The difference is that the lines have been compressed vertically without any changes to the indentation. This can almost always be done with the -lp indentation style, but only in limited cases for the default indentation style.

If you prefer the default, skip ahead to "Closing Token Placement".

Otherwise, use -vt=n, where n should be either 1 or 2. To help decide, observe the first three opening parens in the following snippet and choose the value of n you prefer. Here it is with -lp -vt=1:

    if (
         !defined(
                   start_slip( $DEVICE, $PHONE,  $ACCOUNT, $PASSWORD,
                               $LOCAL,  $REMOTE, $NETMASK, $MTU
                   )
         )
         && $continuation_flag
      )
    {
        do_something_about_it();
    }

And here it is again formatted with -lp -vt=2:

    if ( !defined( start_slip( $DEVICE, $PHONE,  $ACCOUNT, $PASSWORD,
                               $LOCAL,  $REMOTE, $NETMASK, $MTU
                   )
         )
         && $continuation_flag
      )
    {
        do_something_about_it();
    }

The -vt=1 style tries to display the structure by preventing more than one step in indentation per line. In this example, the first two opening parens were not followed by balanced lines, so -vt=1 broke after them.

The -vt=2 style does not limit itself to a single indentation step per line.

Note that in the above example the function 'do_sumething_about_it' started on a new line. This is because it follows an opening code block brace and is governed by the flag previously set in "Block Brace Vertical Tightness".

Closing Token Placement

You have several options for dealing with the terminal closing tokens of non-blocks. In the following examples, a closing parenthesis is shown, but these parameters apply to closing square brackets and non-block curly braces as well.

The default behavior for parenthesized relatively large lists is to place the closing paren on a separate new line. The flag -cti=n controls the amount of indentation of such a closing paren.

The default, -cti=0, for a line beginning with a closing paren, is to use the indentation defined by the next (lower) indentation level. This works well for the default indentation scheme:

    # perltidy
    @month_of_year = (
        'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
        'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'
    );

but it may not look very good with the -lp indentation scheme:

    # perltidy -lp
    @month_of_year = (
                       'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
                       'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'
    );

An alternative which works well with -lp indentation is -cti=1, which aligns the closing paren vertically with its opening paren, if possible:

    # perltidy -lp -cti=1
    @month_of_year = (
                       'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
                       'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'
                     );

Another alternative, -cti=3, indents a line with leading closing paren one full indentation level:

    # perltidy -lp -cti=3
    @month_of_year = (
                       'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
                       'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'
                       );

If you prefer the closing paren on a separate line like this, note the value of -cti=n that you prefer and skip ahead to "Define Horizontal Tightness".

Finally, the question of paren indentation can be avoided by placing it at the end of the previous line, like this:

    @month_of_year = (
        'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
        'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec' );

Perltidy will automatically do this to save space for very short lists but not for longer lists.

Use -vtc=n if you prefer to usually do this, where n is either 1 or 2. To determine n, we have to look at something more complex. Observe the behavior of the closing tokens in the following snippet:

Here is -lp -vtc=1:

    $srec->{'ACTION'} = [
                          $self->read_value(
                                             $lookup->{'VFMT'},
                                             $loc, $lookup, $fh
                          ),
                          $self->read_value(
                                             $lookup->{'VFMT2'},
                                             $loc, $lookup, $fh
                          ) ];

Here is -lp -vtc=2:

    $srec->{'ACTION'} = [
                          $self->read_value(
                                             $lookup->{'VFMT'},
                                             $loc, $lookup, $fh ),
                          $self->read_value(
                                             $lookup->{'VFMT2'},
                                             $loc, $lookup, $fh ) ];

Choose the one that you prefer. The difference is that -vtc=1 leaves closing tokens at the start of a line within a list, which can assist in keeping hierarchical lists readable. The -vtc=2 style always tries to move closing tokens to the end of a line.

If you choose -vtc=1, you may also want to specify a value of -cti=n (previous section) to handle cases where a line begins with a closing paren.

Stack Opening Tokens

In the following snippet the opening hash brace has been placed alone on a new line.

    $opt_c = Text::CSV_XS->new(
        {
            binary       => 1,
            sep_char     => $opt_c,
            always_quote => 1,
        }
    );

If you prefer to avoid isolated opening tokens by "stacking" them together with other opening tokens like this:

    $opt_c = Text::CSV_XS->new( {
            binary       => 1,
            sep_char     => $opt_c,
            always_quote => 1,
        }
    );

use -sot.

Stack Closing Tokens

Likewise, in the same snippet the default formatting leaves the closing paren on a line by itself here:

    $opt_c = Text::CSV_XS->new(
        {
            binary       => 1,
            sep_char     => $opt_c,
            always_quote => 1,
        }
    );

If you would like to avoid leaving isolated closing tokens by stacking them with other closing tokens, like this:

    $opt_c = Text::CSV_XS->new(
        {
            binary       => 1,
            sep_char     => $opt_c,
            always_quote => 1,
        } );

use -sct.

The -sct flag is somewhat similar to the -vtc flags, and in some cases it can give a similar result. The difference is that the -vtc flags try to avoid lines with leading opening tokens by "hiding" them at the end of a previous line, whereas the -sct flag merely tries to reduce the number of lines with isolated closing tokens by stacking multiple closing tokens together, but it does not try to hide them.

The manual shows how all of these vertical tightness controls may be applied independently to each type of non-block opening and opening token.

Define Horizontal Tightness

Horizontal tightness parameters define how much space is included within a set of container tokens.

For parentheses, decide which of the following values of -pt=n you prefer:

 if ( ( my $len_tab = length( $tabstr ) ) > 0 ) {  # -pt=0
 if ( ( my $len_tab = length($tabstr) ) > 0 ) {    # -pt=1 (default)
 if ((my $len_tab = length($tabstr)) > 0) {        # -pt=2

For n=0, space is always used, and for n=2, space is never used. For the default n=1, space is used if the parentheses contain more than one token.

For square brackets, decide which of the following values of -sbt=n you prefer:

 $width = $col[ $j + $k ] - $col[ $j ];  # -sbt=0
 $width = $col[ $j + $k ] - $col[$j];    # -sbt=1 (default)
 $width = $col[$j + $k] - $col[$j];      # -sbt=2 

For curly braces, decide which of the following values of -bt=n you prefer:

 $obj->{ $parsed_sql->{ 'table' }[0] };    # -bt=0
 $obj->{ $parsed_sql->{'table'}[0] };      # -bt=1 (default)
 $obj->{$parsed_sql->{'table'}[0]};        # -bt=2

For code block curly braces, decide which of the following values of -bbt=n you prefer:

 %bf = map { $_ => -M $_ } grep { /\.deb$/ } dirents '.'; # -bbt=0 (default)
 %bf = map { $_ => -M $_ } grep {/\.deb$/} dirents '.';   # -bbt=1
 %bf = map {$_ => -M $_} grep {/\.deb$/} dirents '.';     # -bbt=2

Spaces between function names and opening parens

The default is not to place a space after a function call:

  myfunc( $a, $b, $c );    # default 

If you prefer a space:

  myfunc ( $a, $b, $c );   # -sfp

use -sfp.

Spaces between Perl keywords and parens

The default is to place a space between only these keywords and an opening paren:

   my local our and or eq ne if else elsif until unless 
   while for foreach return switch case given when

but no others. For example, the default is:

    $aa = pop(@bb);

If you want a space between all Perl keywords and an opening paren,

    $aa = pop (@bb);

use -skp. For detailed control of individual keywords, see the manual.

Statement Termination Semicolon Spaces

The default is not to put a space before a statement termination semicolon, like this:

    $i = 1;

If you prefer a space, like this:

    $i = 1 ; 

enter -sts.

For Loop Semicolon Spaces

The default is to place a space before a semicolon in a for statement, like this:

 for ( @a = @$ap, $u = shift @a ; @a ; $u = $v ) {  # -sfs (default)

If you prefer no such space, like this:

 for ( @a = @$ap, $u = shift @a; @a; $u = $v ) {    # -nsfs

enter -nsfs.

Block Comment Indentation

Block comments are comments which occupy a full line, as opposed to side comments. The default is to indent block comments with the same indentation as the code block that contains them (even though this will allow long comments to exceed the maximum line length).

If you would like block comments indented except when this would cause the maximum line length to be exceeded, use -olc. This will cause a group of consecutive block comments to be outdented by the amount needed to prevent any one from exceeding the maximum line length.

If you never want block comments indented, use -nibc.

If block comments may only be indented if they have some space characters before the leading # character in the input file, use -isbc.

The manual shows many other options for controlling comments.

Outdenting Long Quotes

Long quoted strings may exceed the specified line length limit. The default, when this happens, is to outdent them to the first column. Here is an example of an outdented long quote:

        if ($source_stream) {
            if ( @ARGV > 0 ) {
                die
 "You may not specify any filenames when a source array is given\n";
            }
        }

The effect is not too different from using a here document to represent the quote. If you prefer to leave the quote indented, like this:

        if ($source_stream) {
            if ( @ARGV > 0 ) {
                die
                  "You may not specify any filenames when a source array is given\n";
            }
        }

use -nolq.

Many Other Parameters

This document has only covered the most popular parameters. The manual contains many more and should be consulted if you did not find what you need here.

Example .perltidyrc files

Now gather together all of the parameters you prefer and enter them in a file called .perltidyrc.

Here are some example .perltidyrc files and the corresponding style.

Here is a little test snippet, shown the way it would appear with the default style.

    for (@methods) {
        push (
            @results,
            {
                name => $_->name,
                help => $_->help,
            }
        );
    }

You do not need a .perltidyrc file for this style.

Here is the same snippet

    for (@methods)
    {
        push(@results,
             {  name => $_->name,
                help => $_->help,
             }
            );
    }

for a .perltidyrc file containing these parameters:

 -bl
 -lp
 -cti=1
 -vt=1
 -pt=2

You do not need to place just one parameter per line, but this may be convenient for long lists. You may then hide any parameter by placing a # symbol before it.

And here is the snippet

    for (@methods) {
        push ( @results,
               { name => $_->name,
                 help => $_->help,
               } );
    }

for a .perltidyrc file containing these parameters:

 -lp
 -vt=1
 -vtc=1

Tidyview

There is a graphical program called tidyview which you can use to read a preliminary .perltidyrc file, make trial adjustments and immediately see their effect on a test file, and then write a new .perltidyrc. You can download a copy at

http://sourceforge.net/projects/tidyview

Additional Information

This document has covered the main parameters. Many more parameters are available for special purposes and for fine-tuning a style. For complete information see the perltidy manual http://perltidy.sourceforge.net/perltidy.html

For an introduction to using perltidy, see the tutorial http://perltidy.sourceforge.net/tutorial.html

Suggestions for improving this document are welcome and may be sent to perltidy at users.sourceforge.net

Perl-Tidy-20230309/docs/ChangeLog.html0000644000175000017500000052642414401515103016301 0ustar stevesteve

Perltidy Change Log

2023 03 09

- No significant bugs have been found since the last release to CPAN.
  Several minor issues have been fixed, and some new parameters have been
  added, as follows:

- Added parameter --one-line-block-exclusion-list=s, or -olbxl=s, where
  s is a list of block types which should not automatically be turned
  into one-line blocks.  This implements the issue raised in PR #111.
  The list s may include any of the words 'sort map grep eval', or
  it may be '*' to indicate all of these.  So for example to prevent
  multi-line 'eval' blocks from becoming one-line blocks, the command
  would be -olbxl='eval'.

- For the -b (--backup-and-modify-in-place) option, the file timestamps
  are changing (git #113, rt#145999).  First, if there are no formatting
  changes to an input file, it will keep its original modification time.
  Second, any backup file will keep its original modification time.  This
  was previously true for --backup-method=move but not for the default
  --backup-method=copy.  The purpose of these changes is to avoid
  triggering Makefile operations when there are no actual file changes.
  If this causes a problem please open an issue for discussion on github.

- A change was made to the way line breaks are made at the '.'
  operator when the user sets -wba='.' to requests breaks after a '.'
  ( this setting is not recommended because it can be hard to read ).
  The goal of the change is to make switching from breaks before '.'s
  to breaks after '.'s just move the dots from the end of
  lines to the beginning of lines.  For example:

        # default and recommended (--want-break-before='.'):
        $output_rules .=
          (     'class'
              . $dir
              . '.stamp: $('
              . $dir
              . '_JAVA)' . "\n" . "\t"
              . '$(CLASSPATH_ENV) $(JAVAC) -d $(JAVAROOT) '
              . '$(JAVACFLAGS) $?' . "\n" . "\t"
              . 'echo timestamp > class'
              . $dir
              . '.stamp'
              . "\n" );

        # perltidy --want-break-after='.'
        $output_rules .=
          ( 'class' .
              $dir .
              '.stamp: $(' .
              $dir .
              '_JAVA)' . "\n" . "\t" .
              '$(CLASSPATH_ENV) $(JAVAC) -d $(JAVAROOT) ' .
              '$(JAVACFLAGS) $?' . "\n" . "\t" .
              'echo timestamp > class' .
              $dir .
              '.stamp' .
              "\n" );

  For existing code formatted with -wba='.', this may cause some
  changes in the formatting of code with long concatenation chains.

- Added option --use-feature=class, or -uf=class, for issue rt #145706.
  This adds keywords 'class', 'method', 'field', and 'ADJUST' in support of
  this feature which is being tested for future inclusion in Perl.
  An effort has been made to avoid conflicts with past uses of these
  words, especially 'method' and 'class'. The default setting
  is --use-feature=class. If this causes a conflict, this option can
  be turned off by entering -uf=' '.

  In other words, perltidy should work for both old and new uses of
  these keywords with the default settings, but this flag is available
  if a conflict arises.

- Added option -bfvt=n, or --brace-follower-vertical-tightness=n,
  for part of issue git #110.  For n=2, this option looks for lines
  which would otherwise be, by default,

  }
    or ..

  and joins them into a single line

  } or ..

  where the or can be one of a number of logical operators or if unless.
  The default is not to do this and can be indicated with n=1.

- Added option -cpb, or --cuddled-paren-brace, for issue git #110.
  This option will cause perltidy to join two lines which
  otherwise would be, by default,

    )
  {

  into a single line

  ) {

- Some minor changes to existing formatted output may occur as a result
  of fixing minor formatting issues with edge cases.  This is especially
  true for code which uses the -lp or -xlp styles.

- Added option -dbs, or --dump-block-summary, to dump summary
  information about code blocks in a file to standard output.
  The basic command is:

      perltidy -dbs somefile.pl >blocks.csv

  Instead of formatting ``somefile.pl``, this dumps the following
  comma-separated items describing its blocks to the standard output:

   filename     - the name of the file
   line         - the line number of the opening brace of this block
   line_count   - the number of lines between opening and closing braces
   code_lines   - the number of lines excluding blanks, comments, and pod
   type         - the block type (sub, for, foreach, ...)
   name         - the block name if applicable (sub name, label, asub name)
   depth        - the nesting depth of the opening block brace
   max_change   - the change in depth to the most deeply nested code block
   block_count  - the total number of code blocks nested in this block
   mccabe_count - the McCabe complexity measure of this code block

  This can be useful for code restructuring. The man page for perltidy
  has more information and describes controls for selecting block types.

- This version was stress-tested for over 100 cpu hours with random
  input parameters. No failures to converge, internal fault checks,
  undefined variable references or other irregularities were seen.

- This version runs a few percent faster than the previous release on
  large files due to optimizations made with the help of Devel::NYTProf.

2022 11 12

- Fix rt #145095, undef warning in Perl before 5.12. Version 20221112 is
  identical to 2022111 except for this fix for older versions of Perl.

- No significant bugs have been found since the last release to CPAN.
  Several minor issues have been fixed, and some new parameters have been
  added, as follows:

- Fixed rare problem with irregular indentation involving --cuddled-else,
  usually also with the combination -xci and -lp.  Reported in rt #144979.

- Add option --weld-fat-comma (-wfc) for issue git #108. When -wfc
  is set, along with -wn, perltidy is allowed to weld an opening paren
  to an inner opening container when they are separated by a hash key
  and fat comma (=>).  For example:

    # perltidy -wn
    elf->call_method(
        method_name_foo => {
            some_arg1       => $foo,
            some_other_arg3 => $bar->{'baz'},
        }
    );

    # perltidy -wn -wfc
    elf->call_method( method_name_foo => {
        some_arg1       => $foo,
        some_other_arg3 => $bar->{'baz'},
    } );

  This flag is off by default.

- Fix issue git #106. This fixes some edge cases of formatting with the
  combination -xlp -pt=2, mainly for two-line lists with short function
  names. One indentation space is removed to improve alignment:

    # OLD: perltidy -xlp -pt=2
    is($module->VERSION, $expected,
        "$main_module->VERSION matches $module->VERSION ($expected)");

    # NEW: perltidy -xlp -pt=2
    is($module->VERSION, $expected,
       "$main_module->VERSION matches $module->VERSION ($expected)");

- Fix for issue git #105, incorrect formatting with 5.36 experimental
  for_list feature.

- Fix for issue git #103. For parameter -b, or --backup-and-modify-in-place,
  the default backup method has been changed to preserve the inode value
  of the file being formatted.  If this causes a problem, the previous
  method is available and can be used by setting -backup-mode='move', or
  -bm='move'.  The new default corresponds to -bm='copy'.  The difference
  between the two methods is as follows.  For the older method,
  -bm='move', the input file was moved to the backup, and a new file was
  created for the formatted output.  This caused the inode to change.  For
  the new default method, -bm='copy', the input is copied to the backup
  and then the input file is reopened and rewritten. This preserves the
  file inode.  Tests have not produced any problems with this change, but
  before using the --backup-and-modify-in-place parameter please verify
  that it works correctly in your environment and operating system. The
  initial update for this had an error which was caught and fixed
  in git #109.

- Fix undefined value message when perltidy -D is used (git #104)

- Fixed an inconsistency in html colors near pointers when -html is used.
  Previously, a '->' at the end of a line got the 'punctuation color', black
  by default but a '->' before an identifier got the color of the following
  identifier. Now all pointers get the same color, which is black by default.
  Also, previously a word following a '->' was given the color of a bareword,
  black by default, but now it is given the color of an identifier.

- Fixed incorrect indentation of any function named 'err'.  This was
  due to some old code from when "use feature 'err'" was valid.

        # OLD:
        my ($curr) = current();
          err (@_);

        # NEW:
        my ($curr) = current();
        err(@_);

- Added parameter --delete-repeated-commas (-drc) to delete repeated
  commas. This is off by default. For example, given:

        ignoreSpec( $file, "file",, \%spec, \%Rspec );

  # perltidy -drc:
        ignoreSpec( $file, "file", \%spec, \%Rspec );

- Add continuation indentation to long C-style 'for' terms; i.e.

        # OLD
        for (
            $j = $i - $shell ;
            $j >= 0
            && ++$ncomp
            && $array->[$j] gt $array->[ $j + $shell ] ;
            $j -= $shell
          )

        # NEW
        for (
            $j = $i - $shell ;
            $j >= 0
              && ++$ncomp
              && $array->[$j] gt $array->[ $j + $shell ] ;
            $j -= $shell
          )

  This will change some existing formatting with very long 'for' terms.

- The following new parameters are available for manipulating
  trailing commas of lists. They are described in the manual.

       --want-trailing-commas=s, -wtc=s
       --add-trailing-commas,    -atc
       --delete-trailing-commas, -dtc
       --delete-weld-interfering-commas, -dwic

- Files with errors due to missing, extra or misplaced parens, braces,
  or square brackets are now written back out verbatim, without any
  attempt at formatting.

- This version runs 10 to 15 percent faster than the previous
  release on large files due to optimizations made with the help of
  Devel::NYTProf.

- This version was stress-tested for over 200 cpu hours with random
  input parameters. No failures to converge, internal fault checks,
  undefined variable references or other irregularities were seen.

2022 06 13

- No significant bugs have been found since the last release but users
  of programs which call the Perl::Tidy module should note the first
  item below, which changes a default setting.  The main change to
  existing formatting is the second item below, which adds vertical
  alignment to 'use' statements.

- The flag --encode-output-strings, or -eos, is now set 'on' by default.
  This has no effect on the use of the 'perltidy' binary script, but could
  change the behavior of some programs which use the Perl::Tidy module on
  files encoded in UTF-8.  If any problems are noticed, an emergency fix
  can be made by reverting to the old default by setting -neos.  For
  an explanation of why this change needs to be made see:

  https://github.com/perltidy/perltidy/issues/92

  https://github.com/perltidy/perltidy/blob/master/docs/eos_flag.md

- Added vertical alignment for qw quotes and empty parens in 'use'
  statements (see issue #git 93).  This new alignment is 'on' by default
  and will change formatting as shown below. If this is not wanted it can
  be turned off with the parameter -vxl='q' (--valign-exclusion-list='q').

    # old default, or -vxl='q'
    use Getopt::Long qw(GetOptions);
    use Fcntl qw(O_RDONLY O_WRONLY O_EXCL O_CREAT);
    use Symbol qw(gensym);
    use Exporter ();

    # new default
    use Getopt::Long qw(GetOptions);
    use Fcntl        qw(O_RDONLY O_WRONLY O_EXCL O_CREAT);
    use Symbol       qw(gensym);
    use Exporter     ();

- The parameter -kbb (--keep-break-before) now ignores a request to break
  before an opening token, such as '('.  Likewise, -kba (--keep-break-after)
  now ignores a request to break after a closing token, such as ')'. This
  change was made to avoid a rare instability discovered in random testing.

- Previously, if a -dsc command was used to delete all side comments,
  then any special side comments for controlling non-indenting braces got
  deleted too. Now, these control side comments are retained when -dsc is
  set unless a -nnib (--nonon-indenting-braces) flag is also set to
  deactivate them.

- This version runs about 10 percent faster on large files than the previous
  release due to optimizations made with the help of Devel::NYTProf.  Much
  of the gain came from faster processing of blank tokens and comments.

- This version of perltidy was stress-tested for many cpu hours with
  random input parameters. No failures to converge, internal fault checks,
  undefined variable references or other irregularities were seen.

2022 02 17

- A new flag, --encode-output-strings, or -eos, has been added to resolve
  issue git #83. This issue involves the interface between Perl::Tidy and
  calling programs, and Code::TidyAll (tidyall) in particular.  The problem
  is that perltidy by default returns decoded character strings, but
  tidyall expects encoded strings.  This flag provides a fix for that.

  So, tidyall users who process encoded (utf8) files should update to this
  version of Perl::Tidy and use -eos for tidyall.  For further info see:

  https://github.com/houseabsolute/perl-code-tidyall/issues/84, and
  https://github.com/perltidy/perltidy/issues/83

  If there are other applications having utf8 problems at the interface
  with Perl::Tidy, this flag probably may need to be set.

- The default value of the new flag, --encode-output-strings, -eos, is currently
  -neos BUT THIS MAY CHANGE in a future release because the current
  default is inconvenient.  So authors of programs which receive character
  strings back from Perl::Tidy should set this flag, if necessary,
  to avoid any problems when the default changes.  For more information see the
  above links and the Perl::Tidy man pages for example coding.

- The possible values of the string 's' for the flag '--character-encoding=s'
  have been limited to 'utf8' (or UTF-8), 'none', or 'guess'.  Previously an
  arbitrary encoding could also be specified, but as a result of discussions
  regarding git #83 it became clear that this could cause trouble
  since the output encoding was still restricted to UTF-8. Users
  who need to work in other encodings can write a short program calling
  Perl::Tidy with pre- and post-processing to handle encoding/decoding.

- A new flag --break-after-labels=i, or -bal=i, was added for git #86.  This
  controls line breaks after labels, to provide a uniform style, as follows:

        -bal=0 follows the input line breaks [DEFAULT]
        -bal=1 always break after a label
        -bal=2 never break after a label

  For example:

      # perltidy -bal=1
      INIT:
        {
            $xx = 1.234;
        }

      # perltidy -bal=2
      INIT: {
            $xx = 1.234;
        }

- Fix issue git #82, an error handling something like ${bareword} in a
  possible indirect object location. Perl allows this, now perltidy does too.

- The flags -kbb=s or --keep-old-breakpoints-before=s, and its counterpart
  -kba=s or --keep-old-breakpoints-after=s have expanded functionality
  for the container tokens: { [ ( } ] ).  The updated man pages have
  details.

- Two new flags have been added to provide finer vertical alignment control,
  --valign-exclusion-list=s (-vxl=s) and  --valign-inclusion-list=s (-vil=s).
  This has been requested several times, most recently in git #79, and it
  finally got done.  For example, -vil='=>' means just align on '=>'.

- A new flag -gal=s, --grep-alias-list=s, has been added as suggested in
  git #77.  This allows code blocks passed to list operator functions to
  be formatted in the same way as a code block passed to grep, map, or sort.
  By default, the following list operators in List::Util are included:

    all any first none notall reduce reductions

  They can be changed with the flag -gaxl=s, -grep-alias-exclusion-list=s

- A new flag -xlp has been added which can be set to avoid most of the
  limitations of the -lp flag regarding side comments, blank lines, and
  code blocks.  See the man pages for more info. This fixes git #64 and git #74.
  The older -lp flag still works.

- A new flag -lpil=s, --line-up-parentheses-inclusion-list=s, has been added
  as an alternative to -lpxl=s, --line-up-parentheses-exclusion-list=s.
  It supplies equivalent information but is much easier to describe and use.
  It works for both the older -lp version and the newer -xlp.

- The coding for the older -lp flag has been updated to avoid some problems
  and limitations.  The new coding allows the -lp indentation style to
  mix smoothly with the standard indentation in a single file.  Some problems
  where -lp and -xci flags were not working well together have been fixed, such
  as happened in issue rt140025.  As a result of these updates some minor
  changes in existing code using the -lp style may occur.

- This version of perltidy was stress-tested for many cpu hours with
  random input parameters. No failures to converge, internal fault checks,
  undefined variable references or other irregularities were seen.

- Numerous minor fixes have been made, mostly very rare formatting
  instabilities found in random testing.

2021 10 29

- No significant bugs have been found since the last release, but several
  minor issues have been fixed.  Vertical alignment has been improved for
  lists of call args which are not contained within parens (next item).

- Vertical alignment of function calls without parens has been improved with
  the goal of making vertical alignment essentially the same with or
  without parens around the call args.  Some examples:

    # OLD
    mkTextConfig $c, $x, $y, -anchor => 'se', $color;
    mkTextConfig $c, $x + 30, $y, -anchor => 's',  $color;
    mkTextConfig $c, $x + 60, $y, -anchor => 'sw', $color;
    mkTextConfig $c, $x, $y + 30, -anchor => 'e', $color;

    # NEW
    mkTextConfig $c, $x,      $y,      -anchor => 'se', $color;
    mkTextConfig $c, $x + 30, $y,      -anchor => 's',  $color;
    mkTextConfig $c, $x + 60, $y,      -anchor => 'sw', $color;
    mkTextConfig $c, $x,      $y + 30, -anchor => 'e',  $color;

    # OLD
    is id_2obj($id), undef, "unregistered object not retrieved";
    is scalar keys %$ob_reg, 0, "object registry empty";
    is register($obj), $obj, "object returned by register";
    is scalar keys %$ob_reg, 1, "object registry nonempty";
    is id_2obj($id), $obj, "registered object retrieved";

    # NEW
    is id_2obj($id),         undef, "unregistered object not retrieved";
    is scalar keys %$ob_reg, 0,     "object registry empty";
    is register($obj),       $obj,  "object returned by register";
    is scalar keys %$ob_reg, 1,     "object registry nonempty";
    is id_2obj($id),         $obj,  "registered object retrieved";

  This will cause some changes in alignment, hopefully for the better,
  particularly in test code which often uses numerous parenless function
  calls with functions like 'ok', 'is', 'is_deeply', ....

- Two new parameters were added to control the block types to which the
  -bl (--opening-brace-on-new-line) flag applies.  The new parameters are
  -block-left-list=s, or -bll=s, and --block-left-exclusion-list=s,
  or -blxl=s.  Previously the -bl flag was 'hardwired' to apply to
  nearly all blocks. The default values of the new parameters
  retain the the old default behavior but allow it to be changed.

- The default behavior of the -bli (-brace-left-and-indent) flag has changed
  slightly.  Previously, if you set -bli, then the -bl flag would also
  automatically be set.  Consequently, block types which were not included
  in the default list for -bli would get -bl formatting.  This is no longer done,
  and these two styles are now controlled independently.  The manual describes
  the controls.  If you want to recover the exact previous default behavior of
  the -bli then add the -bl flag.

- A partial fix was made for issue for git #74. The -lp formatting style was
  being lost when a one-line anonymous sub was followed by a closing brace.

- Fixed issue git #73, in which the -nfpva flag was not working correctly.
  Some unwanted vertical alignments of spaced function perens
  were being made.

- Updated the man pages to clarify the flags -valign and -novalign
  for turning vertical alignment on and off (issue git #72).
  Added parameters -vc -vsc -vbc for separately turning off vertical
  alignment of code, side comments and block comments.

- Fixed issue git #68, where a blank line following a closing code-skipping
  comment, '#>>V', could be lost.

- This version runs 10 to 15 percent faster on large files than the
  previous release due to optimizations made with the help of NYTProf.

- This version of perltidy was stress-tested for many cpu hours with
  random input parameters. No instabilities,  internal fault checks,
  undefined variable references or other irregularities were seen.

- Numerous minor fixes have been made, mostly very rare formatting instabilities
  found in random testing. An effort has been made to minimize changes to
  existing formatting that these fixes produce, but occasional changes
  may occur. Many of these updates are listed at:

       https://github.com/perltidy/perltidy/blob/master/local-docs/BugLog.pod

2021 07 17

- This release is being made mainly because of the next item, in which an
  error message about an uninitialized value error message could be produced
  in certain cases when format-skipping is used.  The error message was
  annoying but harmless to formatting.

- Fixed an undefined variable message, see git #67. When a format skipping
  comment '#<<' is placed before the first line of code in a script, a
  message 'Use of uninitialized value $Ktoken_vars in numeric ...' can
  occur.

- A warning will no longer be given if a script has an opening code-skipping
  comment '#<<V' which is not terminated with a closing comment '#>>V'. This
  makes code-skipping and format-skipping behave in a similar way: an
  opening comment without a corresponding closing comment will cause
  the rest of a file to be skipped.  If there is a question about which lines
  are skipped, a .LOG file can be produced with the -g flag and it will have
  this information.

- Removed the limit on -ci=n when -xci is set, reference: rt #136415.
  This update removes a limit in the previous two versions in which the
  value of -ci=n was limited to the value of -i=n when -xci was set.
  This limit had been placed to avoid some formatting instabilities,
  but recent coding improvements allow the limit to be removed.

- The -wn and -bbxx=n flags were not working together correctly. This has
  been fixed.

- This version may produce occasional differences in formatting compared to
  previous versions, mainly for lines which are near the specified line
  length limit.  This is due to ongoing efforts to eliminate edge cases of
  formatting instability.

- Numerous minor fixes have been made. A complete list is at:

       https://github.com/perltidy/perltidy/blob/master/local-docs/BugLog.pod

2021 06 25

- This release adds several new requested parameters.  No significant bugs have
  been found since the last release, but a number of minor problems have been
  corrected.

- Added a new option '--code-skipping', requested in git #65, in which code
  between comment lines '#<<V' and '#>>V' is passed verbatim to the output
  stream without error checking.  It is simmilar to --format-skipping
  but there is no error checking of the skipped code. This can be useful for
  skipping past code which employs an extended syntax.

- Added a new option for closing paren placement, -vtc=3, requested in rt #136417.

- Added flag -atnl, --add-terminal-newline, to help issue git #58.
  This flag tells perltidy to terminate the last line of the output stream
  with a newline character, regardless of whether or not the input stream
  was terminated with a newline character.  This is the default.
  If this flag is negated, with -natnl, then perltidy will add a terminal
  newline character to the the output stream only if the input
  stream is terminated with a newline.

- Some nested structures formatted with the -lp indentation option may have
  some changes in indentation.  This is due to updates which were made to
  prevent formatting instability when line lengths are limited by the maximum line
  length. Most scripts will not be affected. If this causes unwanted formatting
  changes, try increasing the --maximum-line-length by a few characters.

- Numerous minor fixes have been made. A complete list is at:

       https://github.com/perltidy/perltidy/blob/master/local-docs/BugLog.pod

2021 04 02

- This release fixes several non-critical bugs which have been found since the last
release.  An effort has been made to keep existing formatting unchanged.

- Fixed issue git #57 regarding uninitialized warning flag.

- Added experimental flag -lpxl=s requested in issue git #56 to provide some
control over which containers get -lp indentation.

- Fixed issue git #55 regarding lack of coordination of the --break-before-xxx
flags and the --line-up-parens flag.

- Fixed issue git #54 regarding irregular application of the --break-before-paren
and similar --break-before-xxx flags, in which lists without commas were not
being formatted according to these flags.

- Fixed issue git #53. A flag was added to turn off alignment of spaced function
parens.  If the --space-function-paren, -sfp flag is set, a side-effect is that the
spaced function parens may get vertically aligned.  This can be undesirable,
so a new parameter '--function-paren-vertical-alignment', or '-fpva', has been
added to turn this vertical alignment off. The default is '-fpva', so that
existing formatting is not changed.  Use '-nfpva' to turn off unwanted
vertical alignment.  To illustrate the possibilities:

    # perltidy [default]
    myfun( $aaa, $b, $cc );
    mylongfun( $a, $b, $c );

    # perltidy -sfp
    myfun     ( $aaa, $b, $cc );
    mylongfun ( $a, $b, $c );

    # perltidy -sfp -nfpva
    myfun ( $aaa, $b, $cc );
    mylongfun ( $a, $b, $c );

- Fixed issue git #51, a closing qw bare paren was not being outdented when
the -nodelete-old-newlines flag was set.

- Fixed numerous edge cases involving unusual parameter combinations which
  could cause alternating output states.  Most scripts will not be
  changed by these fixes.

- A more complete list of updates is at

       https://github.com/perltidy/perltidy/blob/master/local-docs/BugLog.pod

2021 01 11

- Fixed issue git #49, -se breaks warnings exit status behavior.
The exit status flag was not always being set when the -se flag was set.

- Some improvements have been made in the method for aligning side comments.
One of the problems that was fixed is that there was a tendency for side comment
placement to drift to the right in long scripts.  Programs with side comments
may have a few changes.

- Some improvements have been made in formatting qw quoted lists.  This
fixes issue git #51, in which closing qw pattern delimiters not always
following the settings specified by the --closing-token-indentation=n settings.
Now qw closing delimiters ')', '}' and ']' follow these flags, and the
delimiter '>' follows the flag for ')'.  Other qw pattern delimiters remain
indented as the are now.  This change will cause some small formatting changes
in some existing programs.

- Another change involving qw lists is that they get full indentation,
rather than just continuation indentation, if

     (1) the closing delimiter is one of } ) ] > and is on a separate line,
     (2) the opening delimiter  (i.e. 'qw{' ) is also on a separate line, and
     (3) the -xci flag (--extended-continuation-indentation) is set.

This improves formatting when qw lists are contained in other lists. For example,

        # OLD: perltidy
        foreach $color (
            qw(
            AntiqueWhite3 Bisque1 Bisque2 Bisque3 Bisque4
            SlateBlue3 RoyalBlue1 SteelBlue2 DeepSkyBlue3
            ),
            qw(
            LightBlue1 DarkSlateGray1 Aquamarine2 DarkSeaGreen2
            SeaGreen1 Yellow1 IndianRed1 IndianRed2 Tan1 Tan4
            )
          )

        # NEW, perltidy -xci
        foreach $color (
            qw(
                AntiqueWhite3 Bisque1 Bisque2 Bisque3 Bisque4
                SlateBlue3 RoyalBlue1 SteelBlue2 DeepSkyBlue3
            ),
            qw(
                LightBlue1 DarkSlateGray1 Aquamarine2 DarkSeaGreen2
                SeaGreen1 Yellow1 IndianRed1 IndianRed2 Tan1 Tan4
            )
          )

- Some minor improvements have been made to the rules for formatting
some edge vertical alignment cases, usually involving two dissimilar lines.

- A more complete list of updates is at

       https://github.com/perltidy/perltidy/blob/master/local-docs/BugLog.pod

2020 12 07

- Fixed issue git #47, incorrect welding of anonymous subs.
  An incorrect weld format was being made when the --weld-nested-containers option
  (-wn) was used in to format a function which returns a list of anonymous subs.
  For example, the following snippet was incorrectly being welded.

$promises[$i]->then(
    sub { $all->resolve(@_); () },
    sub {
        $results->[$i] = [@_];
        $all->reject(@$results) if --$remaining <= 0;
        return ();
    }
);

This was due to an error introduced in v20201201 related to parsing sub
signatures.  Reformatting with the current version will fix the problem.

2020 12 01

- This release is being made primarily to make available a several new formatting
  parameters, in particular -xci, -kbb=s, -kba=s, and -wnxl=s. No significant
  bugs have been found since the previous release, but numerous minor issues have
  been found and fixed as listed below.

- This version is about 20% faster than the previous version due to optimizations
  made with the help of Devel::NYTProf.

- Added flag -wnxl=s, --weld-nested-exclusion-list=s, to provide control which containers
  are welded with the --weld-nested-containers parameter.  This is related to issue git #45.

- Merged pull request git #46 which fixes the docs regarding the -fse flag.

- Fixed issue git #45, -vtc=n flag was ignored when -wn was set.

- implement request RT #133649, delete-old-newlines selectively. Two parameters,

  -kbb=s or --keep-old-breakpoints-before=s, and
  -kba=s or --keep-old-breakpoints-after=s

  were added to request that old breakpoints be kept before or after
  selected token types.  For example, -kbb='=>' means that newlines before
  fat commas should be kept.

- Fix git #44, fix exit status for assert-tidy/untidy.  The exit status was
  always 0 for --assert-tidy if the user had turned off all error messages with
  the -quiet flag.  This has been fixed.

- Add flag -maxfs=n, --maximum-file-size-mb=n.  This parameter is provided to
  avoid causing system problems by accidentally attempting to format an 
  extremely large data file. The default is n=10.  The command to increase 
  the limit to 20 MB for example would be  -mfs=20.  This only applies to
  files specified by filename on the command line.

- Skip formatting if there are too many indentation level errors.  This is 
  controlled with -maxle=n, --maximum-level-errors=n.  This means that if 
  the ending indentation differs from the starting indentation by more than
  n levels, the file will be output verbatim. The default is n=1. 
  To skip this check, set n=-1 or set n to a large number.

- A related new flag, --maximum-unexpected-errors=n, or -maxue=n, is available
  but is off by default.

- Add flag -xci, --extended-continuation-indentation, regarding issue git #28
  This flag causes continuation indentation to "extend" deeper into structures.
  Since this is a fairly new flag, the default is -nxci to avoid disturbing 
  existing formatting.  BUT you will probably see some improved formatting
  in complex data structures by setting this flag if you currently use -ci=n 
  and -i=n with the same value of 'n' (as is the case if you use -pbp, 
  --perl-best-practices, where n=4).

- Fix issue git #42, clarify how --break-at-old-logical-breakpoints works.
  The man page was updated to note that it does not cause all logical breakpoints
  to be replicated in the output file.

- Fix issue git #41, typo in manual regarding -fsb.

- Fix issue git #40: when using the -bli option, a closing brace followed by 
  a semicolon was not being indented.  This applies to braces which require 
  semicolons, such as a 'do' block.

- Added 'state' as a keyword.

- A better test for convergence has been added. When iterations are requested,
  the new test will stop after the first pass if no changes in line break
  locations are made.  Previously, file checksums were used and required at least two 
  passes to verify convergence unless no formatting changes were made.  With the new test, 
  only a single pass is needed when formatting changes are limited to adjustments of 
  indentation and whitespace on the lines of code.  Extensive testing has been made to
  verify the correctness of the new convergence test.

- Line breaks are now automatically placed after 'use overload' to 
  improve formatting when there are numerous overloaded operators.  For
  example

    use overload
      '+' => sub {
      ...

- A number of minor problems with parsing signatures and prototypes have
  been corrected, particularly multi-line signatures. Some signatures 
  had previously been parsed as if they were prototypes, which meant the 
  normal spacing rules were not applied.  For example

  OLD:
    sub echo ($message= 'Hello World!' ) {
        ...;
    }

  NEW:
    sub echo ( $message = 'Hello World!' ) {
        ...;
    }

- Numerous minor issues that the average user would not encounter were found
  and fixed. They can be seen in the more complete list of updates at 

       https://github.com/perltidy/perltidy/blob/master/local-docs/BugLog.pod

2020 10 01

- Robustness of perltidy has been significantly improved.  Updating is recommended. Continual 
  automated testing runs began about 1 Sep 2020 and numerous issues have been found and fixed. 
  Many involve references to uninitialized variables when perltidy is fed random text and random
  control parameters. 

- Added the token '->' to the list of alignment tokens, as suggested in git
  #39, so that it can be vertically aligned if a space is placed before them with -wls='->'.

- Added parameters -bbhb=n (--break-before-hash-brace=n), -bbsb=n (--break-before-square-bracket=n),
  and -bbp=n (--break-before-paren=n) suggested in git #38.  These provide control over the
  opening container token of a multiple-line list.  Related new parameters -bbhbi=n, -bbsbi=n, -bbpi=n
  control indentation of these tokens.

- Added keyword 'isa'.

2020 09 07

- Fixed bug git #37, an error when the combination -scbb -csc was used.
  It occurs in perltidy versions 20200110, 20200619, and 20200822.  What happens is
  that when two consecutive lines with isolated closing braces had new side
  comments generated by the -csc parameter, a separating newline was missing.
  The resulting script will not then run, but worse, if it is reformatted with
  the same parameters then closing side comments could be overwritten and data
  lost. 

  This problem was found during automated random testing.  The parameter
  -scbb is rarely used, which is probably why this has not been reported.  Please
  upgrade your version.

- Added parameter --non-indenting-braces, or -nib, which prevents
  code from indenting one level if it follows an opening brace marked 
  with a special side comment, '#<<<'.  For example,

                { #<<<   a closure to contain lexical vars

                my $var;  # this line does not indent

                }

                # this line cannot 'see' $var;

  This is on by default.  If your code happens to have some
  opening braces followed by '#<<<', and you
  don't want this, you can use -nnib to deactivate it. 

- Side comment locations reset at a line ending in a level 0 open
  block, such as when a new multi-line sub begins.  This is intended to 
  help keep side comments from drifting to far to the right.

2020 08 22

- Fix RT #133166, encoding not set for -st.  Also reported as RT #133171
  and git #35. 

  This is a significant bug in version 20200616 which can corrupt data if
  perltidy is run as a filter on encoded text.

Please upgrade

- Fix issue RT #133161, perltidy -html was not working on pod

- Fix issue git #33, allow control of space after '->'

- Vertical alignment has been improved. Numerous minor issues have
  been fixed.

- Formatting with the -lp option is improved. 

- Fixed issue git #32, misparse of bare 'ref' in ternary

- When --assert-tidy is used and triggers an error, the first difference
  between input and output files is shown in the error output. This is
  a partial response to issue git #30.

2020 06 19

- Added support for Switch::Plain syntax, issue git #31.

- Fixed minor problem where trailing 'unless' clauses were not 
  getting vertically aligned.

- Added a parameter --logical-padding or -lop to allow logical padding
  to be turned off.  Requested by git #29. This flag is on by default.
  The man pages have examples.

- Added a parameter -kpit=n to control spaces inside of parens following
  certain keywords, requested in git#26. This flag is off by default.

- Added fix for git#25, improve vertical alignment for long lists with
  varying numbers of items per line.

- calls to the module Perl::Tidy can now capture any output produced
  by a debug flag or one of the 'tee' flags through the new 'debugfile' and
  'teefile' call parameters.  These output streams are rarely used but
  they are now treated the same as any 'logfile' stream.

- add option --break-at-old-semicolon-breakpoints', -bos, requested 
  in RT#131644.  This flag will keep lines beginning with a semicolon.

- Added --use-unicode-gcstring to control use of Unicode::GCString for
  evaluating character widths of encoded data.  The default is 
  not to use this (--nouse-unicode-gcstring). If this flag is set,
  perltidy will look for Unicode::GCString and, if found, will use it 
  to evaluate character display widths.  This can improve displayed
  vertical alignment for files with wide characters.  It is a nice
  feature but it is off by default to avoid conflicting formatting
  when there are multiple developers.  Perltidy installation does not 
  require Unicode::GCString, so users wanting to use this feature need 
  set this flag and also to install Unicode::GCString separately.

- Added --character-encoding=guess or -guess to have perltidy guess
  if a file (or other input stream) is encoded as -utf8 or some 
  other single-byte encoding. This is useful when processing a mixture 
  of file types, such as utf8 and latin-1.

  Please Note: The default encoding has been set to be 'guess'
  instead of 'none'. This seems like the best default, since 
  it allows perltidy work properly with both
  utf8 files and older latin-1 files.  The guess mode uses Encode::Guess,
  which is included in standard perl distributions, and only tries to 
  guess if a file is utf8 or not, never any other encoding.  If the guess is 
  utf8, and if the file successfully decodes as utf8, then it the encoding 
  is assumed to be utf8.  Otherwise, no encoding is assumed. 
  If you do not want to use this new default guess mode, or have a 
  problem with it, you can set --character-encoding=none (the previous 
  default) or --character-encoding=utf8 (if you deal with utf8 files).

- Specific encodings of input files other than utf8 may now be given, for
  example --character-encoding=euc-jp.

- Fix for git#22, Preserve function signature on a single line. An
  unwanted line break was being introduced when a closing signature paren
  followed a closing do brace.

- Fix RT#132059, the -dac parameter was not working and caused an error exit

- When -utf8 is used, any error output is encoded as utf8

- Fix for git#19, adjust line break around an 'xor'

- Fix for git#18, added warning for missing comma before unknown bare word.

2020 01 10

- This release adds a flag to control the feature RT#130394 (allow short nested blocks)
  introduced in the previous release.  Unfortunately that feature breaks 
  RPerl installations, so a control flag has been introduced and that feature is now
  off by default.  The flag is:

  --one-line-block-nesting=n, or -olbn=n, where n is an integer as follows: 

  -olbn=0 break nested one-line blocks into multiple lines [new DEFAULT]
  -olbn=1 stable; keep existing nested-one line blocks intact [previous DEFAULT]

  For example, consider this input line:

    foreach (@list) { if ($_ eq $asked_for) { last } ++$found }

  The new default behavior (-olbn=0), and behavior prior to version 20191203, is to break it into multiple lines:

    foreach (@list) {
        if ( $_ eq $asked_for ) { last }
        ++$found;
    }

  To keep nested one-line blocks such as this on a single line you can add the parameter -olbn=1.

- Fixed issue RT#131288: parse error for un-prototyped constant function without parenthesized
  call parameters followed by ternary.

- Fixed issue RT#131360, installation documentation.  Added a note that the binary 
  'perltidy' comes with the Perl::Tidy module. They can both normally be installed with 
  'cpanm Perl::Tidy'

2019 12 03

- Fixed issue RT#131115: -bli option not working correctly.
  Closing braces were not indented in some cases due to a glitch
  introduced in version 20181120.

- Fixed issue RT#130394: Allow short nested blocks.  Given the following

    $factorial = sub { reduce { $a * $b } 1 .. 11 };

  Previous versions would always break the sub block because it
  contains another block (the reduce block).  The fix keeps
  short one-line blocks such as this intact.

- Implement issue RT#130640: Allow different subroutine keywords.
  Added a flag --sub-alias-list=s or -sal=s, where s is a string with
  one or more aliases for 'sub', separated by spaces or commas.
  For example,

    perltidy -sal='method fun' 

  will cause the perltidy to treat the words 'method' and 'fun' to be
  treated the same as if they were 'sub'.

- Added flag --space-prototype-paren=i, or -spp=i, to control spacing 
  before the opening paren of a prototype, where i=0, 1, or 2:
  i=0 no space
  i=1 follow input [current and default]
  i=2 always space

  Previously, perltidy always followed the input.
  For example, given the following input 

     sub usage();

  The result will be:
    sub usage();    # i=0 [no space]
    sub usage();    # i=1 [default; follows input]
    sub usage ();   # i=2 [space]

- Fixed issue git#16, minor vertical alignment issue.

- Fixed issue git#10, minor conflict of -wn and -ce

- Improved some vertical alignments involving two lines.

2019 09 15

- fixed issue RT#130344: false warning "operator in print statement" 
  for "use lib". 

- fixed issue RT#130304: standard error output should include filename.
  When perltidy error messages are directed to the standard error output 
  with -se or --standard-error-output, the message lines now have a prefix 
  'filename:' for clarification in case multiple files 
  are processed, where 'filename' is the name of the input file.  If 
  input is from the standard input the displayed filename is '<stdin>', 
  and if it is from a data structure then displayed filename 
  is '<source_stream>'.

- implement issue RT#130425: check mode.  A new flag '--assert-tidy'
  will cause an error message if the output script is not identical to
  the input script. For completeness, the opposite flag '--assert-untidy'
  has also been added.  The next item, RT#130297, insures that the script
  will exit with a non-zero exit flag if the assertion fails.

- fixed issue RT#130297; the perltidy script now exits with a nonzero exit 
  status if it wrote to the standard error output. Prevously only fatal
  run errors produced a non-zero exit flag. Now, even non-fatal messages
  requested with the -w flag will cause a non-zero exit flag.  The exit
  flag now has these values:

     0 = no errors
     1 = perltidy could not run to completion due to errors
     2 = perltidy ran to completion with error messages

- added warning message for RT#130008, which warns of conflicting input
  parameters -iob and -bom or -boc.

- fixed RT#129850; concerning a space between a closing block brace and
  opening bracket or brace, as occurs before the '[' in this line:

   my @addunix = map { File::Spec::Unix->catfile( @ROOT, @$_ ) } ['b'];

  Formerly, any space was removed. Now it is optional, and the output will
  follow the input.

- fixed issue git#13, needless trailing whitespace in error message

- fixed issue git#9: if the -ce (--cuddled-else) flag is used,
  do not try to form new one line blocks for a block type 
  specified with -cbl, particularly map, sort, grep

- iteration speedup for unchanged code.  Previously, when iterations were
  requested, at least two formatting passes were made. Now just a single pass
  is made if the formatted code is identical to the input code.

- some improved vertical alignments

2019 06 01

- rt #128477: Prevent inconsistent owner/group and setuid/setgid bits. 
  In the -b (--backup-and-modify-in-place) mode, an attempt is made to set ownership
  of the output file equal to the input file, if they differ.
  In all cases, if the final output file ownership differs from input file, any setuid/setgid bits are cleared.

- Added option -bom  (--break-at-old-method-breakpoints) by
  merrillymeredith which preserves breakpoints of method chains. Modified to also handle a cuddled call style.

- Merged patch to fix Windows EOL translation error with UTF-8 written by
  Ron Ivy. This update prevents automatic conversion to 'DOS' CRLF line
  endings.  Also, Windows system testing at the appveyor site is working again.

- RT #128280, added flag --one-line-block-semicolons=n (-olbs=n) 
  to control semicolons in one-line blocks.  The values of n are:
    n=0 means no semicolons termininating simple one-line blocks
    n=1 means stable; do not change from input file [DEFAULT and current]
    n=2 means always add semicolons in one-line blocks
  The current behavior corresponds to the default n=1.

- RT #128216, Minor update to prevent inserting unwanted blank line at
  indentation level change.  This should not change existing scripts.

- RT #81852: Improved indentation when quoted word (qw) lists are 
  nested within other containers using the --weld-nested (-wn) flag.
  The example given previously (below) is now closer to what it would
  be with a simple list instead of qw:

  # perltidy -wn
  use_all_ok( qw{
      PPI
      PPI::Tokenizer
      PPI::Lexer
      PPI::Dumper
      PPI::Find
      PPI::Normal
      PPI::Util
      PPI::Cache
  } );

- RT#12764, introduced new feature allowing placement of blanks around
  sequences of selected keywords. This can be activated with the -kgb* 
  series of parameters described in the manual.

- Rewrote vertical algnment module.  It is better at finding
  patterns in complex code. For example,

OLD:
       /^-std$/ && do { $std       = 1;     next; };
       /^--$/   && do { @link_args = @argv; last; };
       /^-I(.*)/ && do { $path = $1 || shift @argv; next; };

NEW:
       /^-std$/  && do { $std       = 1;                 next; };
       /^--$/    && do { @link_args = @argv;             last; };
       /^-I(.*)/ && do { $path      = $1 || shift @argv; next; };

- Add repository URLs to META files 

- RT #118553, "leave only one newline at end of file". This option was not 
  added because of undesirable side effects, but a new filter script
  was added which can do this, "examples/delete_ending_blank_lines.pl".

2018 11 20

- fix RT#127736 Perl-Tidy-20181119 has the EXE_FILES entry commented out in
  Makefile.PL so it doesn't install the perltidy script or its manpage.

2018 11 19

- Removed test case 'filter_example.t' which was causing a failure on a
  Windows installation for unknown reasons, possibly due to an unexpected
  perltidyrc being read by the test script.  Added VERSION numbers to all
  new modules.

2018 11 17

- Fixed RT #126965, in which a ternary operator was misparsed if immediately
  following a function call without arguments, such as: 
    my $restrict_customer = shift ? 1 : 0;

- Fixed RT #125012: bug in -mangle --delete-all-comments
  A needed blank space before bareword tokens was being removed when comments 
  were deleted

- Fixed RT #81852: Stacked containers and quoting operators. Quoted words
  (qw) delimited by container tokens ('{', '[', '(', '<') are now included in
  the --weld-nested (-wn) flag:

      # perltidy -wn
      use_all_ok( qw{
            PPI
            PPI::Tokenizer
            PPI::Lexer
            PPI::Dumper
            PPI::Find
            PPI::Normal
            PPI::Util
            PPI::Cache
            } );

- The cuddled-else (-ce) coding was merged with the new cuddled-block (-cb)
  coding.  The change is backward compatible and simplifies input.  
  The --cuddled-block-option=n (-cbo=n) flag now applies to both -ce and -cb 
  formatting.  In fact the -cb flag is just an alias for -ce now.

- Fixed RT #124594, license text desc. changed from 'GPL-2.0+' to 'gpl_2'

- Fixed bug in which a warning about a possible code bug was issued in a
  script with brace errors. 

- added option --notimestamp or -nts to eliminate any time stamps in output 
  files.  This is used to prevent differences in test scripts from causing
  failure at installation. For example, the -cscw option will put a date
  stamp on certain closing side comments. We need to avoid this in order
  to test this feature in an installation test.

- Fixed bug with the entab option, -et=8, in which the leading space of
  some lines was was not entabbed.  This happened in code which was adjusted
  for vertical alignment and in hanging side comments. Thanks to Glenn.

- Fixed RT #127633, undesirable line break after return when -baao flag is set

- Fixed RT #127035, vertical alignment. Vertical alignment has been improved 
  in several ways.  Thanks especially to Michael Wardman and Glenn for sending 
  helpful snippets. 

  - Alignment of the =~ operators has been reactivated.  

      OLD:
      $service_profile =~ s/^\s+|\s+$//g;
      $host_profile =~ s/^\s+|\s+$//g;

      NEW:
      $service_profile =~ s/^\s+|\s+$//g;
      $host_profile    =~ s/^\s+|\s+$//g;

  - Alignment of the // operator has been reactivated.  

      OLD:
      is( pop // 7,       7, 'pop // ... works' );
      is( pop() // 7,     0, 'pop() // ... works' );
      is( pop @ARGV // 7, 3, 'pop @array // ... works' );

      NEW:
      is( pop       // 7, 7, 'pop // ... works' );
      is( pop()     // 7, 0, 'pop() // ... works' );
      is( pop @ARGV // 7, 3, 'pop @array // ... works' );

  - The rules for alignment of just two lines have been adjusted,
    hopefully to be a little better overall.  In some cases, two 
    lines which were previously unaligned are now aligned, and vice-versa.

      OLD:
      $expect = "1$expect" if $expect =~ /^e/i;
      $p = "1$p" if defined $p and $p =~ /^e/i;

      NEW:
      $expect = "1$expect" if $expect =~ /^e/i;
      $p      = "1$p"      if defined $p and $p =~ /^e/i;


- RT #106493; source code repository location has been added to docs; it is 
     https://github.com/perltidy/perltidy

- The packaging for this version has changed. The Tidy.pm module is much 
  smaller.  Supporting modules have been split out from it and placed below 
  it in the path Perl/Tidy/*.

- A number of new installation test cases have been added. Updates are now
  continuously tested at Travis CI against versions back to Perl 5.08.

2018 02 20

- RT #124469, #124494, perltidy often making empty files.  The previous had
  an index error causing it to fail, particularly in version 5.18 of Perl.

  Please avoid version 20180219.

2018 02 19

- RT #79947, cuddled-else generalization. A new flag -cb provides
  'cuddled-else' type formatting for an arbitrary type of block chain. The
  default is try-catch-finally, but this can be modified with the 
  parameter -cbl. 

- Fixed RT #124298: add space after ! operator without breaking !! secret 
  operator

- RT #123749: numerous minor improvements to the -wn flag were made.  

- Fixed a problem with convergence tests in which iterations were stopping 
  prematurely. 

- Here doc targets for <<~ type here-docs may now have leading whitespace.

- Fixed RT #124354. The '-indent-only' flag was not working correctly in the 
  previous release. A bug in version 20180101 caused extra blank lines 
  to be output.

- Issue RT #124114. Some improvements were made in vertical alignment
  involving 'fat commas'.

2018 01 01

- Added new flag -wn (--weld-nested-containers) which addresses these issues:
  RT #123749: Problem with promises; 
  RT #119970: opening token stacking strange behavior;
  RT #81853: Can't stack block braces

  This option causes closely nested pairs of opening and closing containers
  to be "welded" together and essentially be formatted as a single unit,
  with just one level of indentation.

  Since this is a new flag it is set to be "off" by default but it has given 
  excellent results in testing. 

  EXAMPLE 1, multiple blocks, default formatting:
      do {
          {
              next if $x == $y;    # do something here
          }
      } until $x++ > $z;

  perltidy -wn
      do { {
          next if $x == $y;
      } } until $x++ > $z;

   EXAMPLE 2, three levels of wrapped function calls, default formatting:
          p(
              em(
                  conjug(
                      translate( param('verb') ), param('tense'),
                      param('person')
                  )
              )
          );

      # perltidy -wn
          p( em( conjug(
              translate( param('verb') ),
              param('tense'), param('person')
          ) ) );

      # EXAMPLE 3, chained method calls, default formatting:
      get('http://mojolicious.org')->then(
          sub {
              my $mojo = shift;
              say $mojo->res->code;
              return get('http://metacpan.org');
          }
      )->then(
          sub {
              my $cpan = shift;
              say $cpan->res->code;
          }
      )->catch(
          sub {
              my $err = shift;
              warn "Something went wrong: $err";
          }
      )->wait;

      # perltidy -wn
      get('http://mojolicious.org')->then( sub {
          my $mojo = shift;
          say $mojo->res->code;
          return get('http://metacpan.org');
      } )->then( sub {
          my $cpan = shift;
          say $cpan->res->code;
      } )->catch( sub {
          my $err = shift;
          warn "Something went wrong: $err";
      } )->wait;


- Fixed RT #114359: Missparsing of "print $x ** 0.5;

- Deactivated the --check-syntax flag for better security.  It will be
  ignored if set.  

- Corrected minimum perl version from 5.004 to 5.008 based on perlver
  report.  The change is required for coding involving wide characters.

- For certain severe errors, the source file will be copied directly to the
  output without formatting. These include ending in a quote, ending in a
  here doc, and encountering an unidentified character.

2017 12 14

- RT #123749, partial fix.  "Continuation indentation" is removed from lines 
  with leading closing parens which are part of a call chain. 
  For example, the call to pack() is is now outdented to the starting 
  indentation in the following experession:  

      # OLD
      $mw->Button(
          -text    => "New Document",
          -command => \&new_document
        )->pack(
          -side   => 'bottom',
          -anchor => 'e'
        );

      # NEW
      $mw->Button(
          -text    => "New Document",
          -command => \&new_document
      )->pack(
          -side   => 'bottom',
          -anchor => 'e'
      );

  This modification improves readability of complex expressions, especially
  when the user uses the same value for continuation indentation (-ci=n) and 
  normal indentation (-i=n).  Perltidy was already programmed to
  do this but a minor bug was preventing it.

- RT #123774, added flag to control space between a backslash and a single or
  double quote, requested by Robert Rothenberg.  The issue is that lines like

     $str1=\"string1";
     $str2=\'string2';

  confuse syntax highlighters unless a space is left between the backslash and
  the quote.

  The new flag to control this is -sbq=n (--space-backslash-quote=n), 
  where n=0 means no space, n=1 means follow existing code, n=2 means always
  space.  The default is n=1, meaning that a space will be retained if there
  is one in the source code.

- Fixed RT #123492, support added for indented here doc operator <<~ added 
  in v5.26.  Thanks to Chris Weyl for the report.

- Fixed docs; --closing-side-comment-list-string should have been just
  --closing-side-comment-list.  Thanks to F.Li.

- Added patch RT #122030] Perl::Tidy sometimes does not call binmode.
  Thanks to Irilis Aelae.

- Fixed RT #121959, PERLTIDY doesn't honor the 'three dot' notation for 
  locating a config file using environment variables.  Thanks to John 
  Wittkowski.

- Minor improvements to formatting, in which some additional vertical
  aligmnemt is done. Thanks to Keith Neargarder.

- RT #119588.  Vertical alignment is no longer done for // operator.

2017 05 21

- Fixed debian #862667: failure to check for perltidy.ERR deletion can lead 
  to overwriting arbitrary files by symlink attack. Perltidy was continuing
  to write files after an unlink failure.  Thanks to Don Armstrong 
  for a patch.

- Fixed RT #116344, perltidy fails on certain anonymous hash references:
  in the following code snippet the '?' was misparsed as a pattern 
  delimiter rather than a ternary operator.
      return ref {} ? 1 : 0;

- Fixed RT #113792: misparsing of a fat comma (=>) right after 
  the __END__ or __DATA__ tokens.  These keywords were getting
  incorrectly quoted by the following => operator.

- Fixed RT #118558. Custom Getopt::Long configuration breaks parsing 
  of perltidyrc.  Perltidy was resetting the users configuration too soon.

- Fixed RT #119140, failure to parse double diamond operator.  Code to
  handle this new operator has been added.

- Fixed RT #120968.  Fixed problem where -enc=utf8 didn't work 
  with --backup-and-modify-in-place. Thanks to Heinz Knutzen for this patch.

- Fixed minor formatting issue where one-line blocks for subs with signatures 
  were unnecessarily broken

- RT #32905, patch to fix utf-8 error when output was STDOUT. 

- RT #79947, improved spacing of try/catch/finally blocks. Thanks to qsimpleq
  for a patch.

- Fixed #114909, Anonymous subs with signatures and prototypes misparsed as
  broken ternaries, in which a statement such as this was not being parsed
  correctly:
      return sub ( $fh, $out ) : prototype(*$) { ... }

- Implemented RT #113689, option to introduces spaces after an opening block
  brace and before a closing block brace. Four new optional controls are
  added. The first two define the minimum number of blank lines to be
  inserted 

   -blao=i or --blank-lines-after-opening-block=i
   -blbc=i or --blank-lines-before-closing-block=i

  where i is an integer, the number of lines (the default is 0).  

  The second two define the types of blocks to which the first two apply 

   -blaol=s or --blank-lines-after-opening-block-list=s
   -blbcl=s or --blank-lines-before-closing-block-list=s

  where s is a string of possible block keywords (default is just 'sub',
  meaning a named subroutine).

  For more information please see the documentation.

- The method for specifying block types for certain input parameters has
  been generalized to distinguish between normal named subroutines and
  anonymous subs.  The keyword for normal subroutines remains 'sub', and
  the new keyword for anonymous subs is 'asub'. 

- Minor documentation changes. The BUGS sections now have a link
  to CPAN where most open bugs and issues can be reviewed and bug reports
  can be submitted.  The information in the AUTHOR and CREDITS sections of
  the man pages have been removed from the man pages to streamline the
  documentation. This information is still in the source code.

2016 03 02

- RT #112534. Corrected a minor problem in which an unwanted newline
  was placed before the closing brace of an anonymous sub with 
  a signature, if it was in a list.  Thanks to Dmytro Zagashev.

- Corrected a minor problem in which occasional extra indentation was
  given to the closing brace of an anonymous sub in a list when the -lp 
  parameter was set.

2016 03 01

 - RT #104427. Added support for signatures.

 - RT #111512.  Changed global warning flag $^W = 1 to use warnings;
   Thanks to Dmytro Zagashev.

 - RT #110297, added support for new regexp modifier /n
   Thanks to Dmytro Zagashev.

 - RT #111519.  The -io (--indent-only) and -dac (--delete-all-comments)
   can now both be used in one pass. Thanks to Dmitry Veltishev.

 - Patch to avoid error message with 'catch' used by TryCatch, as in
      catch($err){
         # do something
      }
   Thanks to Nick Tonkin.

 - RT #32905, UTF-8 coding is now more robust. Thanks to qsimpleq
   and Dmytro for patches.

 - RT #106885. Added string bitwise operators ^. &. |. ~. ^.= &.= |.=

 - Fixed RT #107832 and #106492, lack of vertical alignment of two lines
   when -boc flag (break at old commas) is set.  This bug was 
   inadvertently introduced in previous bug fix RT #98902. 

 - Some common extensions to Perl syntax are handled better.
   In particular, the following snippet is now foratted cleanly:

     method deposit( Num $amount) {
         $self->balance( $self->balance + $amount );
     }

   A new flag -xs (--extended-syntax) was added to enable this, and the default
   is to use -xs. 

   In previous versions, and now only when -nxs is set, this snippet of code
   generates the following error message:

   "syntax error at ') {', didn't see one of: case elsif for foreach given if switch unless until when while"

2015 08 15

 - Fixed RT# 105484, Invalid warning about 'else' in 'switch' statement.  The
   warning happened if a 'case' statement did not use parens.

 - Fixed RT# 101547, misparse of // caused error message.  Also..

 - Fixed RT# 102371, misparse of // caused unwated space in //=

 - Fixed RT# 100871, "silent failure of HTML Output on Windows". 
   Changed calls to tempfile() from:
     my ( $fh_tmp, $tmpfile ) = tempfile();
   to have the full path name:
     my ( $fh_tmp, $tmpfile ) = File::Temp::tempfile()
   because of problems in the Windows version reported by Dean Pearce.

 - Fixed RT# 99514, calling the perltidy module multiple times with 
   a .perltidyrc file containing the parameter --output-line-ending 
   caused a crash.  This was a glitch in the memoization logic. 

 - Fixed RT#99961, multiple lines inside a cast block caused unwanted
   continuation indentation.  

 - RT# 32905, broken handling of UTF-8 strings. 
   A new flag -utf8 causes perltidy assume UTF-8 encoding for input and 
   output of an io stream.  Thanks to Sebastian Podjasek for a patch.  
   This feature may not work correctly in older versions of Perl. 
   It worked in a linux version 5.10.1 but not in a Windows version 5.8.3 (but
   otherwise perltidy ran correctly).

 - Warning files now report perltidy VERSION. Suggested by John Karr.

 - Fixed long flag --nostack-closing-tokens (-nsct has always worked though). 
   This was due to a typo.  This also fixed --nostack-opening-tokens to 
   behave correctly.  Thanks to Rob Dixon.

2014 07 11

- Fixed RT #94902: abbreviation parsing in .perltidyrc files was not
  working for multi-line abbreviations.  Thanks to Eric Fung for 
  supplying a patch. 

- Fixed RT #95708, misparsing of a hash when the first key was a perl
  keyword, causing a semicolon to be incorrectly added.

- Fixed RT #94338 for-loop in a parenthesized block-map.  A code block within
  parentheses of a map, sort, or grep function was being mistokenized.  In 
  rare cases this could produce in an incorrect error message.  The fix will
  produce some minor formatting changes.  Thanks to Daniel Trizen 
  discovering and documenting this.

- Fixed RT #94354, excess indentation for stacked tokens.  Thanks to 
  Colin Williams for supplying a patch.

- Added support for experimental postfix dereferencing notation introduced in
  perl 5.20. RT #96021.

- Updated documentation to clarify the behavior of the -io flag
  in response to RT #95709.  You can add -noll or -l=0 to prevent 
  long comments from being outdented when -io is used.

- Added a check to prevent a problem reported in RT #81866, where large
  scripts which had been compressed to a single line could not be formatted
  because of a check for VERSION for MakeMaker. The workaround was to 
  use -nvpl, but this shouldn't be necessary now.

- Fixed RT #96101; Closing brace of anonymous sub in a list was being
  indented.  For example, the closing brace of the anonymous sub below 
  will now be lined up with the word 'callback'.  This problem 
  occurred if there was no comma after the closing brace of the anonymous sub.
  This update may cause minor changes to formatting of code with lists 
  of anonymous subs, especially TK code.

  # OLD
  my @menu_items = (

      #...
      {
          path     => '/_Operate/Transcode and split',
          callback => sub {
              return 1 if not $self->project_opened;
              $self->comp('project')->transcode( split => 1 );
            }
      }
  );

  # NEW
  my @menu_items = (

      #...
      {
          path     => '/_Operate/Transcode and split',
          callback => sub {
              return 1 if not $self->project_opened;
              $self->comp('project')->transcode( split => 1 );
          }
      }
  );

2014 03 28

- Fixed RT #94190 and debian Bug #742004: perltidy.LOG file left behind.
  Thanks to George Hartzell for debugging this.  The problem was
  caused by the memoization speedup patch in version 20121207.  An
  unwanted flag was being set which caused a LOG to be written if 
  perltidy was called multiple times.

- New default behavior for LOG files: If the source is from an array or 
  string (through a call to the perltidy module) then a LOG output is only
  possible if a logfile stream is specified.  This is to prevent 
  unexpected perltidy.LOG files. 

- Fixed debian Bug #740670, insecure temporary file usage.  File::Temp is now
  used to get a temporary file.  Thanks to Don Anderson for a patch.

- Any -b (--backup-and-modify-in-place) flag is silently ignored when a 
  source stream, destination stream, or standard output is used.  
  This is because the -b flag may have been in a .perltidyrc file and 
  warnings break Test::NoWarnings.  Thanks to Marijn Brand.

2013 09 22

- Fixed RT #88020. --converge was not working with wide characters.

- Fixed RT #78156. package NAMESPACE VERSION syntax not accepted.

- First attempt to fix RT #88588.  INDEX END tag change in pod2html breaks 
  perltidy -html. I put in a patch which should work but I don't yet have
  a way of testing it.

2013 08 06

- Fixed RT #87107, spelling

2013 08 05

- Fixed RT #87502, incorrect of parsing of smartmatch before hash brace

- Added feature request RT #87330, trim whitespace after POD.
  The flag -trp (--trim-pod) will trim trailing whitespace from lines of POD

2013 07 17

- Fixed RT #86929, #86930, missing lhs of assignment.

- Fixed RT #84922, moved pod from Tidy.pm into Tidy.pod

2012 12 07

- The flag -cab=n or --comma-arrow-breakpoints=n has been generalized
  to give better control over breaking open short containers.  The
  possible values are now:

    n=0 break at all commas after =>  
    n=1 stable: break at all commas after => if container is open,
        EXCEPT FOR one-line containers
    n=2 break at all commas after =>, BUT try to form the maximum
        maximum one-line container lengths
    n=3 do not treat commas after => specially at all 
    n=4 break everything: like n=0 but also break a short container with
        a => not followed by a comma
    n=5 stable: like n=1 but ALSO break at open one-line containers (default)

  New values n=4 and n=5 have been added to allow short blocks to be
  broken open.  The new default is n=5, stable.  It should more closely
  follow the breaks in the input file, and previously formatted code
  should remain unchanged.  If this causes problems use -cab=1 to recover 
  the former behavior.  Thanks to Tony Maszeroski for the suggestion.

  To illustrate the need for the new options, if perltidy is given
  the following code, then the old default (-cab=1) was to close up 
  the 'index' container even if it was open in the source.  The new 
  default (-cab=5) will keep it open if it was open in the source.

   our $fancypkg = {
       'ALL' => {
           'index' => {
               'key' => 'value',
           },
           'alpine' => {
               'one'   => '+',
               'two'   => '+',
               'three' => '+',
           },
       }
   };

- New debug flag --memoize (-mem).  This version contains a 
  patch supplied by Jonathan Swartz which can significantly speed up
  repeated calls to Perl::Tidy::perltidy in a single process by caching
  the result of parsing the formatting parameters.  A factor of up to 10
  speedup was achieved for masontidy (https://metacpan.org/module/masontidy).
  The memoization patch is on by default but can be deactivated for 
  testing with -nmem (or --no-memoize).

- New flag -tso (--tight-secret-operators) causes certain perl operator
  sequences (secret operators) to be formatted "tightly" (without spaces).  
  The most common of these are 0 +  and + 0 which become 0+ and +0.  The
  operators currently modified by this flag are: 
       =( )=  0+  +0  ()x!! ~~<>  ,=>
  Suggested by by Philippe Bruhat. See https://metacpan.org/module/perlsecret
  This flag is off by default.

- New flag -vmll (--variable-maximum-line-length) makes the maximum
  line length increase with the nesting depth of a line of code.  
  Basically, it causes the length of leading whitespace to be ignored when
  setting line breaks, so the formatting of a block of code is independent
  of its nesting depth.  Try this option if you have deeply nested 
  code or data structures, perhaps in conjunction with the -wc flag
  described next.  The default is not todo this.

- New flag -wc=n (--whitespace-cycle=n) also addresses problems with
  very deeply nested code and data structures.  When this parameter is
  used and the nesting depth exceeds the value n, the leading whitespace 
  will be reduced and start at 1 again.  The result is that deeply
  nested blocks of code will shift back to the left. This occurs cyclically 
  to any nesting depth.  This flag may be used either with or without -vmll.
  The default is not to use this (-wc=0).

- Fixed RT #78764, error parsing smartmatch operator followed by anonymous
  hash or array and then a ternary operator; two examples:

   qr/3/ ~~ ['1234'] ? 1 : 0;
   map { $_ ~~ [ '0', '1' ] ? 'x' : 'o' } @a;

- Fixed problem with specifying spaces around arrows using -wls='->'
  and -wrs='->'.  Thanks to Alain Valleton for documenting this problem. 

- Implemented RT #53183, wishlist, lines of code with the same indentation
  level which are contained with multiple stacked opening and closing tokens
  (requested with flags -sot -sct) now have reduced indentation.  

   # Default
   $sender->MailMsg(
       {
           to      => $addr,
           subject => $subject,
           msg     => $body
       }
   );

   # OLD: perltidy -sot -sct 
   $sender->MailMsg( {
           to      => $addr,
           subject => $subject,
           msg     => $body
   } );

   # NEW: perltidy -sot -sct 
   $sender->MailMsg( {
       to      => $addr,
       subject => $subject,
       msg     => $body
   } );

- New flag -act=n (--all-containers-tightness=n) is an abbreviation for
  -pt=n -sbt=n -bt=n -bbt=n, where n=0,1, or 2.  It simplifies input when all
  containers have the same tightness. Using the same example:

   # NEW: perltidy -sot -sct -act=2
   $sender->MailMsg({
       to      => $addr,
       subject => $subject,
       msg     => $body
   });

- New flag -sac (--stack-all-containers) is an abbreviation for -sot -sct
  This is part of wishlist item RT #53183. Using the same example again:

   # NEW: perltidy -sac -act=2
   $sender->MailMsg({
       to      => $addr,
       subject => $subject,
       msg     => $body
   });

 - new flag -scbb (--stack-closing-block-brace) causes isolated closing 
   block braces to stack as in the following example. (Wishlist item RT#73788)

   DEFAULT:
   for $w1 (@w1) {
       for $w2 (@w2) {
           for $w3 (@w3) {
               for $w4 (@w4) {
                   push( @lines, "$w1 $w2 $w3 $w4\n" );
               }
           }
       }
   }

   perltidy -scbb:
   for $w1 (@w1) {
       for $w2 (@w2) {
           for $w3 (@w3) {
               for $w4 (@w4) {
                   push( @lines, "$w1 $w2 $w3 $w4\n" );
               } } } }

  There is, at present, no flag to place these closing braces at the end
  of the previous line. It seems difficult to develop good rules for 
  doing this for a wide variety of code and data structures.

- Parameters defining block types may use a wildcard '*' to indicate
  all block types.  Previously it was not possible to include bare blocks.

- A flag -sobb (--stack-opening-block-brace) has been introduced as an
  alias for -bbvt=2 -bbvtl='*'.  So for example the following test code:

  {{{{{{{ $testing }}}}}}}

  cannot be formatted as above but can at least be kept vertically compact 
  using perltidy -sobb -scbb

  {   {   {   {   {   {   {   $testing
                          } } } } } } }

  Or even, perltidy -sobb -scbb -i=1 -bbt=2
  {{{{{{{$testing
        }}}}}}}


- Error message improved for conflicts due to -pbp; thanks to Djun Kim.

- Fixed RT #80645, error parsing special array name '@$' when used as 
  @{$} or $#{$}

- Eliminated the -chk debug flag which was included in version 20010406 to
  do a one-time check for a bug with multi-line quotes.  It has not been
  needed since then.

- Numerous other minor formatting improvements.

2012 07 14

- Added flag -iscl (--ignore-side-comment-lengths) which causes perltidy 
  to ignore the length of side comments when setting line breaks, 
  RT #71848.  The default is to include the length of side comments when
  breaking lines to stay within the length prescribed by the -l=n
  maximum line length parameter.  For example,

    Default behavior on a single line with long side comment:
       $vmsfile =~ s/;[\d\-]*$//
         ;    # Clip off version number; we can use a newer version as well

    perltidy -iscl leaves the line intact:

       $vmsfile =~ s/;[\d\-]*$//; # Clip off version number; we can use a newer version as well

- Fixed RT #78182, side effects with STDERR.  Error handling has been
  revised and the documentation has been updated.  STDERR can now be 
  redirected to a string reference, and perltidy now returns an 
  error flag instead of calling die when input errors are detected. 
  If the error flag is set then no tidied output was produced.
  See man Perl::Tidy for an example.

- Fixed RT #78156, erroneous warning message for package VERSION syntax.

- Added abbreviations -conv (--converge) to simplify iteration control.
  -conv is equivalent to -it=4 and will insure that the tidied code is
  converged to its final state with the minimum number of iterations.

- Minor formatting modifications have been made to insure convergence.

- Simplified and hopefully improved the method for guessing the starting 
  indentation level of entabbed code.  Added flag -dt=n (--default_tabsize=n) 
  which might be helpful if the guessing method does not work well for
  some editors.

- Added support for stacked labels, upper case X/B in hex and binary, and
  CORE:: namespace.

- Eliminated warning messages for using keyword names as constants.

2012 07 01

- Corrected problem introduced by using a chomp on scalar references, RT #77978

- Added support for Perl 5.14 package block syntax, RT #78114.

- A convergence test is made if three or more iterations are requested with
  the -it=n parameter to avoid wasting computer time.  Several hundred Mb of
  code gleaned from the internet were searched with the results that: 
   - It is unusual for two iterations to be required unless a major 
     style change is being made. 
   - Only one case has been found where three iterations were required.  
   - No cases requiring four iterations have been found with this version.
  For the previous version several cases where found the results could
  oscillate between two semi-stable states. This version corrects this.

  So if it is important that the code be converged it is okay to set -it=4
  with this version and it will probably stop after the second iteration.

- Improved ability to identify and retain good line break points in the
  input stream, such as at commas and equals. You can always tell 
  perltidy to ignore old breakpoints with -iob.  

- Fixed glitch in which a terminal closing hash brace followed by semicolon
  was not outdented back to the leading line depth like other closing
  tokens.  Thanks to Keith Neargarder for noting this.

    OLD:
       my ( $pre, $post ) = @{
           {
               "pp_anonlist" => [ "[", "]" ],
               "pp_anonhash" => [ "{", "}" ]
           }->{ $kid->ppaddr }
         };   # terminal brace

    NEW:
       my ( $pre, $post ) = @{
           {
               "pp_anonlist" => [ "[", "]" ],
               "pp_anonhash" => [ "{", "}" ]
           }->{ $kid->ppaddr }
       };    # terminal brace

- Removed extra indentation given to trailing 'if' and 'unless' clauses 
  without parentheses because this occasionally produced undesirable 
  results.  This only applies where parens are not used after the if or
  unless.

   OLD:
       return undef
         unless my ( $who, $actions ) =
             $clause =~ /^($who_re)((?:$action_re)+)$/o; 

   NEW:
       return undef
         unless my ( $who, $actions ) =
         $clause =~ /^($who_re)((?:$action_re)+)$/o;

2012 06 19

- Updated perltidy to handle all quote modifiers defined for perl 5 version 16.

- Side comment text in perltidyrc configuration files must now begin with
  at least one space before the #.  Thus:

  OK:
    -l=78 # Max line width is 78 cols
  BAD: 
    -l=78# Max line width is 78 cols

  This is probably true of almost all existing perltidyrc files, 
  but if you get an error message about bad parameters
  involving a '#' the first time you run this version, please check the side
  comments in your perltidyrc file, and add a space before the # if necessary.
  You can quickly see the contents your perltidyrc file, if any, with the
  command:

    perltidy -dpro

  The reason for this change is that some parameters naturally involve
  the # symbol, and this can get interpreted as a side comment unless the
  parameter is quoted.  For example, to define -sphb=# it used to be necessary
  to write
    -sbcp='#'
  to keep the # from becoming part of a comment.  This was causing 
  trouble for new users.  Now it can also be written without quotes: 
    -sbcp=#

- Fixed bug in processing some .perltidyrc files containing parameters with
  an opening brace character, '{'.  For example the following was
  incorrectly processed:
     --static-block-comment-prefix="^#{2,}[^\s#]"
  Thanks to pdagosto.

- Added flag -boa (--break-at-old-attribute-breakpoints) which retains
  any existing line breaks at attribute separation ':'. This is now the
  default, use -nboa to deactivate.  Thanks to Daphne Phister for the patch.  
  For example, given the following code, the line breaks at the ':'s will be
  retained:

                   my @field
                     : field
                     : Default(1)
                     : Get('Name' => 'foo') : Set('Name');

  whereas the previous version would have output a single line.  If
  the attributes are on a single line then they will remain on a single line.

- Added new flags --blank-lines-before-subs=n (-blbs=n) and
  --blank-lines-before-packages=n (-blbp=n) to put n blank lines before
  subs and packages.  The old flag -bbs is now equivalent to -blbs=1 -blbp=1.
  and -nbbs is equivalent to -blbs=0 -blbp=0. Requested by M. Schwern and
  several others.

- Added feature -nsak='*' meaning no space between any keyword and opening 
  paren.  This avoids listing entering a long list of keywords.  Requested
  by M. Schwern.

- Added option to delete a backup of original file with in-place-modify (-b)
  if there were no errors.  This can be requested with the flag -bext='/'.  
  See documentation for details.  Requested by M. Schwern and others.

- Fixed bug where the module postfilter parameter was not applied when -b 
  flag was used.  This was discovered during testing.

- Fixed in-place-modify (-b) to work with symbolic links to source files.
  Thanks to Ted Johnson.

- Fixed bug where the Perl::Tidy module did not allow -b to be used 
  in some cases.

- No extra blank line is added before a comment which follows
  a short line ending in an opening token, for example like this:
   OLD:
           if (

               # unless we follow a blank or comment line
               $last_line_leading_type !~ /^[#b]$/
               ...

   NEW:
           if (
               # unless we follow a blank or comment line
               $last_line_leading_type !~ /^[#b]$/
               ...

   The blank is not needed for readability in these cases because there
   already is already space above the comment.  If a blank already 
   exists there it will not be removed, so this change should not 
   change code which has previously been formatted with perltidy. 
   Thanks to R.W.Stauner.

- Likewise, no extra blank line is added above a comment consisting of a
  single #, since nothing is gained in readability.

- Fixed error in which a blank line was removed after a #>>> directive. 
  Thanks to Ricky Morse.

- Unnecessary semicolons after given/when/default blocks are now removed.

- Fixed bug where an unwanted blank line could be added before
  pod text in __DATA__ or __END__ section.  Thanks to jidani.

- Changed exit flags from 1 to 0 to indicate success for -help, -version, 
  and all -dump commands.  Also added -? as another way to dump the help.
  Requested by Keith Neargarder.

- Fixed bug where .ERR and .LOG files were not written except for -it=2 or more

- Fixed bug where trailing blank lines at the end of a file were dropped when
  -it>1.

- Fixed bug where a line occasionally ended with an extra space. This reduces
  rhe number of instances where a second iteration gives a result different
  from the first. 

- Updated documentation to note that the Tidy.pm module <stderr> parameter may
  not be a reference to SCALAR or ARRAY; it must be a file.

- Syntax check with perl now work when the Tidy.pm module is processing
  references to arrays and strings.  Thanks to Charles Alderman.

- Zero-length files are no longer processed due to concerns for data loss
  due to side effects in some scenarios.

- block labels, if any, are now included in closing side comment text
  when the -csc flag is used.  Suggested by Aaron.  For example, 
  the label L102 in the following block is now included in the -csc text:

     L102: for my $i ( 1 .. 10 ) {
       ...
     } ## end L102: for my $i ( 1 .. 10 )

2010 12 17

- added new flag -it=n or --iterations=n
  This flag causes perltidy to do n complete iterations.  
  For most purposes the default of n=1 should be satisfactory.  However n=2
  can be useful when a major style change is being made, or when code is being
  beautified on check-in to a source code control system.  The run time will be
  approximately proportional to n, and it should seldom be necessary to use a
  value greater than n=2.  Thanks to Jonathan Swartz

- A configuration file pathname begins with three dots, e.g.
  ".../.perltidyrc", indicates that the file should be searched for starting
  in the current directory and working upwards. This makes it easier to have
  multiple projects each with their own .perltidyrc in their root directories.
  Thanks to Jonathan Swartz for this patch.

- Added flag --notidy which disables all formatting and causes the input to be
  copied unchanged.  This can be useful in conjunction with hierarchical
  F<.perltidyrc> files to prevent unwanted tidying.
  Thanks to Jonathan Swartz for this patch.

- Added prefilters and postfilters in the call to the Tidy.pm module.
  Prefilters and postfilters. The prefilter is a code reference that 
  will be applied to the source before tidying, and the postfilter 
  is a code reference to the result before outputting.  

  Thanks to Jonathan Swartz for this patch.  He writes:
  This is useful for all manner of customizations. For example, I use
  it to convert the 'method' keyword to 'sub' so that perltidy will work for
  Method::Signature::Simple code:

  Perl::Tidy::perltidy(
     prefilter => sub { $_ = $_[0]; s/^method (.*)/sub $1 \#__METHOD/gm; return $_ },
     postfilter => sub { $_ = $_[0]; s/^sub (.*?)\s* \#__METHOD/method $1/gm; return $_ }
  );

- The starting indentation level of sections of code entabbed with -et=n
  is correctly guessed if it was also produced with the same -et=n flag.  This
  keeps the indentation stable on repeated formatting passes within an editor.
  Thanks to Sam Kington and Glenn.

- Functions with prototype '&' had a space between the function and opening
  peren.  This space now only occurs if the flag --space-function-paren (-sfp)
  is set.  Thanks to Zrajm Akfohg.

- Patch to never put spaces around a bare word in braces beginning with ^ as in:
    my $before = ${^PREMATCH};
  even if requested with the -bt=0 flag because any spaces cause a syntax error in perl.
  Thanks to Fabrice Dulanoy.

2009 06 16

- Allow configuration file to be 'perltidy.ini' for Windows systems.
  i.e. C:\Documents and Settings\User\perltidy.ini
  and added documentation for setting configuation file under Windows in man
  page.  Thanks to Stuart Clark.

- Corrected problem of unwanted semicolons in hash ref within given/when code.
 Thanks to Nelo Onyiah.

- added new flag -cscb or --closing-side-comments-balanced
 When using closing-side-comments, and the closing-side-comment-maximum-text
 limit is exceeded, then the comment text must be truncated.  Previous
 versions of perltidy terminate with three dots, and this can still be
 achieved with -ncscb:

  perltidy -csc -ncscb

  } ## end foreach my $foo (sort { $b cmp $a ...

 However this causes a problem with older editors which cannot recognize
 comments or are not configured to doso because they cannot "bounce" around in
 the text correctly.  The B<-cscb> flag tries to help them by 
 appending appropriate terminal balancing structure:

  perltidy -csc -cscb

  } ## end foreach my $foo (sort { $b cmp $a ... })

 Since there is much to be gained and little to be lost by doing this,
 the default is B<-cscb>.  Use B<-ncscb> if you do not want this.

 Thanks to Daniel Becker for suggesting this option.

- After an isolated closing eval block the continuation indentation will be
  removed so that the braces line up more like other blocks.  Thanks to Yves Orton.

OLD:
   eval {
       #STUFF;
       1;    # return true
     }  
     or do {
       #handle error
     };

NEW:
   eval {
       #STUFF;
       1;    # return true
   } or do {
       #handle error
   };

-A new flag -asbl (or --opening-anonymous-sub-brace-on-new-line) has
 been added to put the opening brace of anonymous sub's on a new line,
 as in the following snippet:

   my $code = sub
   {
       my $arg = shift;
       return $arg->(@_);
   };

 This was not possible before because the -sbl flag only applies to named
 subs. Thanks to Benjamin Krupp.

-Fix tokenization bug with the following snippet
  print 'hi' if { x => 1, }->{x};
 which resulted in a semicolon being added after the comma.  The workaround
 was to use -nasc, but this is no longer necessary.  Thanks to Brian Duggan. 

-Fixed problem in which an incorrect error message could be triggered
by the (unusual) combination of parameters  -lp -i=0 -l=2 -ci=0 for
example.  Thanks to Richard Jelinek.

-A new flag --keep-old-blank-lines=n has been added to
give more control over the treatment of old blank lines in
a script.  The manual has been revised to discuss the new
flag and clarify the treatment of old blank lines.  Thanks
to Oliver Schaefer.

2007 12 05

-Improved support for perl 5.10: New quote modifier 'p', new block type UNITCHECK, 
new keyword break, improved formatting of given/when.

-Corrected tokenization bug of something like $var{-q}.

-Numerous minor formatting improvements.

-Corrected list of operators controlled by -baao -bbao to include
  . : ? && || and or err xor

-Corrected very minor error in log file involving incorrect comment
regarding need for upper case of labels.  

-Fixed problem where perltidy could run for a very long time
when given certain non-perl text files.

-Line breaks in un-parenthesized lists now try to follow
line breaks in the input file rather than trying to fill
lines.  This usually works better, but if this causes
trouble you can use -iob to ignore any old line breaks.
Example for the following input snippet:

   print
   "conformability (Not the same dimension)\n",
   "\t", $have, " is ", text_unit($hu), "\n",
   "\t", $want, " is ", text_unit($wu), "\n",
   ;

 OLD:
   print "conformability (Not the same dimension)\n", "\t", $have, " is ",
     text_unit($hu), "\n", "\t", $want, " is ", text_unit($wu), "\n",;

 NEW:
   print "conformability (Not the same dimension)\n",
     "\t", $have, " is ", text_unit($hu), "\n",
     "\t", $want, " is ", text_unit($wu), "\n",
     ;

2007 08 01

-Added -fpsc option (--fixed-position-side-comment). Thanks to Ueli Hugenschmidt. 
For example -fpsc=40 tells perltidy to put side comments in column 40
if possible.  

-Added -bbao and -baao options (--break-before-all-operators and
--break-after-all-operators) to simplify command lines and configuration
files.  These define an initial preference for breaking at operators which can
be modified with -wba and -wbb flags.  For example to break before all operators
except an = one could use --bbao -wba='=' rather than listing every
single perl operator (except =) on a -wbb flag.

-Added -kis option (--keep-interior-semicolons).  Use the B<-kis> flag
to prevent breaking at a semicolon if there was no break there in the
input file.  To illustrate, consider the following input lines:

   dbmclose(%verb_delim); undef %verb_delim;
   dbmclose(%expanded); undef %expanded;
   dbmclose(%global); undef %global;

Normally these would be broken into six lines, but 
perltidy -kis gives:

   dbmclose(%verb_delim); undef %verb_delim;
   dbmclose(%expanded);   undef %expanded;
   dbmclose(%global);     undef %global;

-Improved formatting of complex ternary statements, with indentation
of nested statements.  
 OLD:
   return defined( $cw->{Selected} )
     ? (wantarray)
     ? @{ $cw->{Selected} }
     : $cw->{Selected}[0]
     : undef;

 NEW:
   return defined( $cw->{Selected} )
     ? (wantarray)
         ? @{ $cw->{Selected} }
         : $cw->{Selected}[0]
     : undef;

-Text following un-parenthesized if/unless/while/until statements get a
full level of indentation.  Suggested by Jeff Armstrong and others.
OLD:
   return $ship->chargeWeapons("phaser-canon")
     if $encounter->description eq 'klingon'
     and $ship->firepower >= $encounter->firepower
     and $location->status ne 'neutral';
NEW:
   return $ship->chargeWeapons("phaser-canon")
     if $encounter->description eq 'klingon'
         and $ship->firepower >= $encounter->firepower
         and $location->status ne 'neutral';

2007 05 08

-Fixed bug where #line directives were being indented.  Thanks to
Philippe Bruhat.

2007 05 04

-Fixed problem where an extra blank line was added after an =cut when either
(a) the =cut started (not stopped) a POD section, or (b) -mbl > 1. 
Thanks to J. Robert Ray and Bill Moseley.

2007 04 24

-ole (--output-line-ending) and -ple (--preserve-line-endings) should
now work on all systems rather than just unix systems. Thanks to Dan
Tyrell.

-Fixed problem of a warning issued for multiple subs for BEGIN subs
and other control subs. Thanks to Heiko Eissfeldt.

-Fixed problem where no space was introduced between a keyword or
bareword and a colon, such as:

( ref($result) eq 'HASH' && !%$result ) ? undef: $result;

Thanks to Niek.

-Added a utility program 'break_long_quotes.pl' to the examples directory of
the distribution.  It breaks long quoted strings into a chain of concatenated
sub strings no longer than a selected length.  Suggested by Michael Renner as
a perltidy feature but was judged to be best done in a separate program.

-Updated docs to remove extra < and >= from list of tokens 
after which breaks are made by default.  Thanks to Bob Kleemann.

-Removed improper uses of $_ to avoid conflicts with external calls, giving
error message similar to:
   Modification of a read-only value attempted at 
   /usr/share/perl5/Perl/Tidy.pm line 6907.
Thanks to Michael Renner.

-Fixed problem when errorfile was not a plain filename or filehandle
in a call to Tidy.pm.  The call
perltidy(source => \$input, destination => \$output, errorfile => \$err);
gave the following error message:
 Not a GLOB reference at /usr/share/perl5/Perl/Tidy.pm line 3827.
Thanks to Michael Renner and Phillipe Bruhat.

-Fixed problem where -sot would not stack an opening token followed by
a side comment.  Thanks to Jens Schicke.

-improved breakpoints in complex math and other long statements. Example:
OLD:
   return
     log($n) + 0.577215664901532 + ( 1 / ( 2 * $n ) ) -
     ( 1 / ( 12 * ( $n**2 ) ) ) + ( 1 / ( 120 * ( $n**4 ) ) );
NEW:
   return
     log($n) + 0.577215664901532 +
     ( 1 / ( 2 * $n ) ) -
     ( 1 / ( 12 * ( $n**2 ) ) ) +
     ( 1 / ( 120 * ( $n**4 ) ) );

-more robust vertical alignment of complex terminal else blocks and ternary
statements.

2006 07 19

-Eliminated bug where a here-doc invoked through an 'e' modifier on a pattern
replacement text was not recognized.  The tokenizer now recursively scans
replacement text (but does not reformat it).

-improved vertical alignment of terminal else blocks and ternary statements.
 Thanks to Chris for the suggestion. 

 OLD:
   if    ( IsBitmap() ) { return GetBitmap(); }
   elsif ( IsFiles() )  { return GetFiles(); }
   else { return GetText(); }

 NEW:
   if    ( IsBitmap() ) { return GetBitmap(); }
   elsif ( IsFiles() )  { return GetFiles(); }
   else                 { return GetText(); }

 OLD:
   $which_search =
       $opts{"t"} ? 'title'
     : $opts{"s"} ? 'subject'
     : $opts{"a"} ? 'author'
     : 'title';

 NEW:
   $which_search =
       $opts{"t"} ? 'title'
     : $opts{"s"} ? 'subject'
     : $opts{"a"} ? 'author'
     :              'title';

-improved indentation of try/catch blocks and other externally defined
functions accepting a block argument.  Thanks to jae.

-Added support for Perl 5.10 features say and smartmatch.

-Added flag -pbp (--perl-best-practices) as an abbreviation for parameters
suggested in Damian Conway's "Perl Best Practices".  -pbp is the same as:

   -l=78 -i=4 -ci=4 -st -se -vt=2 -cti=0 -pt=1 -bt=1 -sbt=1 -bbt=1 -nsfs -nolq
   -wbb="% + - * / x != == >= <= =~ !~ < > | & >= < = 
         **= += *= &= <<= &&= -= /= |= >>= ||= .= %= ^= x="

 Please note that the -st here restricts input to standard input; use
 -nst if necessary to override.

-Eliminated some needless breaks at equals signs in -lp indentation.

   OLD:
       $c =
         Math::Complex->make(LEFT + $x * (RIGHT - LEFT) / SIZE,
                             TOP + $y * (BOTTOM - TOP) / SIZE);
   NEW:
       $c = Math::Complex->make(LEFT + $x * (RIGHT - LEFT) / SIZE,
                                TOP + $y * (BOTTOM - TOP) / SIZE);

A break at an equals is sometimes useful for preventing complex statements 
from hitting the line length limit.  The decision to do this was 
over-eager in some cases and has been improved.  Thanks to Royce Reece.

-qw quotes contained in braces, square brackets, and parens are being
treated more like those containers as far as stacking of tokens.  Also
stack of closing tokens ending ');' will outdent to where the ');' would
have outdented if the closing stack is matched with a similar opening stack.

 OLD: perltidy -soc -sct
   __PACKAGE__->load_components(
       qw(
         PK::Auto
         Core
         )
   );
 NEW: perltidy -soc -sct
   __PACKAGE__->load_components( qw(
         PK::Auto
         Core
   ) );
 Thanks to Aran Deltac

-Eliminated some undesirable or marginally desirable vertical alignments.
These include terminal colons, opening braces, and equals, and particularly
when just two lines would be aligned.

OLD:
   my $accurate_timestamps = $Stamps{lnk};
   my $has_link            = 
       ...
NEW:
   my $accurate_timestamps = $Stamps{lnk};
   my $has_link =

-Corrected a problem with -mangle in which a space would be removed
between a keyword and variable beginning with ::.

2006 06 14

-Attribute argument lists are now correctly treated as quoted strings
and not formatted.  This is the most important update in this version.
Thanks to Borris Zentner, Greg Ferguson, Steve Kirkup.

-Updated to recognize the defined or operator, //, to be released in Perl 10.
Thanks to Sebastien Aperghis-Tramoni.

-A useful utility perltidyrc_dump.pl is included in the examples section.  It
will read any perltidyrc file and write it back out in a standard format
(though comments are lost).

-Added option to have perltidy read and return a hash with the contents of a
perltidyrc file.  This may be used by Leif Eriksen's tidyview code.  This
feature is used by the demonstration program 'perltidyrc_dump.pl' in the
examples directory.

-Improved error checking in perltidyrc files.  Unknown bare words were not
being caught.

-The --dump-options parameter now dumps parameters in the format required by a
perltidyrc file.

-V-Strings with underscores are now recognized.
For example: $v = v1.2_3; 

-cti=3 option added which gives one extra indentation level to closing 
tokens always.  This provides more predictable closing token placement
than cti=2.  If you are using cti=2 you might want to try cti=3.

-To identify all left-adjusted comments as static block comments, use C<-sbcp='^#'>.

-New parameters -fs, -fsb, -fse added to allow sections of code between #<<<
and #>>> to be passed through verbatim. This is enabled by default and turned
off by -nfs.  Flags -fsb and -fse allow other beginning and ending markers.
Thanks to Wolfgang Werner and Marion Berryman for suggesting this.  

-added flag -skp to put a space between all Perl keywords and following paren.
The default is to only do this for certain keywords.  Suggested by
H.Merijn Brand.

-added flag -sfp to put a space between a function name and following paren.
The default is not to do this.  Suggested by H.Merijn Brand.

-Added patch to avoid breaking GetOpt::Long::Configure set by calling program. 
Thanks to Philippe Bruhat.

-An error was fixed in which certain parameters in a .perltidyrc file given
without the equals sign were not recognized.  That is,
'--brace-tightness 0' gave an error but '--brace-tightness=0' worked
ok.  Thanks to Zac Hansen.

-An error preventing the -nwrs flag from working was corrected. Thanks to
 Greg Ferguson.

-Corrected some alignment problems with entab option.

-A bug with the combination of -lp and -extrude was fixed (though this
combination doesn't really make sense).  The bug was that a line with
a single zero would be dropped.  Thanks to Cameron Hayne.

-Updated Windows detection code to avoid an undefined variable.
Thanks to Joe Yates and Russ Jones.

-Improved formatting for short trailing statements following a closing paren.
Thanks to Joe Matarazzo.

-The handling of the -icb (indent closing block braces) flag has been changed
slightly to provide more consistent and predictable formatting of complex
structures.  Instead of giving a closing block brace the indentation of the
previous line, it is now given one extra indentation level.  The two methods
give the same result if the previous line was a complete statement, as in this
example:

       if ($task) {
           yyy();
           }    # -icb
       else {
           zzz();
           }
The change also fixes a problem with empty blocks such as:

   OLD, -icb:
   elsif ($debug) {
   }

   NEW, -icb:
   elsif ($debug) {
       }

-A problem with -icb was fixed in which a closing brace was misplaced when
it followed a quote which spanned multiple lines.

-Some improved breakpoints for -wba='&& || and or'

-Fixed problem with misaligned cuddled else in complex statements
when the -bar flag was also used.  Thanks to Alex and Royce Reese.

-Corrected documentation to show that --outdent-long-comments is the default.
Thanks to Mario Lia.

-New flag -otr (opening-token-right) is similar to -bar (braces-always-right)
but applies to non-structural opening tokens.

-new flags -sot (stack-opening-token), -sct (stack-closing-token).
Suggested by Tony.

2003 10 21

-The default has been changed to not do syntax checking with perl.  
  Use -syn if you want it.  Perltidy is very robust now, and the -syn
  flag now causes more problems than it's worth because of BEGIN blocks
  (which get executed with perl -c).  For example, perltidy will never
  return when trying to beautify this code if -syn is used:

       BEGIN { 1 while { }; }

 Although this is an obvious error, perltidy is often run on untested
 code which is more likely to have this sort of problem.  A more subtle
 example is:

       BEGIN { use FindBin; }

 which may hang on some systems using -syn if a shared file system is
 unavailable.

-Changed style -gnu to use -cti=1 instead of -cti=2 (see next item).
 In most cases it looks better.  To recover the previous format, use
 '-gnu -cti=2'

-Added flags -cti=n for finer control of closing token indentation.
  -cti = 0 no extra indentation (default; same as -nicp)
  -cti = 1 enough indentation so that the closing token
       aligns with its opening token.
  -cti = 2 one extra indentation level if the line has the form 
         );   ];   or   };     (same as -icp).

  The new option -cti=1 works well with -lp:

  EXAMPLES:

   # perltidy -lp -cti=1
   @month_of_year = (
                      'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
                      'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'
                    );

   # perltidy -lp -cti=2
   @month_of_year = (
                      'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
                      'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'
                      );
 This is backwards compatible with -icp. See revised manual for
 details.  Suggested by Mike Pennington.

-Added flag '--preserve-line-endings' or '-ple' to cause the output
 line ending to be the same as in the input file, for unix, dos, 
 or mac line endings.  Only works under unix. Suggested by 
 Rainer Hochschild.

-Added flag '--output-line-ending=s' or '-ole=s' where s=dos or win,
 unix, or mac.  Only works under unix.

-Files with Mac line endings should now be handled properly under unix
 and dos without being passed through a converter.

-You may now include 'and', 'or', and 'xor' in the list following
 '--want-break-after' to get line breaks after those keywords rather than
 before them.  Suggested by Rainer Hochschild.

-Corrected problem with command line option for -vtc=n and -vt=n. The
 equals sign was being eaten up by the Windows shell so perltidy didn't
 see it.

2003 07 26

-Corrected cause of warning message with recent versions of Perl:
   "Possible precedence problem on bitwise & operator at ..."
 Thanks to Jim Files.

-fixed bug with -html with '=for pod2html' sections, in which code/pod
output order was incorrect.  Thanks to Tassilo von Parseval.

-fixed bug when the -html flag is used, in which the following error
message, plus others, appear:
    did not see <body> in pod2html output
This was caused by a change in the format of html output by pod2html
VERSION 1.04 (included with perl 5.8).  Thanks to Tassilo von Parseval.

-Fixed bug where an __END__ statement would be mistaken for a label
if it is immediately followed by a line with a leading colon. Thanks
to John Bayes.

-Implemented guessing logic for brace types when it is ambiguous.  This
has been on the TODO list a long time.  Thanks to Boris Zentner for
an example.

-Long options may now be negated either as '--nolong-option' 
or '--no-long-option'.  Thanks to Philip Newton for the suggestion.

-added flag --html-entities or -hent which controls the use of
Html::Entities for html formatting.  Use --nohtml-entities or -nhent to
prevent the use of Html::Entities to encode special symbols.  The
default is -hent.  Html::Entities when formatting perl text to escape
special symbols.  This may or may not be the right thing to do,
depending on browser/language combinations.  Thanks to Burak Gursoy for
this suggestion.

-Bareword strings with leading '-', like, '-foo' now count as 1 token
for horizontal tightness.  This way $a{'-foo'}, $a{foo}, and $a{-foo}
are now all treated similarly.  Thus, by default, OLD: $a{ -foo } will
now be NEW: $a{-foo}.  Suggested by Mark Olesen.

-added 2 new flags to control spaces between keywords and opening parens:
  -sak=s  or --space-after-keyword=s,  and
  -nsak=s or --nospace-after-keyword=s, where 's' is a list of keywords.

The new default list of keywords which get a space is:

  "my local our and or eq ne if else elsif until unless while for foreach
    return switch case given when"

Use -sak=s and -nsak=s to add and remove keywords from this list,
   respectively.

Explanation: Stephen Hildrey noted that perltidy was being inconsistent
in placing spaces between keywords and opening parens, and sent a patch
to give user control over this.  The above list was selected as being
a reasonable default keyword list.  Previously, perltidy
had a hardwired list which also included these keywords:

       push pop shift unshift join split die

but did not have 'our'.  Example: if you prefer to make perltidy behave
exactly as before, you can include the following two lines in your
.perltidyrc file: 

  -sak="push pop local shift unshift join split die"
  -nsak="our"

-Corrected html error in .toc file when -frm -html is used (extra ");
 browsers were tolerant of it.

-Improved alignment of chains of binary and ?/: operators. Example:
 OLD:
   $leapyear =
     $year % 4     ? 0
     : $year % 100 ? 1
     : $year % 400 ? 0
     : 1;
 NEW:
   $leapyear =
       $year % 4   ? 0
     : $year % 100 ? 1
     : $year % 400 ? 0
     : 1;

-improved breakpoint choices involving '->'

-Corrected tokenization of things like ${#}. For example,
 ${#} is valid, but ${# } is a syntax error.

-Corrected minor tokenization errors with indirect object notation.
 For example, 'new A::()' works now.

-Minor tokenization improvements; all perl code distributed with perl 5.8 
 seems to be parsed correctly except for one instance (lextest.t) 
 of the known bug.

2002 11 30

-Implemented scalar attributes.  Thanks to Sean Tobin for noting this.

-Fixed glitch introduced in previous release where -pre option
was not outputting a leading html <pre> tag.

-Numerous minor improvements in vertical alignment, including the following:

-Improved alignment of opening braces in many cases.  Needed for improved
switch/case formatting, and also suggested by Mark Olesen for sort/map/grep
formatting.  For example:

 OLD:
   @modified =
     map { $_->[0] }
     sort { $a->[1] <=> $b->[1] }
     map { [ $_, -M ] } @filenames;

 NEW:
   @modified =
     map  { $_->[0] }
     sort { $a->[1] <=> $b->[1] }
     map  { [ $_, -M ] } @filenames;

-Eliminated alignments across unrelated statements. Example:
 OLD:
   $borrowerinfo->configure( -state => 'disabled' );
   $borrowerinfo->grid( -col        => 1, -row => 0, -sticky => 'w' );

 NEW:  
   $borrowerinfo->configure( -state => 'disabled' );
   $borrowerinfo->grid( -col => 1, -row => 0, -sticky => 'w' );

 Thanks to Mark Olesen for suggesting this.

-Improved alignement of '='s in certain cases.
 Thanks to Norbert Gruener for sending an example.

-Outdent-long-comments (-olc) has been re-instated as a default, since
 it works much better now.  Use -nolc if you want to prevent it.

-Added check for 'perltidy file.pl -o file.pl', which causes file.pl
to be lost. (The -b option should be used instead). Thanks to mreister
for reporting this problem.

2002 11 06

-Switch/case or given/when syntax is now recognized.  Its vertical alignment
is not great yet, but it parses ok.  The words 'switch', 'case', 'given',
and 'when' are now treated as keywords.  If this causes trouble with older
code, we could introduce a switch to deactivate it.  Thanks to Stan Brown
and Jochen Schneider for recommending this.

-Corrected error parsing sub attributes with call parameters.
Thanks to Marc Kerr for catching this.

-Sub prototypes no longer need to be on the same line as sub names.  

-a new flag -frm or --frames will cause html output to be in a
frame, with table of contents in the left panel and formatted source
in the right panel.  Try 'perltidy -html -frm somemodule.pm' for example.

-The new default for -html formatting is to pass the pod through Pod::Html.
The result is syntax colored code within your pod documents. This can be
deactivated with -npod.  Thanks to those who have written to discuss this,
particularly Mark Olesen and Hugh Myers.

-the -olc (--outdent-long-comments) option works much better.  It now outdents
groups of consecutive comments together, and by just the amount needed to
avoid having any one line exceeding the maximum line length.

-block comments are now trimmed of trailing whitespace.

-if a directory specified with -opath does not exist, it will be created.

-a table of contents to packages and subs is output when -html is used.
Use -ntoc to prevent this. 

-fixed an unusual bug in which a 'for' statement following a 'format'
statement was not correctly tokenized.  Thanks to Boris Zentner for
catching this.

-Tidy.pm is no longer dependent on modules IO::Scalar and IO::ScalarArray.  
There were some speed issues.  Suggested by Joerg Walter.

-The treatment of quoted wildcards (file globs) is now system-independent. 
For example

   perltidy 'b*x.p[lm]'

would match box.pl, box.pm, brinx.pm under any operating system.  Of
course, anything unquoted will be subject to expansion by any shell.

-default color for keywords under -html changed from 
SaddleBrown (#8B4513) to magenta4 (#8B008B).

-fixed an arg parsing glitch in which something like:
  perltidy quick-help
would trigger the help message and exit, rather than operate on the
file 'quick-help'.

2002 09 22

-New option '-b' or '--backup-and-modify-in-place' will cause perltidy to
overwrite the original file with the tidied output file.  The original
file will be saved with a '.bak' extension (which can be changed with
-bext=s).  Thanks to Rudi Farkas for the suggestion.

-An index to all subs is included at the top of -html output, unless
only the <pre> section is written.

-Anchor lines of the form <a name="mysub"></a> are now inserted at key points
in html output, such as before sub definitions, for the convenience of
postprocessing scripts.  Suggested by Howard Owen.

-The cuddled-else (-ce) flag now also makes cuddled continues, like
this:

   while ( ( $pack, $file, $line ) = caller( $i++ ) ) {
      # bla bla
   } continue {
       $prevpack = $pack;
   }

Suggested by Simon Perreault.  

-Fixed bug in which an extra blank line was added before an =head or 
similar pod line after an __END__ or __DATA__ line each time 
perltidy was run.  Also, an extra blank was being added after
a terminal =cut.  Thanks to Mike Birdsall for reporting this.

2002 08 26

-Fixed bug in which space was inserted in a hyphenated hash key:
   my $val = $myhash{USER-NAME};
 was converted to:
   my $val = $myhash{USER -NAME}; 
 Thanks to an anonymous bug reporter at sourceforge.

-Fixed problem with the '-io' ('--indent-only') where all lines 
 were double spaced.  Thanks to Nick Andrew for reporting this bug.

-Fixed tokenization error in which something like '-e1' was 
 parsed as a number. 

-Corrected a rare problem involving older perl versions, in which 
 a line break before a bareword caused problems with 'use strict'.
 Thanks to Wolfgang Weisselberg for noting this.

-More syntax error checking added.

-Outdenting labels (-ola) has been made the default, in order to follow the
 perlstyle guidelines better.  It's probably a good idea in general, but
 if you do not want this, use -nola in your .perltidyrc file.

-Updated rules for padding logical expressions to include more cases.
 Thanks to Wolfgang Weisselberg for helpful discussions.

-Added new flag -osbc (--outdent-static-block-comments) which will
 outdent static block comments by 2 spaces (or whatever -ci equals).
 Requested by Jon Robison.

2002 04 25

-Corrected a bug, introduced in the previous release, in which some
 closing side comments (-csc) could have incorrect text.  This is
 annoying but will be correct the next time perltidy is run with -csc.

-Fixed bug where whitespace was being removed between 'Bar' and '()' 
 in a use statement like:

      use Foo::Bar ();

-Whenever possible, if a logical expression is broken with leading
 '&&', '||', 'and', or 'or', then the leading line will be padded
 with additional space to produce alignment.  This has been on the
 todo list for a long time; thanks to Frank Steinhauer for reminding
 me to do it.  Notice the first line after the open parens here:

       OLD: perltidy -lp
       if (
            !param("rules.to.$linecount")
            && !param("rules.from.$linecount")
            && !param("rules.subject.$linecount")
            && !(
                  param("rules.fieldname.$linecount")
                  && param("rules.fieldval.$linecount")
            )
            && !param("rules.size.$linecount")
            && !param("rules.custom.$linecount")
         )

       NEW: perltidy -lp
       if (
               !param("rules.to.$linecount")
            && !param("rules.from.$linecount")
            && !param("rules.subject.$linecount")
            && !(
                     param("rules.fieldname.$linecount")
                  && param("rules.fieldval.$linecount")
            )
            && !param("rules.size.$linecount")
            && !param("rules.custom.$linecount")
         )

2002 04 16

-Corrected a mistokenization of variables for a package with a name
 equal to a perl keyword.  For example: 

    my::qx();
    package my;
    sub qx{print "Hello from my::qx\n";}

 In this case, the leading 'my' was mistokenized as a keyword, and a
 space was being place between 'my' and '::'.  This has been
 corrected.  Thanks to Martin Sluka for discovering this. 

-A new flag -bol (--break-at-old-logic-breakpoints)
 has been added to control whether containers with logical expressions
 should be broken open.  This is the default.

-A new flag -bok (--break-at-old-keyword-breakpoints)
 has been added to follow breaks at old keywords which return lists,
 such as sort and map.  This is the default.

-A new flag -bot (--break-at-old-trinary-breakpoints) has been added to
 follow breaks at trinary (conditional) operators.  This is the default.

-A new flag -cab=n has been added to control breaks at commas after
 '=>' tokens.  The default is n=1, meaning break unless this breaks
 open an existing on-line container.

-A new flag -boc has been added to allow existing list formatting
 to be retained.  (--break-at-old-comma-breakpoints).  See updated manual.

-A new flag -iob (--ignore-old-breakpoints) has been added to
 prevent the locations of old breakpoints from influencing the output
 format.

-Corrected problem where nested parentheses were not getting full
 indentation.  This has been on the todo list for some time; thanks 
 to Axel Rose for a snippet demonstrating this issue.

           OLD: inner list is not indented
           $this->sendnumeric(
               $this->server,
               (
                 $ret->name,        $user->username, $user->host,
               $user->server->name, $user->nick,     "H"
               ),
           );

           NEW:
           $this->sendnumeric(
               $this->server,
               (
                   $ret->name,          $user->username, $user->host,
                   $user->server->name, $user->nick,     "H"
               ),
           );

-Code cleaned up by removing the following unused, undocumented flags.
 They should not be in any .perltidyrc files because they were just
 experimental flags which were never documented.  Most of them placed
 artificial limits on spaces, and Wolfgang Weisselberg convinced me that
 most of them they do more harm than good by causing unexpected results.

 --maximum-continuation-indentation (-mci)
 --maximum-whitespace-columns
 --maximum-space-to-comment (-xsc)
 --big-space-jump (-bsj)

-Pod file 'perltidy.pod' has been appended to the script 'perltidy', and
 Tidy.pod has been append to the module 'Tidy.pm'.  Older MakeMaker's
 were having trouble.

-A new flag -isbc has been added for more control on comments. This flag
 has the effect that if there is no leading space on the line, then the
 comment will not be indented, and otherwise it may be.  If both -ibc and
 -isbc are set, then -isbc takes priority.  Thanks to Frank Steinhauer
 for suggesting this.

-A new document 'stylekey.pod' has been created to quickly guide new users
 through the maze of perltidy style parameters.  An html version is 
 on the perltidy web page.  Take a look! It should be very helpful.

-Parameters for controlling 'vertical tightness' have been added:
 -vt and -vtc are the main controls, but finer control is provided
 with -pvt, -pcvt, -bvt, -bcvt, -sbvt, -sbcvt.  Block brace vertical
 tightness controls have also been added.
 See updated manual and also see 'stylekey.pod'. Simple examples:

   # perltidy -lp -vt=1 -vtc=1
   @month_of_year = ( 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
                      'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec' );

   # perltidy -lp -vt=1 -vtc=0
   @month_of_year = ( 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
                      'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'
   );

-Lists which do not format well in uniform columns are now better
 identified and formated.

   OLD:
   return $c->create( 'polygon', $x, $y, $x + $ruler_info{'size'},
       $y + $ruler_info{'size'}, $x - $ruler_info{'size'},
       $y + $ruler_info{'size'} );

   NEW:
   return $c->create(
       'polygon', $x, $y,
       $x + $ruler_info{'size'},
       $y + $ruler_info{'size'},
       $x - $ruler_info{'size'},
       $y + $ruler_info{'size'}
   );

   OLD:
     radlablist($f1, pad('Initial', $p), $b->{Init}->get_panel_ref, 'None ',
                'None', 'Default', 'Default', 'Simple', 'Simple');
   NEW:
     radlablist($f1,
                pad('Initial', $p),
                $b->{Init}->get_panel_ref,
                'None ', 'None', 'Default', 'Default', 'Simple', 'Simple');

-Corrected problem where an incorrect html filename was generated for 
 external calls to Tidy.pm module.  Fixed incorrect html title when
 Tidy.pm is called with IO::Scalar or IO::Array source.

-Output file permissions are now set as follows.  An output script file
 gets the same permission as the input file, except that owner
 read/write permission is added (otherwise, perltidy could not be
 rerun).  Html output files use system defaults.  Previously chmod 0755
 was used in all cases.  Thanks to Mark Olesen for bringing this up.

-Missing semicolons will not be added in multi-line blocks of type
 sort, map, or grep.  This brings perltidy into closer agreement
 with common practice.  Of course, you can still put semicolons 
 there if you like.  Thanks to Simon Perreault for a discussion of this.

-Most instances of extra semicolons are now deleted.  This is
 particularly important if the -csc option is used.  Thanks to Wolfgang
 Weisselberg for noting this.  For example, the following line
 (produced by 'h2xs' :) has an extra semicolon which will now be
 removed:

    BEGIN { plan tests => 1 };

-New parameter -csce (--closing-side-comment-else-flag) can be used
 to control what text is appended to 'else' and 'elsif' blocks.
 Default is to just add leading 'if' text to an 'else'.  See manual.

-The -csc option now labels 'else' blocks with additinal information
 from the opening if statement and elsif statements, if space.
 Thanks to Wolfgang Weisselberg for suggesting this.

-The -csc option will now remove any old closing side comments
 below the line interval threshold. Thanks to Wolfgang Weisselberg for
 suggesting this.

-The abbreviation feature, which was broken in the previous version,
 is now fixed.  Thanks to Michael Cartmell for noting this.

-Vertical alignment is now done for '||='  .. somehow this was 
 overlooked.

2002 02 25

-This version uses modules for the first time, and a standard perl
 Makefile.PL has been supplied.  However, perltidy may still be
 installed as a single script, without modules.  See INSTALL for
 details.

-The man page 'perl2web' has been merged back into the main 'perltidy'
 man page to simplify installation.  So you may remove that man page
 if you have an older installation.

-Added patch from Axel Rose for MacPerl.  The patch prompts the user
 for command line arguments before calling the module 
 Perl::Tidy::perltidy.

-Corrected bug with '-bar' which was introduced in the previous
 version.  A closing block brace was being indented.  Thanks to
 Alexandros M Manoussakis for reporting this.

-New parameter '--entab-leading-whitespace=n', or '-et=n', has been
 added for those who prefer tabs.  This behaves different from the
 existing '-t' parameter; see updated man page.  Suggested by Mark
 Olesen.

-New parameter '--perl-syntax-check-flags=s'  or '-pcsf=s' can be
 used to change the flags passed to perltidy in a syntax check.
 See updated man page.  Suggested by Mark Olesen. 

-New parameter '--output-path=s'  or '-opath=s' will cause output
 files to be placed in directory s.  See updated man page.  Thanks for
 Mark Olesen for suggesting this.

-New parameter --dump-profile (or -dpro) will dump to
 standard output information about the search for a
 configuration file, the name of whatever configuration file
 is selected, and its contents.  This should help debugging
 config files, especially on different Windows systems.

-The -w parameter now notes possible errors of the form:

       $comment = s/^\s*(\S+)\..*/$1/;   # trim whitespace

-Corrections added for a leading ':' and for leaving a leading 'tcsh'
 line untouched.  Mark Olesen reported that lines of this form were
 accepted by perl but not by perltidy:

       : # use -*- perl -*-
       eval 'exec perl -wS $0 "$@"'  # shell should exec 'perl'
       unless 1;                     # but Perl should skip this one

 Perl will silently swallow a leading colon on line 1 of a
 script, and now perltidy will do likewise.  For example,
 this is a valid script, provided that it is the first line,
 but not otherwise:

       : print "Hello World\n";

 Also, perltidy will now mark a first line with leading ':' followed by
 '#' as type SYSTEM (just as a #!  line), not to be formatted.

-List formatting improved for certain lists with special
 initial terms, such as occur with 'printf', 'sprintf',
 'push', 'pack', 'join', 'chmod'.  The special initial term is
 now placed on a line by itself.  For example, perltidy -gnu

    OLD:
       $Addr = pack(
                    "C4",                hex($SourceAddr[0]),
                    hex($SourceAddr[1]), hex($SourceAddr[2]),
                    hex($SourceAddr[3])
                    );

    NEW:
       $Addr = pack("C4",
                    hex($SourceAddr[0]), hex($SourceAddr[1]),
                    hex($SourceAddr[2]), hex($SourceAddr[3]));

     OLD:
           push (
                 @{$$self{states}}, '64', '66', '68',
                 '70',              '72', '74', '76',
                 '78',              '80', '82', '84',
                 '86',              '88', '90', '92',
                 '94',              '96', '98', '100',
                 '102',             '104'
                 );

     NEW:
           push (
                 @{$$self{states}},
                 '64', '66', '68', '70', '72',  '74',  '76',
                 '78', '80', '82', '84', '86',  '88',  '90',
                 '92', '94', '96', '98', '100', '102', '104'
                 );

-Lists of complex items, such as matricies, are now detected
 and displayed with just one item per row:

   OLD:
   $this->{'CURRENT'}{'gfx'}{'MatrixSkew'} = Text::PDF::API::Matrix->new(
       [ 1, tan( deg2rad($a) ), 0 ], [ tan( deg2rad($b) ), 1, 0 ],
       [ 0, 0, 1 ]
   );

   NEW:
   $this->{'CURRENT'}{'gfx'}{'MatrixSkew'} = Text::PDF::API::Matrix->new(
       [ 1,                  tan( deg2rad($a) ), 0 ],
       [ tan( deg2rad($b) ), 1,                  0 ],
       [ 0,                  0,                  1 ]
   );

-The perl syntax check will be turned off for now when input is from
 standard input or standard output.  The reason is that this requires
 temporary files, which has produced far too many problems during
 Windows testing.  For example, the POSIX module under Windows XP/2000
 creates temporary names in the root directory, to which only the
 administrator should have permission to write.

-Merged patch sent by Yves Orton to handle appropriate
 configuration file locations for different Windows varieties
 (2000, NT, Me, XP, 95, 98).

-Added patch to properly handle a for/foreach loop without
 parens around a list represented as a qw.  I didn't know this
 was possible until Wolfgang Weisselberg pointed it out:

       foreach my $key qw\Uno Due Tres Quadro\ {
           print "Set $key\n";
       }

 But Perl will give a syntax error without the $ variable; ie this will
 not work:

       foreach qw\Uno Due Tres Quadro\ {
           print "Set $_\n";
       }

-Merged Windows version detection code sent by Yves Orton.  Perltidy
 now automatically turns off syntax checking for Win 9x/ME versions,
 and this has solved a lot of robustness problems.  These systems 
 cannot reliably handle backtick operators.  See man page for
 details.

-Merged VMS filename handling patch sent by Michael Cartmell.  (Invalid
 output filenames were being created in some cases). 

-Numerous minor improvements have been made for -lp style indentation.

-Long C-style 'for' expressions will be broken after each ';'.   

 'perltidy -gnu' gives:

   OLD:
   for ($status = $db->seq($key, $value, R_CURSOR()) ; $status == 0
        and $key eq $origkey ; $status = $db->seq($key, $value, R_NEXT())) 

   NEW:
   for ($status = $db->seq($key, $value, R_CURSOR()) ;
        $status == 0 and $key eq $origkey ;
        $status = $db->seq($key, $value, R_NEXT()))

-For the -lp option, a single long term within parens
 (without commas) now has better alignment.  For example,
 perltidy -gnu

           OLD:
           $self->throw("Must specify a known host, not $location,"
                 . " possible values ("
                 . join (",", sort keys %hosts) . ")");

           NEW:
           $self->throw("Must specify a known host, not $location,"
                        . " possible values ("
                        . join (",", sort keys %hosts) . ")");

2001 12 31

-This version is about 20 percent faster than the previous
 version as a result of optimization work.  The largest gain
 came from switching to a dispatch hash table in the
 tokenizer.

-perltidy -html will check to see if HTML::Entities is
 installed, and if so, it will use it to encode unsafe
 characters.

-Added flag -oext=ext to change the output file extension to
 be different from the default ('tdy' or 'html').  For
 example:

   perltidy -html -oext=htm filename

will produce filename.htm

-Added flag -cscw to issue warnings if a closing side comment would replace
an existing, different side comments.  See the man page for details.
Thanks to Peter Masiar for helpful discussions.

-Corrected tokenization error of signed hex/octal/binary numbers. For
example, the first hex number below would have been parsed correctly
but the second one was not:
   if ( ( $tmp >= 0x80_00_00 ) || ( $tmp < -0x80_00_00 ) ) { }

-'**=' was incorrectly tokenized as '**' and '='.  This only
    caused a problem with the -extrude opton.

-Corrected a divide by zero when -extrude option is used

-The flag -w will now contain all errors reported by 'perl -c' on the
input file, but otherwise they are not reported.  The reason is that
perl will report lots of problems and syntax errors which are not of
interest when only a small snippet is being formatted (such as missing
modules and unknown bare words).  Perltidy will always report all
significant syntax errors that it finds, such as unbalanced braces,
unless the -q (quiet) flag is set.

-Merged modifications created by Hugh Myers into perltidy.
 These include a 'streamhandle' routine which allows perltidy
 as a module to operate on input and output arrays and strings
 in addition to files.  Documentation and new packaging as a
 module should be ready early next year; This is an elegant,
 powerful update; many thanks to Hugh for contributing it.

2001 11 28

-added a tentative patch which tries to keep any existing breakpoints
at lines with leading keywords map,sort,eval,grep. The idea is to
improve formatting of sequences of list operations, as in a schwartzian
transform.  Example:

   INPUT:
   my @sorted = map { $_->[0] }
                sort { $a->[1] <=> $b->[1] }
                map { [ $_, rand ] } @list;

   OLD:
   my @sorted =
     map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, rand ] } @list;

   NEW:
   my @sorted = map { $_->[0] }
     sort { $a->[1] <=> $b->[1] }
     map { [ $_, rand ] } @list;

 The new alignment is not as nice as the input, but this is an improvement.
 Thanks to Yves Orton for this suggestion.

-modified indentation logic so that a line with leading opening paren,
brace, or square bracket will never have less indentation than the
line with the corresponding opening token.  Here's a simple example:

   OLD:
       $mw->Button(
           -text    => "New Document",
           -command => \&new_document
         )->pack(
           -side   => 'bottom',
           -anchor => 'e'
       );

   Note how the closing ');' is lined up with the first line, even
   though it closes a paren in the 'pack' line.  That seems wrong.

   NEW:
       $mw->Button(
           -text    => "New Document",
           -command => \&new_document
         )->pack(
           -side   => 'bottom',
           -anchor => 'e'
         );

  This seems nicer: you can up-arrow with an editor and arrive at the
  opening 'pack' line.

-corrected minor glitch in which cuddled else (-ce) did not get applied
to an 'unless' block, which should look like this:

       unless ($test) {

       } else {

       }

 Thanks to Jeremy Mates for reporting this.

-The man page has been reorganized to parameters easier to find.

-Added check for multiple definitions of same subroutine.  It is easy
 to introduce this problem when cutting and pasting. Perl does not
 complain about it, but it can lead to disaster.

-The command -pro=filename  or -profile=filename may be used to specify a
 configuration file which will override the default name of .perltidyrc.
 There must not be a space on either side of the '=' sign.  I needed
 this to be able to easily test perltidy with a variety of different
 configuration files.

-Side comment alignment has been improved somewhat across frequent level
 changes, as in short if/else blocks.  Thanks to Wolfgang Weisselberg 
 for pointing out this problem.  For example:

   OLD:
   if ( ref $self ) {    # Called as a method
       $format = shift;
   }
   else {    # Regular procedure call
       $format = $self;
       undef $self;
   }

   NEW:
   if ( ref $self ) {    # Called as a method
       $format = shift;
   }
   else {                # Regular procedure call
       $format = $self;
       undef $self;
   }

-New command -ssc (--static-side-comment) and related command allows
 side comments to be spaced close to preceding character.  This is
 useful for displaying commented code as side comments.

-New command -csc (--closing-side-comment) and several related
 commands allow comments to be added to (and deleted from) any or all
 closing block braces.  This can be useful if you have to maintain large
 programs, especially those that you didn't write.  See updated man page.
 Thanks to Peter Masiar for this suggestion.  For a simple example:

       perltidy -csc

       sub foo {
           if ( !defined( $_[0] ) ) {
               print("Hello, World\n");
           }
           else {
               print( $_[0], "\n" );
           }
       } ## end sub foo

 This added '## end sub foo' to the closing brace.  
 To remove it, perltidy -ncsc.

-New commands -ola, for outdenting labels, and -okw, for outdenting
 selected control keywords, were implemented.  See the perltidy man
 page for details.  Thanks to Peter Masiar for this suggestion.

-Hanging side comment change: a comment will not be considered to be a
 hanging side comment if there is no leading whitespace on the line.
 This should improve the reliability of identifying hanging side comments.
 Thanks to Peter Masiar for this suggestion.

-Two new commands for outdenting, -olq (outdent-long-quotes) and -olc
 (outdent-long-comments), have been added.  The original -oll
 (outdent-long-lines) remains, and now is an abbreviation for -olq and -olc.
 The new default is just -olq.  This was necessary to avoid inconsistency with
 the new static block comment option.

-Static block comments:  to provide a way to display commented code
 better, the convention is used that comments with a leading '##' should
 not be formatted as usual.  Please see '-sbc' (or '--static-block-comment')
 for documentation.  It can be deactivated with with -nsbc, but
 should not normally be necessary. Thanks to Peter Masiar for this 
 suggestion.

-Two changes were made to help show structure of complex lists:
 (1) breakpoints are forced after every ',' in a list where any of
 the list items spans multiple lines, and
 (2) List items which span multiple lines now get continuation indentation.

 The following example illustrates both of these points.  Many thanks to
 Wolfgang Weisselberg for this snippet and a discussion of it; this is a
 significant formatting improvement. Note how it is easier to see the call
 parameters in the NEW version:

   OLD:
   assert( __LINE__, ( not defined $check )
       or ref $check
       or $check eq "new"
       or $check eq "old", "Error in parameters",
       defined $old_new ? ( ref $old_new ? ref $old_new : $old_new ) : "undef",
       defined $db_new  ? ( ref $db_new  ? ref $db_new  : $db_new )  : "undef",
       defined $old_db ? ( ref $old_db ? ref $old_db : $old_db ) : "undef" );

   NEW: 
   assert(
       __LINE__,
       ( not defined $check )
         or ref $check
         or $check eq "new"
         or $check eq "old",
       "Error in parameters",
       defined $old_new ? ( ref $old_new ? ref $old_new : $old_new ) : "undef",
       defined $db_new  ? ( ref $db_new  ? ref $db_new  : $db_new )  : "undef",
       defined $old_db  ? ( ref $old_db  ? ref $old_db  : $old_db )  : "undef"
   );

   Another example shows how this helps displaying lists:

   OLD:
   %{ $self->{COMPONENTS} } = (
       fname =>
       { type => 'name', adj => 'yes', font => 'Helvetica', 'index' => 0 },
       street =>
       { type => 'road', adj => 'yes', font => 'Helvetica', 'index' => 2 },
   );

   The structure is clearer with the added indentation:

   NEW:
   %{ $self->{COMPONENTS} } = (
       fname =>
         { type => 'name', adj => 'yes', font => 'Helvetica', 'index' => 0 },
       street =>
         { type => 'road', adj => 'yes', font => 'Helvetica', 'index' => 2 },
   );

   -The structure of nested logical expressions is now displayed better.
   Thanks to Wolfgang Weisselberg for helpful discussions.  For example,
   note how the status of the final 'or' is displayed in the following:

   OLD:
   return ( !null($op)
         and null( $op->sibling )
         and $op->ppaddr eq "pp_null"
         and class($op) eq "UNOP"
         and ( ( $op->first->ppaddr =~ /^pp_(and|or)$/
           and $op->first->first->sibling->ppaddr eq "pp_lineseq" )
           or ( $op->first->ppaddr eq "pp_lineseq"
               and not null $op->first->first->sibling
               and $op->first->first->sibling->ppaddr eq "pp_unstack" ) ) );

   NEW:
   return (
       !null($op)
         and null( $op->sibling )
         and $op->ppaddr eq "pp_null"
         and class($op) eq "UNOP"
         and (
           (
               $op->first->ppaddr =~ /^pp_(and|or)$/
               and $op->first->first->sibling->ppaddr eq "pp_lineseq"
           )
           or ( $op->first->ppaddr eq "pp_lineseq"
               and not null $op->first->first->sibling
               and $op->first->first->sibling->ppaddr eq "pp_unstack" )
         )
   );

  -A break will always be put before a list item containing a comma-arrow.
  This will improve formatting of mixed lists of this form:

       OLD:
       $c->create(
           'text', 225, 20, -text => 'A Simple Plot',
           -font => $font,
           -fill => 'brown'
       );

       NEW:
       $c->create(
           'text', 225, 20,
           -text => 'A Simple Plot',
           -font => $font,
           -fill => 'brown'
       );

 -For convenience, the command -dac (--delete-all-comments) now also
 deletes pod.  Likewise, -tac (--tee-all-comments) now also sends pod
 to a '.TEE' file.  Complete control over the treatment of pod and
 comments is still possible, as described in the updated help message 
 and man page.

 -The logic which breaks open 'containers' has been rewritten to be completely
 symmetric in the following sense: if a line break is placed after an opening
 {, [, or (, then a break will be placed before the corresponding closing
 token.  Thus, a container either remains closed or is completely cracked
 open.

 -Improved indentation of parenthesized lists.  For example, 

           OLD:
           $GPSCompCourse =
             int(
             atan2( $GPSTempCompLong - $GPSLongitude,
             $GPSLatitude - $GPSTempCompLat ) * 180 / 3.14159265 );

           NEW:
           $GPSCompCourse = int(
               atan2(
                   $GPSTempCompLong - $GPSLongitude,
                   $GPSLatitude - $GPSTempCompLat
                 ) * 180 / 3.14159265
           );

  Further improvements will be made in future releases.

 -Some improvements were made in formatting small lists.

 -Correspondence between Input and Output line numbers reported in a 
  .LOG file should now be exact.  They were sometimes off due to the size
  of intermediate buffers.

 -Corrected minor tokenization error in which a ';' in a foreach loop
  control was tokenized as a statement termination, which forced a 
  line break:

       OLD:
       foreach ( $i = 0;
           $i <= 10;
           $i += 2
         )
       {
           print "$i ";
       }

       NEW:
       foreach ( $i = 0 ; $i <= 10 ; $i += 2 ) {
           print "$i ";
       }

 -Corrected a problem with reading config files, in which quote marks were not
  stripped.  As a result, something like -wba="&& . || " would have the leading
  quote attached to the && and not work correctly.  A workaround for older
  versions is to place a space around all tokens within the quotes, like this:
  -wba=" && . || "

 -Removed any existing space between a label and its ':'
   OLD    : { }
   NEW: { }
  This was necessary because the label and its colon are a single token.

 -Corrected tokenization error for the following (highly non-recommended) 
  construct:
   $user = @vars[1] / 100;

 -Resolved cause of a difference between perltidy under perl v5.6.1 and
 5.005_03; the problem was different behavior of \G regex position
 marker(!)

2001 10 20

-Corrected a bug in which a break was not being made after a full-line
comment within a short eval/sort/map/grep block.  A flag was not being
zeroed.  The syntax error check catches this.  Here is a snippet which
illustrates the bug:

       eval {
           #open Socket to Dispatcher
           $sock = &OpenSocket;
       };

The formatter mistakenly thought that it had found the following 
one-line block:

       eval {#open Socket to Dispatcher$sock = &OpenSocket; };

The patch fixes this. Many thanks to Henry Story for reporting this bug.

-Changes were made to help diagnose and resolve problems in a
.perltidyrc file: 
  (1) processing of command parameters has been into two separate
  batches so that any errors in a .perltidyrc file can be localized.  
  (2) commands --help, --version, and as many of the --dump-xxx
  commands are handled immediately, without any command line processing
  at all.  
  (3) Perltidy will ignore any commands in the .perltidyrc file which
  cause immediate exit.  These are:  -h -v -ddf -dln -dop -dsn -dtt
  -dwls -dwrs -ss.  Thanks to Wolfgang Weisselberg for helpful
  suggestions regarding these updates.

-Syntax check has been reinstated as default for MSWin32 systems.  This
way Windows 2000 users will get syntax check by default, which seems
like a better idea, since the number of Win 95/98 systems will be
decreasing over time.  Documentation revised to warn Windows 95/98
users about the problem with empty '&1'.  Too bad these systems
all report themselves as MSWin32.

2001 10 16

-Fixed tokenization error in which a method call of the form

   Module::->new();

 got a space before the '::' like this:

   Module ::->new();

 Thanks to David Holden for reporting this.

-Added -html control over pod text, using a new abbreviation 'pd'.  See
updated perl2web man page. The default is to use the color of a comment,
but italicized.  Old .css style sheets will need a new line for
.pd to use this.  The old color was the color of a string, and there
was no control.  

-.css lines are now printed in sorted order.

-Fixed interpolation problem where html files had '$input_file' as title
instead of actual input file name.  Thanks to Simon Perreault for finding
this and sending a patch, and also to Tobias Weber.

-Breaks will now have the ':' placed at the start of a line, 
one per line by default because this shows logical structure
more clearly. This coding has been completely redone. Some 
examples of new ?/: formatting:

      OLD:
           wantarray ? map( $dir::cwd->lookup($_)->path, @_ ) :
             $dir::cwd->lookup( $_[0] )->path;

      NEW:
           wantarray 
             ? map( $dir::cwd->lookup($_)->path, @_ )
             : $dir::cwd->lookup( $_[0] )->path;

      OLD:
               $a = ( $b > 0 ) ? {
                   a => 1,
                   b => 2
               } : { a => 6, b => 8 };

      NEW:
               $a = ( $b > 0 )
                 ? {
                   a => 1,
                   b => 2
                 }
                 : { a => 6, b => 8 };

   OLD: (-gnu):
   $self->note($self->{skip} ? "Hunk #$self->{hunk} ignored at 1.\n" :
               "Hunk #$self->{hunk} failed--$@");

   NEW: (-gnu):
   $self->note($self->{skip} 
               ? "Hunk #$self->{hunk} ignored at 1.\n"
               : "Hunk #$self->{hunk} failed--$@");

   OLD:
       $which_search =
         $opts{"t"} ? 'title'   :
         $opts{"s"} ? 'subject' : $opts{"a"} ? 'author' : 'title';

   NEW:
       $which_search =
         $opts{"t"} ? 'title'
         : $opts{"s"} ? 'subject'
         : $opts{"a"} ? 'author'
         : 'title';

You can use -wba=':' to recover the previous default which placed ':'
at the end of a line.  Thanks to Michael Cartmell for helpful
discussions and examples.  

-Tokenizer updated to do syntax checking for matched ?/: pairs.  Also,
the tokenizer now outputs a unique serial number for every balanced
pair of brace types and ?/: pairs.  This greatly simplifies the
formatter.

-Long lines with repeated 'and', 'or', '&&', '||'  will now have
one such item per line.  For example:

   OLD:
       if ( $opt_d || $opt_m || $opt_p || $opt_t || $opt_x
           || ( -e $archive && $opt_r ) )
       {
           ( $pAr, $pNames ) = readAr($archive);
       }

   NEW:
       if ( $opt_d
           || $opt_m
           || $opt_p
           || $opt_t
           || $opt_x
           || ( -e $archive && $opt_r ) )
       {
           ( $pAr, $pNames ) = readAr($archive);
       }

  OLD:
       if ( $vp->{X0} + 4 <= $x && $vp->{X0} + $vp->{W} - 4 >= $x
           && $vp->{Y0} + 4 <= $y && $vp->{Y0} + $vp->{H} - 4 >= $y ) 

  NEW:
       if ( $vp->{X0} + 4 <= $x
           && $vp->{X0} + $vp->{W} - 4 >= $x
           && $vp->{Y0} + 4 <= $y
           && $vp->{Y0} + $vp->{H} - 4 >= $y )

-Long lines with multiple concatenated tokens will have concatenated
terms (see below) placed one per line, except for short items.  For
example:

  OLD:
       $report .=
         "Device type:" . $ib->family . "  ID:" . $ib->serial . "  CRC:"
         . $ib->crc . ": " . $ib->model() . "\n";

  NEW:
       $report .= "Device type:"
         . $ib->family . "  ID:"
         . $ib->serial . "  CRC:"
         . $ib->model()
         . $ib->crc . ": " . "\n";

NOTE: at present 'short' means 8 characters or less.  There is a
tentative flag to change this (-scl), but it is undocumented and
is likely to be changed or removed later, so only use it for testing.  
In the above example, the tokens "  ID:", "  CRC:", and "\n" are below
this limit.  

-If a line which is short enough to fit on a single line was
nevertheless broken in the input file at a 'good' location (see below), 
perltidy will try to retain a break.  For example, the following line
will be formatted as:

   open SUM, "<$file"
     or die "Cannot open $file ($!)";

if it was broken in the input file, and like this if not:

   open SUM, "<$file" or die "Cannot open $file ($!)";

GOOD: 'good' location means before 'and','or','if','unless','&&','||'

The reason perltidy does not just always break at these points is that if
there are multiple, similar statements, this would preclude alignment.  So
rather than check for this, perltidy just tries to follow the input style,
in the hopes that the author made a good choice. Here is an example where 
we might not want to break before each 'if':

   ($Locale, @Locale) = ($English, @English) if (@English > @Locale);
   ($Locale, @Locale) = ($German,  @German)  if (@German > @Locale);
   ($Locale, @Locale) = ($French,  @French)  if (@French > @Locale);
   ($Locale, @Locale) = ($Spanish, @Spanish) if (@Spanish > @Locale);

-Added wildcard file expansion for systems with shells which lack this.
Now 'perltidy *.pl' should work under MSDOS/Windows.  Thanks to Hugh Myers 
for suggesting this.  This uses builtin glob() for now; I may change that.

-Added new flag -sbl which, if specified, overrides the value of -bl
for opening sub braces.  This allows formatting of this type:

perltidy -sbl 

sub foo
{
   if (!defined($_[0])) {
       print("Hello, World\n");
   }
   else {
       print($_[0], "\n");
   }
}
Requested by Don Alexander.

-Fixed minor parsing error which prevented a space after a $$ variable
(pid) in some cases.  Thanks to Michael Cartmell for noting this.
For example, 
  old: $$< 700 
  new: $$ < 700

-Improved line break choices 'and' and 'or' to display logic better.
For example:

   OLD:
       exists $self->{'build_dir'} and push @e,
         "Unwrapped into directory $self->{'build_dir'}";

   NEW:
       exists $self->{'build_dir'}
         and push @e, "Unwrapped into directory $self->{'build_dir'}";

-Fixed error of multiple use of abbreviatioin '-dsc'.  -dsc remains 
abbreviation for delete-side-comments; -dsm is new abbreviation for 
delete-semicolons.

-Corrected and updated 'usage' help routine.  Thanks to Slaven Rezic for 
noting an error.

-The default for Windows is, for now, not to do a 'perl -c' syntax
check (but -syn will activate it).  This is because of problems with
command.com.  James Freeman sent me a patch which tries to get around
the problems, and it works in many cases, but testing revealed several
issues that still need to be resolved.  So for now, the default is no
syntax check for Windows.

-I added a -T flag when doing perl -c syntax check.
This is because I test it on a large number of scripts from sources
unknown, and who knows what might be hidden in initialization blocks?
Also, deactivated the syntax check if perltidy is run as root.  As a
benign example, running the previous version of perltidy on the
following file would cause it to disappear:

       BEGIN{
               print "Bye, bye baby!\n";
               unlink $0;
       }

The new version will not let that happen.

-I am contemplating (but have not yet implemented) making '-lp' the
default indentation, because it is stable now and may be closer to how
perl is commonly formatted.  This could be in the next release.  The
reason that '-lp' was not the original default is that the coding for
it was complex and not ready for the initial release of perltidy.  If
anyone has any strong feelings about this, I'd like to hear.  The
current default could always be recovered with the '-nlp' flag.

2001 09 03

-html updates:
    - sub definition names are now specially colored, red by default.  
      The letter 'm' is used to identify them.
    - keyword 'sub' now has color of other keywords.
    - restored html keyword color to __END__ and __DATA__, which was 
      accidentally removed in the previous version.

-A new -se (--standard-error-output) flag has been implemented and
documented which causes all errors to be written to standard output
instead of a .ERR file.

-A new -w (--warning-output) flag has been implemented and documented
 which causes perltidy to output certain non-critical messages to the
 error output file, .ERR.  These include complaints about pod usage,
 for example.  The default is to not include these.

 NOTE: This replaces an undocumented -w=0 or --warning-level flag
 which was tentatively introduced in the previous version to avoid some
 unwanted messages.  The new default is the same as the old -w=0, so
 that is no longer needed. 

 -Improved syntax checking and corrected tokenization of functions such
 as rand, srand, sqrt, ...  These can accept either an operator or a term
 to their right.  This has been corrected.

-Corrected tokenization of semicolon: testing of the previous update showed 
that the semicolon in the following statement was being mis-tokenized.  That
did no harm, other than adding an extra blank space, but has been corrected.

         for (sort {strcoll($a,$b);} keys %investments) {
            ...
         }

-New syntax check: after wasting 5 minutes trying to resolve a syntax
 error in which I had an extra terminal ';' in a complex for (;;) statement, 
 I spent a few more minutes adding a check for this in perltidy so it won't
 happen again.

-The behavior of --break-before-subs (-bbs) and --break-before-blocks
(-bbb) has been modified.  Also, a new control parameter,
--long-block-line-count=n (-lbl=n) has been introduced to give more
control on -bbb.  This was previously a hardwired value.  The reason
for the change is to reduce the number of unwanted blank lines that
perltidy introduces, and make it less erratic.  It's annoying to remove
an unwanted blank line and have perltidy put it back.  The goal is to
be able to sprinkle a few blank lines in that dense script you
inherited from Bubba.  I did a lot of experimenting with different
schemes for introducing blank lines before and after code blocks, and
decided that there is no really good way to do it.  But I think the new
scheme is an improvement.  You can always deactivate this with -nbbb.
I've been meaning to work on this; thanks to Erik Thaysen for bringing
it to my attention.

-The .LOG file is seldom needed, and I get tired of deleting them, so
 they will now only be automatically saved if perltidy thinks that it
 made an error, which is almost never.  You can still force the logfile
 to be saved with -log or -g.

-Improved method for computing number of columns in a table.  The old
method always tried for an even number.  The new method allows odd
numbers when it is obvious that a list is not a hash initialization
list.

  old: my (
            $name,       $xsargs, $parobjs, $optypes,
            $hasp2child, $pmcode, $hdrcode, $inplacecode,
            $globalnew,  $callcopy
         )
         = @_;

  new: my (
            $name,   $xsargs,  $parobjs,     $optypes,   $hasp2child,
            $pmcode, $hdrcode, $inplacecode, $globalnew, $callcopy
         )
         = @_;

-I fiddled with the list threshold adjustment, and some small lists
look better now.  Here is the change for one of the lists in test file
'sparse.t':
old:
  %units =
    ("in", "in", "pt", "pt", "pc", "pi", "mm", "mm", "cm", "cm", "\\hsize", "%",
      "\\vsize", "%", "\\textwidth", "%", "\\textheight", "%");

new:
  %units = (
             "in",      "in", "pt",          "pt", "pc",           "pi",
             "mm",      "mm", "cm",          "cm", "\\hsize",      "%",
             "\\vsize", "%",  "\\textwidth", "%",  "\\textheight", "%"
             );

-Improved -lp formatting at '=' sign.  A break was always being added after
the '=' sign in a statement such as this, (to be sure there was enough room
for the parameters):

old: my $fee =
       CalcReserveFee(
                       $env,          $borrnum,
                       $biblionumber, $constraint,
                       $bibitems
                       );

The updated version doesn't do this unless the space is really needed:

new: my $fee = CalcReserveFee(
                              $env,          $borrnum,
                              $biblionumber, $constraint,
                              $bibitems
                              );

-I updated the tokenizer to allow $#+ and $#-, which seem to be new to
Perl 5.6.  Some experimenting with a recent version of Perl indicated
that it allows these non-alphanumeric '$#' array maximum index
variables: $#: $#- $#+ so I updated the parser accordingly.  Only $#:
seems to be valid in older versions of Perl.

-Fixed a rare formatting problem with -lp (and -gnu) which caused
excessive indentation.

-Many additional syntax checks have been added.

-Revised method for testing here-doc target strings; the following
was causing trouble with a regex test because of the '*' characters:
 print <<"*EOF*";
 bla bla
 *EOF*
Perl seems to allow almost anything to be a here doc target, so an
exact string comparison is now used.

-Made update to allow underscores in binary numbers, like '0b1100_0000'.

-Corrected problem with scanning certain module names; a blank space was 
being inserted after 'warnings' in the following:
   use warnings::register;
The problem was that warnings (and a couple of other key modules) were 
being tokenized as keywords.  They should have just been identifiers.

-Corrected tokenization of indirect objects after sort, system, and exec,
after testing produced an incorrect error message for the following
line of code:
   print sort $sortsubref @list;

-Corrected minor problem where a line after a format had unwanted
extra continuation indentation.  

-Delete-block-comments (and -dac) now retain any leading hash-bang line

-Update for -lp (and -gnu) to not align the leading '=' of a list
with a previous '=', since this interferes with alignment of parameters.

 old:  my $hireDay = new Date;
       my $self    = {
                    firstName => undef,
                    lastName  => undef,
                    hireDay   => $hireDay
                    };

 new:  my $hireDay = new Date;
       my $self = {
                    firstName => undef,
                    lastName  => undef,
                    hireDay   => $hireDay
                    };

-Modifications made to display tables more compactly when possible,
 without adding lines. For example,
 old:
               '1', "I", '2', "II", '3', "III", '4', "IV",
               '5', "V", '6', "VI", '7', "VII", '8', "VIII",
               '9', "IX"
 new:
               '1', "I",   '2', "II",   '3', "III",
               '4', "IV",  '5', "V",    '6', "VI",
               '7', "VII", '8', "VIII", '9', "IX"

-Corrected minor bug in which -pt=2 did not keep the right paren tight
around a '++' or '--' token, like this:

           for ($i = 0 ; $i < length $key ; $i++ )

The formatting for this should be, and now is: 

           for ($i = 0 ; $i < length $key ; $i++)

Thanks to Erik Thaysen for noting this.

-Discovered a new bug involving here-docs during testing!  See BUGS.html.  

-Finally fixed parsing of subroutine attributes (A Perl 5.6 feature).
However, the attributes and prototypes must still be on the same line
as the sub name.

2001 07 31

-Corrected minor, uncommon bug found during routine testing, in which a
blank got inserted between a function name and its opening paren after
a file test operator, but only in the case that the function had not
been previously seen.  Perl uses the existence (or lack thereof) of 
the blank to guess if it is a function call.  That is,
   if (-l pid_filename()) {
became
   if (-l pid_filename ()) {
which is a syntax error if pid_filename has not been seen by perl.

-If the AutoLoader module is used, perltidy will continue formatting
code after seeing an __END__ line.  Use -nlal to deactivate this feature.  
Likewise, if the SelfLoader module is used, perltidy will continue 
formatting code after seeing a __DATA__ line.  Use -nlsl to
deactivate this feature.  Thanks to Slaven Rezic for this suggestion.

-pod text after __END__ and __DATA__ is now identified by perltidy
so that -dp works correctly.  Thanks to Slaven Rezic for this suggestion.

-The first $VERSION line which might be eval'd by MakeMaker
is now passed through unchanged.  Use -npvl to deactivate this feature.
Thanks to Manfred Winter for this suggestion.

-Improved indentation of nested parenthesized expressions.  Tests have
given favorable results.  Thanks to Wolfgang Weisselberg for helpful
examples.

2001 07 23

-Fixed a very rare problem in which an unwanted semicolon was inserted
due to misidentification of anonymous hash reference curly as a code
block curly.  (No instances of this have been reported; I discovered it
during testing).  A workaround for older versions of perltidy is to use
-nasc.

-Added -icb (-indent-closing-brace) parameter to indent a brace which
terminates a code block to the same level as the previous line.
Suggested by Andrew Cutler.  For example, 

       if ($task) {
           yyy();
           }    # -icb
       else {
           zzz();
           }

-Rewrote error message triggered by an unknown bareword in a print or
printf filehandle position, and added flag -w=0 to prevent issuing this
error message.  Suggested by Byron Jones.

-Added modification to align a one-line 'if' block with similar
following 'elsif' one-line blocks, like this:
     if    ( $something eq "simple" )  { &handle_simple }
     elsif ( $something eq "hard" )    { &handle_hard }
(Suggested by  Wolfgang Weisselberg).

2001 07 02

-Eliminated all constants with leading underscores because perl 5.005_03
does not support that.  For example, _SPACES changed to XX_SPACES.
Thanks to kromJx for this update.

2001 07 01

-the directory of test files has been moved to a separate distribution
file because it is getting large but is of little interest to most users.
For the current distribution:
  perltidy-20010701.tgz        contains the source and docs for perltidy
  perltidy-20010701-test.tgz   contains the test files

-fixed bug where temporary file perltidy.TMPI was not being deleted 
when input was from stdin.

-adjusted line break logic to not break after closing brace of an
eval block (suggested by Boris Zentner).

-added flag -gnu (--gnu-style) to give an approximation to the GNU
style as sometimes applied to perl.  The programming style in GNU
'automake' was used as a guide in setting the parameters; these
parameters will probably be adjusted over time.

-an empty code block now has one space for emphasis:
  if ( $cmd eq "bg_untested" ) {}    # old
  if ( $cmd eq "bg_untested" ) { }   # new
If this bothers anyone, we could create a parameter.

-the -bt (--brace-tightness) parameter has been split into two
parameters to give more control. -bt now applies only to non-BLOCK
braces, while a new parameter -bbt (block-brace-tightness) applies to
curly braces which contain code BLOCKS. The default value is -bbt=0.

-added flag -icp (--indent-closing-paren) which leaves a statement
termination of the form );, };, or ]; indented with the same
indentation as the previous line.  For example,

   @month_of_year = (          # default, or -nicp
       'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct',
       'Nov', 'Dec'
   );

   @month_of_year = (          # -icp
       'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct',
       'Nov', 'Dec'
       );

-Vertical alignment updated to synchronize with tokens &&, ||,
and, or, if, unless.  Allowable space before forcing
resynchronization has been increased.  (Suggested by  Wolfgang
Weisselberg).

-html corrected to use -nohtml-bold-xxxxxxx or -nhbx to negate bold,
and likewise -nohtml-italic-xxxxxxx or -nhbi to negate italic.  There
was no way to negate these previously.  html documentation updated and
corrected.  (Suggested by  Wolfgang Weisselberg).

-Some modifications have been made which improve the -lp formatting in
a few cases.

-Perltidy now retains or creates a blank line after an =cut to keep
podchecker happy (Suggested by Manfred H. Winter).  This appears to be
a glitch in podchecker, but it was annoying.

2001 06 17

-Added -bli flag to give continuation indentation to braces, like this

       if ($bli_flag)
         {
           extra_indentation();
         }

-Corrected an error with the tab (-t) option which caused the last line
of a multi-line quote to receive a leading tab.  This error was in
version 2001 06 08  but not 2001 04 06.  If you formatted a script
with -t with this version, please check it by running once with the
-chk flag and perltidy will scan for this possible error.

-Corrected an invalid pattern (\R should have been just R), changed
$^W =1 to BEGIN {$^W=1} to use warnings in compile phase, and corrected
several unnecessary 'my' declarations. Many thanks to Wolfgang Weisselberg,
2001-06-12, for catching these errors.

-A '-bar' flag has been added to require braces to always be on the
right, even for multi-line if and foreach statements.  For example,
the default formatting of a long if statement would be:

       if ($bigwasteofspace1 && $bigwasteofspace2
         || $bigwasteofspace3 && $bigwasteofspace4)
       {
           bigwastoftime();
       }

With -bar, the formatting is:

       if ($bigwasteofspace1 && $bigwasteofspace2
         || $bigwasteofspace3 && $bigwasteofspace4) {
           bigwastoftime();
       }
Suggested by Eli Fidler 2001-06-11.

-Uploaded perltidy to sourceforge cvs 2001-06-10.

-An '-lp' flag (--line-up-parentheses) has been added which causes lists
to be indented with extra indentation in the manner sometimes
associated with emacs or the GNU suggestions.  Thanks to Ian Stuart for
this suggestion and for extensive help in testing it. 

-Subroutine call parameter lists are now formatted as other lists.
This should improve formatting of tables being passed via subroutine
calls.  This will also cause full indentation ('-i=n, default n= 4) of
continued parameter list lines rather than just the number of spaces
given with -ci=n, default n=2.

-Added support for hanging side comments.  Perltidy identifies a hanging
side comment as a comment immediately following a line with a side
comment or another hanging side comment.  This should work in most
cases.  It can be deactivated with --no-hanging-side-comments (-nhsc).
The manual has been updated to discuss this.  Suggested by Brad
Eisenberg some time ago, and finally implemented.

2001 06 08

-fixed problem with parsing command parameters containing quoted
strings in .perltidyrc files. (Reported by Roger Espel Llima 2001-06-07).

-added two command line flags, --want-break-after and 
--want-break-before, which allow changing whether perltidy
breaks lines before or after any operators.  Please see the revised 
man pages for details.

-added system-wide configuration file capability.
If perltidy does not find a .perltidyrc command line file in
the current directory, nor in the home directory, it now looks
for '/usr/local/etc/perltidyrc' and then for '/etc/perltidyrc'.
(Suggested by Roger Espel Llima 2001-05-31).

-fixed problem in which spaces were trimmed from lines of a multi-line
quote. (Reported by Roger Espel Llima 2001-05-30).  This is an 
uncommon situation, but serious, because it could conceivably change
the proper function of a script.

-fixed problem in which a semicolon was incorrectly added within 
an anonymous hash.  (Reported by A.C. Yardley, 2001-5-23).
(You would know if this happened, because perl would give a syntax
error for the resulting script).

-fixed problem in which an incorrect error message was produced
 after a version number on a 'use' line, like this ( Reported 
 by Andres Kroonmaa, 2001-5-14):

             use CGI 2.42 qw(fatalsToBrowser);

 Other than the extraneous error message, this bug was harmless.

2001 04 06

-fixed serious bug in which the last line of some multi-line quotes or
 patterns was given continuation indentation spaces.  This may make
 a pattern incorrect unless it uses the /x modifier.  To find
 instances of this error in scripts which have been formatted with
 earlier versions of perltidy, run with the -chk flag, which has
 been added for this purpose (SLH, 2001-04-05).

 ** So, please check previously formatted scripts by running with -chk
 at least once **

-continuation indentation has been reprogrammed to be hierarchical, 
 which improves deeply nested structures.

-fixed problem with undefined value in list formatting (reported by Michael
 Langner 2001-04-05)

-Switched to graphical display of nesting in .LOG files.  If an
 old format string was "(1 [0 {2", the new string is "{{(".  This
 is easier to read and also shows the order of nesting.

-added outdenting of cuddled paren structures, like  ")->pack(".

-added line break and outdenting of ')->' so that instead of

       $mw->Label(
         -text   => "perltidy",
         -relief => 'ridge')->pack;

 the current default is:

       $mw->Label(
         -text   => "perltidy",
         -relief => 'ridge'
       )->pack;

 (requested by Michael Langner 2001-03-31; in the future this could 
 be controlled by a command-line parameter).

-revised list indentation logic, so that lists following an assignment
 operator get one full indentation level, rather than just continuation 
 indentation.  Also corrected some minor glitches in the continuation 
 indentation logic. 

-Fixed problem with unwanted continuation indentation after a blank line 
(reported by Erik Thaysen 2001-03-28):

-minor update to avoid stranding a single '(' on one line

2001 03 28:

-corrected serious error tokenizing filehandles, in which a sub call 
after a print or printf, like this:
   print usage() and exit;
became this:
   print usage () and exit;
Unfortunately, this converts 'usage' to a filehandle.  To fix this, rerun
perltidy; it will look for this situation and issue a warning. 

-fixed another cuddled-else formatting bug (Reported by Craig Bourne)

-added several diagnostic --dump routines

-added token-level whitespace controls (suggested by Hans Ecke)

2001 03 23:

-added support for special variables of the form ${^WANT_BITS}

-space added between scalar and left paren in 'for' and 'foreach' loops,
 (suggestion by Michael Cartmell):

   for $i( 1 .. 20 )   # old
   for $i ( 1 .. 20 )   # new

-html now outputs cascading style sheets (thanks to suggestion from
 Hans Ecke)

-flags -o and -st now work with -html

-added missing -html documentation for comments (noted by Alex Izvorski)

-support for VMS added (thanks to Michael Cartmell for code patches and 
  testing)

-v-strings implemented (noted by Hans Ecke and Michael Cartmell; extensive
  testing by Michael Cartmell)

-fixed problem where operand may be empty at line 3970 
 (\b should be just b in lines 3970, 3973) (Thanks to Erik Thaysen, 
 Keith Marshall for bug reports)

-fixed -ce bug (cuddled else), where lines like '} else {' were indented
 (Thanks to Shawn Stepper and Rick Measham for reporting this)

2001 03 04:

-fixed undefined value in line 153 (only worked with -I set)
(Thanks to Mike Stok, Phantom of the Opcodes, Ian Ehrenwald, and others)

-fixed undefined value in line 1069 (filehandle problem with perl versions <
5.6) (Thanks to Yuri Leikind, Mike Stok, Michael Holve, Jeff Kolber)

2001 03 03:

-Initial announcement at freshmeat.net; started Change Log
(Unfortunately this version was DOA, but it was fixed the next day)
Perl-Tidy-20230309/docs/tutorial.html0000644000175000017500000006055614401515103016314 0ustar stevesteve

A Brief Perltidy Tutorial

Perltidy can save you a lot of tedious editing if you spend a few minutes learning to use it effectively. Perltidy is highly configurable, but for many programmers the default parameter set will be satisfactory, with perhaps a few additional parameters to account for style preferences.

This tutorial assumes that perltidy has been installed on your system. Installation instructions accompany the package. To follow along with this tutorial, please find a small Perl script and place a copy in a temporary directory. For example, here is a small (and silly) script:

 print "Help Desk -- What Editor do you use?";
 chomp($editor = <STDIN>);
 if ($editor =~ /emacs/i) {
   print "Why aren't you using vi?\n";
 } elsif ($editor =~ /vi/i) {
   print "Why aren't you using emacs?\n";
 } else {
   print "I think that's the problem\n";
 }

It is included in the docs section of the distribution.

A First Test

Assume that the name of your script is testfile.pl. You can reformat it with the default options to use the style recommended in the perlstyle man pages with the command:

 perltidy testfile.pl

For safety, perltidy never overwrites your original file. In this case, its output will go to a file named testfile.pl.tdy, which you should examine now with your editor. Here is what the above file looks like with the default options:

 print "Help Desk -- What Editor do you use?";
 chomp( $editor = <STDIN> );
 if ( $editor =~ /emacs/i ) {
     print "Why aren't you using vi?\n";
 }
 elsif ( $editor =~ /vi/i ) {
     print "Why aren't you using emacs?\n";
 }
 else {
     print "I think that's the problem\n";
 }

You'll notice an immediate style change from the "cuddled-else" style of the original to the default "non-cuddled-else" style. This is because perltidy has to make some kind of default selection of formatting options, and this default tries to follow the suggestions in the perlstyle man pages.

If you prefer the original "cuddled-else" style, don't worry, you can indicate that with a -ce flag. So if you rerun with that flag

 perltidy -ce testfile.pl

you will see a return to the original "cuddled-else" style. There are many more parameters for controlling style, and some of the most useful of these are discussed below.

Indentation

Another noticeable difference between the original and the reformatted file is that the indentation has been changed from 2 spaces to 4 spaces. That's because 4 spaces is the default. You may change this to be a different number with -i=n.

To get some practice, try these examples, and examine the resulting testfile.pl.tdy file:

 perltidy -i=8 testfile.pl

This changes the default of 4 spaces per indentation level to be 8. Now just to emphasize the point, try this and examine the result:

 perltidy -i=0 testfile.pl

There will be no indentation at all in this case.

Input Flags

This is a good place to mention a few points regarding the input flags. First, for each option, there are two forms, a long form and a short form, and either may be used.

For example, if you want to change the number of columns corresponding to one indentation level to 3 (from the default of 4) you may use either

 -i=3   or  --indent-columns=3

The short forms are convenient for entering parameters by hand, whereas the long forms, though often ridiculously long, are self-documenting and therefore useful in configuration scripts. You may use either one or two dashes ahead of the parameters. Also, the '=' sign is optional, and may be a single space instead. However, the value of a parameter must NOT be adjacent to the flag, like this -i3 (WRONG). Also, flags must be input separately, never bundled together.

Line Length and Continuation Indentation.

If you change the indentation spaces you will probably also need to change the continuation indentation spaces with the parameter -ci=n. The continuation indentation is the extra indentation -- 2 spaces by default -- given to that portion of a long line which has been placed below the start of a statement. For example:

 croak "Couldn't pop genome file"
   unless sysread( $impl->{file}, $element, $impl->{group} )
   and truncate( $impl->{file}, $new_end );

There is no fixed rule for setting the value for -ci=n, but it should probably not exceed one-half of the number of spaces of a full indentation level.

In the above snippet, the statement was broken into three lines. The actual number is governed by a parameter, the maximum line length, as well as by what perltidy considers to be good break points. The maximum line length is 80 characters by default. You can change this to be any number n with the -l=n flag. Perltidy tries to produce lines which do not exceed this length, and it does this by finding good break points. For example, the above snippet would look like this with perltidy -l=40:

 croak "Couldn't pop genome file"
   unless
   sysread( $impl->{file}, $element,
     $impl->{group} )
   and
   truncate( $impl->{file}, $new_end );

You may be wondering what would happen with, say, -l=1. Go ahead and try it.

Tabs or Spaces?

With indentation, there is always a tab issue to resolve. By default, perltidy will use leading ascii space characters instead of tabs. The reason is that this will be displayed correctly by virtually all editors, and in the long run, will avoid maintenance problems.

However, if you prefer, you may have perltidy entab the leading whitespace of a line with the command -et=n, where n is the number of spaces which will be represented by one tab. But note that your text will not be displayed properly unless viewed with software that is configured to display n spaces per tab.

Input/Output Control

In the first example, we saw that if we pass perltidy the name of a file on the command line, it reformats it and creates a new filename by appending an extension, .tdy. This is the default behavior, but there are several other options.

On most systems, you may use wildcards to reformat a whole batch of files at once, like this for example:

 perltidy *.pl

and in this case, each of the output files will be have a name equal to the input file with the extension .tdy appended. If you decide that the formatting is acceptable, you will want to backup your originals and then remove the .tdy extensions from the reformatted files. There is a powerful perl script called rename that can be used for this purpose; if you don't have it, you can find it for example in The Perl Cookbook.

If you find that the formatting done by perltidy is usually acceptable, you may want to save some effort by letting perltidy do a simple backup of the original files and then reformat them in place. You specify this with a -b flag. For example, the command

 perltidy -b *.pl

will rename the original files by appending a .bak extension, and then create reformatted files with the same names as the originals. (If you don't like the default backup extension choice .bak, the manual tells how to change it). Each time you run perltidy with the -b option, the previous .bak files will be overwritten, so please make regular separate backups.

If there is no input filename specified on the command line, then input is assumed to come from standard input and output will go to standard output. On systems with a Unix-like interface, you can use perltidy as a filter, like this:

 perltidy <somefile.pl >newfile.pl

What happens in this case is that the shell takes care of the redirected input files, '<somefile.pl', and so perltidy never sees the filename. Therefore, it knows to use the standard input and standard output channels.

If you are executing perltidy on a file and want to force the output to standard output, rather than create a .tdy file, you can indicate this with the flag -st, like this:

 perltidy somefile.pl -st >otherfile.pl

You can also control the name of the output file with the -o flag, like this:

 perltidy testfile.pl -o=testfile.new.pl

Style Variations

Perltidy has to make some kind of default selection of formatting options, and its choice is to try to follow the suggestions in the perlstyle man pages. Many programmers more or less follow these suggestions with a few exceptions. In this section we will look at just a few of the most commonly used style parameters. Later, you may want to systematically develop a set of style parameters with the help of the perltidy stylekey web page at http://perltidy.sourceforge.net/stylekey.html

-ce, cuddled elses

If you prefer cuddled elses, use the -ce flag.

-bl, braces left

Here is what the if block in the above script looks like with -bl:

 if ( $editor =~ /emacs/i )
 {
     print "Why aren't you using vi?\n";
 }
 elsif ( $editor =~ /vi/i )
 {
     print "Why aren't you using emacs?\n";
 }
 else
 {
     print "I think that's the problem\n";
 }
-lp, Lining up with parentheses

The -lp parameter can enhance the readability of lists by adding extra indentation. Consider:

        %romanNumerals = (
            one   => 'I',
            two   => 'II',
            three => 'III',
            four  => 'IV',
            five  => 'V',
            six   => 'VI',
            seven => 'VII',
            eight => 'VIII',
            nine  => 'IX',
            ten   => 'X'
        );

With the -lp flag, this is formatted as:

        %romanNumerals = (
                           one   => 'I',
                           two   => 'II',
                           three => 'III',
                           four  => 'IV',
                           five  => 'V',
                           six   => 'VI',
                           seven => 'VII',
                           eight => 'VIII',
                           nine  => 'IX',
                           ten   => 'X'
                         );

which is preferred by some. (I've actually used -lp and -cti=1 to format this block. The -cti=1 flag causes the closing paren to align vertically with the opening paren, which works well with the -lp indentation style). An advantage of -lp indentation are that it displays lists nicely. A disadvantage is that deeply nested lists can require a long line length.

-bt,-pt,-sbt: Container tightness

These are parameters for controlling the amount of space within containing parentheses, braces, and square brackets. The example below shows the effect of the three possible values, 0, 1, and 2, for the case of parentheses:

 if ( ( my $len_tab = length( $tabstr ) ) > 0 ) {  # -pt=0
 if ( ( my $len_tab = length($tabstr) ) > 0 ) {    # -pt=1 (default)
 if ((my $len_tab = length($tabstr)) > 0) {        # -pt=2

A value of 0 causes all parens to be padded on the inside with a space, and a value of 2 causes this never to happen. With a value of 1, spaces will be introduced if the item within is more than a single token.

Configuration Files

While style preferences vary, most people would agree that it is important to maintain a uniform style within a script, and this is a major benefit provided by perltidy. Once you have decided on which, if any, special options you prefer, you may want to avoid having to enter them each time you run it. You can do this by creating a special file named .perltidyrc in either your home directory, your current directory, or certain system-dependent locations. (Note the leading "." in the file name).

A handy command to know when you start using a configuration file is

  perltidy -dpro

which will dump to standard output the search that perltidy makes when looking for a configuration file, and the contents of the one that it selects, if any. This is one of a number of useful "dump and die" commands, in which perltidy will dump some information to standard output and then immediately exit. Others include -h, which dumps help information, and -v, which dumps the version number.

Another useful command when working with configuration files is

 perltidy -pro=file

which causes the contents of file to be used as the configuration file instead of a .perltidyrc file. With this command, you can easily switch among several different candidate configuration files during testing.

This .perltidyrc file is free format. It is simply a list of parameters, just as they would be entered on a command line. Any number of lines may be used, with any number of parameters per line, although it may be easiest to read with one parameter per line. Blank lines are ignored, and text after a '#' is ignored to the end of a line.

Here is an example of a .perltidyrc file:

  # This is a simple of a .perltidyrc configuration file
  # This implements a highly spaced style
  -bl    # braces on new lines
  -pt=0  # parens not tight at all
  -bt=0  # braces not tight
  -sbt=0 # square brackets not tight

If you experiment with this file, remember that it is in your directory, since if you are running on a Unix system, files beginning with a "." are normally hidden.

If you have a .perltidyrc file, and want perltidy to ignore it, use the -npro flag on the command line.

Error Reporting

Let's run through a 'fire drill' to see how perltidy reports errors. Try introducing an extra opening brace somewhere in a test file. For example, introducing an extra brace in the file listed above produces the following message on the terminal (or standard error output):

 ## Please see file testfile.pl.ERR!

Here is what testfile.pl.ERR contains:

 10:    final indentation level: 1
 
 Final nesting depth of '{'s is 1
 The most recent un-matched '{' is on line 6
 6: } elsif ($temperature < 68) {{
                                ^

This shows how perltidy will, by default, write error messages to a file with the extension .ERR, and it will write a note that it did so to the standard error device. If you would prefer to have the error messages sent to standard output, instead of to a .ERR file, use the -se flag.

Almost every programmer would want to see error messages of this type, but there are a number of messages which, if reported, would be annoying. To manage this problem, perltidy puts its messages into two categories: errors and warnings. The default is to just report the errors, but you can control this with input flags, as follows:

 flag  what this does
 ----  --------------
       default: report errors but not warnings
 -w    report all errors and warnings
 -q    quiet! do not report either errors or warnings

The default is generally a good choice, but it's not a bad idea to check programs with -w occasionally, especially if your are looking for a bug. For example, it will ask if you really want '=' instead of '=~' in this line:

    $line = s/^\s*//;

This kind of error can otherwise be hard to find.

The Log File

One last topic that needs to be touched upon concerns the .LOG file. This is where perltidy records messages that are not normally of any interest, but which just might occasionally be useful. This file is not saved, though, unless perltidy detects that it has made a mistake or you ask for it to be saved.

There are a couple of ways to ask perltidy to save a log file. To create a relatively sparse log file, use

 perltidy -log testfile.pl

and for a verbose log file, use

 perltidy -g testfile.pl

The difference is that the first form only saves detailed information at least every 50th line, while the second form saves detailed information about every line.

So returning to our example, lets force perltidy to save a verbose log file by issuing the following command

 perltidy -g testfile.pl

You will find that a file named testfile.pl.LOG has been created in your directory.

If you open this file, you will see that it is a text file with a combination of warning messages and informative messages. All you need to know for now is that it exists; someday it may be useful.

Using Perltidy as a Filter on Selected Text from an Editor

Most programmer's editors allow a selected group of lines to be passed through an external filter. Perltidy has been designed to work well as a filter, and it is well worthwhile learning the appropriate commands to do this with your editor. This means that you can enter a few keystrokes and watch a block of text get reformatted. If you are not doing this, you are missing out of a lot of fun! You may want to supply the -q flag to prevent error messages regarding incorrect syntax, since errors may be obvious in the indentation of the reformatted text. This is entirely optional, but if you do not use the -q flag, you will need to use the undo keys in case an error message appears on the screen.

For example, within the vim editor it is only necessary to select the text by any of the text selection methods, and then issue the command !perltidy in command mode. Thus, an entire file can be formatted using

 :%!perltidy -q

or, without the -q flag, just

 :%!perltidy

It isn't necessary to format an entire file, however. Perltidy will probably work well as long as you select blocks of text whose braces, parentheses, and square brackets are properly balanced. You can even format an elsif block without the leading if block, as long as the text you select has all braces balanced.

For the emacs editor, first mark a region and then pipe it through perltidy. For example, to format an entire file, select it with C-x h and then pipe it with M-1 M-| and then perltidy. The numeric argument, M-1 causes the output from perltidy to replace the marked text. See "GNU Emacs Manual" for more information, http://www.gnu.org/manual/emacs-20.3/html_node/emacs_toc.html

If you have difficulty with an editor, try the -st flag, which will force perltidy to send output to standard output. This might be needed, for example, if the editor passes text to perltidy as temporary filename instead of through the standard input. If this works, you might put the -st flag in your .perltidyrc file.

If you have some tips for making perltidy work with your editor, and are willing to share them, please email me (see below) and I'll try to incorporate them in this document or put up a link to them.

After you get your editor and perltidy successfully talking to each other, try formatting a snippet of code with a brace error to see what happens. (Do not use the quiet flag, -q, for this test). Perltidy will send one line starting with ## to standard error output. Your editor may either display it at the top of the reformatted text or at the bottom (or even midstream!). You probably cannot control this, and perltidy can't, but you need to know where to look when an actual error is detected.

Writing an HTML File

Perltidy can switch between two different output modes. We have been discussing what might be called its "beautifier" mode, but it can also output in HTML. To do this, use the -html flag, like this:

 perltidy -html testfile.pl

which will produce a file testfile.pl.html. There are many parameters available for adjusting the appearance of an HTML file, but a very easy way is to just write the HTML file with this simple command and then edit the stylesheet which is embedded at its top.

One important thing to know about the -html flag is that perltidy can either send its output to its beautifier or to its HTML writer, but (unfortunately) not both in a single run. So the situation can be represented like this:

                  ------------
                  |          |     --->beautifier--> testfile.pl.tdy
 testfile.pl -->  | perltidy | -->
                  |          |     --->HTML -------> testfile.pl.html
                  ------------

And in the future, there may be more output filters. So if you would like to both beautify a script and write it to HTML, you need to do it in two steps.

Summary

That's enough to get started using perltidy. When you are ready to create a .perltidyrc file, you may find it helpful to use the stylekey page as a guide at http://perltidy.sourceforge.net/stylekey.html

Many additional special features and capabilities can be found in the manual pages for perltidy at http://perltidy.sourceforge.net/perltidy.html

We hope that perltidy makes perl programming a little more fun. Please check the perltidy web site http://perltidy.sourceforge.net occasionally for updates.

The author may be contacted at perltidy at users.sourceforge.net.

Perl-Tidy-20230309/META.yml0000664000175000017500000000127314401515241014100 0ustar stevesteve--- abstract: 'indent and reformat perl scripts' author: - 'Steve Hancock ' build_requires: ExtUtils::MakeMaker: '0' configure_requires: ExtUtils::MakeMaker: '0' dynamic_config: 1 generated_by: 'ExtUtils::MakeMaker version 7.62, CPAN::Meta::Converter version 2.150010' license: gpl meta-spec: url: http://module-build.sourceforge.net/META-spec-v1.4.html version: '1.4' name: Perl-Tidy no_index: directory: - t - inc requires: perl: '5.008' resources: bugtracker: https://github.com/perltidy/perltidy/issues repository: https://github.com/perltidy/perltidy.git version: '20230309' x_serialization_backend: 'CPAN::Meta::YAML version 0.012' Perl-Tidy-20230309/COPYING0000644000175000017500000004325411776143731013702 0ustar stevesteve GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Lesser General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) year name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. , 1 April 1989 Ty Coon, President of Vice This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. Perl-Tidy-20230309/Makefile.PL0000644000175000017500000000310214364250335014600 0ustar stevesteveuse ExtUtils::MakeMaker; my $mm_ver = $ExtUtils::MakeMaker::VERSION; if ( $mm_ver =~ /_/ ) { # developer release/version $mm_ver = eval $mm_ver; die $@ if $@; } # Minimum version found by perlver: # # ------------------------------------------ # | file | explicit | syntax | external | # | ------------------------------------------ | # | perltidy.pl | v5.8.0 | v5.8.0 | n/a | # | ------------------------------------------ | # | Minimum explicit version : v5.8.0 | # | Minimum syntax version : v5.8.0 | # | Minimum version of perl : v5.8.0 | # ------------------------------------------ WriteMakefile( NAME => "Perl::Tidy", VERSION_FROM => "lib/Perl/Tidy.pm", ( $] >= 5.005 ? ( ABSTRACT => 'indent and reformat perl scripts', LICENSE => 'gpl_2', AUTHOR => 'Steve Hancock ' ) : () ), ( $mm_ver >= 6.48 ? ( MIN_PERL_VERSION => 5.008 ) : () ), EXE_FILES => ['bin/perltidy'], dist => { COMPRESS => 'gzip', SUFFIX => 'gz' }, META_MERGE => { 'meta-spec' => { version => 2 }, resources => { repository => { type => 'git', url => 'https://github.com/perltidy/perltidy.git', web => 'https://github.com/perltidy/perltidy', }, bugtracker => { "web" => "https://github.com/perltidy/perltidy/issues" }, }, }, ); Perl-Tidy-20230309/.pre-commit-hooks.yaml0000644000175000017500000000046013623240005016757 0ustar stevesteve- id: perltidy name: perltidy description: Run the perltidy source code formatter on Perl source files minimum_pre_commit_version: 2.1.0 entry: perltidy --nostandard-output --backup-and-modify-in-place args: [--standard-error-output, --backup-file-extension=/] language: perl types: [perl] Perl-Tidy-20230309/META.json0000664000175000017500000000233214401515241014245 0ustar stevesteve{ "abstract" : "indent and reformat perl scripts", "author" : [ "Steve Hancock " ], "dynamic_config" : 1, "generated_by" : "ExtUtils::MakeMaker version 7.62, CPAN::Meta::Converter version 2.150010", "license" : [ "gpl_2" ], "meta-spec" : { "url" : "http://search.cpan.org/perldoc?CPAN::Meta::Spec", "version" : 2 }, "name" : "Perl-Tidy", "no_index" : { "directory" : [ "t", "inc" ] }, "prereqs" : { "build" : { "requires" : { "ExtUtils::MakeMaker" : "0" } }, "configure" : { "requires" : { "ExtUtils::MakeMaker" : "0" } }, "runtime" : { "requires" : { "perl" : "5.008" } } }, "release_status" : "stable", "resources" : { "bugtracker" : { "web" : "https://github.com/perltidy/perltidy/issues" }, "repository" : { "type" : "git", "url" : "https://github.com/perltidy/perltidy.git", "web" : "https://github.com/perltidy/perltidy" } }, "version" : "20230309", "x_serialization_backend" : "JSON::PP version 4.04" } Perl-Tidy-20230309/lib/0002755000175000017500000000000014401515241013372 5ustar stevestevePerl-Tidy-20230309/lib/Perl/0002755000175000017500000000000014401515241014274 5ustar stevestevePerl-Tidy-20230309/lib/Perl/Tidy.pod0000644000175000017500000004616014400733174015724 0ustar stevesteve =head1 NAME Perl::Tidy - Parses and beautifies perl source =head1 SYNOPSIS use Perl::Tidy; my $error_flag = Perl::Tidy::perltidy( source => $source, destination => $destination, stderr => $stderr, argv => $argv, perltidyrc => $perltidyrc, logfile => $logfile, errorfile => $errorfile, teefile => $teefile, debugfile => $debugfile, formatter => $formatter, # callback object (see below) dump_options => $dump_options, dump_options_type => $dump_options_type, prefilter => $prefilter_coderef, postfilter => $postfilter_coderef, ); =head1 DESCRIPTION This module makes the functionality of the perltidy utility available to perl scripts. Any or all of the input parameters may be omitted, in which case the @ARGV array will be used to provide input parameters as described in the perltidy(1) man page. For example, the perltidy script is basically just this: use Perl::Tidy; Perl::Tidy::perltidy(); The call to B returns a scalar B<$error_flag> which is TRUE if an error caused premature termination, and FALSE if the process ran to normal completion. Additional discuss of errors is contained below in the L section. The module accepts input and output streams by a variety of methods. The following list of parameters may be any of the following: a filename, an ARRAY reference, a SCALAR reference, or an object with either a B or B method, as appropriate. source - the source of the script to be formatted destination - the destination of the formatted output stderr - standard error output perltidyrc - the .perltidyrc file logfile - the .LOG file stream, if any errorfile - the .ERR file stream, if any dump_options - ref to a hash to receive parameters (see below), dump_options_type - controls contents of dump_options dump_getopt_flags - ref to a hash to receive Getopt flags dump_options_category - ref to a hash giving category of options dump_abbreviations - ref to a hash giving all abbreviations The following chart illustrates the logic used to decide how to treat a parameter. ref($param) $param is assumed to be: ----------- --------------------- undef a filename SCALAR ref to string ARRAY ref to array (other) object with getline (if source) or print method If the parameter is an object, and the object has a B method, that close method will be called at the end of the stream. =over 4 =item B If the B parameter is given, it defines the source of the input stream. If an input stream is defined with the B parameter then no other source filenames may be specified in the @ARGV array or B parameter. =item B If the B parameter is given, it will be used to define the file or memory location to receive output of perltidy. B. Perl strings of characters which are decoded as utf8 by Perl::Tidy can be returned in either of two possible states, decoded or encoded, and it is important that the calling program and Perl::Tidy are in agreement regarding the state to be returned. A flag B<--encode-output-strings>, or simply B<-eos>, was added in Perl::Tidy version 20220217 for this purpose. =over 4 =item * Use B<-eos> if Perl::Tidy should encode any string which it decodes. This is the current default because it makes perltidy behave well as a filter, and is the correct setting for most programs. But do not use this setting if the calling program will encode the data too, because double encoding will corrupt data. =item * Use B<-neos> if a string should remain decoded if it was decoded by Perl::Tidy. This is only appropriate if the calling program will handle any needed encoding before outputting the string. If needed, this flag can be added to the end of the B parameter passed to Perl::Tidy. =back For some background information see L. This change in default behavior was made over a period of time as follows: =over 4 =item * For versions before 20220217 the B<-eos> flag was not available and the behavior was equivalent to B<-neos>. =item * In version 20220217 the B<-eos> flag was added but the default remained B<-neos>. =item * For versions after 20220217 the default was set to B<-eos>. =back =item B The B parameter allows the calling program to redirect the stream that would otherwise go to the standard error output device to any of the stream types listed above. This stream contains important warnings and errors related to the parameters passed to perltidy. =item B If the B file is given, it will be used instead of any F<.perltidyrc> configuration file that would otherwise be used. =item B The B parameter allows the calling program to capture the stream that would otherwise go to either a .ERR file. This stream contains warnings or errors related to the contents of one source file or stream. The reason that this is different from the stderr stream is that when perltidy is called to process multiple files there will be up to one .ERR file created for each file and it would be very confusing if they were combined. However if perltidy is called to process just a single perl script then it may be more convenient to combine the B stream with the B stream. This can be done by setting the B<-se> parameter, in which case this parameter is ignored. =item B The B parameter allows the calling program to capture the log stream. This stream is only created if requested with a B<-g> parameter. It contains detailed diagnostic information about a script which may be useful for debugging. =item B The B parameter allows the calling program to capture the tee stream. This stream is only created if requested with one of the 'tee' parameters, a B<--tee-pod> , B<--tee-block-comments>, B<--tee-side-commnts>, or B<--tee-all-comments>. =item B The B parameter allows the calling program to capture the stream produced by the B<--DEBUG> parameter. This parameter is mainly used for debugging perltidy itself. =item B If the B parameter is given, it will be used instead of the B<@ARGV> array. The B parameter may be a string, a reference to a string, or a reference to an array. If it is a string or reference to a string, it will be parsed into an array of items just as if it were a command line string. =item B If the B parameter is given, it must be the reference to a hash. In this case, the parameters contained in any perltidyrc configuration file will be placed in this hash and perltidy will return immediately. This is equivalent to running perltidy with --dump-options, except that the parameters are returned in a hash rather than dumped to standard output. Also, by default only the parameters in the perltidyrc file are returned, but this can be changed (see the next parameter). This parameter provides a convenient method for external programs to read a perltidyrc file. An example program using this feature, F, is included in the distribution. Any combination of the B parameters may be used together. =item B This parameter is a string which can be used to control the parameters placed in the hash reference supplied by B. The possible values are 'perltidyrc' (default) and 'full'. The 'full' parameter causes both the default options plus any options found in a perltidyrc file to be returned. =item B If the B parameter is given, it must be the reference to a hash. This hash will receive all of the parameters that perltidy understands and flags that are passed to Getopt::Long. This parameter may be used alone or with the B flag. Perltidy will exit immediately after filling this hash. See the demo program F for example usage. =item B If the B parameter is given, it must be the reference to a hash. This hash will receive a hash with keys equal to all long parameter names and values equal to the title of the corresponding section of the perltidy manual. See the demo program F for example usage. =item B If the B parameter is given, it must be the reference to a hash. This hash will receive all abbreviations used by Perl::Tidy. See the demo program F for example usage. =item B A code reference that will be applied to the source before tidying. It is expected to take the full content as a string in its input, and output the transformed content. =item B A code reference that will be applied to the tidied result before outputting. It is expected to take the full content as a string in its input, and output the transformed content. Note: A convenient way to check the function of your custom prefilter and postfilter code is to use the --notidy option, first with just the prefilter and then with both the prefilter and postfilter. See also the file B in the perltidy distribution. =back =head1 ERROR HANDLING An exit value of 0, 1, or 2 is returned by perltidy to indicate the status of the result. A exit value of 0 indicates that perltidy ran to completion with no error messages. An exit value of 1 indicates that the process had to be terminated early due to errors in the input parameters. This can happen for example if a parameter is misspelled or given an invalid value. The calling program should check for this flag because if it is set the destination stream will be empty or incomplete and should be ignored. Error messages in the B stream will indicate the cause of any problem. An exit value of 2 indicates that perltidy ran to completion but there there are warning messages in the B stream related to parameter errors or conflicts and/or warning messages in the B stream relating to possible syntax errors in the source code being tidied. In the event of a catastrophic error for which recovery is not possible B terminates by making calls to B or B to help the programmer localize the problem. These should normally only occur during program development. =head1 NOTES ON FORMATTING PARAMETERS Parameters which control formatting may be passed in several ways: in a F<.perltidyrc> configuration file, in the B parameter, and in the B parameter. The B<-syn> (B<--check-syntax>) flag may be used with all source and destination streams except for standard input and output. However data streams which are not associated with a filename will be copied to a temporary file before being passed to Perl. This use of temporary files can cause somewhat confusing output from Perl. If the B<-pbp> style is used it will typically be necessary to also specify a B<-nst> flag. This is necessary to turn off the B<-st> flag contained in the B<-pbp> parameter set which otherwise would direct the output stream to the standard output. =head1 EXAMPLES The following example uses string references to hold the input and output code and error streams, and illustrates checking for errors. use Perl::Tidy; my $source_string = <<'EOT'; my$error=Perl::Tidy::perltidy(argv=>$argv,source=>\$source_string, destination=>\$dest_string,stderr=>\$stderr_string, errorfile=>\$errorfile_string,); EOT my $dest_string; my $stderr_string; my $errorfile_string; my $argv = "-npro"; # Ignore any .perltidyrc at this site $argv .= " -pbp"; # Format according to perl best practices $argv .= " -nst"; # Must turn off -st in case -pbp is specified $argv .= " -se"; # -se appends the errorfile to stderr ## $argv .= " --spell-check"; # uncomment to trigger an error print "<>\n$source_string\n"; my $error = Perl::Tidy::perltidy( argv => $argv, source => \$source_string, destination => \$dest_string, stderr => \$stderr_string, errorfile => \$errorfile_string, # ignored when -se flag is set ##phasers => 'stun', # uncomment to trigger an error ); if ($error) { # serious error in input parameters, no tidied output print "<>\n$stderr_string\n"; die "Exiting because of serious errors\n"; } if ($dest_string) { print "<>\n$dest_string\n" } if ($stderr_string) { print "<>\n$stderr_string\n" } if ($errorfile_string) { print "<<.ERR file>>\n$errorfile_string\n" } Additional examples are given in examples section of the perltidy distribution. =head1 Using the B Callback Object The B parameter is an optional callback object which allows the calling program to receive tokenized lines directly from perltidy for further specialized processing. When this parameter is used, the two formatting options which are built into perltidy (beautification or html) are ignored. The following diagram illustrates the logical flow: |-- (normal route) -> code beautification caller->perltidy->|-- (-html flag ) -> create html |-- (formatter given)-> callback to write_line This can be useful for processing perl scripts in some way. The parameter C<$formatter> in the perltidy call, formatter => $formatter, is an object created by the caller with a C method which will accept and process tokenized lines, one line per call. Here is a simple example of a C which merely prints the line number, the line type (as determined by perltidy), and the text of the line: sub write_line { # This is called from perltidy line-by-line my $self = shift; my $line_of_tokens = shift; my $line_type = $line_of_tokens->{_line_type}; my $input_line_number = $line_of_tokens->{_line_number}; my $input_line = $line_of_tokens->{_line_text}; print "$input_line_number:$line_type:$input_line"; } The complete program, B, is contained in the examples section of the source distribution. As this example shows, the callback method receives a parameter B<$line_of_tokens>, which is a reference to a hash of other useful information. This example uses these hash entries: $line_of_tokens->{_line_number} - the line number (1,2,...) $line_of_tokens->{_line_text} - the text of the line $line_of_tokens->{_line_type} - the type of the line, one of: SYSTEM - system-specific code before hash-bang line CODE - line of perl code (including comments) POD_START - line starting pod, such as '=head' POD - pod documentation text POD_END - last line of pod section, '=cut' HERE - text of here-document HERE_END - last line of here-doc (target word) FORMAT - format section FORMAT_END - last line of format section, '.' DATA_START - __DATA__ line DATA - unidentified text following __DATA__ END_START - __END__ line END - unidentified text following __END__ ERROR - we are in big trouble, probably not a perl script Most applications will be only interested in lines of type B. For another example, let's write a program which checks for one of the so-called I C<&`>, C<$&>, and C<$'>, which can slow down processing. Here is a B, from the example program B, which does that: sub write_line { # This is called back from perltidy line-by-line # We're looking for $`, $&, and $' my ( $self, $line_of_tokens ) = @_; # pull out some stuff we might need my $line_type = $line_of_tokens->{_line_type}; my $input_line_number = $line_of_tokens->{_line_number}; my $input_line = $line_of_tokens->{_line_text}; my $rtoken_type = $line_of_tokens->{_rtoken_type}; my $rtokens = $line_of_tokens->{_rtokens}; chomp $input_line; # skip comments, pod, etc return if ( $line_type ne 'CODE' ); # loop over tokens looking for $`, $&, and $' for ( my $j = 0 ; $j < @$rtoken_type ; $j++ ) { # we only want to examine token types 'i' (identifier) next unless $$rtoken_type[$j] eq 'i'; # pull out the actual token text my $token = $$rtokens[$j]; # and check it if ( $token =~ /^\$[\`\&\']$/ ) { print STDERR "$input_line_number: $token\n"; } } } This example pulls out these tokenization variables from the $line_of_tokens hash reference: $rtoken_type = $line_of_tokens->{_rtoken_type}; $rtokens = $line_of_tokens->{_rtokens}; The variable C<$rtoken_type> is a reference to an array of token type codes, and C<$rtokens> is a reference to a corresponding array of token text. These are obviously only defined for lines of type B. Perltidy classifies tokens into types, and has a brief code for each type. You can get a complete list at any time by running perltidy from the command line with perltidy --dump-token-types In the present example, we are only looking for tokens of type B (identifiers), so the for loop skips past all other types. When an identifier is found, its actual text is checked to see if it is one being sought. If so, the above write_line prints the token and its line number. The B section of the source distribution has some examples of programs which use the B option. For help with perltidy's peculiar way of breaking lines into tokens, you might run, from the command line, perltidy -D filename where F is a short script of interest. This will produce F with interleaved lines of text and their token types. The B<-D> flag has been in perltidy from the beginning for this purpose. If you want to see the code which creates this file, it is C =head1 EXPORT &perltidy =head1 INSTALLATION The module 'Perl::Tidy' comes with a binary 'perltidy' which is installed when the module is installed. The module name is case-sensitive. For example, the basic command for installing with cpanm is 'cpanm Perl::Tidy'. =head1 VERSION This man page documents Perl::Tidy version 20230309 =head1 LICENSE This package is free software; you can redistribute it and/or modify it under the terms of the "GNU General Public License". Please refer to the file "COPYING" for details. =head1 BUG REPORTS The source code repository is at L. To report a new bug or problem, use the "issues" link on this page. =head1 SEE ALSO The perltidy(1) man page describes all of the features of perltidy. It can be found at http://perltidy.sourceforge.net. =cut Perl-Tidy-20230309/lib/Perl/Tidy/0002755000175000017500000000000014401515241015205 5ustar stevestevePerl-Tidy-20230309/lib/Perl/Tidy/Logger.pm0000644000175000017500000004030214400733205016760 0ustar stevesteve##################################################################### # # The Perl::Tidy::Logger class writes any .LOG and .ERR files # and supplies some basic run information for error handling. # ##################################################################### package Perl::Tidy::Logger; use strict; use warnings; our $VERSION = '20230309'; use English qw( -no_match_vars ); use constant DEVEL_MODE => 0; use constant EMPTY_STRING => q{}; use constant SPACE => q{ }; sub AUTOLOAD { # Catch any undefined sub calls so that we are sure to get # some diagnostic information. This sub should never be called # except for a programming error. our $AUTOLOAD; return if ( $AUTOLOAD =~ /\bDESTROY$/ ); my ( $pkg, $fname, $lno ) = caller(); my $my_package = __PACKAGE__; print STDERR < 50; sub new { my ( $class, @args ) = @_; my %defaults = ( rOpts => undef, log_file => undef, warning_file => undef, fh_stderr => undef, display_name => undef, is_encoded_data => undef, ); my %args = ( %defaults, @args ); my $rOpts = $args{rOpts}; my $log_file = $args{log_file}; my $warning_file = $args{warning_file}; my $fh_stderr = $args{fh_stderr}; my $display_name = $args{display_name}; my $is_encoded_data = $args{is_encoded_data}; my $fh_warnings = $rOpts->{'standard-error-output'} ? $fh_stderr : undef; # remove any old error output file if we might write a new one unless ( $fh_warnings || ref($warning_file) ) { if ( -e $warning_file ) { unlink($warning_file) or Perl::Tidy::Die( "couldn't unlink warning file $warning_file: $ERRNO\n"); } } my $logfile_gap = defined( $rOpts->{'logfile-gap'} ) ? $rOpts->{'logfile-gap'} : DEFAULT_LOGFILE_GAP; if ( $logfile_gap == 0 ) { $logfile_gap = 1 } my $filename_stamp = $display_name ? $display_name . ':' : "??"; my $input_stream_name = $display_name ? $display_name : "??"; return bless { _log_file => $log_file, _logfile_gap => $logfile_gap, _rOpts => $rOpts, _fh_warnings => $fh_warnings, _last_input_line_written => 0, _at_end_of_file => 0, _use_prefix => 1, _block_log_output => 0, _line_of_tokens => undef, _output_line_number => undef, _wrote_line_information_string => 0, _wrote_column_headings => 0, _warning_file => $warning_file, _warning_count => 0, _complaint_count => 0, _is_encoded_data => $is_encoded_data, _saw_code_bug => -1, # -1=no 0=maybe 1=for sure _saw_brace_error => 0, _output_array => [], _input_stream_name => $input_stream_name, _filename_stamp => $filename_stamp, _save_logfile => $rOpts->{'logfile'}, }, $class; } ## end sub new sub get_input_stream_name { my $self = shift; return $self->{_input_stream_name}; } sub get_warning_count { my $self = shift; return $self->{_warning_count}; } sub get_use_prefix { my $self = shift; return $self->{_use_prefix}; } sub block_log_output { my $self = shift; $self->{_block_log_output} = 1; return; } sub unblock_log_output { my $self = shift; $self->{_block_log_output} = 0; return; } sub interrupt_logfile { my $self = shift; $self->{_use_prefix} = 0; $self->warning("\n"); $self->write_logfile_entry( '#' x 24 . " WARNING " . '#' x 25 . "\n" ); return; } ## end sub interrupt_logfile sub resume_logfile { my $self = shift; $self->write_logfile_entry( '#' x 60 . "\n" ); $self->{_use_prefix} = 1; return; } ## end sub resume_logfile sub we_are_at_the_last_line { my $self = shift; unless ( $self->{_wrote_line_information_string} ) { $self->write_logfile_entry("Last line\n\n"); } $self->{_at_end_of_file} = 1; return; } ## end sub we_are_at_the_last_line # record some stuff in case we go down in flames use constant MAX_PRINTED_CHARS => 35; sub black_box { my ( $self, $line_of_tokens, $output_line_number ) = @_; my $input_line = $line_of_tokens->{_line_text}; my $input_line_number = $line_of_tokens->{_line_number}; # save line information in case we have to write a logfile message $self->{_line_of_tokens} = $line_of_tokens; $self->{_output_line_number} = $output_line_number; $self->{_wrote_line_information_string} = 0; my $last_input_line_written = $self->{_last_input_line_written}; if ( ( ( $input_line_number - $last_input_line_written ) >= $self->{_logfile_gap} ) || ( $input_line =~ /^\s*(sub|package)\s+(\w+)/ ) ) { my $structural_indentation_level = $line_of_tokens->{_level_0}; $structural_indentation_level = 0 if ( $structural_indentation_level < 0 ); $self->{_last_input_line_written} = $input_line_number; ( my $out_str = $input_line ) =~ s/^\s*//; chomp $out_str; $out_str = ( '.' x $structural_indentation_level ) . $out_str; if ( length($out_str) > MAX_PRINTED_CHARS ) { $out_str = substr( $out_str, 0, MAX_PRINTED_CHARS ) . " ...."; } $self->logfile_output( EMPTY_STRING, "$out_str\n" ); } return; } ## end sub black_box sub write_logfile_entry { my ( $self, @msg ) = @_; # add leading >>> to avoid confusing error messages and code $self->logfile_output( ">>>", "@msg" ); return; } ## end sub write_logfile_entry sub write_column_headings { my $self = shift; $self->{_wrote_column_headings} = 1; my $routput_array = $self->{_output_array}; push @{$routput_array}, <>>) lines levels i k (code begins with one '.' per indent level) ------ ----- - - -------- ------------------------------------------- EOM return; } ## end sub write_column_headings sub make_line_information_string { # make columns of information when a logfile message needs to go out my $self = shift; my $line_of_tokens = $self->{_line_of_tokens}; my $input_line_number = $line_of_tokens->{_line_number}; my $line_information_string = EMPTY_STRING; if ($input_line_number) { my $output_line_number = $self->{_output_line_number}; my $brace_depth = $line_of_tokens->{_curly_brace_depth}; my $paren_depth = $line_of_tokens->{_paren_depth}; my $square_bracket_depth = $line_of_tokens->{_square_bracket_depth}; my $guessed_indentation_level = $line_of_tokens->{_guessed_indentation_level}; my $structural_indentation_level = $line_of_tokens->{_level_0}; $self->write_column_headings() unless $self->{_wrote_column_headings}; # keep logfile columns aligned for scripts up to 999 lines; # for longer scripts it doesn't really matter my $extra_space = EMPTY_STRING; $extra_space .= ( $input_line_number < 10 ) ? SPACE x 2 : ( $input_line_number < 100 ) ? SPACE : EMPTY_STRING; $extra_space .= ( $output_line_number < 10 ) ? SPACE x 2 : ( $output_line_number < 100 ) ? SPACE : EMPTY_STRING; # there are 2 possible nesting strings: # the original which looks like this: (0 [1 {2 # the new one, which looks like this: {{[ # the new one is easier to read, and shows the order, but # could be arbitrarily long, so we use it unless it is too long my $nesting_string = "($paren_depth [$square_bracket_depth {$brace_depth"; my $nesting_string_new = $line_of_tokens->{_nesting_tokens_0}; my $ci_level = $line_of_tokens->{_ci_level_0}; if ( $ci_level > 9 ) { $ci_level = '*' } my $bk = ( $line_of_tokens->{_nesting_blocks_0} =~ /1$/ ) ? '1' : '0'; if ( length($nesting_string_new) <= 8 ) { $nesting_string = $nesting_string_new . SPACE x ( 8 - length($nesting_string_new) ); } $line_information_string = "L$input_line_number:$output_line_number$extra_space i$guessed_indentation_level:$structural_indentation_level $ci_level $bk $nesting_string"; } return $line_information_string; } ## end sub make_line_information_string sub logfile_output { my ( $self, $prompt, $msg ) = @_; return if ( $self->{_block_log_output} ); my $routput_array = $self->{_output_array}; if ( $self->{_at_end_of_file} || !$self->{_use_prefix} ) { push @{$routput_array}, "$msg"; } else { my $line_information_string = $self->make_line_information_string(); $self->{_wrote_line_information_string} = 1; if ($line_information_string) { push @{$routput_array}, "$line_information_string $prompt$msg"; } else { push @{$routput_array}, "$msg"; } } return; } ## end sub logfile_output sub get_saw_brace_error { my $self = shift; return $self->{_saw_brace_error}; } sub increment_brace_error { my $self = shift; $self->{_saw_brace_error}++; return; } sub brace_warning { my ( $self, $msg ) = @_; use constant BRACE_WARNING_LIMIT => 10; my $saw_brace_error = $self->{_saw_brace_error}; if ( $saw_brace_error < BRACE_WARNING_LIMIT ) { $self->warning($msg); } $saw_brace_error++; $self->{_saw_brace_error} = $saw_brace_error; if ( $saw_brace_error == BRACE_WARNING_LIMIT ) { $self->warning("No further warnings of this type will be given\n"); } return; } ## end sub brace_warning sub complain { # handle non-critical warning messages based on input flag my ( $self, $msg ) = @_; my $rOpts = $self->{_rOpts}; # these appear in .ERR output only if -w flag is used if ( $rOpts->{'warning-output'} ) { $self->warning($msg); } # otherwise, they go to the .LOG file else { $self->{_complaint_count}++; $self->write_logfile_entry($msg); } return; } ## end sub complain sub warning { # report errors to .ERR file (or stdout) my ( $self, $msg ) = @_; use constant WARNING_LIMIT => 50; # Always bump the warn count, even if no message goes out Perl::Tidy::Warn_count_bump(); my $rOpts = $self->{_rOpts}; unless ( $rOpts->{'quiet'} ) { my $warning_count = $self->{_warning_count}; my $fh_warnings = $self->{_fh_warnings}; my $is_encoded_data = $self->{_is_encoded_data}; if ( !$fh_warnings ) { my $warning_file = $self->{_warning_file}; ( $fh_warnings, my $filename ) = Perl::Tidy::streamhandle( $warning_file, 'w', $is_encoded_data ); $fh_warnings or Perl::Tidy::Die("couldn't open $filename: $ERRNO\n"); Perl::Tidy::Warn_msg("## Please see file $filename\n") unless ref($warning_file); $self->{_fh_warnings} = $fh_warnings; $fh_warnings->print("Perltidy version is $Perl::Tidy::VERSION\n"); } my $filename_stamp = $self->{_filename_stamp}; if ( $warning_count < WARNING_LIMIT ) { if ( !$warning_count ) { # On first error always write a line with the filename. Note # that the filename will be 'perltidy' if input is from stdin # or from a data structure. if ($filename_stamp) { $fh_warnings->print( "\n$filename_stamp Begin Error Output Stream\n"); } # Turn off filename stamping unless error output is directed # to the standard error output (with -se flag) if ( !$rOpts->{'standard-error-output'} ) { $filename_stamp = EMPTY_STRING; $self->{_filename_stamp} = $filename_stamp; } } if ( $self->get_use_prefix() > 0 ) { $self->write_logfile_entry("WARNING: $msg"); # add prefix 'filename:line_no: ' to message lines my $input_line_number = Perl::Tidy::Tokenizer::get_input_line_number(); if ( !defined($input_line_number) ) { $input_line_number = -1 } my $pre_string = $filename_stamp . $input_line_number . ': '; chomp $msg; $msg =~ s/\n/\n$pre_string/g; $msg = $pre_string . $msg . "\n"; $fh_warnings->print($msg); } else { $self->write_logfile_entry($msg); # add prefix 'filename: ' to message lines if ($filename_stamp) { my $pre_string = $filename_stamp . SPACE; chomp $msg; $msg =~ s/\n/\n$pre_string/g; $msg = $pre_string . $msg . "\n"; } $fh_warnings->print($msg); } } $warning_count++; $self->{_warning_count} = $warning_count; if ( $warning_count == WARNING_LIMIT ) { $fh_warnings->print( $filename_stamp . "No further warnings will be given\n" ); } } return; } ## end sub warning sub report_definite_bug { my $self = shift; $self->{_saw_code_bug} = 1; return; } sub get_save_logfile { # Returns a true/false flag indicating whether or not # the logfile will be saved. my $self = shift; return $self->{_save_logfile}; } ## end sub get_save_logfile sub finish { # called after all formatting to summarize errors my ($self) = @_; my $warning_count = $self->{_warning_count}; my $save_logfile = $self->{_save_logfile}; my $log_file = $self->{_log_file}; if ($warning_count) { if ($save_logfile) { $self->block_log_output(); # avoid echoing this to the logfile $self->warning( "The logfile $log_file may contain useful information\n"); $self->unblock_log_output(); } if ( $self->{_complaint_count} > 0 ) { $self->warning( "To see $self->{_complaint_count} non-critical warnings rerun with -w\n" ); } if ( $self->{_saw_brace_error} && ( $self->{_logfile_gap} > 1 || !$save_logfile ) ) { $self->warning("To save a full .LOG file rerun with -g\n"); } } if ($save_logfile) { my $is_encoded_data = $self->{_is_encoded_data}; my ( $fh, $filename ) = Perl::Tidy::streamhandle( $log_file, 'w', $is_encoded_data ); if ($fh) { my $routput_array = $self->{_output_array}; foreach my $line ( @{$routput_array} ) { $fh->print($line) } if ( $log_file ne '-' && !ref $log_file ) { my $ok = eval { $fh->close(); 1 }; if ( !$ok && DEVEL_MODE ) { Fault("Could not close file handle(): $EVAL_ERROR\n"); } } } } return; } ## end sub finish 1; Perl-Tidy-20230309/lib/Perl/Tidy/VerticalAligner/0002755000175000017500000000000014401515241020260 5ustar stevestevePerl-Tidy-20230309/lib/Perl/Tidy/VerticalAligner/Line.pm0000644000175000017500000000505114400733207021507 0ustar stevesteve##################################################################### # # The Perl::Tidy::VerticalAligner::Line class supplies an object to # contain a single output line. It allows manipulation of the # alignment columns on that line. # ##################################################################### package Perl::Tidy::VerticalAligner::Line; use strict; use warnings; use English qw( -no_match_vars ); our $VERSION = '20230309'; sub AUTOLOAD { # Catch any undefined sub calls so that we are sure to get # some diagnostic information. This sub should never be called # except for a programming error. our $AUTOLOAD; return if ( $AUTOLOAD =~ /\bDESTROY$/ ); my ( $pkg, $fname, $lno ) = caller(); my $my_package = __PACKAGE__; print STDERR <{ralignments}->[$j]; return unless defined($alignment); return $alignment->get_column(); } ## end sub get_column sub current_field_width { my ( $self, $j ) = @_; my $col_j = 0; my $col_jm = 0; my $alignment_j = $self->{ralignments}->[$j]; $col_j = $alignment_j->get_column() if defined($alignment_j); if ( $j > 0 ) { my $alignment_jm = $self->{ralignments}->[ $j - 1 ]; $col_jm = $alignment_jm->get_column() if defined($alignment_jm); } return $col_j - $col_jm; } ## end sub current_field_width sub increase_field_width { my ( $self, $j, $pad ) = @_; my $jmax = $self->{jmax}; foreach ( $j .. $jmax ) { my $alignment = $self->{ralignments}->[$_]; if ( defined($alignment) ) { $alignment->increment_column($pad); } } return; } ## end sub increase_field_width sub get_available_space_on_right { my $jmax = $_[0]->{jmax}; return $_[0]->{maximum_line_length} - $_[0]->get_column($jmax); } } 1; Perl-Tidy-20230309/lib/Perl/Tidy/VerticalAligner/Alignment.pm0000644000175000017500000000316414400733206022540 0ustar stevesteve##################################################################### # # the Perl::Tidy::VerticalAligner::Alignment class holds information # on a single column being aligned # ##################################################################### package Perl::Tidy::VerticalAligner::Alignment; use strict; use warnings; { #<<< A non-indenting brace our $VERSION = '20230309'; sub new { my ( $class, $rarg ) = @_; my $self = bless $rarg, $class; return $self; } sub AUTOLOAD { # Catch any undefined sub calls so that we are sure to get # some diagnostic information. This sub should never be called # except for a programming error. our $AUTOLOAD; return if ( $AUTOLOAD =~ /\bDESTROY$/ ); my ( $pkg, $fname, $lno ) = caller(); my $my_package = __PACKAGE__; print STDERR <{'column'}; } sub increment_column { $_[0]->{'column'} += $_[1]; return; } sub save_column { $_[0]->{'saved_column'} = $_[0]->{'column'}; return; } sub restore_column { $_[0]->{'column'} = $_[0]->{'saved_column'}; return; } } ## end of package VerticalAligner::Alignment 1; Perl-Tidy-20230309/lib/Perl/Tidy/DevNull.pm0000644000175000017500000000060714400733177017126 0ustar stevesteve##################################################################### # # The Perl::Tidy::DevNull class supplies a dummy print method # ##################################################################### package Perl::Tidy::DevNull; use strict; use warnings; our $VERSION = '20230309'; sub new { my $self = shift; return bless {}, $self } sub print { return } sub close { return } 1; Perl-Tidy-20230309/lib/Perl/Tidy/FileWriter.pm0000644000175000017500000003646514400733200017627 0ustar stevesteve##################################################################### # # the Perl::Tidy::FileWriter class writes the output file # ##################################################################### package Perl::Tidy::FileWriter; use strict; use warnings; our $VERSION = '20230309'; use constant DEVEL_MODE => 0; use constant EMPTY_STRING => q{}; sub AUTOLOAD { # Catch any undefined sub calls so that we are sure to get # some diagnostic information. This sub should never be called # except for a programming error. our $AUTOLOAD; return if ( $AUTOLOAD =~ /\bDESTROY$/ ); my ( $pkg, $fname, $lno ) = caller(); my $my_package = __PACKAGE__; print STDERR < 6; BEGIN { # Array index names for variables. # Do not combine with other BEGIN blocks (c101). my $i = 0; use constant { _line_sink_object_ => $i++, _logger_object_ => $i++, _rOpts_ => $i++, _output_line_number_ => $i++, _consecutive_blank_lines_ => $i++, _consecutive_nonblank_lines_ => $i++, _consecutive_new_blank_lines_ => $i++, _first_line_length_error_ => $i++, _max_line_length_error_ => $i++, _last_line_length_error_ => $i++, _first_line_length_error_at_ => $i++, _max_line_length_error_at_ => $i++, _last_line_length_error_at_ => $i++, _line_length_error_count_ => $i++, _max_output_line_length_ => $i++, _max_output_line_length_at_ => $i++, _rK_checklist_ => $i++, _K_arrival_order_matches_ => $i++, _K_sequence_error_msg_ => $i++, _K_last_arrival_ => $i++, _save_logfile_ => $i++, }; } ## end BEGIN sub Die { my ($msg) = @_; Perl::Tidy::Die($msg); return; } sub Fault { my ($msg) = @_; # This routine is called for errors that really should not occur # except if there has been a bug introduced by a recent program change. # Please add comments at calls to Fault to explain why the call # should not occur, and where to look to fix it. my ( $package0, $filename0, $line0, $subroutine0 ) = caller(0); my ( $package1, $filename1, $line1, $subroutine1 ) = caller(1); my ( $package2, $filename2, $line2, $subroutine2 ) = caller(2); my $pkg = __PACKAGE__; Die(<[_logger_object_]; if ($logger_object) { $logger_object->warning($msg); } return; } ## end sub warning sub write_logfile_entry { my ( $self, $msg ) = @_; my $logger_object = $self->[_logger_object_]; if ($logger_object) { $logger_object->write_logfile_entry($msg); } return; } ## end sub write_logfile_entry sub new { my ( $class, $line_sink_object, $rOpts, $logger_object ) = @_; my $self = []; $self->[_line_sink_object_] = $line_sink_object; $self->[_logger_object_] = $logger_object; $self->[_rOpts_] = $rOpts; $self->[_output_line_number_] = 1; $self->[_consecutive_blank_lines_] = 0; $self->[_consecutive_nonblank_lines_] = 0; $self->[_consecutive_new_blank_lines_] = 0; $self->[_first_line_length_error_] = 0; $self->[_max_line_length_error_] = 0; $self->[_last_line_length_error_] = 0; $self->[_first_line_length_error_at_] = 0; $self->[_max_line_length_error_at_] = 0; $self->[_last_line_length_error_at_] = 0; $self->[_line_length_error_count_] = 0; $self->[_max_output_line_length_] = 0; $self->[_max_output_line_length_at_] = 0; $self->[_rK_checklist_] = []; $self->[_K_arrival_order_matches_] = 0; $self->[_K_sequence_error_msg_] = EMPTY_STRING; $self->[_K_last_arrival_] = -1; $self->[_save_logfile_] = defined($logger_object); # save input stream name for local error messages $input_stream_name = EMPTY_STRING; if ($logger_object) { $input_stream_name = $logger_object->get_input_stream_name(); } bless $self, $class; return $self; } ## end sub new sub setup_convergence_test { my ( $self, $rlist ) = @_; if ( @{$rlist} ) { # We are going to destroy the list, so make a copy # and put in reverse order so we can pop values my @list = @{$rlist}; if ( $list[0] < $list[-1] ) { @list = reverse @list; } $self->[_rK_checklist_] = \@list; } $self->[_K_arrival_order_matches_] = 1; $self->[_K_sequence_error_msg_] = EMPTY_STRING; $self->[_K_last_arrival_] = -1; return; } ## end sub setup_convergence_test sub get_convergence_check { my ($self) = @_; my $rlist = $self->[_rK_checklist_]; # converged if all K arrived and in correct order return $self->[_K_arrival_order_matches_] && !@{$rlist}; } ## end sub get_convergence_check sub get_output_line_number { return $_[0]->[_output_line_number_]; } sub decrement_output_line_number { $_[0]->[_output_line_number_]--; return; } sub get_consecutive_nonblank_lines { return $_[0]->[_consecutive_nonblank_lines_]; } sub get_consecutive_blank_lines { return $_[0]->[_consecutive_blank_lines_]; } sub reset_consecutive_blank_lines { $_[0]->[_consecutive_blank_lines_] = 0; return; } # This sub call allows termination of logfile writing for efficiency when we # know that the logfile will not be saved. sub set_save_logfile { my ( $self, $save_logfile ) = @_; $self->[_save_logfile_] = $save_logfile; return; } sub want_blank_line { my $self = shift; unless ( $self->[_consecutive_blank_lines_] ) { $self->write_blank_code_line(); } return; } ## end sub want_blank_line sub require_blank_code_lines { # write out the requested number of blanks regardless of the value of -mbl # unless -mbl=0. This allows extra blank lines to be written for subs and # packages even with the default -mbl=1 my ( $self, $count ) = @_; my $need = $count - $self->[_consecutive_blank_lines_]; my $rOpts = $self->[_rOpts_]; my $forced = $rOpts->{'maximum-consecutive-blank-lines'} > 0; foreach ( 0 .. $need - 1 ) { $self->write_blank_code_line($forced); } return; } ## end sub require_blank_code_lines sub write_blank_code_line { my ( $self, $forced ) = @_; # Write a blank line of code, given: # $forced = optional flag which, if set, forces the blank line # to be written. This allows the -mbl flag to be temporarily # exceeded. my $rOpts = $self->[_rOpts_]; return if (!$forced && $self->[_consecutive_blank_lines_] >= $rOpts->{'maximum-consecutive-blank-lines'} ); $self->[_consecutive_nonblank_lines_] = 0; # Balance old blanks against new (forced) blanks instead of writing them. # This fixes case b1073. if ( !$forced && $self->[_consecutive_new_blank_lines_] > 0 ) { $self->[_consecutive_new_blank_lines_]--; return; } $self->[_line_sink_object_]->write_line("\n"); $self->[_output_line_number_]++; $self->[_consecutive_blank_lines_]++; $self->[_consecutive_new_blank_lines_]++ if ($forced); return; } ## end sub write_blank_code_line use constant MAX_PRINTED_CHARS => 80; sub write_code_line { my ( $self, $str, $K ) = @_; # Write a line of code, given # $str = the line of code # $K = an optional check integer which, if if given, must # increase monotonically. This was added to catch cache # sequence errors in the vertical aligner. $self->[_consecutive_blank_lines_] = 0; $self->[_consecutive_new_blank_lines_] = 0; $self->[_consecutive_nonblank_lines_]++; $self->[_line_sink_object_]->write_line($str); if ( chomp $str ) { $self->[_output_line_number_]++; } if ( $self->[_save_logfile_] ) { $self->check_line_lengths($str) } #---------------------------- # Convergence and error check #---------------------------- if ( defined($K) ) { # Convergence check: we are checking if all defined K values arrive in # the order which was defined by the caller. Quit checking if any # unexpected K value arrives. if ( $self->[_K_arrival_order_matches_] ) { my $Kt = pop @{ $self->[_rK_checklist_] }; if ( !defined($Kt) || $Kt != $K ) { $self->[_K_arrival_order_matches_] = 0; } } # Check for out-of-order arrivals of index K. The K values are the # token indexes of the last token of code lines, and they should come # out in increasing order. Otherwise something is seriously wrong. # Most likely a recent programming change to VerticalAligner.pm has # caused lines to go out in the wrong order. This could happen if # either the cache or buffer that it uses are emptied in the wrong # order. if ( !$self->[_K_sequence_error_msg_] ) { my $K_prev = $self->[_K_last_arrival_]; if ( $K < $K_prev ) { chomp $str; if ( length($str) > MAX_PRINTED_CHARS ) { $str = substr( $str, 0, MAX_PRINTED_CHARS ) . "..."; } my $msg = <warning($msg) if ( length($str) ); # Only issue this warning once $self->[_K_sequence_error_msg_] = $msg; } } $self->[_K_last_arrival_] = $K; } return; } ## end sub write_code_line sub write_line { my ( $self, $str ) = @_; # Write a line directly to the output, without any counting of blank or # non-blank lines. $self->[_line_sink_object_]->write_line($str); if ( chomp $str ) { $self->[_output_line_number_]++; } if ( $self->[_save_logfile_] ) { $self->check_line_lengths($str) } return; } ## end sub write_line sub check_line_lengths { my ( $self, $str ) = @_; # collect info on line lengths for logfile # This calculation of excess line length ignores any internal tabs my $rOpts = $self->[_rOpts_]; my $len_str = length($str); my $exceed = $len_str - $rOpts->{'maximum-line-length'}; if ( $str && substr( $str, 0, 1 ) eq "\t" && $str =~ /^\t+/g ) { $exceed += pos($str) * $rOpts->{'indent-columns'}; } # Note that we just incremented output line number to future value # so we must subtract 1 for current line number if ( $len_str > $self->[_max_output_line_length_] ) { $self->[_max_output_line_length_] = $len_str; $self->[_max_output_line_length_at_] = $self->[_output_line_number_] - 1; } if ( $exceed > 0 ) { my $output_line_number = $self->[_output_line_number_]; $self->[_last_line_length_error_] = $exceed; $self->[_last_line_length_error_at_] = $output_line_number - 1; if ( $self->[_line_length_error_count_] == 0 ) { $self->[_first_line_length_error_] = $exceed; $self->[_first_line_length_error_at_] = $output_line_number - 1; } if ( $self->[_last_line_length_error_] > $self->[_max_line_length_error_] ) { $self->[_max_line_length_error_] = $exceed; $self->[_max_line_length_error_at_] = $output_line_number - 1; } if ( $self->[_line_length_error_count_] < MAX_NAG_MESSAGES ) { $self->write_logfile_entry( "Line length exceeded by $exceed characters\n"); } $self->[_line_length_error_count_]++; } return; } ## end sub check_line_lengths sub report_line_length_errors { my $self = shift; # Write summary info about line lengths to the log file my $rOpts = $self->[_rOpts_]; my $line_length_error_count = $self->[_line_length_error_count_]; if ( $line_length_error_count == 0 ) { $self->write_logfile_entry( "No lines exceeded $rOpts->{'maximum-line-length'} characters\n"); my $max_output_line_length = $self->[_max_output_line_length_]; my $max_output_line_length_at = $self->[_max_output_line_length_at_]; $self->write_logfile_entry( " Maximum output line length was $max_output_line_length at line $max_output_line_length_at\n" ); } else { my $word = ( $line_length_error_count > 1 ) ? "s" : EMPTY_STRING; $self->write_logfile_entry( "$line_length_error_count output line$word exceeded $rOpts->{'maximum-line-length'} characters:\n" ); $word = ( $line_length_error_count > 1 ) ? "First" : EMPTY_STRING; my $first_line_length_error = $self->[_first_line_length_error_]; my $first_line_length_error_at = $self->[_first_line_length_error_at_]; $self->write_logfile_entry( " $word at line $first_line_length_error_at by $first_line_length_error characters\n" ); if ( $line_length_error_count > 1 ) { my $max_line_length_error = $self->[_max_line_length_error_]; my $max_line_length_error_at = $self->[_max_line_length_error_at_]; my $last_line_length_error = $self->[_last_line_length_error_]; my $last_line_length_error_at = $self->[_last_line_length_error_at_]; $self->write_logfile_entry( " Maximum at line $max_line_length_error_at by $max_line_length_error characters\n" ); $self->write_logfile_entry( " Last at line $last_line_length_error_at by $last_line_length_error characters\n" ); } } return; } ## end sub report_line_length_errors 1; Perl-Tidy-20230309/lib/Perl/Tidy/VerticalAligner.pm0000644000175000017500000064162214400733206020631 0ustar stevestevepackage Perl::Tidy::VerticalAligner; use strict; use warnings; use Carp; use English qw( -no_match_vars ); our $VERSION = '20230309'; use Perl::Tidy::VerticalAligner::Alignment; use Perl::Tidy::VerticalAligner::Line; use constant DEVEL_MODE => 0; use constant EMPTY_STRING => q{}; use constant SPACE => q{ }; # The Perl::Tidy::VerticalAligner package collects output lines and # attempts to line up certain common tokens, such as => and #, which are # identified by the calling routine. # # Usage: # - Initiate an object with a call to new(). # - Write lines one-by-one with calls to valign_input(). # - Make a final call to flush() to empty the pipeline. # # The sub valign_input collects lines into groups. When a group reaches # the maximum possible size it is processed for alignment and output. # The maximum group size is reached whenever there is a change in indentation # level, a blank line, a block comment, or an external flush call. The calling # routine may also force a break in alignment at any time. # # If the calling routine needs to interrupt the output and send other text to # the output, it must first call flush() to empty the output pipeline. This # might occur for example if a block of pod text needs to be sent to the output # between blocks of code. # It is essential that a final call to flush() be made. Otherwise some # final lines of text will be lost. # Index... # CODE SECTION 1: Preliminary code, global definitions and sub new # sub new # CODE SECTION 2: Some Basic Utilities # CODE SECTION 3: Code to accept input and form groups # sub valign_input # CODE SECTION 4: Code to process comment lines # sub _flush_comment_lines # CODE SECTION 5: Code to process groups of code lines # sub _flush_group_lines # CODE SECTION 6: Output Step A # sub valign_output_step_A # CODE SECTION 7: Output Step B # sub valign_output_step_B # CODE SECTION 8: Output Step C # sub valign_output_step_C # CODE SECTION 9: Output Step D # sub valign_output_step_D # CODE SECTION 10: Summary # sub report_anything_unusual ################################################################## # CODE SECTION 1: Preliminary code, global definitions and sub new ################################################################## sub AUTOLOAD { # Catch any undefined sub calls so that we are sure to get # some diagnostic information. This sub should never be called # except for a programming error. our $AUTOLOAD; return if ( $AUTOLOAD =~ /\bDESTROY$/ ); my ( $pkg, $fname, $lno ) = caller(); my $my_package = __PACKAGE__; print STDERR < $i++, _logger_object_ => $i++, _diagnostics_object_ => $i++, _length_function_ => $i++, _rOpts_ => $i++, _rOpts_indent_columns_ => $i++, _rOpts_tabs_ => $i++, _rOpts_entab_leading_whitespace_ => $i++, _rOpts_fixed_position_side_comment_ => $i++, _rOpts_minimum_space_to_comment_ => $i++, _rOpts_valign_code_ => $i++, _rOpts_valign_block_comments_ => $i++, _rOpts_valign_side_comments_ => $i++, _last_level_written_ => $i++, _last_side_comment_column_ => $i++, _last_side_comment_line_number_ => $i++, _last_side_comment_length_ => $i++, _last_side_comment_level_ => $i++, _outdented_line_count_ => $i++, _first_outdented_line_at_ => $i++, _last_outdented_line_at_ => $i++, _consecutive_block_comments_ => $i++, _rgroup_lines_ => $i++, _group_level_ => $i++, _group_type_ => $i++, _group_maximum_line_length_ => $i++, _zero_count_ => $i++, _last_leading_space_count_ => $i++, _comment_leading_space_count_ => $i++, }; # Debug flag. This is a relic from the original program development # looking for problems with tab characters. Caution: this debug flag can # produce a lot of output It should be 0 except when debugging small # scripts. use constant DEBUG_TABS => 0; my $debug_warning = sub { print STDOUT "VALIGN_DEBUGGING with key $_[0]\n"; return; }; DEBUG_TABS && $debug_warning->('TABS'); } ## end BEGIN # GLOBAL variables my ( %valign_control_hash, $valign_control_default, ); sub check_options { # This routine is called to check the user-supplied run parameters # and to configure the control hashes to them. my ($rOpts) = @_; # All alignments are done by default %valign_control_hash = (); $valign_control_default = 1; # If -vil=s is entered without -vxl, assume -vxl='*' if ( !$rOpts->{'valign-exclusion-list'} && $rOpts->{'valign-inclusion-list'} ) { $rOpts->{'valign-exclusion-list'} = '*'; } # See if the user wants to exclude any alignment types ... if ( $rOpts->{'valign-exclusion-list'} ) { # The inclusion list is only relevant if there is an exclusion list if ( $rOpts->{'valign-inclusion-list'} ) { my @vil = split /\s+/, $rOpts->{'valign-inclusion-list'}; @valign_control_hash{@vil} = (1) x scalar(@vil); } # Note that the -vxl list is done after -vil, so -vxl has priority # in the event of duplicate entries. my @vxl = split /\s+/, $rOpts->{'valign-exclusion-list'}; @valign_control_hash{@vxl} = (0) x scalar(@vxl); # Optimization: revert to defaults if no exclusions. # This could happen with -vxl=' ' and any -vil list if ( !@vxl ) { %valign_control_hash = (); } # '$valign_control_default' applies to types not in the hash: # - If a '*' was entered then set it to be that default type # - Otherwise, leave it set it to 1 if ( defined( $valign_control_hash{'*'} ) ) { $valign_control_default = $valign_control_hash{'*'}; } # Side comments are controlled separately and must be removed # if given in a list. if (%valign_control_hash) { $valign_control_hash{'#'} = 1; } } return; } ## end sub check_options sub check_keys { my ( $rtest, $rvalid, $msg, $exact_match ) = @_; # Check the keys of a hash: # $rtest = ref to hash to test # $rvalid = ref to hash with valid keys # $msg = a message to write in case of error # $exact_match defines the type of check: # = false: test hash must not have unknown key # = true: test hash must have exactly same keys as known hash my @unknown_keys = grep { !exists $rvalid->{$_} } keys %{$rtest}; my @missing_keys = grep { !exists $rtest->{$_} } keys %{$rvalid}; my $error = @unknown_keys; if ($exact_match) { $error ||= @missing_keys } if ($error) { local $LIST_SEPARATOR = ')('; my @expected_keys = sort keys %{$rvalid}; @unknown_keys = sort @unknown_keys; Fault(< undef, file_writer_object => undef, logger_object => undef, diagnostics_object => undef, length_function => sub { return length( $_[0] ) }, ); my %args = ( %defaults, @args ); # Initialize other caches and buffers initialize_step_B_cache(); initialize_valign_buffer(); initialize_leading_string_cache(); initialize_decode(); set_logger_object( $args{logger_object} ); # Initialize all variables in $self. # To add an item to $self, first define a new constant index in the BEGIN # section. my $self = []; # objects $self->[_file_writer_object_] = $args{file_writer_object}; $self->[_logger_object_] = $args{logger_object}; $self->[_diagnostics_object_] = $args{diagnostics_object}; $self->[_length_function_] = $args{length_function}; # shortcuts to user options my $rOpts = $args{rOpts}; $self->[_rOpts_] = $rOpts; $self->[_rOpts_indent_columns_] = $rOpts->{'indent-columns'}; $self->[_rOpts_tabs_] = $rOpts->{'tabs'}; $self->[_rOpts_entab_leading_whitespace_] = $rOpts->{'entab-leading-whitespace'}; $self->[_rOpts_fixed_position_side_comment_] = $rOpts->{'fixed-position-side-comment'}; $self->[_rOpts_minimum_space_to_comment_] = $rOpts->{'minimum-space-to-comment'}; $self->[_rOpts_valign_code_] = $rOpts->{'valign-code'}; $self->[_rOpts_valign_block_comments_] = $rOpts->{'valign-block-comments'}; $self->[_rOpts_valign_side_comments_] = $rOpts->{'valign-side-comments'}; # Batch of lines being collected $self->[_rgroup_lines_] = []; $self->[_group_level_] = 0; $self->[_group_type_] = EMPTY_STRING; $self->[_group_maximum_line_length_] = undef; $self->[_zero_count_] = 0; $self->[_comment_leading_space_count_] = 0; $self->[_last_leading_space_count_] = 0; # Memory of what has been processed $self->[_last_level_written_] = -1; $self->[_last_side_comment_column_] = 0; $self->[_last_side_comment_line_number_] = 0; $self->[_last_side_comment_length_] = 0; $self->[_last_side_comment_level_] = -1; $self->[_outdented_line_count_] = 0; $self->[_first_outdented_line_at_] = 0; $self->[_last_outdented_line_at_] = 0; $self->[_consecutive_block_comments_] = 0; bless $self, $class; return $self; } ## end sub new ################################# # CODE SECTION 2: Basic Utilities ################################# sub flush { # flush() is the external call to completely empty the pipeline. my ($self) = @_; # push things out the pipeline... # push out any current group lines $self->_flush_group_lines(); # then anything left in the cache of step_B $self->_flush_step_B_cache(); # then anything left in the buffer of step_C $self->dump_valign_buffer(); return; } ## end sub flush sub initialize_for_new_group { my ($self) = @_; $self->[_rgroup_lines_] = []; $self->[_group_type_] = EMPTY_STRING; $self->[_zero_count_] = 0; $self->[_comment_leading_space_count_] = 0; $self->[_last_leading_space_count_] = 0; $self->[_group_maximum_line_length_] = undef; # Note that the value for _group_level_ is # handled separately in sub valign_input return; } ## end sub initialize_for_new_group sub group_line_count { return +@{ $_[0]->[_rgroup_lines_] }; } # interface to Perl::Tidy::Diagnostics routines # For debugging; not currently used sub write_diagnostics { my ( $self, $msg ) = @_; my $diagnostics_object = $self->[_diagnostics_object_]; if ($diagnostics_object) { $diagnostics_object->write_diagnostics($msg); } return; } ## end sub write_diagnostics { ## begin closure for logger routines my $logger_object; # Called once per file to initialize the logger object sub set_logger_object { $logger_object = shift; return; } sub get_logger_object { return $logger_object; } sub get_input_stream_name { my $input_stream_name = EMPTY_STRING; if ($logger_object) { $input_stream_name = $logger_object->get_input_stream_name(); } return $input_stream_name; } ## end sub get_input_stream_name sub warning { my ($msg) = @_; if ($logger_object) { $logger_object->warning($msg); } return; } ## end sub warning sub write_logfile_entry { my ($msg) = @_; if ($logger_object) { $logger_object->write_logfile_entry($msg); } return; } ## end sub write_logfile_entry } sub get_cached_line_count { my $self = shift; return $self->group_line_count() + ( get_cached_line_type() ? 1 : 0 ); } sub get_recoverable_spaces { # return the number of spaces (+ means shift right, - means shift left) # that we would like to shift a group of lines with the same indentation # to get them to line up with their opening parens my $indentation = shift; return ref($indentation) ? $indentation->get_recoverable_spaces() : 0; } ## end sub get_recoverable_spaces ###################################################### # CODE SECTION 3: Code to accept input and form groups ###################################################### use constant DEBUG_VALIGN => 0; use constant SC_LONG_LINE_DIFF => 12; my %is_closing_token; BEGIN { my @q = qw< } ) ] >; @is_closing_token{@q} = (1) x scalar(@q); } #-------------------------------------------- # VTFLAGS: Vertical tightness types and flags #-------------------------------------------- # Vertical tightness is controlled by a 'type' and associated 'flags' for each # line. These values are set by sub Formatter::set_vertical_tightness_flags. # These are defined as follows: # Vertical Tightness Line Type Codes: # Type 0, no vertical tightness condition # Type 1, last token of this line is a non-block opening token # Type 2, first token of next line is a non-block closing # Type 3, isolated opening block brace # type 4, isolated closing block brace # Opening token flag values are the vertical tightness flags # 0 do not join with next line # 1 just one join per line # 2 any number of joins # Closing token flag values indicate spacing: # 0 = no space added before closing token # 1 = single space added before closing token sub valign_input { #--------------------------------------------------------------------- # This is the front door of the vertical aligner. On each call # we receive one line of specially marked text for vertical alignment. # We compare the line with the current group, and either: # - the line joins the current group if alignments match, or # - the current group is flushed and a new group is started otherwise #--------------------------------------------------------------------- # # The key input parameters describing each line are: # $level = indentation level of this line # $rfields = ref to array of fields # $rpatterns = ref to array of patterns, one per field # $rtokens = ref to array of tokens starting fields 1,2,.. # $rfield_lengths = ref to array of field display widths # # Here is an example of what this package does. In this example, # we are trying to line up both the '=>' and the '#'. # # '18' => 'grave', # \` # '19' => 'acute', # `' # '20' => 'caron', # \v # <-tabs-><--field 2 ---><-f3-> # | | | | # | | | | # col1 col2 col3 col4 # # The calling routine has already broken the entire line into 3 fields as # indicated. (So the work of identifying promising common tokens has # already been done). # # In this example, there will be 2 tokens being matched: '=>' and '#'. # They are the leading parts of fields 2 and 3, but we do need to know # what they are so that we can dump a group of lines when these tokens # change. # # The fields contain the actual characters of each field. The patterns # are like the fields, but they contain mainly token types instead # of tokens, so they have fewer characters. They are used to be # sure we are matching fields of similar type. # # In this example, there will be 4 column indexes being adjusted. The # first one is always at zero. The interior columns are at the start of # the matching tokens, and the last one tracks the maximum line length. # # Each time a new line comes in, it joins the current vertical # group if possible. Otherwise it causes the current group to be flushed # and a new group is started. # # For each new group member, the column locations are increased, as # necessary, to make room for the new fields. When the group is finally # output, these column numbers are used to compute the amount of spaces of # padding needed for each field. # # Programming note: the fields are assumed not to have any tab characters. # Tabs have been previously removed except for tabs in quoted strings and # side comments. Tabs in these fields can mess up the column counting. # The log file warns the user if there are any such tabs. my ( $self, $rcall_hash ) = @_; # Unpack the call args. This form is significantly faster than getting them # one-by-one. my ( $Kend, $break_alignment_after, $break_alignment_before, $ci_level, $forget_side_comment, $indentation, $is_terminal_ternary, $level, $level_end, $list_seqno, $maximum_line_length, $outdent_long_lines, $rline_alignment, $rvertical_tightness_flags, ) = @{$rcall_hash}{ qw( Kend break_alignment_after break_alignment_before ci_level forget_side_comment indentation is_terminal_ternary level level_end list_seqno maximum_line_length outdent_long_lines rline_alignment rvertical_tightness_flags ) }; my ( $rtokens, $rfields, $rpatterns, $rfield_lengths ) = @{$rline_alignment}; # The index '$Kend' is a value which passed along with the line text to sub # 'write_code_line' for a convergence check. # number of fields is $jmax # number of tokens between fields is $jmax-1 my $jmax = @{$rfields} - 1; my $leading_space_count = ref($indentation) ? $indentation->get_spaces() : $indentation; # set outdented flag to be sure we either align within statements or # across statement boundaries, but not both. my $is_outdented = $self->[_last_leading_space_count_] > $leading_space_count; $self->[_last_leading_space_count_] = $leading_space_count; # Identify a hanging side comment. Hanging side comments have an empty # initial field. my $is_hanging_side_comment = ( $jmax == 1 && $rtokens->[0] eq '#' && $rfields->[0] =~ /^\s*$/ ); # Undo outdented flag for a hanging side comment $is_outdented = 0 if $is_hanging_side_comment; # Identify a block comment. my $is_block_comment = $jmax == 0 && substr( $rfields->[0], 0, 1 ) eq '#'; # Block comment .. update count if ($is_block_comment) { $self->[_consecutive_block_comments_]++; } # Not a block comment .. # Forget side comment column if we saw 2 or more block comments, # and reset the count else { if ( $self->[_consecutive_block_comments_] > 1 ) { $self->forget_side_comment(); } $self->[_consecutive_block_comments_] = 0; } # Reset side comment location if we are entering a new block from level 0. # This is intended to keep them from drifting too far to the right. if ($forget_side_comment) { $self->forget_side_comment(); } my $is_balanced_line = $level_end == $level; my $group_level = $self->[_group_level_]; my $group_maximum_line_length = $self->[_group_maximum_line_length_]; DEBUG_VALIGN && do { my $nlines = $self->group_line_count(); print STDOUT "Entering valign_input: lines=$nlines new #fields= $jmax, leading_count=$leading_space_count, level=$level, group_level=$group_level, level_end=$level_end\n"; }; # Validate cached line if necessary: If we can produce a container # with just 2 lines total by combining an existing cached opening # token with the closing token to follow, then we will mark both # cached flags as valid. my $cached_line_type = get_cached_line_type(); if ($cached_line_type) { my $cached_line_opening_flag = get_cached_line_opening_flag(); if ($rvertical_tightness_flags) { my $cached_seqno = get_cached_seqno(); if ( $cached_seqno && $rvertical_tightness_flags->{_vt_seqno} && $rvertical_tightness_flags->{_vt_seqno} == $cached_seqno ) { # Fix for b1187 and b1188: Normally this step is only done # if the number of existing lines is 0 or 1. But to prevent # blinking, this range can be controlled by the caller. # If zero values are given we fall back on the range 0 to 1. my $line_count = $self->group_line_count(); my $min_lines = $rvertical_tightness_flags->{_vt_min_lines}; my $max_lines = $rvertical_tightness_flags->{_vt_max_lines}; $min_lines = 0 unless ($min_lines); $max_lines = 1 unless ($max_lines); if ( ( $line_count >= $min_lines ) && ( $line_count <= $max_lines ) ) { $rvertical_tightness_flags->{_vt_valid_flag} ||= 1; set_cached_line_valid(1); } } } # do not join an opening block brace (type 3, see VTFLAGS) # with an unbalanced line unless requested with a flag value of 2 if ( $cached_line_type == 3 && !$self->group_line_count() && $cached_line_opening_flag < 2 && !$is_balanced_line ) { set_cached_line_valid(0); } } # shouldn't happen: if ( $level < 0 ) { $level = 0 } # do not align code across indentation level changes # or changes in the maximum line length # or if vertical alignment is turned off if ( $level != $group_level || ( $group_maximum_line_length && $maximum_line_length != $group_maximum_line_length ) || $is_outdented || ( $is_block_comment && !$self->[_rOpts_valign_block_comments_] ) || ( !$is_block_comment && !$self->[_rOpts_valign_side_comments_] && !$self->[_rOpts_valign_code_] ) ) { $self->_flush_group_lines( $level - $group_level ); $group_level = $level; $self->[_group_level_] = $group_level; $self->[_group_maximum_line_length_] = $maximum_line_length; # Update leading spaces after the above flush because the leading space # count may have been changed if the -icp flag is in effect $leading_space_count = ref($indentation) ? $indentation->get_spaces() : $indentation; } # -------------------------------------------------------------------- # Collect outdentable block COMMENTS # -------------------------------------------------------------------- if ( $self->[_group_type_] eq 'COMMENT' ) { if ( $is_block_comment && $outdent_long_lines && $leading_space_count == $self->[_comment_leading_space_count_] ) { # Note that for a comment group we are not storing a line # but rather just the text and its length. push @{ $self->[_rgroup_lines_] }, [ $rfields->[0], $rfield_lengths->[0], $Kend ]; return; } else { $self->_flush_group_lines(); } } my $rgroup_lines = $self->[_rgroup_lines_]; if ( $break_alignment_before && @{$rgroup_lines} ) { $rgroup_lines->[-1]->{'end_group'} = 1; } # -------------------------------------------------------------------- # add dummy fields for terminal ternary # -------------------------------------------------------------------- my $j_terminal_match; if ( $is_terminal_ternary && @{$rgroup_lines} ) { $j_terminal_match = fix_terminal_ternary( $rgroup_lines->[-1], $rfields, $rtokens, $rpatterns, $rfield_lengths, $group_level, ); $jmax = @{$rfields} - 1; } # -------------------------------------------------------------------- # add dummy fields for else statement # -------------------------------------------------------------------- # Note the trailing space after 'else' here. If there were no space between # the else and the next '{' then we would not be able to do vertical # alignment of the '{'. if ( $rfields->[0] eq 'else ' && @{$rgroup_lines} && $is_balanced_line ) { $j_terminal_match = fix_terminal_else( $rgroup_lines->[-1], $rfields, $rtokens, $rpatterns, $rfield_lengths ); $jmax = @{$rfields} - 1; } # -------------------------------------------------------------------- # Handle simple line of code with no fields to match. # -------------------------------------------------------------------- if ( $jmax <= 0 ) { $self->[_zero_count_]++; if ( @{$rgroup_lines} && !get_recoverable_spaces( $rgroup_lines->[0]->{'indentation'} ) ) { # flush the current group if it has some aligned columns.. # or we haven't seen a comment lately if ( $rgroup_lines->[0]->{'jmax'} > 1 || $self->[_zero_count_] > 3 ) { $self->_flush_group_lines(); # Update '$rgroup_lines' - it will become a ref to empty array. # This allows avoiding a call to get_group_line_count below. $rgroup_lines = $self->[_rgroup_lines_]; } } # start new COMMENT group if this comment may be outdented if ( $is_block_comment && $outdent_long_lines && !@{$rgroup_lines} ) { $self->[_group_type_] = 'COMMENT'; $self->[_comment_leading_space_count_] = $leading_space_count; $self->[_group_maximum_line_length_] = $maximum_line_length; push @{$rgroup_lines}, [ $rfields->[0], $rfield_lengths->[0], $Kend ]; return; } # just write this line directly if no current group, no side comment, # and no space recovery is needed. if ( !@{$rgroup_lines} && !get_recoverable_spaces($indentation) ) { $self->valign_output_step_B( { leading_space_count => $leading_space_count, line => $rfields->[0], line_length => $rfield_lengths->[0], side_comment_length => 0, outdent_long_lines => $outdent_long_lines, rvertical_tightness_flags => $rvertical_tightness_flags, level => $level, level_end => $level_end, Kend => $Kend, maximum_line_length => $maximum_line_length, } ); return; } } else { $self->[_zero_count_] = 0; } # -------------------------------------------------------------------- # It simplifies things to create a zero length side comment # if none exists. # -------------------------------------------------------------------- if ( ( $jmax == 0 ) || ( $rtokens->[ $jmax - 1 ] ne '#' ) ) { $jmax += 1; $rtokens->[ $jmax - 1 ] = '#'; $rfields->[$jmax] = EMPTY_STRING; $rfield_lengths->[$jmax] = 0; $rpatterns->[$jmax] = '#'; } # -------------------------------------------------------------------- # create an object to hold this line # -------------------------------------------------------------------- # The hash keys below must match the list of keys in %valid_LINE_keys. # Values in this hash are accessed directly, except for 'ralignments', # rather than with get/set calls for efficiency. my $new_line = Perl::Tidy::VerticalAligner::Line->new( { jmax => $jmax, rtokens => $rtokens, rfields => $rfields, rpatterns => $rpatterns, rfield_lengths => $rfield_lengths, indentation => $indentation, leading_space_count => $leading_space_count, outdent_long_lines => $outdent_long_lines, list_seqno => $list_seqno, list_type => EMPTY_STRING, is_hanging_side_comment => $is_hanging_side_comment, rvertical_tightness_flags => $rvertical_tightness_flags, is_terminal_ternary => $is_terminal_ternary, j_terminal_match => $j_terminal_match, end_group => $break_alignment_after, Kend => $Kend, ci_level => $ci_level, level => $level, level_end => $level_end, imax_pair => -1, maximum_line_length => $maximum_line_length, ralignments => [], } ); DEVEL_MODE && check_keys( $new_line, \%valid_LINE_keys, "Checking line keys at line definition", 1 ); # -------------------------------------------------------------------- # Decide if this is a simple list of items. # We use this to be less restrictive in deciding what to align. # -------------------------------------------------------------------- decide_if_list($new_line) if ($list_seqno); # -------------------------------------------------------------------- # Append this line to the current group (or start new group) # -------------------------------------------------------------------- push @{ $self->[_rgroup_lines_] }, $new_line; $self->[_group_maximum_line_length_] = $maximum_line_length; # output this group if it ends in a terminal else or ternary line if ( defined($j_terminal_match) ) { $self->_flush_group_lines(); } # Force break after jump to lower level elsif ($level_end < $level || $is_closing_token{ substr( $rfields->[0], 0, 1 ) } ) { $self->_flush_group_lines(-1); } # -------------------------------------------------------------------- # Some old debugging stuff # -------------------------------------------------------------------- DEBUG_VALIGN && do { print STDOUT "exiting valign_input fields:"; dump_array( @{$rfields} ); print STDOUT "exiting valign_input tokens:"; dump_array( @{$rtokens} ); print STDOUT "exiting valign_input patterns:"; dump_array( @{$rpatterns} ); }; return; } ## end sub valign_input sub join_hanging_comment { # Add dummy fields to a hanging side comment to make it look # like the first line in its potential group. This simplifies # the coding. my ( $new_line, $old_line ) = @_; my $jmax = $new_line->{'jmax'}; # must be 2 fields return 0 unless $jmax == 1; my $rtokens = $new_line->{'rtokens'}; # the second field must be a comment return 0 unless $rtokens->[0] eq '#'; my $rfields = $new_line->{'rfields'}; # the first field must be empty return 0 unless $rfields->[0] =~ /^\s*$/; # the current line must have fewer fields my $maximum_field_index = $old_line->{'jmax'}; return 0 unless $maximum_field_index > $jmax; # looks ok.. my $rpatterns = $new_line->{'rpatterns'}; my $rfield_lengths = $new_line->{'rfield_lengths'}; $new_line->{'is_hanging_side_comment'} = 1; $jmax = $maximum_field_index; $new_line->{'jmax'} = $jmax; $rfields->[$jmax] = $rfields->[1]; $rfield_lengths->[$jmax] = $rfield_lengths->[1]; $rtokens->[ $jmax - 1 ] = $rtokens->[0]; $rpatterns->[ $jmax - 1 ] = $rpatterns->[0]; foreach my $j ( 1 .. $jmax - 1 ) { $rfields->[$j] = EMPTY_STRING; $rfield_lengths->[$j] = 0; $rtokens->[ $j - 1 ] = EMPTY_STRING; $rpatterns->[ $j - 1 ] = EMPTY_STRING; } return 1; } ## end sub join_hanging_comment { ## closure for sub decide_if_list my %is_comma_token; BEGIN { my @q = qw( => ); push @q, ','; @is_comma_token{@q} = (1) x scalar(@q); } ## end BEGIN sub decide_if_list { my $line = shift; # A list will be taken to be a line with a forced break in which all # of the field separators are commas or comma-arrows (except for the # trailing #) my $rtokens = $line->{'rtokens'}; my $test_token = $rtokens->[0]; my ( $raw_tok, $lev, $tag, $tok_count ) = decode_alignment_token($test_token); if ( $is_comma_token{$raw_tok} ) { my $list_type = $test_token; my $jmax = $line->{'jmax'}; foreach ( 1 .. $jmax - 2 ) { ( $raw_tok, $lev, $tag, $tok_count ) = decode_alignment_token( $rtokens->[$_] ); if ( !$is_comma_token{$raw_tok} ) { $list_type = EMPTY_STRING; last; } } $line->{'list_type'} = $list_type; } return; } ## end sub decide_if_list } sub fix_terminal_ternary { # Add empty fields as necessary to align a ternary term # like this: # # my $leapyear = # $year % 4 ? 0 # : $year % 100 ? 1 # : $year % 400 ? 0 # : 1; # # returns the index of the terminal question token, if any my ( $old_line, $rfields, $rtokens, $rpatterns, $rfield_lengths, $group_level ) = @_; return unless ($old_line); use constant EXPLAIN_TERNARY => 0; if (%valign_control_hash) { my $align_ok = $valign_control_hash{'?'}; $align_ok = $valign_control_default unless defined($align_ok); return unless ($align_ok); } my $jmax = @{$rfields} - 1; my $rfields_old = $old_line->{'rfields'}; my $rpatterns_old = $old_line->{'rpatterns'}; my $rtokens_old = $old_line->{'rtokens'}; my $maximum_field_index = $old_line->{'jmax'}; # look for the question mark after the : my ($jquestion); my $depth_question; my $pad = EMPTY_STRING; my $pad_length = 0; foreach my $j ( 0 .. $maximum_field_index - 1 ) { my $tok = $rtokens_old->[$j]; my ( $raw_tok, $lev, $tag, $tok_count ) = decode_alignment_token($tok); if ( $raw_tok eq '?' ) { $depth_question = $lev; # depth must be correct next unless ( $depth_question eq $group_level ); $jquestion = $j; if ( $rfields_old->[ $j + 1 ] =~ /^(\?\s*)/ ) { $pad_length = length($1); $pad = SPACE x $pad_length; } else { return; # shouldn't happen } last; } } return unless ( defined($jquestion) ); # shouldn't happen # Now splice the tokens and patterns of the previous line # into the else line to insure a match. Add empty fields # as necessary. my $jadd = $jquestion; # Work on copies of the actual arrays in case we have # to return due to an error my @fields = @{$rfields}; my @patterns = @{$rpatterns}; my @tokens = @{$rtokens}; my @field_lengths = @{$rfield_lengths}; EXPLAIN_TERNARY && do { local $LIST_SEPARATOR = '><'; print STDOUT "CURRENT FIELDS=<@{$rfields_old}>\n"; print STDOUT "CURRENT TOKENS=<@{$rtokens_old}>\n"; print STDOUT "CURRENT PATTERNS=<@{$rpatterns_old}>\n"; print STDOUT "UNMODIFIED FIELDS=<@{$rfields}>\n"; print STDOUT "UNMODIFIED TOKENS=<@{$rtokens}>\n"; print STDOUT "UNMODIFIED PATTERNS=<@{$rpatterns}>\n"; }; # handle cases of leading colon on this line if ( $fields[0] =~ /^(:\s*)(.*)$/ ) { my ( $colon, $therest ) = ( $1, $2 ); # Handle sub-case of first field with leading colon plus additional code # This is the usual situation as at the '1' below: # ... # : $year % 400 ? 0 # : 1; if ($therest) { # Split the first field after the leading colon and insert padding. # Note that this padding will remain even if the terminal value goes # out on a separate line. This does not seem to look to bad, so no # mechanism has been included to undo it. my $field1 = shift @fields; my $field_length1 = shift @field_lengths; my $len_colon = length($colon); unshift @fields, ( $colon, $pad . $therest ); unshift @field_lengths, ( $len_colon, $pad_length + $field_length1 - $len_colon ); # change the leading pattern from : to ? return unless ( $patterns[0] =~ s/^\:/?/ ); # install leading tokens and patterns of existing line unshift( @tokens, @{$rtokens_old}[ 0 .. $jquestion ] ); unshift( @patterns, @{$rpatterns_old}[ 0 .. $jquestion ] ); # insert appropriate number of empty fields splice( @fields, 1, 0, (EMPTY_STRING) x $jadd ) if $jadd; splice( @field_lengths, 1, 0, (0) x $jadd ) if $jadd; } # handle sub-case of first field just equal to leading colon. # This can happen for example in the example below where # the leading '(' would create a new alignment token # : ( $name =~ /[]}]$/ ) ? ( $mname = $name ) # : ( $mname = $name . '->' ); else { return unless ( $jmax > 0 && $tokens[0] ne '#' ); # shouldn't happen # prepend a leading ? onto the second pattern $patterns[1] = "?b" . $patterns[1]; # pad the second field $fields[1] = $pad . $fields[1]; $field_lengths[1] = $pad_length + $field_lengths[1]; # install leading tokens and patterns of existing line, replacing # leading token and inserting appropriate number of empty fields splice( @tokens, 0, 1, @{$rtokens_old}[ 0 .. $jquestion ] ); splice( @patterns, 1, 0, @{$rpatterns_old}[ 1 .. $jquestion ] ); splice( @fields, 1, 0, (EMPTY_STRING) x $jadd ) if $jadd; splice( @field_lengths, 1, 0, (0) x $jadd ) if $jadd; } } # Handle case of no leading colon on this line. This will # be the case when -wba=':' is used. For example, # $year % 400 ? 0 : # 1; else { # install leading tokens and patterns of existing line $patterns[0] = '?' . 'b' . $patterns[0]; unshift( @tokens, @{$rtokens_old}[ 0 .. $jquestion ] ); unshift( @patterns, @{$rpatterns_old}[ 0 .. $jquestion ] ); # insert appropriate number of empty fields $jadd = $jquestion + 1; $fields[0] = $pad . $fields[0]; $field_lengths[0] = $pad_length + $field_lengths[0]; splice( @fields, 0, 0, (EMPTY_STRING) x $jadd ) if $jadd; splice( @field_lengths, 0, 0, (0) x $jadd ) if $jadd; } EXPLAIN_TERNARY && do { local $LIST_SEPARATOR = '><'; print STDOUT "MODIFIED TOKENS=<@tokens>\n"; print STDOUT "MODIFIED PATTERNS=<@patterns>\n"; print STDOUT "MODIFIED FIELDS=<@fields>\n"; }; # all ok .. update the arrays @{$rfields} = @fields; @{$rtokens} = @tokens; @{$rpatterns} = @patterns; @{$rfield_lengths} = @field_lengths; # force a flush after this line return $jquestion; } ## end sub fix_terminal_ternary sub fix_terminal_else { # Add empty fields as necessary to align a balanced terminal # else block to a previous if/elsif/unless block, # like this: # # if ( 1 || $x ) { print "ok 13\n"; } # else { print "not ok 13\n"; } # # returns a positive value if the else block should be indented # my ( $old_line, $rfields, $rtokens, $rpatterns, $rfield_lengths ) = @_; return unless ($old_line); my $jmax = @{$rfields} - 1; return unless ( $jmax > 0 ); if (%valign_control_hash) { my $align_ok = $valign_control_hash{'{'}; $align_ok = $valign_control_default unless defined($align_ok); return unless ($align_ok); } # check for balanced else block following if/elsif/unless my $rfields_old = $old_line->{'rfields'}; # TBD: add handling for 'case' return unless ( $rfields_old->[0] =~ /^(if|elsif|unless)\s*$/ ); # look for the opening brace after the else, and extract the depth my $tok_brace = $rtokens->[0]; my $depth_brace; if ( $tok_brace =~ /^\{(\d+)/ ) { $depth_brace = $1; } # probably: "else # side_comment" else { return } my $rpatterns_old = $old_line->{'rpatterns'}; my $rtokens_old = $old_line->{'rtokens'}; my $maximum_field_index = $old_line->{'jmax'}; # be sure the previous if/elsif is followed by an opening paren my $jparen = 0; my $tok_paren = '(' . $depth_brace; my $tok_test = $rtokens_old->[$jparen]; return unless ( $tok_test eq $tok_paren ); # shouldn't happen # Now find the opening block brace my ($jbrace); foreach my $j ( 1 .. $maximum_field_index - 1 ) { my $tok = $rtokens_old->[$j]; if ( $tok eq $tok_brace ) { $jbrace = $j; last; } } return unless ( defined($jbrace) ); # shouldn't happen # Now splice the tokens and patterns of the previous line # into the else line to insure a match. Add empty fields # as necessary. my $jadd = $jbrace - $jparen; splice( @{$rtokens}, 0, 0, @{$rtokens_old}[ $jparen .. $jbrace - 1 ] ); splice( @{$rpatterns}, 1, 0, @{$rpatterns_old}[ $jparen + 1 .. $jbrace ] ); splice( @{$rfields}, 1, 0, (EMPTY_STRING) x $jadd ); splice( @{$rfield_lengths}, 1, 0, (0) x $jadd ); # force a flush after this line if it does not follow a case if ( $rfields_old->[0] =~ /^case\s*$/ ) { return } else { return $jbrace } } ## end sub fix_terminal_else my %is_closing_block_type; BEGIN { my @q = qw< } ] >; @is_closing_block_type{@q} = (1) x scalar(@q); } # This is a flag for testing alignment by sub sweep_left_to_right only. # This test can help find problems with the alignment logic. # This flag should normally be zero. use constant TEST_SWEEP_ONLY => 0; use constant EXPLAIN_CHECK_MATCH => 0; sub check_match { # See if the current line matches the current vertical alignment group. my ( $self, $new_line, $base_line, $prev_line ) = @_; # Given: # $new_line = the line being considered for group inclusion # $base_line = the first line of the current group # $prev_line = the line just before $new_line # returns a flag and a value as follows: # return (0, $imax_align) if the line does not match # return (1, $imax_align) if the line matches but does not fit # return (2, $imax_align) if the line matches and fits use constant NO_MATCH => 0; use constant MATCH_NO_FIT => 1; use constant MATCH_AND_FIT => 2; my $return_value; # Returns '$imax_align' which is the index of the maximum matching token. # It will be used in the subsequent left-to-right sweep to align as many # tokens as possible for lines which partially match. my $imax_align = -1; # variable $GoToMsg explains reason for no match, for debugging my $GoToMsg = EMPTY_STRING; my $jmax = $new_line->{'jmax'}; my $maximum_field_index = $base_line->{'jmax'}; my $jlimit = $jmax - 2; if ( $jmax > $maximum_field_index ) { $jlimit = $maximum_field_index - 2; } if ( $new_line->{'is_hanging_side_comment'} ) { # HSC's can join the group if they fit } # Everything else else { # A group with hanging side comments ends with the first non hanging # side comment. if ( $base_line->{'is_hanging_side_comment'} ) { $GoToMsg = "end of hanging side comments"; $return_value = NO_MATCH; } else { # The number of tokens that this line shares with the previous # line has been stored with the previous line. This value was # calculated and stored by sub 'match_line_pair'. $imax_align = $prev_line->{'imax_pair'}; if ( $imax_align != $jlimit ) { $GoToMsg = "Not all tokens match: $imax_align != $jlimit\n"; $return_value = NO_MATCH; } } } if ( !defined($return_value) ) { # The tokens match, but the lines must have identical number of # tokens to join the group. if ( $maximum_field_index != $jmax ) { $GoToMsg = "token count differs"; $return_value = NO_MATCH; } # The tokens match. Now See if there is space for this line in the # current group. elsif ( $self->check_fit( $new_line, $base_line ) && !TEST_SWEEP_ONLY ) { $GoToMsg = "match and fit, imax_align=$imax_align, jmax=$jmax\n"; $return_value = MATCH_AND_FIT; $imax_align = $jlimit; } else { $GoToMsg = "match but no fit, imax_align=$imax_align, jmax=$jmax\n"; $return_value = MATCH_NO_FIT; $imax_align = $jlimit; } } EXPLAIN_CHECK_MATCH && print "returning $return_value because $GoToMsg, max match index =i $imax_align, jmax=$jmax\n"; return ( $return_value, $imax_align ); } ## end sub check_match sub check_fit { my ( $self, $new_line, $old_line ) = @_; # The new line has alignments identical to the current group. Now we have # to fit the new line into the group without causing a field to exceed the # line length limit. # return true if successful # return false if not successful my $jmax = $new_line->{'jmax'}; my $leading_space_count = $new_line->{'leading_space_count'}; my $rfield_lengths = $new_line->{'rfield_lengths'}; my $padding_available = $old_line->get_available_space_on_right(); my $jmax_old = $old_line->{'jmax'}; # Safety check ... only lines with equal array sizes should arrive here # from sub check_match. So if this error occurs, look at recent changes in # sub check_match. It is only supposed to check the fit of lines with # identical numbers of alignment tokens. if ( $jmax_old ne $jmax ) { warning(<{'ralignments'} }; foreach my $alignment (@alignments) { $alignment->save_column(); } # Loop over all alignments ... for my $j ( 0 .. $jmax ) { my $pad = $rfield_lengths->[$j] - $old_line->current_field_width($j); if ( $j == 0 ) { $pad += $leading_space_count; } # Keep going if this field does not need any space. next if ( $pad < 0 ); # Revert to the starting state if does not fit if ( $pad > $padding_available ) { #---------------------------------------------- # Line does not fit -- revert to starting state #---------------------------------------------- foreach my $alignment (@alignments) { $alignment->restore_column(); } return; } # make room for this field $old_line->increase_field_width( $j, $pad ); $padding_available -= $pad; } #------------------------------------- # The line fits, the match is accepted #------------------------------------- return 1; } ## end sub check_fit sub install_new_alignments { my ($new_line) = @_; my $jmax = $new_line->{'jmax'}; my $rfield_lengths = $new_line->{'rfield_lengths'}; my $col = $new_line->{'leading_space_count'}; my @alignments; for my $j ( 0 .. $jmax ) { $col += $rfield_lengths->[$j]; # create initial alignments for the new group my $alignment = Perl::Tidy::VerticalAligner::Alignment->new( { column => $col } ); push @alignments, $alignment; } $new_line->{'ralignments'} = \@alignments; return; } ## end sub install_new_alignments sub copy_old_alignments { my ( $new_line, $old_line ) = @_; my @new_alignments = @{ $old_line->{'ralignments'} }; $new_line->{'ralignments'} = \@new_alignments; return; } ## end sub copy_old_alignments sub dump_array { # debug routine to dump array contents local $LIST_SEPARATOR = ')('; print STDOUT "(@_)\n"; return; } ## end sub dump_array sub level_change { # compute decrease in level when we remove $diff spaces from the # leading spaces my ( $self, $leading_space_count, $diff, $level ) = @_; my $rOpts_indent_columns = $self->[_rOpts_indent_columns_]; if ($rOpts_indent_columns) { my $olev = int( ( $leading_space_count + $diff ) / $rOpts_indent_columns ); my $nlev = int( $leading_space_count / $rOpts_indent_columns ); $level -= ( $olev - $nlev ); if ( $level < 0 ) { $level = 0 } } return $level; } ## end sub level_change ############################################### # CODE SECTION 4: Code to process comment lines ############################################### sub _flush_comment_lines { # Output a group consisting of COMMENT lines my ($self) = @_; my $rgroup_lines = $self->[_rgroup_lines_]; return unless ( @{$rgroup_lines} ); my $group_level = $self->[_group_level_]; my $group_maximum_line_length = $self->[_group_maximum_line_length_]; my $leading_space_count = $self->[_comment_leading_space_count_]; ## my $leading_string = ## $self->get_leading_string( $leading_space_count, $group_level ); # look for excessively long lines my $max_excess = 0; foreach my $item ( @{$rgroup_lines} ) { my ( $str, $str_len ) = @{$item}; my $excess = $str_len + $leading_space_count - $group_maximum_line_length; if ( $excess > $max_excess ) { $max_excess = $excess; } } # zero leading space count if any lines are too long if ( $max_excess > 0 ) { $leading_space_count -= $max_excess; if ( $leading_space_count < 0 ) { $leading_space_count = 0 } my $file_writer_object = $self->[_file_writer_object_]; my $last_outdented_line_at = $file_writer_object->get_output_line_number(); my $nlines = @{$rgroup_lines}; $self->[_last_outdented_line_at_] = $last_outdented_line_at + $nlines - 1; my $outdented_line_count = $self->[_outdented_line_count_]; unless ($outdented_line_count) { $self->[_first_outdented_line_at_] = $last_outdented_line_at; } $outdented_line_count += $nlines; $self->[_outdented_line_count_] = $outdented_line_count; } # write the lines my $outdent_long_lines = 0; foreach my $item ( @{$rgroup_lines} ) { my ( $str, $str_len, $Kend ) = @{$item}; $self->valign_output_step_B( { leading_space_count => $leading_space_count, line => $str, line_length => $str_len, side_comment_length => 0, outdent_long_lines => $outdent_long_lines, rvertical_tightness_flags => undef, level => $group_level, level_end => $group_level, Kend => $Kend, maximum_line_length => $group_maximum_line_length, } ); } $self->initialize_for_new_group(); return; } ## end sub _flush_comment_lines ###################################################### # CODE SECTION 5: Code to process groups of code lines ###################################################### sub _flush_group_lines { # This is the vertical aligner internal flush, which leaves the cache # intact my ( $self, $level_jump ) = @_; # $level_jump = $next_level-$group_level, if known # = undef if not known # Note: only the sign of the jump is needed my $rgroup_lines = $self->[_rgroup_lines_]; return unless ( @{$rgroup_lines} ); my $group_type = $self->[_group_type_]; my $group_level = $self->[_group_level_]; # Debug 0 && do { my ( $a, $b, $c ) = caller(); my $nlines = @{$rgroup_lines}; print STDOUT "APPEND0: _flush_group_lines called from $a $b $c lines=$nlines, type=$group_type \n"; }; #------------------------------------------- # Section 1: Handle a group of COMMENT lines #------------------------------------------- if ( $group_type eq 'COMMENT' ) { $self->_flush_comment_lines(); return; } #------------------------------------------------------------------------ # Section 2: Handle line(s) of CODE. Most of the actual work of vertical # aligning happens here in the following steps: #------------------------------------------------------------------------ # STEP 1: Remove most unmatched tokens. They block good alignments. my ( $max_lev_diff, $saw_side_comment ) = delete_unmatched_tokens( $rgroup_lines, $group_level ); # STEP 2: Sweep top to bottom, forming subgroups of lines with exactly # matching common alignments. The indexes of these subgroups are in the # return variable. my $rgroups = $self->sweep_top_down( $rgroup_lines, $group_level ); # STEP 3: Sweep left to right through the lines, looking for leading # alignment tokens shared by groups. sweep_left_to_right( $rgroup_lines, $rgroups, $group_level ) if ( @{$rgroups} > 1 ); # STEP 4: Move side comments to a common column if possible. if ($saw_side_comment) { $self->align_side_comments( $rgroup_lines, $rgroups ); } # STEP 5: For the -lp option, increase the indentation of lists # to the desired amount, but do not exceed the line length limit. # We are allowed to shift a group of lines to the right if: # (1) its level is greater than the level of the previous group, and # (2) its level is greater than the level of the next line to be written. my $extra_indent_ok; if ( $group_level > $self->[_last_level_written_] ) { # Use the level jump to next line to come, if given if ( defined($level_jump) ) { $extra_indent_ok = $level_jump < 0; } # Otherwise, assume the next line has the level of the end of last line. # This fixes case c008. else { my $level_end = $rgroup_lines->[-1]->{'level_end'}; $extra_indent_ok = $group_level > $level_end; } } my $extra_leading_spaces = $extra_indent_ok ? get_extra_leading_spaces( $rgroup_lines, $rgroups ) : 0; # STEP 6: Output the lines. # All lines in this group have the same leading spacing and maximum line # length my $group_leader_length = $rgroup_lines->[0]->{'leading_space_count'}; my $group_maximum_line_length = $rgroup_lines->[0]->{'maximum_line_length'}; foreach my $line ( @{$rgroup_lines} ) { $self->valign_output_step_A( { line => $line, min_ci_gap => 0, do_not_align => 0, group_leader_length => $group_leader_length, extra_leading_spaces => $extra_leading_spaces, level => $group_level, maximum_line_length => $group_maximum_line_length, } ); } # Let the formatter know that this object has been processed and any # recoverable spaces have been handled. This is needed for setting the # closing paren location in -lp mode. my $object = $rgroup_lines->[0]->{'indentation'}; if ( ref($object) ) { $object->set_recoverable_spaces(0) } $self->initialize_for_new_group(); return; } ## end sub _flush_group_lines { ## closure for sub sweep_top_down my $rall_lines; # all of the lines my $grp_level; # level of all lines my $rgroups; # describes the partition of lines we will make here my $group_line_count; # number of lines in current partition BEGIN { $rgroups = [] } sub initialize_for_new_rgroup { $group_line_count = 0; return; } sub add_to_rgroup { my ($jend) = @_; my $rline = $rall_lines->[$jend]; my $jbeg = $jend; if ( $group_line_count == 0 ) { install_new_alignments($rline); } else { my $rvals = pop @{$rgroups}; $jbeg = $rvals->[0]; copy_old_alignments( $rline, $rall_lines->[$jbeg] ); } push @{$rgroups}, [ $jbeg, $jend, undef ]; $group_line_count++; return; } ## end sub add_to_rgroup sub get_rgroup_jrange { return unless @{$rgroups}; return unless ( $group_line_count > 0 ); my ( $jbeg, $jend ) = @{ $rgroups->[-1] }; return ( $jbeg, $jend ); } ## end sub get_rgroup_jrange sub end_rgroup { my ($imax_align) = @_; return unless @{$rgroups}; return unless ( $group_line_count > 0 ); my ( $jbeg, $jend ) = @{ pop @{$rgroups} }; push @{$rgroups}, [ $jbeg, $jend, $imax_align ]; # Undo some alignments of poor two-line combinations. # We had to wait until now to know the line count. if ( $jend - $jbeg == 1 ) { my $line_0 = $rall_lines->[$jbeg]; my $line_1 = $rall_lines->[$jend]; my $imax_pair = $line_1->{'imax_pair'}; if ( $imax_pair > $imax_align ) { $imax_align = $imax_pair } ## flag for possible future use: ## my $is_isolated_pair = $imax_pair < 0 ## && ( $jbeg == 0 ## || $rall_lines->[ $jbeg - 1 ]->{'imax_pair'} < 0 ); my $imax_prev = $jbeg > 0 ? $rall_lines->[ $jbeg - 1 ]->{'imax_pair'} : -1; my ( $is_marginal, $imax_align_fix ) = is_marginal_match( $line_0, $line_1, $grp_level, $imax_align, $imax_prev ); if ($is_marginal) { combine_fields( $line_0, $line_1, $imax_align_fix ); } } initialize_for_new_rgroup(); return; } ## end sub end_rgroup sub block_penultimate_match { # emergency reset to prevent sweep_left_to_right from trying to match a # failed terminal else match return unless @{$rgroups} > 1; $rgroups->[-2]->[2] = -1; return; } ## end sub block_penultimate_match sub sweep_top_down { my ( $self, $rlines, $group_level ) = @_; # Partition the set of lines into final alignment subgroups # and store the alignments with the lines. # The alignment subgroups we are making here are groups of consecutive # lines which have (1) identical alignment tokens and (2) do not # exceed the allowable maximum line length. A later sweep from # left-to-right ('sweep_lr') will handle additional alignments. # transfer args to closure variables $rall_lines = $rlines; $grp_level = $group_level; $rgroups = []; initialize_for_new_rgroup(); return unless @{$rlines}; # shouldn't happen # Unset the _end_group flag for the last line if it it set because it # is not needed and can causes problems for -lp formatting $rall_lines->[-1]->{'end_group'} = 0; # Loop over all lines ... my $jline = -1; foreach my $new_line ( @{$rall_lines} ) { $jline++; # Start a new subgroup if necessary if ( !$group_line_count ) { add_to_rgroup($jline); if ( $new_line->{'end_group'} ) { end_rgroup(-1); } next; } my $j_terminal_match = $new_line->{'j_terminal_match'}; my ( $jbeg, $jend ) = get_rgroup_jrange(); if ( !defined($jbeg) ) { # safety check, shouldn't happen warning(<[$jbeg]; # Initialize a global flag saying if the last line of the group # should match end of group and also terminate the group. There # should be no returns between here and where the flag is handled # at the bottom. my $col_matching_terminal = 0; if ( defined($j_terminal_match) ) { # remember the column of the terminal ? or { to match with $col_matching_terminal = $base_line->get_column($j_terminal_match); # Ignore an undefined value as a defensive step; shouldn't # normally happen. $col_matching_terminal = 0 unless defined($col_matching_terminal); } # ------------------------------------------------------------- # Allow hanging side comment to join current group, if any. The # only advantage is to keep the other tokens in the same group. For # example, this would make the '=' align here: # $ax = 1; # side comment # # hanging side comment # $boondoggle = 5; # side comment # $beetle = 5; # side comment # here is another example.. # _rtoc_name_count => {}, # hash to track .. # _rpackage_stack => [], # stack to check .. # # name changes # _rlast_level => \$last_level, # brace indentation # # # If this were not desired, the next step could be skipped. # ------------------------------------------------------------- if ( $new_line->{'is_hanging_side_comment'} ) { join_hanging_comment( $new_line, $base_line ); } # If this line has no matching tokens, then flush out the lines # BEFORE this line unless both it and the previous line have side # comments. This prevents this line from pushing side comments out # to the right. elsif ( $new_line->{'jmax'} == 1 ) { # There are no matching tokens, so now check side comments. # Programming note: accessing arrays with index -1 is # risky in Perl, but we have verified there is at least one # line in the group and that there is at least one field. my $prev_comment = $rall_lines->[ $jline - 1 ]->{'rfields'}->[-1]; my $side_comment = $new_line->{'rfields'}->[-1]; end_rgroup(-1) unless ( $side_comment && $prev_comment ); } # See if the new line matches and fits the current group, # if it still exists. Flush the current group if not. my $match_code; if ($group_line_count) { ( $match_code, my $imax_align ) = $self->check_match( $new_line, $base_line, $rall_lines->[ $jline - 1 ] ); if ( $match_code != 2 ) { end_rgroup($imax_align) } } # Store the new line add_to_rgroup($jline); if ( defined($j_terminal_match) ) { # Decide if we should fix a terminal match. We can either: # 1. fix it and prevent the sweep_lr from changing it, or # 2. leave it alone and let sweep_lr try to fix it. # The current logic is to fix it if: # -it has not joined to previous lines, # -and either the previous subgroup has just 1 line, or # -this line matched but did not fit (so sweep won't work) my $fixit; if ( $group_line_count == 1 ) { $fixit ||= $match_code; if ( !$fixit ) { if ( @{$rgroups} > 1 ) { my ( $jbegx, $jendx ) = @{ $rgroups->[-2] }; my $nlines = $jendx - $jbegx + 1; $fixit ||= $nlines <= 1; } } } if ($fixit) { $base_line = $new_line; my $col_now = $base_line->get_column($j_terminal_match); # Ignore an undefined value as a defensive step; shouldn't # normally happen. $col_now = 0 unless defined($col_now); my $pad = $col_matching_terminal - $col_now; my $padding_available = $base_line->get_available_space_on_right(); if ( $col_now && $pad > 0 && $pad <= $padding_available ) { $base_line->increase_field_width( $j_terminal_match, $pad ); } # do not let sweep_left_to_right change an isolated 'else' if ( !$new_line->{'is_terminal_ternary'} ) { block_penultimate_match(); } } end_rgroup(-1); } # end the group if we know we cannot match next line. elsif ( $new_line->{'end_group'} ) { end_rgroup(-1); } } ## end loop over lines end_rgroup(-1); return ($rgroups); } ## end sub sweep_top_down } sub two_line_pad { my ( $line_m, $line, $imax_min ) = @_; # Given: # two isolated (list) lines # imax_min = number of common alignment tokens # Return: # $pad_max = maximum suggested pad distance # = 0 if alignment not recommended # Note that this is only for two lines which do not have alignment tokens # in common with any other lines. It is intended for lists, but it might # also be used for two non-list lines with a common leading '='. # Allow alignment if the difference in the two unpadded line lengths # is not more than either line length. The idea is to avoid # aligning lines with very different field lengths, like these two: # [ # 'VARCHAR', DBI::SQL_VARCHAR, undef, "'", "'", undef, 0, 1, # 1, 0, 0, 0, undef, 0, 0 # ]; my $rfield_lengths = $line->{'rfield_lengths'}; my $rfield_lengths_m = $line_m->{'rfield_lengths'}; # Safety check - shouldn't happen return 0 unless $imax_min < @{$rfield_lengths} && $imax_min < @{$rfield_lengths_m}; my $lensum_m = 0; my $lensum = 0; foreach my $i ( 0 .. $imax_min ) { $lensum_m += $rfield_lengths_m->[$i]; $lensum += $rfield_lengths->[$i]; } my ( $lenmin, $lenmax ) = $lensum >= $lensum_m ? ( $lensum_m, $lensum ) : ( $lensum, $lensum_m ); my $patterns_match; if ( $line_m->{'list_type'} && $line->{'list_type'} ) { $patterns_match = 1; my $rpatterns_m = $line_m->{'rpatterns'}; my $rpatterns = $line->{'rpatterns'}; foreach my $i ( 0 .. $imax_min ) { my $pat = $rpatterns->[$i]; my $pat_m = $rpatterns_m->[$i]; if ( $pat ne $pat_m ) { $patterns_match = 0; last } } } my $pad_max = $lenmax; if ( !$patterns_match && $lenmax > 2 * $lenmin ) { $pad_max = 0 } return $pad_max; } ## end sub two_line_pad sub sweep_left_to_right { my ( $rlines, $rgroups, $group_level ) = @_; # So far we have divided the lines into groups having an equal number of # identical alignments. Here we are going to look for common leading # alignments between the different groups and align them when possible. # For example, the three lines below are in three groups because each line # has a different number of commas. In this routine we will sweep from # left to right, aligning the leading commas as we go, but stopping if we # hit the line length limit. # my ( $num, $numi, $numj, $xyza, $ka, $xyzb, $kb, $aff, $error ); # my ( $i, $j, $error, $aff, $asum, $avec ); # my ( $km, $area, $varea ); # nothing to do if just one group my $ng_max = @{$rgroups} - 1; return unless ( $ng_max > 0 ); #--------------------------------------------------------------------- # Step 1: Loop over groups to find all common leading alignment tokens #--------------------------------------------------------------------- my $line; my $rtokens; my $imax; # index of maximum non-side-comment alignment token my $istop; # an optional stopping index my $jbeg; # starting line index my $jend; # ending line index my $line_m; my $rtokens_m; my $imax_m; my $istop_m; my $jbeg_m; my $jend_m; my $istop_mm; # Look at neighboring pairs of groups and form a simple list # of all common leading alignment tokens. Foreach such match we # store [$i, $ng], where # $i = index of the token in the line (0,1,...) # $ng is the second of the two groups with this common token my @icommon; # Hash to hold the maximum alignment change for any group my %max_move; # a small number of columns my $short_pad = 4; my $ng = -1; foreach my $item ( @{$rgroups} ) { $ng++; $istop_mm = $istop_m; # save _m values of previous group $line_m = $line; $rtokens_m = $rtokens; $imax_m = $imax; $istop_m = $istop; $jbeg_m = $jbeg; $jend_m = $jend; # Get values for this group. Note that we just have to use values for # one of the lines of the group since all members have the same # alignments. ( $jbeg, $jend, $istop ) = @{$item}; $line = $rlines->[$jbeg]; $rtokens = $line->{'rtokens'}; $imax = $line->{'jmax'} - 2; $istop = -1 unless ( defined($istop) ); $istop = $imax if ( $istop > $imax ); # Initialize on first group next if ( $ng == 0 ); # Use the minimum index limit of the two groups my $imax_min = $imax > $imax_m ? $imax_m : $imax; # Also impose a limit if given. if ( $istop_m < $imax_min ) { $imax_min = $istop_m; } # Special treatment of two one-line groups isolated from other lines, # unless they form a simple list or a terminal match. Otherwise the # alignment can look strange in some cases. my $list_type = $rlines->[$jbeg]->{'list_type'}; if ( $jend == $jbeg && $jend_m == $jbeg_m && ( $ng == 1 || $istop_mm < 0 ) && ( $ng == $ng_max || $istop < 0 ) && !$line->{'j_terminal_match'} # Only do this for imperfect matches. This is normally true except # when two perfect matches cannot form a group because the line # length limit would be exceeded. In that case we can still try # to match as many alignments as possible. && ( $imax != $imax_m || $istop_m != $imax_m ) ) { # We will just align assignments and simple lists next unless ( $imax_min >= 0 ); next unless ( $rtokens->[0] =~ /^=\d/ || $list_type ); # In this case we will limit padding to a short distance. This # is a compromise to keep some vertical alignment but prevent large # gaps, which do not look good for just two lines. my $pad_max = two_line_pad( $rlines->[$jbeg], $rlines->[$jbeg_m], $imax_min ); next unless ($pad_max); my $ng_m = $ng - 1; $max_move{"$ng_m"} = $pad_max; $max_move{"$ng"} = $pad_max; } # Loop to find all common leading tokens. if ( $imax_min >= 0 ) { foreach my $i ( 0 .. $imax_min ) { my $tok = $rtokens->[$i]; my $tok_m = $rtokens_m->[$i]; last if ( $tok ne $tok_m ); push @icommon, [ $i, $ng, $tok ]; } } } return unless @icommon; #---------------------------------------------------------- # Step 2: Reorder and consolidate the list into a task list #---------------------------------------------------------- # We have to work first from lowest token index to highest, then by group, # sort our list first on token index then group number @icommon = sort { $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1] } @icommon; # Make a task list of the form # [$i, ng_beg, $ng_end, $tok], .. # where # $i is the index of the token to be aligned # $ng_beg..$ng_end is the group range for this action my @todo; my ( $i, $ng_end, $tok ); foreach my $item (@icommon) { my $ng_last = $ng_end; my $i_last = $i; ( $i, $ng_end, $tok ) = @{$item}; my $ng_beg = $ng_end - 1; if ( defined($ng_last) && $ng_beg == $ng_last && $i == $i_last ) { my $var = pop(@todo); $ng_beg = $var->[1]; } my ( $raw_tok, $lev, $tag, $tok_count ) = decode_alignment_token($tok); push @todo, [ $i, $ng_beg, $ng_end, $raw_tok, $lev ]; } #------------------------------ # Step 3: Execute the task list #------------------------------ do_left_to_right_sweep( $rlines, $rgroups, \@todo, \%max_move, $short_pad, $group_level ); return; } ## end sub sweep_left_to_right { ## closure for sub do_left_to_right_sweep my %is_good_alignment_token; BEGIN { # One of the most difficult aspects of vertical alignment is knowing # when not to align. Alignment can go from looking very nice to very # bad when overdone. In the sweep algorithm there are two special # cases where we may need to limit padding to a '$short_pad' distance # to avoid some very ugly formatting: # 1. Two isolated lines with partial alignment # 2. A 'tail-wag-dog' situation, in which a single terminal # line with partial alignment could cause a significant pad # increase in many previous lines if allowed to join the alignment. # For most alignment tokens, we will allow only a small pad to be # introduced (the hardwired $short_pad variable) . But for some 'good' # alignments we can be less restrictive. # These are 'good' alignments, which are allowed more padding: my @q = qw( => = ? if unless or || { ); push @q, ','; @is_good_alignment_token{@q} = (0) x scalar(@q); # Promote a few of these to 'best', with essentially no pad limit: $is_good_alignment_token{'='} = 1; $is_good_alignment_token{'if'} = 1; $is_good_alignment_token{'unless'} = 1; $is_good_alignment_token{'=>'} = 1 # Note the hash values are set so that: # if ($is_good_alignment_token{$raw_tok}) => best # if defined ($is_good_alignment_token{$raw_tok}) => good or best } ## end BEGIN sub move_to_common_column { # This is a sub called by sub do_left_to_right_sweep to # move the alignment column of token $itok to $col_want for a # sequence of groups. my ( $rlines, $rgroups, $rmax_move, $ngb, $nge, $itok, $col_want, $raw_tok ) = @_; return unless ( defined($ngb) && $nge > $ngb ); foreach my $ng ( $ngb .. $nge ) { my ( $jbeg, $jend ) = @{ $rgroups->[$ng] }; my $line = $rlines->[$jbeg]; my $col = $line->get_column($itok); my $move = $col_want - $col; if ( $move > 0 ) { # limit padding increase in isolated two lines next if ( defined( $rmax_move->{$ng} ) && $move > $rmax_move->{$ng} && !$is_good_alignment_token{$raw_tok} ); $line->increase_field_width( $itok, $move ); } elsif ( $move < 0 ) { # spot to take special action on failure to move } } return; } ## end sub move_to_common_column sub do_left_to_right_sweep { my ( $rlines, $rgroups, $rtodo, $rmax_move, $short_pad, $group_level ) = @_; # $blocking_level[$nj is the level at a match failure between groups # $ng-1 and $ng my @blocking_level; my $group_list_type = $rlines->[0]->{'list_type'}; foreach my $task ( @{$rtodo} ) { my ( $itok, $ng_beg, $ng_end, $raw_tok, $lev ) = @{$task}; # Nothing to do for a single group next unless ( $ng_end > $ng_beg ); my $ng_first; # index of the first group of a continuous sequence my $col_want; # the common alignment column of a sequence of groups my $col_limit; # maximum column before bumping into max line length my $line_count_ng_m = 0; my $jmax_m; my $it_stop_m; # Loop over the groups # 'ix_' = index in the array of lines # 'ng_' = index in the array of groups # 'it_' = index in the array of tokens my $ix_min = $rgroups->[$ng_beg]->[0]; my $ix_max = $rgroups->[$ng_end]->[1]; my $lines_total = $ix_max - $ix_min + 1; foreach my $ng ( $ng_beg .. $ng_end ) { my ( $ix_beg, $ix_end, $it_stop ) = @{ $rgroups->[$ng] }; my $line_count_ng = $ix_end - $ix_beg + 1; # Important: note that since all lines in a group have a common # alignments object, we just have to work on one of the lines # (the first line). All of the rest will be changed # automatically. my $line = $rlines->[$ix_beg]; my $jmax = $line->{'jmax'}; # the maximum space without exceeding the line length: my $avail = $line->get_available_space_on_right(); my $col = $line->get_column($itok); my $col_max = $col + $avail; # Initialize on first group if ( !defined($col_want) ) { $ng_first = $ng; $col_want = $col; $col_limit = $col_max; $line_count_ng_m = $line_count_ng; $jmax_m = $jmax; $it_stop_m = $it_stop; next; } # RULE: Throw a blocking flag upon encountering a token level # different from the level of the first blocking token. For # example, in the following example, if the = matches get # blocked between two groups as shown, then we want to start # blocking matches at the commas, which are at deeper level, so # that we do not get the big gaps shown here: # my $unknown3 = pack( "v", -2 ); # my $unknown4 = pack( "v", 0x09 ); # my $unknown5 = pack( "VVV", 0x06, 0x00, 0x00 ); # my $num_bbd_blocks = pack( "V", $num_lists ); # my $root_startblock = pack( "V", $root_start ); # my $unknown6 = pack( "VV", 0x00, 0x1000 ); # On the other hand, it is okay to keep matching at the same # level such as in a simple list of commas and/or fat commas. my $is_blocked = defined( $blocking_level[$ng] ) && $lev > $blocking_level[$ng]; # TAIL-WAG-DOG RULE: prevent a 'tail-wag-dog' syndrom, meaning: # Do not let one or two lines with a **different number of # alignments** open up a big gap in a large block. For # example, we will prevent something like this, where the first # line pries open the rest: # $worksheet->write( "B7", "http://www.perl.com", undef, $format ); # $worksheet->write( "C7", "", $format ); # $worksheet->write( "D7", "", $format ); # $worksheet->write( "D8", "", $format ); # $worksheet->write( "D8", "", $format ); # We should exclude from consideration two groups which are # effectively the same but separated because one does not # fit in the maximum allowed line length. my $is_same_group = $jmax == $jmax_m && $it_stop_m == $jmax_m - 2; my $lines_above = $ix_beg - $ix_min; my $lines_below = $lines_total - $lines_above; # Increase the tolerable gap for certain favorable factors my $factor = 1; my $top_level = $lev == $group_level; # Align best top level alignment tokens like '=', 'if', ... # A factor of 10 allows a gap of up to 40 spaces if ( $top_level && $is_good_alignment_token{$raw_tok} ) { $factor = 10; } # Otherwise allow some minimal padding of good alignments elsif ( defined( $is_good_alignment_token{$raw_tok} ) # We have to be careful if there are just 2 lines. This # two-line factor allows large gaps only for 2 lines which # are simple lists with fewer items on the second line. It # gives results similar to previous versions of perltidy. && ( $lines_total > 2 || $group_list_type && $jmax < $jmax_m && $top_level ) ) { $factor += 1; if ($top_level) { $factor += 1; } } my $is_big_gap; if ( !$is_same_group ) { $is_big_gap ||= ( $lines_above == 1 || $lines_above == 2 && $lines_below >= 4 ) && $col_want > $col + $short_pad * $factor; $is_big_gap ||= ( $lines_below == 1 || $lines_below == 2 && $lines_above >= 4 ) && $col > $col_want + $short_pad * $factor; } # if match is limited by gap size, stop aligning at this level if ($is_big_gap) { $blocking_level[$ng] = $lev - 1; } # quit and restart if it cannot join this batch if ( $col_want > $col_max || $col > $col_limit || $is_big_gap || $is_blocked ) { # remember the level of the first blocking token if ( !defined( $blocking_level[$ng] ) ) { $blocking_level[$ng] = $lev; } move_to_common_column( $rlines, $rgroups, $rmax_move, $ng_first, $ng - 1, $itok, $col_want, $raw_tok ); $ng_first = $ng; $col_want = $col; $col_limit = $col_max; $line_count_ng_m = $line_count_ng; $jmax_m = $jmax; $it_stop_m = $it_stop; next; } $line_count_ng_m += $line_count_ng; # update the common column and limit if ( $col > $col_want ) { $col_want = $col } if ( $col_max < $col_limit ) { $col_limit = $col_max } } ## end loop over groups if ( $ng_end > $ng_first ) { move_to_common_column( $rlines, $rgroups, $rmax_move, $ng_first, $ng_end, $itok, $col_want, $raw_tok ); } ## end loop over groups for one task } ## end loop over tasks return; } ## end sub do_left_to_right_sweep } sub delete_selected_tokens { my ( $line_obj, $ridel ) = @_; # $line_obj is the line to be modified # $ridel is a ref to list of indexes to be deleted # remove an unused alignment token(s) to improve alignment chances return unless ( defined($line_obj) && defined($ridel) && @{$ridel} ); my $jmax_old = $line_obj->{'jmax'}; my $rfields_old = $line_obj->{'rfields'}; my $rfield_lengths_old = $line_obj->{'rfield_lengths'}; my $rpatterns_old = $line_obj->{'rpatterns'}; my $rtokens_old = $line_obj->{'rtokens'}; my $j_terminal_match = $line_obj->{'j_terminal_match'}; use constant EXPLAIN_DELETE_SELECTED => 0; local $LIST_SEPARATOR = '> <'; EXPLAIN_DELETE_SELECTED && print < old jmax: $jmax_old old tokens: <@{$rtokens_old}> old patterns: <@{$rpatterns_old}> old fields: <@{$rfields_old}> old field_lengths: <@{$rfield_lengths_old}> EOM my $rfields_new = []; my $rpatterns_new = []; my $rtokens_new = []; my $rfield_lengths_new = []; # Convert deletion list to a hash to allow any order, multiple entries, # and avoid problems with index values out of range my %delete_me; @delete_me{ @{$ridel} } = (1) x scalar( @{$ridel} ); my $pattern_0 = $rpatterns_old->[0]; my $field_0 = $rfields_old->[0]; my $field_length_0 = $rfield_lengths_old->[0]; push @{$rfields_new}, $field_0; push @{$rfield_lengths_new}, $field_length_0; push @{$rpatterns_new}, $pattern_0; # Loop to either copy items or concatenate fields and patterns my $jmin_del; foreach my $j ( 0 .. $jmax_old - 1 ) { my $token = $rtokens_old->[$j]; my $field = $rfields_old->[ $j + 1 ]; my $field_length = $rfield_lengths_old->[ $j + 1 ]; my $pattern = $rpatterns_old->[ $j + 1 ]; if ( !$delete_me{$j} ) { push @{$rtokens_new}, $token; push @{$rfields_new}, $field; push @{$rpatterns_new}, $pattern; push @{$rfield_lengths_new}, $field_length; } else { if ( !defined($jmin_del) ) { $jmin_del = $j } $rfields_new->[-1] .= $field; $rfield_lengths_new->[-1] += $field_length; $rpatterns_new->[-1] .= $pattern; } } # ----- x ------ x ------ x ------ #t 0 1 2 <- token indexing #f 0 1 2 3 <- field and pattern my $jmax_new = @{$rfields_new} - 1; $line_obj->{'rtokens'} = $rtokens_new; $line_obj->{'rpatterns'} = $rpatterns_new; $line_obj->{'rfields'} = $rfields_new; $line_obj->{'rfield_lengths'} = $rfield_lengths_new; $line_obj->{'jmax'} = $jmax_new; # The value of j_terminal_match will be incorrect if we delete tokens prior # to it. We will have to give up on aligning the terminal tokens if this # happens. if ( defined($j_terminal_match) && $jmin_del <= $j_terminal_match ) { $line_obj->{'j_terminal_match'} = undef; } # update list type - if ( $line_obj->{'list_seqno'} ) { ## This works, but for efficiency see if we need to make a change: ## decide_if_list($line_obj); # An existing list will still be a list but with possibly different # leading token my $old_list_type = $line_obj->{'list_type'}; my $new_list_type = EMPTY_STRING; if ( $rtokens_new->[0] =~ /^(=>|,)/ ) { $new_list_type = $rtokens_new->[0]; } if ( !$old_list_type || $old_list_type ne $new_list_type ) { decide_if_list($line_obj); } } EXPLAIN_DELETE_SELECTED && print < new patterns: <@{$rpatterns_new}> new fields: <@{$rfields_new}> EOM return; } ## end sub delete_selected_tokens { ## closure for sub decode_alignment_token # This routine is called repeatedly for each token, so it needs to be # efficient. We can speed things up by remembering the inputs and outputs # in a hash. my %decoded_token; sub initialize_decode { # We will re-initialize the hash for each file. Otherwise, there is # a danger that the hash can become arbitrarily large if a very large # number of files is processed at once. %decoded_token = (); return; } ## end sub initialize_decode sub decode_alignment_token { # Unpack the values packed in an alignment token # # Usage: # my ( $raw_tok, $lev, $tag, $tok_count ) = # decode_alignment_token($token); # Alignment tokens have a trailing decimal level and optional tag (for # commas): # For example, the first comma in the following line # sub banner { crlf; report( shift, '/', shift ); crlf } # is decorated as follows: # ,2+report-6 => (tok,lev,tag) =qw( , 2 +report-6) # An optional token count may be appended with a leading dot. # Currently this is only done for '=' tokens but this could change. # For example, consider the following line: # $nport = $port = shift || $name; # The first '=' may either be '=0' or '=0.1' [level 0, first equals] # The second '=' will be '=0.2' [level 0, second equals] my ($tok) = @_; if ( defined( $decoded_token{$tok} ) ) { return @{ $decoded_token{$tok} }; } my ( $raw_tok, $lev, $tag, $tok_count ) = ( $tok, 0, EMPTY_STRING, 1 ); if ( $tok =~ /^(\D+)(\d+)([^\.]*)(\.(\d+))?$/ ) { $raw_tok = $1; $lev = $2; $tag = $3 if ($3); $tok_count = $5 if ($5); } my @vals = ( $raw_tok, $lev, $tag, $tok_count ); $decoded_token{$tok} = \@vals; return @vals; } ## end sub decode_alignment_token } { ## closure for sub delete_unmatched_tokens my %is_assignment; my %keep_after_deleted_assignment; BEGIN { my @q; @q = qw( = **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x= ); @is_assignment{@q} = (1) x scalar(@q); # These tokens may be kept following an = deletion @q = qw( if unless or || ); @keep_after_deleted_assignment{@q} = (1) x scalar(@q); } ## end BEGIN sub delete_unmatched_tokens { my ( $rlines, $group_level ) = @_; # This is a important first step in vertical alignment in which # we remove as many obviously un-needed alignment tokens as possible. # This will prevent them from interfering with the final alignment. # Returns: my $max_lev_diff = 0; # used to avoid a call to prune_tree my $saw_side_comment = 0; # used to avoid a call for side comments # Handle no lines -- shouldn't happen return unless @{$rlines}; # Handle a single line if ( @{$rlines} == 1 ) { my $line = $rlines->[0]; my $jmax = $line->{'jmax'}; my $length = $line->{'rfield_lengths'}->[$jmax]; $saw_side_comment = $length > 0; return ( $max_lev_diff, $saw_side_comment ); } # ignore hanging side comments in these operations my @filtered = grep { !$_->{'is_hanging_side_comment'} } @{$rlines}; my $rnew_lines = \@filtered; $saw_side_comment = @filtered != @{$rlines}; $max_lev_diff = 0; # nothing to do if all lines were hanging side comments my $jmax = @{$rnew_lines} - 1; return ( $max_lev_diff, $saw_side_comment ) unless ( $jmax >= 0 ); #---------------------------------------------------- # Create a hash of alignment token info for each line #---------------------------------------------------- ( my $rline_hashes, my $requals_info, $saw_side_comment, $max_lev_diff ) = make_alignment_info( $group_level, $rnew_lines, $saw_side_comment ); #------------------------------------------------------------ # Find independent subgroups of lines. Neighboring subgroups # do not have a common alignment token. #------------------------------------------------------------ my @subgroups; push @subgroups, [ 0, $jmax ]; foreach my $jl ( 0 .. $jmax - 1 ) { if ( $rnew_lines->[$jl]->{'end_group'} ) { $subgroups[-1]->[1] = $jl; push @subgroups, [ $jl + 1, $jmax ]; } } #----------------------------------------------------------- # PASS 1 over subgroups to remove unmatched alignment tokens #----------------------------------------------------------- delete_unmatched_tokens_main_loop( $group_level, $rnew_lines, \@subgroups, $rline_hashes, $requals_info ); #---------------------------------------------------------------- # PASS 2: Construct a tree of matched lines and delete some small # deeper levels of tokens. They also block good alignments. #---------------------------------------------------------------- prune_alignment_tree($rnew_lines) if ($max_lev_diff); #-------------------------------------------- # PASS 3: compare all lines for common tokens #-------------------------------------------- match_line_pairs( $rlines, $rnew_lines, \@subgroups, $group_level ); return ( $max_lev_diff, $saw_side_comment ); } ## end sub delete_unmatched_tokens sub make_alignment_info { my ( $group_level, $rnew_lines, $saw_side_comment ) = @_; #------------------------------------------------------------ # Loop to create a hash of alignment token info for each line #------------------------------------------------------------ my $rline_hashes = []; my @equals_info; my @line_info; # no longer used my $jmax = @{$rnew_lines} - 1; my $max_lev_diff = 0; foreach my $line ( @{$rnew_lines} ) { my $rhash = {}; my $rtokens = $line->{'rtokens'}; my $rpatterns = $line->{'rpatterns'}; my $i = 0; my ( $i_eq, $tok_eq, $pat_eq ); my ( $lev_min, $lev_max ); foreach my $tok ( @{$rtokens} ) { my ( $raw_tok, $lev, $tag, $tok_count ) = decode_alignment_token($tok); if ( $tok ne '#' ) { if ( !defined($lev_min) ) { $lev_min = $lev; $lev_max = $lev; } else { if ( $lev < $lev_min ) { $lev_min = $lev } if ( $lev > $lev_max ) { $lev_max = $lev } } } else { if ( !$saw_side_comment ) { my $length = $line->{'rfield_lengths'}->[ $i + 1 ]; $saw_side_comment ||= $length; } } # Possible future upgrade: for multiple matches, # record [$i1, $i2, ..] instead of $i $rhash->{$tok} = [ $i, undef, undef, $raw_tok, $lev, $tag, $tok_count ]; # remember the first equals at line level if ( !defined($i_eq) && $raw_tok eq '=' ) { if ( $lev eq $group_level ) { $i_eq = $i; $tok_eq = $tok; $pat_eq = $rpatterns->[$i]; } } $i++; } push @{$rline_hashes}, $rhash; push @equals_info, [ $i_eq, $tok_eq, $pat_eq ]; push @line_info, [ $lev_min, $lev_max ]; if ( defined($lev_min) ) { my $lev_diff = $lev_max - $lev_min; if ( $lev_diff > $max_lev_diff ) { $max_lev_diff = $lev_diff } } } #---------------------------------------------------- # Loop to compare each line pair and remember matches #---------------------------------------------------- my $rtok_hash = {}; my $nr = 0; foreach my $jl ( 0 .. $jmax - 1 ) { my $nl = $nr; $nr = 0; my $jr = $jl + 1; my $rhash_l = $rline_hashes->[$jl]; my $rhash_r = $rline_hashes->[$jr]; foreach my $tok ( keys %{$rhash_l} ) { if ( defined( $rhash_r->{$tok} ) ) { my $il = $rhash_l->{$tok}->[0]; my $ir = $rhash_r->{$tok}->[0]; $rhash_l->{$tok}->[2] = $ir; $rhash_r->{$tok}->[1] = $il; if ( $tok ne '#' ) { push @{ $rtok_hash->{$tok} }, ( $jl, $jr ); $nr++; } } } # Set a line break if no matching tokens between these lines # (this is not strictly necessary now but does not hurt) if ( $nr == 0 && $nl > 0 ) { $rnew_lines->[$jl]->{'end_group'} = 1; } # Also set a line break if both lines have simple equals but with # different leading characters in patterns. This check is similar # to one in sub check_match, and will prevent sub # prune_alignment_tree from removing alignments which otherwise # should be kept. This fix is rarely needed, but it can # occasionally improve formatting. # For example: # my $name = $this->{Name}; # $type = $this->ctype($genlooptype) if defined $genlooptype; # my $declini = ( $asgnonly ? "" : "\t$type *" ); # my $cast = ( $type ? "($type *)" : "" ); # The last two lines start with 'my' and will not match the # previous line starting with $type, so we do not want # prune_alignment tree to delete their ? : alignments at a deeper # level. my ( $i_eq_l, $tok_eq_l, $pat_eq_l ) = @{ $equals_info[$jl] }; my ( $i_eq_r, $tok_eq_r, $pat_eq_r ) = @{ $equals_info[$jr] }; if ( defined($i_eq_l) && defined($i_eq_r) ) { # Also, do not align equals across a change in ci level my $ci_jump = $rnew_lines->[$jl]->{'ci_level'} != $rnew_lines->[$jr]->{'ci_level'}; if ( $tok_eq_l eq $tok_eq_r && $i_eq_l == 0 && $i_eq_r == 0 && ( substr( $pat_eq_l, 0, 1 ) ne substr( $pat_eq_r, 0, 1 ) || $ci_jump ) ) { $rnew_lines->[$jl]->{'end_group'} = 1; } } } return ( $rline_hashes, \@equals_info, $saw_side_comment, $max_lev_diff ); } ## end sub make_alignment_info sub delete_unmatched_tokens_main_loop { my ( $group_level, $rnew_lines, $rsubgroups, $rline_hashes, $requals_info ) = @_; #-------------------------------------------------------------- # Main loop over subgroups to remove unmatched alignment tokens #-------------------------------------------------------------- # flag to allow skipping pass 2 - not currently used my $saw_large_group; my $has_terminal_match = $rnew_lines->[-1]->{'j_terminal_match'}; foreach my $item ( @{$rsubgroups} ) { my ( $jbeg, $jend ) = @{$item}; my $nlines = $jend - $jbeg + 1; #--------------------------------------------------- # Look for complete if/elsif/else and ternary blocks #--------------------------------------------------- # We are looking for a common '$dividing_token' like these: # if ( $b and $s ) { $p->{'type'} = 'a'; } # elsif ($b) { $p->{'type'} = 'b'; } # elsif ($s) { $p->{'type'} = 's'; } # else { $p->{'type'} = ''; } # ^----------- dividing_token # my $severity = # !$routine ? '[PFX]' # : $routine =~ /warn.*_d\z/ ? '[DS]' # : $routine =~ /ck_warn/ ? 'W' # : $routine =~ /ckWARN\d*reg_d/ ? 'S' # : $routine =~ /ckWARN\d*reg/ ? 'W' # : $routine =~ /vWARN\d/ ? '[WDS]' # : '[PFX]'; # ^----------- dividing_token # Only look for groups which are more than 2 lines long. Two lines # can get messed up doing this, probably due to the various # two-line rules. my $dividing_token; my %token_line_count; if ( $nlines > 2 ) { foreach my $jj ( $jbeg .. $jend ) { my %seen; my $line = $rnew_lines->[$jj]; my $rtokens = $line->{'rtokens'}; foreach my $tok ( @{$rtokens} ) { if ( !$seen{$tok} ) { $seen{$tok}++; $token_line_count{$tok}++; } } } foreach my $tok ( keys %token_line_count ) { if ( $token_line_count{$tok} == $nlines ) { if ( substr( $tok, 0, 1 ) eq '?' || substr( $tok, 0, 1 ) eq '{' && $tok =~ /^\{\d+if/ ) { $dividing_token = $tok; last; } } } } #------------------------------------------------------------- # Loop over subgroup lines to remove unwanted alignment tokens #------------------------------------------------------------- foreach my $jj ( $jbeg .. $jend ) { my $line = $rnew_lines->[$jj]; my $rtokens = $line->{'rtokens'}; my $rhash = $rline_hashes->[$jj]; my $i_eq = $requals_info->[$jj]->[0]; my @idel; my $imax = @{$rtokens} - 2; my $delete_above_level; my $deleted_assignment_token; my $saw_dividing_token = EMPTY_STRING; $saw_large_group ||= $nlines > 2 && $imax > 1; # Loop over all alignment tokens foreach my $i ( 0 .. $imax ) { my $tok = $rtokens->[$i]; next if ( $tok eq '#' ); # shouldn't happen my ( $iii, $il, $ir, $raw_tok, $lev, $tag, $tok_count ) = @{ $rhash->{$tok} }; #------------------------------------------------------ # Here is the basic RULE: remove an unmatched alignment # which does not occur in the surrounding lines. #------------------------------------------------------ my $delete_me = !defined($il) && !defined($ir); # Apply any user controls. Note that not all lines pass # this way so they have to be applied elsewhere too. my $align_ok = 1; if (%valign_control_hash) { $align_ok = $valign_control_hash{$raw_tok}; $align_ok = $valign_control_default unless defined($align_ok); $delete_me ||= !$align_ok; } # But now we modify this with exceptions... # EXCEPTION 1: If we are in a complete ternary or # if/elsif/else group, and this token is not on every line # of the group, should we delete it to preserve overall # alignment? if ($dividing_token) { if ( $token_line_count{$tok} >= $nlines ) { $saw_dividing_token ||= $tok eq $dividing_token; } else { # For shorter runs, delete toks to save alignment. # For longer runs, keep toks after the '{' or '?' # to allow sub-alignments within braces. The # number 5 lines is arbitrary but seems to work ok. $delete_me ||= ( $nlines < 5 || !$saw_dividing_token ); } } # EXCEPTION 2: Remove all tokens above a certain level # following a previous deletion. For example, we have to # remove tagged higher level alignment tokens following a # '=>' deletion because the tags of higher level tokens # will now be incorrect. For example, this will prevent # aligning commas as follows after deleting the second '=>' # $w->insert( # ListBox => origin => [ 270, 160 ], # size => [ 200, 55 ], # ); if ( defined($delete_above_level) ) { if ( $lev > $delete_above_level ) { $delete_me ||= 1; } else { $delete_above_level = undef } } # EXCEPTION 3: Remove all but certain tokens after an # assignment deletion. if ( $deleted_assignment_token && ( $lev > $group_level || !$keep_after_deleted_assignment{$raw_tok} ) ) { $delete_me ||= 1; } # EXCEPTION 4: Do not touch the first line of a 2 line # terminal match, such as below, because j_terminal has # already been set. # if ($tag) { $tago = "<$tag>"; $tagc = ""; } # else { $tago = $tagc = ''; } # But see snippets 'else1.t' and 'else2.t' $delete_me = 0 if ( $jj == $jbeg && $has_terminal_match && $nlines == 2 ); # EXCEPTION 5: misc additional rules for commas and equals if ( $delete_me && $tok_count == 1 ) { # okay to delete second and higher copies of a token # for a comma... if ( $raw_tok eq ',' ) { # Do not delete commas before an equals $delete_me = 0 if ( defined($i_eq) && $i < $i_eq ); # Do not delete line-level commas $delete_me = 0 if ( $lev <= $group_level ); } # For an assignment at group level.. if ( $is_assignment{$raw_tok} && $lev == $group_level ) { # Do not delete if it is the last alignment of # multiple tokens; this will prevent some # undesirable alignments if ( $imax > 0 && $i == $imax ) { $delete_me = 0; } # Otherwise, set a flag to delete most # remaining tokens else { $deleted_assignment_token = $raw_tok } } } # Do not let a user exclusion be reactivated by above rules $delete_me ||= !$align_ok; #------------------------------------ # Add this token to the deletion list #------------------------------------ if ($delete_me) { push @idel, $i; # update deletion propagation flags if ( !defined($delete_above_level) || $lev < $delete_above_level ) { # delete all following higher level alignments $delete_above_level = $lev; # but keep deleting after => to next lower level # to avoid some bizarre alignments if ( $raw_tok eq '=>' ) { $delete_above_level = $lev - 1; } } } } # End loop over alignment tokens # Process all deletion requests for this line if (@idel) { delete_selected_tokens( $line, \@idel ); } } # End loopover lines } ## end main loop over subgroups return; } ## end sub delete_unmatched_tokens_main_loop } sub match_line_pairs { my ( $rlines, $rnew_lines, $rsubgroups, $group_level ) = @_; # Compare each pair of lines and save information about common matches # $rlines = list of lines including hanging side comments # $rnew_lines = list of lines without any hanging side comments # $rsubgroups = list of subgroups of the new lines # TODO: # Maybe change: imax_pair => pair_match_info = ref to array # = [$imax_align, $rMsg, ... ] # This may eventually have multi-level match info # Previous line vars my ( $line_m, $rtokens_m, $rpatterns_m, $rfield_lengths_m, $imax_m, $list_type_m, $ci_level_m ); # Current line vars my ( $line, $rtokens, $rpatterns, $rfield_lengths, $imax, $list_type, $ci_level ); # loop over subgroups foreach my $item ( @{$rsubgroups} ) { my ( $jbeg, $jend ) = @{$item}; my $nlines = $jend - $jbeg + 1; next unless ( $nlines > 1 ); # loop over lines in a subgroup foreach my $jj ( $jbeg .. $jend ) { $line_m = $line; $rtokens_m = $rtokens; $rpatterns_m = $rpatterns; $rfield_lengths_m = $rfield_lengths; $imax_m = $imax; $list_type_m = $list_type; $ci_level_m = $ci_level; $line = $rnew_lines->[$jj]; $rtokens = $line->{'rtokens'}; $rpatterns = $line->{'rpatterns'}; $rfield_lengths = $line->{'rfield_lengths'}; $imax = @{$rtokens} - 2; $list_type = $line->{'list_type'}; $ci_level = $line->{'ci_level'}; # nothing to do for first line next if ( $jj == $jbeg ); my $ci_jump = $ci_level - $ci_level_m; my $imax_min = $imax_m < $imax ? $imax_m : $imax; my $imax_align = -1; # find number of leading common tokens #--------------------------------- # No match to hanging side comment #--------------------------------- if ( $line->{'is_hanging_side_comment'} ) { # Should not get here; HSC's have been filtered out $imax_align = -1; } #----------------------------- # Handle comma-separated lists #----------------------------- elsif ( $list_type && $list_type eq $list_type_m ) { # do not align lists across a ci jump with new list method if ($ci_jump) { $imax_min = -1 } my $i_nomatch = $imax_min + 1; foreach my $i ( 0 .. $imax_min ) { my $tok = $rtokens->[$i]; my $tok_m = $rtokens_m->[$i]; if ( $tok ne $tok_m ) { $i_nomatch = $i; last; } } $imax_align = $i_nomatch - 1; } #----------------- # Handle non-lists #----------------- else { my $i_nomatch = $imax_min + 1; foreach my $i ( 0 .. $imax_min ) { my $tok = $rtokens->[$i]; my $tok_m = $rtokens_m->[$i]; if ( $tok ne $tok_m ) { $i_nomatch = $i; last; } my $pat = $rpatterns->[$i]; my $pat_m = $rpatterns_m->[$i]; # If patterns don't match, we have to be careful... if ( $pat_m ne $pat ) { my $pad = $rfield_lengths->[$i] - $rfield_lengths_m->[$i]; my ( $match_code, $rmsg ) = compare_patterns( $group_level, $tok, $tok_m, $pat, $pat_m, $pad ); if ($match_code) { if ( $match_code == 1 ) { $i_nomatch = $i } elsif ( $match_code == 2 ) { $i_nomatch = 0 } last; } } } $imax_align = $i_nomatch - 1; } $line_m->{'imax_pair'} = $imax_align; } ## end loop over lines # Put fence at end of subgroup $line->{'imax_pair'} = -1; } ## end loop over subgroups # if there are hanging side comments, propagate the pair info down to them # so that lines can just look back one line for their pair info. if ( @{$rlines} > @{$rnew_lines} ) { my $last_pair_info = -1; foreach my $line ( @{$rlines} ) { if ( $line->{'is_hanging_side_comment'} ) { $line->{'imax_pair'} = $last_pair_info; } else { $last_pair_info = $line->{'imax_pair'}; } } } return; } ## end sub match_line_pairs sub compare_patterns { my ( $group_level, $tok, $tok_m, $pat, $pat_m, $pad ) = @_; # helper routine for sub match_line_pairs to decide if patterns in two # lines match well enough..Given # $tok_m, $pat_m = token and pattern of first line # $tok, $pat = token and pattern of second line # $pad = 0 if no padding is needed, !=0 otherwise # return code: # 0 = patterns match, continue # 1 = no match # 2 = no match, and lines do not match at all my $GoToMsg = EMPTY_STRING; my $return_code = 0; use constant EXPLAIN_COMPARE_PATTERNS => 0; my ( $alignment_token, $lev, $tag, $tok_count ) = decode_alignment_token($tok); # We have to be very careful about aligning commas # when the pattern's don't match, because it can be # worse to create an alignment where none is needed # than to omit one. Here's an example where the ','s # are not in named containers. The first line below # should not match the next two: # ( $a, $b ) = ( $b, $r ); # ( $x1, $x2 ) = ( $x2 - $q * $x1, $x1 ); # ( $y1, $y2 ) = ( $y2 - $q * $y1, $y1 ); if ( $alignment_token eq ',' ) { # do not align commas unless they are in named # containers if ( $tok !~ /[A-Za-z]/ ) { $return_code = 1; $GoToMsg = "do not align commas in unnamed containers"; } else { $return_code = 0; } } # do not align parens unless patterns match; # large ugly spaces can occur in math expressions. elsif ( $alignment_token eq '(' ) { # But we can allow a match if the parens don't # require any padding. if ( $pad != 0 ) { $return_code = 1; $GoToMsg = "do not align '(' unless patterns match or pad=0"; } else { $return_code = 0; } } # Handle an '=' alignment with different patterns to # the left. elsif ( $alignment_token eq '=' ) { # It is best to be a little restrictive when # aligning '=' tokens. Here is an example of # two lines that we will not align: # my $variable=6; # $bb=4; # The problem is that one is a 'my' declaration, # and the other isn't, so they're not very similar. # We will filter these out by comparing the first # letter of the pattern. This is crude, but works # well enough. if ( substr( $pat_m, 0, 1 ) ne substr( $pat, 0, 1 ) ) { $GoToMsg = "first character before equals differ"; $return_code = 1; } # The introduction of sub 'prune_alignment_tree' # enabled alignment of lists left of the equals with # other scalar variables. For example: # my ( $D, $s, $e ) = @_; # my $d = length $D; # my $c = $e - $s - $d; # But this would change formatting of a lot of scripts, # so for now we prevent alignment of comma lists on the # left with scalars on the left. We will also prevent # any partial alignments. # set return code 2 if the = is at line level, but # set return code 1 if the = is below line level, i.e. # sub new { my ( $p, $v ) = @_; bless \$v, $p } # sub iter { my ($x) = @_; return undef if $$x < 0; return $$x--; } elsif ( ( index( $pat_m, ',' ) >= 0 ) ne ( index( $pat, ',' ) >= 0 ) ) { $GoToMsg = "mixed commas/no-commas before equals"; $return_code = 1; if ( $lev eq $group_level ) { $return_code = 2; } } else { $return_code = 0; } } else { $return_code = 0; } EXPLAIN_COMPARE_PATTERNS && $return_code && print STDERR "no match because $GoToMsg\n"; return ( $return_code, \$GoToMsg ); } ## end sub compare_patterns sub fat_comma_to_comma { my ($str) = @_; # We are changing '=>' to ',' and removing any trailing decimal count # because currently fat commas have a count and commas do not. # For example, we will change '=>2+{-3.2' into ',2+{-3' if ( $str =~ /^=>([^\.]*)/ ) { $str = ',' . $1 } return $str; } ## end sub fat_comma_to_comma sub get_line_token_info { # scan lines of tokens and return summary information about the range of # levels and patterns. my ($rlines) = @_; # First scan to check monotonicity. Here is an example of several # lines which are monotonic. The = is the lowest level, and # the commas are all one level deeper. So this is not nonmonotonic. # $$d{"weeks"} = [ "w", "wk", "wks", "week", "weeks" ]; # $$d{"days"} = [ "d", "day", "days" ]; # $$d{"hours"} = [ "h", "hr", "hrs", "hour", "hours" ]; my @all_token_info; my $all_monotonic = 1; foreach my $jj ( 0 .. @{$rlines} - 1 ) { my ($line) = $rlines->[$jj]; my $rtokens = $line->{'rtokens'}; my $last_lev; my $is_monotonic = 1; my $i = -1; foreach my $tok ( @{$rtokens} ) { $i++; my ( $raw_tok, $lev, $tag, $tok_count ) = decode_alignment_token($tok); push @{ $all_token_info[$jj] }, [ $raw_tok, $lev, $tag, $tok_count ]; last if ( $tok eq '#' ); if ( $i > 0 && $lev < $last_lev ) { $is_monotonic = 0 } $last_lev = $lev; } if ( !$is_monotonic ) { $all_monotonic = 0 } } my $rline_values = []; foreach my $jj ( 0 .. @{$rlines} - 1 ) { my ($line) = $rlines->[$jj]; my $rtokens = $line->{'rtokens'}; my $i = -1; my ( $lev_min, $lev_max ); my $token_pattern_max = EMPTY_STRING; my %saw_level; my $is_monotonic = 1; # find the index of the last token before the side comment my $imax = @{$rtokens} - 2; my $imax_true = $imax; # If the entire group is monotonic, and the line ends in a comma list, # walk it back to the first such comma. this will have the effect of # making all trailing ragged comma lists match in the prune tree # routine. these trailing comma lists can better be handled by later # alignment rules. # Treat fat commas the same as commas here by converting them to # commas. This will improve the chance of aligning the leading parts # of ragged lists. my $tok_end = fat_comma_to_comma( $rtokens->[$imax] ); if ( $all_monotonic && $tok_end =~ /^,/ ) { my $ii = $imax - 1; while ( $ii >= 0 && fat_comma_to_comma( $rtokens->[$ii] ) eq $tok_end ) { $imax = $ii; $ii--; } } # make a first pass to find level range my $last_lev; foreach my $tok ( @{$rtokens} ) { $i++; last if ( $i > $imax ); last if ( $tok eq '#' ); my ( $raw_tok, $lev, $tag, $tok_count ) = @{ $all_token_info[$jj]->[$i] }; last if ( $tok eq '#' ); $token_pattern_max .= $tok; $saw_level{$lev}++; if ( !defined($lev_min) ) { $lev_min = $lev; $lev_max = $lev; } else { if ( $lev < $lev_min ) { $lev_min = $lev; } if ( $lev > $lev_max ) { $lev_max = $lev; } if ( $lev < $last_lev ) { $is_monotonic = 0 } } $last_lev = $lev; } # handle no levels my $rtoken_patterns = {}; my $rtoken_indexes = {}; my @levs = sort keys %saw_level; if ( !defined($lev_min) ) { $lev_min = -1; $lev_max = -1; $levs[0] = -1; $rtoken_patterns->{$lev_min} = EMPTY_STRING; $rtoken_indexes->{$lev_min} = []; } # handle one level elsif ( $lev_max == $lev_min ) { $rtoken_patterns->{$lev_max} = $token_pattern_max; $rtoken_indexes->{$lev_max} = [ ( 0 .. $imax ) ]; } # handle multiple levels else { $rtoken_patterns->{$lev_max} = $token_pattern_max; $rtoken_indexes->{$lev_max} = [ ( 0 .. $imax ) ]; my $lev_top = pop @levs; # alread did max level my $itok = -1; foreach my $tok ( @{$rtokens} ) { $itok++; last if ( $itok > $imax ); my ( $raw_tok, $lev, $tag, $tok_count ) = @{ $all_token_info[$jj]->[$itok] }; last if ( $raw_tok eq '#' ); foreach my $lev_test (@levs) { next if ( $lev > $lev_test ); $rtoken_patterns->{$lev_test} .= $tok; push @{ $rtoken_indexes->{$lev_test} }, $itok; } } push @levs, $lev_top; } push @{$rline_values}, [ $lev_min, $lev_max, $rtoken_patterns, \@levs, $rtoken_indexes, $is_monotonic, $imax_true, $imax, ]; # debug 0 && do { local $LIST_SEPARATOR = ')('; print "lev_min=$lev_min, lev_max=$lev_max, levels=(@levs)\n"; foreach my $key ( sort keys %{$rtoken_patterns} ) { print "$key => $rtoken_patterns->{$key}\n"; print "$key => @{$rtoken_indexes->{$key}}\n"; } }; } ## end loop over lines return ( $rline_values, $all_monotonic ); } ## end sub get_line_token_info sub prune_alignment_tree { my ($rlines) = @_; my $jmax = @{$rlines} - 1; return unless $jmax > 0; # Vertical alignment in perltidy is done as an iterative process. The # starting point is to mark all possible alignment tokens ('=', ',', '=>', # etc) for vertical alignment. Then we have to delete all alignments # which, if actually made, would detract from overall alignment. This # is done in several phases of which this is one. # In this routine we look at the alignments of a group of lines as a # hierarchical tree. We will 'prune' the tree to limited depths if that # will improve overall alignment at the lower depths. # For each line we will be looking at its alignment patterns down to # different fixed depths. For each depth, we include all lower depths and # ignore all higher depths. We want to see if we can get alignment of a # larger group of lines if we ignore alignments at some lower depth. # Here is an # example: # for ( # [ '$var', sub { join $_, "bar" }, 0, "bar" ], # [ 'CONSTANT', sub { join "foo", "bar" }, 0, "bar" ], # [ 'CONSTANT', sub { join "foo", "bar", 3 }, 1, "barfoo3" ], # [ '$myvar', sub { my $var; join $var, "bar" }, 0, "bar" ], # ); # In the above example, all lines have three commas at the lowest depth # (zero), so if there were no other alignments, these lines would all # align considering only the zero depth alignment token. But some lines # have additional comma alignments at the next depth, so we need to decide # if we should drop those to keep the top level alignments, or keep those # for some additional low level alignments at the expense losing some top # level alignments. In this case we will drop the deeper level commas to # keep the entire collection aligned. But in some cases the decision could # go the other way. # The tree for this example at the zero depth has one node containing # all four lines, since they are identical at zero level (three commas). # At depth one, there are three 'children' nodes, namely: # - lines 1 and 2, which have a single comma in the 'sub' at depth 1 # - line 3, which has 2 commas at depth 1 # - line4, which has a ';' and a ',' at depth 1 # There are no deeper alignments in this example. # so the tree structure for this example is: # # depth 0 depth 1 depth 2 # [lines 1-4] -- [line 1-2] - (empty) # | [line 3] - (empty) # | [line 4] - (empty) # We can carry this to any depth, but it is not really useful to go below # depth 2. To cleanly stop there, we will consider depth 2 to contain all # alignments at depth >=2. use constant EXPLAIN_PRUNE => 0; #------------------------------------------------------------------- # Prune Tree Step 1. Start by scanning the lines and collecting info #------------------------------------------------------------------- # Note that the caller had this info but we have to redo this now because # alignment tokens may have been deleted. my ( $rline_values, $all_monotonic ) = get_line_token_info($rlines); # If all the lines have levels which increase monotonically from left to # right, then the sweep-left-to-right pass can do a better job of alignment # than pruning, and without deleting alignments. return if ($all_monotonic); # Contents of $rline_values # [ # $lev_min, $lev_max, $rtoken_patterns, \@levs, # $rtoken_indexes, $is_monotonic, $imax_true, $imax, # ]; # We can work to any depth, but there is little advantage to working # to a a depth greater than 2 my $MAX_DEPTH = 2; # This arrays will hold the tree of alignment tokens at different depths # for these lines. my @match_tree; # Tree nodes contain these values: # $match_tree[$depth] = [$jbeg, $jend, $n_parent, $level, $pattern, # $nc_beg_p, $nc_end_p, $rindexes]; # where # $depth = 0,1,2 = index of depth of the match # $jbeg beginning index j of the range of lines in this match # $jend ending index j of the range of lines in this match # $n_parent = index of the containing group at $depth-1, if it exists # $level = actual level of code being matched in this group # $pattern = alignment pattern being matched # $nc_beg_p = first child # $nc_end_p = last child # $rindexes = ref to token indexes # the patterns and levels of the current group being formed at each depth my ( @token_patterns_current, @levels_current, @token_indexes_current ); # the patterns and levels of the next line being tested at each depth my ( @token_patterns_next, @levels_next, @token_indexes_next ); #----------------------------------------------------------- # define a recursive worker subroutine for tree construction #----------------------------------------------------------- # This is a recursive routine which is called if a match condition changes # at any depth when a new line is encountered. It ends the match node # which changed plus all deeper nodes attached to it. my $end_node; $end_node = sub { my ( $depth, $jl, $n_parent ) = @_; # $depth is the tree depth # $jl is the index of the line # $n_parent is index of the parent node of this node return if ( $depth > $MAX_DEPTH ); # end any current group at this depth if ( $jl >= 0 && defined( $match_tree[$depth] ) && @{ $match_tree[$depth] } && defined( $levels_current[$depth] ) ) { $match_tree[$depth]->[-1]->[1] = $jl; } # Define the index of the node we will create below my $ng_self = 0; if ( defined( $match_tree[$depth] ) ) { $ng_self = @{ $match_tree[$depth] }; } # end any next deeper child node(s) $end_node->( $depth + 1, $jl, $ng_self ); # update the levels being matched $token_patterns_current[$depth] = $token_patterns_next[$depth]; $token_indexes_current[$depth] = $token_indexes_next[$depth]; $levels_current[$depth] = $levels_next[$depth]; # Do not start a new group at this level if it is not being used if ( !defined( $levels_next[$depth] ) || $depth > 0 && $levels_next[$depth] <= $levels_next[ $depth - 1 ] ) { return; } # Create a node for the next group at this depth. We initially assume # that it will continue to $jmax, and correct that later if the node # ends earlier. push @{ $match_tree[$depth] }, [ $jl + 1, $jmax, $n_parent, $levels_current[$depth], $token_patterns_current[$depth], undef, undef, $token_indexes_current[$depth], ]; return; }; ## end sub end_node #----------------------------------------------------- # Prune Tree Step 2. Loop to form the tree of matches. #----------------------------------------------------- foreach my $jp ( 0 .. $jmax ) { # working with two adjacent line indexes, 'm'=minus, 'p'=plus my $jm = $jp - 1; # Pull out needed values for the next line my ( $lev_min, $lev_max, $rtoken_patterns, $rlevs, $rtoken_indexes, $is_monotonic, $imax_true, $imax ) = @{ $rline_values->[$jp] }; # Transfer levels and patterns for this line to the working arrays. # If the number of levels differs from our chosen MAX_DEPTH ... # if fewer than MAX_DEPTH: leave levels at missing depths undefined # if more than MAX_DEPTH: set the MAX_DEPTH level to be the maximum @levels_next = @{$rlevs}[ 0 .. $MAX_DEPTH ]; if ( @{$rlevs} > $MAX_DEPTH ) { $levels_next[$MAX_DEPTH] = $rlevs->[-1]; } my $depth = 0; foreach my $item (@levels_next) { $token_patterns_next[$depth] = defined($item) ? $rtoken_patterns->{$item} : undef; $token_indexes_next[$depth] = defined($item) ? $rtoken_indexes->{$item} : undef; $depth++; } # Look for a change in match groups... # Initialize on the first line if ( $jp == 0 ) { my $n_parent; $end_node->( 0, $jm, $n_parent ); } # End groups if a hard flag has been set elsif ( $rlines->[$jm]->{'end_group'} ) { my $n_parent; $end_node->( 0, $jm, $n_parent ); } # Continue at hanging side comment elsif ( $rlines->[$jp]->{'is_hanging_side_comment'} ) { next; } # Otherwise see if anything changed and update the tree if so else { foreach my $depth ( 0 .. $MAX_DEPTH ) { my $def_current = defined( $token_patterns_current[$depth] ); my $def_next = defined( $token_patterns_next[$depth] ); last unless ( $def_current || $def_next ); if ( !$def_current || !$def_next || $token_patterns_current[$depth] ne $token_patterns_next[$depth] ) { my $n_parent; if ( $depth > 0 && defined( $match_tree[ $depth - 1 ] ) ) { $n_parent = @{ $match_tree[ $depth - 1 ] } - 1; } $end_node->( $depth, $jm, $n_parent ); last; } } } } ## end loop to form tree of matches #--------------------------------------------------------- # Prune Tree Step 3. Make links from parent to child nodes #--------------------------------------------------------- # It seemed cleaner to do this as a separate step rather than during tree # construction. The children nodes have links up to the parent node which # created them. Now make links in the opposite direction, so the parents # can find the children. We store the range of children nodes ($nc_beg, # $nc_end) of each parent with two additional indexes in the original array. # These will be undef if no children. foreach my $depth ( reverse( 1 .. $MAX_DEPTH ) ) { next unless defined( $match_tree[$depth] ); my $nc_max = @{ $match_tree[$depth] } - 1; my $np_now; foreach my $nc ( 0 .. $nc_max ) { my $np = $match_tree[$depth]->[$nc]->[2]; if ( !defined($np) ) { # shouldn't happen #print STDERR "lost child $np at depth $depth\n"; next; } if ( !defined($np_now) || $np != $np_now ) { $np_now = $np; $match_tree[ $depth - 1 ]->[$np]->[5] = $nc; } $match_tree[ $depth - 1 ]->[$np]->[6] = $nc; } } ## end loop to make links down to the child nodes EXPLAIN_PRUNE > 0 && do { print "Tree complete. Found these groups:\n"; foreach my $depth ( 0 .. $MAX_DEPTH ) { Dump_tree_groups( \@{ $match_tree[$depth] }, "depth=$depth" ); } }; #------------------------------------------------------ # Prune Tree Step 4. Make a list of nodes to be deleted #------------------------------------------------------ # list of lines with tokens to be deleted: # [$jbeg, $jend, $level_keep] # $jbeg..$jend is the range of line indexes, # $level_keep is the minimum level to keep my @delete_list; # Not currently used: # Groups with ending comma lists and their range of sizes: # $ragged_comma_group{$id} = [ imax_group_min, imax_group_max ] ## my %ragged_comma_group; # We work with a list of nodes to visit at the next deeper depth. my @todo_list; if ( defined( $match_tree[0] ) ) { @todo_list = ( 0 .. @{ $match_tree[0] } - 1 ); } foreach my $depth ( 0 .. $MAX_DEPTH ) { last unless (@todo_list); my @todo_next; foreach my $np (@todo_list) { my ( $jbeg_p, $jend_p, $np_p, $lev_p, $pat_p, $nc_beg_p, $nc_end_p, $rindexes_p ) = @{ $match_tree[$depth]->[$np] }; my $nlines_p = $jend_p - $jbeg_p + 1; # nothing to do if no children next unless defined($nc_beg_p); # Define the number of lines to either keep or delete a child node. # This is the key decision we have to make. We want to delete # short runs of matched lines, and keep long runs. It seems easier # for the eye to follow breaks in monotonic level changes than # non-monotonic level changes. For example, the following looks # best if we delete the lower level alignments: # [1] ~~ []; # [ ["foo"], ["bar"] ] ~~ [ qr/o/, qr/a/ ]; # [ qr/o/, qr/a/ ] ~~ [ ["foo"], ["bar"] ]; # [ "foo", "bar" ] ~~ [ qr/o/, qr/a/ ]; # [ qr/o/, qr/a/ ] ~~ [ "foo", "bar" ]; # $deep1 ~~ $deep1; # So we will use two thresholds. my $nmin_mono = $depth + 2; my $nmin_non_mono = $depth + 6; if ( $nmin_mono > $nlines_p - 1 ) { $nmin_mono = $nlines_p - 1; } if ( $nmin_non_mono > $nlines_p - 1 ) { $nmin_non_mono = $nlines_p - 1; } # loop to keep or delete each child node foreach my $nc ( $nc_beg_p .. $nc_end_p ) { my ( $jbeg_c, $jend_c, $np_c, $lev_c, $pat_c, $nc_beg_c, $nc_end_c ) = @{ $match_tree[ $depth + 1 ]->[$nc] }; my $nlines_c = $jend_c - $jbeg_c + 1; my $is_monotonic = $rline_values->[$jbeg_c]->[5]; my $nmin = $is_monotonic ? $nmin_mono : $nmin_non_mono; if ( $nlines_c < $nmin ) { ##print "deleting child, nlines=$nlines_c, nmin=$nmin\n"; push @delete_list, [ $jbeg_c, $jend_c, $lev_p ]; } else { ##print "keeping child, nlines=$nlines_c, nmin=$nmin\n"; push @todo_next, $nc; } } } @todo_list = @todo_next; } ## end loop to mark nodes to delete #------------------------------------------------------------ # Prune Tree Step 5. Loop to delete selected alignment tokens #------------------------------------------------------------ foreach my $item (@delete_list) { my ( $jbeg, $jend, $level_keep ) = @{$item}; foreach my $jj ( $jbeg .. $jend ) { my $line = $rlines->[$jj]; my @idel; my $rtokens = $line->{'rtokens'}; my $imax = @{$rtokens} - 2; foreach my $i ( 0 .. $imax ) { my $tok = $rtokens->[$i]; my ( $raw_tok, $lev, $tag, $tok_count ) = decode_alignment_token($tok); if ( $lev > $level_keep ) { push @idel, $i; } } if (@idel) { delete_selected_tokens( $line, \@idel ); } } } ## end loop to delete selected alignment tokens return; } ## end sub prune_alignment_tree sub Dump_tree_groups { my ( $rgroup, $msg ) = @_; # Debug routine print "$msg\n"; local $LIST_SEPARATOR = ')('; foreach my $item ( @{$rgroup} ) { my @fix = @{$item}; foreach my $val (@fix) { $val = "undef" unless defined $val; } $fix[4] = "..."; print "(@fix)\n"; } return; } ## end sub Dump_tree_groups { ## closure for sub is_marginal_match my %is_if_or; my %is_assignment; my %is_good_alignment; # This test did not give sufficiently better results to use as an update, # but the flag is worth keeping as a starting point for future testing. use constant TEST_MARGINAL_EQ_ALIGNMENT => 0; BEGIN { my @q = qw( if unless or || ); @is_if_or{@q} = (1) x scalar(@q); @q = qw( = **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x= ); @is_assignment{@q} = (1) x scalar(@q); # Vertically aligning on certain "good" tokens is usually okay # so we can be less restrictive in marginal cases. @q = qw( { ? => = ); push @q, (','); @is_good_alignment{@q} = (1) x scalar(@q); } ## end BEGIN sub is_marginal_match { my ( $line_0, $line_1, $group_level, $imax_align, $imax_prev ) = @_; # Decide if we should undo some or all of the common alignments of a # group of just two lines. # Given: # $line_0 and $line_1 - the two lines # $group_level = the indentation level of the group being processed # $imax_align = the maximum index of the common alignment tokens # of the two lines # $imax_prev = the maximum index of the common alignment tokens # with the line before $line_0 (=-1 of does not exist) # Return: # $is_marginal = true if the two lines should NOT be fully aligned # = false if the two lines can remain fully aligned # $imax_align = the index of the highest alignment token shared by # these two lines to keep if the match is marginal. # When we have an alignment group of just two lines like this, we are # working in the twilight zone of what looks good and what looks bad. # This routine is a collection of rules which work have been found to # work fairly well, but it will need to be updated from time to time. my $is_marginal = 0; #--------------------------------------- # Always align certain special cases ... #--------------------------------------- if ( # always keep alignments of a terminal else or ternary defined( $line_1->{'j_terminal_match'} ) # always align lists || $line_0->{'list_type'} # always align hanging side comments || $line_1->{'is_hanging_side_comment'} ) { return ( $is_marginal, $imax_align ); } my $jmax_0 = $line_0->{'jmax'}; my $jmax_1 = $line_1->{'jmax'}; my $rtokens_1 = $line_1->{'rtokens'}; my $rtokens_0 = $line_0->{'rtokens'}; my $rfield_lengths_0 = $line_0->{'rfield_lengths'}; my $rfield_lengths_1 = $line_1->{'rfield_lengths'}; my $rpatterns_0 = $line_0->{'rpatterns'}; my $rpatterns_1 = $line_1->{'rpatterns'}; my $imax_next = $line_1->{'imax_pair'}; # We will scan the alignment tokens and set a flag '$is_marginal' if # it seems that the an alignment would look bad. my $max_pad = 0; my $saw_good_alignment = 0; my $saw_if_or; # if we saw an 'if' or 'or' at group level my $raw_tokb = EMPTY_STRING; # first token seen at group level my $jfirst_bad; my $line_ending_fat_comma; # is last token just a '=>' ? my $j0_eq_pad; my $j0_max_pad = 0; foreach my $j ( 0 .. $jmax_1 - 2 ) { my ( $raw_tok, $lev, $tag, $tok_count ) = decode_alignment_token( $rtokens_1->[$j] ); if ( $raw_tok && $lev == $group_level ) { if ( !$raw_tokb ) { $raw_tokb = $raw_tok } $saw_if_or ||= $is_if_or{$raw_tok}; } # When the first of the two lines ends in a bare '=>' this will # probably be marginal match. (For a bare =>, the next field length # will be 2 or 3, depending on side comment) $line_ending_fat_comma = $j == $jmax_1 - 2 && $raw_tok eq '=>' && $rfield_lengths_0->[ $j + 1 ] <= 3; my $pad = $rfield_lengths_1->[$j] - $rfield_lengths_0->[$j]; if ( $j == 0 ) { $pad += $line_1->{'leading_space_count'} - $line_0->{'leading_space_count'}; # Remember the pad at a leading equals if ( $raw_tok eq '=' && $lev == $group_level ) { $j0_eq_pad = $pad; $j0_max_pad = 0.5 * ( $rfield_lengths_1->[0] + $rfield_lengths_0->[0] ); $j0_max_pad = 4 if ( $j0_max_pad < 4 ); } } if ( $pad < 0 ) { $pad = -$pad } if ( $pad > $max_pad ) { $max_pad = $pad } if ( $is_good_alignment{$raw_tok} && !$line_ending_fat_comma ) { $saw_good_alignment = 1; } else { $jfirst_bad = $j unless defined($jfirst_bad); } if ( $rpatterns_0->[$j] ne $rpatterns_1->[$j] ) { # Flag this as a marginal match since patterns differ. # Normally, we will not allow just two lines to match if # marginal. But we can allow matching in some specific cases. $jfirst_bad = $j if ( !defined($jfirst_bad) ); $is_marginal = 1 if ( $is_marginal == 0 ); if ( $raw_tok eq '=' ) { # Here is an example of a marginal match: # $done{$$op} = 1; # $op = compile_bblock($op); # The left tokens are both identifiers, but # one accesses a hash and the other doesn't. # We'll let this be a tentative match and undo # it later if we don't find more than 2 lines # in the group. $is_marginal = 2; } } } $is_marginal = 1 if ( $is_marginal == 0 && $line_ending_fat_comma ); # Turn off the "marginal match" flag in some cases... # A "marginal match" occurs when the alignment tokens agree # but there are differences in the other tokens (patterns). # If we leave the marginal match flag set, then the rule is that we # will align only if there are more than two lines in the group. # We will turn of the flag if we almost have a match # and either we have seen a good alignment token or we # just need a small pad (2 spaces) to fit. These rules are # the result of experimentation. Tokens which misaligned by just # one or two characters are annoying. On the other hand, # large gaps to less important alignment tokens are also annoying. if ( $is_marginal == 1 && ( $saw_good_alignment || $max_pad < 3 ) ) { $is_marginal = 0; } # We will use the line endings to help decide on alignments... # See if the lines end with semicolons... my $sc_term0; my $sc_term1; if ( $jmax_0 < 1 || $jmax_1 < 1 ) { # shouldn't happen } else { my $pat0 = $rpatterns_0->[ $jmax_0 - 1 ]; my $pat1 = $rpatterns_1->[ $jmax_1 - 1 ]; $sc_term0 = $pat0 =~ /;b?$/; $sc_term1 = $pat1 =~ /;b?$/; } if ( !$is_marginal && !$sc_term0 ) { # First line of assignment should be semicolon terminated. # For example, do not align here: # $$href{-NUM_TEXT_FILES} = $$href{-NUM_BINARY_FILES} = # $$href{-NUM_DIRS} = 0; if ( $is_assignment{$raw_tokb} ) { $is_marginal = 1; } } # Try to avoid some undesirable alignments of opening tokens # for example, the space between grep and { here: # return map { ( $_ => $_ ) } # grep { /$handles/ } $self->_get_delegate_method_list; $is_marginal ||= ( $raw_tokb eq '(' || $raw_tokb eq '{' ) && $jmax_1 == 2 && $sc_term0 ne $sc_term1; #--------------------------------------- # return if this is not a marginal match #--------------------------------------- if ( !$is_marginal ) { return ( $is_marginal, $imax_align ); } # Undo the marginal match flag in certain cases, # Two lines with a leading equals-like operator are allowed to # align if the patterns to the left of the equals are the same. # For example the following two lines are a marginal match but have # the same left side patterns, so we will align the equals. # my $orig = my $format = "^<<<<< ~~\n"; # my $abc = "abc"; # But these have a different left pattern so they will not be # aligned # $xmldoc .= $`; # $self->{'leftovers'} .= "[0]; my $pat1 = $rpatterns_1->[0]; #--------------------------------------------------------- # Turn off the marginal flag for some types of assignments #--------------------------------------------------------- if ( $is_assignment{$raw_tokb} ) { # undo marginal flag if first line is semicolon terminated # and leading patters match if ($sc_term0) { # && $sc_term1) { $is_marginal = $pat0 ne $pat1; } } elsif ( $raw_tokb eq '=>' ) { # undo marginal flag if patterns match $is_marginal = $pat0 ne $pat1 || $line_ending_fat_comma; } elsif ( $raw_tokb eq '=~' ) { # undo marginal flag if both lines are semicolon terminated # and leading patters match if ( $sc_term1 && $sc_term0 ) { $is_marginal = $pat0 ne $pat1; } } #----------------------------------------------------- # Turn off the marginal flag if we saw an 'if' or 'or' #----------------------------------------------------- # A trailing 'if' and 'or' often gives a good alignment # For example, we can align these: # return -1 if $_[0] =~ m/^CHAPT|APPENDIX/; # return $1 + 0 if $_[0] =~ m/^SECT(\d*)$/; # or # $d_in_m[2] = 29 if ( &Date_LeapYear($y) ); # $d = $d_in_m[$m] if ( $d > $d_in_m[$m] ); if ($saw_if_or) { # undo marginal flag if both lines are semicolon terminated if ( $sc_term0 && $sc_term1 ) { $is_marginal = 0; } } # For a marginal match, only keep matches before the first 'bad' match if ( $is_marginal && defined($jfirst_bad) && $imax_align > $jfirst_bad - 1 ) { $imax_align = $jfirst_bad - 1; } #---------------------------------------------------------- # Allow sweep to match lines with leading '=' in some cases #---------------------------------------------------------- if ( $imax_align < 0 && defined($j0_eq_pad) ) { if ( # If there is a following line with leading equals, or # preceding line with leading equals, then let the sweep align # them without restriction. For example, the first two lines # here are a marginal match, but they are followed by a line # with leading equals, so the sweep-lr logic can align all of # the lines: # $date[1] = $month_to_num{ $date[1] }; # <--line_0 # @xdate = split( /[:\/\s]/, $log->field('t') ); # <--line_1 # $day = sprintf( "%04d/%02d/%02d", @date[ 2, 1, 0 ] ); # $time = sprintf( "%02d:%02d:%02d", @date[ 3 .. 5 ] ); # Likewise, if we reverse the two pairs we want the same result # $day = sprintf( "%04d/%02d/%02d", @date[ 2, 1, 0 ] ); # $time = sprintf( "%02d:%02d:%02d", @date[ 3 .. 5 ] ); # $date[1] = $month_to_num{ $date[1] }; # <--line_0 # @xdate = split( /[:\/\s]/, $log->field('t') ); # <--line_1 ( $imax_next >= 0 || $imax_prev >= 0 || TEST_MARGINAL_EQ_ALIGNMENT ) && $j0_eq_pad >= -$j0_max_pad && $j0_eq_pad <= $j0_max_pad ) { # But do not do this if there is a comma before the '='. # For example, the first two lines below have commas and # therefore are not allowed to align with lines 3 & 4: # my ( $x, $y ) = $self->Size(); #<--line_0 # my ( $left, $top, $right, $bottom ) = $self->Window(); #<--l_1 # my $vx = $right - $left; # my $vy = $bottom - $top; if ( $rpatterns_0->[0] !~ /,/ && $rpatterns_1->[0] !~ /,/ ) { $imax_align = 0; } } } return ( $is_marginal, $imax_align ); } ## end sub is_marginal_match } ## end closure for sub is_marginal_match sub get_extra_leading_spaces { my ( $rlines, $rgroups ) = @_; #---------------------------------------------------------- # Define any extra indentation space (for the -lp option). # Here is why: # If a list has side comments, sub scan_list must dump the # list before it sees everything. When this happens, it sets # the indentation to the standard scheme, but notes how # many spaces it would have liked to use. We may be able # to recover that space here in the event that all of the # lines of a list are back together again. #---------------------------------------------------------- return 0 unless ( @{$rlines} && @{$rgroups} ); my $object = $rlines->[0]->{'indentation'}; return 0 unless ( ref($object) ); my $extra_leading_spaces = 0; my $extra_indentation_spaces_wanted = get_recoverable_spaces($object); return ($extra_leading_spaces) unless ($extra_indentation_spaces_wanted); my $min_spaces = $extra_indentation_spaces_wanted; if ( $min_spaces > 0 ) { $min_spaces = 0 } # loop over all groups my $ng = -1; my $ngroups = @{$rgroups}; foreach my $item ( @{$rgroups} ) { $ng++; my ( $jbeg, $jend ) = @{$item}; foreach my $j ( $jbeg .. $jend ) { next if ( $j == 0 ); # all indentation objects must be the same if ( $object != $rlines->[$j]->{'indentation'} ) { return 0; } } # find the maximum space without exceeding the line length for this group my $avail = $rlines->[$jbeg]->get_available_space_on_right(); my $spaces = ( $avail > $extra_indentation_spaces_wanted ) ? $extra_indentation_spaces_wanted : $avail; #-------------------------------------------------------- # Note: min spaces can be negative; for example with -gnu # f( # do { 1; !!(my $x = bless []); } # ); #-------------------------------------------------------- # The following rule is needed to match older formatting: # For multiple groups, we will keep spaces non-negative. # For a single group, we will allow a negative space. if ( $ngroups > 1 && $spaces < 0 ) { $spaces = 0 } # update the minimum spacing if ( $ng == 0 || $spaces < $extra_leading_spaces ) { $extra_leading_spaces = $spaces; } } # update the indentation object because with -icp the terminal # ');' will use the same adjustment. $object->permanently_decrease_available_spaces( -$extra_leading_spaces ); return $extra_leading_spaces; } ## end sub get_extra_leading_spaces sub forget_side_comment { my ($self) = @_; $self->[_last_side_comment_column_] = 0; return; } sub is_good_side_comment_column { my ( $self, $line, $line_number, $level, $num5 ) = @_; # Upon encountering the first side comment of a group, decide if # a previous side comment should be forgotten. This involves # checking several rules. # Return true to KEEP old comment location # Return false to FORGET old comment location my $KEEP = 1; my $FORGET = 0; my $rfields = $line->{'rfields'}; my $is_hanging_side_comment = $line->{'is_hanging_side_comment'}; # RULE1: Never forget comment before a hanging side comment return $KEEP if ($is_hanging_side_comment); # RULE2: Forget a side comment after a short line difference, # where 'short line difference' is computed from a formula. # Using a smooth formula helps minimize sudden large changes. my $line_diff = $line_number - $self->[_last_side_comment_line_number_]; my $alev_diff = abs( $level - $self->[_last_side_comment_level_] ); # '$num5' is the number of comments in the first 5 lines after the first # comment. It is needed to keep a compact group of side comments from # being influenced by a more distant side comment. $num5 = 1 unless ($num5); # Some values: # $adiff $num5 $short_diff # 0 * 12 # 1 1 6 # 1 2 4 # 1 3 3 # 1 4 2 # 2 1 4 # 2 2 2 # 2 3 1 # 3 1 3 # 3 2 1 my $short_diff = SC_LONG_LINE_DIFF / ( 1 + $alev_diff * $num5 ); return $FORGET if ( $line_diff > $short_diff || !$self->[_rOpts_valign_side_comments_] ); # RULE3: Forget a side comment if this line is at lower level and # ends a block my $last_sc_level = $self->[_last_side_comment_level_]; return $FORGET if ( $level < $last_sc_level && $is_closing_block_type{ substr( $rfields->[0], 0, 1 ) } ); # RULE 4: Forget the last side comment if this comment might join a cached # line ... if ( my $cached_line_type = get_cached_line_type() ) { # ... otherwise side comment alignment will get messed up. # For example, in the following test script # with using 'perltidy -sct -act=2', the last comment would try to # align with the previous and then be in the wrong column when # the lines are combined: # foreach $line ( # [0, 1, 2], [3, 4, 5], [6, 7, 8], # rows # [0, 3, 6], [1, 4, 7], [2, 5, 8], # columns # [0, 4, 8], [2, 4, 6] # ) # diagonals return $FORGET if ( $cached_line_type == 2 || $cached_line_type == 4 ); } # Otherwise, keep it alive return $KEEP; } ## end sub is_good_side_comment_column sub align_side_comments { my ( $self, $rlines, $rgroups ) = @_; # Align any side comments in this batch of lines # Given: # $rlines - the lines # $rgroups - the partition of the lines into groups # # We will be working group-by-group because all side comments # (real or fake) in each group are already aligned. So we just have # to make alignments between groups wherever possible. # An unusual aspect is that within each group we have aligned both real # and fake side comments. This has the consequence that the lengths of # long lines without real side comments can cause 'push' all side comments # to the right. This seems unusual, but testing with and without this # feature shows that it is usually better this way. Otherwise, side # comments can be hidden between long lines without side comments and # thus be harder to read. my $group_level = $self->[_group_level_]; my $continuing_sc_flow = $self->[_last_side_comment_length_] > 0 && $group_level == $self->[_last_level_written_]; # Find groups with side comments, and remember the first nonblank comment my $j_sc_beg; my @todo; my $ng = -1; foreach my $item ( @{$rgroups} ) { $ng++; my ( $jbeg, $jend ) = @{$item}; foreach my $j ( $jbeg .. $jend ) { my $line = $rlines->[$j]; my $jmax = $line->{'jmax'}; if ( $line->{'rfield_lengths'}->[$jmax] ) { # this group has a line with a side comment push @todo, $ng; if ( !defined($j_sc_beg) ) { $j_sc_beg = $j; } last; } } } # done if no groups with side comments return unless @todo; # Count $num5 = number of comments in the 5 lines after the first comment # This is an important factor in a decision formula my $num5 = 1; foreach my $jj ( $j_sc_beg + 1 .. @{$rlines} - 1 ) { my $ldiff = $jj - $j_sc_beg; last if ( $ldiff > 5 ); my $line = $rlines->[$jj]; my $jmax = $line->{'jmax'}; my $sc_len = $line->{'rfield_lengths'}->[$jmax]; next unless ($sc_len); $num5++; } # Forget the old side comment location if necessary my $line_0 = $rlines->[$j_sc_beg]; my $lnum = $j_sc_beg + $self->[_file_writer_object_]->get_output_line_number(); my $keep_it = $self->is_good_side_comment_column( $line_0, $lnum, $group_level, $num5 ); my $last_side_comment_column = $keep_it ? $self->[_last_side_comment_column_] : 0; # If there are multiple groups we will do two passes # so that we can find a common alignment for all groups. my $MAX_PASS = @todo > 1 ? 2 : 1; # Loop over passes my $max_comment_column = $last_side_comment_column; foreach my $PASS ( 1 .. $MAX_PASS ) { # If there are two passes, then on the last pass make the old column # equal to the largest of the group. This will result in the comments # being aligned if possible. if ( $PASS == $MAX_PASS ) { $last_side_comment_column = $max_comment_column; } # Loop over the groups with side comments my $column_limit; foreach my $ng (@todo) { my ( $jbeg, $jend ) = @{ $rgroups->[$ng] }; # Note that since all lines in a group have common alignments, we # just have to work on one of the lines (the first line). my $line = $rlines->[$jbeg]; my $jmax = $line->{'jmax'}; my $is_hanging_side_comment = $line->{'is_hanging_side_comment'}; last if ( $PASS < $MAX_PASS && $is_hanging_side_comment ); # the maximum space without exceeding the line length: my $avail = $line->get_available_space_on_right(); # try to use the previous comment column my $side_comment_column = $line->get_column( $jmax - 1 ); my $move = $last_side_comment_column - $side_comment_column; # Remember the maximum possible column of the first line with # side comment if ( !defined($column_limit) ) { $column_limit = $side_comment_column + $avail; } next if ( $jmax <= 0 ); # but if this doesn't work, give up and use the minimum space my $min_move = $self->[_rOpts_minimum_space_to_comment_] - 1; if ( $move > $avail ) { $move = $min_move; } # but we want some minimum space to the comment if ( $move >= 0 && $j_sc_beg == 0 && $continuing_sc_flow ) { $min_move = 0; } # remove constraints on hanging side comments if ($is_hanging_side_comment) { $min_move = 0 } if ( $move < $min_move ) { $move = $min_move; } # don't exceed the available space if ( $move > $avail ) { $move = $avail } # We can only increase space, never decrease. if ( $move < 0 ) { $move = 0 } # Discover the largest column on the preliminary pass if ( $PASS < $MAX_PASS ) { my $col = $line->get_column( $jmax - 1 ) + $move; # but ignore columns too large for the starting line if ( $col > $max_comment_column && $col < $column_limit ) { $max_comment_column = $col; } } # Make the changes on the final pass else { $line->increase_field_width( $jmax - 1, $move ); # remember this column for the next group $last_side_comment_column = $line->get_column( $jmax - 1 ); } } ## end loop over groups } ## end loop over passes # Find the last side comment my $j_sc_last; my $ng_last = $todo[-1]; my ( $jbeg, $jend ) = @{ $rgroups->[$ng_last] }; foreach my $jj ( reverse( $jbeg .. $jend ) ) { my $line = $rlines->[$jj]; my $jmax = $line->{'jmax'}; if ( $line->{'rfield_lengths'}->[$jmax] ) { $j_sc_last = $jj; last; } } # Save final side comment info for possible use by the next batch if ( defined($j_sc_last) ) { my $line_number = $self->[_file_writer_object_]->get_output_line_number() + $j_sc_last; $self->[_last_side_comment_column_] = $last_side_comment_column; $self->[_last_side_comment_line_number_] = $line_number; $self->[_last_side_comment_level_] = $group_level; } return; } ## end sub align_side_comments ############################### # CODE SECTION 6: Output Step A ############################### sub valign_output_step_A { #------------------------------------------------------------ # This is Step A in writing vertically aligned lines. # The line is prepared according to the alignments which have # been found. Then it is shipped to the next step. #------------------------------------------------------------ my ( $self, $rinput_hash ) = @_; my $line = $rinput_hash->{line}; my $min_ci_gap = $rinput_hash->{min_ci_gap}; my $do_not_align = $rinput_hash->{do_not_align}; my $group_leader_length = $rinput_hash->{group_leader_length}; my $extra_leading_spaces = $rinput_hash->{extra_leading_spaces}; my $level = $rinput_hash->{level}; my $maximum_line_length = $rinput_hash->{maximum_line_length}; my $rfields = $line->{'rfields'}; my $rfield_lengths = $line->{'rfield_lengths'}; my $leading_space_count = $line->{'leading_space_count'}; my $outdent_long_lines = $line->{'outdent_long_lines'}; my $maximum_field_index = $line->{'jmax'}; my $rvertical_tightness_flags = $line->{'rvertical_tightness_flags'}; my $Kend = $line->{'Kend'}; my $level_end = $line->{'level_end'}; # Check for valid hash keys at end of lifetime of $line during development DEVEL_MODE && check_keys( $line, \%valid_LINE_keys, "Checking line keys at valign_output_step_A", 1 ); # add any extra spaces if ( $leading_space_count > $group_leader_length ) { $leading_space_count += $min_ci_gap; } my $str = $rfields->[0]; my $str_len = $rfield_lengths->[0]; my @alignments = @{ $line->{'ralignments'} }; if ( @alignments != $maximum_field_index + 1 ) { # Shouldn't happen: sub install_new_alignments makes jmax alignments my $jmax_alignments = @alignments - 1; if (DEVEL_MODE) { Fault( "alignment jmax=$jmax_alignments should equal $maximum_field_index\n" ); } $do_not_align = 1; } # loop to concatenate all fields of this line and needed padding my $total_pad_count = 0; for my $j ( 1 .. $maximum_field_index ) { # skip zero-length side comments last if ( ( $j == $maximum_field_index ) && ( !defined( $rfields->[$j] ) || ( $rfield_lengths->[$j] == 0 ) ) ); # compute spaces of padding before this field my $col = $alignments[ $j - 1 ]->{'column'}; my $pad = $col - ( $str_len + $leading_space_count ); if ($do_not_align) { $pad = ( $j < $maximum_field_index ) ? 0 : $self->[_rOpts_minimum_space_to_comment_] - 1; } # if the -fpsc flag is set, move the side comment to the selected # column if and only if it is possible, ignoring constraints on # line length and minimum space to comment if ( $self->[_rOpts_fixed_position_side_comment_] && $j == $maximum_field_index ) { my $newpad = $pad + $self->[_rOpts_fixed_position_side_comment_] - $col - 1; if ( $newpad >= 0 ) { $pad = $newpad; } } # accumulate the padding if ( $pad > 0 ) { $total_pad_count += $pad; } # only add padding when we have a finite field; # this avoids extra terminal spaces if we have empty fields if ( $rfield_lengths->[$j] > 0 ) { $str .= SPACE x $total_pad_count; $str_len += $total_pad_count; $total_pad_count = 0; $str .= $rfields->[$j]; $str_len += $rfield_lengths->[$j]; } else { $total_pad_count = 0; } } my $side_comment_length = $rfield_lengths->[$maximum_field_index]; # ship this line off $self->valign_output_step_B( { leading_space_count => $leading_space_count + $extra_leading_spaces, line => $str, line_length => $str_len, side_comment_length => $side_comment_length, outdent_long_lines => $outdent_long_lines, rvertical_tightness_flags => $rvertical_tightness_flags, level => $level, level_end => $level_end, Kend => $Kend, maximum_line_length => $maximum_line_length, } ); return; } ## end sub valign_output_step_A sub combine_fields { # We have a group of two lines for which we do not want to align tokens # between index $imax_align and the side comment. So we will delete fields # between $imax_align and the side comment. Alignments have already # been set so we have to adjust them. my ( $line_0, $line_1, $imax_align ) = @_; if ( !defined($imax_align) ) { $imax_align = -1 } # First delete the unwanted tokens my $jmax_old = $line_0->{'jmax'}; my @idel = ( $imax_align + 1 .. $jmax_old - 2 ); return unless (@idel); # Get old alignments before any changes are made my @old_alignments = @{ $line_0->{'ralignments'} }; foreach my $line ( $line_0, $line_1 ) { delete_selected_tokens( $line, \@idel ); } # Now adjust the alignments. Note that the side comment alignment # is always at jmax-1, and there is an ending alignment at jmax. my @new_alignments; if ( $imax_align >= 0 ) { @new_alignments[ 0 .. $imax_align ] = @old_alignments[ 0 .. $imax_align ]; } my $jmax_new = $line_0->{'jmax'}; $new_alignments[ $jmax_new - 1 ] = $old_alignments[ $jmax_old - 1 ]; $new_alignments[$jmax_new] = $old_alignments[$jmax_old]; $line_0->{'ralignments'} = \@new_alignments; $line_1->{'ralignments'} = \@new_alignments; return; } ## end sub combine_fields sub get_output_line_number { # The output line number reported to a caller = # the number of items still in the buffer + # the number of items written. return $_[0]->group_line_count() + $_[0]->[_file_writer_object_]->get_output_line_number(); } ## end sub get_output_line_number ############################### # CODE SECTION 7: Output Step B ############################### { ## closure for sub valign_output_step_B # These are values for a cache used by valign_output_step_B. my $cached_line_text; my $cached_line_text_length; my $cached_line_type; my $cached_line_opening_flag; my $cached_line_closing_flag; my $cached_seqno; my $cached_line_valid; my $cached_line_leading_space_count; my $cached_seqno_string; my $cached_line_Kend; my $cached_line_maximum_length; # These are passed to step_C: my $seqno_string; my $last_nonblank_seqno_string; sub set_last_nonblank_seqno_string { my ($val) = @_; $last_nonblank_seqno_string = $val; return; } sub get_cached_line_opening_flag { return $cached_line_opening_flag; } sub get_cached_line_type { return $cached_line_type; } sub set_cached_line_valid { my ($val) = @_; $cached_line_valid = $val; return; } sub get_cached_seqno { return $cached_seqno; } sub initialize_step_B_cache { # valign_output_step_B cache: $cached_line_text = EMPTY_STRING; $cached_line_text_length = 0; $cached_line_type = 0; $cached_line_opening_flag = 0; $cached_line_closing_flag = 0; $cached_seqno = 0; $cached_line_valid = 0; $cached_line_leading_space_count = 0; $cached_seqno_string = EMPTY_STRING; $cached_line_Kend = undef; $cached_line_maximum_length = undef; # These vars hold a string of sequence numbers joined together used by # the cache $seqno_string = EMPTY_STRING; $last_nonblank_seqno_string = EMPTY_STRING; return; } ## end sub initialize_step_B_cache sub _flush_step_B_cache { my ($self) = @_; # Send any text in the step_B cache on to step_C if ($cached_line_type) { $seqno_string = $cached_seqno_string; $self->valign_output_step_C( $seqno_string, $last_nonblank_seqno_string, $cached_line_text, $cached_line_leading_space_count, $self->[_last_level_written_], $cached_line_Kend, ); $cached_line_type = 0; $cached_line_text = EMPTY_STRING; $cached_line_text_length = 0; $cached_seqno_string = EMPTY_STRING; $cached_line_Kend = undef; $cached_line_maximum_length = undef; } return; } ## end sub _flush_step_B_cache sub handle_cached_line { my ( $self, $rinput, $leading_string, $leading_string_length ) = @_; # The cached line will either be: # - passed along to step_C, or # - or combined with the current line my $last_level_written = $self->[_last_level_written_]; my $leading_space_count = $rinput->{leading_space_count}; my $str = $rinput->{line}; my $str_length = $rinput->{line_length}; my $rvertical_tightness_flags = $rinput->{rvertical_tightness_flags}; my $level = $rinput->{level}; my $level_end = $rinput->{level_end}; my $maximum_line_length = $rinput->{maximum_line_length}; my ( $open_or_close, $opening_flag, $closing_flag, $seqno, $valid, $seqno_beg, $seqno_end ); if ($rvertical_tightness_flags) { $open_or_close = $rvertical_tightness_flags->{_vt_type}; $seqno_beg = $rvertical_tightness_flags->{_vt_seqno_beg}; } # Dump an invalid cached line if ( !$cached_line_valid ) { $self->valign_output_step_C( $seqno_string, $last_nonblank_seqno_string, $cached_line_text, $cached_line_leading_space_count, $last_level_written, $cached_line_Kend, ); } # Handle cached line ending in OPENING tokens elsif ( $cached_line_type == 1 || $cached_line_type == 3 ) { my $gap = $leading_space_count - $cached_line_text_length; # handle option of just one tight opening per line: if ( $cached_line_opening_flag == 1 ) { if ( defined($open_or_close) && $open_or_close == 1 ) { $gap = -1; } } # Do not join the lines if this might produce a one-line # container which exceeds the maximum line length. This is # necessary prevent blinking, particularly with the combination # -xci -pvt=2. In that case a one-line block alternately forms # and breaks, causing -xci to alternately turn on and off (case # b765). # Patched to fix cases b656 b862 b971 b972: always do the check # if the maximum line length changes (due to -vmll). if ( $gap >= 0 && ( $maximum_line_length != $cached_line_maximum_length || ( defined($level_end) && $level > $level_end ) ) ) { my $test_line_length = $cached_line_text_length + $gap + $str_length; # Add a small tolerance in the length test (fixes case b862) if ( $test_line_length > $cached_line_maximum_length - 2 ) { $gap = -1; } } if ( $gap >= 0 && defined($seqno_beg) ) { $maximum_line_length = $cached_line_maximum_length; $leading_string = $cached_line_text . SPACE x $gap; $leading_string_length = $cached_line_text_length + $gap; $leading_space_count = $cached_line_leading_space_count; $seqno_string = $cached_seqno_string . ':' . $seqno_beg; $level = $last_level_written; } else { $self->valign_output_step_C( $seqno_string, $last_nonblank_seqno_string, $cached_line_text, $cached_line_leading_space_count, $last_level_written, $cached_line_Kend, ); } } # Handle cached line ending in CLOSING tokens else { my $test_line = $cached_line_text . SPACE x $cached_line_closing_flag . $str; my $test_line_length = $cached_line_text_length + $cached_line_closing_flag + $str_length; if ( # The new line must start with container $seqno_beg # The container combination must be okay.. && ( # okay to combine like types ( $open_or_close == $cached_line_type ) # closing block brace may append to non-block || ( $cached_line_type == 2 && $open_or_close == 4 ) # something like ');' || ( !$open_or_close && $cached_line_type == 2 ) ) # The combined line must fit && ( $test_line_length <= $cached_line_maximum_length ) ) { $seqno_string = $cached_seqno_string . ':' . $seqno_beg; # Patch to outdent closing tokens ending # in ');' If we # are joining a line like ');' to a previous stacked set of # closing tokens, then decide if we may outdent the # combined stack to the indentation of the ');'. Since we # should not normally outdent any of the other tokens more # than the indentation of the lines that contained them, we # will only do this if all of the corresponding opening # tokens were on the same line. This can happen with -sot # and -sct. # For example, it is ok here: # __PACKAGE__->load_components( qw( # PK::Auto # Core # )); # # But, for example, we do not outdent in this example # because that would put the closing sub brace out farther # than the opening sub brace: # # perltidy -sot -sct # $c->Tk::bind( # '' => sub { # my ($c) = @_; # my $e = $c->XEvent; # itemsUnderArea $c; # } ); # if ( $str =~ /^\);/ && $cached_line_text =~ /^[\)\}\]\s]*$/ ) { # The way to tell this is if the stacked sequence # numbers of this output line are the reverse of the # stacked sequence numbers of the previous non-blank # line of sequence numbers. So we can join if the # previous nonblank string of tokens is the mirror # image. For example if stack )}] is 13:8:6 then we # are looking for a leading stack like [{( which # is 6:8:13. We only need to check the two ends, # because the intermediate tokens must fall in order. # Note on speed: having to split on colons and # eliminate multiple colons might appear to be slow, # but it's not an issue because we almost never come # through here. In a typical file we don't. $seqno_string =~ s/^:+//; $last_nonblank_seqno_string =~ s/^:+//; $seqno_string =~ s/:+/:/g; $last_nonblank_seqno_string =~ s/:+/:/g; # how many spaces can we outdent? my $diff = $cached_line_leading_space_count - $leading_space_count; if ( $diff > 0 && length($seqno_string) && length($last_nonblank_seqno_string) == length($seqno_string) ) { my @seqno_last = ( split /:/, $last_nonblank_seqno_string ); my @seqno_now = ( split /:/, $seqno_string ); if ( @seqno_now && @seqno_last && $seqno_now[-1] == $seqno_last[0] && $seqno_now[0] == $seqno_last[-1] ) { # OK to outdent .. # for absolute safety, be sure we only remove # whitespace my $ws = substr( $test_line, 0, $diff ); if ( ( length($ws) == $diff ) && $ws =~ /^\s+$/ ) { $test_line = substr( $test_line, $diff ); $cached_line_leading_space_count -= $diff; $last_level_written = $self->level_change( $cached_line_leading_space_count, $diff, $last_level_written ); $self->reduce_valign_buffer_indentation($diff); } # shouldn't happen, but not critical: ##else { ## ERROR transferring indentation here ##} } } } # Change the args to look like we received the combined line $str = $test_line; $str_length = $test_line_length; $leading_string = EMPTY_STRING; $leading_string_length = 0; $leading_space_count = $cached_line_leading_space_count; $level = $last_level_written; $maximum_line_length = $cached_line_maximum_length; } else { $self->valign_output_step_C( $seqno_string, $last_nonblank_seqno_string, $cached_line_text, $cached_line_leading_space_count, $last_level_written, $cached_line_Kend, ); } } return ( $str, $str_length, $leading_string, $leading_string_length, $leading_space_count, $level, $maximum_line_length, ); } ## end sub handle_cached_line sub valign_output_step_B { #--------------------------------------------------------- # This is Step B in writing vertically aligned lines. # Vertical tightness is applied according to preset flags. # In particular this routine handles stacking of opening # and closing tokens. #--------------------------------------------------------- my ( $self, $rinput ) = @_; my $leading_space_count = $rinput->{leading_space_count}; my $str = $rinput->{line}; my $str_length = $rinput->{line_length}; my $side_comment_length = $rinput->{side_comment_length}; my $outdent_long_lines = $rinput->{outdent_long_lines}; my $rvertical_tightness_flags = $rinput->{rvertical_tightness_flags}; my $level = $rinput->{level}; my $level_end = $rinput->{level_end}; my $Kend = $rinput->{Kend}; my $maximum_line_length = $rinput->{maximum_line_length}; # Useful -gcs test cases for wide characters are # perl527/(method.t.2, reg_mesg.t, mime-header.t) # handle outdenting of long lines: my $is_outdented_line; if ($outdent_long_lines) { my $excess = $str_length - $side_comment_length + $leading_space_count - $maximum_line_length; if ( $excess > 0 ) { $leading_space_count = 0; my $file_writer_object = $self->[_file_writer_object_]; my $last_outdented_line_at = $file_writer_object->get_output_line_number(); $self->[_last_outdented_line_at_] = $last_outdented_line_at; my $outdented_line_count = $self->[_outdented_line_count_]; unless ($outdented_line_count) { $self->[_first_outdented_line_at_] = $last_outdented_line_at; } $outdented_line_count++; $self->[_outdented_line_count_] = $outdented_line_count; $is_outdented_line = 1; } } # Make preliminary leading whitespace. It could get changed # later by entabbing, so we have to keep track of any changes # to the leading_space_count from here on. my $leading_string = $leading_space_count > 0 ? ( SPACE x $leading_space_count ) : EMPTY_STRING; my $leading_string_length = length($leading_string); # Unpack any recombination data; it was packed by # sub 'Formatter::set_vertical_tightness_flags' # old hash Meaning # index key # # 0 _vt_type: 1=opening non-block 2=closing non-block # 3=opening block brace 4=closing block brace # # 1a _vt_opening_flag: 1=no multiple steps, 2=multiple steps ok # 1b _vt_closing_flag: spaces of padding to use if closing # 2 _vt_seqno: sequence number of container # 3 _vt_valid flag: do not append if this flag is false. Will be # true if appropriate -vt flag is set. Otherwise, Will be # made true only for 2 line container in parens with -lp # 4 _vt_seqno_beg: sequence number of first token of line # 5 _vt_seqno_end: sequence number of last token of line # 6 _vt_min_lines: min number of lines for joining opening cache, # 0=no constraint # 7 _vt_max_lines: max number of lines for joining opening cache, # 0=no constraint my ( $open_or_close, $opening_flag, $closing_flag, $seqno, $valid, $seqno_beg, $seqno_end ); if ($rvertical_tightness_flags) { $open_or_close = $rvertical_tightness_flags->{_vt_type}; $opening_flag = $rvertical_tightness_flags->{_vt_opening_flag}; $closing_flag = $rvertical_tightness_flags->{_vt_closing_flag}; $seqno = $rvertical_tightness_flags->{_vt_seqno}; $valid = $rvertical_tightness_flags->{_vt_valid_flag}; $seqno_beg = $rvertical_tightness_flags->{_vt_seqno_beg}; $seqno_end = $rvertical_tightness_flags->{_vt_seqno_end}; } $seqno_string = $seqno_end; # handle any cached line .. # either append this line to it or write it out # Note: the function length() is used in this next test out of caution. # All testing has shown that the variable $cached_line_text_length is # correct, but its calculation is complex and a loss of cached text # would be a disaster. if ( length($cached_line_text) ) { ( $str, $str_length, $leading_string, $leading_string_length, $leading_space_count, $level, $maximum_line_length ) = $self->handle_cached_line( $rinput, $leading_string, $leading_string_length ); $cached_line_type = 0; $cached_line_text = EMPTY_STRING; $cached_line_text_length = 0; $cached_line_Kend = undef; $cached_line_maximum_length = undef; } # make the line to be written my $line = $leading_string . $str; my $line_length = $leading_string_length + $str_length; # Safety check: be sure that a line to be cached as a stacked block # brace line ends in the appropriate opening or closing block brace. # This should always be the case if the caller set flags correctly. # Code '3' is for -sobb, code '4' is for -scbb. if ($open_or_close) { if ( $open_or_close == 3 && $line !~ /\{\s*$/ || $open_or_close == 4 && $line !~ /\}\s*$/ ) { $open_or_close = 0; } } # write or cache this line ... # fix for case b999: do not cache an outdented line # fix for b1378: do not cache an empty line if ( !$open_or_close || $side_comment_length > 0 || $is_outdented_line || !$line_length ) { $self->valign_output_step_C( $seqno_string, $last_nonblank_seqno_string, $line, $leading_space_count, $level, $Kend, ); } else { $cached_line_text = $line; $cached_line_text_length = $line_length; $cached_line_type = $open_or_close; $cached_line_opening_flag = $opening_flag; $cached_line_closing_flag = $closing_flag; $cached_seqno = $seqno; $cached_line_valid = $valid; $cached_line_leading_space_count = $leading_space_count; $cached_seqno_string = $seqno_string; $cached_line_Kend = $Kend; $cached_line_maximum_length = $maximum_line_length; } $self->[_last_level_written_] = $level; $self->[_last_side_comment_length_] = $side_comment_length; return; } ## end sub valign_output_step_B } ############################### # CODE SECTION 8: Output Step C ############################### { ## closure for sub valign_output_step_C # Vertical alignment buffer used by valign_output_step_C my $valign_buffer_filling; my @valign_buffer; sub initialize_valign_buffer { @valign_buffer = (); $valign_buffer_filling = EMPTY_STRING; return; } sub dump_valign_buffer { my ($self) = @_; # Send all lines in the current buffer on to step_D if (@valign_buffer) { foreach (@valign_buffer) { $self->valign_output_step_D( @{$_} ); } @valign_buffer = (); } $valign_buffer_filling = EMPTY_STRING; return; } ## end sub dump_valign_buffer sub reduce_valign_buffer_indentation { my ( $self, $diff ) = @_; # Reduce the leading indentation of lines in the current # buffer by $diff spaces if ( $valign_buffer_filling && $diff ) { my $max_valign_buffer = @valign_buffer; foreach my $i ( 0 .. $max_valign_buffer - 1 ) { my ( $line, $leading_space_count, $level, $Kend ) = @{ $valign_buffer[$i] }; my $ws = substr( $line, 0, $diff ); if ( ( length($ws) == $diff ) && $ws =~ /^\s+$/ ) { $line = substr( $line, $diff ); } if ( $leading_space_count >= $diff ) { $leading_space_count -= $diff; $level = $self->level_change( $leading_space_count, $diff, $level ); } $valign_buffer[$i] = [ $line, $leading_space_count, $level, $Kend ]; } } return; } ## end sub reduce_valign_buffer_indentation sub valign_output_step_C { #----------------------------------------------------------------------- # This is Step C in writing vertically aligned lines. # Lines are either stored in a buffer or passed along to the next step. # The reason for storing lines is that we may later want to reduce their # indentation when -sot and -sct are both used. #----------------------------------------------------------------------- my ( $self, $seqno_string, $last_nonblank_seqno_string, @args_to_D, ) = @_; # Dump any saved lines if we see a line with an unbalanced opening or # closing token. $self->dump_valign_buffer() if ( $seqno_string && $valign_buffer_filling ); # Either store or write this line if ($valign_buffer_filling) { push @valign_buffer, [@args_to_D]; } else { $self->valign_output_step_D(@args_to_D); } # For lines starting or ending with opening or closing tokens.. if ($seqno_string) { $last_nonblank_seqno_string = $seqno_string; set_last_nonblank_seqno_string($seqno_string); # Start storing lines when we see a line with multiple stacked # opening tokens. # patch for RT #94354, requested by Colin Williams if ( index( $seqno_string, ':' ) >= 0 && $seqno_string =~ /^\d+(\:+\d+)+$/ && $args_to_D[0] !~ /^[\}\)\]\:\?]/ ) { # This test is efficient but a little subtle: The first test # says that we have multiple sequence numbers and hence # multiple opening or closing tokens in this line. The second # part of the test rejects stacked closing and ternary tokens. # So if we get here then we should have stacked unbalanced # opening tokens. # Here is a complex example: # Foo($Bar[0], { # (side comment) # baz => 1, # }); # The first line has sequence 6::4. It does not begin with # a closing token or ternary, so it passes the test and must be # stacked opening tokens. # The last line has sequence 4:6 but is a stack of closing # tokens, so it gets rejected. # Note that the sequence number of an opening token for a qw # quote is a negative number and will be rejected. For # example, for the following line: skip_symbols([qw( # $seqno_string='10:5:-1'. It would be okay to accept it but I # decided not to do this after testing. $valign_buffer_filling = $seqno_string; } } return; } ## end sub valign_output_step_C } ############################### # CODE SECTION 9: Output Step D ############################### sub valign_output_step_D { #---------------------------------------------------------------- # This is Step D in writing vertically aligned lines. # It is the end of the vertical alignment pipeline. # Write one vertically aligned line of code to the output object. #---------------------------------------------------------------- my ( $self, $line, $leading_space_count, $level, $Kend ) = @_; # The line is currently correct if there is no tabbing (recommended!) # We may have to lop off some leading spaces and replace with tabs. if ( $leading_space_count > 0 ) { my $rOpts_indent_columns = $self->[_rOpts_indent_columns_]; my $rOpts_tabs = $self->[_rOpts_tabs_]; my $rOpts_entab_leading_whitespace = $self->[_rOpts_entab_leading_whitespace_]; # Nothing to do if no tabs if ( !( $rOpts_tabs || $rOpts_entab_leading_whitespace ) || $rOpts_indent_columns <= 0 ) { # nothing to do } # Handle entab option elsif ($rOpts_entab_leading_whitespace) { # Patch 12-nov-2018 based on report from Glenn. Extra padding was # not correctly entabbed, nor were side comments: Increase leading # space count for a padded line to get correct tabbing if ( $line =~ /^(\s+)(.*)$/ ) { my $spaces = length($1); if ( $spaces > $leading_space_count ) { $leading_space_count = $spaces; } } my $space_count = $leading_space_count % $rOpts_entab_leading_whitespace; my $tab_count = int( $leading_space_count / $rOpts_entab_leading_whitespace ); my $leading_string = "\t" x $tab_count . SPACE x $space_count; if ( $line =~ /^\s{$leading_space_count,$leading_space_count}/ ) { substr( $line, 0, $leading_space_count ) = $leading_string; } else { # shouldn't happen - program error counting whitespace # - skip entabbing DEBUG_TABS && warning( "Error entabbing in valign_output_step_D: expected count=$leading_space_count\n" ); } } # Handle option of one tab per level else { my $leading_string = ( "\t" x $level ); my $space_count = $leading_space_count - $level * $rOpts_indent_columns; # shouldn't happen: if ( $space_count < 0 ) { # But it could be an outdented comment if ( $line !~ /^\s*#/ ) { DEBUG_TABS && warning( "Error entabbing in valign_output_step_D: for level=$level count=$leading_space_count\n" ); } $leading_string = ( SPACE x $leading_space_count ); } else { $leading_string .= ( SPACE x $space_count ); } if ( $line =~ /^\s{$leading_space_count,$leading_space_count}/ ) { substr( $line, 0, $leading_space_count ) = $leading_string; } else { # shouldn't happen - program error counting whitespace # we'll skip entabbing DEBUG_TABS && warning( "Error entabbing in valign_output_step_D: expected count=$leading_space_count\n" ); } } } my $file_writer_object = $self->[_file_writer_object_]; $file_writer_object->write_code_line( $line . "\n", $Kend ); return; } ## end sub valign_output_step_D { ## closure for sub get_leading_string my @leading_string_cache; sub initialize_leading_string_cache { @leading_string_cache = (); return; } sub get_leading_string { # define the leading whitespace string for this line.. my ( $self, $leading_whitespace_count, $group_level ) = @_; # Handle case of zero whitespace, which includes multi-line quotes # (which may have a finite level; this prevents tab problems) if ( $leading_whitespace_count <= 0 ) { return EMPTY_STRING; } # look for previous result elsif ( $leading_string_cache[$leading_whitespace_count] ) { return $leading_string_cache[$leading_whitespace_count]; } # must compute a string for this number of spaces my $leading_string; # Handle simple case of no tabs my $rOpts_indent_columns = $self->[_rOpts_indent_columns_]; my $rOpts_tabs = $self->[_rOpts_tabs_]; my $rOpts_entab_leading_whitespace = $self->[_rOpts_entab_leading_whitespace_]; if ( !( $rOpts_tabs || $rOpts_entab_leading_whitespace ) || $rOpts_indent_columns <= 0 ) { $leading_string = ( SPACE x $leading_whitespace_count ); } # Handle entab option elsif ($rOpts_entab_leading_whitespace) { my $space_count = $leading_whitespace_count % $rOpts_entab_leading_whitespace; my $tab_count = int( $leading_whitespace_count / $rOpts_entab_leading_whitespace ); $leading_string = "\t" x $tab_count . SPACE x $space_count; } # Handle option of one tab per level else { $leading_string = ( "\t" x $group_level ); my $space_count = $leading_whitespace_count - $group_level * $rOpts_indent_columns; # shouldn't happen: if ( $space_count < 0 ) { DEBUG_TABS && warning( "Error in get_leading_string: for level=$group_level count=$leading_whitespace_count\n" ); # -- skip entabbing $leading_string = ( SPACE x $leading_whitespace_count ); } else { $leading_string .= ( SPACE x $space_count ); } } $leading_string_cache[$leading_whitespace_count] = $leading_string; return $leading_string; } ## end sub get_leading_string } ## end get_leading_string ########################## # CODE SECTION 10: Summary ########################## sub report_anything_unusual { my $self = shift; my $outdented_line_count = $self->[_outdented_line_count_]; if ( $outdented_line_count > 0 ) { write_logfile_entry( "$outdented_line_count long lines were outdented:\n"); my $first_outdented_line_at = $self->[_first_outdented_line_at_]; write_logfile_entry( " First at output line $first_outdented_line_at\n"); if ( $outdented_line_count > 1 ) { my $last_outdented_line_at = $self->[_last_outdented_line_at_]; write_logfile_entry( " Last at output line $last_outdented_line_at\n"); } write_logfile_entry( " use -noll to prevent outdenting, -l=n to increase line length\n" ); write_logfile_entry("\n"); } return; } ## end sub report_anything_unusual 1; Perl-Tidy-20230309/lib/Perl/Tidy/Debugger.pm0000644000175000017500000000665614400733176017312 0ustar stevesteve##################################################################### # # The Perl::Tidy::Debugger class shows line tokenization # ##################################################################### package Perl::Tidy::Debugger; use strict; use warnings; use English qw( -no_match_vars ); our $VERSION = '20230309'; use constant EMPTY_STRING => q{}; use constant SPACE => q{ }; sub new { my ( $class, $filename, $is_encoded_data ) = @_; return bless { _debug_file => $filename, _debug_file_opened => 0, _fh => undef, _is_encoded_data => $is_encoded_data, }, $class; } ## end sub new sub really_open_debug_file { my $self = shift; my $debug_file = $self->{_debug_file}; my $is_encoded_data = $self->{_is_encoded_data}; my ( $fh, $filename ) = Perl::Tidy::streamhandle( $debug_file, 'w', $is_encoded_data ); if ( !$fh ) { Perl::Tidy::Warn("can't open $debug_file: $ERRNO\n"); } $self->{_debug_file_opened} = 1; $self->{_fh} = $fh; $fh->print( "Use -dump-token-types (-dtt) to get a list of token type codes\n"); return; } ## end sub really_open_debug_file sub close_debug_file { my $self = shift; if ( $self->{_debug_file_opened} ) { if ( !eval { $self->{_fh}->close(); 1 } ) { # ok, maybe no close function } } return; } ## end sub close_debug_file sub write_debug_entry { # This is a debug dump routine which may be modified as necessary # to dump tokens on a line-by-line basis. The output will be written # to the .DEBUG file when the -D flag is entered. my ( $self, $line_of_tokens ) = @_; my $input_line = $line_of_tokens->{_line_text}; my $rtoken_type = $line_of_tokens->{_rtoken_type}; my $rtokens = $line_of_tokens->{_rtokens}; my $rlevels = $line_of_tokens->{_rlevels}; my $input_line_number = $line_of_tokens->{_line_number}; my $line_type = $line_of_tokens->{_line_type}; my ( $j, $num ); my $token_str = "$input_line_number: "; my $reconstructed_original = "$input_line_number: "; my $pattern = EMPTY_STRING; my @next_char = ( '"', '"' ); my $i_next = 0; unless ( $self->{_debug_file_opened} ) { $self->really_open_debug_file() } my $fh = $self->{_fh}; foreach my $j ( 0 .. @{$rtoken_type} - 1 ) { # testing patterns if ( $rtoken_type->[$j] eq 'k' ) { $pattern .= $rtokens->[$j]; } else { $pattern .= $rtoken_type->[$j]; } $reconstructed_original .= $rtokens->[$j]; $num = length( $rtokens->[$j] ); my $type_str = $rtoken_type->[$j]; # be sure there are no blank tokens (shouldn't happen) # This can only happen if a programming error has been made # because all valid tokens are non-blank if ( $type_str eq SPACE ) { $fh->print("BLANK TOKEN on the next line\n"); $type_str = $next_char[$i_next]; $i_next = 1 - $i_next; } if ( length($type_str) == 1 ) { $type_str = $type_str x $num; } $token_str .= $type_str; } # Write what you want here ... # $fh->print "$input_line\n"; # $fh->print "$pattern\n"; $fh->print("$reconstructed_original\n"); $fh->print("$token_str\n"); return; } ## end sub write_debug_entry 1; Perl-Tidy-20230309/lib/Perl/Tidy/IOScalarArray.pm0000644000175000017500000000575314400733202020205 0ustar stevesteve##################################################################### # # This is a stripped down version of IO::ScalarArray # Given a reference to an array, it supplies either: # a getline method which reads lines (mode='r'), or # a print method which reads lines (mode='w') # # NOTE: this routine assumes that there aren't any embedded # newlines within any of the array elements. There are no checks # for that. # ##################################################################### package Perl::Tidy::IOScalarArray; use strict; use warnings; use Carp; our $VERSION = '20230309'; sub AUTOLOAD { # Catch any undefined sub calls so that we are sure to get # some diagnostic information. This sub should never be called # except for a programming error. our $AUTOLOAD; return if ( $AUTOLOAD =~ /\bDESTROY$/ ); my ( $pkg, $fname, $lno ) = caller(); my $my_package = __PACKAGE__; print STDERR <[1]; if ( $mode ne 'r' ) { confess <[2]++; return $self->[0]->[$i]; } sub print { my ( $self, $msg ) = @_; my $mode = $self->[1]; if ( $mode ne 'w' ) { confess <[0] }, $msg; return; } sub close { return } 1; Perl-Tidy-20230309/lib/Perl/Tidy/Tokenizer.pm0000644000175000017500000133651414400733205017531 0ustar stevesteve##################################################################### # # The Perl::Tidy::Tokenizer package is essentially a filter which # reads lines of perl source code from a source object and provides # corresponding tokenized lines through its get_line() method. Lines # flow from the source_object to the caller like this: # # source_object --> LineBuffer_object --> Tokenizer --> calling routine # get_line() get_line() get_line() line_of_tokens # # The source object can be any object with a get_line() method which # supplies one line (a character string) perl call. # The LineBuffer object is created by the Tokenizer. # The Tokenizer returns a reference to a data structure 'line_of_tokens' # containing one tokenized line for each call to its get_line() method. # # WARNING: This is not a real class. Only one tokenizer my be used. # ######################################################################## package Perl::Tidy::Tokenizer; use strict; use warnings; use English qw( -no_match_vars ); our $VERSION = '20230309'; use Perl::Tidy::LineBuffer; use Carp; use constant DEVEL_MODE => 0; use constant EMPTY_STRING => q{}; use constant SPACE => q{ }; # Decimal values of some ascii characters for quick checks use constant ORD_TAB => 9; use constant ORD_SPACE => 32; use constant ORD_PRINTABLE_MIN => 33; use constant ORD_PRINTABLE_MAX => 126; # PACKAGE VARIABLES for processing an entire FILE. # These must be package variables because most may get localized during # processing. Most are initialized in sub prepare_for_a_new_file. use vars qw{ $tokenizer_self $last_nonblank_token $last_nonblank_type $last_nonblank_block_type $statement_type $in_attribute_list $current_package $context %is_constant %is_user_function %user_function_prototype %is_block_function %is_block_list_function %saw_function_definition %saw_use_module $brace_depth $paren_depth $square_bracket_depth @current_depth @total_depth $total_depth $next_sequence_number @nesting_sequence_number @current_sequence_number @paren_type @paren_semicolon_count @paren_structural_type @brace_type @brace_structural_type @brace_context @brace_package @square_bracket_type @square_bracket_structural_type @depth_array @nested_ternary_flag @nested_statement_type @starting_line_of_current_depth }; # GLOBAL CONSTANTS for routines in this package, # Initialized in a BEGIN block. use vars qw{ %is_indirect_object_taker %is_block_operator %expecting_operator_token %expecting_operator_types %expecting_term_types %expecting_term_token %is_digraph %can_start_digraph %is_file_test_operator %is_trigraph %is_tetragraph %is_valid_token_type %is_keyword %is_my_our_state %is_code_block_token %is_sort_map_grep_eval_do %is_sort_map_grep %is_grep_alias %really_want_term @opening_brace_names @closing_brace_names %is_keyword_taking_list %is_keyword_taking_optional_arg %is_keyword_rejecting_slash_as_pattern_delimiter %is_keyword_rejecting_question_as_pattern_delimiter %is_q_qq_qx_qr_s_y_tr_m %is_q_qq_qw_qx_qr_s_y_tr_m %is_sub %is_package %is_comma_question_colon %is_if_elsif_unless %is_if_elsif_unless_case_when %other_line_endings %is_END_DATA_format_sub %is_semicolon_or_t $code_skipping_pattern_begin $code_skipping_pattern_end }; # GLOBAL VARIABLES which are constant after being configured by user-supplied # parameters. They remain constant as a file is being processed. my ( $rOpts_code_skipping, $code_skipping_pattern_begin, $code_skipping_pattern_end, ); # possible values of operator_expected() use constant TERM => -1; use constant UNKNOWN => 0; use constant OPERATOR => 1; # possible values of context use constant SCALAR_CONTEXT => -1; use constant UNKNOWN_CONTEXT => 0; use constant LIST_CONTEXT => 1; # Maximum number of little messages; probably need not be changed. use constant MAX_NAG_MESSAGES => 6; BEGIN { # Array index names for $self. # Do not combine with other BEGIN blocks (c101). my $i = 0; use constant { _rhere_target_list_ => $i++, _in_here_doc_ => $i++, _here_doc_target_ => $i++, _here_quote_character_ => $i++, _in_data_ => $i++, _in_end_ => $i++, _in_format_ => $i++, _in_error_ => $i++, _in_pod_ => $i++, _in_skipped_ => $i++, _in_attribute_list_ => $i++, _in_quote_ => $i++, _quote_target_ => $i++, _line_start_quote_ => $i++, _starting_level_ => $i++, _know_starting_level_ => $i++, _tabsize_ => $i++, _indent_columns_ => $i++, _look_for_hash_bang_ => $i++, _trim_qw_ => $i++, _continuation_indentation_ => $i++, _outdent_labels_ => $i++, _last_line_number_ => $i++, _saw_perl_dash_P_ => $i++, _saw_perl_dash_w_ => $i++, _saw_use_strict_ => $i++, _saw_v_string_ => $i++, _hit_bug_ => $i++, _look_for_autoloader_ => $i++, _look_for_selfloader_ => $i++, _saw_autoloader_ => $i++, _saw_selfloader_ => $i++, _saw_hash_bang_ => $i++, _saw_end_ => $i++, _saw_data_ => $i++, _saw_negative_indentation_ => $i++, _started_tokenizing_ => $i++, _line_buffer_object_ => $i++, _debugger_object_ => $i++, _diagnostics_object_ => $i++, _logger_object_ => $i++, _unexpected_error_count_ => $i++, _started_looking_for_here_target_at_ => $i++, _nearly_matched_here_target_at_ => $i++, _line_of_text_ => $i++, _rlower_case_labels_at_ => $i++, _extended_syntax_ => $i++, _maximum_level_ => $i++, _true_brace_error_count_ => $i++, _rOpts_maximum_level_errors_ => $i++, _rOpts_maximum_unexpected_errors_ => $i++, _rOpts_logfile_ => $i++, _rOpts_ => $i++, }; } ## end BEGIN { ## closure for subs to count instances # methods to count instances my $_count = 0; sub get_count { return $_count; } sub _increment_count { return ++$_count } sub _decrement_count { return --$_count } } sub DESTROY { my $self = shift; $self->_decrement_count(); return; } sub AUTOLOAD { # Catch any undefined sub calls so that we are sure to get # some diagnostic information. This sub should never be called # except for a programming error. our $AUTOLOAD; return if ( $AUTOLOAD =~ /\bDESTROY$/ ); my ( $pkg, $fname, $lno ) = caller(); my $my_package = __PACKAGE__; print STDERR <{$opt_name}; unless ($param) { $param = $default } $param =~ s/^\s*//; # allow leading spaces to be like format-skipping if ( $param !~ /^#/ ) { Die("ERROR: the $opt_name parameter '$param' must begin with '#'\n"); } my $pattern = '^\s*' . $param . '\b'; if ( bad_pattern($pattern) ) { Die( "ERROR: the $opt_name parameter '$param' causes the invalid regex '$pattern'\n" ); } return $pattern; } ## end sub make_code_skipping_pattern sub check_options { # Check Tokenizer parameters my $rOpts = shift; %is_sub = (); $is_sub{'sub'} = 1; %is_END_DATA_format_sub = ( '__END__' => 1, '__DATA__' => 1, 'format' => 1, 'sub' => 1, ); # Install any aliases to 'sub' if ( $rOpts->{'sub-alias-list'} ) { # Note that any 'sub-alias-list' has been preprocessed to # be a trimmed, space-separated list which includes 'sub' # for example, it might be 'sub method fun' my @sub_alias_list = split /\s+/, $rOpts->{'sub-alias-list'}; foreach my $word (@sub_alias_list) { $is_sub{$word} = 1; $is_END_DATA_format_sub{$word} = 1; } } #------------------------------------------------ # Update hash values for any -use-feature options #------------------------------------------------ my $use_feature_class = $rOpts->{'use-feature'} =~ /\bclass\b/; # These are the main updates for this option. There are additional # changes elsewhere, usually indicated with a comment 'rt145706' # Update hash values for use_feature=class, added for rt145706 # see 'perlclass.pod' # IMPORTANT: We are changing global hash values initially set in a BEGIN # block. Values must be defined (true or false) for each of these new # words whether true or false. Otherwise, programs using the module which # change options between runs (such as test code) will have # incorrect settings and fail. # There are 4 new keywords: # 'class' - treated specially as generalization of 'package' # Note: we must not set 'class' to be a keyword to avoid problems # with older uses. $is_package{'class'} = $use_feature_class; # 'method' - treated like sub using the sub-alias-list option # Note: we must not set 'method' to be a keyword to avoid problems # with older uses. # 'field' - added as a keyword, and works like 'my' $is_keyword{'field'} = $use_feature_class; $is_my_our_state{'field'} = $use_feature_class; # 'ADJUST' - added as a keyword and works like 'BEGIN' # TODO: if ADJUST gets a paren list, this will need to be updated $is_keyword{'ADJUST'} = $use_feature_class; $is_code_block_token{'ADJUST'} = $use_feature_class; %is_grep_alias = (); if ( $rOpts->{'grep-alias-list'} ) { # Note that 'grep-alias-list' has been preprocessed to be a trimmed, # space-separated list my @q = split /\s+/, $rOpts->{'grep-alias-list'}; @{is_grep_alias}{@q} = (1) x scalar(@q); } $rOpts_code_skipping = $rOpts->{'code-skipping'}; $code_skipping_pattern_begin = make_code_skipping_pattern( $rOpts, 'code-skipping-begin', '#<>V' ); return; } ## end sub check_options sub new { my ( $class, @args ) = @_; # Note: 'tabs' and 'indent_columns' are temporary and should be # removed asap my %defaults = ( source_object => undef, debugger_object => undef, diagnostics_object => undef, logger_object => undef, starting_level => undef, indent_columns => 4, tabsize => 8, look_for_hash_bang => 0, trim_qw => 1, look_for_autoloader => 1, look_for_selfloader => 1, starting_line_number => 1, extended_syntax => 0, rOpts => {}, ); my %args = ( %defaults, @args ); # we are given an object with a get_line() method to supply source lines my $source_object = $args{source_object}; my $rOpts = $args{rOpts}; # we create another object with a get_line() and peek_ahead() method my $line_buffer_object = Perl::Tidy::LineBuffer->new($source_object); # Tokenizer state data is as follows: # _rhere_target_list_ reference to list of here-doc targets # _here_doc_target_ the target string for a here document # _here_quote_character_ the type of here-doc quoting (" ' ` or none) # to determine if interpolation is done # _quote_target_ character we seek if chasing a quote # _line_start_quote_ line where we started looking for a long quote # _in_here_doc_ flag indicating if we are in a here-doc # _in_pod_ flag set if we are in pod documentation # _in_skipped_ flag set if we are in a skipped section # _in_error_ flag set if we saw severe error (binary in script) # _in_data_ flag set if we are in __DATA__ section # _in_end_ flag set if we are in __END__ section # _in_format_ flag set if we are in a format description # _in_attribute_list_ flag telling if we are looking for attributes # _in_quote_ flag telling if we are chasing a quote # _starting_level_ indentation level of first line # _line_buffer_object_ object with get_line() method to supply source code # _diagnostics_object_ place to write debugging information # _unexpected_error_count_ error count used to limit output # _lower_case_labels_at_ line numbers where lower case labels seen # _hit_bug_ program bug detected my $self = []; $self->[_rhere_target_list_] = []; $self->[_in_here_doc_] = 0; $self->[_here_doc_target_] = EMPTY_STRING; $self->[_here_quote_character_] = EMPTY_STRING; $self->[_in_data_] = 0; $self->[_in_end_] = 0; $self->[_in_format_] = 0; $self->[_in_error_] = 0; $self->[_in_pod_] = 0; $self->[_in_skipped_] = 0; $self->[_in_attribute_list_] = 0; $self->[_in_quote_] = 0; $self->[_quote_target_] = EMPTY_STRING; $self->[_line_start_quote_] = -1; $self->[_starting_level_] = $args{starting_level}; $self->[_know_starting_level_] = defined( $args{starting_level} ); $self->[_tabsize_] = $args{tabsize}; $self->[_indent_columns_] = $args{indent_columns}; $self->[_look_for_hash_bang_] = $args{look_for_hash_bang}; $self->[_trim_qw_] = $args{trim_qw}; $self->[_continuation_indentation_] = $args{continuation_indentation}; $self->[_outdent_labels_] = $args{outdent_labels}; $self->[_last_line_number_] = $args{starting_line_number} - 1; $self->[_saw_perl_dash_P_] = 0; $self->[_saw_perl_dash_w_] = 0; $self->[_saw_use_strict_] = 0; $self->[_saw_v_string_] = 0; $self->[_hit_bug_] = 0; $self->[_look_for_autoloader_] = $args{look_for_autoloader}; $self->[_look_for_selfloader_] = $args{look_for_selfloader}; $self->[_saw_autoloader_] = 0; $self->[_saw_selfloader_] = 0; $self->[_saw_hash_bang_] = 0; $self->[_saw_end_] = 0; $self->[_saw_data_] = 0; $self->[_saw_negative_indentation_] = 0; $self->[_started_tokenizing_] = 0; $self->[_line_buffer_object_] = $line_buffer_object; $self->[_debugger_object_] = $args{debugger_object}; $self->[_diagnostics_object_] = $args{diagnostics_object}; $self->[_logger_object_] = $args{logger_object}; $self->[_unexpected_error_count_] = 0; $self->[_started_looking_for_here_target_at_] = 0; $self->[_nearly_matched_here_target_at_] = undef; $self->[_line_of_text_] = EMPTY_STRING; $self->[_rlower_case_labels_at_] = undef; $self->[_extended_syntax_] = $args{extended_syntax}; $self->[_maximum_level_] = 0; $self->[_true_brace_error_count_] = 0; $self->[_rOpts_maximum_level_errors_] = $rOpts->{'maximum-level-errors'}; $self->[_rOpts_maximum_unexpected_errors_] = $rOpts->{'maximum-unexpected-errors'}; $self->[_rOpts_logfile_] = $rOpts->{'logfile'}; $self->[_rOpts_] = $rOpts; # These vars are used for guessing indentation and must be positive $self->[_tabsize_] = 8 if ( !$self->[_tabsize_] ); $self->[_indent_columns_] = 4 if ( !$self->[_indent_columns_] ); bless $self, $class; $tokenizer_self = $self; prepare_for_a_new_file(); $self->find_starting_indentation_level(); # This is not a full class yet, so die if an attempt is made to # create more than one object. if ( _increment_count() > 1 ) { confess "Attempt to create more than 1 object in $class, which is not a true class yet\n"; } return $self; } ## end sub new # interface to Perl::Tidy::Logger routines sub warning { my $msg = shift; my $logger_object = $tokenizer_self->[_logger_object_]; if ($logger_object) { $logger_object->warning($msg); } return; } ## end sub warning sub get_input_stream_name { my $input_stream_name = EMPTY_STRING; my $logger_object = $tokenizer_self->[_logger_object_]; if ($logger_object) { $input_stream_name = $logger_object->get_input_stream_name(); } return $input_stream_name; } ## end sub get_input_stream_name sub complain { my $msg = shift; my $logger_object = $tokenizer_self->[_logger_object_]; if ($logger_object) { my $input_line_number = $tokenizer_self->[_last_line_number_] + 1; $msg = "Line $input_line_number: $msg"; $logger_object->complain($msg); } return; } ## end sub complain sub write_logfile_entry { my $msg = shift; my $logger_object = $tokenizer_self->[_logger_object_]; if ($logger_object) { $logger_object->write_logfile_entry($msg); } return; } ## end sub write_logfile_entry sub interrupt_logfile { my $logger_object = $tokenizer_self->[_logger_object_]; if ($logger_object) { $logger_object->interrupt_logfile(); } return; } ## end sub interrupt_logfile sub resume_logfile { my $logger_object = $tokenizer_self->[_logger_object_]; if ($logger_object) { $logger_object->resume_logfile(); } return; } ## end sub resume_logfile sub increment_brace_error { my $logger_object = $tokenizer_self->[_logger_object_]; if ($logger_object) { $logger_object->increment_brace_error(); } return; } ## end sub increment_brace_error sub report_definite_bug { $tokenizer_self->[_hit_bug_] = 1; my $logger_object = $tokenizer_self->[_logger_object_]; if ($logger_object) { $logger_object->report_definite_bug(); } return; } ## end sub report_definite_bug sub brace_warning { my $msg = shift; my $logger_object = $tokenizer_self->[_logger_object_]; if ($logger_object) { $logger_object->brace_warning($msg); } return; } ## end sub brace_warning sub get_saw_brace_error { my $logger_object = $tokenizer_self->[_logger_object_]; if ($logger_object) { return $logger_object->get_saw_brace_error(); } else { return 0; } } ## end sub get_saw_brace_error sub get_unexpected_error_count { my ($self) = @_; return $self->[_unexpected_error_count_]; } # interface to Perl::Tidy::Diagnostics routines sub write_diagnostics { my $msg = shift; if ( $tokenizer_self->[_diagnostics_object_] ) { $tokenizer_self->[_diagnostics_object_]->write_diagnostics($msg); } return; } ## end sub write_diagnostics sub get_maximum_level { return $tokenizer_self->[_maximum_level_]; } sub report_tokenization_errors { my ($self) = @_; # Report any tokenization errors and return a flag '$severe_error'. # Set $severe_error = 1 if the tokenization errors are so severe that # the formatter should not attempt to format the file. Instead, it will # just output the file verbatim. # set severe error flag if tokenizer has encountered file reading problems # (i.e. unexpected binary characters) my $severe_error = $self->[_in_error_]; my $maxle = $self->[_rOpts_maximum_level_errors_]; my $maxue = $self->[_rOpts_maximum_unexpected_errors_]; $maxle = 1 unless defined($maxle); $maxue = 0 unless defined($maxue); my $level = get_indentation_level(); if ( $level != $tokenizer_self->[_starting_level_] ) { warning("final indentation level: $level\n"); my $level_diff = $tokenizer_self->[_starting_level_] - $level; if ( $level_diff < 0 ) { $level_diff = -$level_diff } # Set severe error flag if the level error is greater than 1. # The formatter can function for any level error but it is probably # best not to attempt formatting for a high level error. if ( $maxle >= 0 && $level_diff > $maxle ) { $severe_error = 1; warning(<[_true_brace_error_count_] > 2 ) { $severe_error = 1; } if ( $tokenizer_self->[_look_for_hash_bang_] && !$tokenizer_self->[_saw_hash_bang_] ) { warning( "hit EOF without seeing hash-bang line; maybe don't need -x?\n"); } if ( $tokenizer_self->[_in_format_] ) { warning("hit EOF while in format description\n"); } if ( $tokenizer_self->[_in_skipped_] ) { write_logfile_entry( "hit EOF while in lines skipped with --code-skipping\n"); } if ( $tokenizer_self->[_in_pod_] ) { # Just write log entry if this is after __END__ or __DATA__ # because this happens to often, and it is not likely to be # a parsing error. if ( $tokenizer_self->[_saw_data_] || $tokenizer_self->[_saw_end_] ) { write_logfile_entry( "hit eof while in pod documentation (no =cut seen)\n\tthis can cause trouble with some pod utilities\n" ); } else { complain( "hit eof while in pod documentation (no =cut seen)\n\tthis can cause trouble with some pod utilities\n" ); } } if ( $tokenizer_self->[_in_here_doc_] ) { $severe_error = 1; my $here_doc_target = $tokenizer_self->[_here_doc_target_]; my $started_looking_for_here_target_at = $tokenizer_self->[_started_looking_for_here_target_at_]; if ($here_doc_target) { warning( "hit EOF in here document starting at line $started_looking_for_here_target_at with target: $here_doc_target\n" ); } else { warning(<[_nearly_matched_here_target_at_]; if ($nearly_matched_here_target_at) { warning( "NOTE: almost matched at input line $nearly_matched_here_target_at except for whitespace\n" ); } } # Something is seriously wrong if we ended inside a quote if ( $tokenizer_self->[_in_quote_] ) { $severe_error = 1; my $line_start_quote = $tokenizer_self->[_line_start_quote_]; my $quote_target = $tokenizer_self->[_quote_target_]; my $what = ( $tokenizer_self->[_in_attribute_list_] ) ? "attribute list" : "quote/pattern"; warning( "hit EOF seeking end of $what starting at line $line_start_quote ending in $quote_target\n" ); } if ( $tokenizer_self->[_hit_bug_] ) { $severe_error = 1; } # Multiple "unexpected" type tokenization errors usually indicate parsing # non-perl scripts, or that something is seriously wrong, so we should # avoid formatting them. This can happen for example if we run perltidy on # a shell script or an html file. But unfortunately this check can # interfere with some extended syntaxes, such as RPerl, so it has to be off # by default. my $ue_count = $tokenizer_self->[_unexpected_error_count_]; if ( $maxue > 0 && $ue_count > $maxue ) { warning(< -maxue=$maxue; use -maxue=0 to force formatting EOM $severe_error = 1; } unless ( $tokenizer_self->[_saw_perl_dash_w_] ) { if ( $] < 5.006 ) { write_logfile_entry("Suggest including '-w parameter'\n"); } else { write_logfile_entry("Suggest including 'use warnings;'\n"); } } if ( $tokenizer_self->[_saw_perl_dash_P_] ) { write_logfile_entry("Use of -P parameter for defines is discouraged\n"); } unless ( $tokenizer_self->[_saw_use_strict_] ) { write_logfile_entry("Suggest including 'use strict;'\n"); } # it is suggested that labels have at least one upper case character # for legibility and to avoid code breakage as new keywords are introduced if ( $tokenizer_self->[_rlower_case_labels_at_] ) { my @lower_case_labels_at = @{ $tokenizer_self->[_rlower_case_labels_at_] }; write_logfile_entry( "Suggest using upper case characters in label(s)\n"); local $LIST_SEPARATOR = ')('; write_logfile_entry(" defined at line(s): (@lower_case_labels_at)\n"); } return $severe_error; } ## end sub report_tokenization_errors sub report_v_string { # warn if this version can't handle v-strings my $tok = shift; unless ( $tokenizer_self->[_saw_v_string_] ) { $tokenizer_self->[_saw_v_string_] = $tokenizer_self->[_last_line_number_]; } if ( $] < 5.006 ) { warning( "Found v-string '$tok' but v-strings are not implemented in your version of perl; see Camel 3 book ch 2\n" ); } return; } ## end sub report_v_string sub is_valid_token_type { my ($type) = @_; return $is_valid_token_type{$type}; } sub get_input_line_number { return $tokenizer_self->[_last_line_number_]; } sub log_numbered_msg { my ( $self, $msg ) = @_; # write input line number + message to logfile my $input_line_number = $self->[_last_line_number_]; write_logfile_entry("Line $input_line_number: $msg"); return; } ## end sub log_numbered_msg # returns the next tokenized line sub get_line { my $self = shift; # USES GLOBAL VARIABLES: # $brace_depth, $square_bracket_depth, $paren_depth my $input_line = $self->[_line_buffer_object_]->get_line(); $self->[_line_of_text_] = $input_line; return unless ($input_line); my $input_line_number = ++$self->[_last_line_number_]; # Find and remove what characters terminate this line, including any # control r my $input_line_separator = EMPTY_STRING; if ( chomp($input_line) ) { $input_line_separator = $INPUT_RECORD_SEPARATOR; } # The first test here very significantly speeds things up, but be sure to # keep the regex and hash %other_line_endings the same. if ( $other_line_endings{ substr( $input_line, -1 ) } ) { if ( $input_line =~ s/((\r|\035|\032)+)$// ) { $input_line_separator = $2 . $input_line_separator; } } # for backwards compatibility we keep the line text terminated with # a newline character $input_line .= "\n"; $self->[_line_of_text_] = $input_line; # create a data structure describing this line which will be # returned to the caller. # _line_type codes are: # SYSTEM - system-specific code before hash-bang line # CODE - line of perl code (including comments) # POD_START - line starting pod, such as '=head' # POD - pod documentation text # POD_END - last line of pod section, '=cut' # HERE - text of here-document # HERE_END - last line of here-doc (target word) # FORMAT - format section # FORMAT_END - last line of format section, '.' # SKIP - code skipping section # SKIP_END - last line of code skipping section, '#>>V' # DATA_START - __DATA__ line # DATA - unidentified text following __DATA__ # END_START - __END__ line # END - unidentified text following __END__ # ERROR - we are in big trouble, probably not a perl script # Other variables: # _curly_brace_depth - depth of curly braces at start of line # _square_bracket_depth - depth of square brackets at start of line # _paren_depth - depth of parens at start of line # _starting_in_quote - this line continues a multi-line quote # (so don't trim leading blanks!) # _ending_in_quote - this line ends in a multi-line quote # (so don't trim trailing blanks!) my $line_of_tokens = { _line_type => 'EOF', _line_text => $input_line, _line_number => $input_line_number, _guessed_indentation_level => 0, _curly_brace_depth => $brace_depth, _square_bracket_depth => $square_bracket_depth, _paren_depth => $paren_depth, _quote_character => EMPTY_STRING, ## Skip these needless initializations for efficiency: ## _rtoken_type => undef, ## _rtokens => undef, ## _rlevels => undef, ## _rblock_type => undef, ## _rtype_sequence => undef, ## _rci_levels => undef, ## _starting_in_quote => 0, ## _ending_in_quote => 0, }; # must print line unchanged if we are in a here document if ( $self->[_in_here_doc_] ) { $line_of_tokens->{_line_type} = 'HERE'; my $here_doc_target = $self->[_here_doc_target_]; my $here_quote_character = $self->[_here_quote_character_]; my $candidate_target = $input_line; chomp $candidate_target; # Handle <<~ targets, which are indicated here by a leading space on # the here quote character if ( $here_quote_character =~ /^\s/ ) { $candidate_target =~ s/^\s*//; } if ( $candidate_target eq $here_doc_target ) { $self->[_nearly_matched_here_target_at_] = undef; $line_of_tokens->{_line_type} = 'HERE_END'; $self->log_numbered_msg("Exiting HERE document $here_doc_target\n"); my $rhere_target_list = $self->[_rhere_target_list_]; if ( @{$rhere_target_list} ) { # there can be multiple here targets ( $here_doc_target, $here_quote_character ) = @{ shift @{$rhere_target_list} }; $self->[_here_doc_target_] = $here_doc_target; $self->[_here_quote_character_] = $here_quote_character; $self->log_numbered_msg( "Entering HERE document $here_doc_target\n"); $self->[_nearly_matched_here_target_at_] = undef; $self->[_started_looking_for_here_target_at_] = $input_line_number; } else { $self->[_in_here_doc_] = 0; $self->[_here_doc_target_] = EMPTY_STRING; $self->[_here_quote_character_] = EMPTY_STRING; } } # check for error of extra whitespace # note for PERL6: leading whitespace is allowed else { $candidate_target =~ s/\s*$//; $candidate_target =~ s/^\s*//; if ( $candidate_target eq $here_doc_target ) { $self->[_nearly_matched_here_target_at_] = $input_line_number; } } return $line_of_tokens; } # Print line unchanged if we are in a format section elsif ( $self->[_in_format_] ) { if ( $input_line =~ /^\.[\s#]*$/ ) { # Decrement format depth count at a '.' after a 'format' $self->[_in_format_]--; # This is the end when count reaches 0 if ( !$self->[_in_format_] ) { $self->log_numbered_msg("Exiting format section\n"); $line_of_tokens->{_line_type} = 'FORMAT_END'; } } else { $line_of_tokens->{_line_type} = 'FORMAT'; if ( $input_line =~ /^\s*format\s+\w+/ ) { # Increment format depth count at a 'format' within a 'format' # This is a simple way to handle nested formats (issue c019). $self->[_in_format_]++; } } return $line_of_tokens; } # must print line unchanged if we are in pod documentation elsif ( $self->[_in_pod_] ) { $line_of_tokens->{_line_type} = 'POD'; if ( $input_line =~ /^=cut/ ) { $line_of_tokens->{_line_type} = 'POD_END'; $self->log_numbered_msg("Exiting POD section\n"); $self->[_in_pod_] = 0; } if ( $input_line =~ /^\#\!.*perl\b/ && !$self->[_in_end_] ) { warning( "Hash-bang in pod can cause older versions of perl to fail! \n" ); } return $line_of_tokens; } # print line unchanged if in skipped section elsif ( $self->[_in_skipped_] ) { $line_of_tokens->{_line_type} = 'SKIP'; if ( $input_line =~ /$code_skipping_pattern_end/ ) { $line_of_tokens->{_line_type} = 'SKIP_END'; $self->log_numbered_msg("Exiting code-skipping section\n"); $self->[_in_skipped_] = 0; } return $line_of_tokens; } # must print line unchanged if we have seen a severe error (i.e., we # are seeing illegal tokens and cannot continue. Syntax errors do # not pass this route). Calling routine can decide what to do, but # the default can be to just pass all lines as if they were after __END__ elsif ( $self->[_in_error_] ) { $line_of_tokens->{_line_type} = 'ERROR'; return $line_of_tokens; } # print line unchanged if we are __DATA__ section elsif ( $self->[_in_data_] ) { # ...but look for POD # Note that the _in_data and _in_end flags remain set # so that we return to that state after seeing the # end of a pod section if ( $input_line =~ /^=(\w+)\b/ && $1 ne 'cut' ) { $line_of_tokens->{_line_type} = 'POD_START'; $self->log_numbered_msg("Entering POD section\n"); $self->[_in_pod_] = 1; return $line_of_tokens; } else { $line_of_tokens->{_line_type} = 'DATA'; return $line_of_tokens; } } # print line unchanged if we are in __END__ section elsif ( $self->[_in_end_] ) { # ...but look for POD # Note that the _in_data and _in_end flags remain set # so that we return to that state after seeing the # end of a pod section if ( $input_line =~ /^=(\w+)\b/ && $1 ne 'cut' ) { $line_of_tokens->{_line_type} = 'POD_START'; $self->log_numbered_msg("Entering POD section\n"); $self->[_in_pod_] = 1; return $line_of_tokens; } else { $line_of_tokens->{_line_type} = 'END'; return $line_of_tokens; } } # check for a hash-bang line if we haven't seen one if ( !$self->[_saw_hash_bang_] ) { if ( $input_line =~ /^\#\!.*perl\b/ ) { $self->[_saw_hash_bang_] = $input_line_number; # check for -w and -P flags if ( $input_line =~ /^\#\!.*perl\s.*-.*P/ ) { $self->[_saw_perl_dash_P_] = 1; } if ( $input_line =~ /^\#\!.*perl\s.*-.*w/ ) { $self->[_saw_perl_dash_w_] = 1; } if ( $input_line_number > 1 # leave any hash bang in a BEGIN block alone # i.e. see 'debugger-duck_type.t' && !( $last_nonblank_block_type && $last_nonblank_block_type eq 'BEGIN' ) && !$self->[_look_for_hash_bang_] # Try to avoid giving a false alarm at a simple comment. # These look like valid hash-bang lines: #!/usr/bin/perl -w #! /usr/bin/perl -w #!c:\perl\bin\perl.exe # These are comments: #! I love perl #! sunos does not yet provide a /usr/bin/perl # Comments typically have multiple spaces, which suggests # the filter && $input_line =~ /^\#\!(\s+)?(\S+)?perl/ ) { # this is helpful for VMS systems; we may have accidentally # tokenized some DCL commands if ( $self->[_started_tokenizing_] ) { warning( "There seems to be a hash-bang after line 1; do you need to run with -x ?\n" ); } else { complain("Useless hash-bang after line 1\n"); } } # Report the leading hash-bang as a system line # This will prevent -dac from deleting it else { $line_of_tokens->{_line_type} = 'SYSTEM'; return $line_of_tokens; } } } # wait for a hash-bang before parsing if the user invoked us with -x if ( $self->[_look_for_hash_bang_] && !$self->[_saw_hash_bang_] ) { $line_of_tokens->{_line_type} = 'SYSTEM'; return $line_of_tokens; } # a first line of the form ': #' will be marked as SYSTEM # since lines of this form may be used by tcsh if ( $input_line_number == 1 && $input_line =~ /^\s*\:\s*\#/ ) { $line_of_tokens->{_line_type} = 'SYSTEM'; return $line_of_tokens; } # now we know that it is ok to tokenize the line... # the line tokenizer will modify any of these private variables: # _rhere_target_list_ # _in_data_ # _in_end_ # _in_format_ # _in_error_ # _in_skipped_ # _in_pod_ # _in_quote_ $self->tokenize_this_line($line_of_tokens); # Now finish defining the return structure and return it $line_of_tokens->{_ending_in_quote} = $self->[_in_quote_]; # handle severe error (binary data in script) if ( $self->[_in_error_] ) { $self->[_in_quote_] = 0; # to avoid any more messages warning("Giving up after error\n"); $line_of_tokens->{_line_type} = 'ERROR'; reset_indentation_level(0); # avoid error messages return $line_of_tokens; } # handle start of pod documentation if ( $self->[_in_pod_] ) { # This gets tricky..above a __DATA__ or __END__ section, perl # accepts '=cut' as the start of pod section. But afterwards, # only pod utilities see it and they may ignore an =cut without # leading =head. In any case, this isn't good. if ( $input_line =~ /^=cut\b/ ) { if ( $self->[_saw_data_] || $self->[_saw_end_] ) { complain("=cut while not in pod ignored\n"); $self->[_in_pod_] = 0; $line_of_tokens->{_line_type} = 'POD_END'; } else { $line_of_tokens->{_line_type} = 'POD_START'; warning( "=cut starts a pod section .. this can fool pod utilities.\n" ) unless (DEVEL_MODE); $self->log_numbered_msg("Entering POD section\n"); } } else { $line_of_tokens->{_line_type} = 'POD_START'; $self->log_numbered_msg("Entering POD section\n"); } return $line_of_tokens; } # handle start of skipped section if ( $self->[_in_skipped_] ) { $line_of_tokens->{_line_type} = 'SKIP'; $self->log_numbered_msg("Entering code-skipping section\n"); return $line_of_tokens; } # see if this line contains here doc targets my $rhere_target_list = $self->[_rhere_target_list_]; if ( @{$rhere_target_list} ) { my ( $here_doc_target, $here_quote_character ) = @{ shift @{$rhere_target_list} }; $self->[_in_here_doc_] = 1; $self->[_here_doc_target_] = $here_doc_target; $self->[_here_quote_character_] = $here_quote_character; $self->log_numbered_msg("Entering HERE document $here_doc_target\n"); $self->[_started_looking_for_here_target_at_] = $input_line_number; } # NOTE: __END__ and __DATA__ statements are written unformatted # because they can theoretically contain additional characters # which are not tokenized (and cannot be read with either!). if ( $self->[_in_data_] ) { $line_of_tokens->{_line_type} = 'DATA_START'; $self->log_numbered_msg("Starting __DATA__ section\n"); $self->[_saw_data_] = 1; # keep parsing after __DATA__ if use SelfLoader was seen if ( $self->[_saw_selfloader_] ) { $self->[_in_data_] = 0; $self->log_numbered_msg( "SelfLoader seen, continuing; -nlsl deactivates\n"); } return $line_of_tokens; } elsif ( $self->[_in_end_] ) { $line_of_tokens->{_line_type} = 'END_START'; $self->log_numbered_msg("Starting __END__ section\n"); $self->[_saw_end_] = 1; # keep parsing after __END__ if use AutoLoader was seen if ( $self->[_saw_autoloader_] ) { $self->[_in_end_] = 0; $self->log_numbered_msg( "AutoLoader seen, continuing; -nlal deactivates\n"); } return $line_of_tokens; } # now, finally, we know that this line is type 'CODE' $line_of_tokens->{_line_type} = 'CODE'; # remember if we have seen any real code if ( !$self->[_started_tokenizing_] && $input_line !~ /^\s*$/ && $input_line !~ /^\s*#/ ) { $self->[_started_tokenizing_] = 1; } if ( $self->[_debugger_object_] ) { $self->[_debugger_object_]->write_debug_entry($line_of_tokens); } # Note: if keyword 'format' occurs in this line code, it is still CODE # (keyword 'format' need not start a line) if ( $self->[_in_format_] ) { $self->log_numbered_msg("Entering format section\n"); } if ( $self->[_in_quote_] and ( $self->[_line_start_quote_] < 0 ) ) { #if ( ( my $quote_target = get_quote_target() ) !~ /^\s*$/ ) { if ( ( my $quote_target = $self->[_quote_target_] ) !~ /^\s*$/ ) { $self->[_line_start_quote_] = $input_line_number; $self->log_numbered_msg( "Start multi-line quote or pattern ending in $quote_target\n"); } } elsif ( ( $self->[_line_start_quote_] >= 0 ) && !$self->[_in_quote_] ) { $self->[_line_start_quote_] = -1; $self->log_numbered_msg("End of multi-line quote or pattern\n"); } # we are returning a line of CODE return $line_of_tokens; } ## end sub get_line sub find_starting_indentation_level { # We need to find the indentation level of the first line of the # script being formatted. Often it will be zero for an entire file, # but if we are formatting a local block of code (within an editor for # example) it may not be zero. The user may specify this with the # -sil=n parameter but normally doesn't so we have to guess. # my ($self) = @_; my $starting_level = 0; # use value if given as parameter if ( $self->[_know_starting_level_] ) { $starting_level = $self->[_starting_level_]; } # if we know there is a hash_bang line, the level must be zero elsif ( $self->[_look_for_hash_bang_] ) { $self->[_know_starting_level_] = 1; } # otherwise figure it out from the input file else { my $line; my $i = 0; # keep looking at lines until we find a hash bang or piece of code my $msg = EMPTY_STRING; while ( $line = $self->[_line_buffer_object_]->peek_ahead( $i++ ) ) { # if first line is #! then assume starting level is zero if ( $i == 1 && $line =~ /^\#\!/ ) { $starting_level = 0; last; } next if ( $line =~ /^\s*#/ ); # skip past comments next if ( $line =~ /^\s*$/ ); # skip past blank lines $starting_level = guess_old_indentation_level($line); last; } $msg = "Line $i implies starting-indentation-level = $starting_level\n"; write_logfile_entry("$msg"); } $self->[_starting_level_] = $starting_level; reset_indentation_level($starting_level); return; } ## end sub find_starting_indentation_level sub guess_old_indentation_level { my ($line) = @_; # Guess the indentation level of an input line. # # For the first line of code this result will define the starting # indentation level. It will mainly be non-zero when perltidy is applied # within an editor to a local block of code. # # This is an impossible task in general because we can't know what tabs # meant for the old script and how many spaces were used for one # indentation level in the given input script. For example it may have # been previously formatted with -i=7 -et=3. But we can at least try to # make sure that perltidy guesses correctly if it is applied repeatedly to # a block of code within an editor, so that the block stays at the same # level when perltidy is applied repeatedly. # # USES GLOBAL VARIABLES: $tokenizer_self my $level = 0; # find leading tabs, spaces, and any statement label my $spaces = 0; if ( $line =~ /^(\t+)?(\s+)?(\w+:[^:])?/ ) { # If there are leading tabs, we use the tab scheme for this run, if # any, so that the code will remain stable when editing. if ($1) { $spaces += length($1) * $tokenizer_self->[_tabsize_] } if ($2) { $spaces += length($2) } # correct for outdented labels if ( $3 && $tokenizer_self->[_outdent_labels_] ) { $spaces += $tokenizer_self->[_continuation_indentation_]; } } # compute indentation using the value of -i for this run. # If -i=0 is used for this run (which is possible) it doesn't matter # what we do here but we'll guess that the old run used 4 spaces per level. my $indent_columns = $tokenizer_self->[_indent_columns_]; $indent_columns = 4 if ( !$indent_columns ); $level = int( $spaces / $indent_columns ); return ($level); } ## end sub guess_old_indentation_level # This is a currently unused debug routine sub dump_functions { my $fh = *STDOUT; foreach my $pkg ( keys %is_user_function ) { $fh->print("\nnon-constant subs in package $pkg\n"); foreach my $sub ( keys %{ $is_user_function{$pkg} } ) { my $msg = EMPTY_STRING; if ( $is_block_list_function{$pkg}{$sub} ) { $msg = 'block_list'; } if ( $is_block_function{$pkg}{$sub} ) { $msg = 'block'; } $fh->print("$sub $msg\n"); } } foreach my $pkg ( keys %is_constant ) { $fh->print("\nconstants and constant subs in package $pkg\n"); foreach my $sub ( keys %{ $is_constant{$pkg} } ) { $fh->print("$sub\n"); } } return; } ## end sub dump_functions sub prepare_for_a_new_file { # previous tokens needed to determine what to expect next $last_nonblank_token = ';'; # the only possible starting state which $last_nonblank_type = ';'; # will make a leading brace a code block $last_nonblank_block_type = EMPTY_STRING; # scalars for remembering statement types across multiple lines $statement_type = EMPTY_STRING; # '' or 'use' or 'sub..' or 'case..' $in_attribute_list = 0; # scalars for remembering where we are in the file $current_package = "main"; $context = UNKNOWN_CONTEXT; # hashes used to remember function information %is_constant = (); # user-defined constants %is_user_function = (); # user-defined functions %user_function_prototype = (); # their prototypes %is_block_function = (); %is_block_list_function = (); %saw_function_definition = (); %saw_use_module = (); # variables used to track depths of various containers # and report nesting errors $paren_depth = 0; $brace_depth = 0; $square_bracket_depth = 0; @current_depth = (0) x scalar @closing_brace_names; $total_depth = 0; @total_depth = (); @nesting_sequence_number = ( 0 .. @closing_brace_names - 1 ); @current_sequence_number = (); $next_sequence_number = 2; # The value 1 is reserved for SEQ_ROOT @paren_type = (); @paren_semicolon_count = (); @paren_structural_type = (); @brace_type = (); @brace_structural_type = (); @brace_context = (); @brace_package = (); @square_bracket_type = (); @square_bracket_structural_type = (); @depth_array = (); @nested_ternary_flag = (); @nested_statement_type = (); @starting_line_of_current_depth = (); $paren_type[$paren_depth] = EMPTY_STRING; $paren_semicolon_count[$paren_depth] = 0; $paren_structural_type[$brace_depth] = EMPTY_STRING; $brace_type[$brace_depth] = ';'; # identify opening brace as code block $brace_structural_type[$brace_depth] = EMPTY_STRING; $brace_context[$brace_depth] = UNKNOWN_CONTEXT; $brace_package[$paren_depth] = $current_package; $square_bracket_type[$square_bracket_depth] = EMPTY_STRING; $square_bracket_structural_type[$square_bracket_depth] = EMPTY_STRING; initialize_tokenizer_state(); return; } ## end sub prepare_for_a_new_file { ## closure for sub tokenize_this_line use constant BRACE => 0; use constant SQUARE_BRACKET => 1; use constant PAREN => 2; use constant QUESTION_COLON => 3; # TV1: scalars for processing one LINE. # Re-initialized on each entry to sub tokenize_this_line. my ( $block_type, $container_type, $expecting, $i, $i_tok, $input_line, $input_line_number, $last_nonblank_i, $max_token_index, $next_tok, $next_type, $peeked_ahead, $prototype, $rhere_target_list, $rtoken_map, $rtoken_type, $rtokens, $tok, $type, $type_sequence, $indent_flag, ); # TV2: refs to ARRAYS for processing one LINE # Re-initialized on each call. my $routput_token_list = []; # stack of output token indexes my $routput_token_type = []; # token types my $routput_block_type = []; # types of code block my $routput_container_type = []; # paren types, such as if, elsif, .. my $routput_type_sequence = []; # nesting sequential number my $routput_indent_flag = []; # # TV3: SCALARS for quote variables. These are initialized with a # subroutine call and continually updated as lines are processed. my ( $in_quote, $quote_type, $quote_character, $quote_pos, $quote_depth, $quoted_string_1, $quoted_string_2, $allowed_quote_modifiers, ); # TV4: SCALARS for multi-line identifiers and # statements. These are initialized with a subroutine call # and continually updated as lines are processed. my ( $id_scan_state, $identifier, $want_paren ); # TV5: SCALARS for tracking indentation level. # Initialized once and continually updated as lines are # processed. my ( $nesting_token_string, $nesting_type_string, $nesting_block_string, $nesting_block_flag, $nesting_list_string, $nesting_list_flag, $ci_string_in_tokenizer, $continuation_string_in_tokenizer, $in_statement_continuation, $level_in_tokenizer, $slevel_in_tokenizer, $rslevel_stack, ); # TV6: SCALARS for remembering several previous # tokens. Initialized once and continually updated as # lines are processed. my ( $last_nonblank_container_type, $last_nonblank_type_sequence, $last_last_nonblank_token, $last_last_nonblank_type, $last_last_nonblank_block_type, $last_last_nonblank_container_type, $last_last_nonblank_type_sequence, $last_nonblank_prototype, ); # ---------------------------------------------------------------- # beginning of tokenizer variable access and manipulation routines # ---------------------------------------------------------------- sub initialize_tokenizer_state { # TV1: initialized on each call # TV2: initialized on each call # TV3: $in_quote = 0; $quote_type = 'Q'; $quote_character = EMPTY_STRING; $quote_pos = 0; $quote_depth = 0; $quoted_string_1 = EMPTY_STRING; $quoted_string_2 = EMPTY_STRING; $allowed_quote_modifiers = EMPTY_STRING; # TV4: $id_scan_state = EMPTY_STRING; $identifier = EMPTY_STRING; $want_paren = EMPTY_STRING; # TV5: $nesting_token_string = EMPTY_STRING; $nesting_type_string = EMPTY_STRING; $nesting_block_string = '1'; # initially in a block $nesting_block_flag = 1; $nesting_list_string = '0'; # initially not in a list $nesting_list_flag = 0; # initially not in a list $ci_string_in_tokenizer = EMPTY_STRING; $continuation_string_in_tokenizer = "0"; $in_statement_continuation = 0; $level_in_tokenizer = 0; $slevel_in_tokenizer = 0; $rslevel_stack = []; # TV6: $last_nonblank_container_type = EMPTY_STRING; $last_nonblank_type_sequence = EMPTY_STRING; $last_last_nonblank_token = ';'; $last_last_nonblank_type = ';'; $last_last_nonblank_block_type = EMPTY_STRING; $last_last_nonblank_container_type = EMPTY_STRING; $last_last_nonblank_type_sequence = EMPTY_STRING; $last_nonblank_prototype = EMPTY_STRING; return; } ## end sub initialize_tokenizer_state sub save_tokenizer_state { my $rTV1 = [ $block_type, $container_type, $expecting, $i, $i_tok, $input_line, $input_line_number, $last_nonblank_i, $max_token_index, $next_tok, $next_type, $peeked_ahead, $prototype, $rhere_target_list, $rtoken_map, $rtoken_type, $rtokens, $tok, $type, $type_sequence, $indent_flag, ]; my $rTV2 = [ $routput_token_list, $routput_token_type, $routput_block_type, $routput_container_type, $routput_type_sequence, $routput_indent_flag, ]; my $rTV3 = [ $in_quote, $quote_type, $quote_character, $quote_pos, $quote_depth, $quoted_string_1, $quoted_string_2, $allowed_quote_modifiers, ]; my $rTV4 = [ $id_scan_state, $identifier, $want_paren ]; my $rTV5 = [ $nesting_token_string, $nesting_type_string, $nesting_block_string, $nesting_block_flag, $nesting_list_string, $nesting_list_flag, $ci_string_in_tokenizer, $continuation_string_in_tokenizer, $in_statement_continuation, $level_in_tokenizer, $slevel_in_tokenizer, $rslevel_stack, ]; my $rTV6 = [ $last_nonblank_container_type, $last_nonblank_type_sequence, $last_last_nonblank_token, $last_last_nonblank_type, $last_last_nonblank_block_type, $last_last_nonblank_container_type, $last_last_nonblank_type_sequence, $last_nonblank_prototype, ]; return [ $rTV1, $rTV2, $rTV3, $rTV4, $rTV5, $rTV6 ]; } ## end sub save_tokenizer_state sub restore_tokenizer_state { my ($rstate) = @_; my ( $rTV1, $rTV2, $rTV3, $rTV4, $rTV5, $rTV6 ) = @{$rstate}; ( $block_type, $container_type, $expecting, $i, $i_tok, $input_line, $input_line_number, $last_nonblank_i, $max_token_index, $next_tok, $next_type, $peeked_ahead, $prototype, $rhere_target_list, $rtoken_map, $rtoken_type, $rtokens, $tok, $type, $type_sequence, $indent_flag, ) = @{$rTV1}; ( $routput_token_list, $routput_token_type, $routput_block_type, $routput_container_type, $routput_type_sequence, $routput_indent_flag, ) = @{$rTV2}; ( $in_quote, $quote_type, $quote_character, $quote_pos, $quote_depth, $quoted_string_1, $quoted_string_2, $allowed_quote_modifiers, ) = @{$rTV3}; ( $id_scan_state, $identifier, $want_paren ) = @{$rTV4}; ( $nesting_token_string, $nesting_type_string, $nesting_block_string, $nesting_block_flag, $nesting_list_string, $nesting_list_flag, $ci_string_in_tokenizer, $continuation_string_in_tokenizer, $in_statement_continuation, $level_in_tokenizer, $slevel_in_tokenizer, $rslevel_stack, ) = @{$rTV5}; ( $last_nonblank_container_type, $last_nonblank_type_sequence, $last_last_nonblank_token, $last_last_nonblank_type, $last_last_nonblank_block_type, $last_last_nonblank_container_type, $last_last_nonblank_type_sequence, $last_nonblank_prototype, ) = @{$rTV6}; return; } ## end sub restore_tokenizer_state sub split_pretoken { my ($numc) = @_; # Split the leading $numc characters from the current token (at index=$i) # which is pre-type 'w' and insert the remainder back into the pretoken # stream with appropriate settings. Since we are splitting a pre-type 'w', # there are three cases, depending on if the remainder starts with a digit: # Case 1: remainder is type 'd', all digits # Case 2: remainder is type 'd' and type 'w': digits and other characters # Case 3: remainder is type 'w' # Examples, for $numc=1: # $tok => $tok_0 $tok_1 $tok_2 # 'x10' => 'x' '10' # case 1 # 'x10if' => 'x' '10' 'if' # case 2 # '0ne => 'O' 'ne' # case 3 # where: # $tok_1 is a possible string of digits (pre-type 'd') # $tok_2 is a possible word (pre-type 'w') # return 1 if successful # return undef if error (shouldn't happen) # Calling routine should update '$type' and '$tok' if successful. my $pretoken = $rtokens->[$i]; if ( $pretoken && length($pretoken) > $numc && substr( $pretoken, $numc ) =~ /^(\d*)(.*)$/ ) { # Split $tok into up to 3 tokens: my $tok_0 = substr( $pretoken, 0, $numc ); my $tok_1 = defined($1) ? $1 : EMPTY_STRING; my $tok_2 = defined($2) ? $2 : EMPTY_STRING; my $len_0 = length($tok_0); my $len_1 = length($tok_1); my $len_2 = length($tok_2); my $pre_type_0 = 'w'; my $pre_type_1 = 'd'; my $pre_type_2 = 'w'; my $pos_0 = $rtoken_map->[$i]; my $pos_1 = $pos_0 + $len_0; my $pos_2 = $pos_1 + $len_1; my $isplice = $i + 1; # Splice in any digits if ($len_1) { splice @{$rtoken_map}, $isplice, 0, $pos_1; splice @{$rtokens}, $isplice, 0, $tok_1; splice @{$rtoken_type}, $isplice, 0, $pre_type_1; $max_token_index++; $isplice++; } # Splice in any trailing word if ($len_2) { splice @{$rtoken_map}, $isplice, 0, $pos_2; splice @{$rtokens}, $isplice, 0, $tok_2; splice @{$rtoken_type}, $isplice, 0, $pre_type_2; $max_token_index++; } $rtokens->[$i] = $tok_0; return 1; } else { # Shouldn't get here if (DEVEL_MODE) { Fault(< '{', ']' => '[', ')' => '(' ); # These block types terminate statements and do not need a trailing # semicolon # patched for SWITCH/CASE/ my %is_zero_continuation_block_type; my @q; @q = qw( } { BEGIN END CHECK INIT AUTOLOAD DESTROY UNITCHECK continue ; if elsif else unless while until for foreach switch case given when); @is_zero_continuation_block_type{@q} = (1) x scalar(@q); my %is_logical_container; @q = qw(if elsif unless while and or err not && ! || for foreach); @is_logical_container{@q} = (1) x scalar(@q); my %is_binary_type; @q = qw(|| &&); @is_binary_type{@q} = (1) x scalar(@q); my %is_binary_keyword; @q = qw(and or err eq ne cmp); @is_binary_keyword{@q} = (1) x scalar(@q); # 'L' is token for opening { at hash key my %is_opening_type; @q = qw< L { ( [ >; @is_opening_type{@q} = (1) x scalar(@q); # 'R' is token for closing } at hash key my %is_closing_type; @q = qw< R } ) ] >; @is_closing_type{@q} = (1) x scalar(@q); my %is_redo_last_next_goto; @q = qw(redo last next goto); @is_redo_last_next_goto{@q} = (1) x scalar(@q); my %is_use_require; @q = qw(use require); @is_use_require{@q} = (1) x scalar(@q); # This hash holds the array index in $tokenizer_self for these keywords: # Fix for issue c035: removed 'format' from this hash my %is_END_DATA = ( '__END__' => _in_end_, '__DATA__' => _in_data_, ); my %is_list_end_type; @q = qw( ; { } ); push @q, ','; @is_list_end_type{@q} = (1) x scalar(@q); # original ref: camel 3 p 147, # but perl may accept undocumented flags # perl 5.10 adds 'p' (preserve) # Perl version 5.22 added 'n' # From http://perldoc.perl.org/perlop.html we have # /PATTERN/msixpodualngc or m?PATTERN?msixpodualngc # s/PATTERN/REPLACEMENT/msixpodualngcer # y/SEARCHLIST/REPLACEMENTLIST/cdsr # tr/SEARCHLIST/REPLACEMENTLIST/cdsr # qr/STRING/msixpodualn my %quote_modifiers = ( 's' => '[msixpodualngcer]', 'y' => '[cdsr]', 'tr' => '[cdsr]', 'm' => '[msixpodualngc]', 'qr' => '[msixpodualn]', 'q' => EMPTY_STRING, 'qq' => EMPTY_STRING, 'qw' => EMPTY_STRING, 'qx' => EMPTY_STRING, ); # table showing how many quoted things to look for after quote operator.. # s, y, tr have 2 (pattern and replacement) # others have 1 (pattern only) my %quote_items = ( 's' => 2, 'y' => 2, 'tr' => 2, 'm' => 1, 'qr' => 1, 'q' => 1, 'qq' => 1, 'qw' => 1, 'qx' => 1, ); my %is_for_foreach; @q = qw(for foreach); @is_for_foreach{@q} = (1) x scalar(@q); # These keywords may introduce blocks after parenthesized expressions, # in the form: # keyword ( .... ) { BLOCK } # patch for SWITCH/CASE: added 'switch' 'case' 'given' 'when' # NOTE for --use-feature=class: if ADJUST blocks eventually take a # parameter list, then ADJUST might need to be added to this list (see # perlclass.pod) my %is_blocktype_with_paren; @q = qw(if elsif unless while until for foreach switch case given when catch); @is_blocktype_with_paren{@q} = (1) x scalar(@q); my %is_case_default; @q = qw(case default); @is_case_default{@q} = (1) x scalar(@q); #------------------------ # end of tokenizer hashes #------------------------ # ------------------------------------------------------------ # beginning of various scanner interface routines # ------------------------------------------------------------ sub scan_replacement_text { # check for here-docs in replacement text invoked by # a substitution operator with executable modifier 'e'. # # given: # $replacement_text # return: # $rht = reference to any here-doc targets my ($replacement_text) = @_; # quick check return unless ( $replacement_text =~ /<[_logger_object_]; # localize all package variables local ( $tokenizer_self, $last_nonblank_token, $last_nonblank_type, $last_nonblank_block_type, $statement_type, $in_attribute_list, $current_package, $context, %is_constant, %is_user_function, %user_function_prototype, %is_block_function, %is_block_list_function, %saw_function_definition, $brace_depth, $paren_depth, $square_bracket_depth, @current_depth, @total_depth, $total_depth, @nesting_sequence_number, @current_sequence_number, @paren_type, @paren_semicolon_count, @paren_structural_type, @brace_type, @brace_structural_type, @brace_context, @brace_package, @square_bracket_type, @square_bracket_structural_type, @depth_array, @starting_line_of_current_depth, @nested_ternary_flag, @nested_statement_type, $next_sequence_number, ); # save all lexical variables my $rstate = save_tokenizer_state(); _decrement_count(); # avoid error check for multiple tokenizers # make a new tokenizer my $rOpts = {}; my $source_object = Perl::Tidy::LineSource->new( input_file => \$replacement_text, rOpts => $rOpts, ); my $tokenizer = Perl::Tidy::Tokenizer->new( source_object => $source_object, logger_object => $logger_object, starting_line_number => $input_line_number, ); # scan the replacement text 1 while ( $tokenizer->get_line() ); # remove any here doc targets my $rht = undef; if ( $tokenizer_self->[_in_here_doc_] ) { $rht = []; push @{$rht}, [ $tokenizer_self->[_here_doc_target_], $tokenizer_self->[_here_quote_character_] ]; if ( $tokenizer_self->[_rhere_target_list_] ) { push @{$rht}, @{ $tokenizer_self->[_rhere_target_list_] }; $tokenizer_self->[_rhere_target_list_] = undef; } $tokenizer_self->[_in_here_doc_] = undef; } # now its safe to report errors my $severe_error = $tokenizer->report_tokenization_errors(); # TODO: Could propagate a severe error up # restore all tokenizer lexical variables restore_tokenizer_state($rstate); # return the here doc targets return $rht; } ## end sub scan_replacement_text sub scan_bare_identifier { ( $i, $tok, $type, $prototype ) = scan_bare_identifier_do( $input_line, $i, $tok, $type, $prototype, $rtoken_map, $max_token_index ); return; } ## end sub scan_bare_identifier sub scan_identifier { ( $i, $tok, $type, $id_scan_state, $identifier, my $split_pretoken_flag ) = scan_complex_identifier( $i, $id_scan_state, $identifier, $rtokens, $max_token_index, $expecting, $paren_type[$paren_depth] ); # Check for signal to fix a special variable adjacent to a keyword, # such as '$^One$0'. if ($split_pretoken_flag) { # Try to fix it by splitting the pretoken if ( $i > 0 && $rtokens->[ $i - 1 ] eq '^' && split_pretoken(1) ) { $identifier = substr( $identifier, 0, 3 ); $tok = $identifier; } else { # This shouldn't happen ... my $var = substr( $tok, 0, 3 ); my $excess = substr( $tok, 3 ); interrupt_logfile(); warning(< 0; my %fast_scan_context; BEGIN { %fast_scan_context = ( '$' => SCALAR_CONTEXT, '*' => SCALAR_CONTEXT, '@' => LIST_CONTEXT, '%' => LIST_CONTEXT, '&' => UNKNOWN_CONTEXT, ); } ## end BEGIN sub scan_simple_identifier { # This is a wrapper for sub scan_identifier. It does a fast preliminary # scan for certain common identifiers: # '$var', '@var', %var, *var, &var, '@{...}', '%{...}' # If it does not find one of these, or this is a restart, it calls the # original scanner directly. # This gives the same results as the full scanner in about 1/4 the # total runtime for a typical input stream. # Notation: # $var * 2 # ^^ ^ # || | # || ---- $i_next [= next nonblank pretoken ] # |----$i_plus_1 [= a bareword ] # ---$i_begin [= a sigil] my $i_begin = $i; my $tok_begin = $tok; my $i_plus_1 = $i + 1; my $fast_scan_type; #------------------------------------------------------- # Do full scan for anything following a pointer, such as # $cref->&*; # a postderef #------------------------------------------------------- if ( $last_nonblank_token eq '->' ) { } #------------------------------ # quick scan with leading sigil #------------------------------ elsif ( !$id_scan_state && $i_plus_1 <= $max_token_index && $fast_scan_context{$tok} ) { $context = $fast_scan_context{$tok}; # look for $var, @var, ... if ( $rtoken_type->[$i_plus_1] eq 'w' ) { my $pretype_next = EMPTY_STRING; if ( $i_plus_1 < $max_token_index ) { my $i_next = $i_plus_1 + 1; if ( $rtoken_type->[$i_next] eq 'b' && $i_next < $max_token_index ) { $i_next += 1; } $pretype_next = $rtoken_type->[$i_next]; } if ( $pretype_next ne ':' && $pretype_next ne "'" ) { # Found type 'i' like '$var', '@var', or '%var' $identifier = $tok . $rtokens->[$i_plus_1]; $tok = $identifier; $type = 'i'; $i = $i_plus_1; $fast_scan_type = $type; } } # Look for @{ or %{ . # But we must let the full scanner handle things ${ because it may # keep going to get a complete identifier like '${#}' . elsif ( $rtoken_type->[$i_plus_1] eq '{' && ( $tok_begin eq '@' || $tok_begin eq '%' ) ) { $identifier = $tok; $type = 't'; $fast_scan_type = $type; } } #--------------------------- # Quick scan with leading -> # Look for ->[ and ->{ #--------------------------- elsif ( $tok eq '->' && $i < $max_token_index && ( $rtokens->[$i_plus_1] eq '{' || $rtokens->[$i_plus_1] eq '[' ) ) { $type = $tok; $fast_scan_type = $type; $identifier = $tok; $context = UNKNOWN_CONTEXT; } #-------------------------------------- # Verify correctness during development #-------------------------------------- if ( VERIFY_FASTSCAN && $fast_scan_type ) { # We will call the full method my $identifier_simple = $identifier; my $tok_simple = $tok; my $i_simple = $i; my $context_simple = $context; $tok = $tok_begin; $i = $i_begin; scan_identifier(); if ( $tok ne $tok_simple || $type ne $fast_scan_type || $i != $i_simple || $identifier ne $identifier_simple || $id_scan_state || $context ne $context_simple ) { print STDERR < sub { # return; # }; # from do_scan_sub: my $i_beg = $i + 1; my $pos_beg = $rtoken_map->[$i_beg]; pos($input_line) = $pos_beg; # TEST 1: look a valid sub NAME if ( $input_line =~ m/\G\s* ((?:\w*(?:'|::))*) # package - something that ends in :: or ' (\w+) # NAME - required /gcx ) { # For possible future use.. my $subname = $2; my $package = $1 ? $1 : EMPTY_STRING; } else { return; } # TEST 2: look for invalid characters after name, such as here: # method paint => sub { # ... # } my $next_char = EMPTY_STRING; if ( $input_line =~ m/\s*(\S)/gcx ) { $next_char = $1 } if ( !$next_char || $next_char eq '#' ) { ( $next_char, my $i_next ) = find_next_nonblank_token( $max_token_index, $rtokens, $max_token_index ); } if ( !$next_char ) { # out of characters - give up return; } # Possibly valid next token types: # '(' could start prototype or signature # ':' could start ATTRIBUTE # '{' cold start BLOCK # ';' or '}' could end a statement if ( $next_char !~ /^[\(\:\{\;\}]/ ) { # This does not match use feature 'class' syntax return; } # We will stop here and assume that this is valid syntax for # use feature 'class'. return 1; } ## end sub method_ok_here sub class_ok_here { # Return: # false if this is definitely an invalid class declaration # true otherwise (even if not sure) # We are trying to avoid problems with old uses of 'class' # when --use-feature=class is set (rt145706). We look ahead # see if this use of 'class' is obviously inconsistent with # the syntax of use feature 'class'. This allows the default # setting --use-feature=class to work for old syntax too. # Valid class declarations look like # class NAME ?ATTRS ?VERSION ?BLOCK # where ATTRS VERSION and BLOCK are optional # For example, this should produce a return of 'false': # # class ExtendsBasicAttributes is BasicAttributes{ # TEST 1: class stmt can only go where a new statment can start if ( !new_statement_ok() ) { return } my $i_beg = $i + 1; my $pos_beg = $rtoken_map->[$i_beg]; pos($input_line) = $pos_beg; # TEST 2: look for a valid NAME if ( $input_line =~ m/\G\s* ((?:\w*(?:'|::))*) # package - something that ends in :: or ' (\w+) # NAME - required /gcx ) { # For possible future use.. my $subname = $2; my $package = $1 ? $1 : EMPTY_STRING; } else { return; } # TEST 3: look for valid characters after NAME my $next_char = EMPTY_STRING; if ( $input_line =~ m/\s*(\S)/gcx ) { $next_char = $1 } if ( !$next_char || $next_char eq '#' ) { ( $next_char, my $i_next ) = find_next_nonblank_token( $max_token_index, $rtokens, $max_token_index ); } if ( !$next_char ) { # out of characters - give up return; } # Must see one of: ATTRIBUTE, VERSION, BLOCK, or end stmt # Possibly valid next token types: # ':' could start ATTRIBUTE # '\d' could start VERSION # '{' cold start BLOCK # ';' could end a statement # '}' could end statement but would be strange if ( $next_char !~ /^[\:\d\{\;\}]/ ) { # This does not match use feature 'class' syntax return; } # We will stop here and assume that this is valid syntax for # use feature 'class'. return 1; } ## end sub class_ok_here sub scan_id { ( $i, $tok, $type, $id_scan_state ) = scan_id_do( $input_line, $i, $tok, $rtokens, $rtoken_map, $id_scan_state, $max_token_index ); return; } ## end sub scan_id sub scan_number { my $number; ( $i, $type, $number ) = scan_number_do( $input_line, $i, $rtoken_map, $type, $max_token_index ); return $number; } ## end sub scan_number use constant VERIFY_FASTNUM => 0; sub scan_number_fast { # This is a wrapper for sub scan_number. It does a fast preliminary # scan for a simple integer. It calls the original scan_number if it # does not find one. my $i_begin = $i; my $tok_begin = $tok; my $number; #--------------------------------- # Quick check for (signed) integer #--------------------------------- # This will be the string of digits: my $i_d = $i; my $tok_d = $tok; my $typ_d = $rtoken_type->[$i_d]; # check for signed integer my $sign = EMPTY_STRING; if ( $typ_d ne 'd' && ( $typ_d eq '+' || $typ_d eq '-' ) && $i_d < $max_token_index ) { $sign = $tok_d; $i_d++; $tok_d = $rtokens->[$i_d]; $typ_d = $rtoken_type->[$i_d]; } # Handle integers if ( $typ_d eq 'd' && ( $i_d == $max_token_index || ( $i_d < $max_token_index && $rtoken_type->[ $i_d + 1 ] ne '.' && $rtoken_type->[ $i_d + 1 ] ne 'w' ) ) ) { # Let let full scanner handle multi-digit integers beginning with # '0' because there could be error messages. For example, '009' is # not a valid number. if ( $tok_d eq '0' || substr( $tok_d, 0, 1 ) ne '0' ) { $number = $sign . $tok_d; $type = 'n'; $i = $i_d; } } #-------------------------------------- # Verify correctness during development #-------------------------------------- if ( VERIFY_FASTNUM && defined($number) ) { # We will call the full method my $type_simple = $type; my $i_simple = $i; my $number_simple = $number; $tok = $tok_begin; $i = $i_begin; $number = scan_number(); if ( $type ne $type_simple || ( $i != $i_simple && $i <= $max_token_index ) || $number ne $number_simple ) { print STDERR <' error_if_expecting_TERM() if ( $expecting == TERM ); return; } ## end sub do_GREATER_THAN_SIGN sub do_VERTICAL_LINE { # '|' error_if_expecting_TERM() if ( $expecting == TERM ); return; } ## end sub do_VERTICAL_LINE sub do_DOLLAR_SIGN { # '$' # start looking for a scalar error_if_expecting_OPERATOR("Scalar") if ( $expecting == OPERATOR ); scan_simple_identifier(); if ( $identifier eq '$^W' ) { $tokenizer_self->[_saw_perl_dash_w_] = 1; } # Check for identifier in indirect object slot # (vorboard.pl, sort.t). Something like: # /^(print|printf|sort|exec|system)$/ if ( $is_indirect_object_taker{$last_nonblank_token} && $last_nonblank_type eq 'k' || ( ( $last_nonblank_token eq '(' ) && $is_indirect_object_taker{ $paren_type[$paren_depth] } ) || ( $last_nonblank_type eq 'w' || $last_nonblank_type eq 'U' ) # possible object ) { # An identifier followed by '->' is not indirect object; # fixes b1175, b1176 my ( $next_nonblank_type, $i_next ) = find_next_noncomment_type( $i, $rtokens, $max_token_index ); $type = 'Z' if ( $next_nonblank_type ne '->' ); } return; } ## end sub do_DOLLAR_SIGN sub do_LEFT_PARENTHESIS { # '(' ++$paren_depth; $paren_semicolon_count[$paren_depth] = 0; if ($want_paren) { $container_type = $want_paren; $want_paren = EMPTY_STRING; } elsif ( $statement_type =~ /^sub\b/ ) { $container_type = $statement_type; } else { $container_type = $last_nonblank_token; # We can check for a syntax error here of unexpected '(', # but this is going to get messy... if ( $expecting == OPERATOR # Be sure this is not a method call of the form # &method(...), $method->(..), &{method}(...), # $ref[2](list) is ok & short for $ref[2]->(list) # NOTE: at present, braces in something like &{ xxx } # are not marked as a block, we might have a method call. # Added ')' to fix case c017, something like ()()() && $last_nonblank_token !~ /^([\]\}\)\&]|\-\>)/ ) { # ref: camel 3 p 703. if ( $last_last_nonblank_token eq 'do' ) { complain( "do SUBROUTINE is deprecated; consider & or -> notation\n" ); } else { # if this is an empty list, (), then it is not an # error; for example, we might have a constant pi and # invoke it with pi() or just pi; my ( $next_nonblank_token, $i_next ) = find_next_nonblank_token( $i, $rtokens, $max_token_index ); # Patch for c029: give up error check if # a side comment follows if ( $next_nonblank_token ne ')' && $next_nonblank_token ne '#' ) { my $hint; error_if_expecting_OPERATOR('('); if ( $last_nonblank_type eq 'C' ) { $hint = "$last_nonblank_token has a void prototype\n"; } elsif ( $last_nonblank_type eq 'i' ) { if ( $i_tok > 0 && $last_nonblank_token =~ /^\$/ ) { $hint = "Do you mean '$last_nonblank_token->(' ?\n"; } } if ($hint) { interrupt_logfile(); warning($hint); resume_logfile(); } } ## end if ( $next_nonblank_token... } ## end else [ if ( $last_last_nonblank_token... } ## end if ( $expecting == OPERATOR... } # Do not update container type at ') ('; fix for git #105. This will # propagate the container type onward so that any subsequent brace gets # correctly marked. I have implemented this as a general rule, which # should be safe, but if necessary it could be restricted to certain # container statement types such as 'for'. $paren_type[$paren_depth] = $container_type if ( $last_nonblank_token ne ')' ); ( $type_sequence, $indent_flag ) = increase_nesting_depth( PAREN, $rtoken_map->[$i_tok] ); # propagate types down through nested parens # for example: the second paren in 'if ((' would be structural # since the first is. if ( $last_nonblank_token eq '(' ) { $type = $last_nonblank_type; } # We exclude parens as structural after a ',' because it # causes subtle problems with continuation indentation for # something like this, where the first 'or' will not get # indented. # # assert( # __LINE__, # ( not defined $check ) # or ref $check # or $check eq "new" # or $check eq "old", # ); # # Likewise, we exclude parens where a statement can start # because of problems with continuation indentation, like # these: # # ($firstline =~ /^#\!.*perl/) # and (print $File::Find::name, "\n") # and (return 1); # # (ref($usage_fref) =~ /CODE/) # ? &$usage_fref # : (&blast_usage, &blast_params, &blast_general_params); else { $type = '{'; } if ( $last_nonblank_type eq ')' ) { warning( "Syntax error? found token '$last_nonblank_type' then '('\n"); } $paren_structural_type[$paren_depth] = $type; return; } ## end sub do_LEFT_PARENTHESIS sub do_RIGHT_PARENTHESIS { # ')' ( $type_sequence, $indent_flag ) = decrease_nesting_depth( PAREN, $rtoken_map->[$i_tok] ); if ( $paren_structural_type[$paren_depth] eq '{' ) { $type = '}'; } $container_type = $paren_type[$paren_depth]; # restore statement type as 'sub' at closing paren of a signature # so that a subsequent ':' is identified as an attribute if ( $container_type =~ /^sub\b/ ) { $statement_type = $container_type; } # /^(for|foreach)$/ if ( $is_for_foreach{ $paren_type[$paren_depth] } ) { my $num_sc = $paren_semicolon_count[$paren_depth]; if ( $num_sc > 0 && $num_sc != 2 ) { warning("Expected 2 ';' in 'for(;;)' but saw $num_sc\n"); } } if ( $paren_depth > 0 ) { $paren_depth-- } return; } ## end sub do_RIGHT_PARENTHESIS sub do_COMMA { # ',' if ( $last_nonblank_type eq ',' ) { complain("Repeated ','s \n"); } # Note that we have to check both token and type here because a # comma following a qw list can have last token='(' but type = 'q' elsif ( $last_nonblank_token eq '(' && $last_nonblank_type eq '{' ) { warning("Unexpected leading ',' after a '('\n"); } # patch for operator_expected: note if we are in the list (use.t) if ( $statement_type eq 'use' ) { $statement_type = '_use' } return; } ## end sub do_COMMA sub do_SEMICOLON { # ';' $context = UNKNOWN_CONTEXT; $statement_type = EMPTY_STRING; $want_paren = EMPTY_STRING; # /^(for|foreach)$/ if ( $is_for_foreach{ $paren_type[$paren_depth] } ) { # mark ; in for loop # Be careful: we do not want a semicolon such as the # following to be included: # # for (sort {strcoll($a,$b);} keys %investments) { if ( $brace_depth == $depth_array[PAREN][BRACE][$paren_depth] && $square_bracket_depth == $depth_array[PAREN][SQUARE_BRACKET][$paren_depth] ) { $type = 'f'; $paren_semicolon_count[$paren_depth]++; } } return; } ## end sub do_SEMICOLON sub do_QUOTATION_MARK { # '"' error_if_expecting_OPERATOR("String") if ( $expecting == OPERATOR ); $in_quote = 1; $type = 'Q'; $allowed_quote_modifiers = EMPTY_STRING; return; } ## end sub do_QUOTATION_MARK sub do_APOSTROPHE { # "'" error_if_expecting_OPERATOR("String") if ( $expecting == OPERATOR ); $in_quote = 1; $type = 'Q'; $allowed_quote_modifiers = EMPTY_STRING; return; } ## end sub do_APOSTROPHE sub do_BACKTICK { # '`' error_if_expecting_OPERATOR("String") if ( $expecting == OPERATOR ); $in_quote = 1; $type = 'Q'; $allowed_quote_modifiers = EMPTY_STRING; return; } ## end sub do_BACKTICK sub do_SLASH { # '/' my $is_pattern; # a pattern cannot follow certain keywords which take optional # arguments, like 'shift' and 'pop'. See also '?'. if ( $last_nonblank_type eq 'k' && $is_keyword_rejecting_slash_as_pattern_delimiter{ $last_nonblank_token} ) { $is_pattern = 0; } elsif ( $expecting == UNKNOWN ) { # indeterminate, must guess.. my $msg; ( $is_pattern, $msg ) = guess_if_pattern_or_division( $i, $rtokens, $rtoken_map, $max_token_index ); if ($msg) { write_diagnostics("DIVIDE:$msg\n"); write_logfile_entry($msg); } } else { $is_pattern = ( $expecting == TERM ) } if ($is_pattern) { $in_quote = 1; $type = 'Q'; $allowed_quote_modifiers = '[msixpodualngc]'; } else { # not a pattern; check for a /= token if ( $rtokens->[ $i + 1 ] eq '=' ) { # form token /= $i++; $tok = '/='; $type = $tok; } #DEBUG - collecting info on what tokens follow a divide # for development of guessing algorithm #if ( is_possible_numerator( $i, $rtokens, $max_token_index ) < 0 ) { # #write_diagnostics( "DIVIDE? $input_line\n" ); #} } return; } ## end sub do_SLASH sub do_LEFT_CURLY_BRACKET { # '{' # if we just saw a ')', we will label this block with # its type. We need to do this to allow sub # code_block_type to determine if this brace starts a # code block or anonymous hash. (The type of a paren # pair is the preceding token, such as 'if', 'else', # etc). $container_type = EMPTY_STRING; # ATTRS: for a '{' following an attribute list, reset # things to look like we just saw the sub name # Added 'package' (can be 'class') for --use-feature=class (rt145706) if ( $statement_type =~ /^(sub|package)\b/ ) { $last_nonblank_token = $statement_type; $last_nonblank_type = 'i'; $statement_type = EMPTY_STRING; } # patch for SWITCH/CASE: hide these keywords from an immediately # following opening brace elsif ( ( $statement_type eq 'case' || $statement_type eq 'when' ) && $statement_type eq $last_nonblank_token ) { $last_nonblank_token = ";"; } elsif ( $last_nonblank_token eq ')' ) { $last_nonblank_token = $paren_type[ $paren_depth + 1 ]; # defensive move in case of a nesting error (pbug.t) # in which this ')' had no previous '(' # this nesting error will have been caught if ( !defined($last_nonblank_token) ) { $last_nonblank_token = 'if'; } # check for syntax error here; unless ( $is_blocktype_with_paren{$last_nonblank_token} ) { if ( $tokenizer_self->[_extended_syntax_] ) { # we append a trailing () to mark this as an unknown # block type. This allows perltidy to format some # common extensions of perl syntax. # This is used by sub code_block_type $last_nonblank_token .= '()'; } else { my $list = join( SPACE, sort keys %is_blocktype_with_paren ); warning( "syntax error at ') {', didn't see one of: <<$list>>; If this code is okay try using the -xs flag\n" ); } } } # patch for paren-less for/foreach glitch, part 2. # see note below under 'qw' elsif ($last_nonblank_token eq 'qw' && $is_for_foreach{$want_paren} ) { $last_nonblank_token = $want_paren; if ( $last_last_nonblank_token eq $want_paren ) { warning( "syntax error at '$want_paren .. {' -- missing \$ loop variable\n" ); } $want_paren = EMPTY_STRING; } # now identify which of the three possible types of # curly braces we have: hash index container, anonymous # hash reference, or code block. # non-structural (hash index) curly brace pair # get marked 'L' and 'R' if ( is_non_structural_brace() ) { $type = 'L'; # patch for SWITCH/CASE: # allow paren-less identifier after 'when' # if the brace is preceded by a space if ( $statement_type eq 'when' && $last_nonblank_type eq 'i' && $last_last_nonblank_type eq 'k' && ( $i_tok == 0 || $rtoken_type->[ $i_tok - 1 ] eq 'b' ) ) { $type = '{'; $block_type = $statement_type; } } # code and anonymous hash have the same type, '{', but are # distinguished by 'block_type', # which will be blank for an anonymous hash else { $block_type = code_block_type( $i_tok, $rtokens, $rtoken_type, $max_token_index ); # patch to promote bareword type to function taking block if ( $block_type && $last_nonblank_type eq 'w' && $last_nonblank_i >= 0 ) { if ( $routput_token_type->[$last_nonblank_i] eq 'w' ) { $routput_token_type->[$last_nonblank_i] = $is_grep_alias{$block_type} ? 'k' : 'G'; } } # patch for SWITCH/CASE: if we find a stray opening block brace # where we might accept a 'case' or 'when' block, then take it if ( $statement_type eq 'case' || $statement_type eq 'when' ) { if ( !$block_type || $block_type eq '}' ) { $block_type = $statement_type; } } } $brace_type[ ++$brace_depth ] = $block_type; # Patch for CLASS BLOCK definitions: do not update the package for the # current depth if this is a BLOCK type definition. # TODO: should make 'class' separate from 'package' and only do # this for 'class' $brace_package[$brace_depth] = $current_package if ( substr( $block_type, 0, 8 ) ne 'package ' ); $brace_structural_type[$brace_depth] = $type; $brace_context[$brace_depth] = $context; ( $type_sequence, $indent_flag ) = increase_nesting_depth( BRACE, $rtoken_map->[$i_tok] ); return; } ## end sub do_LEFT_CURLY_BRACKET sub do_RIGHT_CURLY_BRACKET { # '}' $block_type = $brace_type[$brace_depth]; if ($block_type) { $statement_type = EMPTY_STRING } if ( defined( $brace_package[$brace_depth] ) ) { $current_package = $brace_package[$brace_depth]; } # can happen on brace error (caught elsewhere) else { } ( $type_sequence, $indent_flag ) = decrease_nesting_depth( BRACE, $rtoken_map->[$i_tok] ); if ( $brace_structural_type[$brace_depth] eq 'L' ) { $type = 'R'; } # propagate type information for 'do' and 'eval' blocks, and also # for smartmatch operator. This is necessary to enable us to know # if an operator or term is expected next. if ( $is_block_operator{$block_type} ) { $tok = $block_type; } $context = $brace_context[$brace_depth]; if ( $brace_depth > 0 ) { $brace_depth--; } return; } ## end sub do_RIGHT_CURLY_BRACKET sub do_AMPERSAND { # '&' = maybe sub call? start looking # We have to check for sub call unless we are sure we # are expecting an operator. This example from s2p # got mistaken as a q operator in an early version: # print BODY &q(<<'EOT'); if ( $expecting != OPERATOR ) { # But only look for a sub call if we are expecting a term or # if there is no existing space after the &. # For example we probably don't want & as sub call here: # Fcntl::S_IRUSR & $mode; if ( $expecting == TERM || $next_type ne 'b' ) { scan_simple_identifier(); } } else { } return; } ## end sub do_AMPERSAND sub do_LESS_THAN_SIGN { # '<' - angle operator or less than? if ( $expecting != OPERATOR ) { ( $i, $type ) = find_angle_operator_termination( $input_line, $i, $rtoken_map, $expecting, $max_token_index ); ## This message is not very helpful and quite confusing if the above ## routine decided not to write a message with the line number. ## if ( $type eq '<' && $expecting == TERM ) { ## error_if_expecting_TERM(); ## interrupt_logfile(); ## warning("Unterminated <> operator?\n"); ## resume_logfile(); ## } } else { } return; } ## end sub do_LESS_THAN_SIGN sub do_QUESTION_MARK { # '?' = conditional or starting pattern? my $is_pattern; # Patch for rt #126965 # a pattern cannot follow certain keywords which take optional # arguments, like 'shift' and 'pop'. See also '/'. if ( $last_nonblank_type eq 'k' && $is_keyword_rejecting_question_as_pattern_delimiter{ $last_nonblank_token} ) { $is_pattern = 0; } # patch for RT#131288, user constant function without prototype # last type is 'U' followed by ?. elsif ( $last_nonblank_type =~ /^[FUY]$/ ) { $is_pattern = 0; } elsif ( $expecting == UNKNOWN ) { # In older versions of Perl, a bare ? can be a pattern # delimiter. In perl version 5.22 this was # dropped, but we have to support it in order to format # older programs. See: ## https://perl.developpez.com/documentations/en/5.22.0/perl5211delta.html # For example, the following line worked # at one time: # ?(.*)? && (print $1,"\n"); # In current versions it would have to be written with slashes: # /(.*)/ && (print $1,"\n"); my $msg; ( $is_pattern, $msg ) = guess_if_pattern_or_conditional( $i, $rtokens, $rtoken_map, $max_token_index ); if ($msg) { write_logfile_entry($msg) } } else { $is_pattern = ( $expecting == TERM ) } if ($is_pattern) { $in_quote = 1; $type = 'Q'; $allowed_quote_modifiers = '[msixpodualngc]'; } else { ( $type_sequence, $indent_flag ) = increase_nesting_depth( QUESTION_COLON, $rtoken_map->[$i_tok] ); } return; } ## end sub do_QUESTION_MARK sub do_STAR { # '*' = typeglob, or multiply? if ( $expecting == UNKNOWN && $last_nonblank_type eq 'Z' ) { if ( $next_type ne 'b' && $next_type ne '(' && $next_type ne '#' ) # Fix c036 { $expecting = TERM; } } if ( $expecting == TERM ) { scan_simple_identifier(); } else { if ( $rtokens->[ $i + 1 ] eq '=' ) { $tok = '*='; $type = $tok; $i++; } elsif ( $rtokens->[ $i + 1 ] eq '*' ) { $tok = '**'; $type = $tok; $i++; if ( $rtokens->[ $i + 1 ] eq '=' ) { $tok = '**='; $type = $tok; $i++; } } } return; } ## end sub do_STAR sub do_DOT { # '.' = what kind of . ? if ( $expecting != OPERATOR ) { scan_number(); if ( $type eq '.' ) { error_if_expecting_TERM() if ( $expecting == TERM ); } } else { } return; } ## end sub do_DOT sub do_COLON { # ':' = label, ternary, attribute, ? # if this is the first nonblank character, call it a label # since perl seems to just swallow it if ( $input_line_number == 1 && $last_nonblank_i == -1 ) { $type = 'J'; } # ATTRS: check for a ':' which introduces an attribute list # either after a 'sub' keyword or within a paren list # Added 'package' (can be 'class') for --use-feature=class (rt145706) elsif ( $statement_type =~ /^(sub|package)\b/ ) { $type = 'A'; $in_attribute_list = 1; } # Within a signature, unless we are in a ternary. For example, # from 't/filter_example.t': # method foo4 ( $class: $bar ) { $class->bar($bar) } elsif ( $paren_type[$paren_depth] =~ /^sub\b/ && !is_balanced_closing_container(QUESTION_COLON) ) { $type = 'A'; $in_attribute_list = 1; } # check for scalar attribute, such as # my $foo : shared = 1; elsif ($is_my_our_state{$statement_type} && $current_depth[QUESTION_COLON] == 0 ) { $type = 'A'; $in_attribute_list = 1; } # Look for Switch::Plain syntax if an error would otherwise occur # here. Note that we do not need to check if the extended syntax # flag is set because otherwise an error would occur, and we would # then have to output a message telling the user to set the # extended syntax flag to avoid the error. # case 1: { # default: { # default: # Note that the line 'default:' will be parsed as a label elsewhere. elsif ( $is_case_default{$statement_type} && !is_balanced_closing_container(QUESTION_COLON) ) { # mark it as a perltidy label type $type = 'J'; } # otherwise, it should be part of a ?/: operator else { ( $type_sequence, $indent_flag ) = decrease_nesting_depth( QUESTION_COLON, $rtoken_map->[$i_tok] ); if ( $last_nonblank_token eq '?' ) { warning("Syntax error near ? :\n"); } } return; } ## end sub do_COLON sub do_PLUS_SIGN { # '+' = what kind of plus? if ( $expecting == TERM ) { my $number = scan_number_fast(); # unary plus is safest assumption if not a number if ( !defined($number) ) { $type = 'p'; } } elsif ( $expecting == OPERATOR ) { } else { if ( $next_type eq 'w' ) { $type = 'p' } } return; } ## end sub do_PLUS_SIGN sub do_AT_SIGN { # '@' = sigil for array? error_if_expecting_OPERATOR("Array") if ( $expecting == OPERATOR ); scan_simple_identifier(); return; } ## end sub do_AT_SIGN sub do_PERCENT_SIGN { # '%' = hash or modulo? # first guess is hash if no following blank or paren if ( $expecting == UNKNOWN ) { if ( $next_type ne 'b' && $next_type ne '(' ) { $expecting = TERM; } } if ( $expecting == TERM ) { scan_simple_identifier(); } return; } ## end sub do_PERCENT_SIGN sub do_LEFT_SQUARE_BRACKET { # '[' $square_bracket_type[ ++$square_bracket_depth ] = $last_nonblank_token; ( $type_sequence, $indent_flag ) = increase_nesting_depth( SQUARE_BRACKET, $rtoken_map->[$i_tok] ); # It may seem odd, but structural square brackets have # type '{' and '}'. This simplifies the indentation logic. if ( !is_non_structural_brace() ) { $type = '{'; } $square_bracket_structural_type[$square_bracket_depth] = $type; return; } ## end sub do_LEFT_SQUARE_BRACKET sub do_RIGHT_SQUARE_BRACKET { # ']' ( $type_sequence, $indent_flag ) = decrease_nesting_depth( SQUARE_BRACKET, $rtoken_map->[$i_tok] ); if ( $square_bracket_structural_type[$square_bracket_depth] eq '{' ) { $type = '}'; } # propagate type information for smartmatch operator. This is # necessary to enable us to know if an operator or term is expected # next. if ( $square_bracket_type[$square_bracket_depth] eq '~~' ) { $tok = $square_bracket_type[$square_bracket_depth]; } if ( $square_bracket_depth > 0 ) { $square_bracket_depth--; } return; } ## end sub do_RIGHT_SQUARE_BRACKET sub do_MINUS_SIGN { # '-' = what kind of minus? if ( ( $expecting != OPERATOR ) && $is_file_test_operator{$next_tok} ) { my ( $next_nonblank_token, $i_next ) = find_next_nonblank_token( $i + 1, $rtokens, $max_token_index ); # check for a quoted word like "-w=>xx"; # it is sufficient to just check for a following '=' if ( $next_nonblank_token eq '=' ) { $type = 'm'; } else { $i++; $tok .= $next_tok; $type = 'F'; } } elsif ( $expecting == TERM ) { my $number = scan_number_fast(); # maybe part of bareword token? unary is safest if ( !defined($number) ) { $type = 'm'; } } elsif ( $expecting == OPERATOR ) { } else { if ( $next_type eq 'w' ) { $type = 'm'; } } return; } ## end sub do_MINUS_SIGN sub do_CARAT_SIGN { # '^' # check for special variables like ${^WARNING_BITS} if ( $expecting == TERM ) { if ( $last_nonblank_token eq '{' && ( $next_tok !~ /^\d/ ) && ( $next_tok =~ /^\w/ ) ) { if ( $next_tok eq 'W' ) { $tokenizer_self->[_saw_perl_dash_w_] = 1; } $tok = $tok . $next_tok; $i = $i + 1; $type = 'w'; # Optional coding to try to catch syntax errors. This can # be removed if it ever causes incorrect warning messages. # The '{^' should be preceded by either by a type or '$#' # Examples: # $#{^CAPTURE} ok # *${^LAST_FH}{NAME} ok # @{^HOWDY} ok # $hash{^HOWDY} error # Note that a type sigil '$' may be tokenized as 'Z' # after something like 'print', so allow type 'Z' if ( $last_last_nonblank_type ne 't' && $last_last_nonblank_type ne 'Z' && $last_last_nonblank_token ne '$#' ) { warning("Possible syntax error near '{^'\n"); } } else { unless ( error_if_expecting_TERM() ) { # Something like this is valid but strange: # undef ^I; complain("The '^' seems unusual here\n"); } } } return; } ## end sub do_CARAT_SIGN sub do_DOUBLE_COLON { # '::' = probably a sub call scan_bare_identifier(); return; } ## end sub do_DOUBLE_COLON sub do_LEFT_SHIFT { # '<<' = maybe a here-doc? ## This check removed because it could be a deprecated here-doc with ## no specified target. See example in log 16 Sep 2020. ## return ## unless ( $i < $max_token_index ) ## ; # here-doc not possible if end of line if ( $expecting != OPERATOR ) { my ( $found_target, $here_doc_target, $here_quote_character, $saw_error ); ( $found_target, $here_doc_target, $here_quote_character, $i, $saw_error ) = find_here_doc( $expecting, $i, $rtokens, $rtoken_map, $max_token_index ); if ($found_target) { push @{$rhere_target_list}, [ $here_doc_target, $here_quote_character ]; $type = 'h'; if ( length($here_doc_target) > 80 ) { my $truncated = substr( $here_doc_target, 0, 80 ); complain("Long here-target: '$truncated' ...\n"); } elsif ( !$here_doc_target ) { warning( 'Use of bare << to mean <<"" is deprecated' . "\n" ) unless ($here_quote_character); } elsif ( $here_doc_target !~ /^[A-Z_]\w+$/ ) { complain( "Unconventional here-target: '$here_doc_target'\n"); } } elsif ( $expecting == TERM ) { unless ($saw_error) { # shouldn't happen..arriving here implies an error in # the logic in sub 'find_here_doc' if (DEVEL_MODE) { Fault(< 80 ) { my $truncated = substr( $here_doc_target, 0, 80 ); complain("Long here-target: '$truncated' ...\n"); } elsif ( $here_doc_target !~ /^[A-Z_]\w+$/ ) { complain( "Unconventional here-target: '$here_doc_target'\n"); } # Note that we put a leading space on the here quote # character indicate that it may be preceded by spaces $here_quote_character = SPACE . $here_quote_character; push @{$rhere_target_list}, [ $here_doc_target, $here_quote_character ]; $type = 'h'; } elsif ( $expecting == TERM ) { unless ($saw_error) { # shouldn't happen..arriving here implies an error in # the logic in sub 'find_here_doc' if (DEVEL_MODE) { Fault(<' return; } sub do_PLUS_PLUS { # '++' # type = 'pp' for pre-increment, '++' for post-increment if ( $expecting == TERM ) { $type = 'pp' } elsif ( $expecting == UNKNOWN ) { my ( $next_nonblank_token, $i_next ) = find_next_nonblank_token( $i, $rtokens, $max_token_index ); # Fix for c042: look past a side comment if ( $next_nonblank_token eq '#' ) { ( $next_nonblank_token, $i_next ) = find_next_nonblank_token( $max_token_index, $rtokens, $max_token_index ); } if ( $next_nonblank_token eq '$' ) { $type = 'pp' } } return; } ## end sub do_PLUS_PLUS sub do_FAT_COMMA { # '=>' if ( $last_nonblank_type eq $tok ) { complain("Repeated '=>'s \n"); } # patch for operator_expected: note if we are in the list (use.t) # TODO: make version numbers a new token type if ( $statement_type eq 'use' ) { $statement_type = '_use' } return; } ## end sub do_FAT_COMMA sub do_MINUS_MINUS { # '--' # type = 'mm' for pre-decrement, '--' for post-decrement if ( $expecting == TERM ) { $type = 'mm' } elsif ( $expecting == UNKNOWN ) { my ( $next_nonblank_token, $i_next ) = find_next_nonblank_token( $i, $rtokens, $max_token_index ); # Fix for c042: look past a side comment if ( $next_nonblank_token eq '#' ) { ( $next_nonblank_token, $i_next ) = find_next_nonblank_token( $max_token_index, $rtokens, $max_token_index ); } if ( $next_nonblank_token eq '$' ) { $type = 'mm' } } return; } ## end sub do_MINUS_MINUS sub do_LOGICAL_AND { # '&&' error_if_expecting_TERM() if ( $expecting == TERM && $last_nonblank_token ne ',' ); #c015 return; } ## end sub do_LOGICAL_AND sub do_LOGICAL_OR { # '||' error_if_expecting_TERM() if ( $expecting == TERM && $last_nonblank_token ne ',' ); #c015 return; } ## end sub do_LOGICAL_OR sub do_SLASH_SLASH { # '//' error_if_expecting_TERM() if ( $expecting == TERM ); return; } ## end sub do_SLASH_SLASH sub do_DIGITS { # 'd' = string of digits error_if_expecting_OPERATOR("Number") if ( $expecting == OPERATOR ); my $number = scan_number_fast(); if ( !defined($number) ) { # shouldn't happen - we should always get a number if (DEVEL_MODE) { Fault(< $input_line, i => $i, i_beg => $i_beg, tok => $tok, type => $type, rtokens => $rtokens, rtoken_map => $rtoken_map, id_scan_state => $id_scan_state, max_token_index => $max_token_index, } ); # If successful, mark as type 'q' to be consistent # with other attributes. Type 'w' would also work. if ( $i > $i_beg ) { $type = 'q'; return 1; } # If not successful, continue and parse as a quote. } # All other attribute lists must be parsed as quotes # (see 'signatures.t' for good examples) $in_quote = $quote_items{'q'}; $allowed_quote_modifiers = $quote_modifiers{'q'}; $type = 'q'; $quote_type = 'q'; return 1; } # handle bareword not followed by open paren else { $type = 'w'; return 1; } # attribute not found return; } ## end sub do_ATTRIBUTE_LIST sub do_QUOTED_BAREWORD { # find type of a bareword followed by a '=>' if ( $is_constant{$current_package}{$tok} ) { $type = 'C'; } elsif ( $is_user_function{$current_package}{$tok} ) { $type = 'U'; $prototype = $user_function_prototype{$current_package}{$tok}; } elsif ( $tok =~ /^v\d+$/ ) { $type = 'v'; report_v_string($tok); } else { # Bareword followed by a fat comma - see 'git18.in' # If tok is something like 'x17' then it could # actually be operator x followed by number 17. # For example, here: # 123x17 => [ 792, 1224 ], # (a key of 123 repeated 17 times, perhaps not # what was intended). We will mark x17 as type # 'n' and it will be split. If the previous token # was also a bareword then it is not very clear is # going on. In this case we will not be sure that # an operator is expected, so we just mark it as a # bareword. Perl is a little murky in what it does # with stuff like this, and its behavior can change # over time. Something like # a x18 => [792, 1224], will compile as # a key with 18 a's. But something like # push @array, a x18; # is a syntax error. if ( $expecting == OPERATOR && substr( $tok, 0, 1 ) eq 'x' && ( length($tok) == 1 || substr( $tok, 1, 1 ) =~ /^\d/ ) ) { $type = 'n'; if ( split_pretoken(1) ) { $type = 'x'; $tok = 'x'; } } else { # git #18 $type = 'w'; error_if_expecting_OPERATOR(); } } return; } ## end sub do_QUOTED_BAREWORD sub do_X_OPERATOR { if ( $tok eq 'x' ) { if ( $rtokens->[ $i + 1 ] eq '=' ) { # x= $tok = 'x='; $type = $tok; $i++; } else { $type = 'x'; } } else { # Split a pretoken like 'x10' into 'x' and '10'. # Note: In previous versions of perltidy it was marked # as a number, $type = 'n', and fixed downstream by the # Formatter. $type = 'n'; if ( split_pretoken(1) ) { $type = 'x'; $tok = 'x'; } } return; } ## end sub do_X_OPERATOR sub do_USE_CONSTANT { scan_bare_identifier(); my ( $next_nonblank_tok2, $i_next2 ) = find_next_nonblank_token( $i, $rtokens, $max_token_index ); if ($next_nonblank_tok2) { if ( $is_keyword{$next_nonblank_tok2} ) { # Assume qw is used as a quote and okay, as in: # use constant qw{ DEBUG 0 }; # Not worth trying to parse for just a warning # NOTE: This warning is deactivated because recent # versions of perl do not complain here, but # the coding is retained for reference. if ( 0 && $next_nonblank_tok2 ne 'qw' ) { warning( "Attempting to define constant '$next_nonblank_tok2' which is a perl keyword\n" ); } } else { $is_constant{$current_package}{$next_nonblank_tok2} = 1; } } return; } ## end sub do_USE_CONSTANT sub do_KEYWORD { # found a keyword - set any associated flags $type = 'k'; # Since for and foreach may not be followed immediately # by an opening paren, we have to remember which keyword # is associated with the next '(' if ( $is_for_foreach{$tok} ) { if ( new_statement_ok() ) { $want_paren = $tok; } } # recognize 'use' statements, which are special elsif ( $is_use_require{$tok} ) { $statement_type = $tok; error_if_expecting_OPERATOR() if ( $expecting == OPERATOR ); } # remember my and our to check for trailing ": shared" elsif ( $is_my_our_state{$tok} ) { $statement_type = $tok; } # Check for misplaced 'elsif' and 'else', but allow isolated # else or elsif blocks to be formatted. This is indicated # by a last noblank token of ';' elsif ( $tok eq 'elsif' ) { if ( $last_nonblank_token ne ';' ## !~ /^(if|elsif|unless)$/ && !$is_if_elsif_unless{$last_nonblank_block_type} ) { warning( "expecting '$tok' to follow one of 'if|elsif|unless'\n"); } } elsif ( $tok eq 'else' ) { # patched for SWITCH/CASE if ( $last_nonblank_token ne ';' ## !~ /^(if|elsif|unless|case|when)$/ && !$is_if_elsif_unless_case_when{$last_nonblank_block_type} # patch to avoid an unwanted error message for # the case of a parenless 'case' (RT 105484): # switch ( 1 ) { case x { 2 } else { } } ## !~ /^(if|elsif|unless|case|when)$/ && !$is_if_elsif_unless_case_when{$statement_type} ) { warning( "expecting '$tok' to follow one of 'if|elsif|unless|case|when'\n" ); } } # patch for SWITCH/CASE if 'case' and 'when are # treated as keywords. Also 'default' for Switch::Plain elsif ($tok eq 'when' || $tok eq 'case' || $tok eq 'default' ) { $statement_type = $tok; # next '{' is block } # feature 'err' was removed in Perl 5.10. So mark this as # a bareword unless an operator is expected (see c158). elsif ( $tok eq 'err' ) { if ( $expecting != OPERATOR ) { $type = 'w' } } return; } ## end sub do_KEYWORD sub do_QUOTE_OPERATOR { if ( $expecting == OPERATOR ) { # Be careful not to call an error for a qw quote # where a parenthesized list is allowed. For example, # it could also be a for/foreach construct such as # # foreach my $key qw\Uno Due Tres Quadro\ { # print "Set $key\n"; # } # # Or it could be a function call. # NOTE: Braces in something like &{ xxx } are not # marked as a block, we might have a method call. # &method(...), $method->(..), &{method}(...), # $ref[2](list) is ok & short for $ref[2]->(list) # # See notes in 'sub code_block_type' and # 'sub is_non_structural_brace' unless ( $tok eq 'qw' && ( $last_nonblank_token =~ /^([\]\}\&]|\-\>)/ || $is_for_foreach{$want_paren} ) ) { error_if_expecting_OPERATOR(); } } $in_quote = $quote_items{$tok}; $allowed_quote_modifiers = $quote_modifiers{$tok}; # All quote types are 'Q' except possibly qw quotes. # qw quotes are special in that they may generally be trimmed # of leading and trailing whitespace. So they are given a # separate type, 'q', unless requested otherwise. $type = ( $tok eq 'qw' && $tokenizer_self->[_trim_qw_] ) ? 'q' : 'Q'; $quote_type = $type; return; } ## end sub do_QUOTE_OPERATOR sub do_UNKNOWN_BAREWORD { my ($next_nonblank_token) = @_; scan_bare_identifier(); if ( $statement_type eq 'use' && $last_nonblank_token eq 'use' ) { $saw_use_module{$current_package}->{$tok} = 1; } if ( $type eq 'w' ) { if ( $expecting == OPERATOR ) { # Patch to avoid error message for RPerl overloaded # operator functions: use overload # '+' => \&sse_add, # '-' => \&sse_sub, # '*' => \&sse_mul, # '/' => \&sse_div; # TODO: this could eventually be generalized if ( $saw_use_module{$current_package}->{'RPerl'} && $tok =~ /^sse_(mul|div|add|sub)$/ ) { } # Fix part 1 for git #63 in which a comment falls # between an -> and the following word. An # alternate fix would be to change operator_expected # to return an UNKNOWN for this type. elsif ( $last_nonblank_type eq '->' ) { } # don't complain about possible indirect object # notation. # For example: # package main; # sub new($) { ... } # $b = new A::; # calls A::new # $c = new A; # same thing but suspicious # This will call A::new but we have a 'new' in # main:: which looks like a constant. # elsif ( $last_nonblank_type eq 'C' ) { if ( $tok !~ /::$/ ) { complain(<[ $i + 1 ]; if ( $next_tok eq '(' ) { # Patch for issue c151, where we are processing a snippet and # have not seen that SPACE is a constant. In this case 'x' is # probably an operator. The only disadvantage with an incorrect # guess is that the space after it may be incorrect. For example # $str .= SPACE x ( 16 - length($str) ); See also b1410. if ( $tok eq 'x' && $last_nonblank_type eq 'w' ) { $type = 'x' } # Fix part 2 for git #63. Leave type as 'w' to keep # the type the same as if the -> were not separated elsif ( $last_nonblank_type ne '->' ) { $type = 'U' } } # underscore after file test operator is file handle if ( $tok eq '_' && $last_nonblank_type eq 'F' ) { $type = 'Z'; } # patch for SWITCH/CASE if 'case' and 'when are # not treated as keywords: if ( ( $tok eq 'case' && $brace_type[$brace_depth] eq 'switch' ) || ( $tok eq 'when' && $brace_type[$brace_depth] eq 'given' ) ) { $statement_type = $tok; # next '{' is block $type = 'k'; # for keyword syntax coloring } if ( $next_nonblank_token eq '(' ) { # patch for SWITCH/CASE if switch and given not keywords # Switch is not a perl 5 keyword, but we will gamble # and mark switch followed by paren as a keyword. This # is only necessary to get html syntax coloring nice, # and does not commit this as being a switch/case. if ( $tok eq 'switch' || $tok eq 'given' ) { $type = 'k'; # for keyword syntax coloring } # mark 'x' as operator for something like this (see b1410) # my $line = join( LD_X, map { LD_H x ( $_ + 2 ) } @$widths ); elsif ( $tok eq 'x' && $last_nonblank_type eq 'w' ) { $type = 'x'; } } } return; } ## end sub do_UNKNOWN_BAREWORD sub sub_attribute_ok_here { my ( $tok_kw, $next_nonblank_token, $i_next ) = @_; # Decide if 'sub :' can be the start of a sub attribute list. # We will decide based on if the colon is followed by a # bareword which is not a keyword. # Changed inext+1 to inext to fixed case b1190. my $sub_attribute_ok_here; if ( $is_sub{$tok_kw} && $expecting != OPERATOR && $next_nonblank_token eq ':' ) { my ( $nn_nonblank_token, $i_nn ) = find_next_nonblank_token( $i_next, $rtokens, $max_token_index ); $sub_attribute_ok_here = $nn_nonblank_token =~ /^\w/ && $nn_nonblank_token !~ /^\d/ && !$is_keyword{$nn_nonblank_token}; } return $sub_attribute_ok_here; } ## end sub sub_attribute_ok_here sub do_BAREWORD { my ($is_END_or_DATA) = @_; # handle a bareword token: # returns # true if this token ends the current line # false otherwise my ( $next_nonblank_token, $i_next ) = find_next_nonblank_token( $i, $rtokens, $max_token_index ); # a bare word immediately followed by :: is not a keyword; # use $tok_kw when testing for keywords to avoid a mistake my $tok_kw = $tok; if ( $rtokens->[ $i + 1 ] eq ':' && $rtokens->[ $i + 2 ] eq ':' ) { $tok_kw .= '::'; } if ($in_attribute_list) { my $is_attribute = do_ATTRIBUTE_LIST($next_nonblank_token); return if ($is_attribute); } #---------------------------------------- # Starting final if-elsif- chain of tests #---------------------------------------- # This is the return flag: # true => this is the last token on the line # false => keep tokenizing the line my $is_last; # The following blocks of code must update these vars: # $type - the final token type, must always be set # In addition, if additional pretokens are added: # $tok - the final token # $i - the index of the last pretoken # They may also need to check and set various flags # Scan a bare word following a -> as an identifier; it could # have a long package name. Fixes c037, c041. if ( $last_nonblank_token eq '->' ) { scan_bare_identifier(); # a bareward after '->' gets type 'i' $type = 'i'; } # Quote a word followed by => operator # unless the word __END__ or __DATA__ and the only word on # the line. elsif ( !$is_END_or_DATA && $next_nonblank_token eq '=' && $rtokens->[ $i_next + 1 ] eq '>' ) { do_QUOTED_BAREWORD(); } # quote a bare word within braces..like xxx->{s}; note that we # must be sure this is not a structural brace, to avoid # mistaking {s} in the following for a quoted bare word: # for(@[){s}bla}BLA} # Also treat q in something like var{-q} as a bare word, not # a quote operator elsif ( $next_nonblank_token eq '}' && ( $last_nonblank_type eq 'L' || ( $last_nonblank_type eq 'm' && $last_last_nonblank_type eq 'L' ) ) ) { $type = 'w'; } # handle operator x (now we know it isn't $x=) elsif ( $expecting == OPERATOR && substr( $tok, 0, 1 ) eq 'x' && ( length($tok) == 1 || substr( $tok, 1, 1 ) =~ /^\d/ ) ) { do_X_OPERATOR(); } elsif ( $tok_kw eq 'CORE::' ) { $type = $tok = $tok_kw; $i += 2; } elsif ( ( $tok eq 'strict' ) and ( $last_nonblank_token eq 'use' ) ) { $tokenizer_self->[_saw_use_strict_] = 1; scan_bare_identifier(); } elsif ( ( $tok eq 'warnings' ) and ( $last_nonblank_token eq 'use' ) ) { $tokenizer_self->[_saw_perl_dash_w_] = 1; # scan as identifier, so that we pick up something like: # use warnings::register scan_bare_identifier(); } elsif ( $tok eq 'AutoLoader' && $tokenizer_self->[_look_for_autoloader_] && ( $last_nonblank_token eq 'use' # these regexes are from AutoSplit.pm, which we want # to mimic || $input_line =~ /^\s*(use|require)\s+AutoLoader\b/ || $input_line =~ /\bISA\s*=.*\bAutoLoader\b/ ) ) { write_logfile_entry("AutoLoader seen, -nlal deactivates\n"); $tokenizer_self->[_saw_autoloader_] = 1; $tokenizer_self->[_look_for_autoloader_] = 0; scan_bare_identifier(); } elsif ( $tok eq 'SelfLoader' && $tokenizer_self->[_look_for_selfloader_] && ( $last_nonblank_token eq 'use' || $input_line =~ /^\s*(use|require)\s+SelfLoader\b/ || $input_line =~ /\bISA\s*=.*\bSelfLoader\b/ ) ) { write_logfile_entry("SelfLoader seen, -nlsl deactivates\n"); $tokenizer_self->[_saw_selfloader_] = 1; $tokenizer_self->[_look_for_selfloader_] = 0; scan_bare_identifier(); } elsif ( ( $tok eq 'constant' ) and ( $last_nonblank_token eq 'use' ) ) { do_USE_CONSTANT(); } # various quote operators elsif ( $is_q_qq_qw_qx_qr_s_y_tr_m{$tok} ) { do_QUOTE_OPERATOR(); } # check for a statement label elsif ( ( $next_nonblank_token eq ':' ) && ( $rtokens->[ $i_next + 1 ] ne ':' ) && ( $i_next <= $max_token_index ) # colon on same line # like 'sub : lvalue' ? && !sub_attribute_ok_here( $tok_kw, $next_nonblank_token, $i_next ) && label_ok() ) { if ( $tok !~ /[A-Z]/ ) { push @{ $tokenizer_self->[_rlower_case_labels_at_] }, $input_line_number; } $type = 'J'; $tok .= ':'; $i = $i_next; } # 'sub' or other sub alias elsif ( $is_sub{$tok_kw} ) { # Update for --use-feature=class (rt145706): # We have to be extra careful to avoid misparsing other uses of # 'method' in older scripts. if ( $tok_kw eq 'method' ) { if ( $expecting == OPERATOR || $next_nonblank_token !~ /^(\w|\:)/ || !method_ok_here() ) { do_UNKNOWN_BAREWORD($next_nonblank_token); } else { initialize_subname(); scan_id(); } } else { error_if_expecting_OPERATOR() if ( $expecting == OPERATOR ); initialize_subname(); scan_id(); } } # 'package' elsif ( $is_package{$tok_kw} ) { # Update for --use-feature=class (rt145706): # We have to be extra careful because 'class' may be used for other # purposes on older code; i.e. # class($x) - valid sub call # package($x) - error if ( $tok_kw eq 'class' ) { if ( $expecting == OPERATOR || $next_nonblank_token !~ /^(\w|\:)/ || !class_ok_here() ) { do_UNKNOWN_BAREWORD($next_nonblank_token); } else { scan_id() } } else { error_if_expecting_OPERATOR() if ( $expecting == OPERATOR ); scan_id(); } } # Fix for c035: split 'format' from 'is_format_END_DATA' to be # more restrictive. Require a new statement to be ok here. elsif ( $tok_kw eq 'format' && new_statement_ok() ) { $type = ';'; # make tokenizer look for TERM next $tokenizer_self->[_in_format_] = 1; $is_last = 1; ## is last token on this line } # Note on token types for format, __DATA__, __END__: # It simplifies things to give these type ';', so that when we # start rescanning we will be expecting a token of type TERM. # We will switch to type 'k' before outputting the tokens. elsif ( $is_END_DATA{$tok_kw} ) { $type = ';'; # make tokenizer look for TERM next # Remember that we are in one of these three sections $tokenizer_self->[ $is_END_DATA{$tok_kw} ] = 1; $is_last = 1; ## is last token on this line } elsif ( $is_keyword{$tok_kw} ) { do_KEYWORD(); } # check for inline label following # /^(redo|last|next|goto)$/ elsif (( $last_nonblank_type eq 'k' ) && ( $is_redo_last_next_goto{$last_nonblank_token} ) ) { $type = 'j'; } # something else -- else { do_UNKNOWN_BAREWORD($next_nonblank_token); } return $is_last; } ## end sub do_BAREWORD sub do_FOLLOW_QUOTE { # Continue following a quote on a new line $type = $quote_type; unless ( @{$routput_token_list} ) { # initialize if continuation line push( @{$routput_token_list}, $i ); $routput_token_type->[$i] = $type; } # scan for the end of the quote or pattern ( $i, $in_quote, $quote_character, $quote_pos, $quote_depth, $quoted_string_1, $quoted_string_2, ) = do_quote( $i, $in_quote, $quote_character, $quote_pos, $quote_depth, $quoted_string_1, $quoted_string_2, $rtokens, $rtoken_map, $max_token_index, ); # all done if we didn't find it if ($in_quote) { return } # save pattern and replacement text for rescanning my $qs1 = $quoted_string_1; # re-initialize for next search $quote_character = EMPTY_STRING; $quote_pos = 0; $quote_type = 'Q'; $quoted_string_1 = EMPTY_STRING; $quoted_string_2 = EMPTY_STRING; if ( ++$i > $max_token_index ) { return } # look for any modifiers if ($allowed_quote_modifiers) { # check for exact quote modifiers if ( $rtokens->[$i] =~ /^[A-Za-z_]/ ) { my $str = $rtokens->[$i]; my $saw_modifier_e; while ( $str =~ /\G$allowed_quote_modifiers/gc ) { my $pos = pos($str); my $char = substr( $str, $pos - 1, 1 ); $saw_modifier_e ||= ( $char eq 'e' ); } # For an 'e' quote modifier we must scan the replacement # text for here-doc targets... # but if the modifier starts a new line we can skip # this because either the here doc will be fully # contained in the replacement text (so we can # ignore it) or Perl will not find it. # See test 'here2.in'. if ( $saw_modifier_e && $i_tok >= 0 ) { my $rht = scan_replacement_text($qs1); # Change type from 'Q' to 'h' for quotes with # here-doc targets so that the formatter (see sub # process_line_of_CODE) will not make any line # breaks after this point. if ($rht) { push @{$rhere_target_list}, @{$rht}; $type = 'h'; if ( $i_tok < 0 ) { my $ilast = $routput_token_list->[-1]; $routput_token_type->[$ilast] = $type; } } } if ( defined( pos($str) ) ) { # matched if ( pos($str) == length($str) ) { if ( ++$i > $max_token_index ) { return } } # Looks like a joined quote modifier # and keyword, maybe something like # s/xxx/yyy/gefor @k=... # Example is "galgen.pl". Would have to split # the word and insert a new token in the # pre-token list. This is so rare that I haven't # done it. Will just issue a warning citation. # This error might also be triggered if my quote # modifier characters are incomplete else { warning(<[$i]\n"; # my $num = length($str) - pos($str); # $rtokens->[$i]=substr($rtokens->[$i],pos($str),$num); # print "continuing with new token $rtokens->[$i]\n"; # skipping past this token does least damage if ( ++$i > $max_token_index ) { return } } } else { # example file: rokicki4.pl # This error might also be triggered if my quote # modifier characters are incomplete write_logfile_entry( "Note: found word $str at quote modifier location\n"); } } # re-initialize $allowed_quote_modifiers = EMPTY_STRING; } return; } ## end sub do_FOLLOW_QUOTE # ------------------------------------------------------------ # begin hash of code for handling most token types # ------------------------------------------------------------ my $tokenization_code = { '>' => \&do_GREATER_THAN_SIGN, '|' => \&do_VERTICAL_LINE, '$' => \&do_DOLLAR_SIGN, '(' => \&do_LEFT_PARENTHESIS, ')' => \&do_RIGHT_PARENTHESIS, ',' => \&do_COMMA, ';' => \&do_SEMICOLON, '"' => \&do_QUOTATION_MARK, "'" => \&do_APOSTROPHE, '`' => \&do_BACKTICK, '/' => \&do_SLASH, '{' => \&do_LEFT_CURLY_BRACKET, '}' => \&do_RIGHT_CURLY_BRACKET, '&' => \&do_AMPERSAND, '<' => \&do_LESS_THAN_SIGN, '?' => \&do_QUESTION_MARK, '*' => \&do_STAR, '.' => \&do_DOT, ':' => \&do_COLON, '+' => \&do_PLUS_SIGN, '@' => \&do_AT_SIGN, '%' => \&do_PERCENT_SIGN, '[' => \&do_LEFT_SQUARE_BRACKET, ']' => \&do_RIGHT_SQUARE_BRACKET, '-' => \&do_MINUS_SIGN, '^' => \&do_CARAT_SIGN, '::' => \&do_DOUBLE_COLON, '<<' => \&do_LEFT_SHIFT, '<<~' => \&do_NEW_HERE_DOC, '->' => \&do_POINTER, '++' => \&do_PLUS_PLUS, '=>' => \&do_FAT_COMMA, '--' => \&do_MINUS_MINUS, '&&' => \&do_LOGICAL_AND, '||' => \&do_LOGICAL_OR, '//' => \&do_SLASH_SLASH, # No special code for these types yet, but syntax checks # could be added. ## '!' => undef, ## '!=' => undef, ## '!~' => undef, ## '%=' => undef, ## '&&=' => undef, ## '&=' => undef, ## '+=' => undef, ## '-=' => undef, ## '..' => undef, ## '..' => undef, ## '...' => undef, ## '.=' => undef, ## '<<=' => undef, ## '<=' => undef, ## '<=>' => undef, ## '<>' => undef, ## '=' => undef, ## '==' => undef, ## '=~' => undef, ## '>=' => undef, ## '>>' => undef, ## '>>=' => undef, ## '\\' => undef, ## '^=' => undef, ## '|=' => undef, ## '||=' => undef, ## '//=' => undef, ## '~' => undef, ## '~~' => undef, ## '!~~' => undef, }; # ------------------------------------------------------------ # end hash of code for handling individual token types # ------------------------------------------------------------ use constant DEBUG_TOKENIZE => 0; sub tokenize_this_line { # This routine breaks a line of perl code into tokens which are of use in # indentation and reformatting. One of my goals has been to define tokens # such that a newline may be inserted between any pair of tokens without # changing or invalidating the program. This version comes close to this, # although there are necessarily a few exceptions which must be caught by # the formatter. Many of these involve the treatment of bare words. # # The tokens and their types are returned in arrays. See previous # routine for their names. # # See also the array "valid_token_types" in the BEGIN section for an # up-to-date list. # # To simplify things, token types are either a single character, or they # are identical to the tokens themselves. # # As a debugging aid, the -D flag creates a file containing a side-by-side # comparison of the input string and its tokenization for each line of a file. # This is an invaluable debugging aid. # # In addition to tokens, and some associated quantities, the tokenizer # also returns flags indication any special line types. These include # quotes, here_docs, formats. # # ----------------------------------------------------------------------- # # How to add NEW_TOKENS: # # New token types will undoubtedly be needed in the future both to keep up # with changes in perl and to help adapt the tokenizer to other applications. # # Here are some notes on the minimal steps. I wrote these notes while # adding the 'v' token type for v-strings, which are things like version # numbers 5.6.0, and ip addresses, and will use that as an example. ( You # can use your editor to search for the string "NEW_TOKENS" to find the # appropriate sections to change): # # *. Try to talk somebody else into doing it! If not, .. # # *. Make a backup of your current version in case things don't work out! # # *. Think of a new, unused character for the token type, and add to # the array @valid_token_types in the BEGIN section of this package. # For example, I used 'v' for v-strings. # # *. Implement coding to recognize the $type of the token in this routine. # This is the hardest part, and is best done by imitating or modifying # some of the existing coding. For example, to recognize v-strings, I # patched 'sub scan_bare_identifier' to recognize v-strings beginning with # 'v' and 'sub scan_number' to recognize v-strings without the leading 'v'. # # *. Update sub operator_expected. This update is critically important but # the coding is trivial. Look at the comments in that routine for help. # For v-strings, which should behave like numbers, I just added 'v' to the # regex used to handle numbers and strings (types 'n' and 'Q'). # # *. Implement a 'bond strength' rule in sub set_bond_strengths in # Perl::Tidy::Formatter for breaking lines around this token type. You can # skip this step and take the default at first, then adjust later to get # desired results. For adding type 'v', I looked at sub bond_strength and # saw that number type 'n' was using default strengths, so I didn't do # anything. I may tune it up someday if I don't like the way line # breaks with v-strings look. # # *. Implement a 'whitespace' rule in sub set_whitespace_flags in # Perl::Tidy::Formatter. For adding type 'v', I looked at this routine # and saw that type 'n' used spaces on both sides, so I just added 'v' # to the array @spaces_both_sides. # # *. Update HtmlWriter package so that users can colorize the token as # desired. This is quite easy; see comments identified by 'NEW_TOKENS' in # that package. For v-strings, I initially chose to use a default color # equal to the default for numbers, but it might be nice to change that # eventually. # # *. Update comments in Perl::Tidy::Tokenizer::dump_token_types. # # *. Run lots and lots of debug tests. Start with special files designed # to test the new token type. Run with the -D flag to create a .DEBUG # file which shows the tokenization. When these work ok, test as many old # scripts as possible. Start with all of the '.t' files in the 'test' # directory of the distribution file. Compare .tdy output with previous # version and updated version to see the differences. Then include as # many more files as possible. My own technique has been to collect a huge # number of perl scripts (thousands!) into one directory and run perltidy # *, then run diff between the output of the previous version and the # current version. # # *. For another example, search for the smartmatch operator '~~' # with your editor to see where updates were made for it. # # ----------------------------------------------------------------------- my ( $self, $line_of_tokens ) = @_; my ($untrimmed_input_line) = $line_of_tokens->{_line_text}; # Extract line number for use in error messages $input_line_number = $line_of_tokens->{_line_number}; # Check for pod documentation if ( substr( $untrimmed_input_line, 0, 1 ) eq '=' && $untrimmed_input_line =~ /^=[A-Za-z_]/ ) { # Must not be in multi-line quote # and must not be in an equation if ( !$in_quote && ( operator_expected( [ 'b', '=', 'b' ] ) == TERM ) ) { $self->[_in_pod_] = 1; return; } } $input_line = $untrimmed_input_line; chomp $input_line; # Set a flag to indicate if we might be at an __END__ or __DATA__ line # This will be used below to avoid quoting a bare word followed by # a fat comma. my $is_END_or_DATA; # Reinitialize the multi-line quote flag if ( $in_quote && $quote_type eq 'Q' ) { $line_of_tokens->{_starting_in_quote} = 1; } else { $line_of_tokens->{_starting_in_quote} = 0; # Trim start of this line unless we are continuing a quoted line. # Do not trim end because we might end in a quote (test: deken4.pl) # Perl::Tidy::Formatter will delete needless trailing blanks $input_line =~ s/^(\s+)//; # Calculate a guessed level for nonblank lines to avoid calls to # sub guess_old_indentation_level() if ( length($input_line) && $1 ) { my $leading_spaces = $1; my $spaces = length($leading_spaces); # handle leading tabs if ( ord( substr( $leading_spaces, 0, 1 ) ) == ORD_TAB && $leading_spaces =~ /^(\t+)/ ) { my $tabsize = $self->[_tabsize_]; $spaces += length($1) * ( $tabsize - 1 ); } my $indent_columns = $self->[_indent_columns_]; $line_of_tokens->{_guessed_indentation_level} = int( $spaces / $indent_columns ); } $is_END_or_DATA = substr( $input_line, 0, 1 ) eq '_' && $input_line =~ /^__(END|DATA)__\s*$/; } # Optimize for a full-line comment. if ( !$in_quote ) { if ( substr( $input_line, 0, 1 ) eq '#' ) { # and check for skipped section if ( $rOpts_code_skipping && $input_line =~ /$code_skipping_pattern_begin/ ) { $self->[_in_skipped_] = 1; return; } # Optional fast processing of a block comment my $ci_string_sum = ( my $str = $ci_string_in_tokenizer ) =~ tr/1/0/; my $ci_string_i = $ci_string_sum + $in_statement_continuation; $line_of_tokens->{_line_type} = 'CODE'; $line_of_tokens->{_rtokens} = [$input_line]; $line_of_tokens->{_rtoken_type} = ['#']; $line_of_tokens->{_rlevels} = [$level_in_tokenizer]; $line_of_tokens->{_rci_levels} = [$ci_string_i]; $line_of_tokens->{_rblock_type} = [EMPTY_STRING]; $line_of_tokens->{_nesting_tokens_0} = $nesting_token_string; $line_of_tokens->{_nesting_blocks_0} = $nesting_block_string; return; } # Optimize handling of a blank line if ( !length($input_line) ) { $line_of_tokens->{_line_type} = 'CODE'; $line_of_tokens->{_rtokens} = []; $line_of_tokens->{_rtoken_type} = []; $line_of_tokens->{_rlevels} = []; $line_of_tokens->{_rci_levels} = []; $line_of_tokens->{_rblock_type} = []; $line_of_tokens->{_nesting_tokens_0} = $nesting_token_string; $line_of_tokens->{_nesting_blocks_0} = $nesting_block_string; return; } } # update the copy of the line for use in error messages # This must be exactly what we give the pre_tokenizer $self->[_line_of_text_] = $input_line; # re-initialize for the main loop $routput_token_list = []; # stack of output token indexes $routput_token_type = []; # token types $routput_block_type = []; # types of code block $routput_container_type = []; # paren types, such as if, elsif, .. $routput_type_sequence = []; # nesting sequential number $rhere_target_list = []; $tok = $last_nonblank_token; $type = $last_nonblank_type; $prototype = $last_nonblank_prototype; $last_nonblank_i = -1; $block_type = $last_nonblank_block_type; $container_type = $last_nonblank_container_type; $type_sequence = $last_nonblank_type_sequence; $indent_flag = 0; $peeked_ahead = 0; $self->tokenizer_main_loop($is_END_or_DATA); #----------------------------------------------- # all done tokenizing this line ... # now prepare the final list of tokens and types #----------------------------------------------- $self->tokenizer_wrapup_line($line_of_tokens); return; } ## end sub tokenize_this_line sub tokenizer_main_loop { my ( $self, $is_END_or_DATA ) = @_; #--------------------------------- # Break one input line into tokens #--------------------------------- # Input parameter: # $is_END_or_DATA is true for a __END__ or __DATA__ line # start by breaking the line into pre-tokens my $max_tokens_wanted = 0; # this signals pre_tokenize to get all tokens ( $rtokens, $rtoken_map, $rtoken_type ) = pre_tokenize( $input_line, $max_tokens_wanted ); $max_token_index = scalar( @{$rtokens} ) - 1; push( @{$rtokens}, SPACE, SPACE, SPACE ) ; # extra whitespace simplifies logic push( @{$rtoken_map}, 0, 0, 0 ); # shouldn't be referenced push( @{$rtoken_type}, 'b', 'b', 'b' ); # initialize for main loop if (0) { #<<< this is not necessary foreach my $ii ( 0 .. $max_token_index + 3 ) { $routput_token_type->[$ii] = EMPTY_STRING; $routput_block_type->[$ii] = EMPTY_STRING; $routput_container_type->[$ii] = EMPTY_STRING; $routput_type_sequence->[$ii] = EMPTY_STRING; $routput_indent_flag->[$ii] = 0; } } $i = -1; $i_tok = -1; #----------------------------- # begin main tokenization loop #----------------------------- # we are looking at each pre-token of one line and combining them # into tokens while ( ++$i <= $max_token_index ) { # continue looking for the end of a quote if ($in_quote) { do_FOLLOW_QUOTE(); last if ( $in_quote || $i > $max_token_index ); } if ( $type ne 'b' && $tok ne 'CORE::' ) { # try to catch some common errors if ( ( $type eq 'n' ) && ( $tok ne '0' ) ) { if ( $last_nonblank_token eq 'eq' ) { complain("Should 'eq' be '==' here ?\n"); } elsif ( $last_nonblank_token eq 'ne' ) { complain("Should 'ne' be '!=' here ?\n"); } } # fix c090, only rotate vars if a new token will be stored if ( $i_tok >= 0 ) { $last_last_nonblank_token = $last_nonblank_token; $last_last_nonblank_type = $last_nonblank_type; $last_last_nonblank_block_type = $last_nonblank_block_type; $last_last_nonblank_container_type = $last_nonblank_container_type; $last_last_nonblank_type_sequence = $last_nonblank_type_sequence; # Fix part #3 for git82: propagate type 'Z' though L-R pair unless ( $type eq 'R' && $last_nonblank_type eq 'Z' ) { $last_nonblank_token = $tok; $last_nonblank_type = $type; } $last_nonblank_prototype = $prototype; $last_nonblank_block_type = $block_type; $last_nonblank_container_type = $container_type; $last_nonblank_type_sequence = $type_sequence; $last_nonblank_i = $i_tok; } # Patch for c030: Fix things in case a '->' got separated from # the subsequent identifier by a side comment. We need the # last_nonblank_token to have a leading -> to avoid triggering # an operator expected error message at the next '('. See also # fix for git #63. if ( $last_last_nonblank_token eq '->' ) { if ( $last_nonblank_type eq 'w' || $last_nonblank_type eq 'i' ) { $last_nonblank_token = '->' . $last_nonblank_token; $last_nonblank_type = 'i'; } } } # store previous token type if ( $i_tok >= 0 ) { $routput_token_type->[$i_tok] = $type; $routput_block_type->[$i_tok] = $block_type; $routput_container_type->[$i_tok] = $container_type; $routput_type_sequence->[$i_tok] = $type_sequence; $routput_indent_flag->[$i_tok] = $indent_flag; } # get the next pre-token and type # $tok and $type will be modified to make the output token my $pre_tok = $tok = $rtokens->[$i]; # get the next pre-token my $pre_type = $type = $rtoken_type->[$i]; # and type # remember the starting index of this token; we will be updating $i $i_tok = $i; # re-initialize various flags for the next output token $block_type &&= EMPTY_STRING; $container_type &&= EMPTY_STRING; $type_sequence &&= EMPTY_STRING; $indent_flag &&= 0; $prototype &&= EMPTY_STRING; # this pre-token will start an output token push( @{$routput_token_list}, $i_tok ); #-------------------------- # handle a whitespace token #-------------------------- next if ( $pre_type eq 'b' ); #----------------- # handle a comment #----------------- last if ( $pre_type eq '#' ); # continue gathering identifier if necessary if ($id_scan_state) { if ( $is_sub{$id_scan_state} || $is_package{$id_scan_state} ) { scan_id(); } else { scan_identifier(); } if ($id_scan_state) { # Still scanning ... # Check for side comment between sub and prototype (c061) # done if nothing left to scan on this line last if ( $i > $max_token_index ); my ( $next_nonblank_token, $i_next ) = find_next_nonblank_token_on_this_line( $i, $rtokens, $max_token_index ); # done if it was just some trailing space last if ( $i_next > $max_token_index ); # something remains on the line ... must be a side comment next; } next if ( ( $i > 0 ) || $type ); # didn't find any token; start over $type = $pre_type; $tok = $pre_tok; } ## my $prev_tok = $i > 0 ? $rtokens->[ $i - 1 ] : SPACE; my $prev_type = $i > 0 ? $rtoken_type->[ $i - 1 ] : 'b'; #----------------------------------------------------------- # Combine pre-tokens into digraphs and trigraphs if possible #----------------------------------------------------------- # See if we can make a digraph... # The following tokens are excluded and handled specially: # '/=' is excluded because the / might start a pattern. # 'x=' is excluded since it might be $x=, with $ on previous line # '**' and *= might be typeglobs of punctuation variables # I have allowed tokens starting with <, such as <=, # because I don't think these could be valid angle operators. # test file: storrs4.pl if ( $can_start_digraph{$tok} && $i < $max_token_index && $is_digraph{ $tok . $rtokens->[ $i + 1 ] } ) { my $combine_ok = 1; my $test_tok = $tok . $rtokens->[ $i + 1 ]; # check for special cases which cannot be combined # '//' must be defined_or operator if an operator is expected. # TODO: Code for other ambiguous digraphs (/=, x=, **, *=) # could be migrated here for clarity # Patch for RT#102371, misparsing a // in the following snippet: # state $b //= ccc(); # The solution is to always accept the digraph (or trigraph) # after type 'Z' (possible file handle). The reason is that # sub operator_expected gives TERM expected here, which is # wrong in this case. if ( $test_tok eq '//' && $last_nonblank_type ne 'Z' ) { # note that here $tok = '/' and the next tok and type is '/' $expecting = operator_expected( [ $prev_type, $tok, '/' ] ); # Patched for RT#101547, was 'unless ($expecting==OPERATOR)' $combine_ok = 0 if ( $expecting == TERM ); } # Patch for RT #114359: Missparsing of "print $x ** 0.5; # Accept the digraphs '**' only after type 'Z' # Otherwise postpone the decision. if ( $test_tok eq '**' ) { if ( $last_nonblank_type ne 'Z' ) { $combine_ok = 0 } } if ( # still ok to combine? $combine_ok && ( $test_tok ne '/=' ) # might be pattern && ( $test_tok ne 'x=' ) # might be $x && ( $test_tok ne '*=' ) # typeglob? # Moved above as part of fix for # RT #114359: Missparsing of "print $x ** 0.5; # && ( $test_tok ne '**' ) # typeglob? ) { $tok = $test_tok; $i++; # Now try to assemble trigraphs. Note that all possible # perl trigraphs can be constructed by appending a character # to a digraph. $test_tok = $tok . $rtokens->[ $i + 1 ]; if ( $is_trigraph{$test_tok} ) { $tok = $test_tok; $i++; } # The only current tetragraph is the double diamond operator # and its first three characters are not a trigraph, so # we do can do a special test for it elsif ( $test_tok eq '<<>' ) { $test_tok .= $rtokens->[ $i + 2 ]; if ( $is_tetragraph{$test_tok} ) { $tok = $test_tok; $i += 2; } } } } $type = $tok; $next_tok = $rtokens->[ $i + 1 ]; $next_type = $rtoken_type->[ $i + 1 ]; DEBUG_TOKENIZE && do { local $LIST_SEPARATOR = ')('; my @debug_list = ( $last_nonblank_token, $tok, $next_tok, $brace_depth, $brace_type[$brace_depth], $paren_depth, $paren_type[$paren_depth], ); print STDOUT "TOKENIZE:(@debug_list)\n"; }; # Turn off attribute list on first non-blank, non-bareword. # Added '#' to fix c038 (later moved above). if ( $in_attribute_list && $pre_type ne 'w' ) { $in_attribute_list = 0; } #-------------------------------------------------------- # We have the next token, $tok. # Now we have to examine this token and decide what it is # and define its $type # # section 1: bare words #-------------------------------------------------------- if ( $pre_type eq 'w' ) { $expecting = operator_expected( [ $prev_type, $tok, $next_type ] ); my $is_last = do_BAREWORD($is_END_or_DATA); last if ($is_last); } #----------------------------- # section 2: strings of digits #----------------------------- elsif ( $pre_type eq 'd' ) { $expecting = operator_expected( [ $prev_type, $tok, $next_type ] ); do_DIGITS(); } #---------------------------- # section 3: all other tokens #---------------------------- else { my $code = $tokenization_code->{$tok}; if ($code) { $expecting = operator_expected( [ $prev_type, $tok, $next_type ] ); $code->(); redo if $in_quote; } } } # ----------------------------- # end of main tokenization loop # ----------------------------- # Store the final token if ( $i_tok >= 0 ) { $routput_token_type->[$i_tok] = $type; $routput_block_type->[$i_tok] = $block_type; $routput_container_type->[$i_tok] = $container_type; $routput_type_sequence->[$i_tok] = $type_sequence; $routput_indent_flag->[$i_tok] = $indent_flag; } # Remember last nonblank values if ( $type ne 'b' && $type ne '#' ) { $last_last_nonblank_token = $last_nonblank_token; $last_last_nonblank_type = $last_nonblank_type; $last_last_nonblank_block_type = $last_nonblank_block_type; $last_last_nonblank_container_type = $last_nonblank_container_type; $last_last_nonblank_type_sequence = $last_nonblank_type_sequence; $last_nonblank_token = $tok; $last_nonblank_type = $type; $last_nonblank_block_type = $block_type; $last_nonblank_container_type = $container_type; $last_nonblank_type_sequence = $type_sequence; $last_nonblank_prototype = $prototype; } # reset indentation level if necessary at a sub or package # in an attempt to recover from a nesting error if ( $level_in_tokenizer < 0 ) { if ( $input_line =~ /^\s*(sub|package)\s+(\w+)/ ) { reset_indentation_level(0); brace_warning("resetting level to 0 at $1 $2\n"); } } $self->[_in_attribute_list_] = $in_attribute_list; $self->[_in_quote_] = $in_quote; $self->[_quote_target_] = $in_quote ? matching_end_token($quote_character) : EMPTY_STRING; $self->[_rhere_target_list_] = $rhere_target_list; return; } ## end sub tokenizer_main_loop sub tokenizer_wrapup_line { my ( $self, $line_of_tokens ) = @_; #--------------------------------------------------------- # Package a line of tokens for shipping back to the caller #--------------------------------------------------------- # Most of the remaining work involves defining the two indentation # parameters that the formatter needs for each token: # - $level = structural indentation level and # - $ci_level = continuation indentation level # The method for setting the indentation level is straightforward. # But the method used to define the continuation indentation is # complicated because it has evolved over a long time by trial and # error. It could undoubtedly be simplified but it works okay as is. # Here is a brief description of how indentation is computed. # Perl::Tidy computes indentation as the sum of 2 terms: # # (1) structural indentation, such as if/else/elsif blocks # (2) continuation indentation, such as long parameter call lists. # # These are occasionally called primary and secondary indentation. # # Structural indentation is introduced by tokens of type '{', # although the actual tokens might be '{', '(', or '['. Structural # indentation is of two types: BLOCK and non-BLOCK. Default # structural indentation is 4 characters if the standard indentation # scheme is used. # # Continuation indentation is introduced whenever a line at BLOCK # level is broken before its termination. Default continuation # indentation is 2 characters in the standard indentation scheme. # # Both types of indentation may be nested arbitrarily deep and # interlaced. The distinction between the two is somewhat arbitrary. # # For each token, we will define two variables which would apply if # the current statement were broken just before that token, so that # that token started a new line: # # $level = the structural indentation level, # $ci_level = the continuation indentation level # # The total indentation will be $level * (4 spaces) + $ci_level * (2 # spaces), assuming defaults. However, in some special cases it is # customary to modify $ci_level from this strict value. # # The total structural indentation is easy to compute by adding and # subtracting 1 from a saved value as types '{' and '}' are seen. # The running value of this variable is $level_in_tokenizer. # # The total continuation is much more difficult to compute, and # requires several variables. These variables are: # # $ci_string_in_tokenizer = a string of 1's and 0's indicating, for # each indentation level, if there are intervening open secondary # structures just prior to that level. # $continuation_string_in_tokenizer = a string of 1's and 0's # indicating if the last token at that level is "continued", meaning # that it is not the first token of an expression. # $nesting_block_string = a string of 1's and 0's indicating, for each # indentation level, if the level is of type BLOCK or not. # $nesting_block_flag = the most recent 1 or 0 of $nesting_block_string # $nesting_list_string = a string of 1's and 0's indicating, for each # indentation level, if it is appropriate for list formatting. # If so, continuation indentation is used to indent long list items. # $nesting_list_flag = the most recent 1 or 0 of $nesting_list_string # @{$rslevel_stack} = a stack of total nesting depths at each # structural indentation level, where "total nesting depth" means # the nesting depth that would occur if every nesting token # -- '{', '[', # and '(' -- , regardless of context, is used to # compute a nesting depth. # Notes on the Continuation Indentation # # There is a sort of chicken-and-egg problem with continuation # indentation. The formatter can't make decisions on line breaks # without knowing what 'ci' will be at arbitrary locations. # # But a problem with setting the continuation indentation (ci) here # in the tokenizer is that we do not know where line breaks will # actually be. As a result, we don't know if we should propagate # continuation indentation to higher levels of structure. # # For nesting of only structural indentation, we never need to do # this. For example, in a long if statement, like this # # if ( !$output_block_type[$i] # && ($in_statement_continuation) ) # { <--outdented # do_something(); # } # # the second line has ci but we do normally give the lines within # the BLOCK any ci. This would be true if we had blocks nested # arbitrarily deeply. # # But consider something like this, where we have created a break # after an opening paren on line 1, and the paren is not (currently) # a structural indentation token: # # my $file = $menubar->Menubutton( # qw/-text File -underline 0 -menuitems/ => [ # [ # Cascade => '~View', # -menuitems => [ # ... # # The second line has ci, so it would seem reasonable to propagate # it down, giving the third line 1 ci + 1 indentation. This # suggests the following rule, which is currently used to # propagating ci down: if there are any non-structural opening # parens (or brackets, or braces), before an opening structural # brace, then ci is propagated down, and otherwise # not. The variable $intervening_secondary_structure contains this # information for the current token, and the string # "$ci_string_in_tokenizer" is a stack of previous values of this # variable. my @token_type = (); # stack of output token types my @block_type = (); # stack of output code block types my @type_sequence = (); # stack of output type sequence numbers my @tokens = (); # output tokens my @levels = (); # structural brace levels of output tokens my @ci_string = (); # string needed to compute continuation indentation # Count the number of '1's in the string (previously sub ones_count) my $ci_string_sum = ( my $str = $ci_string_in_tokenizer ) =~ tr/1/0/; $line_of_tokens->{_nesting_tokens_0} = $nesting_token_string; my ( $ci_string_i, $level_i ); #----------------- # Loop over tokens #----------------- my $rtoken_map_im; foreach my $i ( @{$routput_token_list} ) { my $type_i = $routput_token_type->[$i]; $level_i = $level_in_tokenizer; # Quick handling of indentation levels for blanks and comments if ( $type_i eq 'b' || $type_i eq '#' ) { $ci_string_i = $ci_string_sum + $in_statement_continuation; } # All other types else { # $tok_i is the PRE-token. It only equals the token for symbols my $tok_i = $rtokens->[$i]; # Check for an invalid token type.. # This can happen by running perltidy on non-scripts although # it could also be bug introduced by programming change. Perl # silently accepts a 032 (^Z) and takes it as the end if ( !$is_valid_token_type{$type_i} ) { my $val = ord($type_i); warning( "unexpected character decimal $val ($type_i) in script\n" ); $self->[_in_error_] = 1; } # $ternary_indentation_flag indicates that we need a change # in level at a nested ternary, as follows # 1 => at a nested ternary ? # -1 => at a nested ternary : # 0 => otherwise my $ternary_indentation_flag = $routput_indent_flag->[$i]; #------------------------------------------- # Section 1: handle a level-increasing token #------------------------------------------- # set primary indentation levels based on structural braces # Note: these are set so that the leading braces have a HIGHER # level than their CONTENTS, which is convenient for indentation # Also, define continuation indentation for each token. if ( $type_i eq '{' || $type_i eq 'L' || $ternary_indentation_flag > 0 ) { # if the difference between total nesting levels is not 1, # there are intervening non-structural nesting types between # this '{' and the previous unclosed '{' my $intervening_secondary_structure = 0; if ( @{$rslevel_stack} ) { $intervening_secondary_structure = $slevel_in_tokenizer - $rslevel_stack->[-1]; } # save the current states push( @{$rslevel_stack}, 1 + $slevel_in_tokenizer ); $level_in_tokenizer++; if ( $level_in_tokenizer > $self->[_maximum_level_] ) { $self->[_maximum_level_] = $level_in_tokenizer; } if ($ternary_indentation_flag) { # break BEFORE '?' in a nested ternary if ( $type_i eq '?' ) { $level_i = $level_in_tokenizer; } $nesting_block_string .= "$nesting_block_flag"; } ## end if ($ternary_indentation_flag) else { if ( $routput_block_type->[$i] ) { $nesting_block_flag = 1; $nesting_block_string .= '1'; } else { $nesting_block_flag = 0; $nesting_block_string .= '0'; } } # we will use continuation indentation within containers # which are not blocks and not logical expressions my $bit = 0; if ( !$routput_block_type->[$i] ) { # propagate flag down at nested open parens if ( $routput_container_type->[$i] eq '(' ) { $bit = 1 if $nesting_list_flag; } # use list continuation if not a logical grouping # /^(if|elsif|unless|while|and|or|not|&&|!|\|\||for|foreach)$/ else { $bit = 1 unless $is_logical_container{ $routput_container_type ->[$i] }; } } $nesting_list_string .= $bit; $nesting_list_flag = $bit; $ci_string_in_tokenizer .= ( $intervening_secondary_structure != 0 ) ? '1' : '0'; $ci_string_sum = ( my $str = $ci_string_in_tokenizer ) =~ tr/1/0/; $continuation_string_in_tokenizer .= ( $in_statement_continuation > 0 ) ? '1' : '0'; # Sometimes we want to give an opening brace # continuation indentation, and sometimes not. For code # blocks, we don't do it, so that the leading '{' gets # outdented, like this: # # if ( !$output_block_type[$i] # && ($in_statement_continuation) ) # { <--outdented # # For other types, we will give them continuation # indentation. For example, here is how a list looks # with the opening paren indented: # # @LoL = # ( [ "fred", "barney" ], [ "george", "jane", "elroy" ], # [ "homer", "marge", "bart" ], ); # # This looks best when 'ci' is one-half of the # indentation (i.e., 2 and 4) my $total_ci = $ci_string_sum; if ( !$routput_block_type->[$i] # patch: skip for BLOCK && ($in_statement_continuation) && !( $ternary_indentation_flag && $type_i eq ':' ) ) { $total_ci += $in_statement_continuation unless ( substr( $ci_string_in_tokenizer, -1 ) eq '1' ); } $ci_string_i = $total_ci; $in_statement_continuation = 0; } ## end if ( $type_i eq '{' ||...}) #------------------------------------------- # Section 2: handle a level-decreasing token #------------------------------------------- elsif ($type_i eq '}' || $type_i eq 'R' || $ternary_indentation_flag < 0 ) { # only a nesting error in the script would prevent # popping here if ( @{$rslevel_stack} > 1 ) { pop( @{$rslevel_stack} ); } $level_i = --$level_in_tokenizer; if ( $level_in_tokenizer < 0 ) { unless ( $self->[_saw_negative_indentation_] ) { $self->[_saw_negative_indentation_] = 1; warning("Starting negative indentation\n"); } } # restore previous level values if ( length($nesting_block_string) > 1 ) { # true for valid script chop $nesting_block_string; $nesting_block_flag = substr( $nesting_block_string, -1 ) eq '1'; chop $nesting_list_string; $nesting_list_flag = substr( $nesting_list_string, -1 ) eq '1'; chop $ci_string_in_tokenizer; $ci_string_sum = ( my $str = $ci_string_in_tokenizer ) =~ tr/1/0/; $in_statement_continuation = chop $continuation_string_in_tokenizer; # zero continuation flag at terminal BLOCK '}' which # ends a statement. my $block_type_i = $routput_block_type->[$i]; if ($block_type_i) { # ...These include non-anonymous subs # note: could be sub ::abc { or sub 'abc if ( substr( $block_type_i, 0, 3 ) eq 'sub' && $block_type_i =~ m/^sub\s*/gc ) { # note: older versions of perl require the /gc # modifier here or else the \G does not work. $in_statement_continuation = 0 if ( $block_type_i =~ /\G('|::|\w)/gc ); } # ...and include all block types except user subs # with block prototypes and these: # (sort|grep|map|do|eval) elsif ( $is_zero_continuation_block_type{$block_type_i} ) { $in_statement_continuation = 0; } # ..but these are not terminal types: # /^(sort|grep|map|do|eval)$/ ) elsif ($is_sort_map_grep_eval_do{$block_type_i} || $is_grep_alias{$block_type_i} ) { } # ..and a block introduced by a label # /^\w+\s*:$/gc ) { elsif ( $block_type_i =~ /:$/ ) { $in_statement_continuation = 0; } # user function with block prototype else { $in_statement_continuation = 0; } } ## end if ($block_type_i) # If we are in a list, then # we must set continuation indentation at the closing # paren of something like this (paren after $check): # assert( # __LINE__, # ( not defined $check ) # or ref $check # or $check eq "new" # or $check eq "old", # ); elsif ( $tok_i eq ')' ) { $in_statement_continuation = 1 if ( $is_list_end_type{ $routput_container_type->[$i] } ); ##if $routput_container_type->[$i] =~ /^[;,\{\}]$/; } } ## end if ( length($nesting_block_string...)) $ci_string_i = $ci_string_sum + $in_statement_continuation; } ## end elsif ( $type_i eq '}' ||...{) #----------------------------------------- # Section 3: handle a constant level token #----------------------------------------- else { # zero the continuation indentation at certain tokens so # that they will be at the same level as its container. For # commas, this simplifies the -lp indentation logic, which # counts commas. For ?: it makes them stand out. if ( $nesting_list_flag ## $type_i =~ /^[,\?\:]$/ && $is_comma_question_colon{$type_i} ) { $in_statement_continuation = 0; } # Be sure binary operators get continuation indentation. # Note: the check on $nesting_block_flag is only needed # to add ci to binary operators following a 'try' block, # or similar extended syntax block operator (see c158). if ( !$in_statement_continuation && ( $nesting_block_flag || $nesting_list_flag ) && ( $type_i eq 'k' && $is_binary_keyword{$tok_i} || $is_binary_type{$type_i} ) ) { $in_statement_continuation = 1; } # continuation indentation is sum of any open ci from # previous levels plus the current level $ci_string_i = $ci_string_sum + $in_statement_continuation; # update continuation flag ... # if we are in a BLOCK if ($nesting_block_flag) { # the next token after a ';' and label starts a new stmt if ( $type_i eq ';' || $type_i eq 'J' ) { $in_statement_continuation = 0; } # otherwise, we are continuing the current statement else { $in_statement_continuation = 1; } } # if we are not in a BLOCK.. else { # do not use continuation indentation if not list # environment (could be within if/elsif clause) if ( !$nesting_list_flag ) { $in_statement_continuation = 0; } # otherwise, the token after a ',' starts a new term # Patch FOR RT#99961; no continuation after a ';' # This is needed because perltidy currently marks # a block preceded by a type character like % or @ # as a non block, to simplify formatting. But these # are actually blocks and can have semicolons. # See code_block_type() and is_non_structural_brace(). elsif ( $type_i eq ',' || $type_i eq ';' ) { $in_statement_continuation = 0; } # otherwise, we are continuing the current term else { $in_statement_continuation = 1; } } ## end else [ if ($nesting_block_flag)] } ## end else [ if ( $type_i eq '{' ||...})] #------------------------------------------- # Section 4: operations common to all levels #------------------------------------------- # set secondary nesting levels based on all containment token # types Note: these are set so that the nesting depth is the # depth of the PREVIOUS TOKEN, which is convenient for setting # the strength of token bonds # /^[L\{\(\[]$/ if ( $is_opening_type{$type_i} ) { $slevel_in_tokenizer++; $nesting_token_string .= $tok_i; $nesting_type_string .= $type_i; } # /^[R\}\)\]]$/ elsif ( $is_closing_type{$type_i} ) { $slevel_in_tokenizer--; my $char = chop $nesting_token_string; if ( $char ne $matching_start_token{$tok_i} ) { $nesting_token_string .= $char . $tok_i; $nesting_type_string .= $type_i; } else { chop $nesting_type_string; } } # apply token type patch: # - output anonymous 'sub' as keyword (type 'k') # - output __END__, __DATA__, and format as type 'k' instead # of ';' to make html colors correct, etc. # The following hash tests are equivalent to these older tests: # if ( $type_i eq 't' && $is_sub{$tok_i} ) { $fix_type = 'k' } # if ( $type_i eq ';' && $tok_i =~ /\w/ ) { $fix_type = 'k' } if ( $is_END_DATA_format_sub{$tok_i} && $is_semicolon_or_t{$type_i} ) { $type_i = 'k'; } } ## end else [ if ( $type_i eq 'b' ||...)] #-------------------------------- # Store the values for this token #-------------------------------- push( @ci_string, $ci_string_i ); push( @levels, $level_i ); push( @block_type, $routput_block_type->[$i] ); push( @type_sequence, $routput_type_sequence->[$i] ); push( @token_type, $type_i ); # Form and store the PREVIOUS token if ( defined($rtoken_map_im) ) { my $numc = $rtoken_map->[$i] - $rtoken_map_im; # how many characters if ( $numc > 0 ) { push( @tokens, substr( $input_line, $rtoken_map_im, $numc ) ); } else { # Should not happen unless @{$rtoken_map} is corrupted DEVEL_MODE && Fault( "number of characters is '$numc' but should be >0\n"); } } # or grab some values for the leading token (needed for log output) else { $line_of_tokens->{_nesting_blocks_0} = $nesting_block_string; } $rtoken_map_im = $rtoken_map->[$i]; } ## end foreach my $i ( @{$routput_token_list...}) #------------------------ # End loop to over tokens #------------------------ # Form and store the final token of this line if ( defined($rtoken_map_im) ) { my $numc = length($input_line) - $rtoken_map_im; if ( $numc > 0 ) { push( @tokens, substr( $input_line, $rtoken_map_im, $numc ) ); } else { # Should not happen unless @{$rtoken_map} is corrupted DEVEL_MODE && Fault( "Number of Characters is '$numc' but should be >0\n"); } } #---------------------------------------------------------- # Wrap up this line of tokens for shipping to the Formatter #---------------------------------------------------------- $line_of_tokens->{_rtoken_type} = \@token_type; $line_of_tokens->{_rtokens} = \@tokens; $line_of_tokens->{_rblock_type} = \@block_type; $line_of_tokens->{_rtype_sequence} = \@type_sequence; $line_of_tokens->{_rlevels} = \@levels; $line_of_tokens->{_rci_levels} = \@ci_string; return; } ## end sub tokenizer_wrapup_line } ## end tokenize_this_line ####################################################################### # Tokenizer routines which assist in identifying token types ####################################################################### # hash lookup table of operator expected values my %op_expected_table; # exceptions to perl's weird parsing rules after type 'Z' my %is_weird_parsing_rule_exception; my %is_paren_dollar; my %is_n_v; BEGIN { # Always expecting TERM following these types: # note: this is identical to '@value_requestor_type' defined later. my @q = qw( ; ! + x & ? F J - p / Y : % f U ~ A G j L * . | ^ < = [ m { \ > t || >= != mm *= => .. !~ == && |= .= pp -= =~ += <= %= ^= x= ~~ ** << /= &= // >> ~. &. |. ^. ... **= <<= >>= &&= ||= //= <=> !~~ &.= |.= ^.= <<~ ); push @q, ','; push @q, '('; # for completeness, not currently a token type push @q, '->'; # was previously in UNKNOWN @{op_expected_table}{@q} = (TERM) x scalar(@q); # Always UNKNOWN following these types; # previously had '->' in this list for c030 @q = qw( w ); @{op_expected_table}{@q} = (UNKNOWN) x scalar(@q); # Always expecting OPERATOR ... # 'n' and 'v' are currently excluded because they might be VERSION numbers # 'i' is currently excluded because it might be a package # 'q' is currently excluded because it might be a prototype # Fix for c030: removed '->' from this list: @q = qw( -- C h R ++ ] Q <> ); ## n v q i ); push @q, ')'; @{op_expected_table}{@q} = (OPERATOR) x scalar(@q); # Fix for git #62: added '*' and '%' @q = qw( < ? * % ); @{is_weird_parsing_rule_exception}{@q} = (1) x scalar(@q); @q = qw<) $>; @{is_paren_dollar}{@q} = (1) x scalar(@q); @q = qw( n v ); @{is_n_v}{@q} = (1) x scalar(@q); } ## end BEGIN use constant DEBUG_OPERATOR_EXPECTED => 0; sub operator_expected { # Returns a parameter indicating what types of tokens can occur next # Call format: # $op_expected = operator_expected( [ $prev_type, $tok, $next_type ] ); # where # $prev_type is the type of the previous token (blank or not) # $tok is the current token # $next_type is the type of the next token (blank or not) # Many perl symbols have two or more meanings. For example, '<<' # can be a shift operator or a here-doc operator. The # interpretation of these symbols depends on the current state of # the tokenizer, which may either be expecting a term or an # operator. For this example, a << would be a shift if an OPERATOR # is expected, and a here-doc if a TERM is expected. This routine # is called to make this decision for any current token. It returns # one of three possible values: # # OPERATOR - operator expected (or at least, not a term) # UNKNOWN - can't tell # TERM - a term is expected (or at least, not an operator) # # The decision is based on what has been seen so far. This # information is stored in the "$last_nonblank_type" and # "$last_nonblank_token" variables. For example, if the # $last_nonblank_type is '=~', then we are expecting a TERM, whereas # if $last_nonblank_type is 'n' (numeric), we are expecting an # OPERATOR. # # If a UNKNOWN is returned, the calling routine must guess. A major # goal of this tokenizer is to minimize the possibility of returning # UNKNOWN, because a wrong guess can spoil the formatting of a # script. # # Adding NEW_TOKENS: it is critically important that this routine be # updated to allow it to determine if an operator or term is to be # expected after the new token. Doing this simply involves adding # the new token character to one of the regexes in this routine or # to one of the hash lists # that it uses, which are initialized in the BEGIN section. # USES GLOBAL VARIABLES: $last_nonblank_type, $last_nonblank_token, # $statement_type # When possible, token types should be selected such that we can determine # the 'operator_expected' value by a simple hash lookup. If there are # exceptions, that is an indication that a new type is needed. my ($rarg) = @_; #------------- # Table lookup #------------- # Many types are can be obtained by a table lookup given the previous type. # This typically handles half or more of the calls. my $op_expected = $op_expected_table{$last_nonblank_type}; if ( defined($op_expected) ) { DEBUG_OPERATOR_EXPECTED && print STDOUT "OPERATOR_EXPECTED: Table Lookup; returns $op_expected for last type $last_nonblank_type token $last_nonblank_token\n"; return $op_expected; } #--------------------- # Handle special cases #--------------------- $op_expected = UNKNOWN; my ( $prev_type, $tok, $next_type ) = @{$rarg}; # Types 'k', '}' and 'Z' depend on context # Types 'i', 'n', 'v', 'q' currently also temporarily depend on context. # identifier... if ( $last_nonblank_type eq 'i' ) { $op_expected = OPERATOR; # TODO: it would be cleaner to make this a special type # expecting VERSION or {} after package NAMESPACE; # maybe mark these words as type 'Y'? if ( substr( $last_nonblank_token, 0, 7 ) eq 'package' && $statement_type =~ /^package\b/ && $last_nonblank_token =~ /^package\b/ ) { $op_expected = TERM; } } # keyword... elsif ( $last_nonblank_type eq 'k' ) { $op_expected = TERM; if ( $expecting_operator_token{$last_nonblank_token} ) { $op_expected = OPERATOR; } elsif ( $expecting_term_token{$last_nonblank_token} ) { # Exceptions from TERM: # // may follow perl functions which may be unary operators # see test file dor.t (defined or); if ( $tok eq '/' && $next_type eq '/' && $is_keyword_rejecting_slash_as_pattern_delimiter{ $last_nonblank_token} ) { $op_expected = OPERATOR; } # Patch to allow a ? following 'split' to be a deprecated pattern # delimiter. This patch is coordinated with the omission of split # from the list # %is_keyword_rejecting_question_as_pattern_delimiter. This patch # will force perltidy to guess. elsif ($tok eq '?' && $last_nonblank_token eq 'split' ) { $op_expected = UNKNOWN; } } } ## end type 'k' # closing container token... # Note that the actual token for type '}' may also be a ')'. # Also note that $last_nonblank_token is not the token corresponding to # $last_nonblank_type when the type is a closing container. In that # case it is the token before the corresponding opening container token. # So for example, for this snippet # $a = do { BLOCK } / 2; # the $last_nonblank_token is 'do' when $last_nonblank_type eq '}'. elsif ( $last_nonblank_type eq '}' ) { $op_expected = UNKNOWN; # handle something after 'do' and 'eval' if ( $is_block_operator{$last_nonblank_token} ) { # something like $a = do { BLOCK } / 2; $op_expected = OPERATOR; # block mode following } } # $last_nonblank_token =~ /^(\)|\$|\-\>)/ elsif ( $is_paren_dollar{ substr( $last_nonblank_token, 0, 1 ) } || substr( $last_nonblank_token, 0, 2 ) eq '->' ) { $op_expected = OPERATOR; if ( $last_nonblank_token eq '$' ) { $op_expected = UNKNOWN } } # Check for smartmatch operator before preceding brace or square # bracket. For example, at the ? after the ] in the following # expressions we are expecting an operator: # # qr/3/ ~~ ['1234'] ? 1 : 0; # map { $_ ~~ [ '0', '1' ] ? 'x' : 'o' } @a; elsif ( $last_nonblank_token eq '~~' ) { $op_expected = OPERATOR; } # A right brace here indicates the end of a simple block. All # non-structural right braces have type 'R' all braces associated with # block operator keywords have been given those keywords as # "last_nonblank_token" and caught above. (This statement is order # dependent, and must come after checking $last_nonblank_token). else { # patch for dor.t (defined or). if ( $tok eq '/' && $next_type eq '/' && $last_nonblank_token eq ']' ) { $op_expected = OPERATOR; } # Patch for RT #116344: misparse a ternary operator after an # anonymous hash, like this: # return ref {} ? 1 : 0; # The right brace should really be marked type 'R' in this case, # and it is safest to return an UNKNOWN here. Expecting a TERM will # cause the '?' to always be interpreted as a pattern delimiter # rather than introducing a ternary operator. elsif ( $tok eq '?' ) { $op_expected = UNKNOWN; } else { $op_expected = TERM; } } } ## end type '}' # number or v-string... # An exception is for VERSION numbers a 'use' statement. It has the format # use Module VERSION LIST # We could avoid this exception by writing a special sub to parse 'use' # statements and perhaps mark these numbers with a new type V (for VERSION) ##elsif ( $last_nonblank_type =~ /^[nv]$/ ) { elsif ( $is_n_v{$last_nonblank_type} ) { $op_expected = OPERATOR; if ( $statement_type eq 'use' ) { $op_expected = UNKNOWN; } } # quote... # TODO: labeled prototype words would better be given type 'A' or maybe # 'J'; not 'q'; or maybe mark as type 'Y'? elsif ( $last_nonblank_type eq 'q' ) { $op_expected = OPERATOR; if ( $last_nonblank_token eq 'prototype' ) { $op_expected = TERM; } # update for --use-feature=class (rt145706): # Look for class VERSION after possible attribute, as in # class Example::Subclass : isa(Example::Base) 1.345 { ... } elsif ( $statement_type =~ /^package\b/ ) { $op_expected = TERM; } } # file handle or similar elsif ( $last_nonblank_type eq 'Z' ) { $op_expected = UNKNOWN; # angle.t if ( $last_nonblank_token =~ /^\w/ ) { $op_expected = UNKNOWN; } # Exception to weird parsing rules for 'x(' ... see case b1205: # In something like 'print $vv x(...' the x is an operator; # Likewise in 'print $vv x$ww' the x is an operator (case b1207) # otherwise x follows the weird parsing rules. elsif ( $tok eq 'x' && $next_type =~ /^[\(\$\@\%]$/ ) { $op_expected = OPERATOR; } # The 'weird parsing rules' of next section do not work for '<' and '?' # It is best to mark them as unknown. Test case: # print $fh ; elsif ( $is_weird_parsing_rule_exception{$tok} ) { $op_expected = UNKNOWN; } # For possible file handle like "$a", Perl uses weird parsing rules. # For example: # print $a/2,"/hi"; - division # print $a / 2,"/hi"; - division # print $a/ 2,"/hi"; - division # print $a /2,"/hi"; - pattern (and error)! # Some examples where this logic works okay, for '&','*','+': # print $fh &xsi_protos(@mods); # my $x = new $CompressClass *FH; # print $OUT +( $count % 15 ? ", " : "\n\t" ); elsif ($prev_type eq 'b' && $next_type ne 'b' ) { $op_expected = TERM; } # Note that '?' and '<' have been moved above # ( $tok =~ /^([x\/\+\-\*\%\&\.\?\<]|\>\>)$/ ) { elsif ( $tok =~ /^([x\/\+\-\*\%\&\.]|\>\>)$/ ) { # Do not complain in 'use' statements, which have special syntax. # For example, from RT#130344: # use lib $FindBin::Bin . '/lib'; if ( $statement_type ne 'use' ) { complain( "operator in possible indirect object location not recommended\n" ); } $op_expected = OPERATOR; } } # anything else... else { $op_expected = UNKNOWN; } DEBUG_OPERATOR_EXPECTED && print STDOUT "OPERATOR_EXPECTED: returns $op_expected for last type $last_nonblank_type token $last_nonblank_token\n"; return $op_expected; } ## end sub operator_expected sub new_statement_ok { # return true if the current token can start a new statement # USES GLOBAL VARIABLES: $last_nonblank_type return label_ok() # a label would be ok here || $last_nonblank_type eq 'J'; # or we follow a label } ## end sub new_statement_ok sub label_ok { # Decide if a bare word followed by a colon here is a label # USES GLOBAL VARIABLES: $last_nonblank_token, $last_nonblank_type, # $brace_depth, @brace_type # if it follows an opening or closing code block curly brace.. if ( ( $last_nonblank_token eq '{' || $last_nonblank_token eq '}' ) && $last_nonblank_type eq $last_nonblank_token ) { # it is a label if and only if the curly encloses a code block return $brace_type[$brace_depth]; } # otherwise, it is a label if and only if it follows a ';' (real or fake) # or another label else { return ( $last_nonblank_type eq ';' || $last_nonblank_type eq 'J' ); } } ## end sub label_ok sub code_block_type { # Decide if this is a block of code, and its type. # Must be called only when $type = $token = '{' # The problem is to distinguish between the start of a block of code # and the start of an anonymous hash reference # Returns "" if not code block, otherwise returns 'last_nonblank_token' # to indicate the type of code block. (For example, 'last_nonblank_token' # might be 'if' for an if block, 'else' for an else block, etc). # USES GLOBAL VARIABLES: $last_nonblank_token, $last_nonblank_type, # $last_nonblank_block_type, $brace_depth, @brace_type # handle case of multiple '{'s # print "BLOCK_TYPE EXAMINING: type=$last_nonblank_type tok=$last_nonblank_token\n"; my ( $i, $rtokens, $rtoken_type, $max_token_index ) = @_; if ( $last_nonblank_token eq '{' && $last_nonblank_type eq $last_nonblank_token ) { # opening brace where a statement may appear is probably # a code block but might be and anonymous hash reference if ( $brace_type[$brace_depth] ) { return decide_if_code_block( $i, $rtokens, $rtoken_type, $max_token_index ); } # cannot start a code block within an anonymous hash else { return EMPTY_STRING; } } elsif ( $last_nonblank_token eq ';' ) { # an opening brace where a statement may appear is probably # a code block but might be and anonymous hash reference return decide_if_code_block( $i, $rtokens, $rtoken_type, $max_token_index ); } # handle case of '}{' elsif ($last_nonblank_token eq '}' && $last_nonblank_type eq $last_nonblank_token ) { # a } { situation ... # could be hash reference after code block..(blktype1.t) if ($last_nonblank_block_type) { return decide_if_code_block( $i, $rtokens, $rtoken_type, $max_token_index ); } # must be a block if it follows a closing hash reference else { return $last_nonblank_token; } } #-------------------------------------------------------------- # NOTE: braces after type characters start code blocks, but for # simplicity these are not identified as such. See also # sub is_non_structural_brace. #-------------------------------------------------------------- ## elsif ( $last_nonblank_type eq 't' ) { ## return $last_nonblank_token; ## } # brace after label: elsif ( $last_nonblank_type eq 'J' ) { return $last_nonblank_token; } # otherwise, look at previous token. This must be a code block if # it follows any of these: # /^(BEGIN|END|CHECK|INIT|AUTOLOAD|DESTROY|UNITCHECK|continue|if|elsif|else|unless|do|while|until|eval|for|foreach|map|grep|sort)$/ elsif ($is_code_block_token{$last_nonblank_token} || $is_grep_alias{$last_nonblank_token} ) { # Bug Patch: Note that the opening brace after the 'if' in the following # snippet is an anonymous hash ref and not a code block! # print 'hi' if { x => 1, }->{x}; # We can identify this situation because the last nonblank type # will be a keyword (instead of a closing paren) if ( $last_nonblank_type eq 'k' && ( $last_nonblank_token eq 'if' || $last_nonblank_token eq 'unless' ) ) { return EMPTY_STRING; } else { return $last_nonblank_token; } } # or a sub or package BLOCK elsif ( ( $last_nonblank_type eq 'i' || $last_nonblank_type eq 't' ) && $last_nonblank_token =~ /^(sub|package)\b/ ) { return $last_nonblank_token; } # or a sub alias elsif (( $last_nonblank_type eq 'i' || $last_nonblank_type eq 't' ) && ( $is_sub{$last_nonblank_token} ) ) { return 'sub'; } elsif ( $statement_type =~ /^(sub|package)\b/ ) { return $statement_type; } # user-defined subs with block parameters (like grep/map/eval) elsif ( $last_nonblank_type eq 'G' ) { return $last_nonblank_token; } # check bareword elsif ( $last_nonblank_type eq 'w' ) { # check for syntax 'use MODULE LIST' # This fixes b1022 b1025 b1027 b1028 b1029 b1030 b1031 return EMPTY_STRING if ( $statement_type eq 'use' ); return decide_if_code_block( $i, $rtokens, $rtoken_type, $max_token_index ); } # Patch for bug # RT #94338 reported by Daniel Trizen # for-loop in a parenthesized block-map triggering an error message: # map( { foreach my $item ( '0', '1' ) { print $item} } qw(a b c) ); # Check for a code block within a parenthesized function call elsif ( $last_nonblank_token eq '(' ) { my $paren_type = $paren_type[$paren_depth]; # /^(map|grep|sort)$/ if ( $paren_type && $is_sort_map_grep{$paren_type} ) { # We will mark this as a code block but use type 't' instead # of the name of the containing function. This will allow for # correct parsing but will usually produce better formatting. # Braces with block type 't' are not broken open automatically # in the formatter as are other code block types, and this usually # works best. return 't'; # (Not $paren_type) } else { return EMPTY_STRING; } } # handle unknown syntax ') {' # we previously appended a '()' to mark this case elsif ( $last_nonblank_token =~ /\(\)$/ ) { return $last_nonblank_token; } # anything else must be anonymous hash reference else { return EMPTY_STRING; } } ## end sub code_block_type sub decide_if_code_block { # USES GLOBAL VARIABLES: $last_nonblank_token my ( $i, $rtokens, $rtoken_type, $max_token_index ) = @_; my ( $next_nonblank_token, $i_next ) = find_next_nonblank_token( $i, $rtokens, $max_token_index ); # we are at a '{' where a statement may appear. # We must decide if this brace starts an anonymous hash or a code # block. # return "" if anonymous hash, and $last_nonblank_token otherwise # initialize to be code BLOCK my $code_block_type = $last_nonblank_token; # Check for the common case of an empty anonymous hash reference: # Maybe something like sub { { } } if ( $next_nonblank_token eq '}' ) { $code_block_type = EMPTY_STRING; } else { # To guess if this '{' is an anonymous hash reference, look ahead # and test as follows: # # it is a hash reference if next come: # - a string or digit followed by a comma or => # - bareword followed by => # otherwise it is a code block # # Examples of anonymous hash ref: # {'aa',}; # {1,2} # # Examples of code blocks: # {1; print "hello\n", 1;} # {$a,1}; # We are only going to look ahead one more (nonblank/comment) line. # Strange formatting could cause a bad guess, but that's unlikely. my @pre_types; my @pre_tokens; # Ignore the rest of this line if it is a side comment if ( $next_nonblank_token ne '#' ) { @pre_types = @{$rtoken_type}[ $i + 1 .. $max_token_index ]; @pre_tokens = @{$rtokens}[ $i + 1 .. $max_token_index ]; } my ( $rpre_tokens, $rpre_types ) = peek_ahead_for_n_nonblank_pre_tokens(20); # 20 is arbitrary but # generous, and prevents # wasting lots of # time in mangled files if ( defined($rpre_types) && @{$rpre_types} ) { push @pre_types, @{$rpre_types}; push @pre_tokens, @{$rpre_tokens}; } # put a sentinel token to simplify stopping the search push @pre_types, '}'; push @pre_types, '}'; my $jbeg = 0; $jbeg = 1 if $pre_types[0] eq 'b'; # first look for one of these # - bareword # - bareword with leading - # - digit # - quoted string my $j = $jbeg; if ( $pre_types[$j] =~ /^[\'\"]/ ) { # find the closing quote; don't worry about escapes my $quote_mark = $pre_types[$j]; foreach my $k ( $j + 1 .. @pre_types - 2 ) { if ( $pre_types[$k] eq $quote_mark ) { $j = $k + 1; ##my $next = $pre_types[$j]; last; } } } elsif ( $pre_types[$j] eq 'd' ) { $j++; } elsif ( $pre_types[$j] eq 'w' ) { $j++; } elsif ( $pre_types[$j] eq '-' && $pre_types[ ++$j ] eq 'w' ) { $j++; } if ( $j > $jbeg ) { $j++ if $pre_types[$j] eq 'b'; # Patched for RT #95708 if ( # it is a comma which is not a pattern delimiter except for qw ( $pre_types[$j] eq ',' ## !~ /^(s|m|y|tr|qr|q|qq|qx)$/ && !$is_q_qq_qx_qr_s_y_tr_m{ $pre_tokens[$jbeg] } ) # or a => || ( $pre_types[$j] eq '=' && $pre_types[ ++$j ] eq '>' ) ) { $code_block_type = EMPTY_STRING; } } if ($code_block_type) { # Patch for cases b1085 b1128: It is uncertain if this is a block. # If this brace follows a bareword, then append a space as a signal # to the formatter that this may not be a block brace. To find the # corresponding code in Formatter.pm search for 'b1085'. $code_block_type .= SPACE if ( $code_block_type =~ /^\w/ ); } } return $code_block_type; } ## end sub decide_if_code_block sub report_unexpected { # report unexpected token type and show where it is # USES GLOBAL VARIABLES: $tokenizer_self my ( $found, $expecting, $i_tok, $last_nonblank_i, $rpretoken_map, $rpretoken_type, $input_line ) = @_; if ( ++$tokenizer_self->[_unexpected_error_count_] <= MAX_NAG_MESSAGES ) { my $msg = "found $found where $expecting expected"; my $pos = $rpretoken_map->[$i_tok]; interrupt_logfile(); my $input_line_number = $tokenizer_self->[_last_line_number_]; my ( $offset, $numbered_line, $underline ) = make_numbered_line( $input_line_number, $input_line, $pos ); $underline = write_on_underline( $underline, $pos - $offset, '^' ); my $trailer = EMPTY_STRING; if ( ( $i_tok > 0 ) && ( $last_nonblank_i >= 0 ) ) { my $pos_prev = $rpretoken_map->[$last_nonblank_i]; my $num; if ( $rpretoken_type->[ $i_tok - 1 ] eq 'b' ) { $num = $rpretoken_map->[ $i_tok - 1 ] - $pos_prev; } else { $num = $pos - $pos_prev; } if ( $num > 40 ) { $num = 40; $pos_prev = $pos - 40; } $underline = write_on_underline( $underline, $pos_prev - $offset, '-' x $num ); $trailer = " (previous token underlined)"; } $underline =~ s/\s+$//; warning( $numbered_line . "\n" ); warning( $underline . "\n" ); warning( $msg . $trailer . "\n" ); resume_logfile(); } return; } ## end sub report_unexpected my %is_sigil_or_paren; my %is_R_closing_sb; BEGIN { my @q = qw< $ & % * @ ) >; @{is_sigil_or_paren}{@q} = (1) x scalar(@q); @q = qw(R ]); @{is_R_closing_sb}{@q} = (1) x scalar(@q); } ## end BEGIN sub is_non_structural_brace { # Decide if a brace or bracket is structural or non-structural # by looking at the previous token and type # USES GLOBAL VARIABLES: $last_nonblank_type, $last_nonblank_token # EXPERIMENTAL: Mark slices as structural; idea was to improve formatting. # Tentatively deactivated because it caused the wrong operator expectation # for this code: # $user = @vars[1] / 100; # Must update sub operator_expected before re-implementing. # if ( $last_nonblank_type eq 'i' && $last_nonblank_token =~ /^@/ ) { # return 0; # } #-------------------------------------------------------------- # NOTE: braces after type characters start code blocks, but for # simplicity these are not identified as such. See also # sub code_block_type #-------------------------------------------------------------- ##if ($last_nonblank_type eq 't') {return 0} # otherwise, it is non-structural if it is decorated # by type information. # For example, the '{' here is non-structural: ${xxx} # Removed '::' to fix c074 ## $last_nonblank_token =~ /^([\$\@\*\&\%\)]|->|::)/ return ( ## $last_nonblank_token =~ /^([\$\@\*\&\%\)]|->)/ $is_sigil_or_paren{ substr( $last_nonblank_token, 0, 1 ) } || substr( $last_nonblank_token, 0, 2 ) eq '->' # or if we follow a hash or array closing curly brace or bracket # For example, the second '{' in this is non-structural: $a{'x'}{'y'} # because the first '}' would have been given type 'R' ##|| $last_nonblank_type =~ /^([R\]])$/ || $is_R_closing_sb{$last_nonblank_type} ); } ## end sub is_non_structural_brace ####################################################################### # Tokenizer routines for tracking container nesting depths ####################################################################### # The following routines keep track of nesting depths of the nesting # types, ( [ { and ?. This is necessary for determining the indentation # level, and also for debugging programs. Not only do they keep track of # nesting depths of the individual brace types, but they check that each # of the other brace types is balanced within matching pairs. For # example, if the program sees this sequence: # # { ( ( ) } # # then it can determine that there is an extra left paren somewhere # between the { and the }. And so on with every other possible # combination of outer and inner brace types. For another # example: # # ( [ ..... ] ] ) # # which has an extra ] within the parens. # # The brace types have indexes 0 .. 3 which are indexes into # the matrices. # # The pair ? : are treated as just another nesting type, with ? acting # as the opening brace and : acting as the closing brace. # # The matrix # # $depth_array[$a][$b][ $current_depth[$a] ] = $current_depth[$b]; # # saves the nesting depth of brace type $b (where $b is either of the other # nesting types) when brace type $a enters a new depth. When this depth # decreases, a check is made that the current depth of brace types $b is # unchanged, or otherwise there must have been an error. This can # be very useful for localizing errors, particularly when perl runs to # the end of a large file (such as this one) and announces that there # is a problem somewhere. # # A numerical sequence number is maintained for every nesting type, # so that each matching pair can be uniquely identified in a simple # way. sub increase_nesting_depth { my ( $aa, $pos ) = @_; # USES GLOBAL VARIABLES: $tokenizer_self, @current_depth, # @current_sequence_number, @depth_array, @starting_line_of_current_depth, # $statement_type $current_depth[$aa]++; $total_depth++; $total_depth[$aa][ $current_depth[$aa] ] = $total_depth; my $input_line_number = $tokenizer_self->[_last_line_number_]; my $input_line = $tokenizer_self->[_line_of_text_]; # Sequence numbers increment by number of items. This keeps # a unique set of numbers but still allows the relative location # of any type to be determined. # make a new unique sequence number my $seqno = $next_sequence_number++; $current_sequence_number[$aa][ $current_depth[$aa] ] = $seqno; $starting_line_of_current_depth[$aa][ $current_depth[$aa] ] = [ $input_line_number, $input_line, $pos ]; for my $bb ( 0 .. @closing_brace_names - 1 ) { next if ( $bb == $aa ); $depth_array[$aa][$bb][ $current_depth[$aa] ] = $current_depth[$bb]; } # set a flag for indenting a nested ternary statement my $indent = 0; if ( $aa == QUESTION_COLON ) { $nested_ternary_flag[ $current_depth[$aa] ] = 0; if ( $current_depth[$aa] > 1 ) { if ( $nested_ternary_flag[ $current_depth[$aa] - 1 ] == 0 ) { my $pdepth = $total_depth[$aa][ $current_depth[$aa] - 1 ]; if ( $pdepth == $total_depth - 1 ) { $indent = 1; $nested_ternary_flag[ $current_depth[$aa] - 1 ] = -1; } } } } # Fix part #1 for git82: save last token type for propagation of type 'Z' $nested_statement_type[$aa][ $current_depth[$aa] ] = [ $statement_type, $last_nonblank_type, $last_nonblank_token ]; $statement_type = EMPTY_STRING; return ( $seqno, $indent ); } ## end sub increase_nesting_depth sub is_balanced_closing_container { # Return true if a closing container can go here without error # Return false if not my ($aa) = @_; # cannot close if there was no opening return unless ( $current_depth[$aa] > 0 ); # check that any other brace types $bb contained within would be balanced for my $bb ( 0 .. @closing_brace_names - 1 ) { next if ( $bb == $aa ); return unless ( $depth_array[$aa][$bb][ $current_depth[$aa] ] == $current_depth[$bb] ); } # OK, everything will be balanced return 1; } ## end sub is_balanced_closing_container sub decrease_nesting_depth { my ( $aa, $pos ) = @_; # USES GLOBAL VARIABLES: $tokenizer_self, @current_depth, # @current_sequence_number, @depth_array, @starting_line_of_current_depth # $statement_type my $seqno = 0; my $input_line_number = $tokenizer_self->[_last_line_number_]; my $input_line = $tokenizer_self->[_line_of_text_]; my $outdent = 0; $total_depth--; if ( $current_depth[$aa] > 0 ) { # set a flag for un-indenting after seeing a nested ternary statement $seqno = $current_sequence_number[$aa][ $current_depth[$aa] ]; if ( $aa == QUESTION_COLON ) { $outdent = $nested_ternary_flag[ $current_depth[$aa] ]; } # Fix part #2 for git82: use saved type for propagation of type 'Z' # through type L-R braces. Perl seems to allow ${bareword} # as an indirect object, but nothing much more complex than that. ( $statement_type, my $saved_type, my $saved_token ) = @{ $nested_statement_type[$aa][ $current_depth[$aa] ] }; if ( $aa == BRACE && $saved_type eq 'Z' && $last_nonblank_type eq 'w' && $brace_structural_type[$brace_depth] eq 'L' ) { $last_nonblank_type = $saved_type; } # check that any brace types $bb contained within are balanced for my $bb ( 0 .. @closing_brace_names - 1 ) { next if ( $bb == $aa ); unless ( $depth_array[$aa][$bb][ $current_depth[$aa] ] == $current_depth[$bb] ) { my $diff = $current_depth[$bb] - $depth_array[$aa][$bb][ $current_depth[$aa] ]; # don't whine too many times my $saw_brace_error = get_saw_brace_error(); if ( $saw_brace_error <= MAX_NAG_MESSAGES # if too many closing types have occurred, we probably # already caught this error && ( ( $diff > 0 ) || ( $saw_brace_error <= 0 ) ) ) { interrupt_logfile(); my $rsl = $starting_line_of_current_depth[$aa] [ $current_depth[$aa] ]; my $sl = $rsl->[0]; my $rel = [ $input_line_number, $input_line, $pos ]; my $el = $rel->[0]; my ($ess); if ( $diff == 1 || $diff == -1 ) { $ess = EMPTY_STRING; } else { $ess = 's'; } my $bname = ( $diff > 0 ) ? $opening_brace_names[$bb] : $closing_brace_names[$bb]; write_error_indicator_pair( @{$rsl}, '^' ); my $msg = <<"EOM"; Found $diff extra $bname$ess between $opening_brace_names[$aa] on line $sl and $closing_brace_names[$aa] on line $el EOM if ( $diff > 0 ) { my $rml = $starting_line_of_current_depth[$bb] [ $current_depth[$bb] ]; my $ml = $rml->[0]; $msg .= " The most recent un-matched $bname is on line $ml\n"; write_error_indicator_pair( @{$rml}, '^' ); } write_error_indicator_pair( @{$rel}, '^' ); warning($msg); resume_logfile(); } increment_brace_error(); } } $current_depth[$aa]--; } else { my $saw_brace_error = get_saw_brace_error(); if ( $saw_brace_error <= MAX_NAG_MESSAGES ) { my $msg = <<"EOM"; There is no previous $opening_brace_names[$aa] to match a $closing_brace_names[$aa] on line $input_line_number EOM indicate_error( $msg, $input_line_number, $input_line, $pos, '^' ); } increment_brace_error(); # keep track of errors in braces alone (ignoring ternary nesting errors) $tokenizer_self->[_true_brace_error_count_]++ if ( $closing_brace_names[$aa] ne "':'" ); } return ( $seqno, $outdent ); } ## end sub decrease_nesting_depth sub check_final_nesting_depths { # USES GLOBAL VARIABLES: @current_depth, @starting_line_of_current_depth for my $aa ( 0 .. @closing_brace_names - 1 ) { if ( $current_depth[$aa] ) { my $rsl = $starting_line_of_current_depth[$aa][ $current_depth[$aa] ]; my $sl = $rsl->[0]; my $msg = <<"EOM"; Final nesting depth of $opening_brace_names[$aa]s is $current_depth[$aa] The most recent un-matched $opening_brace_names[$aa] is on line $sl EOM indicate_error( $msg, @{$rsl}, '^' ); increment_brace_error(); } } return; } ## end sub check_final_nesting_depths ####################################################################### # Tokenizer routines for looking ahead in input stream ####################################################################### sub peek_ahead_for_n_nonblank_pre_tokens { # returns next n pretokens if they exist # returns undef's if hits eof without seeing any pretokens # USES GLOBAL VARIABLES: $tokenizer_self my $max_pretokens = shift; my $line; my $i = 0; my ( $rpre_tokens, $rmap, $rpre_types ); while ( $line = $tokenizer_self->[_line_buffer_object_]->peek_ahead( $i++ ) ) { $line =~ s/^\s*//; # trim leading blanks next if ( length($line) <= 0 ); # skip blank next if ( $line =~ /^#/ ); # skip comment ( $rpre_tokens, $rmap, $rpre_types ) = pre_tokenize( $line, $max_pretokens ); last; } return ( $rpre_tokens, $rpre_types ); } ## end sub peek_ahead_for_n_nonblank_pre_tokens # look ahead for next non-blank, non-comment line of code sub peek_ahead_for_nonblank_token { # USES GLOBAL VARIABLES: $tokenizer_self my ( $rtokens, $max_token_index ) = @_; my $line; my $i = 0; while ( $line = $tokenizer_self->[_line_buffer_object_]->peek_ahead( $i++ ) ) { $line =~ s/^\s*//; # trim leading blanks next if ( length($line) <= 0 ); # skip blank next if ( $line =~ /^#/ ); # skip comment # Updated from 2 to 3 to get trigraphs, added for case b1175 my ( $rtok, $rmap, $rtype ) = pre_tokenize( $line, 3 ); my $j = $max_token_index + 1; foreach my $tok ( @{$rtok} ) { last if ( $tok =~ "\n" ); $rtokens->[ ++$j ] = $tok; } last; } return; } ## end sub peek_ahead_for_nonblank_token ####################################################################### # Tokenizer guessing routines for ambiguous situations ####################################################################### sub guess_if_pattern_or_conditional { # this routine is called when we have encountered a ? following an # unknown bareword, and we must decide if it starts a pattern or not # input parameters: # $i - token index of the ? starting possible pattern # output parameters: # $is_pattern = 0 if probably not pattern, =1 if probably a pattern # msg = a warning or diagnostic message # USES GLOBAL VARIABLES: $last_nonblank_token my ( $i, $rtokens, $rtoken_map, $max_token_index ) = @_; my $is_pattern = 0; my $msg = "guessing that ? after $last_nonblank_token starts a "; if ( $i >= $max_token_index ) { $msg .= "conditional (no end to pattern found on the line)\n"; } else { my $ibeg = $i; $i = $ibeg + 1; my $next_token = $rtokens->[$i]; # first token after ? # look for a possible ending ? on this line.. my $in_quote = 1; my $quote_depth = 0; my $quote_character = EMPTY_STRING; my $quote_pos = 0; my $quoted_string; ( $i, $in_quote, $quote_character, $quote_pos, $quote_depth, $quoted_string, ) = follow_quoted_string( $ibeg, $in_quote, $rtokens, $quote_character, $quote_pos, $quote_depth, $max_token_index, ); if ($in_quote) { # we didn't find an ending ? on this line, # so we bias towards conditional $is_pattern = 0; $msg .= "conditional (no ending ? on this line)\n"; # we found an ending ?, so we bias towards a pattern } else { # Watch out for an ending ? in quotes, like this # my $case_flag = File::Spec->case_tolerant ? '(?i)' : ''; my $s_quote = 0; my $d_quote = 0; my $colons = 0; foreach my $ii ( $ibeg + 1 .. $i - 1 ) { my $tok = $rtokens->[$ii]; if ( $tok eq ":" ) { $colons++ } if ( $tok eq "'" ) { $s_quote++ } if ( $tok eq '"' ) { $d_quote++ } } if ( $s_quote % 2 || $d_quote % 2 || $colons ) { $is_pattern = 0; $msg .= "found ending ? but unbalanced quote chars\n"; } elsif ( pattern_expected( $i, $rtokens, $max_token_index ) >= 0 ) { $is_pattern = 1; $msg .= "pattern (found ending ? and pattern expected)\n"; } else { $msg .= "pattern (uncertain, but found ending ?)\n"; } } } return ( $is_pattern, $msg ); } ## end sub guess_if_pattern_or_conditional my %is_known_constant; my %is_known_function; BEGIN { # Constants like 'pi' in Trig.pm are common my @q = qw(pi pi2 pi4 pip2 pip4); @{is_known_constant}{@q} = (1) x scalar(@q); # parenless calls of 'ok' are common @q = qw( ok ); @{is_known_function}{@q} = (1) x scalar(@q); } ## end BEGIN sub guess_if_pattern_or_division { # this routine is called when we have encountered a / following an # unknown bareword, and we must decide if it starts a pattern or is a # division # input parameters: # $i - token index of the / starting possible pattern # output parameters: # $is_pattern = 0 if probably division, =1 if probably a pattern # msg = a warning or diagnostic message # USES GLOBAL VARIABLES: $last_nonblank_token my ( $i, $rtokens, $rtoken_map, $max_token_index ) = @_; my $is_pattern = 0; my $msg = "guessing that / after $last_nonblank_token starts a "; if ( $i >= $max_token_index ) { $msg .= "division (no end to pattern found on the line)\n"; } else { my $ibeg = $i; my $divide_possible = is_possible_numerator( $i, $rtokens, $max_token_index ); if ( $divide_possible < 0 ) { $msg = "pattern (division not possible here)\n"; $is_pattern = 1; return ( $is_pattern, $msg ); } $i = $ibeg + 1; my $next_token = $rtokens->[$i]; # first token after slash # One of the things we can look at is the spacing around the slash. # There # are four possible spacings around the first slash: # # return pi/two;#/; -/- # return pi/ two;#/; -/+ # return pi / two;#/; +/+ # return pi /two;#/; +/- <-- possible pattern # # Spacing rule: a space before the slash but not after the slash # usually indicates a pattern. We can use this to break ties. my $is_pattern_by_spacing = ( $i > 1 && $next_token !~ m/^\s/ && $rtokens->[ $i - 2 ] =~ m/^\s/ ); # look for a possible ending / on this line.. my $in_quote = 1; my $quote_depth = 0; my $quote_character = EMPTY_STRING; my $quote_pos = 0; my $quoted_string; ( $i, $in_quote, $quote_character, $quote_pos, $quote_depth, $quoted_string ) = follow_quoted_string( $ibeg, $in_quote, $rtokens, $quote_character, $quote_pos, $quote_depth, $max_token_index ); if ($in_quote) { # we didn't find an ending / on this line, so we bias towards # division if ( $divide_possible >= 0 ) { $is_pattern = 0; $msg .= "division (no ending / on this line)\n"; } else { # assuming a multi-line pattern ... this is risky, but division # does not seem possible. If this fails, it would either be due # to a syntax error in the code, or the division_expected logic # needs to be fixed. $msg = "multi-line pattern (division not possible)\n"; $is_pattern = 1; } } # we found an ending /, so we bias slightly towards a pattern else { my $pattern_expected = pattern_expected( $i, $rtokens, $max_token_index ); if ( $pattern_expected >= 0 ) { # pattern looks possible... if ( $divide_possible >= 0 ) { # Both pattern and divide can work here... # Increase weight of divide if a pure number follows $divide_possible += $next_token =~ /^\d+$/; # Check for known constants in the numerator, like 'pi' if ( $is_known_constant{$last_nonblank_token} ) { $msg .= "division (pattern works too but saw known constant '$last_nonblank_token')\n"; $is_pattern = 0; } # A very common bare word in pattern expressions is 'ok' elsif ( $is_known_function{$last_nonblank_token} ) { $msg .= "pattern (division works too but saw '$last_nonblank_token')\n"; $is_pattern = 1; } # If one rule is more definite, use it elsif ( $divide_possible > $pattern_expected ) { $msg .= "division (more likely based on following tokens)\n"; $is_pattern = 0; } # otherwise, use the spacing rule elsif ($is_pattern_by_spacing) { $msg .= "pattern (guess on spacing, but division possible too)\n"; $is_pattern = 1; } else { $msg .= "division (guess on spacing, but pattern is possible too)\n"; $is_pattern = 0; } } # divide_possible < 0 means divide can not work here else { $is_pattern = 1; $msg .= "pattern (division not possible)\n"; } } # pattern does not look possible... else { if ( $divide_possible >= 0 ) { $is_pattern = 0; $msg .= "division (pattern not possible)\n"; } # Neither pattern nor divide look possible...go by spacing else { if ($is_pattern_by_spacing) { $msg .= "pattern (guess on spacing)\n"; $is_pattern = 1; } else { $msg .= "division (guess on spacing)\n"; $is_pattern = 0; } } } } } return ( $is_pattern, $msg ); } ## end sub guess_if_pattern_or_division # try to resolve here-doc vs. shift by looking ahead for # non-code or the end token (currently only looks for end token) # returns 1 if it is probably a here doc, 0 if not sub guess_if_here_doc { # This is how many lines we will search for a target as part of the # guessing strategy. It is a constant because there is probably # little reason to change it. # USES GLOBAL VARIABLES: $tokenizer_self, $current_package # %is_constant, my $HERE_DOC_WINDOW = 40; my $next_token = shift; my $here_doc_expected = 0; my $line; my $k = 0; my $msg = "checking <<"; while ( $line = $tokenizer_self->[_line_buffer_object_]->peek_ahead( $k++ ) ) { chomp $line; if ( $line =~ /^$next_token$/ ) { $msg .= " -- found target $next_token ahead $k lines\n"; $here_doc_expected = 1; # got it last; } last if ( $k >= $HERE_DOC_WINDOW ); } unless ($here_doc_expected) { if ( !defined($line) ) { $here_doc_expected = -1; # hit eof without seeing target $msg .= " -- must be shift; target $next_token not in file\n"; } else { # still unsure..taking a wild guess if ( !$is_constant{$current_package}{$next_token} ) { $here_doc_expected = 1; $msg .= " -- guessing it's a here-doc ($next_token not a constant)\n"; } else { $msg .= " -- guessing it's a shift ($next_token is a constant)\n"; } } } write_logfile_entry($msg); return $here_doc_expected; } ## end sub guess_if_here_doc ####################################################################### # Tokenizer Routines for scanning identifiers and related items ####################################################################### sub scan_bare_identifier_do { # this routine is called to scan a token starting with an alphanumeric # variable or package separator, :: or '. # USES GLOBAL VARIABLES: $current_package, $last_nonblank_token, # $last_nonblank_type,@paren_type, $paren_depth my ( $input_line, $i, $tok, $type, $prototype, $rtoken_map, $max_token_index ) = @_; my $i_begin = $i; my $package = undef; my $i_beg = $i; # we have to back up one pretoken at a :: since each : is one pretoken if ( $tok eq '::' ) { $i_beg-- } if ( $tok eq '->' ) { $i_beg-- } my $pos_beg = $rtoken_map->[$i_beg]; pos($input_line) = $pos_beg; # Examples: # A::B::C # A:: # ::A # A'B if ( $input_line =~ m/\G\s*((?:\w*(?:'|::)))*(?:(?:->)?(\w+))?/gc ) { my $pos = pos($input_line); my $numc = $pos - $pos_beg; $tok = substr( $input_line, $pos_beg, $numc ); # type 'w' includes anything without leading type info # ($,%,@,*) including something like abc::def::ghi $type = 'w'; my $sub_name = EMPTY_STRING; if ( defined($2) ) { $sub_name = $2; } if ( defined($1) ) { $package = $1; # patch: don't allow isolated package name which just ends # in the old style package separator (single quote). Example: # use CGI':all'; if ( !($sub_name) && substr( $package, -1, 1 ) eq '\'' ) { $pos--; } $package =~ s/\'/::/g; if ( $package =~ /^\:/ ) { $package = 'main' . $package } $package =~ s/::$//; } else { $package = $current_package; # patched for c043, part 1: keyword does not follow '->' if ( $is_keyword{$tok} && $last_nonblank_type ne '->' ) { $type = 'k'; } } # if it is a bareword.. patched for c043, part 2: not following '->' if ( $type eq 'w' && $last_nonblank_type ne '->' ) { # check for v-string with leading 'v' type character # (This seems to have precedence over filehandle, type 'Y') if ( $tok =~ /^v\d[_\d]*$/ ) { # we only have the first part - something like 'v101' - # look for more if ( $input_line =~ m/\G(\.\d[_\d]*)+/gc ) { $pos = pos($input_line); $numc = $pos - $pos_beg; $tok = substr( $input_line, $pos_beg, $numc ); } $type = 'v'; # warn if this version can't handle v-strings report_v_string($tok); } elsif ( $is_constant{$package}{$sub_name} ) { $type = 'C'; } # bareword after sort has implied empty prototype; for example: # @sorted = sort numerically ( 53, 29, 11, 32, 7 ); # This has priority over whatever the user has specified. elsif ($last_nonblank_token eq 'sort' && $last_nonblank_type eq 'k' ) { $type = 'Z'; } # Note: strangely, perl does not seem to really let you create # functions which act like eval and do, in the sense that eval # and do may have operators following the final }, but any operators # that you create with prototype (&) apparently do not allow # trailing operators, only terms. This seems strange. # If this ever changes, here is the update # to make perltidy behave accordingly: # elsif ( $is_block_function{$package}{$tok} ) { # $tok='eval'; # patch to do braces like eval - doesn't work # $type = 'k'; #} # TODO: This could become a separate type to allow for different # future behavior: elsif ( $is_block_function{$package}{$sub_name} ) { $type = 'G'; } elsif ( $is_block_list_function{$package}{$sub_name} ) { $type = 'G'; } elsif ( $is_user_function{$package}{$sub_name} ) { $type = 'U'; $prototype = $user_function_prototype{$package}{$sub_name}; } # check for indirect object elsif ( # added 2001-03-27: must not be followed immediately by '(' # see fhandle.t ( $input_line !~ m/\G\(/gc ) # and && ( # preceded by keyword like 'print', 'printf' and friends $is_indirect_object_taker{$last_nonblank_token} # or preceded by something like 'print(' or 'printf(' || ( ( $last_nonblank_token eq '(' ) && $is_indirect_object_taker{ $paren_type[$paren_depth] } ) ) ) { # may not be indirect object unless followed by a space; # updated 2021-01-16 to consider newline to be a space. # updated for case b990 to look for either ';' or space if ( pos($input_line) == length($input_line) || $input_line =~ m/\G[;\s]/gc ) { $type = 'Y'; # Abandon Hope ... # Perl's indirect object notation is a very bad # thing and can cause subtle bugs, especially for # beginning programmers. And I haven't even been # able to figure out a sane warning scheme which # doesn't get in the way of good scripts. # Complain if a filehandle has any lower case # letters. This is suggested good practice. # Use 'sub_name' because something like # main::MYHANDLE is ok for filehandle if ( $sub_name =~ /[a-z]/ ) { # could be bug caused by older perltidy if # followed by '(' if ( $input_line =~ m/\G\s*\(/gc ) { complain( "Caution: unknown word '$tok' in indirect object slot\n" ); } } } # bareword not followed by a space -- may not be filehandle # (may be function call defined in a 'use' statement) else { $type = 'Z'; } } } # Now we must convert back from character position # to pre_token index. # I don't think an error flag can occur here ..but who knows my $error; ( $i, $error ) = inverse_pretoken_map( $i, $pos, $rtoken_map, $max_token_index ); if ($error) { warning("scan_bare_identifier: Possibly invalid tokenization\n"); } } # no match but line not blank - could be syntax error # perl will take '::' alone without complaint else { $type = 'w'; # change this warning to log message if it becomes annoying warning("didn't find identifier after leading ::\n"); } return ( $i, $tok, $type, $prototype ); } ## end sub scan_bare_identifier_do sub scan_id_do { # This is the new scanner and will eventually replace scan_identifier. # Only type 'sub' and 'package' are implemented. # Token types $ * % @ & -> are not yet implemented. # # Scan identifier following a type token. # The type of call depends on $id_scan_state: $id_scan_state = '' # for starting call, in which case $tok must be the token defining # the type. # # If the type token is the last nonblank token on the line, a value # of $id_scan_state = $tok is returned, indicating that further # calls must be made to get the identifier. If the type token is # not the last nonblank token on the line, the identifier is # scanned and handled and a value of '' is returned. # USES GLOBAL VARIABLES: $current_package, $last_nonblank_token, $in_attribute_list, # $statement_type, $tokenizer_self my ( $input_line, $i, $tok, $rtokens, $rtoken_map, $id_scan_state, $max_token_index ) = @_; use constant DEBUG_NSCAN => 0; my $type = EMPTY_STRING; my ( $i_beg, $pos_beg ); #print "NSCAN:entering i=$i, tok=$tok, type=$type, state=$id_scan_state\n"; #my ($a,$b,$c) = caller; #print "NSCAN: scan_id called with tok=$tok $a $b $c\n"; # on re-entry, start scanning at first token on the line if ($id_scan_state) { $i_beg = $i; $type = EMPTY_STRING; } # on initial entry, start scanning just after type token else { $i_beg = $i + 1; $id_scan_state = $tok; $type = 't'; } # find $i_beg = index of next nonblank token, # and handle empty lines my $blank_line = 0; my $next_nonblank_token = $rtokens->[$i_beg]; if ( $i_beg > $max_token_index ) { $blank_line = 1; } else { # only a '#' immediately after a '$' is not a comment if ( $next_nonblank_token eq '#' ) { unless ( $tok eq '$' ) { $blank_line = 1; } } if ( $next_nonblank_token =~ /^\s/ ) { ( $next_nonblank_token, $i_beg ) = find_next_nonblank_token_on_this_line( $i_beg, $rtokens, $max_token_index ); if ( $next_nonblank_token =~ /(^#|^\s*$)/ ) { $blank_line = 1; } } } # handle non-blank line; identifier, if any, must follow unless ($blank_line) { if ( $is_sub{$id_scan_state} ) { ( $i, $tok, $type, $id_scan_state ) = do_scan_sub( { input_line => $input_line, i => $i, i_beg => $i_beg, tok => $tok, type => $type, rtokens => $rtokens, rtoken_map => $rtoken_map, id_scan_state => $id_scan_state, max_token_index => $max_token_index, } ); } elsif ( $is_package{$id_scan_state} ) { ( $i, $tok, $type ) = do_scan_package( $input_line, $i, $i_beg, $tok, $type, $rtokens, $rtoken_map, $max_token_index ); $id_scan_state = EMPTY_STRING; } else { warning("invalid token in scan_id: $tok\n"); $id_scan_state = EMPTY_STRING; } } if ( $id_scan_state && ( !defined($type) || !$type ) ) { # shouldn't happen: if (DEVEL_MODE) { Fault(<[$i_beg]; pos($input_line) = $pos_beg; # handle non-blank line; package name, if any, must follow if ( $input_line =~ m/\G\s*((?:\w*(?:'|::))*\w*)/gc ) { $package = $1; $package = ( defined($1) && $1 ) ? $1 : 'main'; $package =~ s/\'/::/g; if ( $package =~ /^\:/ ) { $package = 'main' . $package } $package =~ s/::$//; my $pos = pos($input_line); my $numc = $pos - $pos_beg; $tok = 'package ' . substr( $input_line, $pos_beg, $numc ); $type = 'i'; # Now we must convert back from character position # to pre_token index. # I don't think an error flag can occur here ..but ? my $error; ( $i, $error ) = inverse_pretoken_map( $i, $pos, $rtoken_map, $max_token_index ); if ($error) { warning("Possibly invalid package\n") } $current_package = $package; # we should now have package NAMESPACE # now expecting VERSION, BLOCK, or ; to follow ... # package NAMESPACE VERSION # package NAMESPACE BLOCK # package NAMESPACE VERSION BLOCK my ( $next_nonblank_token, $i_next ) = find_next_nonblank_token( $i, $rtokens, $max_token_index ); # check that something recognizable follows, but do not parse. # A VERSION number will be parsed later as a number or v-string in the # normal way. What is important is to set the statement type if # everything looks okay so that the operator_expected() routine # knows that the number is in a package statement. # Examples of valid primitive tokens that might follow are: # 1235 . ; { } v3 v # FIX: added a '#' since a side comment may also follow # Added ':' for class attributes (for --use-feature=class, rt145706) if ( $next_nonblank_token =~ /^([v\.\d;\{\}\#\:])|v\d|\d+$/ ) { $statement_type = $tok; } else { warning( "Unexpected '$next_nonblank_token' after package name '$tok'\n" ); } } # no match but line not blank -- # could be a label with name package, like package: , for example. else { $type = 'k'; } return ( $i, $tok, $type ); } ## end sub do_scan_package my %is_special_variable_char; BEGIN { # These are the only characters which can (currently) form special # variables, like $^W: (issue c066). my @q = qw{ ? A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ }; @{is_special_variable_char}{@q} = (1) x scalar(@q); } ## end BEGIN { ## begin closure for sub scan_complex_identifier use constant DEBUG_SCAN_ID => 0; # These are the possible states for this scanner: my $scan_state_SIGIL = '$'; my $scan_state_ALPHA = 'A'; my $scan_state_COLON = ':'; my $scan_state_LPAREN = '('; my $scan_state_RPAREN = ')'; my $scan_state_AMPERSAND = '&'; my $scan_state_SPLIT = '^'; # Only these non-blank states may be returned to caller: my %is_returnable_scan_state = ( $scan_state_SIGIL => 1, $scan_state_AMPERSAND => 1, ); # USES GLOBAL VARIABLES: # $context, $last_nonblank_token, $last_nonblank_type #----------- # call args: #----------- my ( $i, $id_scan_state, $identifier, $rtokens, $max_token_index, $expecting, $container_type ); #------------------------------------------- # my variables, re-initialized on each call: #------------------------------------------- my $i_begin; # starting index $i my $type; # returned identifier type my $tok_begin; # starting token my $tok; # returned token my $id_scan_state_begin; # starting scan state my $identifier_begin; # starting identifier my $i_save; # a last good index, in case of error my $message; # hold error message for log file my $tok_is_blank; my $last_tok_is_blank; my $in_prototype_or_signature; my $saw_alpha; my $saw_type; my $allow_tick; sub initialize_my_scan_id_vars { # Initialize all 'my' vars on entry $i_begin = $i; $type = EMPTY_STRING; $tok_begin = $rtokens->[$i_begin]; $tok = $tok_begin; if ( $tok_begin eq ':' ) { $tok_begin = '::' } $id_scan_state_begin = $id_scan_state; $identifier_begin = $identifier; $i_save = undef; $message = EMPTY_STRING; $tok_is_blank = undef; # a flag to speed things up $last_tok_is_blank = undef; $in_prototype_or_signature = $container_type && $container_type =~ /^sub\b/; # these flags will be used to help figure out the type: $saw_alpha = undef; $saw_type = undef; # allow old package separator (') except in 'use' statement $allow_tick = ( $last_nonblank_token ne 'use' ); return; } ## end sub initialize_my_scan_id_vars #---------------------------------- # Routines for handling scan states #---------------------------------- sub do_id_scan_state_dollar { # We saw a sigil, now looking to start a variable name if ( $tok eq '$' ) { $identifier .= $tok; # we've got a punctuation variable if end of line (punct.t) if ( $i == $max_token_index ) { $type = 'i'; $id_scan_state = EMPTY_STRING; } } elsif ( $tok =~ /^\w/ ) { # alphanumeric .. $saw_alpha = 1; $id_scan_state = $scan_state_COLON; # now need :: $identifier .= $tok; } elsif ( $tok eq '::' ) { $id_scan_state = $scan_state_ALPHA; $identifier .= $tok; } # POSTDEFREF ->@ ->% ->& ->* elsif ( ( $tok =~ /^[\@\%\&\*]$/ ) && $identifier =~ /\-\>$/ ) { $identifier .= $tok; } elsif ( $tok eq "'" && $allow_tick ) { # alphanumeric .. $saw_alpha = 1; $id_scan_state = $scan_state_COLON; # now need :: $identifier .= $tok; # Perl will accept leading digits in identifiers, # although they may not always produce useful results. # Something like $main::0 is ok. But this also works: # # sub howdy::123::bubba{ print "bubba $54321!\n" } # howdy::123::bubba(); # } elsif ( $tok eq '#' ) { my $is_punct_var = $identifier eq '$$'; # side comment or identifier? if ( # A '#' starts a comment if it follows a space. For example, # the following is equivalent to $ans=40. # my $ # # ans = 40; !$last_tok_is_blank # a # inside a prototype or signature can only start a # comment && !$in_prototype_or_signature # these are valid punctuation vars: *# %# @# $# # May also be '$#array' or POSTDEFREF ->$# && ( $identifier =~ /^[\%\@\$\*]$/ || $identifier =~ /\$$/ ) # but a '#' after '$$' is a side comment; see c147 && !$is_punct_var ) { $identifier .= $tok; # keep same state, a $ could follow } else { # otherwise it is a side comment if ( $identifier eq '->' ) { } elsif ($is_punct_var) { $type = 'i' } elsif ( $id_scan_state eq $scan_state_SIGIL ) { $type = 't' } else { $type = 'i' } $i = $i_save; $id_scan_state = EMPTY_STRING; } } elsif ( $tok eq '{' ) { # check for something like ${#} or ${?}, where ? is a special char if ( ( $identifier eq '$' || $identifier eq '@' || $identifier eq '$#' ) && $i + 2 <= $max_token_index && $rtokens->[ $i + 2 ] eq '}' && $rtokens->[ $i + 1 ] !~ /[\s\w]/ ) { my $next2 = $rtokens->[ $i + 2 ]; my $next1 = $rtokens->[ $i + 1 ]; $identifier .= $tok . $next1 . $next2; $i += 2; $id_scan_state = EMPTY_STRING; } else { # skip something like ${xxx} or ->{ $id_scan_state = EMPTY_STRING; # if this is the first token of a line, any tokens for this # identifier have already been accumulated if ( $identifier eq '$' || $i == 0 ) { $identifier = EMPTY_STRING; } $i = $i_save; } } # space ok after leading $ % * & @ elsif ( $tok =~ /^\s*$/ ) { $tok_is_blank = 1; # note: an id with a leading '&' does not actually come this way if ( $identifier =~ /^[\$\%\*\&\@]/ ) { if ( length($identifier) > 1 ) { $id_scan_state = EMPTY_STRING; $i = $i_save; $type = 'i'; # probably punctuation variable } else { # fix c139: trim line-ending type 't' if ( $i == $max_token_index ) { $i = $i_save; $type = 't'; } # spaces after $'s are common, and space after @ # is harmless, so only complain about space # after other type characters. Space after $ and # @ will be removed in formatting. Report space # after % and * because they might indicate a # parsing error. In other words '% ' might be a # modulo operator. Delete this warning if it # gets annoying. elsif ( $identifier !~ /^[\@\$]$/ ) { $message = "Space in identifier, following $identifier\n"; } else { ## ok: silently accept space after '$' and '@' sigils } } } elsif ( $identifier eq '->' ) { # space after '->' is ok except at line end .. # so trim line-ending in type '->' (fixes c139) if ( $i == $max_token_index ) { $i = $i_save; $type = '->'; } } # stop at space after something other than -> or sigil # Example of what can arrive here: # eval { $MyClass->$$ }; else { $id_scan_state = EMPTY_STRING; $i = $i_save; $type = 'i'; } } elsif ( $tok eq '^' ) { # check for some special variables like $^ $^W if ( $identifier =~ /^[\$\*\@\%]$/ ) { $identifier .= $tok; $type = 'i'; # There may be one more character, not a space, after the ^ my $next1 = $rtokens->[ $i + 1 ]; my $chr = substr( $next1, 0, 1 ); if ( $is_special_variable_char{$chr} ) { # It is something like $^W # Test case (c066) : $^Oeq'linux' $i++; $identifier .= $next1; # If pretoken $next1 is more than one character long, # set a flag indicating that it needs to be split. $id_scan_state = ( length($next1) > 1 ) ? $scan_state_SPLIT : EMPTY_STRING; } else { # it is just $^ # Simple test case (c065): '$aa=$^if($bb)'; $id_scan_state = EMPTY_STRING; } } else { $id_scan_state = EMPTY_STRING; $i = $i_save; } } else { # something else if ( $in_prototype_or_signature && $tok =~ /^[\),=#]/ ) { # We might be in an extrusion of # sub foo2 ( $first, $, $third ) { # looking at a line starting with a comma, like # $ # , # in this case the comma ends the signature variable # '$' which will have been previously marked type 't' # rather than 'i'. if ( $i == $i_begin ) { $identifier = EMPTY_STRING; $type = EMPTY_STRING; } # at a # we have to mark as type 't' because more may # follow, otherwise, in a signature we can let '$' be an # identifier here for better formatting. # See 'mangle4.in' for a test case. else { $type = 'i'; if ( $id_scan_state eq $scan_state_SIGIL && $tok eq '#' ) { $type = 't'; } $i = $i_save; } $id_scan_state = EMPTY_STRING; } # check for various punctuation variables elsif ( $identifier =~ /^[\$\*\@\%]$/ ) { $identifier .= $tok; } # POSTDEFREF: Postfix reference ->$* ->%* ->@* ->** ->&* ->$#* elsif ($tok eq '*' && $identifier =~ /\-\>([\@\%\$\*\&]|\$\#)$/ ) { $identifier .= $tok; } elsif ( $identifier eq '$#' ) { if ( $tok eq '{' ) { $type = 'i'; $i = $i_save } # perl seems to allow just these: $#: $#- $#+ elsif ( $tok =~ /^[\:\-\+]$/ ) { $type = 'i'; $identifier .= $tok; } else { $i = $i_save; write_logfile_entry( 'Use of $# is deprecated' . "\n" ); } } elsif ( $identifier eq '$$' ) { # perl does not allow references to punctuation # variables without braces. For example, this # won't work: # $:=\4; # $a = $$:; # You would have to use # $a = ${$:}; # '$$' alone is punctuation variable for PID $i = $i_save; if ( $tok eq '{' ) { $type = 't' } else { $type = 'i' } } elsif ( $identifier eq '->' ) { $i = $i_save; } else { $i = $i_save; if ( length($identifier) == 1 ) { $identifier = EMPTY_STRING; } } $id_scan_state = EMPTY_STRING; } return; } ## end sub do_id_scan_state_dollar sub do_id_scan_state_alpha { # looking for alphanumeric after :: $tok_is_blank = $tok =~ /^\s*$/; if ( $tok =~ /^\w/ ) { # found it $identifier .= $tok; $id_scan_state = $scan_state_COLON; # now need :: $saw_alpha = 1; } elsif ( $tok eq "'" && $allow_tick ) { $identifier .= $tok; $id_scan_state = $scan_state_COLON; # now need :: $saw_alpha = 1; } elsif ( $tok_is_blank && $identifier =~ /^sub / ) { $id_scan_state = $scan_state_LPAREN; $identifier .= $tok; } elsif ( $tok eq '(' && $identifier =~ /^sub / ) { $id_scan_state = $scan_state_RPAREN; $identifier .= $tok; } else { $id_scan_state = EMPTY_STRING; $i = $i_save; } return; } ## end sub do_id_scan_state_alpha sub do_id_scan_state_colon { # looking for possible :: after alphanumeric $tok_is_blank = $tok =~ /^\s*$/; if ( $tok eq '::' ) { # got it $identifier .= $tok; $id_scan_state = $scan_state_ALPHA; # now require alpha } elsif ( $tok =~ /^\w/ ) { # more alphanumeric is ok here $identifier .= $tok; $id_scan_state = $scan_state_COLON; # now need :: $saw_alpha = 1; } elsif ( $tok eq "'" && $allow_tick ) { # tick if ( $is_keyword{$identifier} ) { $id_scan_state = EMPTY_STRING; # that's all $i = $i_save; } else { $identifier .= $tok; } } elsif ( $tok_is_blank && $identifier =~ /^sub / ) { $id_scan_state = $scan_state_LPAREN; $identifier .= $tok; } elsif ( $tok eq '(' && $identifier =~ /^sub / ) { $id_scan_state = $scan_state_RPAREN; $identifier .= $tok; } else { $id_scan_state = EMPTY_STRING; # that's all $i = $i_save; } return; } ## end sub do_id_scan_state_colon sub do_id_scan_state_left_paren { # looking for possible '(' of a prototype if ( $tok eq '(' ) { # got it $identifier .= $tok; $id_scan_state = $scan_state_RPAREN; # now find the end of it } elsif ( $tok =~ /^\s*$/ ) { # blank - keep going $identifier .= $tok; $tok_is_blank = 1; } else { $id_scan_state = EMPTY_STRING; # that's all - no prototype $i = $i_save; } return; } ## end sub do_id_scan_state_left_paren sub do_id_scan_state_right_paren { # looking for a ')' of prototype to close a '(' $tok_is_blank = $tok =~ /^\s*$/; if ( $tok eq ')' ) { # got it $identifier .= $tok; $id_scan_state = EMPTY_STRING; # all done } elsif ( $tok =~ /^[\s\$\%\\\*\@\&\;]/ ) { $identifier .= $tok; } else { # probable error in script, but keep going warning("Unexpected '$tok' while seeking end of prototype\n"); $identifier .= $tok; } return; } ## end sub do_id_scan_state_right_paren sub do_id_scan_state_ampersand { # Starting sub call after seeing an '&' if ( $tok =~ /^[\$\w]/ ) { # alphanumeric .. $id_scan_state = $scan_state_COLON; # now need :: $saw_alpha = 1; $identifier .= $tok; } elsif ( $tok eq "'" && $allow_tick ) { # alphanumeric .. $id_scan_state = $scan_state_COLON; # now need :: $saw_alpha = 1; $identifier .= $tok; } elsif ( $tok =~ /^\s*$/ ) { # allow space $tok_is_blank = 1; # fix c139: trim line-ending type 't' if ( length($identifier) == 1 && $i == $max_token_index ) { $i = $i_save; $type = 't'; } } elsif ( $tok eq '::' ) { # leading :: $id_scan_state = $scan_state_ALPHA; # accept alpha next $identifier .= $tok; } elsif ( $tok eq '{' ) { if ( $identifier eq '&' || $i == 0 ) { $identifier = EMPTY_STRING; } $i = $i_save; $id_scan_state = EMPTY_STRING; } elsif ( $tok eq '^' ) { if ( $identifier eq '&' ) { # Special variable (c066) $identifier .= $tok; $type = '&'; # There may be one more character, not a space, after the ^ my $next1 = $rtokens->[ $i + 1 ]; my $chr = substr( $next1, 0, 1 ); if ( $is_special_variable_char{$chr} ) { # It is something like &^O $i++; $identifier .= $next1; # If pretoken $next1 is more than one character long, # set a flag indicating that it needs to be split. $id_scan_state = ( length($next1) > 1 ) ? $scan_state_SPLIT : EMPTY_STRING; } else { # it is &^ $id_scan_state = EMPTY_STRING; } } else { $identifier = EMPTY_STRING; $i = $i_save; } } else { # punctuation variable? # testfile: cunningham4.pl # # We have to be careful here. If we are in an unknown state, # we will reject the punctuation variable. In the following # example the '&' is a binary operator but we are in an unknown # state because there is no sigil on 'Prima', so we don't # know what it is. But it is a bad guess that # '&~' is a function variable. # $self->{text}->{colorMap}->[ # Prima::PodView::COLOR_CODE_FOREGROUND # & ~tb::COLOR_INDEX ] = # $sec->{ColorCode} # Fix for case c033: a '#' here starts a side comment if ( $identifier eq '&' && $expecting && $tok ne '#' ) { $identifier .= $tok; } else { $identifier = EMPTY_STRING; $i = $i_save; $type = '&'; } $id_scan_state = EMPTY_STRING; } return; } ## end sub do_id_scan_state_ampersand #------------------- # hash of scanner subs #------------------- my $scan_identifier_code = { $scan_state_SIGIL => \&do_id_scan_state_dollar, $scan_state_ALPHA => \&do_id_scan_state_alpha, $scan_state_COLON => \&do_id_scan_state_colon, $scan_state_LPAREN => \&do_id_scan_state_left_paren, $scan_state_RPAREN => \&do_id_scan_state_right_paren, $scan_state_AMPERSAND => \&do_id_scan_state_ampersand, }; sub scan_complex_identifier { # This routine assembles tokens into identifiers. It maintains a # scan state, id_scan_state. It updates id_scan_state based upon # current id_scan_state and token, and returns an updated # id_scan_state and the next index after the identifier. # This routine now serves a a backup for sub scan_simple_identifier # which handles most identifiers. ( $i, $id_scan_state, $identifier, $rtokens, $max_token_index, $expecting, $container_type ) = @_; # return flag telling caller to split the pretoken my $split_pretoken_flag; #------------------- # Initialize my vars #------------------- initialize_my_scan_id_vars(); #-------------------------------------------------------- # get started by defining a type and a state if necessary #-------------------------------------------------------- if ( !$id_scan_state ) { $context = UNKNOWN_CONTEXT; # fixup for digraph if ( $tok eq '>' ) { $tok = '->'; $tok_begin = $tok; } $identifier = $tok; if ( $last_nonblank_token eq '->' ) { $identifier = '->' . $identifier; $id_scan_state = $scan_state_SIGIL; } elsif ( $tok eq '$' || $tok eq '*' ) { $id_scan_state = $scan_state_SIGIL; $context = SCALAR_CONTEXT; } elsif ( $tok eq '%' || $tok eq '@' ) { $id_scan_state = $scan_state_SIGIL; $context = LIST_CONTEXT; } elsif ( $tok eq '&' ) { $id_scan_state = $scan_state_AMPERSAND; } elsif ( $tok eq 'sub' or $tok eq 'package' ) { $saw_alpha = 0; # 'sub' is considered type info here $id_scan_state = $scan_state_SIGIL; $identifier .= SPACE; # need a space to separate sub from sub name } elsif ( $tok eq '::' ) { $id_scan_state = $scan_state_ALPHA; } elsif ( $tok =~ /^\w/ ) { $id_scan_state = $scan_state_COLON; $saw_alpha = 1; } elsif ( $tok eq '->' ) { $id_scan_state = $scan_state_SIGIL; } else { # shouldn't happen: bad call parameter my $msg = "Program bug detected: scan_identifier received bad starting token = '$tok'\n"; if (DEVEL_MODE) { Fault($msg) } if ( !$tokenizer_self->[_in_error_] ) { warning($msg); $tokenizer_self->[_in_error_] = 1; } $id_scan_state = EMPTY_STRING; # emergency return goto RETURN; } $saw_type = !$saw_alpha; } else { $i--; $saw_alpha = ( $tok =~ /^\w/ ); $saw_type = ( $tok =~ /([\$\%\@\*\&])/ ); # check for a valid starting state if ( DEVEL_MODE && !$is_returnable_scan_state{$id_scan_state} ) { Fault(<{$id_scan_state}; if ( !$code ) { if ( $id_scan_state eq $scan_state_SPLIT ) { ## OK: this is the signal to exit and split the pretoken } # unknown state - should not happen else { if (DEVEL_MODE) { Fault(<[ ++$i ]; # patch to make digraph :: if necessary if ( ( $tok eq ':' ) && ( $rtokens->[ $i + 1 ] eq ':' ) ) { $tok = '::'; $i++; } $code->(); # check for forward progress: a decrease in the index $i # implies that scanning has finished last if ( $i <= $i_start_loop ); } ## end of main loop #------------- # Check result #------------- # Be sure a valid state is returned if ($id_scan_state) { if ( !$is_returnable_scan_state{$id_scan_state} ) { if ( $id_scan_state eq $scan_state_SPLIT ) { $split_pretoken_flag = 1; } if ( $id_scan_state eq $scan_state_RPAREN ) { warning( "Hit end of line while seeking ) to end prototype\n"); } $id_scan_state = EMPTY_STRING; } # Patch: the deprecated variable $# does not combine with anything # on the next line. if ( $identifier eq '$#' ) { $id_scan_state = EMPTY_STRING } } # Be sure the token index is valid if ( $i < 0 ) { $i = 0 } # Be sure a token type is defined if ( !$type ) { if ($saw_type) { if ($saw_alpha) { # The type without the -> should be the same as with the -> so # that if they get separated we get the same bond strengths, # etc. See b1234 if ( $identifier =~ /^->/ && $last_nonblank_type eq 'w' && substr( $identifier, 2, 1 ) =~ /^\w/ ) { $type = 'w'; } else { $type = 'i' } } elsif ( $identifier eq '->' ) { $type = '->'; } elsif ( ( length($identifier) > 1 ) # In something like '@$=' we have an identifier '@$' # In something like '$${' we have type '$$' (and only # part of an identifier) && !( $identifier =~ /\$$/ && $tok eq '{' ) ## && ( $identifier !~ /^(sub |package )$/ ) && $identifier ne 'sub ' && $identifier ne 'package ' ) { $type = 'i'; } else { $type = 't' } } elsif ($saw_alpha) { # type 'w' includes anything without leading type info # ($,%,@,*) including something like abc::def::ghi $type = 'w'; # Fix for b1337, if restarting scan after line break between # '->' or sigil and identifier name, use type 'i' if ( $id_scan_state_begin && $identifier =~ /^([\$\%\@\*\&]|->)/ ) { $type = 'i'; } } else { $type = EMPTY_STRING; } # this can happen on a restart } # See if we formed an identifier... if ($identifier) { $tok = $identifier; if ($message) { write_logfile_entry($message) } } # did not find an identifier, back up else { $tok = $tok_begin; $i = $i_begin; } RETURN: DEBUG_SCAN_ID && do { my ( $a, $b, $c ) = caller; print STDOUT "SCANID: called from $a $b $c with tok, i, state, identifier =$tok_begin, $i_begin, $id_scan_state_begin, $identifier_begin\n"; print STDOUT "SCANID: returned with tok, i, state, identifier =$tok, $i, $id_scan_state, $identifier\n"; }; return ( $i, $tok, $type, $id_scan_state, $identifier, $split_pretoken_flag ); } ## end sub scan_complex_identifier } ## end closure for sub scan_complex_identifier { ## closure for sub do_scan_sub my %warn_if_lexical; BEGIN { # lexical subs with these names can cause parsing errors in this version my @q = qw( m q qq qr qw qx s tr y ); @{warn_if_lexical}{@q} = (1) x scalar(@q); } ## end BEGIN # saved package and subnames in case prototype is on separate line my ( $package_saved, $subname_saved ); # initialize subname each time a new 'sub' keyword is encountered sub initialize_subname { $package_saved = EMPTY_STRING; $subname_saved = EMPTY_STRING; return; } use constant { SUB_CALL => 1, PAREN_CALL => 2, PROTOTYPE_CALL => 3, }; sub do_scan_sub { # do_scan_sub parses a sub name and prototype. # At present there are three basic CALL TYPES which are # distinguished by the starting value of '$tok': # 1. $tok='sub', id_scan_state='sub' # it is called with $i_beg equal to the index of the first nonblank # token following a 'sub' token. # 2. $tok='(', id_scan_state='sub', # it is called with $i_beg equal to the index of a '(' which may # start a prototype. # 3. $tok='prototype', id_scan_state='prototype' # it is called with $i_beg equal to the index of a '(' which is # preceded by ': prototype' and has $id_scan_state eq 'prototype' # Examples: # A single type 1 call will get both the sub and prototype # sub foo1 ( $$ ) { } # ^ # The subname will be obtained with a 'sub' call # The prototype on line 2 will be obtained with a '(' call # sub foo1 # ^ <---call type 1 # ( $$ ) { } # ^ <---call type 2 # The subname will be obtained with a 'sub' call # The prototype will be obtained with a 'prototype' call # sub foo1 ( $x, $y ) : prototype ( $$ ) { } # ^ <---type 1 ^ <---type 3 # TODO: add future error checks to be sure we have a valid # sub name. For example, 'sub &doit' is wrong. Also, be sure # a name is given if and only if a non-anonymous sub is # appropriate. # USES GLOBAL VARS: $current_package, $last_nonblank_token, # $in_attribute_list, %saw_function_definition, # $statement_type my ($rinput_hash) = @_; my $input_line = $rinput_hash->{input_line}; my $i = $rinput_hash->{i}; my $i_beg = $rinput_hash->{i_beg}; my $tok = $rinput_hash->{tok}; my $type = $rinput_hash->{type}; my $rtokens = $rinput_hash->{rtokens}; my $rtoken_map = $rinput_hash->{rtoken_map}; my $id_scan_state = $rinput_hash->{id_scan_state}; my $max_token_index = $rinput_hash->{max_token_index}; my $i_entry = $i; # Determine the CALL TYPE # 1=sub # 2=( # 3=prototype my $call_type = $tok eq 'prototype' ? PROTOTYPE_CALL : $tok eq '(' ? PAREN_CALL : SUB_CALL; $id_scan_state = EMPTY_STRING; # normally we get everything in one call my $subname = $subname_saved; my $package = $package_saved; my $proto = undef; my $attrs = undef; my $match; my $pos_beg = $rtoken_map->[$i_beg]; pos($input_line) = $pos_beg; # Look for the sub NAME if this is a SUB call if ( $call_type == SUB_CALL && $input_line =~ m/\G\s* ((?:\w*(?:'|::))*) # package - something that ends in :: or ' (\w+) # NAME - required /gcx ) { $match = 1; $subname = $2; my $is_lexical_sub = $last_nonblank_type eq 'k' && $last_nonblank_token eq 'my'; if ( $is_lexical_sub && $1 ) { warning("'my' sub $subname cannot be in package '$1'\n"); $is_lexical_sub = 0; } if ($is_lexical_sub) { # lexical subs use the block sequence number as a package name my $seqno = $current_sequence_number[BRACE][ $current_depth[BRACE] ]; $seqno = 1 unless ( defined($seqno) ); $package = $seqno; if ( $warn_if_lexical{$subname} ) { warning( "'my' sub '$subname' matches a builtin name and may not be handled correctly in this perltidy version.\n" ); } } else { $package = ( defined($1) && $1 ) ? $1 : $current_package; $package =~ s/\'/::/g; if ( $package =~ /^\:/ ) { $package = 'main' . $package } $package =~ s/::$//; } my $pos = pos($input_line); my $numc = $pos - $pos_beg; $tok = 'sub ' . substr( $input_line, $pos_beg, $numc ); $type = 'i'; # remember the sub name in case another call is needed to # get the prototype $package_saved = $package; $subname_saved = $subname; } # Now look for PROTO ATTRS for all call types # Look for prototype/attributes which are usually on the same # line as the sub name but which might be on a separate line. # For example, we might have an anonymous sub with attributes, # or a prototype on a separate line from its sub name # NOTE: We only want to parse PROTOTYPES here. If we see anything that # does not look like a prototype, we assume it is a SIGNATURE and we # will stop and let the the standard tokenizer handle it. In # particular, we stop if we see any nested parens, braces, or commas. # Also note, a valid prototype cannot contain any alphabetic character # -- see https://perldoc.perl.org/perlsub # But it appears that an underscore is valid in a prototype, so the # regex below uses [A-Za-z] rather than \w # This is the old regex which has been replaced: # $input_line =~ m/\G(\s*\([^\)\(\}\{\,#]*\))? # PROTO my $saw_opening_paren = $input_line =~ /\G\s*\(/; if ( $input_line =~ m/\G(\s*\([^\)\(\}\{\,#A-Za-z]*\))? # PROTO (\s*:)? # ATTRS leading ':' /gcx && ( $1 || $2 ) ) { $proto = $1; $attrs = $2; # Append the prototype to the starting token if it is 'sub' or # 'prototype'. This is not necessary but for compatibility with # previous versions when the -csc flag is used: if ( $proto && ( $match || $call_type == PROTOTYPE_CALL ) ) { $tok .= $proto; } # If we just entered the sub at an opening paren on this call, not # a following :prototype, label it with the previous token. This is # necessary to propagate the sub name to its opening block. elsif ( $call_type == PAREN_CALL ) { $tok = $last_nonblank_token; } $match ||= 1; # Patch part #1 to fixes cases b994 and b1053: # Mark an anonymous sub keyword without prototype as type 'k', i.e. # 'sub : lvalue { ...' $type = 'i'; if ( $tok eq 'sub' && !$proto ) { $type = 'k' } } if ($match) { # ATTRS: if there are attributes, back up and let the ':' be # found later by the scanner. my $pos = pos($input_line); if ($attrs) { $pos -= length($attrs); } my $next_nonblank_token = $tok; # catch case of line with leading ATTR ':' after anonymous sub if ( $pos == $pos_beg && $tok eq ':' ) { $type = 'A'; $in_attribute_list = 1; } # Otherwise, if we found a match we must convert back from # string position to the pre_token index for continued parsing. else { # I don't think an error flag can occur here ..but ? my $error; ( $i, $error ) = inverse_pretoken_map( $i, $pos, $rtoken_map, $max_token_index ); if ($error) { warning("Possibly invalid sub\n") } # Patch part #2 to fixes cases b994 and b1053: # Do not let spaces be part of the token of an anonymous sub # keyword which we marked as type 'k' above...i.e. for # something like: # 'sub : lvalue { ...' # Back up and let it be parsed as a blank if ( $type eq 'k' && $attrs && $i > $i_entry && substr( $rtokens->[$i], 0, 1 ) =~ m/\s/ ) { $i--; } # check for multiple definitions of a sub ( $next_nonblank_token, my $i_next ) = find_next_nonblank_token_on_this_line( $i, $rtokens, $max_token_index ); } if ( $next_nonblank_token =~ /^(\s*|#)$/ ) { # skip blank or side comment my ( $rpre_tokens, $rpre_types ) = peek_ahead_for_n_nonblank_pre_tokens(1); if ( defined($rpre_tokens) && @{$rpre_tokens} ) { $next_nonblank_token = $rpre_tokens->[0]; } else { $next_nonblank_token = '}'; } } # See what's next... if ( $next_nonblank_token eq '{' ) { if ($subname) { # Check for multiple definitions of a sub, but # it is ok to have multiple sub BEGIN, etc, # so we do not complain if name is all caps if ( $saw_function_definition{$subname}{$package} && $subname !~ /^[A-Z]+$/ ) { my $lno = $saw_function_definition{$subname}{$package}; if ( $package =~ /^\d/ ) { warning( "already saw definition of lexical 'sub $subname' at line $lno\n" ); } else { warning( "already saw definition of 'sub $subname' in package '$package' at line $lno\n" ) unless (DEVEL_MODE); } } $saw_function_definition{$subname}{$package} = $tokenizer_self->[_last_line_number_]; } } elsif ( $next_nonblank_token eq ';' ) { } elsif ( $next_nonblank_token eq '}' ) { } # ATTRS - if an attribute list follows, remember the name # of the sub so the next opening brace can be labeled. # Setting 'statement_type' causes any ':'s to introduce # attributes. elsif ( $next_nonblank_token eq ':' ) { if ( $call_type == SUB_CALL ) { $statement_type = substr( $tok, 0, 3 ) eq 'sub' ? $tok : 'sub'; } } # if we stopped before an open paren ... elsif ( $next_nonblank_token eq '(' ) { # If we DID NOT see this paren above then it must be on the # next line so we will set a flag to come back here and see if # it is a PROTOTYPE # Otherwise, we assume it is a SIGNATURE rather than a # PROTOTYPE and let the normal tokenizer handle it as a list if ( !$saw_opening_paren ) { $id_scan_state = 'sub'; # we must come back to get proto } if ( $call_type == SUB_CALL ) { $statement_type = substr( $tok, 0, 3 ) eq 'sub' ? $tok : 'sub'; } } elsif ($next_nonblank_token) { # EOF technically ok if ( $rinput_hash->{tok} eq 'method' && $call_type == SUB_CALL ) { # For a method call, silently ignore this error (rt145706) # to avoid needless warnings. Example which can produce it: # test(method Pack (), "method"); # TODO: scan for use feature 'class' and: # - if we saw 'use feature 'class' then issue the warning. # - if we did not see use feature 'class' then issue the # warning and suggest turning off --use-feature=class } else { $subname = EMPTY_STRING unless defined($subname); warning( "expecting ':' or ';' or '{' after definition or declaration of sub '$subname' but saw '$next_nonblank_token'\n" ); } } check_prototype( $proto, $package, $subname ); } # no match to either sub name or prototype, but line not blank else { } return ( $i, $tok, $type, $id_scan_state ); } ## end sub do_scan_sub } ######################################################################### # Tokenizer utility routines which may use CONSTANTS but no other GLOBALS ######################################################################### sub find_next_nonblank_token { my ( $i, $rtokens, $max_token_index ) = @_; # Returns the next nonblank token after the token at index $i # To skip past a side comment, and any subsequent block comments # and blank lines, call with i=$max_token_index if ( $i >= $max_token_index ) { if ( !peeked_ahead() ) { peeked_ahead(1); peek_ahead_for_nonblank_token( $rtokens, $max_token_index ); } } my $next_nonblank_token = $rtokens->[ ++$i ]; return ( SPACE, $i ) unless ( defined($next_nonblank_token) && length($next_nonblank_token) ); # Quick test for nonblank ascii char. Note that we just have to # examine the first character here. my $ord = ord( substr( $next_nonblank_token, 0, 1 ) ); if ( $ord >= ORD_PRINTABLE_MIN && $ord <= ORD_PRINTABLE_MAX ) { return ( $next_nonblank_token, $i ); } # Quick test to skip over an ascii space or tab elsif ( $ord == ORD_SPACE || $ord == ORD_TAB ) { $next_nonblank_token = $rtokens->[ ++$i ]; return ( SPACE, $i ) unless defined($next_nonblank_token); } # Slow test to skip over something else identified as whitespace elsif ( $next_nonblank_token =~ /^\s*$/ ) { $next_nonblank_token = $rtokens->[ ++$i ]; return ( SPACE, $i ) unless defined($next_nonblank_token); } # We should be at a nonblank now return ( $next_nonblank_token, $i ); } ## end sub find_next_nonblank_token sub find_next_noncomment_type { my ( $i, $rtokens, $max_token_index ) = @_; # Given the current character position, look ahead past any comments # and blank lines and return the next token, including digraphs and # trigraphs. my ( $next_nonblank_token, $i_next ) = find_next_nonblank_token( $i, $rtokens, $max_token_index ); # skip past any side comment if ( $next_nonblank_token eq '#' ) { ( $next_nonblank_token, $i_next ) = find_next_nonblank_token( $i_next, $rtokens, $max_token_index ); } # check for a digraph if ( $next_nonblank_token && $next_nonblank_token ne SPACE && defined( $rtokens->[ $i_next + 1 ] ) ) { my $test2 = $next_nonblank_token . $rtokens->[ $i_next + 1 ]; if ( $is_digraph{$test2} ) { $next_nonblank_token = $test2; $i_next = $i_next + 1; # check for a trigraph if ( defined( $rtokens->[ $i_next + 1 ] ) ) { my $test3 = $next_nonblank_token . $rtokens->[ $i_next + 1 ]; if ( $is_trigraph{$test3} ) { $next_nonblank_token = $test3; $i_next = $i_next + 1; } } } } return ( $next_nonblank_token, $i_next ); } ## end sub find_next_noncomment_type sub is_possible_numerator { # Look at the next non-comment character and decide if it could be a # numerator. Return # 1 - yes # 0 - can't tell # -1 - no my ( $i, $rtokens, $max_token_index ) = @_; my $is_possible_numerator = 0; my $next_token = $rtokens->[ $i + 1 ]; if ( $next_token eq '=' ) { $i++; } # handle /= my ( $next_nonblank_token, $i_next ) = find_next_nonblank_token( $i, $rtokens, $max_token_index ); if ( $next_nonblank_token eq '#' ) { ( $next_nonblank_token, $i_next ) = find_next_nonblank_token( $max_token_index, $rtokens, $max_token_index ); } if ( $next_nonblank_token =~ /(\(|\$|\w|\.|\@)/ ) { $is_possible_numerator = 1; } elsif ( $next_nonblank_token =~ /^\s*$/ ) { $is_possible_numerator = 0; } else { $is_possible_numerator = -1; } return $is_possible_numerator; } ## end sub is_possible_numerator { ## closure for sub pattern_expected my %pattern_test; BEGIN { # List of tokens which may follow a pattern. Note that we will not # have formed digraphs at this point, so we will see '&' instead of # '&&' and '|' instead of '||' # /(\)|\}|\;|\&\&|\|\||and|or|while|if|unless)/ my @q = qw( & && | || ? : + - * and or while if unless); push @q, ')', '}', ']', '>', ',', ';'; @{pattern_test}{@q} = (1) x scalar(@q); } ## end BEGIN sub pattern_expected { # This a filter for a possible pattern. # It looks at the token after a possible pattern and tries to # determine if that token could end a pattern. # returns - # 1 - yes # 0 - can't tell # -1 - no my ( $i, $rtokens, $max_token_index ) = @_; my $is_pattern = 0; my $next_token = $rtokens->[ $i + 1 ]; if ( $next_token =~ /^[msixpodualgc]/ ) { $i++; } # skip possible modifier my ( $next_nonblank_token, $i_next ) = find_next_nonblank_token( $i, $rtokens, $max_token_index ); if ( $pattern_test{$next_nonblank_token} ) { $is_pattern = 1; } else { # Added '#' to fix issue c044 if ( $next_nonblank_token =~ /^\s*$/ || $next_nonblank_token eq '#' ) { $is_pattern = 0; } else { $is_pattern = -1; } } return $is_pattern; } ## end sub pattern_expected } sub find_next_nonblank_token_on_this_line { my ( $i, $rtokens, $max_token_index ) = @_; my $next_nonblank_token; if ( $i < $max_token_index ) { $next_nonblank_token = $rtokens->[ ++$i ]; if ( $next_nonblank_token =~ /^\s*$/ ) { if ( $i < $max_token_index ) { $next_nonblank_token = $rtokens->[ ++$i ]; } } } else { $next_nonblank_token = EMPTY_STRING; } return ( $next_nonblank_token, $i ); } ## end sub find_next_nonblank_token_on_this_line sub find_angle_operator_termination { # We are looking at a '<' and want to know if it is an angle operator. # We are to return: # $i = pretoken index of ending '>' if found, current $i otherwise # $type = 'Q' if found, '>' otherwise my ( $input_line, $i_beg, $rtoken_map, $expecting, $max_token_index ) = @_; my $i = $i_beg; my $type = '<'; pos($input_line) = 1 + $rtoken_map->[$i]; my $filter; # we just have to find the next '>' if a term is expected if ( $expecting == TERM ) { $filter = '[\>]' } # we have to guess if we don't know what is expected elsif ( $expecting == UNKNOWN ) { $filter = '[\>\;\=\#\|\<]' } # shouldn't happen - we shouldn't be here if operator is expected else { if (DEVEL_MODE) { Fault(< # # <$fh> # <*.c *.h> # <_> # ( glob.t) # <${PREFIX}*img*.$IMAGE_TYPE> # # # <$LATEX2HTMLVERSIONS${dd}html[1-9].[0-9].pl> # # Here are some examples of lines which do not have angle operators: # return unless $self->[2]++ < $#{$self->[1]}; # < 2 || @$t > # # the following line from dlister.pl caused trouble: # print'~'x79,"\n",$D<1024?"0.$D":$D>>10,"K, $C files\n\n\n"; # # If the '<' starts an angle operator, it must end on this line and # it must not have certain characters like ';' and '=' in it. I use # this to limit the testing. This filter should be improved if # possible. if ( $input_line =~ /($filter)/g ) { if ( $1 eq '>' ) { # We MAY have found an angle operator termination if we get # here, but we need to do more to be sure we haven't been # fooled. my $pos = pos($input_line); my $pos_beg = $rtoken_map->[$i]; my $str = substr( $input_line, $pos_beg, ( $pos - $pos_beg ) ); # Test for '<' after possible filehandle, issue c103 # print $fh <>; # syntax error # print $fh ; # ok # print $fh < DATA>; # syntax error at '>' # print STDERR < DATA>; # ok, prints word 'DATA' # print BLABLA ; # ok; does nothing unless BLABLA is defined if ( $last_nonblank_type eq 'Z' ) { # $str includes brackets; something like '' if ( substr( $last_nonblank_token, 0, 1 ) !~ /[A-Za-z_]/ && substr( $str, 1, 1 ) !~ /[A-Za-z_]/ ) { return ( $i, $type ); } } # Reject if the closing '>' follows a '-' as in: # if ( VERSION < 5.009 && $op-> name eq 'assign' ) { } if ( $expecting eq UNKNOWN ) { my $check = substr( $input_line, $pos - 2, 1 ); if ( $check eq '-' ) { return ( $i, $type ); } } ######################################debug##### #write_diagnostics( "ANGLE? :$str\n"); #print "ANGLE: found $1 at pos=$pos str=$str check=$check\n"; ######################################debug##### $type = 'Q'; my $error; ( $i, $error ) = inverse_pretoken_map( $i, $pos, $rtoken_map, $max_token_index ); # It may be possible that a quote ends midway in a pretoken. # If this happens, it may be necessary to split the pretoken. if ($error) { if (DEVEL_MODE) { Fault(</ ); # Now let's see where we stand.... # OK if math op not possible if ( $expecting == TERM ) { } # OK if there are no more than 2 non-blank pre-tokens inside # (not possible to write 2 token math between < and >) # This catches most common cases elsif ( $i <= $i_beg + 3 + $blank_count ) { # No longer any need to document this common case ## write_diagnostics("ANGLE(1 or 2 tokens): $str\n"); } # OK if there is some kind of identifier inside # print $fh ; elsif ( $str =~ /^<\s*\$?(\w|::|\s)+\s*>$/ ) { write_diagnostics("ANGLE (contains identifier): $str\n"); } # Not sure.. else { # Let's try a Brace Test: any braces inside must balance my $br = 0; while ( $str =~ /\{/g ) { $br++ } while ( $str =~ /\}/g ) { $br-- } my $sb = 0; while ( $str =~ /\[/g ) { $sb++ } while ( $str =~ /\]/g ) { $sb-- } my $pr = 0; while ( $str =~ /\(/g ) { $pr++ } while ( $str =~ /\)/g ) { $pr-- } # if braces do not balance - not angle operator if ( $br || $sb || $pr ) { $i = $i_beg; $type = '<'; write_diagnostics( "NOT ANGLE (BRACE={$br ($pr [$sb ):$str\n"); } # we should keep doing more checks here...to be continued # Tentatively accepting this as a valid angle operator. # There are lots more things that can be checked. else { write_diagnostics( "ANGLE-Guessing yes: $str expecting=$expecting\n"); write_logfile_entry("Guessing angle operator here: $str\n"); } } } # didn't find ending > else { if ( $expecting == TERM ) { warning("No ending > for angle operator\n"); } } } return ( $i, $type ); } ## end sub find_angle_operator_termination sub scan_number_do { # scan a number in any of the formats that Perl accepts # Underbars (_) are allowed in decimal numbers. # input parameters - # $input_line - the string to scan # $i - pre_token index to start scanning # $rtoken_map - reference to the pre_token map giving starting # character position in $input_line of token $i # output parameters - # $i - last pre_token index of the number just scanned # number - the number (characters); or undef if not a number my ( $input_line, $i, $rtoken_map, $input_type, $max_token_index ) = @_; my $pos_beg = $rtoken_map->[$i]; my $pos; my $i_begin = $i; my $number = undef; my $type = $input_type; my $first_char = substr( $input_line, $pos_beg, 1 ); # Look for bad starting characters; Shouldn't happen.. if ( $first_char !~ /[\d\.\+\-Ee]/ ) { if (DEVEL_MODE) { Fault(<[$i] ) { # Let the calling routine handle errors in which we do not # land on a pre-token boundary. It can happen by running # perltidy on some non-perl scripts, for example. if ( $pos < $rtoken_map->[$i] ) { $error = 1 } $i--; last; } } return ( $i, $error ); } ## end sub inverse_pretoken_map sub find_here_doc { # find the target of a here document, if any # input parameters: # $i - token index of the second < of << # ($i must be less than the last token index if this is called) # output parameters: # $found_target = 0 didn't find target; =1 found target # HERE_TARGET - the target string (may be empty string) # $i - unchanged if not here doc, # or index of the last token of the here target # $saw_error - flag noting unbalanced quote on here target my ( $expecting, $i, $rtokens, $rtoken_map, $max_token_index ) = @_; my $ibeg = $i; my $found_target = 0; my $here_doc_target = EMPTY_STRING; my $here_quote_character = EMPTY_STRING; my $saw_error = 0; my ( $next_nonblank_token, $i_next_nonblank, $next_token ); $next_token = $rtokens->[ $i + 1 ]; # perl allows a backslash before the target string (heredoc.t) my $backslash = 0; if ( $next_token eq '\\' ) { $backslash = 1; $next_token = $rtokens->[ $i + 2 ]; } ( $next_nonblank_token, $i_next_nonblank ) = find_next_nonblank_token_on_this_line( $i, $rtokens, $max_token_index ); if ( $next_nonblank_token =~ /[\'\"\`]/ ) { my $in_quote = 1; my $quote_depth = 0; my $quote_pos = 0; my $quoted_string; ( $i, $in_quote, $here_quote_character, $quote_pos, $quote_depth, $quoted_string ) = follow_quoted_string( $i_next_nonblank, $in_quote, $rtokens, $here_quote_character, $quote_pos, $quote_depth, $max_token_index ); if ($in_quote) { # didn't find end of quote, so no target found $i = $ibeg; if ( $expecting == TERM ) { warning( "Did not find here-doc string terminator ($here_quote_character) before end of line \n" ); $saw_error = 1; } } else { # found ending quote $found_target = 1; my $tokj; foreach my $j ( $i_next_nonblank + 1 .. $i - 1 ) { $tokj = $rtokens->[$j]; # we have to remove any backslash before the quote character # so that the here-doc-target exactly matches this string next if ( $tokj eq "\\" && $j < $i - 1 && $rtokens->[ $j + 1 ] eq $here_quote_character ); $here_doc_target .= $tokj; } } } elsif ( ( $next_token =~ /^\s*$/ ) and ( $expecting == TERM ) ) { $found_target = 1; write_logfile_entry( "found blank here-target after <<; suggest using \"\"\n"); $i = $ibeg; } elsif ( $next_token =~ /^\w/ ) { # simple bareword or integer after << my $here_doc_expected; if ( $expecting == UNKNOWN ) { $here_doc_expected = guess_if_here_doc($next_token); } else { $here_doc_expected = 1; } if ($here_doc_expected) { $found_target = 1; $here_doc_target = $next_token; $i = $ibeg + 1; } } else { if ( $expecting == TERM ) { $found_target = 1; write_logfile_entry("Note: bare here-doc operator <<\n"); } else { $i = $ibeg; } } # patch to neglect any prepended backslash if ( $found_target && $backslash ) { $i++ } return ( $found_target, $here_doc_target, $here_quote_character, $i, $saw_error ); } ## end sub find_here_doc sub do_quote { # follow (or continue following) quoted string(s) # $in_quote return code: # 0 - ok, found end # 1 - still must find end of quote whose target is $quote_character # 2 - still looking for end of first of two quotes # # Returns updated strings: # $quoted_string_1 = quoted string seen while in_quote=1 # $quoted_string_2 = quoted string seen while in_quote=2 my ( $i, $in_quote, $quote_character, $quote_pos, $quote_depth, $quoted_string_1, $quoted_string_2, $rtokens, $rtoken_map, $max_token_index, ) = @_; my $quoted_string; if ( $in_quote == 2 ) { # two quotes/quoted_string_1s to follow my $ibeg = $i; ( $i, $in_quote, $quote_character, $quote_pos, $quote_depth, $quoted_string ) = follow_quoted_string( $ibeg, $in_quote, $rtokens, $quote_character, $quote_pos, $quote_depth, $max_token_index ); $quoted_string_2 .= $quoted_string; if ( $in_quote == 1 ) { if ( $quote_character =~ /[\{\[\<\(]/ ) { $i++; } $quote_character = EMPTY_STRING; } else { $quoted_string_2 .= "\n"; } } if ( $in_quote == 1 ) { # one (more) quote to follow my $ibeg = $i; ( $i, $in_quote, $quote_character, $quote_pos, $quote_depth, $quoted_string ) = follow_quoted_string( $ibeg, $in_quote, $rtokens, $quote_character, $quote_pos, $quote_depth, $max_token_index ); $quoted_string_1 .= $quoted_string; if ( $in_quote == 1 ) { $quoted_string_1 .= "\n"; } } return ( $i, $in_quote, $quote_character, $quote_pos, $quote_depth, $quoted_string_1, $quoted_string_2, ); } ## end sub do_quote sub follow_quoted_string { # scan for a specific token, skipping escaped characters # if the quote character is blank, use the first non-blank character # input parameters: # $rtokens = reference to the array of tokens # $i = the token index of the first character to search # $in_quote = number of quoted strings being followed # $beginning_tok = the starting quote character # $quote_pos = index to check next for alphanumeric delimiter # output parameters: # $i = the token index of the ending quote character # $in_quote = decremented if found end, unchanged if not # $beginning_tok = the starting quote character # $quote_pos = index to check next for alphanumeric delimiter # $quote_depth = nesting depth, since delimiters '{ ( [ <' can be nested. # $quoted_string = the text of the quote (without quotation tokens) my ( $i_beg, $in_quote, $rtokens, $beginning_tok, $quote_pos, $quote_depth, $max_token_index, ) = @_; my ( $tok, $end_tok ); my $i = $i_beg - 1; my $quoted_string = EMPTY_STRING; 0 && do { print STDOUT "QUOTE entering with quote_pos = $quote_pos i=$i beginning_tok =$beginning_tok\n"; }; # get the corresponding end token if ( $beginning_tok !~ /^\s*$/ ) { $end_tok = matching_end_token($beginning_tok); } # a blank token means we must find and use the first non-blank one else { my $allow_quote_comments = ( $i < 0 ) ? 1 : 0; # i<0 means we saw a while ( $i < $max_token_index ) { $tok = $rtokens->[ ++$i ]; if ( $tok !~ /^\s*$/ ) { if ( ( $tok eq '#' ) && ($allow_quote_comments) ) { $i = $max_token_index; } else { if ( length($tok) > 1 ) { if ( $quote_pos <= 0 ) { $quote_pos = 1 } $beginning_tok = substr( $tok, $quote_pos - 1, 1 ); } else { $beginning_tok = $tok; $quote_pos = 0; } $end_tok = matching_end_token($beginning_tok); $quote_depth = 1; last; } } else { $allow_quote_comments = 1; } } } # There are two different loops which search for the ending quote # character. In the rare case of an alphanumeric quote delimiter, we # have to look through alphanumeric tokens character-by-character, since # the pre-tokenization process combines multiple alphanumeric # characters, whereas for a non-alphanumeric delimiter, only tokens of # length 1 can match. #---------------------------------------------------------------- # Case 1 (rare): loop for case of alphanumeric quote delimiter.. # "quote_pos" is the position the current word to begin searching #---------------------------------------------------------------- if ( $beginning_tok =~ /\w/ ) { # Note this because it is not recommended practice except # for obfuscated perl contests if ( $in_quote == 1 ) { write_logfile_entry( "Note: alphanumeric quote delimiter ($beginning_tok) \n"); } # Note: changed < to <= here to fix c109. Relying on extra end blanks. while ( $i <= $max_token_index ) { if ( $quote_pos == 0 || ( $i < 0 ) ) { $tok = $rtokens->[ ++$i ]; if ( $tok eq '\\' ) { # retain backslash unless it hides the end token $quoted_string .= $tok unless $rtokens->[ $i + 1 ] eq $end_tok; $quote_pos++; last if ( $i >= $max_token_index ); $tok = $rtokens->[ ++$i ]; } } my $old_pos = $quote_pos; unless ( defined($tok) && defined($end_tok) && defined($quote_pos) ) { } $quote_pos = 1 + index( $tok, $end_tok, $quote_pos ); if ( $quote_pos > 0 ) { $quoted_string .= substr( $tok, $old_pos, $quote_pos - $old_pos - 1 ); # NOTE: any quote modifiers will be at the end of '$tok'. If we # wanted to check them, this is the place to get them. But # this quote form is rarely used in practice, so it isn't # worthwhile. $quote_depth--; if ( $quote_depth == 0 ) { $in_quote--; last; } } else { if ( $old_pos <= length($tok) ) { $quoted_string .= substr( $tok, $old_pos ); } } } } #----------------------------------------------------------------------- # Case 2 (normal): loop for case of a non-alphanumeric quote delimiter.. #----------------------------------------------------------------------- else { while ( $i < $max_token_index ) { $tok = $rtokens->[ ++$i ]; if ( $tok eq $end_tok ) { $quote_depth--; if ( $quote_depth == 0 ) { $in_quote--; last; } } elsif ( $tok eq $beginning_tok ) { $quote_depth++; } elsif ( $tok eq '\\' ) { # retain backslash unless it hides the beginning or end token $tok = $rtokens->[ ++$i ]; $quoted_string .= '\\' unless ( $tok eq $end_tok || $tok eq $beginning_tok ); } $quoted_string .= $tok; } } if ( $i > $max_token_index ) { $i = $max_token_index } return ( $i, $in_quote, $beginning_tok, $quote_pos, $quote_depth, $quoted_string, ); } ## end sub follow_quoted_string sub indicate_error { my ( $msg, $line_number, $input_line, $pos, $carrat ) = @_; interrupt_logfile(); warning($msg); write_error_indicator_pair( $line_number, $input_line, $pos, $carrat ); resume_logfile(); return; } ## end sub indicate_error sub write_error_indicator_pair { my ( $line_number, $input_line, $pos, $carrat ) = @_; my ( $offset, $numbered_line, $underline ) = make_numbered_line( $line_number, $input_line, $pos ); $underline = write_on_underline( $underline, $pos - $offset, $carrat ); warning( $numbered_line . "\n" ); $underline =~ s/\s*$//; warning( $underline . "\n" ); return; } ## end sub write_error_indicator_pair sub make_numbered_line { # Given an input line, its line number, and a character position of # interest, create a string not longer than 80 characters of the form # $lineno: sub_string # such that the sub_string of $str contains the position of interest # # Here is an example of what we want, in this case we add trailing # '...' because the line is long. # # 2: (One of QAML 2.0's authors is a member of the World Wide Web Con ... # # Here is another example, this time in which we used leading '...' # because of excessive length: # # 2: ... er of the World Wide Web Consortium's # # input parameters are: # $lineno = line number # $str = the text of the line # $pos = position of interest (the error) : 0 = first character # # We return : # - $offset = an offset which corrects the position in case we only # display part of a line, such that $pos-$offset is the effective # position from the start of the displayed line. # - $numbered_line = the numbered line as above, # - $underline = a blank 'underline' which is all spaces with the same # number of characters as the numbered line. my ( $lineno, $str, $pos ) = @_; my $offset = ( $pos < 60 ) ? 0 : $pos - 40; my $excess = length($str) - $offset - 68; my $numc = ( $excess > 0 ) ? 68 : undef; if ( defined($numc) ) { if ( $offset == 0 ) { $str = substr( $str, $offset, $numc - 4 ) . " ..."; } else { $str = "... " . substr( $str, $offset + 4, $numc - 4 ) . " ..."; } } else { if ( $offset == 0 ) { } else { $str = "... " . substr( $str, $offset + 4 ); } } my $numbered_line = sprintf( "%d: ", $lineno ); $offset -= length($numbered_line); $numbered_line .= $str; my $underline = SPACE x length($numbered_line); return ( $offset, $numbered_line, $underline ); } ## end sub make_numbered_line sub write_on_underline { # The "underline" is a string that shows where an error is; it starts # out as a string of blanks with the same length as the numbered line of # code above it, and we have to add marking to show where an error is. # In the example below, we want to write the string '--^' just below # the line of bad code: # # 2: (One of QAML 2.0's authors is a member of the World Wide Web Con ... # ---^ # We are given the current underline string, plus a position and a # string to write on it. # # In the above example, there will be 2 calls to do this: # First call: $pos=19, pos_chr=^ # Second call: $pos=16, pos_chr=--- # # This is a trivial thing to do with substr, but there is some # checking to do. my ( $underline, $pos, $pos_chr ) = @_; # check for error..shouldn't happen unless ( ( $pos >= 0 ) && ( $pos <= length($underline) ) ) { return $underline; } my $excess = length($pos_chr) + $pos - length($underline); if ( $excess > 0 ) { $pos_chr = substr( $pos_chr, 0, length($pos_chr) - $excess ); } substr( $underline, $pos, length($pos_chr) ) = $pos_chr; return ($underline); } ## end sub write_on_underline sub pre_tokenize { my ( $str, $max_tokens_wanted ) = @_; # Input parameter: # $max_tokens_wanted > 0 to stop on reaching this many tokens. # = 0 means get all tokens # Break a string, $str, into a sequence of preliminary tokens. We # are interested in these types of tokens: # words (type='w'), example: 'max_tokens_wanted' # digits (type = 'd'), example: '0755' # whitespace (type = 'b'), example: ' ' # any other single character (i.e. punct; type = the character itself). # We cannot do better than this yet because we might be in a quoted # string or pattern. Caller sets $max_tokens_wanted to 0 to get all # tokens. # An advantage of doing this pre-tokenization step is that it keeps almost # all of the regex work highly localized. A disadvantage is that in some # very rare instances we will have to go back and split a pre-token. # Return parameters: my @tokens = (); # array of the tokens themselves my @token_map = (0); # string position of start of each token my @type = (); # 'b'=whitespace, 'd'=digits, 'w'=alpha, or punct do { # whitespace if ( $str =~ /\G(\s+)/gc ) { push @type, 'b'; } # numbers # note that this must come before words! elsif ( $str =~ /\G(\d+)/gc ) { push @type, 'd'; } # words elsif ( $str =~ /\G(\w+)/gc ) { push @type, 'w'; } # single-character punctuation elsif ( $str =~ /\G(\W)/gc ) { push @type, $1; } # that's all.. else { return ( \@tokens, \@token_map, \@type ); } push @tokens, $1; push @token_map, pos($str); } while ( --$max_tokens_wanted != 0 ); return ( \@tokens, \@token_map, \@type ); } ## end sub pre_tokenize sub show_tokens { # this is an old debug routine # not called, but saved for reference my ( $rtokens, $rtoken_map ) = @_; my $num = scalar( @{$rtokens} ); foreach my $i ( 0 .. $num - 1 ) { my $len = length( $rtokens->[$i] ); print STDOUT "$i:$len:$rtoken_map->[$i]:$rtokens->[$i]:\n"; } return; } ## end sub show_tokens { ## closure for sub matching end token my %matching_end_token; BEGIN { %matching_end_token = ( '{' => '}', '(' => ')', '[' => ']', '<' => '>', ); } ## end BEGIN sub matching_end_token { # return closing character for a pattern my $beginning_token = shift; if ( $matching_end_token{$beginning_token} ) { return $matching_end_token{$beginning_token}; } return ($beginning_token); } ## end sub matching_end_token } sub dump_token_types { my ( $class, $fh ) = @_; # This should be the latest list of token types in use # adding NEW_TOKENS: add a comment here $fh->print(<<'END_OF_LIST'); Here is a list of the token types currently used for lines of type 'CODE'. For the following tokens, the "type" of a token is just the token itself. .. :: << >> ** && .. || // -> => += -= .= %= &= |= ^= *= <> ( ) <= >= == =~ !~ != ++ -- /= x= ... **= <<= >>= &&= ||= //= <=> , + - / * | % ! x ~ = \ ? : . < > ^ & The following additional token types are defined: type meaning b blank (white space) { indent: opening structural curly brace or square bracket or paren (code block, anonymous hash reference, or anonymous array reference) } outdent: right structural curly brace or square bracket or paren [ left non-structural square bracket (enclosing an array index) ] right non-structural square bracket ( left non-structural paren (all but a list right of an =) ) right non-structural paren L left non-structural curly brace (enclosing a key) R right non-structural curly brace ; terminal semicolon f indicates a semicolon in a "for" statement h here_doc operator << # a comment Q indicates a quote or pattern q indicates a qw quote block k a perl keyword C user-defined constant or constant function (with void prototype = ()) U user-defined function taking parameters G user-defined function taking block parameter (like grep/map/eval) M (unused, but reserved for subroutine definition name) P (unused, but -html uses it to label pod text) t type indicater such as %,$,@,*,&,sub w bare word (perhaps a subroutine call) i identifier of some type (with leading %, $, @, *, &, sub, -> ) n a number v a v-string F a file test operator (like -e) Y File handle Z identifier in indirect object slot: may be file handle, object J LABEL: code block label j LABEL after next, last, redo, goto p unary + m unary - pp pre-increment operator ++ mm pre-decrement operator -- A : used as attribute separator Here are the '_line_type' codes used internally: SYSTEM - system-specific code before hash-bang line CODE - line of perl code (including comments) POD_START - line starting pod, such as '=head' POD - pod documentation text POD_END - last line of pod section, '=cut' HERE - text of here-document HERE_END - last line of here-doc (target word) FORMAT - format section FORMAT_END - last line of format section, '.' SKIP - code skipping section SKIP_END - last line of code skipping section, '#>>V' DATA_START - __DATA__ line DATA - unidentified text following __DATA__ END_START - __END__ line END - unidentified text following __END__ ERROR - we are in big trouble, probably not a perl script END_OF_LIST return; } ## end sub dump_token_types BEGIN { # These names are used in error messages @opening_brace_names = qw# '{' '[' '(' '?' #; @closing_brace_names = qw# '}' ']' ')' ':' #; my @q; my @digraphs = qw( .. :: << >> ** && || // -> => += -= .= %= &= |= ^= *= <> <= >= == =~ !~ != ++ -- /= x= ~~ ~. |. &. ^. ); @is_digraph{@digraphs} = (1) x scalar(@digraphs); @q = qw( . : < > * & | / - = + - % ^ ! x ~ ); @can_start_digraph{@q} = (1) x scalar(@q); my @trigraphs = qw( ... **= <<= >>= &&= ||= //= <=> !~~ &.= |.= ^.= <<~); @is_trigraph{@trigraphs} = (1) x scalar(@trigraphs); my @tetragraphs = qw( <<>> ); @is_tetragraph{@tetragraphs} = (1) x scalar(@tetragraphs); # make a hash of all valid token types for self-checking the tokenizer # (adding NEW_TOKENS : select a new character and add to this list) my @valid_token_types = qw# A b C G L R f h Q k t w i q n p m F pp mm U j J Y Z v { } ( ) [ ] ; + - / * | % ! x ~ = \ ? : . < > ^ & #; push( @valid_token_types, @digraphs ); push( @valid_token_types, @trigraphs ); push( @valid_token_types, @tetragraphs ); push( @valid_token_types, ( '#', ',', 'CORE::' ) ); @is_valid_token_type{@valid_token_types} = (1) x scalar(@valid_token_types); # a list of file test letters, as in -e (Table 3-4 of 'camel 3') my @file_test_operators = qw( A B C M O R S T W X b c d e f g k l o p r s t u w x z); @is_file_test_operator{@file_test_operators} = (1) x scalar(@file_test_operators); # these functions have prototypes of the form (&), so when they are # followed by a block, that block MAY BE followed by an operator. # Smartmatch operator ~~ may be followed by anonymous hash or array ref @q = qw( do eval ); @is_block_operator{@q} = (1) x scalar(@q); # these functions allow an identifier in the indirect object slot @q = qw( print printf sort exec system say); @is_indirect_object_taker{@q} = (1) x scalar(@q); # Note: 'field' will be added by sub check_options if --use-feature=class @q = qw(my our state); @is_my_our_state{@q} = (1) x scalar(@q); # These tokens may precede a code block # patched for SWITCH/CASE/CATCH. Actually these could be removed # now and we could let the extended-syntax coding handle them. # Added 'default' for Switch::Plain. # Note: 'ADJUST' will be added by sub check_options if --use-feature=class @q = qw( BEGIN END CHECK INIT AUTOLOAD DESTROY UNITCHECK continue if elsif else unless do while until eval for foreach map grep sort switch case given when default catch try finally); @is_code_block_token{@q} = (1) x scalar(@q); # Note: this hash was formerly named '%is_not_zero_continuation_block_type' # to contrast it with the block types in '%is_zero_continuation_block_type' @q = qw( sort map grep eval do ); @is_sort_map_grep_eval_do{@q} = (1) x scalar(@q); @q = qw( sort map grep ); @is_sort_map_grep{@q} = (1) x scalar(@q); %is_grep_alias = (); # I'll build the list of keywords incrementally my @Keywords = (); # keywords and tokens after which a value or pattern is expected, # but not an operator. In other words, these should consume terms # to their right, or at least they are not expected to be followed # immediately by operators. my @value_requestor = qw( AUTOLOAD BEGIN CHECK DESTROY END EQ GE GT INIT LE LT NE UNITCHECK abs accept alarm and atan2 bind binmode bless break caller chdir chmod chomp chop chown chr chroot close closedir cmp connect continue cos crypt dbmclose dbmopen defined delete die dump each else elsif eof eq evalbytes exec exists exit exp fc fcntl fileno flock for foreach formline ge getc getgrgid getgrnam gethostbyaddr gethostbyname getnetbyaddr getnetbyname getpeername getpgrp getpriority getprotobyname getprotobynumber getpwnam getpwuid getservbyname getservbyport getsockname getsockopt glob gmtime goto grep gt hex if index int ioctl join keys kill last lc lcfirst le length link listen local localtime lock log lstat lt map mkdir msgctl msgget msgrcv msgsnd my ne next no not oct open opendir or ord our pack pipe pop pos print printf prototype push quotemeta rand read readdir readlink readline readpipe recv redo ref rename require reset return reverse rewinddir rindex rmdir scalar seek seekdir select semctl semget semop send sethostent setnetent setpgrp setpriority setprotoent setservent setsockopt shift shmctl shmget shmread shmwrite shutdown sin sleep socket socketpair sort splice split sprintf sqrt srand stat state study substr symlink syscall sysopen sysread sysseek system syswrite tell telldir tie tied truncate uc ucfirst umask undef unless unlink unpack unshift untie until use utime values vec waitpid warn while write xor switch case default given when err say isa catch ); # Note: 'ADJUST', 'field' are added by sub check_options # if --use-feature=class # patched above for SWITCH/CASE given/when err say # 'err' is a fairly safe addition. # Added 'default' for Switch::Plain. Note that we could also have # a separate set of keywords to include if we see 'use Switch::Plain' push( @Keywords, @value_requestor ); # These are treated the same but are not keywords: my @extra_vr = qw( constant vars ); push( @value_requestor, @extra_vr ); @expecting_term_token{@value_requestor} = (1) x scalar(@value_requestor); # this list contains keywords which do not look for arguments, # so that they might be followed by an operator, or at least # not a term. my @operator_requestor = qw( endgrent endhostent endnetent endprotoent endpwent endservent fork getgrent gethostent getlogin getnetent getppid getprotoent getpwent getservent setgrent setpwent time times wait wantarray ); push( @Keywords, @operator_requestor ); # These are treated the same but are not considered keywords: my @extra_or = qw( STDERR STDIN STDOUT ); push( @operator_requestor, @extra_or ); @expecting_operator_token{@operator_requestor} = (1) x scalar(@operator_requestor); # these token TYPES expect trailing operator but not a term # note: ++ and -- are post-increment and decrement, 'C' = constant my @operator_requestor_types = qw( ++ -- C <> q ); @expecting_operator_types{@operator_requestor_types} = (1) x scalar(@operator_requestor_types); # these token TYPES consume values (terms) # note: pp and mm are pre-increment and decrement # f=semicolon in for, F=file test operator my @value_requestor_type = qw# L { ( [ ~ !~ =~ ; . .. ... A : && ! || // = + - x **= += -= .= /= *= %= x= &= |= ^= <<= >>= &&= ||= //= <= >= == != => \ > < % * / ? & | ** <=> ~~ !~~ <<~ f F pp mm Y p m U J G j >> << ^ t ~. ^. |. &. ^.= |.= &.= #; push( @value_requestor_type, ',' ) ; # (perl doesn't like a ',' in a qw block) @expecting_term_types{@value_requestor_type} = (1) x scalar(@value_requestor_type); # Note: the following valid token types are not assigned here to # hashes requesting to be followed by values or terms, but are # instead currently hard-coded into sub operator_expected: # ) -> :: Q R Z ] b h i k n v w } # # For simple syntax checking, it is nice to have a list of operators which # will really be unhappy if not followed by a term. This includes most # of the above... %really_want_term = %expecting_term_types; # with these exceptions... delete $really_want_term{'U'}; # user sub, depends on prototype delete $really_want_term{'F'}; # file test works on $_ if no following term delete $really_want_term{'Y'}; # indirect object, too risky to check syntax; # let perl do it @q = qw(q qq qx qr s y tr m); @is_q_qq_qx_qr_s_y_tr_m{@q} = (1) x scalar(@q); # Note added 'qw' here @q = qw(q qq qw qx qr s y tr m); @is_q_qq_qw_qx_qr_s_y_tr_m{@q} = (1) x scalar(@q); # Note: 'class' will be added by sub check_options if -use-feature=class @q = qw(package); @is_package{@q} = (1) x scalar(@q); @q = qw( ? : ); push @q, ','; @is_comma_question_colon{@q} = (1) x scalar(@q); @q = qw( if elsif unless ); @is_if_elsif_unless{@q} = (1) x scalar(@q); @q = qw( ; t ); @is_semicolon_or_t{@q} = (1) x scalar(@q); @q = qw( if elsif unless case when ); @is_if_elsif_unless_case_when{@q} = (1) x scalar(@q); # Hash of other possible line endings which may occur. # Keep these coordinated with the regex where this is used. # Note: chr(13) = chr(015)="\r". @q = ( chr(13), chr(29), chr(26) ); @other_line_endings{@q} = (1) x scalar(@q); # These keywords are handled specially in the tokenizer code: my @special_keywords = qw( do eval format m package q qq qr qw qx s sub tr y ); push( @Keywords, @special_keywords ); # Keywords after which list formatting may be used # WARNING: do not include |map|grep|eval or perl may die on # syntax errors (map1.t). my @keyword_taking_list = qw( and chmod chomp chop chown dbmopen die elsif exec fcntl for foreach formline getsockopt if index ioctl join kill local msgctl msgrcv msgsnd my open or our pack print printf push read readpipe recv return reverse rindex seek select semctl semget send setpriority setsockopt shmctl shmget shmread shmwrite socket socketpair sort splice split sprintf state substr syscall sysopen sysread sysseek system syswrite tie unless unlink unpack unshift until vec warn while given when ); @is_keyword_taking_list{@keyword_taking_list} = (1) x scalar(@keyword_taking_list); # perl functions which may be unary operators. # This list is used to decide if a pattern delimited by slashes, /pattern/, # can follow one of these keywords. @q = qw( chomp eof eval fc lc pop shift uc undef ); @is_keyword_rejecting_slash_as_pattern_delimiter{@q} = (1) x scalar(@q); # These are keywords for which an arg may optionally be omitted. They are # currently only used to disambiguate a ? used as a ternary from one used # as a (deprecated) pattern delimiter. In the future, they might be used # to give a warning about ambiguous syntax before a /. # Note: split has been omitted (see not below). my @keywords_taking_optional_arg = qw( abs alarm caller chdir chomp chop chr chroot close cos defined die eof eval evalbytes exit exp fc getc glob gmtime hex int last lc lcfirst length localtime log lstat mkdir next oct ord pop pos print printf prototype quotemeta rand readline readlink readpipe redo ref require reset reverse rmdir say select shift sin sleep sqrt srand stat study tell uc ucfirst umask undef unlink warn write ); @is_keyword_taking_optional_arg{@keywords_taking_optional_arg} = (1) x scalar(@keywords_taking_optional_arg); # This list is used to decide if a pattern delimited by question marks, # ?pattern?, can follow one of these keywords. Note that from perl 5.22 # on, a ?pattern? is not recognized, so we can be much more strict than # with a /pattern/. Note that 'split' is not in this list. In current # versions of perl a question following split must be a ternary, but # in older versions it could be a pattern. The guessing algorithm will # decide. We are combining two lists here to simplify the test. @q = ( @keywords_taking_optional_arg, @operator_requestor ); @is_keyword_rejecting_question_as_pattern_delimiter{@q} = (1) x scalar(@q); # These are not used in any way yet # my @unused_keywords = qw( # __FILE__ # __LINE__ # __PACKAGE__ # ); # The list of keywords was originally extracted from function 'keyword' in # perl file toke.c version 5.005.03, using this utility, plus a # little editing: (file getkwd.pl): # while (<>) { while (/\"(.*)\"/g) { print "$1\n"; } } # Add 'get' prefix where necessary, then split into the above lists. # This list should be updated as necessary. # The list should not contain these special variables: # ARGV DATA ENV SIG STDERR STDIN STDOUT # __DATA__ __END__ @is_keyword{@Keywords} = (1) x scalar(@Keywords); } ## end BEGIN 1; Perl-Tidy-20230309/lib/Perl/Tidy/LineSink.pm0000644000175000017500000000523414400733204017261 0ustar stevesteve##################################################################### # # the Perl::Tidy::LineSink class supplies a write_line method for # actual file writing # ##################################################################### package Perl::Tidy::LineSink; use strict; use warnings; our $VERSION = '20230309'; sub AUTOLOAD { # Catch any undefined sub calls so that we are sure to get # some diagnostic information. This sub should never be called # except for a programming error. our $AUTOLOAD; return if ( $AUTOLOAD =~ /\bDESTROY$/ ); my ( $pkg, $fname, $lno ) = caller(); my $my_package = __PACKAGE__; print STDERR < undef, line_separator => undef, is_encoded_data => undef, ); my %args = ( %defaults, @args ); my $output_file = $args{output_file}; my $line_separator = $args{line_separator}; my $is_encoded_data = $args{is_encoded_data}; my $fh = undef; my $output_file_open = 0; ( $fh, $output_file ) = Perl::Tidy::streamhandle( $output_file, 'w', $is_encoded_data ); unless ($fh) { Perl::Tidy::Die("Cannot write to output stream\n"); } $output_file_open = 1; return bless { _fh => $fh, _output_file => $output_file, _output_file_open => $output_file_open, _line_separator => $line_separator, _is_encoded_data => $is_encoded_data, }, $class; } sub set_line_separator { my ( $self, $val ) = @_; $self->{_line_separator} = $val; return; } sub write_line { my ( $self, $line ) = @_; my $fh = $self->{_fh}; my $line_separator = $self->{_line_separator}; if ( defined($line_separator) ) { chomp $line; $line .= $line_separator; } $fh->print($line) if ( $self->{_output_file_open} ); return; } sub close_output_file { my $self = shift; # Only close physical files, not STDOUT and other objects my $output_file = $self->{_output_file}; if ( $output_file ne '-' && !ref $output_file ) { $self->{_fh}->close() if $self->{_output_file_open}; } return; } 1; Perl-Tidy-20230309/lib/Perl/Tidy/Diagnostics.pm0000644000175000017500000000551214400733200020007 0ustar stevesteve##################################################################### # # The Perl::Tidy::Diagnostics class writes the DIAGNOSTICS file, which is # useful for program development. # # Only one such file is created regardless of the number of input # files processed. This allows the results of processing many files # to be summarized in a single file. # Output messages go to a file named DIAGNOSTICS, where # they are labeled by file and line. This allows many files to be # scanned at once for some particular condition of interest. It was # particularly useful for developing guessing strategies. # # NOTE: This feature is deactivated in final releases but can be # reactivated for debugging by un-commenting the 'I' options flag # ##################################################################### package Perl::Tidy::Diagnostics; use strict; use warnings; use English qw( -no_match_vars ); our $VERSION = '20230309'; use constant EMPTY_STRING => q{}; sub AUTOLOAD { # Catch any undefined sub calls so that we are sure to get # some diagnostic information. This sub should never be called # except for a programming error. our $AUTOLOAD; return if ( $AUTOLOAD =~ /\bDESTROY$/ ); my ( $pkg, $fname, $lno ) = caller(); my $my_package = __PACKAGE__; print STDERR < 0, _last_diagnostic_file => EMPTY_STRING, _input_file => EMPTY_STRING, _fh => undef, }, $class; } sub set_input_file { my ( $self, $input_file ) = @_; $self->{_input_file} = $input_file; return; } sub write_diagnostics { my ( $self, $msg ) = @_; unless ( $self->{_write_diagnostics_count} ) { open( $self->{_fh}, ">", "DIAGNOSTICS" ) or Perl::Tidy::Die("couldn't open DIAGNOSTICS: $ERRNO\n"); } my $fh = $self->{_fh}; my $last_diagnostic_file = $self->{_last_diagnostic_file}; my $input_file = $self->{_input_file}; if ( $last_diagnostic_file ne $input_file ) { $fh->print("\nFILE:$input_file\n"); } $self->{_last_diagnostic_file} = $input_file; my $input_line_number = Perl::Tidy::Tokenizer::get_input_line_number(); $fh->print("$input_line_number:\t$msg"); $self->{_write_diagnostics_count}++; return; } 1; Perl-Tidy-20230309/lib/Perl/Tidy/IndentationItem.pm0000644000175000017500000002063114400733203020635 0ustar stevesteve##################################################################### # # The Perl::Tidy::IndentationItem class supplies items which contain # how much whitespace should be used at the start of a line # ##################################################################### package Perl::Tidy::IndentationItem; use strict; use warnings; our $VERSION = '20230309'; BEGIN { # Array index names # Do not combine with other BEGIN blocks (c101). my $i = 0; use constant { _spaces_ => $i++, _level_ => $i++, _ci_level_ => $i++, _available_spaces_ => $i++, _closed_ => $i++, _comma_count_ => $i++, _lp_item_index_ => $i++, _have_child_ => $i++, _recoverable_spaces_ => $i++, _align_seqno_ => $i++, _marked_ => $i++, _stack_depth_ => $i++, _K_begin_line_ => $i++, _arrow_count_ => $i++, _standard_spaces_ => $i++, _K_extra_space_ => $i++, }; } ## end BEGIN sub AUTOLOAD { # Catch any undefined sub calls so that we are sure to get # some diagnostic information. This sub should never be called # except for a programming error. our $AUTOLOAD; return if ( $AUTOLOAD =~ /\bDESTROY$/ ); my ( $pkg, $fname, $lno ) = caller(); my $my_package = __PACKAGE__; print STDERR < # total leading white spaces # level => # the indentation 'level' # ci_level => # the 'continuation level' # available_spaces => # how many left spaces available # # for this level # closed => # index where we saw closing '}' # comma_count => # how many commas at this level? # lp_item_index => # index in output batch list # have_child => # any dependents? # recoverable_spaces => # how many spaces to the right # # we would like to move to get # # alignment (negative if left) # align_seqno => # if we are aligning with an opening structure, # # this is its seqno # marked => # if visited by corrector logic # stack_depth => # indentation nesting depth # K_begin_line => # first token index K of this level # arrow_count => # how many =>'s my $self = []; $self->[_spaces_] = $input_hash{spaces}; $self->[_level_] = $input_hash{level}; $self->[_ci_level_] = $input_hash{ci_level}; $self->[_available_spaces_] = $input_hash{available_spaces}; $self->[_closed_] = -1; $self->[_comma_count_] = 0; $self->[_lp_item_index_] = $input_hash{lp_item_index}; $self->[_have_child_] = 0; $self->[_recoverable_spaces_] = 0; $self->[_align_seqno_] = $input_hash{align_seqno}; $self->[_marked_] = 0; $self->[_stack_depth_] = $input_hash{stack_depth}; $self->[_K_begin_line_] = $input_hash{K_begin_line}; $self->[_arrow_count_] = 0; $self->[_standard_spaces_] = $input_hash{standard_spaces}; $self->[_K_extra_space_] = $input_hash{K_extra_space}; bless $self, $class; return $self; } ## end sub new sub permanently_decrease_available_spaces { # make a permanent reduction in the available indentation spaces # at one indentation item. NOTE: if there are child nodes, their # total SPACES must be reduced by the caller. my ( $item, $spaces_needed ) = @_; my $available_spaces = $item->get_available_spaces(); my $deleted_spaces = ( $available_spaces > $spaces_needed ) ? $spaces_needed : $available_spaces; # Fixed for c085; a zero value must remain unchanged unless the closed # flag has been set. my $closed = $item->get_closed(); $item->decrease_available_spaces($deleted_spaces) unless ( $available_spaces == 0 && $closed < 0 ); $item->decrease_SPACES($deleted_spaces); $item->set_recoverable_spaces(0); return $deleted_spaces; } ## end sub permanently_decrease_available_spaces sub tentatively_decrease_available_spaces { # We are asked to tentatively delete $spaces_needed of indentation # for an indentation item. We may want to undo this later. NOTE: if # there are child nodes, their total SPACES must be reduced by the # caller. my ( $item, $spaces_needed ) = @_; my $available_spaces = $item->get_available_spaces(); my $deleted_spaces = ( $available_spaces > $spaces_needed ) ? $spaces_needed : $available_spaces; $item->decrease_available_spaces($deleted_spaces); $item->decrease_SPACES($deleted_spaces); $item->increase_recoverable_spaces($deleted_spaces); return $deleted_spaces; } ## end sub tentatively_decrease_available_spaces sub get_stack_depth { return $_[0]->[_stack_depth_]; } sub get_spaces { return $_[0]->[_spaces_]; } sub get_standard_spaces { return $_[0]->[_standard_spaces_]; } sub get_marked { return $_[0]->[_marked_]; } sub set_marked { my ( $self, $value ) = @_; if ( defined($value) ) { $self->[_marked_] = $value; } return $self->[_marked_]; } ## end sub set_marked sub get_available_spaces { return $_[0]->[_available_spaces_]; } sub decrease_SPACES { my ( $self, $value ) = @_; if ( defined($value) ) { $self->[_spaces_] -= $value; } return $self->[_spaces_]; } ## end sub decrease_SPACES sub decrease_available_spaces { my ( $self, $value ) = @_; if ( defined($value) ) { $self->[_available_spaces_] -= $value; } return $self->[_available_spaces_]; } ## end sub decrease_available_spaces sub get_align_seqno { return $_[0]->[_align_seqno_]; } sub get_recoverable_spaces { return $_[0]->[_recoverable_spaces_]; } sub set_recoverable_spaces { my ( $self, $value ) = @_; if ( defined($value) ) { $self->[_recoverable_spaces_] = $value; } return $self->[_recoverable_spaces_]; } ## end sub set_recoverable_spaces sub increase_recoverable_spaces { my ( $self, $value ) = @_; if ( defined($value) ) { $self->[_recoverable_spaces_] += $value; } return $self->[_recoverable_spaces_]; } ## end sub increase_recoverable_spaces sub get_ci_level { return $_[0]->[_ci_level_]; } sub get_level { return $_[0]->[_level_]; } sub get_spaces_level_ci { my $self = shift; return [ $self->[_spaces_], $self->[_level_], $self->[_ci_level_] ]; } sub get_lp_item_index { return $_[0]->[_lp_item_index_]; } sub get_K_begin_line { return $_[0]->[_K_begin_line_]; } sub get_K_extra_space { return $_[0]->[_K_extra_space_]; } sub set_have_child { my ( $self, $value ) = @_; if ( defined($value) ) { $self->[_have_child_] = $value; } return $self->[_have_child_]; } ## end sub set_have_child sub get_have_child { return $_[0]->[_have_child_]; } sub set_arrow_count { my ( $self, $value ) = @_; if ( defined($value) ) { $self->[_arrow_count_] = $value; } return $self->[_arrow_count_]; } ## end sub set_arrow_count sub get_arrow_count { return $_[0]->[_arrow_count_]; } sub set_comma_count { my ( $self, $value ) = @_; if ( defined($value) ) { $self->[_comma_count_] = $value; } return $self->[_comma_count_]; } ## end sub set_comma_count sub get_comma_count { return $_[0]->[_comma_count_]; } sub set_closed { my ( $self, $value ) = @_; if ( defined($value) ) { $self->[_closed_] = $value; } return $self->[_closed_]; } ## end sub set_closed sub get_closed { return $_[0]->[_closed_]; } 1; Perl-Tidy-20230309/lib/Perl/Tidy/HtmlWriter.pm0000644000175000017500000014455214400733201017652 0ustar stevesteve##################################################################### # # The Perl::Tidy::HtmlWriter class writes a copy of the input stream in html # ##################################################################### package Perl::Tidy::HtmlWriter; use strict; use warnings; our $VERSION = '20230309'; use English qw( -no_match_vars ); use File::Basename; use constant EMPTY_STRING => q{}; use constant SPACE => q{ }; # class variables use vars qw{ %html_color %html_bold %html_italic %token_short_names %short_to_long_names $rOpts $css_filename $css_linkname $missing_html_entities $missing_pod_html }; # replace unsafe characters with HTML entity representation if HTML::Entities # is available #{ eval "use HTML::Entities"; $missing_html_entities = $@; } BEGIN { if ( !eval { require HTML::Entities; 1 } ) { $missing_html_entities = $EVAL_ERROR ? $EVAL_ERROR : 1; } if ( !eval { require Pod::Html; 1 } ) { $missing_pod_html = $EVAL_ERROR ? $EVAL_ERROR : 1; } } ## end BEGIN sub AUTOLOAD { # Catch any undefined sub calls so that we are sure to get # some diagnostic information. This sub should never be called # except for a programming error. our $AUTOLOAD; return if ( $AUTOLOAD =~ /\bDESTROY$/ ); my ( $pkg, $fname, $lno ) = caller(); my $my_package = __PACKAGE__; print STDERR < undef, html_file => undef, extension => undef, html_toc_extension => undef, html_src_extension => undef, ); my %args = ( %defaults, @args ); my $input_file = $args{input_file}; my $html_file = $args{html_file}; my $extension = $args{extension}; my $html_toc_extension = $args{html_toc_extension}; my $html_src_extension = $args{html_src_extension}; my $html_file_opened = 0; my $html_fh; ( $html_fh, my $html_filename ) = Perl::Tidy::streamhandle( $html_file, 'w' ); unless ($html_fh) { Perl::Tidy::Warn("can't open $html_file: $ERRNO\n"); return; } $html_file_opened = 1; if ( !$input_file || $input_file eq '-' || ref($input_file) ) { $input_file = "NONAME"; } # write the table of contents to a string my $toc_string; my $html_toc_fh = Perl::Tidy::IOScalar->new( \$toc_string, 'w' ); my $html_pre_fh; my @pre_string_stack; if ( $rOpts->{'html-pre-only'} ) { # pre section goes directly to the output stream $html_pre_fh = $html_fh; $html_pre_fh->print( <<"PRE_END");
PRE_END
    }
    else {

        # pre section go out to a temporary string
        my $pre_string;
        $html_pre_fh = Perl::Tidy::IOScalar->new( \$pre_string, 'w' );
        push @pre_string_stack, \$pre_string;
    }

    # pod text gets diverted if the 'pod2html' is used
    my $html_pod_fh;
    my $pod_string;
    if ( $rOpts->{'pod2html'} ) {
        if ( $rOpts->{'html-pre-only'} ) {
            undef $rOpts->{'pod2html'};
        }
        else {
            ##eval "use Pod::Html";
            #if ($@) {
            if ($missing_pod_html) {
                Perl::Tidy::Warn(
"unable to find Pod::Html; cannot use pod2html\n-npod disables this message\n"
                );
                undef $rOpts->{'pod2html'};
            }
            else {
                $html_pod_fh = Perl::Tidy::IOScalar->new( \$pod_string, 'w' );
            }
        }
    }

    my $toc_filename;
    my $src_filename;
    if ( $rOpts->{'frames'} ) {
        unless ($extension) {
            Perl::Tidy::Warn(
"cannot use frames without a specified output extension; ignoring -frm\n"
            );
            undef $rOpts->{'frames'};
        }
        else {
            $toc_filename = $input_file . $html_toc_extension . $extension;
            $src_filename = $input_file . $html_src_extension . $extension;
        }
    }

    # ----------------------------------------------------------
    # Output is now directed as follows:
    # html_toc_fh <-- table of contents items
    # html_pre_fh <-- the 
 section of formatted code, except:
    # html_pod_fh <-- pod goes here with the pod2html option
    # ----------------------------------------------------------

    my $title = $rOpts->{'title'};
    unless ($title) {
        ( $title, my $path ) = fileparse($input_file);
    }
    my $toc_item_count = 0;
    my $in_toc_package = EMPTY_STRING;
    my $last_level     = 0;
    return bless {
        _input_file        => $input_file,          # name of input file
        _title             => $title,               # title, unescaped
        _html_file         => $html_file,           # name of .html output file
        _toc_filename      => $toc_filename,        # for frames option
        _src_filename      => $src_filename,        # for frames option
        _html_file_opened  => $html_file_opened,    # a flag
        _html_fh           => $html_fh,             # the output stream
        _html_pre_fh       => $html_pre_fh,         # pre section goes here
        _rpre_string_stack => \@pre_string_stack,   # stack of pre sections
        _html_pod_fh       => $html_pod_fh,         # pod goes here if pod2html
        _rpod_string       => \$pod_string,         # string holding pod
        _pod_cut_count     => 0,                    # how many =cut's?
        _html_toc_fh       => $html_toc_fh,         # fh for table of contents
        _rtoc_string       => \$toc_string,         # string holding toc
        _rtoc_item_count   => \$toc_item_count,     # how many toc items
        _rin_toc_package   => \$in_toc_package,     # package name
        _rtoc_name_count   => {},                   # hash to track unique names
        _rpackage_stack    => [],                   # stack to check for package
                                                    # name changes
        _rlast_level       => \$last_level,         # brace indentation level
    }, $class;
} ## end sub new

sub close_object {
    my ($object) = @_;

    # returns true if close works, false if not
    # failure probably means there is no close method
    return eval { $object->close(); 1 };
} ## end sub close_object

sub add_toc_item {

    # Add an item to the html table of contents.
    # This is called even if no table of contents is written,
    # because we still want to put the anchors in the 
 text.
    # We are given an anchor name and its type; types are:
    #      'package', 'sub', '__END__', '__DATA__', 'EOF'
    # There must be an 'EOF' call at the end to wrap things up.
    my ( $self, $name, $type ) = @_;
    my $html_toc_fh     = $self->{_html_toc_fh};
    my $html_pre_fh     = $self->{_html_pre_fh};
    my $rtoc_name_count = $self->{_rtoc_name_count};
    my $rtoc_item_count = $self->{_rtoc_item_count};
    my $rlast_level     = $self->{_rlast_level};
    my $rin_toc_package = $self->{_rin_toc_package};
    my $rpackage_stack  = $self->{_rpackage_stack};

    # packages contain sublists of subs, so to avoid errors all package
    # items are written and finished with the following routines
    my $end_package_list = sub {
        if ( ${$rin_toc_package} ) {
            $html_toc_fh->print("\n\n");
            ${$rin_toc_package} = EMPTY_STRING;
        }
        return;
    };

    my $start_package_list = sub {
        my ( $unique_name, $package ) = @_;
        if ( ${$rin_toc_package} ) { $end_package_list->() }
        $html_toc_fh->print(<package $package
    EOM ${$rin_toc_package} = $package; return; }; # start the table of contents on the first item unless ( ${$rtoc_item_count} ) { # but just quit if we hit EOF without any other entries # in this case, there will be no toc return if ( $type eq 'EOF' ); $html_toc_fh->print( <<"TOC_END");
      TOC_END } ${$rtoc_item_count}++; # make a unique anchor name for this location: # - packages get a 'package-' prefix # - subs use their names my $unique_name = $name; if ( $type eq 'package' ) { $unique_name = "package-$name" } # append '-1', '-2', etc if necessary to make unique; this will # be unique because subs and packages cannot have a '-' if ( my $count = $rtoc_name_count->{ lc $unique_name }++ ) { $unique_name .= "-$count"; } # - all names get terminal '-' if pod2html is used, to avoid # conflicts with anchor names created by pod2html if ( $rOpts->{'pod2html'} ) { $unique_name .= '-' } # start/stop lists of subs if ( $type eq 'sub' ) { my $package = $rpackage_stack->[ ${$rlast_level} ]; unless ($package) { $package = 'main' } # if we're already in a package/sub list, be sure its the right # package or else close it if ( ${$rin_toc_package} && ${$rin_toc_package} ne $package ) { $end_package_list->(); } # start a package/sub list if necessary unless ( ${$rin_toc_package} ) { $start_package_list->( $unique_name, $package ); } } # now write an entry in the toc for this item if ( $type eq 'package' ) { $start_package_list->( $unique_name, $name ); } elsif ( $type eq 'sub' ) { $html_toc_fh->print("
    • $name
    • \n"); } else { $end_package_list->(); $html_toc_fh->print("
    • $name
    • \n"); } # write the anchor in the
       section
          $html_pre_fh->print("");
      
          # end the table of contents, if any, on the end of file
          if ( $type eq 'EOF' ) {
              $html_toc_fh->print( <<"TOC_END");
      
    TOC_END } return; } ## end sub add_toc_item BEGIN { # This is the official list of tokens which may be identified by the # user. Long names are used as getopt keys. Short names are # convenient short abbreviations for specifying input. Short names # somewhat resemble token type characters, but are often different # because they may only be alphanumeric, to allow command line # input. Also, note that because of case insensitivity of html, # this table must be in a single case only (I've chosen to use all # lower case). # When adding NEW_TOKENS: update this hash table # short names => long names %short_to_long_names = ( 'n' => 'numeric', 'p' => 'paren', 'q' => 'quote', 's' => 'structure', 'c' => 'comment', 'v' => 'v-string', 'cm' => 'comma', 'w' => 'bareword', 'co' => 'colon', 'pu' => 'punctuation', 'i' => 'identifier', 'j' => 'label', 'h' => 'here-doc-target', 'hh' => 'here-doc-text', 'k' => 'keyword', 'sc' => 'semicolon', 'm' => 'subroutine', 'pd' => 'pod-text', ); # Now we have to map actual token types into one of the above short # names; any token types not mapped will get 'punctuation' # properties. # The values of this hash table correspond to the keys of the # previous hash table. # The keys of this hash table are token types and can be seen # by running with --dump-token-types (-dtt). # When adding NEW_TOKENS: update this hash table # $type => $short_name %token_short_names = ( '#' => 'c', 'n' => 'n', 'v' => 'v', 'k' => 'k', 'F' => 'k', 'Q' => 'q', 'q' => 'q', 'J' => 'j', 'j' => 'j', 'h' => 'h', 'H' => 'hh', 'w' => 'w', ',' => 'cm', '=>' => 'cm', ';' => 'sc', ':' => 'co', 'f' => 'sc', '(' => 'p', ')' => 'p', 'M' => 'm', 'P' => 'pd', 'A' => 'co', ); # These token types will all be called identifiers for now my @identifier = qw< i t U C Y Z G :: CORE::>; @token_short_names{@identifier} = ('i') x scalar(@identifier); # These token types will be called 'structure' my @structure = qw< { } >; @token_short_names{@structure} = ('s') x scalar(@structure); # OLD NOTES: save for reference # Any of these could be added later if it would be useful. # For now, they will by default become punctuation # my @list = qw< L R [ ] >; # @token_long_names{@list} = ('non-structure') x scalar(@list); # # my @list = qw" # / /= * *= ** **= + += - -= % %= = ++ -- << <<= >> >>= pp p m mm # "; # @token_long_names{@list} = ('math') x scalar(@list); # # my @list = qw" & &= ~ ~= ^ ^= | |= "; # @token_long_names{@list} = ('bit') x scalar(@list); # # my @list = qw" == != < > <= <=> "; # @token_long_names{@list} = ('numerical-comparison') x scalar(@list); # # my @list = qw" && || ! &&= ||= //= "; # @token_long_names{@list} = ('logical') x scalar(@list); # # my @list = qw" . .= =~ !~ x x= "; # @token_long_names{@list} = ('string-operators') x scalar(@list); # # # Incomplete.. # my @list = qw" .. -> <> ... \ ? "; # @token_long_names{@list} = ('misc-operators') x scalar(@list); } ## end BEGIN sub make_getopt_long_names { my ( $class, $rgetopt_names ) = @_; while ( my ( $short_name, $name ) = each %short_to_long_names ) { push @{$rgetopt_names}, "html-color-$name=s"; push @{$rgetopt_names}, "html-italic-$name!"; push @{$rgetopt_names}, "html-bold-$name!"; } push @{$rgetopt_names}, "html-color-background=s"; push @{$rgetopt_names}, "html-linked-style-sheet=s"; push @{$rgetopt_names}, "nohtml-style-sheets"; push @{$rgetopt_names}, "html-pre-only"; push @{$rgetopt_names}, "html-line-numbers"; push @{$rgetopt_names}, "html-entities!"; push @{$rgetopt_names}, "stylesheet"; push @{$rgetopt_names}, "html-table-of-contents!"; push @{$rgetopt_names}, "pod2html!"; push @{$rgetopt_names}, "frames!"; push @{$rgetopt_names}, "html-toc-extension=s"; push @{$rgetopt_names}, "html-src-extension=s"; # Pod::Html parameters: push @{$rgetopt_names}, "backlink=s"; push @{$rgetopt_names}, "cachedir=s"; push @{$rgetopt_names}, "htmlroot=s"; push @{$rgetopt_names}, "libpods=s"; push @{$rgetopt_names}, "podpath=s"; push @{$rgetopt_names}, "podroot=s"; push @{$rgetopt_names}, "title=s"; # Pod::Html parameters with leading 'pod' which will be removed # before the call to Pod::Html push @{$rgetopt_names}, "podquiet!"; push @{$rgetopt_names}, "podverbose!"; push @{$rgetopt_names}, "podrecurse!"; push @{$rgetopt_names}, "podflush"; push @{$rgetopt_names}, "podheader!"; push @{$rgetopt_names}, "podindex!"; return; } ## end sub make_getopt_long_names sub make_abbreviated_names { # We're appending things like this to the expansion list: # 'hcc' => [qw(html-color-comment)], # 'hck' => [qw(html-color-keyword)], # etc my ( $class, $rexpansion ) = @_; # abbreviations for color/bold/italic properties while ( my ( $short_name, $long_name ) = each %short_to_long_names ) { ${$rexpansion}{"hc$short_name"} = ["html-color-$long_name"]; ${$rexpansion}{"hb$short_name"} = ["html-bold-$long_name"]; ${$rexpansion}{"hi$short_name"} = ["html-italic-$long_name"]; ${$rexpansion}{"nhb$short_name"} = ["nohtml-bold-$long_name"]; ${$rexpansion}{"nhi$short_name"} = ["nohtml-italic-$long_name"]; } # abbreviations for all other html options ${$rexpansion}{"hcbg"} = ["html-color-background"]; ${$rexpansion}{"pre"} = ["html-pre-only"]; ${$rexpansion}{"toc"} = ["html-table-of-contents"]; ${$rexpansion}{"ntoc"} = ["nohtml-table-of-contents"]; ${$rexpansion}{"nnn"} = ["html-line-numbers"]; ${$rexpansion}{"hent"} = ["html-entities"]; ${$rexpansion}{"nhent"} = ["nohtml-entities"]; ${$rexpansion}{"css"} = ["html-linked-style-sheet"]; ${$rexpansion}{"nss"} = ["nohtml-style-sheets"]; ${$rexpansion}{"ss"} = ["stylesheet"]; ${$rexpansion}{"pod"} = ["pod2html"]; ${$rexpansion}{"npod"} = ["nopod2html"]; ${$rexpansion}{"frm"} = ["frames"]; ${$rexpansion}{"nfrm"} = ["noframes"]; ${$rexpansion}{"text"} = ["html-toc-extension"]; ${$rexpansion}{"sext"} = ["html-src-extension"]; return; } ## end sub make_abbreviated_names sub check_options { # This will be called once after options have been parsed # Note that we are defining the package variable $rOpts here: ( my $class, $rOpts ) = @_; # X11 color names for default settings that seemed to look ok # (these color names are only used for programming clarity; the hex # numbers are actually written) use constant ForestGreen => "#228B22"; use constant SaddleBrown => "#8B4513"; use constant magenta4 => "#8B008B"; use constant IndianRed3 => "#CD5555"; use constant DeepSkyBlue4 => "#00688B"; use constant MediumOrchid3 => "#B452CD"; use constant black => "#000000"; use constant white => "#FFFFFF"; use constant red => "#FF0000"; # set default color, bold, italic properties # anything not listed here will be given the default (punctuation) color -- # these types currently not listed and get default: ws pu s sc cm co p # When adding NEW_TOKENS: add an entry here if you don't want defaults # set_default_properties( $short_name, default_color, bold?, italic? ); set_default_properties( 'c', ForestGreen, 0, 0 ); set_default_properties( 'pd', ForestGreen, 0, 1 ); set_default_properties( 'k', magenta4, 1, 0 ); # was SaddleBrown set_default_properties( 'q', IndianRed3, 0, 0 ); set_default_properties( 'hh', IndianRed3, 0, 1 ); set_default_properties( 'h', IndianRed3, 1, 0 ); set_default_properties( 'i', DeepSkyBlue4, 0, 0 ); set_default_properties( 'w', black, 0, 0 ); set_default_properties( 'n', MediumOrchid3, 0, 0 ); set_default_properties( 'v', MediumOrchid3, 0, 0 ); set_default_properties( 'j', IndianRed3, 1, 0 ); set_default_properties( 'm', red, 1, 0 ); set_default_color( 'html-color-background', white ); set_default_color( 'html-color-punctuation', black ); # setup property lookup tables for tokens based on their short names # every token type has a short name, and will use these tables # to do the html markup while ( my ( $short_name, $long_name ) = each %short_to_long_names ) { $html_color{$short_name} = $rOpts->{"html-color-$long_name"}; $html_bold{$short_name} = $rOpts->{"html-bold-$long_name"}; $html_italic{$short_name} = $rOpts->{"html-italic-$long_name"}; } # write style sheet to STDOUT and die if requested if ( defined( $rOpts->{'stylesheet'} ) ) { write_style_sheet_file('-'); Perl::Tidy::Exit(0); } # make sure user gives a file name after -css if ( defined( $rOpts->{'html-linked-style-sheet'} ) ) { $css_linkname = $rOpts->{'html-linked-style-sheet'}; if ( $css_linkname =~ /^-/ ) { Perl::Tidy::Die("You must specify a valid filename after -css\n"); } } # check for conflict if ( $css_linkname && $rOpts->{'nohtml-style-sheets'} ) { $rOpts->{'nohtml-style-sheets'} = 0; Perl::Tidy::Warn( "You can't specify both -css and -nss; -nss ignored\n"); } # write a style sheet file if necessary if ($css_linkname) { # if the selected filename exists, don't write, because user may # have done some work by hand to create it; use backup name instead # Also, this will avoid a potential disaster in which the user # forgets to specify the style sheet, like this: # perltidy -html -css myfile1.pl myfile2.pl # This would cause myfile1.pl to parsed as the style sheet by GetOpts my $css_filename = $css_linkname; unless ( -e $css_filename ) { write_style_sheet_file($css_filename); } } $missing_html_entities = 1 unless $rOpts->{'html-entities'}; return; } ## end sub check_options sub write_style_sheet_file { my $css_filename = shift; my $fh; unless ( $fh = IO::File->new("> $css_filename") ) { Perl::Tidy::Die("can't open $css_filename: $ERRNO\n"); } write_style_sheet_data($fh); close_object($fh); return; } ## end sub write_style_sheet_file sub write_style_sheet_data { # write the style sheet data to an open file handle my $fh = shift; my $bg_color = $rOpts->{'html-color-background'}; my $text_color = $rOpts->{'html-color-punctuation'}; # pre-bgcolor is new, and may not be defined my $pre_bg_color = $rOpts->{'html-pre-color-background'}; $pre_bg_color = $bg_color unless $pre_bg_color; $fh->print(<<"EOM"); /* default style sheet generated by perltidy */ body {background: $bg_color; color: $text_color} pre { color: $text_color; background: $pre_bg_color; font-family: courier; } EOM foreach my $short_name ( sort keys %short_to_long_names ) { my $long_name = $short_to_long_names{$short_name}; my $abbrev = '.' . $short_name; if ( length($short_name) == 1 ) { $abbrev .= SPACE } # for alignment my $color = $html_color{$short_name}; if ( !defined($color) ) { $color = $text_color } $fh->print("$abbrev \{ color: $color;"); if ( $html_bold{$short_name} ) { $fh->print(" font-weight:bold;"); } if ( $html_italic{$short_name} ) { $fh->print(" font-style:italic;"); } $fh->print("} /* $long_name */\n"); } return; } ## end sub write_style_sheet_data sub set_default_color { # make sure that options hash $rOpts->{$key} contains a valid color my ( $key, $color ) = @_; if ( $rOpts->{$key} ) { $color = $rOpts->{$key} } $rOpts->{$key} = check_RGB($color); return; } ## end sub set_default_color sub check_RGB { # if color is a 6 digit hex RGB value, prepend a #, otherwise # assume that it is a valid ascii color name my ($color) = @_; if ( $color =~ /^[0-9a-fA-F]{6,6}$/ ) { $color = "#$color" } return $color; } ## end sub check_RGB sub set_default_properties { my ( $short_name, $color, $bold, $italic ) = @_; set_default_color( "html-color-$short_to_long_names{$short_name}", $color ); my $key; $key = "html-bold-$short_to_long_names{$short_name}"; $rOpts->{$key} = ( defined $rOpts->{$key} ) ? $rOpts->{$key} : $bold; $key = "html-italic-$short_to_long_names{$short_name}"; $rOpts->{$key} = ( defined $rOpts->{$key} ) ? $rOpts->{$key} : $italic; return; } ## end sub set_default_properties sub pod_to_html { # Use Pod::Html to process the pod and make the page # then merge the perltidy code sections into it. # return 1 if success, 0 otherwise my ( $self, $pod_string, $css_string, $toc_string, $rpre_string_stack ) = @_; my $input_file = $self->{_input_file}; my $title = $self->{_title}; my $success_flag = 0; # don't try to use pod2html if no pod unless ($pod_string) { return $success_flag; } # Pod::Html requires a real temporary filename my ( $fh_tmp, $tmpfile ) = File::Temp::tempfile(); unless ($fh_tmp) { Perl::Tidy::Warn( "unable to open temporary file $tmpfile; cannot use pod2html\n"); return $success_flag; } #------------------------------------------------------------------ # Warning: a temporary file is open; we have to clean up if # things go bad. From here on all returns should be by going to # RETURN so that the temporary file gets unlinked. #------------------------------------------------------------------ # write the pod text to the temporary file $fh_tmp->print($pod_string); $fh_tmp->close(); # Hand off the pod to pod2html. # Note that we can use the same temporary filename for input and output # because of the way pod2html works. { my @args; push @args, "--infile=$tmpfile", "--outfile=$tmpfile", "--title=$title"; # Flags with string args: # "backlink=s", "cachedir=s", "htmlroot=s", "libpods=s", # "podpath=s", "podroot=s" # Note: -css=s is handled by perltidy itself foreach my $kw (qw(backlink cachedir htmlroot libpods podpath podroot)) { if ( $rOpts->{$kw} ) { push @args, "--$kw=$rOpts->{$kw}" } } # Toggle switches; these have extra leading 'pod' # "header!", "index!", "recurse!", "quiet!", "verbose!" foreach my $kw (qw(podheader podindex podrecurse podquiet podverbose)) { my $kwd = $kw; # allows us to strip 'pod' if ( $rOpts->{$kw} ) { $kwd =~ s/^pod//; push @args, "--$kwd" } elsif ( defined( $rOpts->{$kw} ) ) { $kwd =~ s/^pod//; push @args, "--no$kwd"; } } # "flush", my $kw = 'podflush'; if ( $rOpts->{$kw} ) { $kw =~ s/^pod//; push @args, "--$kw" } # Must clean up if pod2html dies (it can); # Be careful not to overwrite callers __DIE__ routine local $SIG{__DIE__} = sub { unlink $tmpfile if -e $tmpfile; Perl::Tidy::Die( $_[0] ); }; Pod::Html::pod2html(@args); } $fh_tmp = IO::File->new( $tmpfile, 'r' ); unless ($fh_tmp) { # this error shouldn't happen ... we just used this filename Perl::Tidy::Warn( "unable to open temporary file $tmpfile; cannot use pod2html\n"); return $success_flag; } my $html_fh = $self->{_html_fh}; my @toc; my $in_toc; my $ul_level = 0; my $no_print; # This routine will write the html selectively and store the toc my $html_print = sub { foreach my $line (@_) { $html_fh->print($line) unless ($no_print); if ($in_toc) { push @toc, $line } } return; }; # loop over lines of html output from pod2html and merge in # the necessary perltidy html sections my ( $saw_body, $saw_index, $saw_body_end ); my $timestamp = EMPTY_STRING; if ( $rOpts->{'timestamp'} ) { my $date = localtime; $timestamp = "on $date"; } while ( my $line = $fh_tmp->getline() ) { if ( $line =~ /^\s*\s*$/i ) { ##my $date = localtime; ##$html_print->("\n"); $html_print->("\n"); $html_print->($line); } # Copy the perltidy css, if any, after tag elsif ( $line =~ /^\s*\s*$/i ) { $saw_body = 1; $html_print->($css_string) if $css_string; $html_print->($line); # add a top anchor and heading $html_print->("\n"); $title = escape_html($title); $html_print->("

    $title

    \n"); } # check for start of index, old pod2html # before Pod::Html VERSION 1.15_02 it is delimited by comments as: # #
      # ... #
    # # elsif ( $line =~ /^\s*\s*$/i ) { $in_toc = 'INDEX'; # when frames are used, an extra table of contents in the # contents panel is confusing, so don't print it $no_print = $rOpts->{'frames'} || !$rOpts->{'html-table-of-contents'}; $html_print->("

    Doc Index:

    \n") if $rOpts->{'frames'}; $html_print->($line); } # check for start of index, new pod2html # After Pod::Html VERSION 1.15_02 it is delimited as: #
      # ... #
    elsif ( $line =~ /^\s*/i ) { $in_toc = 'UL'; $ul_level = 1; # when frames are used, an extra table of contents in the # contents panel is confusing, so don't print it $no_print = $rOpts->{'frames'} || !$rOpts->{'html-table-of-contents'}; $html_print->("

    Doc Index:

    \n") if $rOpts->{'frames'}; $html_print->($line); } # Check for end of index, old pod2html elsif ( $line =~ /^\s*\s*$/i ) { $saw_index = 1; $html_print->($line); # Copy the perltidy toc, if any, after the Pod::Html toc if ($toc_string) { $html_print->("
    \n") if $rOpts->{'frames'}; $html_print->("

    Code Index:

    \n"); ##my @toc = map { $_ .= "\n" } split /\n/, $toc_string; my @toc_st = map { $_ . "\n" } split /\n/, $toc_string; $html_print->(@toc_st); } $in_toc = EMPTY_STRING; $no_print = 0; } # must track
      depth level for new pod2html elsif ( $line =~ /\s*
        \s*$/i && $in_toc eq 'UL' ) { $ul_level++; $html_print->($line); } # Check for end of index, for new pod2html elsif ( $line =~ /\s*<\/ul>/i && $in_toc eq 'UL' ) { $ul_level--; $html_print->($line); # Copy the perltidy toc, if any, after the Pod::Html toc if ( $ul_level <= 0 ) { $saw_index = 1; if ($toc_string) { $html_print->("
        \n") if $rOpts->{'frames'}; $html_print->("

        Code Index:

        \n"); ##my @toc = map { $_ .= "\n" } split /\n/, $toc_string; my @toc_st = map { $_ . "\n" } split /\n/, $toc_string; $html_print->(@toc_st); } $in_toc = EMPTY_STRING; $ul_level = 0; $no_print = 0; } } # Copy one perltidy section after each marker elsif ( $line =~ /^(.*)(.*)$/ ) { $line = $2; $html_print->($1) if $1; # Intermingle code and pod sections if we saw multiple =cut's. if ( $self->{_pod_cut_count} > 1 ) { my $rpre_string = shift( @{$rpre_string_stack} ); if ( ${$rpre_string} ) { $html_print->('
        ');
                            $html_print->( ${$rpre_string} );
                            $html_print->('
        '); } else { # shouldn't happen: we stored a string before writing # each marker. Perl::Tidy::Warn( "Problem merging html stream with pod2html; order may be wrong\n" ); } $html_print->($line); } # If didn't see multiple =cut lines, we'll put the pod out first # and then the code, because it's less confusing. else { # since we are not intermixing code and pod, we don't need # or want any
        lines which separated pod and code $html_print->($line) unless ( $line =~ /^\s*
        \s*$/i ); } } # Copy any remaining code section before the tag elsif ( $line =~ /^\s*<\/body>\s*$/i ) { $saw_body_end = 1; if ( @{$rpre_string_stack} ) { unless ( $self->{_pod_cut_count} > 1 ) { $html_print->('
        '); } while ( my $rpre_string = shift( @{$rpre_string_stack} ) ) { $html_print->('
        ');
                            $html_print->( ${$rpre_string} );
                            $html_print->('
        '); } } $html_print->($line); } else { $html_print->($line); } } $success_flag = 1; unless ($saw_body) { Perl::Tidy::Warn("Did not see in pod2html output\n"); $success_flag = 0; } unless ($saw_body_end) { Perl::Tidy::Warn("Did not see in pod2html output\n"); $success_flag = 0; } unless ($saw_index) { Perl::Tidy::Warn("Did not find INDEX END in pod2html output\n"); $success_flag = 0; } close_object($html_fh); # note that we have to unlink tmpfile before making frames # because the tmpfile may be one of the names used for frames if ( -e $tmpfile ) { unless ( unlink($tmpfile) ) { Perl::Tidy::Warn( "couldn't unlink temporary file $tmpfile: $ERRNO\n"); $success_flag = 0; } } if ( $success_flag && $rOpts->{'frames'} ) { $self->make_frame( \@toc ); } return $success_flag; } ## end sub pod_to_html sub make_frame { # Make a frame with table of contents in the left panel # and the text in the right panel. # On entry: # $html_filename contains the no-frames html output # $rtoc is a reference to an array with the table of contents my ( $self, $rtoc ) = @_; my $input_file = $self->{_input_file}; my $html_filename = $self->{_html_file}; my $toc_filename = $self->{_toc_filename}; my $src_filename = $self->{_src_filename}; my $title = $self->{_title}; $title = escape_html($title); # FUTURE input parameter: my $top_basename = EMPTY_STRING; # We need to produce 3 html files: # 1. - the table of contents # 2. - the contents (source code) itself # 3. - the frame which contains them # get basenames for relative links my ( $toc_basename, $toc_path ) = fileparse($toc_filename); my ( $src_basename, $src_path ) = fileparse($src_filename); # 1. Make the table of contents panel, with appropriate changes # to the anchor names my $src_frame_name = 'SRC'; my $first_anchor = write_toc_html( $title, $toc_filename, $src_basename, $rtoc, $src_frame_name ); # 2. The current .html filename is renamed to be the contents panel rename( $html_filename, $src_filename ) or Perl::Tidy::Die( "Cannot rename $html_filename to $src_filename: $ERRNO\n"); # 3. Then use the original html filename for the frame write_frame_html( $title, $html_filename, $top_basename, $toc_basename, $src_basename, $src_frame_name ); return; } ## end sub make_frame sub write_toc_html { # write a separate html table of contents file for frames my ( $title, $toc_filename, $src_basename, $rtoc, $src_frame_name ) = @_; my $fh = IO::File->new( $toc_filename, 'w' ) or Perl::Tidy::Die("Cannot open $toc_filename: $ERRNO\n"); $fh->print(< $title

        $title

        EOM my $first_anchor = change_anchor_names( $rtoc, $src_basename, "$src_frame_name" ); $fh->print( join EMPTY_STRING, @{$rtoc} ); $fh->print(< EOM return; } ## end sub write_toc_html sub write_frame_html { # write an html file to be the table of contents frame my ( $title, $frame_filename, $top_basename, $toc_basename, $src_basename, $src_frame_name ) = @_; my $fh = IO::File->new( $frame_filename, 'w' ) or Perl::Tidy::Die("Cannot open $toc_basename: $ERRNO\n"); $fh->print(< $title EOM # two left panels, one right, if master index file if ($top_basename) { $fh->print(< EOM } # one left panels, one right, if no master index file else { $fh->print(< EOM } $fh->print(< <body> <p>If you see this message, you are using a non-frame-capable web client.</p> <p>This document contains:</p> <ul> <li><a href="$toc_basename">A table of contents</a></li> <li><a href="$src_basename">The source code</a></li> </ul> </body> EOM return; } ## end sub write_frame_html sub change_anchor_names { # add a filename and target to anchors # also return the first anchor my ( $rlines, $filename, $target ) = @_; my $first_anchor; foreach my $line ( @{$rlines} ) { # We're looking for lines like this: #
      • SYNOPSIS
      • # ---- - -------- ----------------- # $1 $4 $5 if ( $line =~ /^(.*)]*>(.*)$/i ) { my $pre = $1; my $name = $4; my $post = $5; my $href = "$filename#$name"; $line = "$pre$post\n"; unless ($first_anchor) { $first_anchor = $href } } } return $first_anchor; } ## end sub change_anchor_names sub close_html_file { my $self = shift; return unless $self->{_html_file_opened}; my $html_fh = $self->{_html_fh}; my $rtoc_string = $self->{_rtoc_string}; # There are 3 basic paths to html output... # --------------------------------- # Path 1: finish up if in -pre mode # --------------------------------- if ( $rOpts->{'html-pre-only'} ) { $html_fh->print( <<"PRE_END");
PRE_END close_object($html_fh); return; } # Finish the index $self->add_toc_item( 'EOF', 'EOF' ); my $rpre_string_stack = $self->{_rpre_string_stack}; # Patch to darken the
 background color in case of pod2html and
    # interleaved code/documentation.  Otherwise, the distinction
    # between code and documentation is blurred.
    if (   $rOpts->{pod2html}
        && $self->{_pod_cut_count} >= 1
        && $rOpts->{'html-color-background'} eq '#FFFFFF' )
    {
        $rOpts->{'html-pre-color-background'} = '#F0F0F0';
    }

    # put the css or its link into a string, if used
    my $css_string;
    my $fh_css = Perl::Tidy::IOScalar->new( \$css_string, 'w' );

    # use css linked to another file
    if ( $rOpts->{'html-linked-style-sheet'} ) {
        $fh_css->print(
            qq());
    }

    # use css embedded in this file
    elsif ( !$rOpts->{'nohtml-style-sheets'} ) {
        $fh_css->print( <<'ENDCSS');

ENDCSS
    }

    # -----------------------------------------------------------
    # path 2: use pod2html if requested
    #         If we fail for some reason, continue on to path 3
    # -----------------------------------------------------------
    if ( $rOpts->{'pod2html'} ) {
        my $rpod_string = $self->{_rpod_string};
        $self->pod_to_html(
            ${$rpod_string}, $css_string,
            ${$rtoc_string}, $rpre_string_stack
        ) && return;
    }

    # --------------------------------------------------
    # path 3: write code in html, with pod only in italics
    # --------------------------------------------------
    my $input_file = $self->{_input_file};
    my $title      = escape_html($input_file);
    my $timestamp  = EMPTY_STRING;
    if ( $rOpts->{'timestamp'} ) {
        my $date = localtime;
        $timestamp = "on $date";
    }
    $html_fh->print( <<"HTML_START");




$title
HTML_START

    # output the css, if used
    if ($css_string) {
        $html_fh->print($css_string);
        $html_fh->print( <<"ENDCSS");


ENDCSS
    }
    else {

        $html_fh->print( <<"HTML_START");

{'html-color-background'}\" text=\"$rOpts->{'html-color-punctuation'}\">
HTML_START
    }

    $html_fh->print("\n");
    $html_fh->print( <<"EOM");

$title

EOM # copy the table of contents if ( ${$rtoc_string} && !$rOpts->{'frames'} && $rOpts->{'html-table-of-contents'} ) { $html_fh->print( ${$rtoc_string} ); } # copy the pre section(s) my $fname_comment = $input_file; $fname_comment =~ s/--+/-/g; # protect HTML comment tags $html_fh->print( <<"END_PRE");
END_PRE

    foreach my $rpre_string ( @{$rpre_string_stack} ) {
        $html_fh->print( ${$rpre_string} );
    }

    # and finish the html page
    $html_fh->print( <<"HTML_END");
HTML_END close_object($html_fh); if ( $rOpts->{'frames'} ) { ##my @toc = map { $_ .= "\n" } split /\n/, ${$rtoc_string}; my @toc = map { $_ . "\n" } split /\n/, ${$rtoc_string}; $self->make_frame( \@toc ); } return; } ## end sub close_html_file sub markup_tokens { my ( $self, $rtokens, $rtoken_type, $rlevels ) = @_; my ( @colored_tokens, $type, $token, $level ); my $rlast_level = $self->{_rlast_level}; my $rpackage_stack = $self->{_rpackage_stack}; foreach my $j ( 0 .. @{$rtoken_type} - 1 ) { $type = $rtoken_type->[$j]; $token = $rtokens->[$j]; $level = $rlevels->[$j]; $level = 0 if ( $level < 0 ); #------------------------------------------------------- # Update the package stack. The package stack is needed to keep # the toc correct because some packages may be declared within # blocks and go out of scope when we leave the block. #------------------------------------------------------- if ( $level > ${$rlast_level} ) { unless ( $rpackage_stack->[ $level - 1 ] ) { $rpackage_stack->[ $level - 1 ] = 'main'; } $rpackage_stack->[$level] = $rpackage_stack->[ $level - 1 ]; } elsif ( $level < ${$rlast_level} ) { my $package = $rpackage_stack->[$level]; unless ($package) { $package = 'main' } # if we change packages due to a nesting change, we # have to make an entry in the toc if ( $package ne $rpackage_stack->[ $level + 1 ] ) { $self->add_toc_item( $package, 'package' ); } } ${$rlast_level} = $level; #------------------------------------------------------- # Intercept a sub name here; split it # into keyword 'sub' and sub name; and add an # entry in the toc #------------------------------------------------------- if ( $type eq 'i' && $token =~ /^(sub\s+)(\w.*)$/ ) { $token = $self->markup_html_element( $1, 'k' ); push @colored_tokens, $token; $token = $2; $type = 'M'; # but don't include sub declarations in the toc; # these will have leading token types 'i;' my $signature = join EMPTY_STRING, @{$rtoken_type}; unless ( $signature =~ /^i;/ ) { my $subname = $token; $subname =~ s/[\s\(].*$//; # remove any attributes and prototype $self->add_toc_item( $subname, 'sub' ); } } #------------------------------------------------------- # Intercept a package name here; split it # into keyword 'package' and name; add to the toc, # and update the package stack #------------------------------------------------------- if ( $type eq 'i' && $token =~ /^(package\s+)(\w.*)$/ ) { $token = $self->markup_html_element( $1, 'k' ); push @colored_tokens, $token; $token = $2; $type = 'i'; $self->add_toc_item( "$token", 'package' ); $rpackage_stack->[$level] = $token; } $token = $self->markup_html_element( $token, $type ); push @colored_tokens, $token; } return ( \@colored_tokens ); } ## end sub markup_tokens sub markup_html_element { my ( $self, $token, $type ) = @_; return $token if ( $type eq 'b' ); # skip a blank token return $token if ( $token =~ /^\s*$/ ); # skip a blank line $token = escape_html($token); # get the short abbreviation for this token type my $short_name = $token_short_names{$type}; if ( !defined($short_name) ) { $short_name = "pu"; # punctuation is default } # handle style sheets.. if ( !$rOpts->{'nohtml-style-sheets'} ) { if ( $short_name ne 'pu' ) { $token = qq() . $token . ""; } } # handle no style sheets.. else { my $color = $html_color{$short_name}; if ( $color && ( $color ne $rOpts->{'html-color-punctuation'} ) ) { $token = qq() . $token . ""; } if ( $html_italic{$short_name} ) { $token = "$token" } if ( $html_bold{$short_name} ) { $token = "$token" } } return $token; } ## end sub markup_html_element sub escape_html { my $token = shift; if ($missing_html_entities) { $token =~ s/\&/&/g; $token =~ s/\/>/g; $token =~ s/\"/"/g; } else { HTML::Entities::encode_entities($token); } return $token; } ## end sub escape_html sub finish_formatting { # called after last line my $self = shift; $self->close_html_file(); return; } ## end sub finish_formatting sub write_line { my ( $self, $line_of_tokens ) = @_; return unless $self->{_html_file_opened}; my $html_pre_fh = $self->{_html_pre_fh}; my $line_type = $line_of_tokens->{_line_type}; my $input_line = $line_of_tokens->{_line_text}; my $line_number = $line_of_tokens->{_line_number}; chomp $input_line; # markup line of code.. my $html_line; if ( $line_type eq 'CODE' ) { my $rtoken_type = $line_of_tokens->{_rtoken_type}; my $rtokens = $line_of_tokens->{_rtokens}; my $rlevels = $line_of_tokens->{_rlevels}; if ( $input_line =~ /(^\s*)/ ) { $html_line = $1; } else { $html_line = EMPTY_STRING; } my ($rcolored_tokens) = $self->markup_tokens( $rtokens, $rtoken_type, $rlevels ); $html_line .= join EMPTY_STRING, @{$rcolored_tokens}; } # markup line of non-code.. else { my $line_character; if ( $line_type eq 'HERE' ) { $line_character = 'H' } elsif ( $line_type eq 'HERE_END' ) { $line_character = 'h' } elsif ( $line_type eq 'FORMAT' ) { $line_character = 'H' } elsif ( $line_type eq 'FORMAT_END' ) { $line_character = 'h' } elsif ( $line_type eq 'SKIP' ) { $line_character = 'H' } elsif ( $line_type eq 'SKIP_END' ) { $line_character = 'h' } elsif ( $line_type eq 'SYSTEM' ) { $line_character = 'c' } elsif ( $line_type eq 'END_START' ) { $line_character = 'k'; $self->add_toc_item( '__END__', '__END__' ); } elsif ( $line_type eq 'DATA_START' ) { $line_character = 'k'; $self->add_toc_item( '__DATA__', '__DATA__' ); } elsif ( $line_type =~ /^POD/ ) { $line_character = 'P'; if ( $rOpts->{'pod2html'} ) { my $html_pod_fh = $self->{_html_pod_fh}; if ( $line_type eq 'POD_START' ) { my $rpre_string_stack = $self->{_rpre_string_stack}; my $rpre_string = $rpre_string_stack->[-1]; # if we have written any non-blank lines to the # current pre section, start writing to a new output # string if ( ${$rpre_string} =~ /\S/ ) { my $pre_string; $html_pre_fh = Perl::Tidy::IOScalar->new( \$pre_string, 'w' ); $self->{_html_pre_fh} = $html_pre_fh; push @{$rpre_string_stack}, \$pre_string; # leave a marker in the pod stream so we know # where to put the pre section we just # finished. my $for_html = '=for html'; # don't confuse pod utils $html_pod_fh->print(< EOM } # otherwise, just clear the current string and start # over else { ${$rpre_string} = EMPTY_STRING; $html_pod_fh->print("\n"); } } $html_pod_fh->print( $input_line . "\n" ); if ( $line_type eq 'POD_END' ) { $self->{_pod_cut_count}++; $html_pod_fh->print("\n"); } return; } } else { $line_character = 'Q' } $html_line = $self->markup_html_element( $input_line, $line_character ); } # add the line number if requested if ( $rOpts->{'html-line-numbers'} ) { my $extra_space = ( $line_number < 10 ) ? SPACE x 3 : ( $line_number < 100 ) ? SPACE x 2 : ( $line_number < 1000 ) ? SPACE : EMPTY_STRING; $html_line = $extra_space . $line_number . SPACE . $html_line; } # write the line $html_pre_fh->print("$html_line\n"); return; } ## end sub write_line 1; Perl-Tidy-20230309/lib/Perl/Tidy/LineBuffer.pm0000644000175000017500000000446214400733203017567 0ustar stevesteve##################################################################### # # The Perl::Tidy::LineBuffer class supplies a 'get_line()' # method for returning the next line to be parsed, as well as a # 'peek_ahead()' method # # The input parameter is an object with a 'get_line()' method # which returns the next line to be parsed # ##################################################################### package Perl::Tidy::LineBuffer; use strict; use warnings; our $VERSION = '20230309'; sub AUTOLOAD { # Catch any undefined sub calls so that we are sure to get # some diagnostic information. This sub should never be called # except for a programming error. our $AUTOLOAD; return if ( $AUTOLOAD =~ /\bDESTROY$/ ); my ( $pkg, $fname, $lno ) = caller(); my $my_package = __PACKAGE__; print STDERR < $line_source_object, _rlookahead_buffer => [], }, $class; } sub peek_ahead { my ( $self, $buffer_index ) = @_; my $line = undef; my $line_source_object = $self->{_line_source_object}; my $rlookahead_buffer = $self->{_rlookahead_buffer}; if ( $buffer_index < scalar( @{$rlookahead_buffer} ) ) { $line = $rlookahead_buffer->[$buffer_index]; } else { $line = $line_source_object->get_line(); push( @{$rlookahead_buffer}, $line ); } return $line; } sub get_line { my $self = shift; my $line = undef; my $line_source_object = $self->{_line_source_object}; my $rlookahead_buffer = $self->{_rlookahead_buffer}; if ( scalar( @{$rlookahead_buffer} ) ) { $line = shift @{$rlookahead_buffer}; } else { $line = $line_source_object->get_line(); } return $line; } 1; Perl-Tidy-20230309/lib/Perl/Tidy/LineSource.pm0000644000175000017500000000563314400733204017620 0ustar stevesteve##################################################################### # # the Perl::Tidy::LineSource class supplies an object with a 'get_line()' method # which returns the next line to be parsed # ##################################################################### package Perl::Tidy::LineSource; use strict; use warnings; use English qw( -no_match_vars ); our $VERSION = '20230309'; use constant DEVEL_MODE => 0; sub AUTOLOAD { # Catch any undefined sub calls so that we are sure to get # some diagnostic information. This sub should never be called # except for a programming error. our $AUTOLOAD; return if ( $AUTOLOAD =~ /\bDESTROY$/ ); my ( $pkg, $fname, $lno ) = caller(); my $my_package = __PACKAGE__; print STDERR < undef, rOpts => undef, ); my %args = ( %defaults, @args ); my $input_file = $args{input_file}; my $rOpts = $args{rOpts}; ( my $fh, $input_file ) = Perl::Tidy::streamhandle( $input_file, 'r' ); return unless $fh; return bless { _fh => $fh, _filename => $input_file, _rinput_buffer => [], _started => 0, }, $class; } sub close_input_file { my $self = shift; # Only close physical files, not STDIN and other objects my $filename = $self->{_filename}; if ( $filename ne '-' && !ref $filename ) { my $ok = eval { $self->{_fh}->close(); 1 }; if ( !$ok && DEVEL_MODE ) { Fault("Could not close file handle(): $EVAL_ERROR\n"); } } return; } sub get_line { my $self = shift; my $line = undef; my $fh = $self->{_fh}; my $rinput_buffer = $self->{_rinput_buffer}; if ( scalar( @{$rinput_buffer} ) ) { $line = shift @{$rinput_buffer}; } else { $line = $fh->getline(); # patch to read raw mac files under unix, dos # see if the first line has embedded \r's if ( $line && !$self->{_started} ) { if ( $line =~ /[\015][^\015\012]/ ) { # found one -- break the line up and store in a buffer @{$rinput_buffer} = map { $_ . "\n" } split /\015/, $line; my $count = @{$rinput_buffer}; $line = shift @{$rinput_buffer}; } $self->{_started}++; } } return $line; } 1; Perl-Tidy-20230309/lib/Perl/Tidy/Formatter.pm0000644000175000017500000434043614401430117017520 0ustar stevesteve#################################################################### # # The Perl::Tidy::Formatter package adds indentation, whitespace, and # line breaks to the token stream # ##################################################################### # Index... # CODE SECTION 1: Preliminary code, global definitions and sub new # sub new # CODE SECTION 2: Some Basic Utilities # CODE SECTION 3: Check and process options # sub check_options # CODE SECTION 4: Receive lines from the tokenizer # sub write_line # CODE SECTION 5: Pre-process the entire file # sub finish_formatting # CODE SECTION 6: Process line-by-line # sub process_all_lines # CODE SECTION 7: Process lines of code # process_line_of_CODE # CODE SECTION 8: Utilities for setting breakpoints # sub set_forced_breakpoint # CODE SECTION 9: Process batches of code # sub grind_batch_of_CODE # CODE SECTION 10: Code to break long statements # sub break_long_lines # CODE SECTION 11: Code to break long lists # sub break_lists # CODE SECTION 12: Code for setting indentation # CODE SECTION 13: Preparing batch of lines for vertical alignment # sub convey_batch_to_vertical_aligner # CODE SECTION 14: Code for creating closing side comments # sub add_closing_side_comment # CODE SECTION 15: Summarize # sub wrapup ####################################################################### # CODE SECTION 1: Preliminary code and global definitions up to sub new ####################################################################### package Perl::Tidy::Formatter; use strict; use warnings; # DEVEL_MODE gets switched on during automated testing for extra checking use constant DEVEL_MODE => 0; use constant EMPTY_STRING => q{}; use constant SPACE => q{ }; { #<<< A non-indenting brace to contain all lexical variables use Carp; use English qw( -no_match_vars ); use List::Util qw( min max ); # min, max are in Perl 5.8 our $VERSION = '20230309'; # The Tokenizer will be loaded with the Formatter ##use Perl::Tidy::Tokenizer; # for is_keyword() sub AUTOLOAD { # Catch any undefined sub calls so that we are sure to get # some diagnostic information. This sub should never be called # except for a programming error. our $AUTOLOAD; return if ( $AUTOLOAD =~ /\bDESTROY$/ ); my ( $pkg, $fname, $lno ) = caller(); my $my_package = __PACKAGE__; print STDERR <_decrement_count(); return; } sub Die { my ($msg) = @_; Perl::Tidy::Die($msg); croak "unexpected return from Perl::Tidy::Die"; } sub Warn { my ($msg) = @_; Perl::Tidy::Warn($msg); return; } sub Fault { my ($msg) = @_; # This routine is called for errors that really should not occur # except if there has been a bug introduced by a recent program change. # Please add comments at calls to Fault to explain why the call # should not occur, and where to look to fix it. my ( $package0, $filename0, $line0, $subroutine0 ) = caller(0); my ( $package1, $filename1, $line1, $subroutine1 ) = caller(1); my ( $package2, $filename2, $line2, $subroutine2 ) = caller(2); my $pkg = __PACKAGE__; my $input_stream_name = get_input_stream_name(); Die(< $i++, _CUMULATIVE_LENGTH_ => $i++, _LINE_INDEX_ => $i++, _KNEXT_SEQ_ITEM_ => $i++, _LEVEL_ => $i++, _TOKEN_ => $i++, _TOKEN_LENGTH_ => $i++, _TYPE_ => $i++, _TYPE_SEQUENCE_ => $i++, # Number of token variables; must be last in list: _NVARS => $i++, }; } ## end BEGIN BEGIN { # Index names for $self variables. # Do not combine with other BEGIN blocks (c101). my $i = 0; use constant { _rlines_ => $i++, _rLL_ => $i++, _Klimit_ => $i++, _rdepth_of_opening_seqno_ => $i++, _rSS_ => $i++, _Iss_opening_ => $i++, _Iss_closing_ => $i++, _rblock_type_of_seqno_ => $i++, _ris_asub_block_ => $i++, _ris_sub_block_ => $i++, _K_opening_container_ => $i++, _K_closing_container_ => $i++, _K_opening_ternary_ => $i++, _K_closing_ternary_ => $i++, _K_first_seq_item_ => $i++, _rtype_count_by_seqno_ => $i++, _ris_function_call_paren_ => $i++, _rlec_count_by_seqno_ => $i++, _ris_broken_container_ => $i++, _ris_permanently_broken_ => $i++, _rblank_and_comment_count_ => $i++, _rhas_list_ => $i++, _rhas_broken_list_ => $i++, _rhas_broken_list_with_lec_ => $i++, _rfirst_comma_line_index_ => $i++, _rhas_code_block_ => $i++, _rhas_broken_code_block_ => $i++, _rhas_ternary_ => $i++, _ris_excluded_lp_container_ => $i++, _rlp_object_by_seqno_ => $i++, _rwant_reduced_ci_ => $i++, _rno_xci_by_seqno_ => $i++, _rbrace_left_ => $i++, _ris_bli_container_ => $i++, _rparent_of_seqno_ => $i++, _rchildren_of_seqno_ => $i++, _ris_list_by_seqno_ => $i++, _ris_cuddled_closing_brace_ => $i++, _rbreak_container_ => $i++, _rshort_nested_ => $i++, _length_function_ => $i++, _is_encoded_data_ => $i++, _fh_tee_ => $i++, _sink_object_ => $i++, _file_writer_object_ => $i++, _vertical_aligner_object_ => $i++, _logger_object_ => $i++, _radjusted_levels_ => $i++, _this_batch_ => $i++, _ris_special_identifier_token_ => $i++, _last_output_short_opening_token_ => $i++, _last_line_leading_type_ => $i++, _last_line_leading_level_ => $i++, _added_semicolon_count_ => $i++, _first_added_semicolon_at_ => $i++, _last_added_semicolon_at_ => $i++, _deleted_semicolon_count_ => $i++, _first_deleted_semicolon_at_ => $i++, _last_deleted_semicolon_at_ => $i++, _embedded_tab_count_ => $i++, _first_embedded_tab_at_ => $i++, _last_embedded_tab_at_ => $i++, _first_tabbing_disagreement_ => $i++, _last_tabbing_disagreement_ => $i++, _tabbing_disagreement_count_ => $i++, _in_tabbing_disagreement_ => $i++, _first_brace_tabbing_disagreement_ => $i++, _in_brace_tabbing_disagreement_ => $i++, _saw_VERSION_in_this_file_ => $i++, _saw_END_or_DATA_ => $i++, _rK_weld_left_ => $i++, _rK_weld_right_ => $i++, _rweld_len_right_at_K_ => $i++, _rspecial_side_comment_type_ => $i++, _rseqno_controlling_my_ci_ => $i++, _ris_seqno_controlling_ci_ => $i++, _save_logfile_ => $i++, _maximum_level_ => $i++, _maximum_level_at_line_ => $i++, _maximum_BLOCK_level_ => $i++, _maximum_BLOCK_level_at_line_ => $i++, _rKrange_code_without_comments_ => $i++, _rbreak_before_Kfirst_ => $i++, _rbreak_after_Klast_ => $i++, _converged_ => $i++, _rstarting_multiline_qw_seqno_by_K_ => $i++, _rending_multiline_qw_seqno_by_K_ => $i++, _rKrange_multiline_qw_by_seqno_ => $i++, _rmultiline_qw_has_extra_level_ => $i++, _rcollapsed_length_by_seqno_ => $i++, _rbreak_before_container_by_seqno_ => $i++, _roverride_cab3_ => $i++, _ris_assigned_structure_ => $i++, _ris_short_broken_eval_block_ => $i++, _ris_bare_trailing_comma_by_seqno_ => $i++, _rseqno_non_indenting_brace_by_ix_ => $i++, _rmax_vertical_tightness_ => $i++, _no_vertical_tightness_flags_ => $i++, _LAST_SELF_INDEX_ => $i - 1, }; } ## end BEGIN BEGIN { # Index names for batch variables. # Do not combine with other BEGIN blocks (c101). # These are stored in _this_batch_, which is a sub-array of $self. my $i = 0; use constant { _starting_in_quote_ => $i++, _ending_in_quote_ => $i++, _is_static_block_comment_ => $i++, _ri_first_ => $i++, _ri_last_ => $i++, _do_not_pad_ => $i++, _peak_batch_size_ => $i++, _batch_count_ => $i++, _rix_seqno_controlling_ci_ => $i++, _batch_CODE_type_ => $i++, _ri_starting_one_line_block_ => $i++, _runmatched_opening_indexes_ => $i++, }; } ## end BEGIN BEGIN { # Sequence number assigned to the root of sequence tree. # The minimum of the actual sequences numbers is 4, so we can use 1 use constant SEQ_ROOT => 1; # Codes for insertion and deletion of blanks use constant DELETE => 0; use constant STABLE => 1; use constant INSERT => 2; # whitespace codes use constant WS_YES => 1; use constant WS_OPTIONAL => 0; use constant WS_NO => -1; # Token bond strengths. use constant NO_BREAK => 10_000; use constant VERY_STRONG => 100; use constant STRONG => 2.1; use constant NOMINAL => 1.1; use constant WEAK => 0.8; use constant VERY_WEAK => 0.55; # values for testing indexes in output array use constant UNDEFINED_INDEX => -1; # Maximum number of little messages; probably need not be changed. use constant MAX_NAG_MESSAGES => 6; # This is the decimal range of printable characters in ASCII. It is used to # make quick preliminary checks before resorting to using a regex. use constant ORD_PRINTABLE_MIN => 33; use constant ORD_PRINTABLE_MAX => 126; # Initialize constant hashes ... my @q; @q = qw( = **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x= ); @is_assignment{@q} = (1) x scalar(@q); # a hash needed by break_lists for efficiency: push @q, qw{ ; < > ~ f }; @is_non_list_type{@q} = (1) x scalar(@q); @q = qw(is if unless and or err last next redo return); @is_if_unless_and_or_last_next_redo_return{@q} = (1) x scalar(@q); # These block types may have text between the keyword and opening # curly. Note: 'else' does not, but must be included to allow trailing # if/elsif text to be appended. # patch for SWITCH/CASE: added 'case' and 'when' @q = qw(if elsif else unless while until for foreach case when catch); @is_if_elsif_else_unless_while_until_for_foreach{@q} = (1) x scalar(@q); @q = qw(if unless while until for foreach); @is_if_unless_while_until_for_foreach{@q} = (1) x scalar(@q); @q = qw(last next redo return); @is_last_next_redo_return{@q} = (1) x scalar(@q); # Map related block names into a common name to allow vertical alignment # used by sub make_alignment_patterns. Note: this is normally unchanged, # but it contains 'grep' and can be re-initialized in # sub initialize_grep_and_friends in a testing mode. %block_type_map = ( 'unless' => 'if', 'else' => 'if', 'elsif' => 'if', 'when' => 'if', 'default' => 'if', 'case' => 'if', 'sort' => 'map', 'grep' => 'map', ); @q = qw(if unless); @is_if_unless{@q} = (1) x scalar(@q); @q = qw(if elsif); @is_if_elsif{@q} = (1) x scalar(@q); @q = qw(if unless elsif); @is_if_unless_elsif{@q} = (1) x scalar(@q); @q = qw(if unless elsif else); @is_if_unless_elsif_else{@q} = (1) x scalar(@q); @q = qw(elsif else); @is_elsif_else{@q} = (1) x scalar(@q); @q = qw(and or err); @is_and_or{@q} = (1) x scalar(@q); # Identify certain operators which often occur in chains. # Note: the minus (-) causes a side effect of padding of the first line in # something like this (by sub set_logical_padding): # Checkbutton => 'Transmission checked', # -variable => \$TRANS # This usually improves appearance so it seems ok. @q = qw(&& || and or : ? . + - * /); @is_chain_operator{@q} = (1) x scalar(@q); # Operators that the user can request break before or after. # Note that some are keywords @all_operators = qw(% + - * / x != == >= <= =~ !~ < > | & = **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x= . : ? && || and or err xor ); # We can remove semicolons after blocks preceded by these keywords @q = qw(BEGIN END CHECK INIT AUTOLOAD DESTROY UNITCHECK continue if elsif else unless while until for foreach given when default); @is_block_without_semicolon{@q} = (1) x scalar(@q); # We will allow semicolons to be added within these block types # as well as sub and package blocks. # NOTES: # 1. Note that these keywords are omitted: # switch case given when default sort map grep # 2. It is also ok to add for sub and package blocks and a labeled block # 3. But not okay for other perltidy types including: # { } ; G t # 4. Test files: blktype.t, blktype1.t, semicolon.t @q = qw( BEGIN END CHECK INIT AUTOLOAD DESTROY UNITCHECK continue if elsif else unless do while until eval for foreach ); @ok_to_add_semicolon_for_block_type{@q} = (1) x scalar(@q); # 'L' is token for opening { at hash key @q = qw< L { ( [ >; @is_opening_type{@q} = (1) x scalar(@q); # 'R' is token for closing } at hash key @q = qw< R } ) ] >; @is_closing_type{@q} = (1) x scalar(@q); @q = qw< { ( [ >; @is_opening_token{@q} = (1) x scalar(@q); @q = qw< } ) ] >; @is_closing_token{@q} = (1) x scalar(@q); @q = qw( ? : ); @is_ternary{@q} = (1) x scalar(@q); @q = qw< { ( [ ? >; @is_opening_sequence_token{@q} = (1) x scalar(@q); @q = qw< } ) ] : >; @is_closing_sequence_token{@q} = (1) x scalar(@q); %matching_token = ( '{' => '}', '(' => ')', '[' => ']', '?' => ':', '}' => '{', ')' => '(', ']' => '[', ':' => '?', ); # a hash needed by sub break_lists for labeling containers @q = qw( k => && || ? : . ); @is_container_label_type{@q} = (1) x scalar(@q); @q = qw( die confess croak warn ); @is_die_confess_croak_warn{@q} = (1) x scalar(@q); @q = qw( my our local ); @is_my_our_local{@q} = (1) x scalar(@q); # Braces -bbht etc must follow these. Note: experimentation with # including a simple comma shows that it adds little and can lead # to poor formatting in complex lists. @q = qw( = => ); @is_equal_or_fat_comma{@q} = (1) x scalar(@q); @q = qw( => ; h f ); push @q, ','; @is_counted_type{@q} = (1) x scalar(@q); # Tokens where --keep-old-break-xxx flags make soft breaks instead # of hard breaks. See b1433 and b1436. # NOTE: $type is used as the hash key for now; if other container tokens # are added it might be necessary to use a token/type mixture. @q = qw# -> ? : && || + - / * #; @is_soft_keep_break_type{@q} = (1) x scalar(@q); # these functions allow an identifier in the indirect object slot @q = qw( print printf sort exec system say); @is_indirect_object_taker{@q} = (1) x scalar(@q); # Define here tokens which may follow the closing brace of a do statement # on the same line, as in: # } while ( $something); my @dof = qw(until while unless if ; : ); push @dof, ','; @is_do_follower{@dof} = (1) x scalar(@dof); # what can follow a multi-line anonymous sub definition closing curly: my @asf = qw# ; : => or and && || ~~ !~~ ) #; push @asf, ','; @is_anon_sub_brace_follower{@asf} = (1) x scalar(@asf); # what can follow a one-line anonymous sub closing curly: # one-line anonymous subs also have ']' here... # see tk3.t and PP.pm my @asf1 = qw# ; : => or and && || ) ] ~~ !~~ #; push @asf1, ','; @is_anon_sub_1_brace_follower{@asf1} = (1) x scalar(@asf1); # What can follow a closing curly of a block # which is not an if/elsif/else/do/sort/map/grep/eval/sub # Testfiles: 'Toolbar.pm', 'Menubar.pm', bless.t, '3rules.pl' my @obf = qw# ; : => or and && || ) #; push @obf, ','; @is_other_brace_follower{@obf} = (1) x scalar(@obf); } ## end BEGIN { ## begin closure to count instances # methods to count instances my $_count = 0; sub get_count { return $_count; } sub _increment_count { return ++$_count } sub _decrement_count { return --$_count } } ## end closure to count instances sub new { my ( $class, @args ) = @_; # we are given an object with a write_line() method to take lines my %defaults = ( sink_object => undef, diagnostics_object => undef, logger_object => undef, length_function => sub { return length( $_[0] ) }, is_encoded_data => EMPTY_STRING, fh_tee => undef, ); my %args = ( %defaults, @args ); my $length_function = $args{length_function}; my $is_encoded_data = $args{is_encoded_data}; my $fh_tee = $args{fh_tee}; my $logger_object = $args{logger_object}; my $diagnostics_object = $args{diagnostics_object}; # we create another object with a get_line() and peek_ahead() method my $sink_object = $args{sink_object}; my $file_writer_object = Perl::Tidy::FileWriter->new( $sink_object, $rOpts, $logger_object ); # initialize closure variables... set_logger_object($logger_object); set_diagnostics_object($diagnostics_object); initialize_lp_vars(); initialize_csc_vars(); initialize_break_lists(); initialize_undo_ci(); initialize_process_line_of_CODE(); initialize_grind_batch_of_CODE(); initialize_get_final_indentation(); initialize_postponed_breakpoint(); initialize_batch_variables(); initialize_write_line(); my $vertical_aligner_object = Perl::Tidy::VerticalAligner->new( rOpts => $rOpts, file_writer_object => $file_writer_object, logger_object => $logger_object, diagnostics_object => $diagnostics_object, length_function => $length_function, ); write_logfile_entry("\nStarting tokenization pass...\n"); if ( $rOpts->{'entab-leading-whitespace'} ) { write_logfile_entry( "Leading whitespace will be entabbed with $rOpts->{'entab-leading-whitespace'} spaces per tab\n" ); } elsif ( $rOpts->{'tabs'} ) { write_logfile_entry("Indentation will be with a tab character\n"); } else { write_logfile_entry( "Indentation will be with $rOpts->{'indent-columns'} spaces\n"); } # Initialize the $self array reference. # To add an item, first add a constant index in the BEGIN block above. my $self = []; # Basic data structures... $self->[_rlines_] = []; # = ref to array of lines of the file # 'rLL' = reference to the continuous liner array of all tokens in a file. # 'LL' stands for 'Linked List'. Using a linked list was a disaster, but # 'LL' stuck because it is easy to type. The 'rLL' array is updated # by sub 'respace_tokens' during reformatting. The indexes in 'rLL' begin # with '$K' by convention. $self->[_rLL_] = []; $self->[_Klimit_] = undef; # = maximum K index for rLL. # Indexes into the rLL list $self->[_K_opening_container_] = {}; $self->[_K_closing_container_] = {}; $self->[_K_opening_ternary_] = {}; $self->[_K_closing_ternary_] = {}; $self->[_K_first_seq_item_] = undef; # K of first token with a sequence # # 'rSS' is the 'Signed Sequence' list, a continuous list of all sequence # numbers with + or - indicating opening or closing. This list represents # the entire container tree and is invariant under reformatting. It can be # used to quickly travel through the tree. Indexes in the rSS array begin # with '$I' by convention. The 'Iss' arrays give the indexes in this list # of opening and closing sequence numbers. $self->[_rSS_] = []; $self->[_Iss_opening_] = []; $self->[_Iss_closing_] = []; # Arrays to help traverse the tree $self->[_rdepth_of_opening_seqno_] = []; $self->[_rblock_type_of_seqno_] = {}; $self->[_ris_asub_block_] = {}; $self->[_ris_sub_block_] = {}; # Mostly list characteristics and processing flags $self->[_rtype_count_by_seqno_] = {}; $self->[_ris_function_call_paren_] = {}; $self->[_rlec_count_by_seqno_] = {}; $self->[_ris_broken_container_] = {}; $self->[_ris_permanently_broken_] = {}; $self->[_rblank_and_comment_count_] = {}; $self->[_rhas_list_] = {}; $self->[_rhas_broken_list_] = {}; $self->[_rhas_broken_list_with_lec_] = {}; $self->[_rfirst_comma_line_index_] = {}; $self->[_rhas_code_block_] = {}; $self->[_rhas_broken_code_block_] = {}; $self->[_rhas_ternary_] = {}; $self->[_ris_excluded_lp_container_] = {}; $self->[_rlp_object_by_seqno_] = {}; $self->[_rwant_reduced_ci_] = {}; $self->[_rno_xci_by_seqno_] = {}; $self->[_rbrace_left_] = {}; $self->[_ris_bli_container_] = {}; $self->[_rparent_of_seqno_] = {}; $self->[_rchildren_of_seqno_] = {}; $self->[_ris_list_by_seqno_] = {}; $self->[_ris_cuddled_closing_brace_] = {}; $self->[_rbreak_container_] = {}; # prevent one-line blocks $self->[_rshort_nested_] = {}; # blocks not forced open $self->[_length_function_] = $length_function; $self->[_is_encoded_data_] = $is_encoded_data; # Some objects... $self->[_fh_tee_] = $fh_tee; $self->[_sink_object_] = $sink_object; $self->[_file_writer_object_] = $file_writer_object; $self->[_vertical_aligner_object_] = $vertical_aligner_object; $self->[_logger_object_] = $logger_object; # Reference to the batch being processed $self->[_this_batch_] = []; # Memory of processed text... $self->[_ris_special_identifier_token_] = {}; $self->[_last_line_leading_level_] = 0; $self->[_last_line_leading_type_] = '#'; $self->[_last_output_short_opening_token_] = 0; $self->[_added_semicolon_count_] = 0; $self->[_first_added_semicolon_at_] = 0; $self->[_last_added_semicolon_at_] = 0; $self->[_deleted_semicolon_count_] = 0; $self->[_first_deleted_semicolon_at_] = 0; $self->[_last_deleted_semicolon_at_] = 0; $self->[_embedded_tab_count_] = 0; $self->[_first_embedded_tab_at_] = 0; $self->[_last_embedded_tab_at_] = 0; $self->[_first_tabbing_disagreement_] = 0; $self->[_last_tabbing_disagreement_] = 0; $self->[_tabbing_disagreement_count_] = 0; $self->[_in_tabbing_disagreement_] = 0; $self->[_saw_VERSION_in_this_file_] = !$rOpts->{'pass-version-line'}; $self->[_saw_END_or_DATA_] = 0; $self->[_first_brace_tabbing_disagreement_] = undef; $self->[_in_brace_tabbing_disagreement_] = undef; # Hashes related to container welding... $self->[_radjusted_levels_] = []; # Weld data structures $self->[_rK_weld_left_] = {}; $self->[_rK_weld_right_] = {}; $self->[_rweld_len_right_at_K_] = {}; # -xci stuff $self->[_rseqno_controlling_my_ci_] = {}; $self->[_ris_seqno_controlling_ci_] = {}; $self->[_rspecial_side_comment_type_] = {}; $self->[_maximum_level_] = 0; $self->[_maximum_level_at_line_] = 0; $self->[_maximum_BLOCK_level_] = 0; $self->[_maximum_BLOCK_level_at_line_] = 0; $self->[_rKrange_code_without_comments_] = []; $self->[_rbreak_before_Kfirst_] = {}; $self->[_rbreak_after_Klast_] = {}; $self->[_converged_] = 0; # qw stuff $self->[_rstarting_multiline_qw_seqno_by_K_] = {}; $self->[_rending_multiline_qw_seqno_by_K_] = {}; $self->[_rKrange_multiline_qw_by_seqno_] = {}; $self->[_rmultiline_qw_has_extra_level_] = {}; $self->[_rcollapsed_length_by_seqno_] = {}; $self->[_rbreak_before_container_by_seqno_] = {}; $self->[_roverride_cab3_] = {}; $self->[_ris_assigned_structure_] = {}; $self->[_ris_short_broken_eval_block_] = {}; $self->[_ris_bare_trailing_comma_by_seqno_] = {}; $self->[_rseqno_non_indenting_brace_by_ix_] = {}; $self->[_rmax_vertical_tightness_] = {}; $self->[_no_vertical_tightness_flags_] = 0; # This flag will be updated later by a call to get_save_logfile() $self->[_save_logfile_] = defined($logger_object); # Be sure all variables in $self have been initialized above. To find the # correspondence of index numbers and array names, copy a list to a file # and use the unix 'nl' command to number lines 1.. if (DEVEL_MODE) { my @non_existant; foreach ( 0 .. _LAST_SELF_INDEX_ ) { if ( !exists( $self->[$_] ) ) { push @non_existant, $_; } } if (@non_existant) { Fault("These indexes in self not initialized: (@non_existant)\n"); } } bless $self, $class; # Safety check..this is not a class yet if ( _increment_count() > 1 ) { confess "Attempt to create more than 1 object in $class, which is not a true class yet\n"; } return $self; } ## end sub new ###################################### # CODE SECTION 2: Some Basic Utilities ###################################### sub check_rLL { # Verify that the rLL array has not been auto-vivified my ( $self, $msg ) = @_; my $rLL = $self->[_rLL_]; my $Klimit = $self->[_Klimit_]; my $num = @{$rLL}; if ( ( defined($Klimit) && $Klimit != $num - 1 ) || ( !defined($Klimit) && $num > 0 ) ) { # This fault can occur if the array has been accessed for an index # greater than $Klimit, which is the last token index. Just accessing # the array above index $Klimit, not setting a value, can cause @rLL to # increase beyond $Klimit. If this occurs, the problem can be located # by making calls to this routine at different locations in # sub 'finish_formatting'. $Klimit = 'undef' if ( !defined($Klimit) ); $msg = EMPTY_STRING unless $msg; Fault("$msg ERROR: rLL has num=$num but Klimit='$Klimit'\n"); } return; } ## end sub check_rLL sub check_keys { my ( $rtest, $rvalid, $msg, $exact_match ) = @_; # Check the keys of a hash: # $rtest = ref to hash to test # $rvalid = ref to hash with valid keys # $msg = a message to write in case of error # $exact_match defines the type of check: # = false: test hash must not have unknown key # = true: test hash must have exactly same keys as known hash my @unknown_keys = grep { !exists $rvalid->{$_} } keys %{$rtest}; my @missing_keys = grep { !exists $rtest->{$_} } keys %{$rvalid}; my $error = @unknown_keys; if ($exact_match) { $error ||= @missing_keys } if ($error) { local $LIST_SEPARATOR = ')('; my @expected_keys = sort keys %{$rvalid}; @unknown_keys = sort @unknown_keys; Fault(<[_rLL_]; foreach my $KK ( 0 .. @{$rLL} - 1 ) { my $nvars = @{ $rLL->[$KK] }; if ( $nvars != _NVARS ) { my $NVARS = _NVARS; my $type = $rLL->[$KK]->[_TYPE_]; $type = '*' unless defined($type); # The number of variables per token node is _NVARS and was set when # the array indexes were generated. So if the number of variables # is different we have done something wrong, like not store all of # them in sub 'write_line' when they were received from the # tokenizer. Fault( "number of vars for node $KK, type '$type', is $nvars but should be $NVARS" ); } foreach my $var ( _TOKEN_, _TYPE_ ) { if ( !defined( $rLL->[$KK]->[$var] ) ) { my $iline = $rLL->[$KK]->[_LINE_INDEX_]; # This is a simple check that each token has some basic # variables. In other words, that there are no holes in the # array of tokens. Sub 'write_line' pushes tokens into the # $rLL array, so this should guarantee no gaps. Fault("Undefined variable $var for K=$KK, line=$iline\n"); } } } return; } ## end sub check_token_array { ## begin closure check_line_hashes # This code checks that no autovivification occurs in the 'line' hash my %valid_line_hash; BEGIN { # These keys are defined for each line in the formatter # Each line must have exactly these quantities my @valid_line_keys = qw( _curly_brace_depth _ending_in_quote _guessed_indentation_level _line_number _line_text _line_type _paren_depth _quote_character _rK_range _square_bracket_depth _starting_in_quote _ended_in_blank_token _code_type _ci_level_0 _level_0 _nesting_blocks_0 _nesting_tokens_0 ); @valid_line_hash{@valid_line_keys} = (1) x scalar(@valid_line_keys); } ## end BEGIN sub check_line_hashes { my $self = shift; my $rlines = $self->[_rlines_]; foreach my $rline ( @{$rlines} ) { my $iline = $rline->{_line_number}; my $line_type = $rline->{_line_type}; check_keys( $rline, \%valid_line_hash, "Checkpoint: line number =$iline, line_type=$line_type", 1 ); } return; } ## end sub check_line_hashes } ## end closure check_line_hashes { ## begin closure for logger routines my $logger_object; # Called once per file to initialize the logger object sub set_logger_object { $logger_object = shift; return; } sub get_logger_object { return $logger_object; } sub get_input_stream_name { my $input_stream_name = EMPTY_STRING; if ($logger_object) { $input_stream_name = $logger_object->get_input_stream_name(); } return $input_stream_name; } ## end sub get_input_stream_name # interface to Perl::Tidy::Logger routines sub warning { my ($msg) = @_; if ($logger_object) { $logger_object->warning($msg); } return; } sub complain { my ($msg) = @_; if ($logger_object) { $logger_object->complain($msg); } return; } ## end sub complain sub write_logfile_entry { my @msg = @_; if ($logger_object) { $logger_object->write_logfile_entry(@msg); } return; } ## end sub write_logfile_entry sub get_saw_brace_error { if ($logger_object) { return $logger_object->get_saw_brace_error(); } return; } ## end sub get_saw_brace_error sub we_are_at_the_last_line { if ($logger_object) { $logger_object->we_are_at_the_last_line(); } return; } ## end sub we_are_at_the_last_line } ## end closure for logger routines { ## begin closure for diagnostics routines my $diagnostics_object; # Called once per file to initialize the diagnostics object sub set_diagnostics_object { $diagnostics_object = shift; return; } sub write_diagnostics { my ($msg) = @_; if ($diagnostics_object) { $diagnostics_object->write_diagnostics($msg); } return; } ## end sub write_diagnostics } ## end closure for diagnostics routines sub get_convergence_check { my ($self) = @_; return $self->[_converged_]; } sub get_output_line_number { my ($self) = @_; my $vao = $self->[_vertical_aligner_object_]; return $vao->get_output_line_number(); } sub want_blank_line { my $self = shift; $self->flush(); my $file_writer_object = $self->[_file_writer_object_]; $file_writer_object->want_blank_line(); return; } ## end sub want_blank_line sub write_unindented_line { my ( $self, $line ) = @_; $self->flush(); my $file_writer_object = $self->[_file_writer_object_]; $file_writer_object->write_line($line); return; } ## end sub write_unindented_line sub consecutive_nonblank_lines { my ($self) = @_; my $file_writer_object = $self->[_file_writer_object_]; my $vao = $self->[_vertical_aligner_object_]; return $file_writer_object->get_consecutive_nonblank_lines() + $vao->get_cached_line_count(); } ## end sub consecutive_nonblank_lines sub split_words { # given a string containing words separated by whitespace, # return the list of words my ($str) = @_; return unless $str; $str =~ s/\s+$//; $str =~ s/^\s+//; return split( /\s+/, $str ); } ## end sub split_words ########################################### # CODE SECTION 3: Check and process options ########################################### sub check_options { # This routine is called to check the user-supplied run parameters # and to configure the control hashes to them. $rOpts = shift; $controlled_comma_style = 0; initialize_whitespace_hashes(); initialize_bond_strength_hashes(); # This function must be called early to get hashes with grep initialized initialize_grep_and_friends(); # Make needed regex patterns for matching text. # NOTE: sub_matching_patterns must be made first because later patterns use # them; see RT #133130. make_sub_matching_pattern(); # must be first pattern made make_static_block_comment_pattern(); make_static_side_comment_pattern(); make_closing_side_comment_prefix(); make_closing_side_comment_list_pattern(); $format_skipping_pattern_begin = make_format_skipping_pattern( 'format-skipping-begin', '#<<<' ); $format_skipping_pattern_end = make_format_skipping_pattern( 'format-skipping-end', '#>>>' ); make_non_indenting_brace_pattern(); # If closing side comments ARE selected, then we can safely # delete old closing side comments unless closing side comment # warnings are requested. This is a good idea because it will # eliminate any old csc's which fall below the line count threshold. # We cannot do this if warnings are turned on, though, because we # might delete some text which has been added. So that must # be handled when comments are created. And we cannot do this # with -io because -csc will be skipped altogether. if ( $rOpts->{'closing-side-comments'} ) { if ( !$rOpts->{'closing-side-comment-warnings'} && !$rOpts->{'indent-only'} ) { $rOpts->{'delete-closing-side-comments'} = 1; } } # If closing side comments ARE NOT selected, but warnings ARE # selected and we ARE DELETING csc's, then we will pretend to be # adding with a huge interval. This will force the comments to be # generated for comparison with the old comments, but not added. elsif ( $rOpts->{'closing-side-comment-warnings'} ) { if ( $rOpts->{'delete-closing-side-comments'} ) { $rOpts->{'delete-closing-side-comments'} = 0; $rOpts->{'closing-side-comments'} = 1; $rOpts->{'closing-side-comment-interval'} = 100_000_000; } } make_bli_pattern(); make_bl_pattern(); make_block_brace_vertical_tightness_pattern(); make_blank_line_pattern(); make_keyword_group_list_pattern(); prepare_cuddled_block_types(); if ( $rOpts->{'dump-cuddled-block-list'} ) { dump_cuddled_block_list(*STDOUT); Exit(0); } # -xlp implies -lp if ( $rOpts->{'extended-line-up-parentheses'} ) { $rOpts->{'line-up-parentheses'} ||= 1; } if ( $rOpts->{'line-up-parentheses'} ) { if ( $rOpts->{'indent-only'} || !$rOpts->{'add-newlines'} || !$rOpts->{'delete-old-newlines'} ) { Warn(<{'line-up-parentheses'} = 0; $rOpts->{'extended-line-up-parentheses'} = 0; } if ( $rOpts->{'whitespace-cycle'} ) { Warn(<{'whitespace-cycle'} = 0; } } # At present, tabs are not compatible with the line-up-parentheses style # (it would be possible to entab the total leading whitespace # just prior to writing the line, if desired). if ( $rOpts->{'line-up-parentheses'} && $rOpts->{'tabs'} ) { Warn(<{'tabs'} = 0; } # Likewise, tabs are not compatible with outdenting.. if ( $rOpts->{'outdent-keywords'} && $rOpts->{'tabs'} ) { Warn(<{'tabs'} = 0; } if ( $rOpts->{'outdent-labels'} && $rOpts->{'tabs'} ) { Warn(<{'tabs'} = 0; } if ( !$rOpts->{'space-for-semicolon'} ) { $want_left_space{'f'} = -1; } if ( $rOpts->{'space-terminal-semicolon'} ) { $want_left_space{';'} = 1; } # We should put an upper bound on any -sil=n value. Otherwise enormous # files could be created by mistake. for ( $rOpts->{'starting-indentation-level'} ) { if ( $_ && $_ > 100 ) { Warn(< 0 to avoid future parsing problems (issue c147) for ( $rOpts->{'minimum-space-to-comment'} ) { if ( !$_ || $_ <= 0 ) { $_ = 1 } } # implement outdenting preferences for keywords %outdent_keyword = (); my @okw = split_words( $rOpts->{'outdent-keyword-list'} ); unless (@okw) { @okw = qw(next last redo goto return); # defaults } # FUTURE: if not a keyword, assume that it is an identifier foreach (@okw) { if ( $Perl::Tidy::Tokenizer::is_keyword{$_} ) { $outdent_keyword{$_} = 1; } else { Warn("ignoring '$_' in -okwl list; not a perl keyword"); } } # setup hash for -kpit option %keyword_paren_inner_tightness = (); my $kpit_value = $rOpts->{'keyword-paren-inner-tightness'}; if ( defined($kpit_value) && $kpit_value != 1 ) { my @kpit = split_words( $rOpts->{'keyword-paren-inner-tightness-list'} ); unless (@kpit) { @kpit = qw(if elsif unless while until for foreach); # defaults } # we will allow keywords and user-defined identifiers foreach (@kpit) { $keyword_paren_inner_tightness{$_} = $kpit_value; } } # implement user whitespace preferences if ( my @q = split_words( $rOpts->{'want-left-space'} ) ) { @want_left_space{@q} = (1) x scalar(@q); } if ( my @q = split_words( $rOpts->{'want-right-space'} ) ) { @want_right_space{@q} = (1) x scalar(@q); } if ( my @q = split_words( $rOpts->{'nowant-left-space'} ) ) { @want_left_space{@q} = (-1) x scalar(@q); } if ( my @q = split_words( $rOpts->{'nowant-right-space'} ) ) { @want_right_space{@q} = (-1) x scalar(@q); } if ( $rOpts->{'dump-want-left-space'} ) { dump_want_left_space(*STDOUT); Exit(0); } if ( $rOpts->{'dump-want-right-space'} ) { dump_want_right_space(*STDOUT); Exit(0); } initialize_space_after_keyword(); initialize_token_break_preferences(); #-------------------------------------------------------------- # The combination -lp -iob -vmll -bbx=2 can be unstable (b1266) #-------------------------------------------------------------- # The -vmll and -lp parameters do not really work well together. # To avoid instabilities, we will change any -bbx=2 to -bbx=1 (stable). # NOTE: we could make this more precise by looking at any exclusion # flags for -lp, and allowing -bbx=2 for excluded types. if ( $rOpts->{'variable-maximum-line-length'} && $rOpts->{'ignore-old-breakpoints'} && $rOpts->{'line-up-parentheses'} ) { my @changed; foreach my $key ( keys %break_before_container_types ) { if ( $break_before_container_types{$key} == 2 ) { $break_before_container_types{$key} = 1; push @changed, $key; } } if (@changed) { # we could write a warning here } } #----------------------------------------------------------- # The combination -lp -vmll can be unstable if -ci<2 (b1267) #----------------------------------------------------------- # The -vmll and -lp parameters do not really work well together. # This is a very crude fix for an unusual parameter combination. if ( $rOpts->{'variable-maximum-line-length'} && $rOpts->{'line-up-parentheses'} && $rOpts->{'continuation-indentation'} < 2 ) { $rOpts->{'continuation-indentation'} = 2; ##Warn("Increased -ci=n to n=2 for stability with -lp and -vmll\n"); } #----------------------------------------------------------- # The combination -lp -vmll -atc -dtc can be unstable #----------------------------------------------------------- # This fixes b1386 b1387 b1388 which had -wtc='b' # Updated to to include any -wtc to fix b1426 if ( $rOpts->{'variable-maximum-line-length'} && $rOpts->{'line-up-parentheses'} && $rOpts->{'add-trailing-commas'} && $rOpts->{'delete-trailing-commas'} && $rOpts->{'want-trailing-commas'} ) { $rOpts->{'delete-trailing-commas'} = 0; ## Issuing a warning message causes trouble with test cases, and this combo is ## so rare that it is unlikely to not occur in practice. So skip warning. ## Warn( ##"The combination -vmll -lp -atc -dtc can be unstable; turning off -dtc\n" ## ); } %container_indentation_options = (); foreach my $pair ( [ 'break-before-hash-brace-and-indent', '{' ], [ 'break-before-square-bracket-and-indent', '[' ], [ 'break-before-paren-and-indent', '(' ], ) { my ( $key, $tok ) = @{$pair}; my $opt = $rOpts->{$key}; if ( defined($opt) && $opt > 0 && $break_before_container_types{$tok} ) { # (1) -lp is not compatible with opt=2, silently set to opt=0 # (2) opt=0 and 2 give same result if -i=-ci; but opt=0 is faster # (3) set opt=0 if -i < -ci (can be unstable, case b1355) if ( $opt == 2 ) { if ( $rOpts->{'line-up-parentheses'} || ( $rOpts->{'indent-columns'} <= $rOpts->{'continuation-indentation'} ) ) { $opt = 0; } } $container_indentation_options{$tok} = $opt; } } $right_bond_strength{'{'} = WEAK; $left_bond_strength{'{'} = VERY_STRONG; # make -l=0 equal to -l=infinite if ( !$rOpts->{'maximum-line-length'} ) { $rOpts->{'maximum-line-length'} = 1_000_000; } # make -lbl=0 equal to -lbl=infinite if ( !$rOpts->{'long-block-line-count'} ) { $rOpts->{'long-block-line-count'} = 1_000_000; } # hashes used to simplify setting whitespace %tightness = ( '{' => $rOpts->{'brace-tightness'}, '}' => $rOpts->{'brace-tightness'}, '(' => $rOpts->{'paren-tightness'}, ')' => $rOpts->{'paren-tightness'}, '[' => $rOpts->{'square-bracket-tightness'}, ']' => $rOpts->{'square-bracket-tightness'}, ); if ( $rOpts->{'ignore-old-breakpoints'} ) { my @conflicts; if ( $rOpts->{'break-at-old-method-breakpoints'} ) { $rOpts->{'break-at-old-method-breakpoints'} = 0; push @conflicts, '--break-at-old-method-breakpoints (-bom)'; } if ( $rOpts->{'break-at-old-comma-breakpoints'} ) { $rOpts->{'break-at-old-comma-breakpoints'} = 0; push @conflicts, '--break-at-old-comma-breakpoints (-boc)'; } if ( $rOpts->{'break-at-old-semicolon-breakpoints'} ) { $rOpts->{'break-at-old-semicolon-breakpoints'} = 0; push @conflicts, '--break-at-old-semicolon-breakpoints (-bos)'; } if ( $rOpts->{'keep-old-breakpoints-before'} ) { $rOpts->{'keep-old-breakpoints-before'} = EMPTY_STRING; push @conflicts, '--keep-old-breakpoints-before (-kbb)'; } if ( $rOpts->{'keep-old-breakpoints-after'} ) { $rOpts->{'keep-old-breakpoints-after'} = EMPTY_STRING; push @conflicts, '--keep-old-breakpoints-after (-kba)'; } if (@conflicts) { my $msg = join( "\n ", " Conflict: These conflicts with --ignore-old-breakponts (-iob) will be turned off:", @conflicts ) . "\n"; Warn($msg); } # Note: These additional parameters are made inactive by -iob. # They are silently turned off here because they are on by default. # We would generate unexpected warnings if we issued a warning. $rOpts->{'break-at-old-keyword-breakpoints'} = 0; $rOpts->{'break-at-old-logical-breakpoints'} = 0; $rOpts->{'break-at-old-ternary-breakpoints'} = 0; $rOpts->{'break-at-old-attribute-breakpoints'} = 0; } %keep_break_before_type = (); initialize_keep_old_breakpoints( $rOpts->{'keep-old-breakpoints-before'}, 'kbb', \%keep_break_before_type ); %keep_break_after_type = (); initialize_keep_old_breakpoints( $rOpts->{'keep-old-breakpoints-after'}, 'kba', \%keep_break_after_type ); # Modify %keep_break_before and %keep_break_after to avoid conflicts # with %want_break_before; fixes b1436. # This became necessary after breaks for some tokens were converted # from hard to soft (see b1433). # We could do this for all tokens, but to minimize changes to existing # code we currently only do this for the soft break tokens. foreach my $key ( keys %keep_break_before_type ) { if ( defined( $want_break_before{$key} ) && !$want_break_before{$key} && $is_soft_keep_break_type{$key} ) { $keep_break_after_type{$key} = $keep_break_before_type{$key}; delete $keep_break_before_type{$key}; } } foreach my $key ( keys %keep_break_after_type ) { if ( defined( $want_break_before{$key} ) && $want_break_before{$key} && $is_soft_keep_break_type{$key} ) { $keep_break_before_type{$key} = $keep_break_after_type{$key}; delete $keep_break_after_type{$key}; } } $controlled_comma_style ||= $keep_break_before_type{','}; $controlled_comma_style ||= $keep_break_after_type{','}; initialize_global_option_vars(); initialize_line_length_vars(); # after 'initialize_global_option_vars' initialize_trailing_comma_rules(); # after 'initialize_line_length_vars' initialize_weld_nested_exclusion_rules(); initialize_weld_fat_comma_rules(); %line_up_parentheses_control_hash = (); $line_up_parentheses_control_is_lxpl = 1; my $lpxl = $rOpts->{'line-up-parentheses-exclusion-list'}; my $lpil = $rOpts->{'line-up-parentheses-inclusion-list'}; if ( $lpxl && $lpil ) { Warn( <{'line-up-parentheses-exclusion-list'}, 'lpxl' ); } elsif ($lpil) { $line_up_parentheses_control_is_lxpl = 0; initialize_line_up_parentheses_control_hash( $rOpts->{'line-up-parentheses-inclusion-list'}, 'lpil' ); } return; } ## end sub check_options use constant ALIGN_GREP_ALIASES => 0; sub initialize_grep_and_friends { # Initialize or re-initialize hashes with 'grep' and grep aliases. This # must be done after each set of options because new grep aliases may be # used. # re-initialize the hashes ... this is critical! %is_sort_map_grep = (); my @q = qw(sort map grep); @is_sort_map_grep{@q} = (1) x scalar(@q); my $olbxl = $rOpts->{'one-line-block-exclusion-list'}; my %is_olb_exclusion_word; if ( defined($olbxl) ) { my @list = split_words($olbxl); if (@list) { @is_olb_exclusion_word{@list} = (1) x scalar(@list); } } # Make the list of block types which may be re-formed into one line. # They will be modified with the grep-alias-list below and # by sub 'prepare_cuddled_block_types'. # Note that it is essential to always re-initialize the hash here: %want_one_line_block = (); if ( !$is_olb_exclusion_word{'*'} ) { foreach (qw(sort map grep eval)) { if ( !$is_olb_exclusion_word{$_} ) { $want_one_line_block{$_} = 1 } } } # Note that any 'grep-alias-list' string has been preprocessed to be a # trimmed, space-separated list. my $str = $rOpts->{'grep-alias-list'}; my @grep_aliases = split /\s+/, $str; if (@grep_aliases) { @{is_sort_map_grep}{@grep_aliases} = (1) x scalar(@grep_aliases); if ( $want_one_line_block{'grep'} ) { @{want_one_line_block}{@grep_aliases} = (1) x scalar(@grep_aliases); } } ##@q = qw(sort map grep eval); %is_sort_map_grep_eval = %is_sort_map_grep; $is_sort_map_grep_eval{'eval'} = 1; ##@q = qw(sort map grep eval do); %is_sort_map_grep_eval_do = %is_sort_map_grep_eval; $is_sort_map_grep_eval_do{'do'} = 1; # These block types can take ci. This is used by the -xci option. # Note that the 'sub' in this list is an anonymous sub. To be more correct # we could remove sub and use ASUB pattern to also handle a # prototype/signature. But that would slow things down and would probably # never be useful. ##@q = qw( do sub eval sort map grep ); %is_block_with_ci = %is_sort_map_grep_eval_do; $is_block_with_ci{'sub'} = 1; %is_keyword_returning_list = (); @q = qw( grep keys map reverse sort split ); push @q, @grep_aliases; @is_keyword_returning_list{@q} = (1) x scalar(@q); # This code enables vertical alignment of grep aliases for testing. It has # not been found to be beneficial, so it is off by default. But it is # useful for precise testing of the grep alias coding. if (ALIGN_GREP_ALIASES) { %block_type_map = ( 'unless' => 'if', 'else' => 'if', 'elsif' => 'if', 'when' => 'if', 'default' => 'if', 'case' => 'if', 'sort' => 'map', 'grep' => 'map', ); foreach (@q) { $block_type_map{$_} = 'map' unless ( $_ eq 'map' ); } } return; } ## end sub initialize_grep_and_friends sub initialize_weld_nested_exclusion_rules { %weld_nested_exclusion_rules = (); my $opt_name = 'weld-nested-exclusion-list'; my $str = $rOpts->{$opt_name}; return unless ($str); $str =~ s/^\s+//; $str =~ s/\s+$//; return unless ($str); # There are four container tokens. my %token_keys = ( '(' => '(', '[' => '[', '{' => '{', 'q' => 'q', ); # We are parsing an exclusion list for nested welds. The list is a string # with spaces separating any number of items. Each item consists of three # pieces of information: # # < ^ or . > < k or K > < ( [ { > # The last character is the required container type and must be one of: # ( = paren # [ = square bracket # { = brace # An optional leading position indicator: # ^ means the leading token position in the weld # . means a secondary token position in the weld # no position indicator means all positions match # An optional alphanumeric character between the position and container # token selects to which the rule applies: # k = any keyword # K = any non-keyword # f = function call # F = not a function call # w = function or keyword # W = not a function or keyword # no letter means any preceding type matches # Examples: # ^( - the weld must not start with a paren # .( - the second and later tokens may not be parens # ( - no parens in weld # ^K( - exclude a leading paren not preceded by a keyword # .k( - exclude a secondary paren preceded by a keyword # [ { - exclude all brackets and braces my @items = split /\s+/, $str; my $msg1; my $msg2; foreach my $item (@items) { my $item_save = $item; my $tok = chop($item); my $key = $token_keys{$tok}; if ( !defined($key) ) { $msg1 .= " '$item_save'"; next; } if ( !defined( $weld_nested_exclusion_rules{$key} ) ) { $weld_nested_exclusion_rules{$key} = []; } my $rflags = $weld_nested_exclusion_rules{$key}; # A 'q' means do not weld quotes if ( $tok eq 'q' ) { $rflags->[0] = '*'; $rflags->[1] = '*'; next; } my $pos = '*'; my $select = '*'; if ($item) { if ( $item =~ /^([\^\.])?([kKfFwW])?$/ ) { $pos = $1 if ($1); $select = $2 if ($2); } else { $msg1 .= " '$item_save'"; next; } } my $err; if ( $pos eq '^' || $pos eq '*' ) { if ( defined( $rflags->[0] ) && $rflags->[0] ne $select ) { $err = 1; } $rflags->[0] = $select; } if ( $pos eq '.' || $pos eq '*' ) { if ( defined( $rflags->[1] ) && $rflags->[1] ne $select ) { $err = 1; } $rflags->[1] = $select; } if ($err) { $msg2 .= " '$item_save'"; } } if ($msg1) { Warn(<' after an opening paren if ( $rOpts->{'weld-fat-comma'} ) { $weld_fat_comma_rules{'('} = 1 } # This could be generalized in the future by introducing a parameter # -weld-fat-comma-after=str (-wfca=str), where str contains any of: # * { [ ( # to indicate which opening parens may weld to a subsequent '=>' # The flag -wfc would then be equivalent to -wfca='(' # This has not been done because it is not yet clear how useful # this generalization would be. return; } ## end sub initialize_weld_fat_comma_rules sub initialize_line_up_parentheses_control_hash { my ( $str, $opt_name ) = @_; return unless ($str); $str =~ s/^\s+//; $str =~ s/\s+$//; return unless ($str); # The format is space separated items, where each item must consist of a # string with a token type preceded by an optional text token and followed # by an integer: # For example: # W(1 # = (flag1)(key)(flag2), where # flag1 = 'W' # key = '(' # flag2 = '1' my @items = split /\s+/, $str; my $msg1; my $msg2; foreach my $item (@items) { my $item_save = $item; my ( $flag1, $key, $flag2 ); if ( $item =~ /^([^\(\]\{]*)?([\(\{\[])(\d)?$/ ) { $flag1 = $1 if $1; $key = $2 if $2; $flag2 = $3 if $3; } else { $msg1 .= " '$item_save'"; next; } if ( !defined($key) ) { $msg1 .= " '$item_save'"; next; } # Check for valid flag1 if ( !defined($flag1) ) { $flag1 = '*' } elsif ( $flag1 !~ /^[kKfFwW\*]$/ ) { $msg1 .= " '$item_save'"; next; } # Check for valid flag2 # 0 or blank: ignore container contents # 1 all containers with sublists match # 2 all containers with sublists, code blocks or ternary operators match # ... this could be extended in the future if ( !defined($flag2) ) { $flag2 = 0 } elsif ( $flag2 !~ /^[012]$/ ) { $msg1 .= " '$item_save'"; next; } if ( !defined( $line_up_parentheses_control_hash{$key} ) ) { $line_up_parentheses_control_hash{$key} = [ $flag1, $flag2 ]; next; } # check for multiple conflicting specifications my $rflags = $line_up_parentheses_control_hash{$key}; my $err; if ( defined( $rflags->[0] ) && $rflags->[0] ne $flag1 ) { $err = 1; $rflags->[0] = $flag1; } if ( defined( $rflags->[1] ) && $rflags->[1] ne $flag2 ) { $err = 1; $rflags->[1] = $flag2; } $msg2 .= " '$item_save'" if ($err); next; } if ($msg1) { Warn(<{'line-up-parentheses'} = EMPTY_STRING; } } return; } ## end sub initialize_line_up_parentheses_control_hash sub initialize_space_after_keyword { # default keywords for which space is introduced before an opening paren # (at present, including them messes up vertical alignment) my @sak = qw(my local our and or xor err eq ne if else elsif until unless while for foreach return switch case given when catch); %space_after_keyword = map { $_ => 1 } @sak; # first remove any or all of these if desired if ( my @q = split_words( $rOpts->{'nospace-after-keyword'} ) ) { # -nsak='*' selects all the above keywords if ( @q == 1 && $q[0] eq '*' ) { @q = keys(%space_after_keyword) } @space_after_keyword{@q} = (0) x scalar(@q); } # then allow user to add to these defaults if ( my @q = split_words( $rOpts->{'space-after-keyword'} ) ) { @space_after_keyword{@q} = (1) x scalar(@q); } return; } ## end sub initialize_space_after_keyword sub initialize_token_break_preferences { # implement user break preferences my $break_after = sub { my @toks = @_; foreach my $tok (@toks) { if ( $tok eq '?' ) { $tok = ':' } # patch to coordinate ?/: if ( $tok eq ',' ) { $controlled_comma_style = 1 } my $lbs = $left_bond_strength{$tok}; my $rbs = $right_bond_strength{$tok}; if ( defined($lbs) && defined($rbs) && $lbs < $rbs ) { ( $right_bond_strength{$tok}, $left_bond_strength{$tok} ) = ( $lbs, $rbs ); } } return; }; my $break_before = sub { my @toks = @_; foreach my $tok (@toks) { if ( $tok eq ',' ) { $controlled_comma_style = 1 } my $lbs = $left_bond_strength{$tok}; my $rbs = $right_bond_strength{$tok}; if ( defined($lbs) && defined($rbs) && $rbs < $lbs ) { ( $right_bond_strength{$tok}, $left_bond_strength{$tok} ) = ( $lbs, $rbs ); } } return; }; $break_after->(@all_operators) if ( $rOpts->{'break-after-all-operators'} ); $break_before->(@all_operators) if ( $rOpts->{'break-before-all-operators'} ); $break_after->( split_words( $rOpts->{'want-break-after'} ) ); $break_before->( split_words( $rOpts->{'want-break-before'} ) ); # make note if breaks are before certain key types %want_break_before = (); foreach my $tok ( @all_operators, ',' ) { $want_break_before{$tok} = $left_bond_strength{$tok} < $right_bond_strength{$tok}; } # Coordinate ?/: breaks, which must be similar # The small strength 0.01 which is added is 1% of the strength of one # indentation level and seems to work okay. if ( !$want_break_before{':'} ) { $want_break_before{'?'} = $want_break_before{':'}; $right_bond_strength{'?'} = $right_bond_strength{':'} + 0.01; $left_bond_strength{'?'} = NO_BREAK; } # Only make a hash entry for the next parameters if values are defined. # That allows a quick check to be made later. %break_before_container_types = (); for ( $rOpts->{'break-before-hash-brace'} ) { $break_before_container_types{'{'} = $_ if $_ && $_ > 0; } for ( $rOpts->{'break-before-square-bracket'} ) { $break_before_container_types{'['} = $_ if $_ && $_ > 0; } for ( $rOpts->{'break-before-paren'} ) { $break_before_container_types{'('} = $_ if $_ && $_ > 0; } return; } ## end sub initialize_token_break_preferences use constant DEBUG_KB => 0; sub initialize_keep_old_breakpoints { my ( $str, $short_name, $rkeep_break_hash ) = @_; return unless $str; my %flags = (); my @list = split_words($str); if ( DEBUG_KB && @list ) { local $LIST_SEPARATOR = SPACE; print < 'f' foreach my $item (@list) { if ( $item =~ /^( [ \w\* ] )( [ \{\(\[\}\)\] ] )$/x ) { $item = $2; $flags{$2} = $1; } } my @unknown_types; foreach my $type (@list) { if ( !Perl::Tidy::Tokenizer::is_valid_token_type($type) ) { push @unknown_types, $type; } } if (@unknown_types) { my $num = @unknown_types; local $LIST_SEPARATOR = SPACE; Warn(<{$key} = $flag; } if ( DEBUG_KB && @list ) { my @tmp = %flags; local $LIST_SEPARATOR = SPACE; print <{'add-newlines'}; $rOpts_add_trailing_commas = $rOpts->{'add-trailing-commas'}; $rOpts_add_whitespace = $rOpts->{'add-whitespace'}; $rOpts_blank_lines_after_opening_block = $rOpts->{'blank-lines-after-opening-block'}; $rOpts_block_brace_tightness = $rOpts->{'block-brace-tightness'}; $rOpts_block_brace_vertical_tightness = $rOpts->{'block-brace-vertical-tightness'}; $rOpts_brace_follower_vertical_tightness = $rOpts->{'brace-follower-vertical-tightness'}; $rOpts_break_after_labels = $rOpts->{'break-after-labels'}; $rOpts_break_at_old_attribute_breakpoints = $rOpts->{'break-at-old-attribute-breakpoints'}; $rOpts_break_at_old_comma_breakpoints = $rOpts->{'break-at-old-comma-breakpoints'}; $rOpts_break_at_old_keyword_breakpoints = $rOpts->{'break-at-old-keyword-breakpoints'}; $rOpts_break_at_old_logical_breakpoints = $rOpts->{'break-at-old-logical-breakpoints'}; $rOpts_break_at_old_semicolon_breakpoints = $rOpts->{'break-at-old-semicolon-breakpoints'}; $rOpts_break_at_old_ternary_breakpoints = $rOpts->{'break-at-old-ternary-breakpoints'}; $rOpts_break_open_compact_parens = $rOpts->{'break-open-compact-parens'}; $rOpts_closing_side_comments = $rOpts->{'closing-side-comments'}; $rOpts_closing_side_comment_else_flag = $rOpts->{'closing-side-comment-else-flag'}; $rOpts_closing_side_comment_maximum_text = $rOpts->{'closing-side-comment-maximum-text'}; $rOpts_comma_arrow_breakpoints = $rOpts->{'comma-arrow-breakpoints'}; $rOpts_continuation_indentation = $rOpts->{'continuation-indentation'}; $rOpts_cuddled_paren_brace = $rOpts->{'cuddled-paren-brace'}; $rOpts_delete_closing_side_comments = $rOpts->{'delete-closing-side-comments'}; $rOpts_delete_old_whitespace = $rOpts->{'delete-old-whitespace'}; $rOpts_extended_continuation_indentation = $rOpts->{'extended-continuation-indentation'}; $rOpts_delete_side_comments = $rOpts->{'delete-side-comments'}; $rOpts_delete_trailing_commas = $rOpts->{'delete-trailing-commas'}; $rOpts_delete_weld_interfering_commas = $rOpts->{'delete-weld-interfering-commas'}; $rOpts_format_skipping = $rOpts->{'format-skipping'}; $rOpts_freeze_whitespace = $rOpts->{'freeze-whitespace'}; $rOpts_function_paren_vertical_alignment = $rOpts->{'function-paren-vertical-alignment'}; $rOpts_fuzzy_line_length = $rOpts->{'fuzzy-line-length'}; $rOpts_ignore_old_breakpoints = $rOpts->{'ignore-old-breakpoints'}; $rOpts_ignore_side_comment_lengths = $rOpts->{'ignore-side-comment-lengths'}; $rOpts_indent_closing_brace = $rOpts->{'indent-closing-brace'}; $rOpts_indent_columns = $rOpts->{'indent-columns'}; $rOpts_indent_only = $rOpts->{'indent-only'}; $rOpts_keep_interior_semicolons = $rOpts->{'keep-interior-semicolons'}; $rOpts_line_up_parentheses = $rOpts->{'line-up-parentheses'}; $rOpts_extended_line_up_parentheses = $rOpts->{'extended-line-up-parentheses'}; $rOpts_logical_padding = $rOpts->{'logical-padding'}; $rOpts_maximum_consecutive_blank_lines = $rOpts->{'maximum-consecutive-blank-lines'}; $rOpts_maximum_fields_per_table = $rOpts->{'maximum-fields-per-table'}; $rOpts_maximum_line_length = $rOpts->{'maximum-line-length'}; $rOpts_one_line_block_semicolons = $rOpts->{'one-line-block-semicolons'}; $rOpts_opening_brace_always_on_right = $rOpts->{'opening-brace-always-on-right'}; $rOpts_outdent_keywords = $rOpts->{'outdent-keywords'}; $rOpts_outdent_labels = $rOpts->{'outdent-labels'}; $rOpts_outdent_long_comments = $rOpts->{'outdent-long-comments'}; $rOpts_outdent_long_quotes = $rOpts->{'outdent-long-quotes'}; $rOpts_outdent_static_block_comments = $rOpts->{'outdent-static-block-comments'}; $rOpts_recombine = $rOpts->{'recombine'}; $rOpts_short_concatenation_item_length = $rOpts->{'short-concatenation-item-length'}; $rOpts_space_prototype_paren = $rOpts->{'space-prototype-paren'}; $rOpts_stack_closing_block_brace = $rOpts->{'stack-closing-block-brace'}; $rOpts_static_block_comments = $rOpts->{'static-block-comments'}; $rOpts_tee_block_comments = $rOpts->{'tee-block-comments'}; $rOpts_tee_pod = $rOpts->{'tee-pod'}; $rOpts_tee_side_comments = $rOpts->{'tee-side-comments'}; $rOpts_valign_code = $rOpts->{'valign-code'}; $rOpts_valign_side_comments = $rOpts->{'valign-side-comments'}; $rOpts_variable_maximum_line_length = $rOpts->{'variable-maximum-line-length'}; # Note that both opening and closing tokens can access the opening # and closing flags of their container types. %opening_vertical_tightness = ( '(' => $rOpts->{'paren-vertical-tightness'}, '{' => $rOpts->{'brace-vertical-tightness'}, '[' => $rOpts->{'square-bracket-vertical-tightness'}, ')' => $rOpts->{'paren-vertical-tightness'}, '}' => $rOpts->{'brace-vertical-tightness'}, ']' => $rOpts->{'square-bracket-vertical-tightness'}, ); %closing_vertical_tightness = ( '(' => $rOpts->{'paren-vertical-tightness-closing'}, '{' => $rOpts->{'brace-vertical-tightness-closing'}, '[' => $rOpts->{'square-bracket-vertical-tightness-closing'}, ')' => $rOpts->{'paren-vertical-tightness-closing'}, '}' => $rOpts->{'brace-vertical-tightness-closing'}, ']' => $rOpts->{'square-bracket-vertical-tightness-closing'}, ); # assume flag for '>' same as ')' for closing qw quotes %closing_token_indentation = ( ')' => $rOpts->{'closing-paren-indentation'}, '}' => $rOpts->{'closing-brace-indentation'}, ']' => $rOpts->{'closing-square-bracket-indentation'}, '>' => $rOpts->{'closing-paren-indentation'}, ); # flag indicating if any closing tokens are indented $some_closing_token_indentation = $rOpts->{'closing-paren-indentation'} || $rOpts->{'closing-brace-indentation'} || $rOpts->{'closing-square-bracket-indentation'} || $rOpts->{'indent-closing-brace'}; %opening_token_right = ( '(' => $rOpts->{'opening-paren-right'}, '{' => $rOpts->{'opening-hash-brace-right'}, '[' => $rOpts->{'opening-square-bracket-right'}, ); %stack_opening_token = ( '(' => $rOpts->{'stack-opening-paren'}, '{' => $rOpts->{'stack-opening-hash-brace'}, '[' => $rOpts->{'stack-opening-square-bracket'}, ); %stack_closing_token = ( ')' => $rOpts->{'stack-closing-paren'}, '}' => $rOpts->{'stack-closing-hash-brace'}, ']' => $rOpts->{'stack-closing-square-bracket'}, ); return; } ## end sub initialize_global_option_vars sub initialize_line_length_vars { # Create a table of maximum line length vs level for later efficient use. # We will make the tables very long to be sure it will not be exceeded. # But we have to choose a fixed length. A check will be made at the start # of sub 'finish_formatting' to be sure it is not exceeded. Note, some of # my standard test problems have indentation levels of about 150, so this # should be fairly large. If the choice of a maximum level ever becomes # an issue then these table values could be returned in a sub with a simple # memoization scheme. # Also create a table of the maximum spaces available for text due to the # level only. If a line has continuation indentation, then that space must # be subtracted from the table value. This table is used for preliminary # estimates in welding, extended_ci, BBX, and marking short blocks. use constant LEVEL_TABLE_MAX => 1000; # The basic scheme: foreach my $level ( 0 .. LEVEL_TABLE_MAX ) { my $indent = $level * $rOpts_indent_columns; $maximum_line_length_at_level[$level] = $rOpts_maximum_line_length; $maximum_text_length_at_level[$level] = $rOpts_maximum_line_length - $indent; } # Correct the maximum_text_length table if the -wc=n flag is used $rOpts_whitespace_cycle = $rOpts->{'whitespace-cycle'}; if ($rOpts_whitespace_cycle) { if ( $rOpts_whitespace_cycle > 0 ) { foreach my $level ( 0 .. LEVEL_TABLE_MAX ) { my $level_mod = $level % $rOpts_whitespace_cycle; my $indent = $level_mod * $rOpts_indent_columns; $maximum_text_length_at_level[$level] = $rOpts_maximum_line_length - $indent; } } else { $rOpts_whitespace_cycle = $rOpts->{'whitespace-cycle'} = 0; } } # Correct the tables if the -vmll flag is used. These values override the # previous values. if ($rOpts_variable_maximum_line_length) { foreach my $level ( 0 .. LEVEL_TABLE_MAX ) { $maximum_text_length_at_level[$level] = $rOpts_maximum_line_length; $maximum_line_length_at_level[$level] = $rOpts_maximum_line_length + $level * $rOpts_indent_columns; } } # Define two measures of indentation level, alpha and beta, at which some # formatting features come under stress and need to start shutting down. # Some combination of the two will be used to shut down different # formatting features. # Put a reasonable upper limit on stress level (say 100) in case the # whitespace-cycle variable is used. my $stress_level_limit = min( 100, LEVEL_TABLE_MAX ); # Find stress_level_alpha, targeted at very short maximum line lengths. $stress_level_alpha = $stress_level_limit + 1; foreach my $level_test ( 0 .. $stress_level_limit ) { my $max_len = $maximum_text_length_at_level[ $level_test + 1 ]; my $excess_inside_space = $max_len - $rOpts_continuation_indentation - $rOpts_indent_columns - 8; if ( $excess_inside_space <= 0 ) { $stress_level_alpha = $level_test; last; } } # Find stress level beta, a stress level targeted at formatting # at deep levels near the maximum line length. We start increasing # from zero and stop at the first level which shows no more space. # 'const' is a fixed number of spaces for a typical variable. # Cases b1197-b1204 work ok with const=12 but not with const=8 my $const = 16; my $denom = max( 1, $rOpts_indent_columns ); $stress_level_beta = 0; foreach my $level ( 0 .. $stress_level_limit ) { my $remaining_cycles = max( 0, ( $maximum_text_length_at_level[$level] - $rOpts_continuation_indentation - $const ) / $denom ); last if ( $remaining_cycles <= 3 ); # 2 does not work $stress_level_beta = $level; } # This is a combined level which works well for turning off formatting # features in most cases: $high_stress_level = min( $stress_level_alpha, $stress_level_beta + 2 ); return; } ## end sub initialize_line_length_vars sub initialize_trailing_comma_rules { # Setup control hash for trailing commas # -wtc=s defines desired trailing comma policy: # # =" " stable # [ both -atc and -dtc ignored ] # =0 : none # [requires -dtc; -atc ignored] # =1 or * : all # [requires -atc; -dtc ignored] # =m : multiline lists require trailing comma # if -atc set => will add missing multiline trailing commas # if -dtc set => will delete trailing single line commas # =b or 'bare' (multiline) lists require trailing comma # if -atc set => will add missing bare trailing commas # if -dtc set => will delete non-bare trailing commas # =h or 'hash': single column stable bare lists require trailing comma # if -atc set will add these # if -dtc set will delete other trailing commas #------------------------------------------------------------------- # This routine must be called after the alpha and beta stress levels # have been defined in sub 'initialize_line_length_vars'. #------------------------------------------------------------------- %trailing_comma_rules = (); my $rvalid_flags = [qw(0 1 * m b h i)]; my $option = $rOpts->{'want-trailing-commas'}; if ($option) { $option =~ s/^\s+//; $option =~ s/\s+$//; } # We need to use length() here because '0' is a possible option if ( defined($option) && length($option) ) { my $error_message; my %rule_hash; my @q = @{$rvalid_flags}; my %is_valid_flag; @is_valid_flag{@q} = (1) x scalar(@q); # handle single character control, such as -wtc='b' if ( length($option) == 1 ) { foreach (qw< ) ] } >) { $rule_hash{$_} = [ $option, EMPTY_STRING ]; } } # handle multi-character control(s), such as -wtc='[m' or -wtc='k(m' else { my @parts = split /\s+/, $option; foreach my $part (@parts) { if ( length($part) >= 2 && length($part) <= 3 ) { my $val = substr( $part, -1, 1 ); my $key_o = substr( $part, -2, 1 ); if ( $is_opening_token{$key_o} ) { my $paren_flag = EMPTY_STRING; if ( length($part) == 3 ) { $paren_flag = substr( $part, 0, 1 ); } my $key = $matching_token{$key_o}; $rule_hash{$key} = [ $val, $paren_flag ]; } else { $error_message .= "Unrecognized term: '$part'\n"; } } else { $error_message .= "Unrecognized term: '$part'\n"; } } } # check for valid control characters if ( !$error_message ) { foreach my $key ( keys %rule_hash ) { my $item = $rule_hash{$key}; my ( $val, $paren_flag ) = @{$item}; if ( $val && !$is_valid_flag{$val} ) { my $valid_str = join( SPACE, @{$rvalid_flags} ); $error_message .= "Unexpected value '$val'; must be one of: $valid_str\n"; last; } if ($paren_flag) { if ( $paren_flag !~ /^[kKfFwW]$/ ) { $error_message .= "Unexpected paren flag '$paren_flag'; must be one of: k K f F w W\n"; last; } if ( $key ne ')' ) { $error_message .= "paren flag '$paren_flag' is only allowed before a '('\n"; last; } } } } if ($error_message) { Warn(<' is excluded because it never gets space # parentheses and brackets are excluded since they are handled specially # curly braces are included but may be overridden by logic, such as # newline logic. # NEW_TOKENS: create a whitespace rule here. This can be as # simple as adding your new letter to @spaces_both_sides, for # example. my @spaces_both_sides = qw# + - * / % ? = . : x < > | & ^ .. << >> ** && .. || // => += -= .= %= x= &= |= ^= *= <> <= >= == =~ !~ /= != ... <<= >>= ~~ !~~ &&= ||= //= <=> A k f w F n C Y U G v #; my @spaces_left_side = qw< t ! ~ m p { \ h pp mm Z j >; push( @spaces_left_side, '#' ); # avoids warning message my @spaces_right_side = qw< ; } ) ] R J ++ -- **= >; push( @spaces_right_side, ',' ); # avoids warning message %want_left_space = (); %want_right_space = (); %binary_ws_rules = (); # Note that we setting defaults here. Later in processing # the values of %want_left_space and %want_right_space # may be overridden by any user settings specified by the # -wls and -wrs parameters. However the binary_whitespace_rules # are hardwired and have priority. @want_left_space{@spaces_both_sides} = (1) x scalar(@spaces_both_sides); @want_right_space{@spaces_both_sides} = (1) x scalar(@spaces_both_sides); @want_left_space{@spaces_left_side} = (1) x scalar(@spaces_left_side); @want_right_space{@spaces_left_side} = (-1) x scalar(@spaces_left_side); @want_left_space{@spaces_right_side} = (-1) x scalar(@spaces_right_side); @want_right_space{@spaces_right_side} = (1) x scalar(@spaces_right_side); $want_left_space{'->'} = WS_NO; $want_right_space{'->'} = WS_NO; $want_left_space{'**'} = WS_NO; $want_right_space{'**'} = WS_NO; $want_right_space{'CORE::'} = WS_NO; # These binary_ws_rules are hardwired and have priority over the above # settings. It would be nice to allow adjustment by the user, # but it would be complicated to specify. # # hash type information must stay tightly bound # as in : ${xxxx} $binary_ws_rules{'i'}{'L'} = WS_NO; $binary_ws_rules{'i'}{'{'} = WS_YES; $binary_ws_rules{'k'}{'{'} = WS_YES; $binary_ws_rules{'U'}{'{'} = WS_YES; $binary_ws_rules{'i'}{'['} = WS_NO; $binary_ws_rules{'R'}{'L'} = WS_NO; $binary_ws_rules{'R'}{'{'} = WS_NO; $binary_ws_rules{'t'}{'L'} = WS_NO; $binary_ws_rules{'t'}{'{'} = WS_NO; $binary_ws_rules{'t'}{'='} = WS_OPTIONAL; # for signatures; fixes b1123 $binary_ws_rules{'}'}{'L'} = WS_NO; $binary_ws_rules{'}'}{'{'} = WS_OPTIONAL; # RT#129850; was WS_NO $binary_ws_rules{'$'}{'L'} = WS_NO; $binary_ws_rules{'$'}{'{'} = WS_NO; $binary_ws_rules{'@'}{'L'} = WS_NO; $binary_ws_rules{'@'}{'{'} = WS_NO; $binary_ws_rules{'='}{'L'} = WS_YES; $binary_ws_rules{'J'}{'J'} = WS_YES; # the following includes ') {' # as in : if ( xxx ) { yyy } $binary_ws_rules{']'}{'L'} = WS_NO; $binary_ws_rules{']'}{'{'} = WS_NO; $binary_ws_rules{')'}{'{'} = WS_YES; $binary_ws_rules{')'}{'['} = WS_NO; $binary_ws_rules{']'}{'['} = WS_NO; $binary_ws_rules{']'}{'{'} = WS_NO; $binary_ws_rules{'}'}{'['} = WS_NO; $binary_ws_rules{'R'}{'['} = WS_NO; $binary_ws_rules{']'}{'++'} = WS_NO; $binary_ws_rules{']'}{'--'} = WS_NO; $binary_ws_rules{')'}{'++'} = WS_NO; $binary_ws_rules{')'}{'--'} = WS_NO; $binary_ws_rules{'R'}{'++'} = WS_NO; $binary_ws_rules{'R'}{'--'} = WS_NO; $binary_ws_rules{'i'}{'Q'} = WS_YES; $binary_ws_rules{'n'}{'('} = WS_YES; # occurs in 'use package n ()' $binary_ws_rules{'i'}{'('} = WS_NO; $binary_ws_rules{'w'}{'('} = WS_NO; $binary_ws_rules{'w'}{'{'} = WS_YES; return; } ## end sub initialize_whitespace_hashes { #<<< begin closure set_whitespace_flags my %is_special_ws_type; my %is_wCUG; my %is_wi; BEGIN { # The following hash is used to skip over needless if tests. # Be sure to update it when adding new checks in its block. my @q = qw(k w C m - Q); push @q, '#'; @is_special_ws_type{@q} = (1) x scalar(@q); # These hashes replace slower regex tests @q = qw( w C U G ); @is_wCUG{@q} = (1) x scalar(@q); @q = qw( w i ); @is_wi{@q} = (1) x scalar(@q); } ## end BEGIN use constant DEBUG_WHITE => 0; # Hashes to set spaces around container tokens according to their # sequence numbers. These are set as keywords are examined. # They are controlled by the -kpit and -kpitl flags. my %opening_container_inside_ws; my %closing_container_inside_ws; sub set_whitespace_flags { # This routine is called once per file to set whitespace flags for that # file. This routine examines each pair of nonblank tokens and sets a flag # indicating if white space is needed. # # $rwhitespace_flags->[$j] is a flag indicating whether a white space # BEFORE token $j is needed, with the following values: # # WS_NO = -1 do not want a space BEFORE token $j # WS_OPTIONAL= 0 optional space or $j is a whitespace # WS_YES = 1 want a space BEFORE token $j # my $self = shift; my $j_tight_closing_paren = -1; my $rLL = $self->[_rLL_]; my $jmax = @{$rLL} - 1; %opening_container_inside_ws = (); %closing_container_inside_ws = (); my $rblock_type_of_seqno = $self->[_rblock_type_of_seqno_]; my $rOpts_space_keyword_paren = $rOpts->{'space-keyword-paren'}; my $rOpts_space_backslash_quote = $rOpts->{'space-backslash-quote'}; my $rOpts_space_function_paren = $rOpts->{'space-function-paren'}; my $rwhitespace_flags = []; my $ris_function_call_paren = {}; return $rwhitespace_flags if ( $jmax < 0 ); my %is_for_foreach = ( 'for' => 1, 'foreach' => 1 ); my $last_token = SPACE; my $last_type = 'b'; my $rtokh_last = [ @{ $rLL->[0] } ]; $rtokh_last->[_TOKEN_] = $last_token; $rtokh_last->[_TYPE_] = $last_type; $rtokh_last->[_TYPE_SEQUENCE_] = EMPTY_STRING; $rtokh_last->[_LINE_INDEX_] = 0; my $rtokh_last_last = $rtokh_last; my ( $ws_1, $ws_2, $ws_3, $ws_4 ); # main loop over all tokens to define the whitespace flags my $last_type_is_opening; my ( $token, $type ); my $j = -1; foreach my $rtokh ( @{$rLL} ) { $j++; $type = $rtokh->[_TYPE_]; if ( $type eq 'b' ) { $rwhitespace_flags->[$j] = WS_OPTIONAL; next; } $token = $rtokh->[_TOKEN_]; my $ws; #--------------------------------------------------------------- # Whitespace Rules Section 1: # Handle space on the inside of opening braces. #--------------------------------------------------------------- # /^[L\{\(\[]$/ if ($last_type_is_opening) { $last_type_is_opening = 0; my $seqno = $rtokh->[_TYPE_SEQUENCE_]; my $block_type = $rblock_type_of_seqno->{$seqno}; my $last_seqno = $rtokh_last->[_TYPE_SEQUENCE_]; my $last_block_type = $rblock_type_of_seqno->{$last_seqno}; $j_tight_closing_paren = -1; # let us keep empty matched braces together: () {} [] # except for BLOCKS if ( $token eq $matching_token{$last_token} ) { if ($block_type) { $ws = WS_YES; } else { $ws = WS_NO; } } else { # we're considering the right of an opening brace # tightness = 0 means always pad inside with space # tightness = 1 means pad inside if "complex" # tightness = 2 means never pad inside with space my $tightness; if ( $last_type eq '{' && $last_token eq '{' && $last_block_type ) { $tightness = $rOpts_block_brace_tightness; } else { $tightness = $tightness{$last_token} } #============================================================= # Patch for test problem <> # We must always avoid spaces around a bare word beginning # with ^ as in: # my $before = ${^PREMATCH}; # Because all of the following cause an error in perl: # my $before = ${ ^PREMATCH }; # my $before = ${ ^PREMATCH}; # my $before = ${^PREMATCH }; # So if brace tightness flag is -bt=0 we must temporarily reset # to bt=1. Note that here we must set tightness=1 and not 2 so # that the closing space is also avoided # (via the $j_tight_closing_paren flag in coding) if ( $type eq 'w' && $token =~ /^\^/ ) { $tightness = 1 } #============================================================= if ( $tightness <= 0 ) { $ws = WS_YES; } elsif ( $tightness > 1 ) { $ws = WS_NO; } else { # find the index of the closing token my $j_closing = $self->[_K_closing_container_]->{$last_seqno}; # If the closing token is less than five characters ahead # we must take a closer look if ( defined($j_closing) && $j_closing - $j < 5 && $rLL->[$j_closing]->[_TYPE_SEQUENCE_] eq $last_seqno ) { $ws = ws_in_container( $j, $j_closing, $rLL, $type, $token, $last_token ); if ( $ws == WS_NO ) { $j_tight_closing_paren = $j_closing; } } else { $ws = WS_YES; } } } # check for special cases which override the above rules if ( %opening_container_inside_ws && $last_seqno ) { my $ws_override = $opening_container_inside_ws{$last_seqno}; if ($ws_override) { $ws = $ws_override } } $ws_4 = $ws_3 = $ws_2 = $ws_1 = $ws if DEBUG_WHITE; } ## end setting space flag inside opening tokens #--------------------------------------------------------------- # Whitespace Rules Section 2: # Special checks for certain types ... #--------------------------------------------------------------- # The hash '%is_special_ws_type' significantly speeds up this routine, # but be sure to update it if a new check is added. # Currently has types: qw(k w C m - Q #) if ( $is_special_ws_type{$type} ) { if ( $type eq 'k' ) { # Keywords 'for', 'foreach' are special cases for -kpit since # the opening paren does not always immediately follow the # keyword. So we have to search forward for the paren in this # case. I have limited the search to 10 tokens ahead, just in # case somebody has a big file and no opening paren. This # should be enough for all normal code. Added the level check # to fix b1236. if ( $is_for_foreach{$token} && %keyword_paren_inner_tightness && defined( $keyword_paren_inner_tightness{$token} ) && $j < $jmax ) { my $level = $rLL->[$j]->[_LEVEL_]; my $jp = $j; ## NOTE: we might use the KNEXT variable to avoid this loop ## but profiling shows that little would be saved foreach my $inc ( 1 .. 9 ) { $jp++; last if ( $jp > $jmax ); last if ( $rLL->[$jp]->[_LEVEL_] != $level ); # b1236 next unless ( $rLL->[$jp]->[_TOKEN_] eq '(' ); my $seqno_p = $rLL->[$jp]->[_TYPE_SEQUENCE_]; set_container_ws_by_keyword( $token, $seqno_p ); last; } } } # handle a comment elsif ( $type eq '#' ) { # newline before block comment ($j==0), and # space before side comment ($j>0), so .. $ws = WS_YES; #--------------------------------- # Nothing more to do for a comment #--------------------------------- $rwhitespace_flags->[$j] = $ws; next; } # retain any space between '-' and bare word elsif ( $type eq 'w' || $type eq 'C' ) { $ws = WS_OPTIONAL if $last_type eq '-'; } # retain any space between '-' and bare word; for example # avoid space between 'USER' and '-' here: <> # $myhash{USER-NAME}='steve'; elsif ( $type eq 'm' || $type eq '-' ) { $ws = WS_OPTIONAL if ( $last_type eq 'w' ); } # space_backslash_quote; RT #123774 <> # allow a space between a backslash and single or double quote # to avoid fooling html formatters elsif ( $last_type eq '\\' && $type eq 'Q' && $token =~ /^[\"\']/ ) { if ($rOpts_space_backslash_quote) { if ( $rOpts_space_backslash_quote == 1 ) { $ws = WS_OPTIONAL; } elsif ( $rOpts_space_backslash_quote == 2 ) { $ws = WS_YES } else { } # shouldnt happen } else { $ws = WS_NO; } } } ## end elsif ( $is_special_ws_type{$type} ... #--------------------------------------------------------------- # Whitespace Rules Section 3: # Handle space on inside of closing brace pairs. #--------------------------------------------------------------- # /[\}\)\]R]/ elsif ( $is_closing_type{$type} ) { my $seqno = $rtokh->[_TYPE_SEQUENCE_]; if ( $j == $j_tight_closing_paren ) { $j_tight_closing_paren = -1; $ws = WS_NO; } else { if ( !defined($ws) ) { my $tightness; my $block_type = $rblock_type_of_seqno->{$seqno}; if ( $type eq '}' && $token eq '}' && $block_type ) { $tightness = $rOpts_block_brace_tightness; } else { $tightness = $tightness{$token} } $ws = ( $tightness > 1 ) ? WS_NO : WS_YES; } } # check for special cases which override the above rules if ( %closing_container_inside_ws && $seqno ) { my $ws_override = $closing_container_inside_ws{$seqno}; if ($ws_override) { $ws = $ws_override } } $ws_4 = $ws_3 = $ws_2 = $ws if DEBUG_WHITE; } ## end setting space flag inside closing tokens #--------------------------------------------------------------- # Whitespace Rules Section 4: #--------------------------------------------------------------- # /^[L\{\(\[]$/ elsif ( $is_opening_type{$type} ) { $last_type_is_opening = 1; if ( $token eq '(' ) { my $seqno = $rtokh->[_TYPE_SEQUENCE_]; # This will have to be tweaked as tokenization changes. # We usually want a space at '} (', for example: # <> # map { 1 * $_; } ( $y, $M, $w, $d, $h, $m, $s ); # # But not others: # &{ $_->[1] }( delete $_[$#_]{ $_->[0] } ); # At present, the above & block is marked as type L/R so this # case won't go through here. if ( $last_type eq '}' && $last_token ne ')' ) { $ws = WS_YES } # NOTE: some older versions of Perl had occasional problems if # spaces are introduced between keywords or functions and # opening parens. So the default is not to do this except is # certain cases. The current Perl seems to tolerate spaces. # Space between keyword and '(' elsif ( $last_type eq 'k' ) { $ws = WS_NO unless ( $rOpts_space_keyword_paren || $space_after_keyword{$last_token} ); # Set inside space flag if requested set_container_ws_by_keyword( $last_token, $seqno ); } # Space between function and '(' # ----------------------------------------------------- # 'w' and 'i' checks for something like: # myfun( &myfun( ->myfun( # ----------------------------------------------------- # Note that at this point an identifier may still have a # leading arrow, but the arrow will be split off during token # respacing. After that, the token may become a bare word # without leading arrow. The point is, it is best to mark # function call parens right here before that happens. # Patch: added 'C' to prevent blinker, case b934, i.e. 'pi()' # NOTE: this would be the place to allow spaces between # repeated parens, like () () (), as in case c017, but I # decided that would not be a good idea. # Updated to allow detached '->' from tokenizer (issue c140) elsif ( # /^[wCUG]$/ $is_wCUG{$last_type} || ( # /^[wi]$/ $is_wi{$last_type} && ( # with prefix '->' or '&' $last_token =~ /^([\&]|->)/ # or preceding token '->' (see b1337; c140) || $rtokh_last_last->[_TYPE_] eq '->' # or preceding sub call operator token '&' || ( $rtokh_last_last->[_TYPE_] eq 't' && $rtokh_last_last->[_TOKEN_] =~ /^\&\s*$/ ) ) ) ) { $ws = $rOpts_space_function_paren ? $self->ws_space_function_paren( $j, $rtokh_last_last ) : WS_NO; set_container_ws_by_keyword( $last_token, $seqno ); $ris_function_call_paren->{$seqno} = 1; } # space between something like $i and ( in 'snippets/space2.in' # for $i ( 0 .. 20 ) { elsif ( $last_type eq 'i' && $last_token =~ /^[\$\%\@]/ ) { $ws = WS_YES; } # allow constant function followed by '()' to retain no space elsif ($last_type eq 'C' && $rLL->[ $j + 1 ]->[_TOKEN_] eq ')' ) { $ws = WS_NO; } } # patch for SWITCH/CASE: make space at ']{' optional # since the '{' might begin a case or when block elsif ( ( $token eq '{' && $type ne 'L' ) && $last_token eq ']' ) { $ws = WS_OPTIONAL; } # keep space between 'sub' and '{' for anonymous sub definition, # be sure type = 'k' (added for c140) if ( $type eq '{' ) { if ( $last_token eq 'sub' && $last_type eq 'k' ) { $ws = WS_YES; } # this is needed to avoid no space in '){' if ( $last_token eq ')' && $token eq '{' ) { $ws = WS_YES } # avoid any space before the brace or bracket in something like # @opts{'a','b',...} if ( $last_type eq 'i' && $last_token =~ /^\@/ ) { $ws = WS_NO; } } } ## end if ( $is_opening_type{$type} ) { # always preserve whatever space was used after a possible # filehandle (except _) or here doc operator if ( ( ( $last_type eq 'Z' && $last_token ne '_' ) || $last_type eq 'h' ) && $type ne '#' # no longer required due to early exit for '#' above ) { $ws = WS_OPTIONAL; } $ws_4 = $ws_3 = $ws if DEBUG_WHITE; if ( !defined($ws) ) { #--------------------------------------------------------------- # Whitespace Rules Section 4: # Use the binary rule table. #--------------------------------------------------------------- if ( defined( $binary_ws_rules{$last_type}{$type} ) ) { $ws = $binary_ws_rules{$last_type}{$type}; $ws_4 = $ws if DEBUG_WHITE; } #--------------------------------------------------------------- # Whitespace Rules Section 5: # Apply default rules not covered above. #--------------------------------------------------------------- # If we fall through to here, look at the pre-defined hash tables # for the two tokens, and: # if (they are equal) use the common value # if (either is zero or undef) use the other # if (either is -1) use it # That is, # left vs right # 1 vs 1 --> 1 # 0 vs 0 --> 0 # -1 vs -1 --> -1 # # 0 vs -1 --> -1 # 0 vs 1 --> 1 # 1 vs 0 --> 1 # -1 vs 0 --> -1 # # -1 vs 1 --> -1 # 1 vs -1 --> -1 else { my $wl = $want_left_space{$type}; my $wr = $want_right_space{$last_type}; if ( !defined($wl) ) { $ws = defined($wr) ? $wr : 0; } elsif ( !defined($wr) ) { $ws = $wl; } else { $ws = ( ( $wl == $wr ) || ( $wl == -1 ) || !$wr ) ? $wl : $wr; } } } # Treat newline as a whitespace. Otherwise, we might combine # 'Send' and '-recipients' here according to the above rules: # <> # my $msg = new Fax::Send # -recipients => $to, # -data => $data; if ( !$ws && $rtokh->[_LINE_INDEX_] != $rtokh_last->[_LINE_INDEX_] ) { $ws = WS_YES; } $rwhitespace_flags->[$j] = $ws; # remember non-blank, non-comment tokens $last_token = $token; $last_type = $type; $rtokh_last_last = $rtokh_last; $rtokh_last = $rtokh; next if ( !DEBUG_WHITE ); my $str = substr( $last_token, 0, 15 ); $str .= SPACE x ( 16 - length($str) ); if ( !defined($ws_1) ) { $ws_1 = "*" } if ( !defined($ws_2) ) { $ws_2 = "*" } if ( !defined($ws_3) ) { $ws_3 = "*" } if ( !defined($ws_4) ) { $ws_4 = "*" } print STDOUT "NEW WHITE: i=$j $str $last_type $type $ws_1 : $ws_2 : $ws_3 : $ws_4 : $ws \n"; # reset for next pass $ws_1 = $ws_2 = $ws_3 = $ws_4 = undef; } ## end main loop if ( $rOpts->{'tight-secret-operators'} ) { new_secret_operator_whitespace( $rLL, $rwhitespace_flags ); } $self->[_ris_function_call_paren_] = $ris_function_call_paren; return $rwhitespace_flags; } ## end sub set_whitespace_flags sub set_container_ws_by_keyword { my ( $word, $sequence_number ) = @_; return unless (%keyword_paren_inner_tightness); # We just saw a keyword (or other function name) followed by an opening # paren. Now check to see if the following paren should have special # treatment for its inside space. If so we set a hash value using the # sequence number as key. if ( $word && $sequence_number ) { my $tightness = $keyword_paren_inner_tightness{$word}; if ( defined($tightness) && $tightness != 1 ) { my $ws_flag = $tightness == 0 ? WS_YES : WS_NO; $opening_container_inside_ws{$sequence_number} = $ws_flag; $closing_container_inside_ws{$sequence_number} = $ws_flag; } } return; } ## end sub set_container_ws_by_keyword sub ws_in_container { my ( $j, $j_closing, $rLL, $type, $token, $last_token ) = @_; # Given: # $j = index of token following an opening container token # $type, $token = the type and token at index $j # $j_closing = closing token of the container # $last_token = the opening token of the container # Return: # WS_NO if there is just one token in the container (with exceptions) # WS_YES otherwise #------------------------------------ # Look forward for the closing token; #------------------------------------ if ( $j + 1 > $j_closing ) { return WS_NO } # Patch to count '-foo' as single token so that # each of $a{-foo} and $a{foo} and $a{'foo'} do # not get spaces with default formatting. my $j_here = $j; ++$j_here if ( $token eq '-' && $last_token eq '{' && $rLL->[ $j + 1 ]->[_TYPE_] eq 'w' ); # Patch to count a sign separated from a number as a single token, as # in the following line. Otherwise, it takes two steps to converge: # deg2rad(- 0.5) if ( ( $type eq 'm' || $type eq 'p' ) && $j < $j_closing + 1 && $rLL->[ $j + 1 ]->[_TYPE_] eq 'b' && $rLL->[ $j + 2 ]->[_TYPE_] eq 'n' && $rLL->[ $j + 2 ]->[_TOKEN_] =~ /^\d/ ) { $j_here = $j + 2; } # $j_next is where a closing token should be if the container has # just a "single" token if ( $j_here + 1 > $j_closing ) { return WS_NO } my $j_next = ( $rLL->[ $j_here + 1 ]->[_TYPE_] eq 'b' ) ? $j_here + 2 : $j_here + 1; #----------------------------------------------------------------- # Now decide: if we get to the closing token we will keep it tight #----------------------------------------------------------------- if ( $j_next == $j_closing # OLD PROBLEM: but watch out for this: [ [ ] (misc.t) # No longer necessary because of the previous check on sequence numbers ##&& $last_token ne $token # double diamond is usually spaced && $token ne '<<>>' ) { return WS_NO; } return WS_YES; } ## end sub ws_in_container sub ws_space_function_paren { my ( $self, $j, $rtokh_last_last ) = @_; # Called if --space-function-paren is set to see if it might cause # a problem. The manual warns the user about potential problems with # this flag. Here we just try to catch one common problem. # Given: # $j = index of '(' after function name # Return: # WS_NO if no space # WS_YES otherwise # This was added to fix for issue c166. Ignore -sfp at a possible indirect # object location. For example, do not convert this: # print header() ... # to this: # print header () ... # because in this latter form, header may be taken to be a file handle # instead of a function call. # Start with the normal value for -sfp: my $ws = WS_YES; # now check to be sure we don't cause a problem: my $type_ll = $rtokh_last_last->[_TYPE_]; my $tok_ll = $rtokh_last_last->[_TOKEN_]; # NOTE: this is just a minimal check. For example, we might also check # for something like this: # print ( header ( .. if ( $type_ll eq 'k' && $is_indirect_object_taker{$tok_ll} ) { $ws = WS_NO; } return $ws; } ## end sub ws_space_function_paren } ## end closure set_whitespace_flags sub dump_want_left_space { my $fh = shift; local $LIST_SEPARATOR = "\n"; $fh->print(<print("$key\t$want_left_space{$key}\n"); } return; } ## end sub dump_want_left_space sub dump_want_right_space { my $fh = shift; local $LIST_SEPARATOR = "\n"; $fh->print(<print("$key\t$want_right_space{$key}\n"); } return; } ## end sub dump_want_right_space { ## begin closure is_essential_whitespace my %is_sort_grep_map; my %is_for_foreach; my %is_digraph; my %is_trigraph; my %essential_whitespace_filter_l1; my %essential_whitespace_filter_r1; my %essential_whitespace_filter_l2; my %essential_whitespace_filter_r2; my %is_type_with_space_before_bareword; my %is_special_variable_char; BEGIN { my @q; # NOTE: This hash is like the global %is_sort_map_grep, but it ignores # grep aliases on purpose, since here we are looking parens, not braces @q = qw(sort grep map); @is_sort_grep_map{@q} = (1) x scalar(@q); @q = qw(for foreach); @is_for_foreach{@q} = (1) x scalar(@q); @q = qw( .. :: << >> ** && || // -> => += -= .= %= &= |= ^= *= <> <= >= == =~ !~ != ++ -- /= x= ~~ ~. |. &. ^. ); @is_digraph{@q} = (1) x scalar(@q); @q = qw( ... **= <<= >>= &&= ||= //= <=> !~~ &.= |.= ^.= <<~); @is_trigraph{@q} = (1) x scalar(@q); # These are used as a speedup filters for sub is_essential_whitespace. # Filter 1: # These left side token types USUALLY do not require a space: @q = qw( ; { } [ ] L R ); push @q, ','; push @q, ')'; push @q, '('; @essential_whitespace_filter_l1{@q} = (1) x scalar(@q); # BUT some might if followed by these right token types @q = qw( pp mm << <<= h ); @essential_whitespace_filter_r1{@q} = (1) x scalar(@q); # Filter 2: # These right side filters usually do not require a space @q = qw( ; ] R } ); push @q, ','; push @q, ')'; @essential_whitespace_filter_r2{@q} = (1) x scalar(@q); # BUT some might if followed by these left token types @q = qw( h Z ); @essential_whitespace_filter_l2{@q} = (1) x scalar(@q); # Keep a space between certain types and any bareword: # Q: keep a space between a quote and a bareword to prevent the # bareword from becoming a quote modifier. # &: do not remove space between an '&' and a bare word because # it may turn into a function evaluation, like here # between '&' and 'O_ACCMODE', producing a syntax error [File.pm] # $opts{rdonly} = (($opts{mode} & O_ACCMODE) == O_RDONLY); @q = qw( Q & ); @is_type_with_space_before_bareword{@q} = (1) x scalar(@q); # These are the only characters which can (currently) form special # variables, like $^W: (issue c066, c068). @q = qw{ ? A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ }; @{is_special_variable_char}{@q} = (1) x scalar(@q); } ## end BEGIN sub is_essential_whitespace { # Essential whitespace means whitespace which cannot be safely deleted # without risking the introduction of a syntax error. # We are given three tokens and their types: # ($tokenl, $typel) is the token to the left of the space in question # ($tokenr, $typer) is the token to the right of the space in question # ($tokenll, $typell) is previous nonblank token to the left of $tokenl # # Note1: This routine should almost never need to be changed. It is # for avoiding syntax problems rather than for formatting. # Note2: The -mangle option causes large numbers of calls to this # routine and therefore is a good test. So if a change is made, be sure # to use nytprof to profile with both old and reviesed coding using the # -mangle option and check differences. my ( $tokenll, $typell, $tokenl, $typel, $tokenr, $typer ) = @_; # This is potentially a very slow routine but the following quick # filters typically catch and handle over 90% of the calls. # Filter 1: usually no space required after common types ; , [ ] { } ( ) return if ( $essential_whitespace_filter_l1{$typel} && !$essential_whitespace_filter_r1{$typer} ); # Filter 2: usually no space before common types ; , return if ( $essential_whitespace_filter_r2{$typer} && !$essential_whitespace_filter_l2{$typel} ); # Filter 3: Handle side comments: a space is only essential if the left # token ends in '$' For example, we do not want to create $#foo below: # sub t086 # ( #foo))) # $ #foo))) # a #foo))) # ) #foo))) # { ... } # Also, I prefer not to put a ? and # together because ? used to be # a pattern delimiter and spacing was used if guessing was needed. if ( $typer eq '#' ) { return 1 if ( $tokenl && ( $typel eq '?' || substr( $tokenl, -1 ) eq '$' ) ); return; } my $tokenr_is_bareword = $tokenr =~ /^\w/ && $tokenr !~ /^\d/; my $tokenr_is_open_paren = $tokenr eq '('; my $token_joined = $tokenl . $tokenr; my $tokenl_is_dash = $tokenl eq '-'; my $result = # never combine two bare words or numbers # examples: and ::ok(1) # return ::spw(...) # for bla::bla:: abc # example is "%overload:: and" in files Dumpvalue.pm or colonbug.pl # $input eq"quit" to make $inputeq"quit" # my $size=-s::SINK if $file; <==OK but we won't do it # don't join something like: for bla::bla:: abc # example is "%overload:: and" in files Dumpvalue.pm or colonbug.pl ( ( $tokenl =~ /([\'\w]|\:\:)$/ && $typel ne 'CORE::' ) && ( $tokenr =~ /^([\'\w]|\:\:)/ ) ) # do not combine a number with a concatenation dot # example: pom.caputo: # $vt100_compatible ? "\e[0;0H" : ('-' x 78 . "\n"); || $typel eq 'n' && $tokenr eq '.' || $typer eq 'n' && $tokenl eq '.' # cases of a space before a bareword... || ( $tokenr_is_bareword && ( # do not join a minus with a bare word, because you might form # a file test operator. Example from Complex.pm: # if (CORE::abs($z - i) < $eps); # "z-i" would be taken as a file test. $tokenl_is_dash && length($tokenr) == 1 # and something like this could become ambiguous without space # after the '-': # use constant III=>1; # $a = $b - III; # and even this: # $a = - III; || $tokenl_is_dash && $typer =~ /^[wC]$/ # keep space between types Q & and a bareword || $is_type_with_space_before_bareword{$typel} # +-: binary plus and minus before a bareword could get # converted into unary plus and minus on next pass through the # tokenizer. This can lead to blinkers: cases b660 b670 b780 # b781 b787 b788 b790 So we keep a space unless the +/- clearly # follows an operator || ( ( $typel eq '+' || $typel eq '-' ) && $typell !~ /^[niC\)\}\]R]$/ ) # keep a space between a token ending in '$' and any word; # this caused trouble: "die @$ if $@" || $typel eq 'i' && substr( $tokenl, -1, 1 ) eq '$' # don't combine $$ or $# with any alphanumeric # (testfile mangle.t with --mangle) || $tokenl eq '$$' || $tokenl eq '$#' ) ) ## end $tokenr_is_bareword # OLD, not used # '= -' should not become =- or you will get a warning # about reversed -= # || ($tokenr eq '-') # do not join a bare word with a minus, like between 'Send' and # '-recipients' here <> # my $msg = new Fax::Send # -recipients => $to, # -data => $data; # This is the safest thing to do. If we had the token to the right of # the minus we could do a better check. # # And do not combine a bareword and a quote, like this: # oops "Your login, $Bad_Login, is not valid"; # It can cause a syntax error if oops is a sub || $typel eq 'w' && ( $tokenr eq '-' || $typer eq 'Q' ) # perl is very fussy about spaces before << || substr( $tokenr, 0, 2 ) eq '<<' # avoid combining tokens to create new meanings. Example: # $a+ +$b must not become $a++$b || ( $is_digraph{$token_joined} ) || $is_trigraph{$token_joined} # another example: do not combine these two &'s: # allow_options & &OPT_EXECCGI || $is_digraph{ $tokenl . substr( $tokenr, 0, 1 ) } # retain any space after possible filehandle # (testfiles prnterr1.t with --extrude and mangle.t with --mangle) || $typel eq 'Z' # Added 'Y' here 16 Jan 2021 to prevent -mangle option from removing # space after type Y. Otherwise, it will get parsed as type 'Z' later # and any space would have to be added back manually if desired. || $typel eq 'Y' # Perl is sensitive to whitespace after the + here: # $b = xvals $a + 0.1 * yvals $a; || $typell eq 'Z' && $typel =~ /^[\/\?\+\-\*]$/ || ( $tokenr_is_open_paren && ( # keep paren separate in 'use Foo::Bar ()' ( $typel eq 'w' && $typell eq 'k' && $tokenll eq 'use' ) # OLD: keep any space between filehandle and paren: # file mangle.t with --mangle: # NEW: this test is no longer necessary here (moved above) ## || $typel eq 'Y' # must have space between grep and left paren; "grep(" will fail || $is_sort_grep_map{$tokenl} # don't stick numbers next to left parens, as in: #use Mail::Internet 1.28 (); (see Entity.pm, Head.pm, Test.pm) || $typel eq 'n' ) ) ## end $tokenr_is_open_paren # retain any space after here doc operator ( hereerr.t) || $typel eq 'h' # be careful with a space around ++ and --, to avoid ambiguity as to # which token it applies || ( $typer eq 'pp' || $typer eq 'mm' ) && $tokenl !~ /^[\;\{\(\[]/ || ( $typel eq '++' || $typel eq '--' ) && $tokenr !~ /^[\;\}\)\]]/ # need space after foreach my; for example, this will fail in # older versions of Perl: # foreach my$ft(@filetypes)... || ( $tokenl eq 'my' && substr( $tokenr, 0, 1 ) eq '$' # /^(for|foreach)$/ && $is_for_foreach{$tokenll} ) # Keep space after like $^ if needed to avoid forming a different # special variable (issue c068). For example: # my $aa = $^ ? "none" : "ok"; || ( $typel eq 'i' && length($tokenl) == 2 && substr( $tokenl, 1, 1 ) eq '^' && $is_special_variable_char{ substr( $tokenr, 0, 1 ) } ) # We must be sure that a space between a ? and a quoted string # remains if the space before the ? remains. [Loca.pm, lockarea] # ie, # $b=join $comma ? ',' : ':', @_; # ok # $b=join $comma?',' : ':', @_; # ok! # $b=join $comma ?',' : ':', @_; # error! # Not really required: ## || ( ( $typel eq '?' ) && ( $typer eq 'Q' ) ) # Space stacked labels... # Not really required: Perl seems to accept non-spaced labels. ## || $typel eq 'J' && $typer eq 'J' ; # the value of this long logic sequence is the result we want return $result; } ## end sub is_essential_whitespace } ## end closure is_essential_whitespace { ## begin closure new_secret_operator_whitespace my %secret_operators; my %is_leading_secret_token; BEGIN { # token lists for perl secret operators as compiled by Philippe Bruhat # at: https://metacpan.org/module/perlsecret %secret_operators = ( 'Goatse' => [qw#= ( ) =#], #=( )= 'Venus1' => [qw#0 +#], # 0+ 'Venus2' => [qw#+ 0#], # +0 'Enterprise' => [qw#) x ! !#], # ()x!! 'Kite1' => [qw#~ ~ <>#], # ~~<> 'Kite2' => [qw#~~ <>#], # ~~<> 'Winking Fat Comma' => [ ( ',', '=>' ) ], # ,=> 'Bang bang ' => [qw#! !#], # !! ); # The following operators and constants are not included because they # are normally kept tight by perltidy: # ~~ <~> # # Make a lookup table indexed by the first token of each operator: # first token => [list, list, ...] foreach my $value ( values(%secret_operators) ) { my $tok = $value->[0]; push @{ $is_leading_secret_token{$tok} }, $value; } } ## end BEGIN sub new_secret_operator_whitespace { my ( $rlong_array, $rwhitespace_flags ) = @_; # Loop over all tokens in this line my ( $token, $type ); my $jmax = @{$rlong_array} - 1; foreach my $j ( 0 .. $jmax ) { $token = $rlong_array->[$j]->[_TOKEN_]; $type = $rlong_array->[$j]->[_TYPE_]; # Skip unless this token might start a secret operator next if ( $type eq 'b' ); next unless ( $is_leading_secret_token{$token} ); # Loop over all secret operators with this leading token foreach my $rpattern ( @{ $is_leading_secret_token{$token} } ) { my $jend = $j - 1; foreach my $tok ( @{$rpattern} ) { $jend++; $jend++ if ( $jend <= $jmax && $rlong_array->[$jend]->[_TYPE_] eq 'b' ); if ( $jend > $jmax || $tok ne $rlong_array->[$jend]->[_TOKEN_] ) { $jend = undef; last; } } if ($jend) { # set flags to prevent spaces within this operator foreach my $jj ( $j + 1 .. $jend ) { $rwhitespace_flags->[$jj] = WS_NO; } $j = $jend; last; } } ## End Loop over all operators } ## End loop over all tokens return; } ## end sub new_secret_operator_whitespace } ## end closure new_secret_operator_whitespace { ## begin closure set_bond_strengths # These routines and variables are involved in deciding where to break very # long lines. my %is_good_keyword_breakpoint; my %is_lt_gt_le_ge; my %is_container_token; my %binary_bond_strength_nospace; my %binary_bond_strength; my %nobreak_lhs; my %nobreak_rhs; my @bias_tokens; my %bias_hash; my %bias; my $delta_bias; sub initialize_bond_strength_hashes { my @q; @q = qw(if unless while until for foreach); @is_good_keyword_breakpoint{@q} = (1) x scalar(@q); @q = qw(lt gt le ge); @is_lt_gt_le_ge{@q} = (1) x scalar(@q); @q = qw/ ( [ { } ] ) /; @is_container_token{@q} = (1) x scalar(@q); # The decision about where to break a line depends upon a "bond # strength" between tokens. The LOWER the bond strength, the MORE # likely a break. A bond strength may be any value but to simplify # things there are several pre-defined strength levels: # NO_BREAK => 10000; # VERY_STRONG => 100; # STRONG => 2.1; # NOMINAL => 1.1; # WEAK => 0.8; # VERY_WEAK => 0.55; # The strength values are based on trial-and-error, and need to be # tweaked occasionally to get desired results. Some comments: # # 1. Only relative strengths are important. small differences # in strengths can make big formatting differences. # 2. Each indentation level adds one unit of bond strength. # 3. A value of NO_BREAK makes an unbreakable bond # 4. A value of VERY_WEAK is the strength of a ',' # 5. Values below NOMINAL are considered ok break points. # 6. Values above NOMINAL are considered poor break points. # # The bond strengths should roughly follow precedence order where # possible. If you make changes, please check the results very # carefully on a variety of scripts. Testing with the -extrude # options is particularly helpful in exercising all of the rules. # Wherever possible, bond strengths are defined in the following # tables. There are two main stages to setting bond strengths and # two types of tables: # # The first stage involves looking at each token individually and # defining left and right bond strengths, according to if we want # to break to the left or right side, and how good a break point it # is. For example tokens like =, ||, && make good break points and # will have low strengths, but one might want to break on either # side to put them at the end of one line or beginning of the next. # # The second stage involves looking at certain pairs of tokens and # defining a bond strength for that particular pair. This second # stage has priority. #--------------------------------------------------------------- # Bond Strength BEGIN Section 1. # Set left and right bond strengths of individual tokens. #--------------------------------------------------------------- # NOTE: NO_BREAK's set in this section first are HINTS which will # probably not be honored. Essential NO_BREAKS's should be set in # BEGIN Section 2 or hardwired in the NO_BREAK coding near the end # of this subroutine. # Note that we are setting defaults in this section. The user # cannot change bond strengths but can cause the left and right # bond strengths of any token type to be swapped through the use of # the -wba and -wbb flags. In this way the user can determine if a # breakpoint token should appear at the end of one line or the # beginning of the next line. %right_bond_strength = (); %left_bond_strength = (); %binary_bond_strength_nospace = (); %binary_bond_strength = (); %nobreak_lhs = (); %nobreak_rhs = (); # The hash keys in this section are token types, plus the text of # certain keywords like 'or', 'and'. # no break around possible filehandle $left_bond_strength{'Z'} = NO_BREAK; $right_bond_strength{'Z'} = NO_BREAK; # never put a bare word on a new line: # example print (STDERR, "bla"); will fail with break after ( $left_bond_strength{'w'} = NO_BREAK; # blanks always have infinite strength to force breaks after # real tokens $right_bond_strength{'b'} = NO_BREAK; # try not to break on exponentiation @q = qw# ** .. ... <=> #; @left_bond_strength{@q} = (STRONG) x scalar(@q); @right_bond_strength{@q} = (STRONG) x scalar(@q); # The comma-arrow has very low precedence but not a good break point $left_bond_strength{'=>'} = NO_BREAK; $right_bond_strength{'=>'} = NOMINAL; # ok to break after label $left_bond_strength{'J'} = NO_BREAK; $right_bond_strength{'J'} = NOMINAL; $left_bond_strength{'j'} = STRONG; $right_bond_strength{'j'} = STRONG; $left_bond_strength{'A'} = STRONG; $right_bond_strength{'A'} = STRONG; $left_bond_strength{'->'} = STRONG; $right_bond_strength{'->'} = VERY_STRONG; $left_bond_strength{'CORE::'} = NOMINAL; $right_bond_strength{'CORE::'} = NO_BREAK; # breaking AFTER modulus operator is ok: @q = qw< % >; @left_bond_strength{@q} = (STRONG) x scalar(@q); @right_bond_strength{@q} = ( 0.1 * NOMINAL + 0.9 * STRONG ) x scalar(@q); # Break AFTER math operators * and / @q = qw< * / x >; @left_bond_strength{@q} = (STRONG) x scalar(@q); @right_bond_strength{@q} = (NOMINAL) x scalar(@q); # Break AFTER weakest math operators + and - # Make them weaker than * but a bit stronger than '.' @q = qw< + - >; @left_bond_strength{@q} = (STRONG) x scalar(@q); @right_bond_strength{@q} = ( 0.91 * NOMINAL + 0.09 * WEAK ) x scalar(@q); # Define left strength of unary plus and minus (fixes case b511) $left_bond_strength{p} = $left_bond_strength{'+'}; $left_bond_strength{m} = $left_bond_strength{'-'}; # And make right strength of unary plus and minus very high. # Fixes cases b670 b790 $right_bond_strength{p} = NO_BREAK; $right_bond_strength{m} = NO_BREAK; # breaking BEFORE these is just ok: @q = qw# >> << #; @right_bond_strength{@q} = (STRONG) x scalar(@q); @left_bond_strength{@q} = (NOMINAL) x scalar(@q); # breaking before the string concatenation operator seems best # because it can be hard to see at the end of a line $right_bond_strength{'.'} = STRONG; $left_bond_strength{'.'} = 0.9 * NOMINAL + 0.1 * WEAK; @q = qw< } ] ) R >; @left_bond_strength{@q} = (STRONG) x scalar(@q); @right_bond_strength{@q} = (NOMINAL) x scalar(@q); # make these a little weaker than nominal so that they get # favored for end-of-line characters @q = qw< != == =~ !~ ~~ !~~ >; @left_bond_strength{@q} = (STRONG) x scalar(@q); @right_bond_strength{@q} = ( 0.9 * NOMINAL + 0.1 * WEAK ) x scalar(@q); # break AFTER these @q = qw# < > | & >= <= #; @left_bond_strength{@q} = (VERY_STRONG) x scalar(@q); @right_bond_strength{@q} = ( 0.8 * NOMINAL + 0.2 * WEAK ) x scalar(@q); # breaking either before or after a quote is ok # but bias for breaking before a quote $left_bond_strength{'Q'} = NOMINAL; $right_bond_strength{'Q'} = NOMINAL + 0.02; $left_bond_strength{'q'} = NOMINAL; $right_bond_strength{'q'} = NOMINAL; # starting a line with a keyword is usually ok $left_bond_strength{'k'} = NOMINAL; # we usually want to bond a keyword strongly to what immediately # follows, rather than leaving it stranded at the end of a line $right_bond_strength{'k'} = STRONG; $left_bond_strength{'G'} = NOMINAL; $right_bond_strength{'G'} = STRONG; # assignment operators @q = qw( = **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x= ); # Default is to break AFTER various assignment operators @left_bond_strength{@q} = (STRONG) x scalar(@q); @right_bond_strength{@q} = ( 0.4 * WEAK + 0.6 * VERY_WEAK ) x scalar(@q); # Default is to break BEFORE '&&' and '||' and '//' # set strength of '||' to same as '=' so that chains like # $a = $b || $c || $d will break before the first '||' $right_bond_strength{'||'} = NOMINAL; $left_bond_strength{'||'} = $right_bond_strength{'='}; # same thing for '//' $right_bond_strength{'//'} = NOMINAL; $left_bond_strength{'//'} = $right_bond_strength{'='}; # set strength of && a little higher than || $right_bond_strength{'&&'} = NOMINAL; $left_bond_strength{'&&'} = $left_bond_strength{'||'} + 0.1; $left_bond_strength{';'} = VERY_STRONG; $right_bond_strength{';'} = VERY_WEAK; $left_bond_strength{'f'} = VERY_STRONG; # make right strength of for ';' a little less than '=' # to make for contents break after the ';' to avoid this: # for ( $j = $number_of_fields - 1 ; $j < $item_count ; $j += # $number_of_fields ) # and make it weaker than ',' and 'and' too $right_bond_strength{'f'} = VERY_WEAK - 0.03; # The strengths of ?/: should be somewhere between # an '=' and a quote (NOMINAL), # make strength of ':' slightly less than '?' to help # break long chains of ? : after the colons $left_bond_strength{':'} = 0.4 * WEAK + 0.6 * NOMINAL; $right_bond_strength{':'} = NO_BREAK; $left_bond_strength{'?'} = $left_bond_strength{':'} + 0.01; $right_bond_strength{'?'} = NO_BREAK; $left_bond_strength{','} = VERY_STRONG; $right_bond_strength{','} = VERY_WEAK; # remaining digraphs and trigraphs not defined above @q = qw( :: <> ++ --); @left_bond_strength{@q} = (WEAK) x scalar(@q); @right_bond_strength{@q} = (STRONG) x scalar(@q); # Set bond strengths of certain keywords # make 'or', 'err', 'and' slightly weaker than a ',' $left_bond_strength{'and'} = VERY_WEAK - 0.01; $left_bond_strength{'or'} = VERY_WEAK - 0.02; $left_bond_strength{'err'} = VERY_WEAK - 0.02; $left_bond_strength{'xor'} = VERY_WEAK - 0.01; $right_bond_strength{'and'} = NOMINAL; $right_bond_strength{'or'} = NOMINAL; $right_bond_strength{'err'} = NOMINAL; $right_bond_strength{'xor'} = NOMINAL; #--------------------------------------------------------------- # Bond Strength BEGIN Section 2. # Set binary rules for bond strengths between certain token types. #--------------------------------------------------------------- # We have a little problem making tables which apply to the # container tokens. Here is a list of container tokens and # their types: # # type tokens // meaning # { {, [, ( // indent # } }, ], ) // outdent # [ [ // left non-structural [ (enclosing an array index) # ] ] // right non-structural square bracket # ( ( // left non-structural paren # ) ) // right non-structural paren # L { // left non-structural curly brace (enclosing a key) # R } // right non-structural curly brace # # Some rules apply to token types and some to just the token # itself. We solve the problem by combining type and token into a # new hash key for the container types. # # If a rule applies to a token 'type' then we need to make rules # for each of these 'type.token' combinations: # Type Type.Token # { {{, {[, {( # [ [[ # ( (( # L L{ # } }}, }], }) # ] ]] # ) )) # R R} # # If a rule applies to a token then we need to make rules for # these 'type.token' combinations: # Token Type.Token # { {{, L{ # [ {[, [[ # ( {(, (( # } }}, R} # ] }], ]] # ) }), )) # allow long lines before final { in an if statement, as in: # if (.......... # ..........) # { # # Otherwise, the line before the { tends to be too short. $binary_bond_strength{'))'}{'{{'} = VERY_WEAK + 0.03; $binary_bond_strength{'(('}{'{{'} = NOMINAL; # break on something like '} (', but keep this stronger than a ',' # example is in 'howe.pl' $binary_bond_strength{'R}'}{'(('} = 0.8 * VERY_WEAK + 0.2 * WEAK; $binary_bond_strength{'}}'}{'(('} = 0.8 * VERY_WEAK + 0.2 * WEAK; # keep matrix and hash indices together # but make them a little below STRONG to allow breaking open # something like {'some-word'}{'some-very-long-word'} at the }{ # (bracebrk.t) $binary_bond_strength{']]'}{'[['} = 0.9 * STRONG + 0.1 * NOMINAL; $binary_bond_strength{']]'}{'L{'} = 0.9 * STRONG + 0.1 * NOMINAL; $binary_bond_strength{'R}'}{'[['} = 0.9 * STRONG + 0.1 * NOMINAL; $binary_bond_strength{'R}'}{'L{'} = 0.9 * STRONG + 0.1 * NOMINAL; # increase strength to the point where a break in the following # will be after the opening paren rather than at the arrow: # $a->$b($c); $binary_bond_strength{'i'}{'->'} = 1.45 * STRONG; # Added for c140 to make 'w ->' and 'i ->' behave the same $binary_bond_strength{'w'}{'->'} = 1.45 * STRONG; # Note that the following alternative strength would make the break at the # '->' rather than opening the '('. Both have advantages and disadvantages. # $binary_bond_strength{'i'}{'->'} = 0.5*STRONG + 0.5 * NOMINAL; # $binary_bond_strength{'))'}{'->'} = 0.1 * STRONG + 0.9 * NOMINAL; $binary_bond_strength{']]'}{'->'} = 0.1 * STRONG + 0.9 * NOMINAL; $binary_bond_strength{'})'}{'->'} = 0.1 * STRONG + 0.9 * NOMINAL; $binary_bond_strength{'}]'}{'->'} = 0.1 * STRONG + 0.9 * NOMINAL; $binary_bond_strength{'}}'}{'->'} = 0.1 * STRONG + 0.9 * NOMINAL; $binary_bond_strength{'R}'}{'->'} = 0.1 * STRONG + 0.9 * NOMINAL; $binary_bond_strength{'))'}{'[['} = 0.2 * STRONG + 0.8 * NOMINAL; $binary_bond_strength{'})'}{'[['} = 0.2 * STRONG + 0.8 * NOMINAL; $binary_bond_strength{'))'}{'{['} = 0.2 * STRONG + 0.8 * NOMINAL; $binary_bond_strength{'})'}{'{['} = 0.2 * STRONG + 0.8 * NOMINAL; #--------------------------------------------------------------- # Binary NO_BREAK rules #--------------------------------------------------------------- # use strict requires that bare word and => not be separated $binary_bond_strength{'C'}{'=>'} = NO_BREAK; $binary_bond_strength{'U'}{'=>'} = NO_BREAK; # Never break between a bareword and a following paren because # perl may give an error. For example, if a break is placed # between 'to_filehandle' and its '(' the following line will # give a syntax error [Carp.pm]: my( $no) =fileno( # to_filehandle( $in)) ; $binary_bond_strength{'C'}{'(('} = NO_BREAK; $binary_bond_strength{'C'}{'{('} = NO_BREAK; $binary_bond_strength{'U'}{'(('} = NO_BREAK; $binary_bond_strength{'U'}{'{('} = NO_BREAK; # use strict requires that bare word within braces not start new # line $binary_bond_strength{'L{'}{'w'} = NO_BREAK; $binary_bond_strength{'w'}{'R}'} = NO_BREAK; # The following two rules prevent a syntax error caused by breaking up # a construction like '{-y}'. The '-' quotes the 'y' and prevents # it from being taken as a transliteration. We have to keep # token types 'L m w' together to prevent this error. $binary_bond_strength{'L{'}{'m'} = NO_BREAK; $binary_bond_strength_nospace{'m'}{'w'} = NO_BREAK; # keep 'bareword-' together, but only if there is no space between # the word and dash. Do not keep together if there is a space. # example 'use perl6-alpha' $binary_bond_strength_nospace{'w'}{'m'} = NO_BREAK; # use strict requires that bare word and => not be separated $binary_bond_strength{'w'}{'=>'} = NO_BREAK; # use strict does not allow separating type info from trailing { } # testfile is readmail.pl $binary_bond_strength{'t'}{'L{'} = NO_BREAK; $binary_bond_strength{'i'}{'L{'} = NO_BREAK; # As a defensive measure, do not break between a '(' and a # filehandle. In some cases, this can cause an error. For # example, the following program works: # my $msg="hi!\n"; # print # ( STDOUT # $msg # ); # # But this program fails: # my $msg="hi!\n"; # print # ( # STDOUT # $msg # ); # # This is normally only a problem with the 'extrude' option $binary_bond_strength{'(('}{'Y'} = NO_BREAK; $binary_bond_strength{'{('}{'Y'} = NO_BREAK; # never break between sub name and opening paren $binary_bond_strength{'w'}{'(('} = NO_BREAK; $binary_bond_strength{'w'}{'{('} = NO_BREAK; # keep '}' together with ';' $binary_bond_strength{'}}'}{';'} = NO_BREAK; # Breaking before a ++ can cause perl to guess wrong. For # example the following line will cause a syntax error # with -extrude if we break between '$i' and '++' [fixstyle2] # print( ( $i++ & 1 ) ? $_ : ( $change{$_} || $_ ) ); $nobreak_lhs{'++'} = NO_BREAK; # Do not break before a possible file handle $nobreak_lhs{'Z'} = NO_BREAK; # use strict hates bare words on any new line. For # example, a break before the underscore here provokes the # wrath of use strict: # if ( -r $fn && ( -s _ || $AllowZeroFilesize)) { $nobreak_rhs{'F'} = NO_BREAK; $nobreak_rhs{'CORE::'} = NO_BREAK; # To prevent the tokenizer from switching between types 'w' and 'G' we # need to avoid breaking between type 'G' and the following code block # brace. Fixes case b929. $nobreak_rhs{G} = NO_BREAK; #--------------------------------------------------------------- # Bond Strength BEGIN Section 3. # Define tables and values for applying a small bias to the above # values. #--------------------------------------------------------------- # Adding a small 'bias' to strengths is a simple way to make a line # break at the first of a sequence of identical terms. For # example, to force long string of conditional operators to break # with each line ending in a ':', we can add a small number to the # bond strength of each ':' (colon.t) @bias_tokens = qw( : && || f and or . ); # tokens which get bias %bias_hash = map { $_ => 0 } @bias_tokens; $delta_bias = 0.0001; # a very small strength level return; } ## end sub initialize_bond_strength_hashes use constant DEBUG_BOND => 0; sub set_bond_strengths { my ($self) = @_; #----------------------------------------------------------------- # Define a 'bond strength' for each token pair in an output batch. # See comments above for definition of bond strength. #----------------------------------------------------------------- my $rbond_strength_to_go = []; my $rLL = $self->[_rLL_]; my $rK_weld_right = $self->[_rK_weld_right_]; my $rK_weld_left = $self->[_rK_weld_left_]; my $ris_list_by_seqno = $self->[_ris_list_by_seqno_]; # patch-its always ok to break at end of line $nobreak_to_go[$max_index_to_go] = 0; # we start a new set of bias values for each line %bias = %bias_hash; my $code_bias = -.01; # bias for closing block braces my $type = 'b'; my $token = SPACE; my $token_length = 1; my $last_type; my $last_nonblank_type = $type; my $last_nonblank_token = $token; my $list_str = $left_bond_strength{'?'}; my ( $bond_str_1, $bond_str_2, $bond_str_3, $bond_str_4 ); my ( $block_type, $i_next, $i_next_nonblank, $next_nonblank_token, $next_nonblank_type, $next_token, $next_type, $total_nesting_depth, ); # main loop to compute bond strengths between each pair of tokens foreach my $i ( 0 .. $max_index_to_go ) { $last_type = $type; if ( $type ne 'b' ) { $last_nonblank_type = $type; $last_nonblank_token = $token; } $type = $types_to_go[$i]; # strength on both sides of a blank is the same if ( $type eq 'b' && $last_type ne 'b' ) { $rbond_strength_to_go->[$i] = $rbond_strength_to_go->[ $i - 1 ]; $nobreak_to_go[$i] ||= $nobreak_to_go[ $i - 1 ]; # fix for b1257 next; } $token = $tokens_to_go[$i]; $token_length = $token_lengths_to_go[$i]; $block_type = $block_type_to_go[$i]; $i_next = $i + 1; $next_type = $types_to_go[$i_next]; $next_token = $tokens_to_go[$i_next]; $total_nesting_depth = $nesting_depth_to_go[$i_next]; $i_next_nonblank = ( ( $next_type eq 'b' ) ? $i + 2 : $i + 1 ); $next_nonblank_type = $types_to_go[$i_next_nonblank]; $next_nonblank_token = $tokens_to_go[$i_next_nonblank]; my $seqno = $type_sequence_to_go[$i]; my $next_nonblank_seqno = $type_sequence_to_go[$i_next_nonblank]; # We are computing the strength of the bond between the current # token and the NEXT token. #--------------------------------------------------------------- # Bond Strength Section 1: # First Approximation. # Use minimum of individual left and right tabulated bond # strengths. #--------------------------------------------------------------- my $bsr = $right_bond_strength{$type}; my $bsl = $left_bond_strength{$next_nonblank_type}; # define right bond strengths of certain keywords if ( $type eq 'k' && defined( $right_bond_strength{$token} ) ) { $bsr = $right_bond_strength{$token}; } elsif ( $token eq 'ne' or $token eq 'eq' ) { $bsr = NOMINAL; } # set terminal bond strength to the nominal value # this will cause good preceding breaks to be retained if ( $i_next_nonblank > $max_index_to_go ) { $bsl = NOMINAL; # But weaken the bond at a 'missing terminal comma'. If an # optional comma is missing at the end of a broken list, use # the strength of a comma anyway to make formatting the same as # if it were there. Fixes issue c133. if ( !defined($bsr) || $bsr > VERY_WEAK ) { my $seqno_px = $parent_seqno_to_go[$max_index_to_go]; if ( $ris_list_by_seqno->{$seqno_px} ) { my $KK = $K_to_go[$max_index_to_go]; my $Kn = $self->K_next_nonblank($KK); my $seqno_n = $rLL->[$Kn]->[_TYPE_SEQUENCE_]; if ( $seqno_n && $seqno_n eq $seqno_px ) { $bsl = VERY_WEAK; } } } } # define right bond strengths of certain keywords if ( $next_nonblank_type eq 'k' && defined( $left_bond_strength{$next_nonblank_token} ) ) { $bsl = $left_bond_strength{$next_nonblank_token}; } elsif ($next_nonblank_token eq 'ne' or $next_nonblank_token eq 'eq' ) { $bsl = NOMINAL; } elsif ( $is_lt_gt_le_ge{$next_nonblank_token} ) { $bsl = 0.9 * NOMINAL + 0.1 * STRONG; } # Use the minimum of the left and right strengths. Note: it might # seem that we would want to keep a NO_BREAK if either token has # this value. This didn't work, for example because in an arrow # list, it prevents the comma from separating from the following # bare word (which is probably quoted by its arrow). So necessary # NO_BREAK's have to be handled as special cases in the final # section. if ( !defined($bsr) ) { $bsr = VERY_STRONG } if ( !defined($bsl) ) { $bsl = VERY_STRONG } my $bond_str = ( $bsr < $bsl ) ? $bsr : $bsl; $bond_str_1 = $bond_str if (DEBUG_BOND); #--------------------------------------------------------------- # Bond Strength Section 2: # Apply hardwired rules.. #--------------------------------------------------------------- # Patch to put terminal or clauses on a new line: Weaken the bond # at an || followed by die or similar keyword to make the terminal # or clause fall on a new line, like this: # # my $class = shift # || die "Cannot add broadcast: No class identifier found"; # # Otherwise the break will be at the previous '=' since the || and # = have the same starting strength and the or is biased, like # this: # # my $class = # shift || die "Cannot add broadcast: No class identifier found"; # # In any case if the user places a break at either the = or the || # it should remain there. if ( $type eq '||' || $type eq 'k' && $token eq 'or' ) { # /^(die|confess|croak|warn)$/ if ( $is_die_confess_croak_warn{$next_nonblank_token} ) { if ( $want_break_before{$token} && $i > 0 ) { $rbond_strength_to_go->[ $i - 1 ] -= $delta_bias; # keep bond strength of a token and its following blank # the same if ( $types_to_go[ $i - 1 ] eq 'b' && $i > 2 ) { $rbond_strength_to_go->[ $i - 2 ] -= $delta_bias; } } else { $bond_str -= $delta_bias; } } } # good to break after end of code blocks if ( $type eq '}' && $block_type && $next_nonblank_type ne ';' ) { $bond_str = 0.5 * WEAK + 0.5 * VERY_WEAK + $code_bias; $code_bias += $delta_bias; } if ( $type eq 'k' ) { # allow certain control keywords to stand out if ( $next_nonblank_type eq 'k' && $is_last_next_redo_return{$token} ) { $bond_str = 0.45 * WEAK + 0.55 * VERY_WEAK; } # Don't break after keyword my. This is a quick fix for a # rare problem with perl. An example is this line from file # Container.pm: # foreach my $question( Debian::DebConf::ConfigDb::gettree( # $this->{'question'} ) ) if ( $token eq 'my' ) { $bond_str = NO_BREAK; } } if ( $next_nonblank_type eq 'k' && $type ne 'CORE::' ) { if ( $is_keyword_returning_list{$next_nonblank_token} ) { $bond_str = $list_str if ( $bond_str > $list_str ); } # keywords like 'unless', 'if', etc, within statements # make good breaks if ( $is_good_keyword_breakpoint{$next_nonblank_token} ) { $bond_str = VERY_WEAK / 1.05; } } # try not to break before a comma-arrow elsif ( $next_nonblank_type eq '=>' ) { if ( $bond_str < STRONG ) { $bond_str = STRONG } } #--------------------------------------------------------------- # Additional hardwired NOBREAK rules #--------------------------------------------------------------- # map1.t -- correct for a quirk in perl if ( $token eq '(' && $next_nonblank_type eq 'i' && $last_nonblank_type eq 'k' && $is_sort_map_grep{$last_nonblank_token} ) # /^(sort|map|grep)$/ ) { $bond_str = NO_BREAK; } # extrude.t: do not break before paren at: # -l pid_filename( if ( $last_nonblank_type eq 'F' && $next_nonblank_token eq '(' ) { $bond_str = NO_BREAK; } # OLD COMMENT: In older version of perl, use strict can cause # problems with breaks before bare words following opening parens. # For example, this will fail under older versions if a break is # made between '(' and 'MAIL': # use strict; open( MAIL, "a long filename or command"); close MAIL; # NEW COMMENT: Third fix for b1213: # This option does not seem to be needed any longer, and it can # cause instabilities. It can be turned off, but to minimize # changes to existing formatting it is retained only in the case # where the previous token was 'open' and there was no line break. # Even this could eventually be removed if it causes instability. if ( $type eq '{' ) { if ( $token eq '(' && $next_nonblank_type eq 'w' && $last_nonblank_type eq 'k' && $last_nonblank_token eq 'open' && !$old_breakpoint_to_go[$i] ) { $bond_str = NO_BREAK; } } # Do not break between a possible filehandle and a ? or / and do # not introduce a break after it if there is no blank # (extrude.t) elsif ( $type eq 'Z' ) { # don't break.. if ( # if there is no blank and we do not want one. Examples: # print $x++ # do not break after $x # print HTML"HELLO" # break ok after HTML ( $next_type ne 'b' && defined( $want_left_space{$next_type} ) && $want_left_space{$next_type} == WS_NO ) # or we might be followed by the start of a quote, # and this is not an existing breakpoint; fixes c039. || !$old_breakpoint_to_go[$i] && substr( $next_nonblank_token, 0, 1 ) eq '/' ) { $bond_str = NO_BREAK; } } # Breaking before a ? before a quote can cause trouble if # they are not separated by a blank. # Example: a syntax error occurs if you break before the ? here # my$logic=join$all?' && ':' || ',@regexps; # From: Professional_Perl_Programming_Code/multifind.pl if ( $next_nonblank_type eq '?' ) { $bond_str = NO_BREAK if ( $types_to_go[ $i_next_nonblank + 1 ] eq 'Q' ); } # Breaking before a . followed by a number # can cause trouble if there is no intervening space # Example: a syntax error occurs if you break before the .2 here # $str .= pack($endian.2, ensurrogate($ord)); # From: perl58/Unicode.pm elsif ( $next_nonblank_type eq '.' ) { $bond_str = NO_BREAK if ( $types_to_go[ $i_next_nonblank + 1 ] eq 'n' ); } # Fix for c039 elsif ( $type eq 'w' ) { $bond_str = NO_BREAK if ( !$old_breakpoint_to_go[$i] && substr( $next_nonblank_token, 0, 1 ) eq '/' && $next_nonblank_type ne '//' ); } $bond_str_2 = $bond_str if (DEBUG_BOND); #--------------------------------------------------------------- # End of hardwired rules #--------------------------------------------------------------- #--------------------------------------------------------------- # Bond Strength Section 3: # Apply table rules. These have priority over the above # hardwired rules. #--------------------------------------------------------------- my $tabulated_bond_str; my $ltype = $type; my $rtype = $next_nonblank_type; if ( $seqno && $is_container_token{$token} ) { $ltype = $type . $token; } if ( $next_nonblank_seqno && $is_container_token{$next_nonblank_token} ) { $rtype = $next_nonblank_type . $next_nonblank_token; # Alternate Fix #1 for issue b1299. This version makes the # decision as soon as possible. See Alternate Fix #2 also. # Do not separate a bareword identifier from its paren: b1299 # This is currently needed for stability because if the bareword # gets separated from a preceding '->' and following '(' then # the tokenizer may switch from type 'i' to type 'w'. This # patch will prevent this by keeping it adjacent to its '('. ## if ( $next_nonblank_token eq '(' ## && $ltype eq 'i' ## && substr( $token, 0, 1 ) =~ /^\w$/ ) ## { ## $ltype = 'w'; ## } } # apply binary rules which apply regardless of space between tokens if ( $binary_bond_strength{$ltype}{$rtype} ) { $bond_str = $binary_bond_strength{$ltype}{$rtype}; $tabulated_bond_str = $bond_str; } # apply binary rules which apply only if no space between tokens if ( $binary_bond_strength_nospace{$ltype}{$next_type} ) { $bond_str = $binary_bond_strength{$ltype}{$next_type}; $tabulated_bond_str = $bond_str; } if ( $nobreak_rhs{$ltype} || $nobreak_lhs{$rtype} ) { $bond_str = NO_BREAK; $tabulated_bond_str = $bond_str; } $bond_str_3 = $bond_str if (DEBUG_BOND); # If the hardwired rules conflict with the tabulated bond # strength then there is an inconsistency that should be fixed DEBUG_BOND && $tabulated_bond_str && $bond_str_1 && $bond_str_1 != $bond_str_2 && $bond_str_2 != $tabulated_bond_str && do { print STDERR "BOND_TABLES: ltype=$ltype rtype=$rtype $bond_str_1->$bond_str_2->$bond_str_3\n"; }; #----------------------------------------------------------------- # Bond Strength Section 4: # Modify strengths of certain tokens which often occur in sequence # by adding a small bias to each one in turn so that the breaks # occur from left to right. # # Note that we only changing strengths by small amounts here, # and usually increasing, so we should not be altering any NO_BREAKs. # Other routines which check for NO_BREAKs will use a tolerance # of one to avoid any problem. #----------------------------------------------------------------- # The bias tables use special keys: # $type - if not keyword # $token - if keyword, but map some keywords together my $left_key = $type eq 'k' ? $token eq 'err' ? 'or' : $token : $type; my $right_key = $next_nonblank_type eq 'k' ? $next_nonblank_token eq 'err' ? 'or' : $next_nonblank_token : $next_nonblank_type; # bias left token if ( defined( $bias{$left_key} ) ) { if ( !$want_break_before{$left_key} ) { $bias{$left_key} += $delta_bias; $bond_str += $bias{$left_key}; } } # bias right token if ( defined( $bias{$right_key} ) ) { if ( $want_break_before{$right_key} ) { # for leading '.' align all but 'short' quotes; the idea # is to not place something like "\n" on a single line. if ( $right_key eq '.' ) { unless ( $last_nonblank_type eq '.' && ( $token_length <= $rOpts_short_concatenation_item_length ) && ( !$is_closing_token{$token} ) ) { $bias{$right_key} += $delta_bias; } } else { $bias{$right_key} += $delta_bias; } $bond_str += $bias{$right_key}; } } $bond_str_4 = $bond_str if (DEBUG_BOND); #--------------------------------------------------------------- # Bond Strength Section 5: # Fifth Approximation. # Take nesting depth into account by adding the nesting depth # to the bond strength. #--------------------------------------------------------------- my $strength; if ( defined($bond_str) && !$nobreak_to_go[$i] ) { if ( $total_nesting_depth > 0 ) { $strength = $bond_str + $total_nesting_depth; } else { $strength = $bond_str; } } else { $strength = NO_BREAK; # For critical code such as lines with here targets we must # be absolutely sure that we do not allow a break. So for # these the nobreak flag exceeds 1 as a signal. Otherwise we # can run into trouble when small tolerances are added. $strength += 1 if ( $nobreak_to_go[$i] && $nobreak_to_go[$i] > 1 ); } #--------------------------------------------------------------- # Bond Strength Section 6: # Sixth Approximation. Welds. #--------------------------------------------------------------- # Do not allow a break within welds if ( $total_weld_count && $seqno ) { my $KK = $K_to_go[$i]; if ( $rK_weld_right->{$KK} ) { $strength = NO_BREAK; } # But encourage breaking after opening welded tokens elsif ($rK_weld_left->{$KK} && $is_opening_token{$token} ) { $strength -= 1; } } # always break after side comment if ( $type eq '#' ) { $strength = 0 } $rbond_strength_to_go->[$i] = $strength; # Fix for case c001: be sure NO_BREAK's are enforced by later # routines, except at a '?' because '?' as quote delimiter is # deprecated. if ( $strength >= NO_BREAK && $next_nonblank_type ne '?' ) { $nobreak_to_go[$i] ||= 1; } DEBUG_BOND && do { my $str = substr( $token, 0, 15 ); $str .= SPACE x ( 16 - length($str) ); print STDOUT "BOND: i=$i $str $type $next_nonblank_type depth=$total_nesting_depth strength=$bond_str_1 -> $bond_str_2 -> $bond_str_3 -> $bond_str_4 $bond_str -> $strength \n"; # reset for next pass $bond_str_1 = $bond_str_2 = $bond_str_3 = $bond_str_4 = undef; }; } ## end main loop return $rbond_strength_to_go; } ## end sub set_bond_strengths } ## end closure set_bond_strengths sub bad_pattern { # See if a pattern will compile. We have to use a string eval here, # but it should be safe because the pattern has been constructed # by this program. my ($pattern) = @_; my $ok = eval "'##'=~/$pattern/"; return !defined($ok) || $EVAL_ERROR; } ## end sub bad_pattern { ## begin closure prepare_cuddled_block_types my %no_cuddle; # Add keywords here which really should not be cuddled BEGIN { my @q = qw(if unless for foreach while); @no_cuddle{@q} = (1) x scalar(@q); } sub prepare_cuddled_block_types { # the cuddled-else style, if used, is controlled by a hash that # we construct here # Include keywords here which should not be cuddled my $cuddled_string = EMPTY_STRING; if ( $rOpts->{'cuddled-else'} ) { # set the default $cuddled_string = 'elsif else continue catch finally' unless ( $rOpts->{'cuddled-block-list-exclusive'} ); # This is the old equivalent but more complex version # $cuddled_string = 'if-elsif-else unless-elsif-else -continue '; # Add users other blocks to be cuddled my $cuddled_block_list = $rOpts->{'cuddled-block-list'}; if ($cuddled_block_list) { $cuddled_string .= SPACE . $cuddled_block_list; } } # If we have a cuddled string of the form # 'try-catch-finally' # we want to prepare a hash of the form # $rcuddled_block_types = { # 'try' => { # 'catch' => 1, # 'finally' => 1 # }, # }; # use -dcbl to dump this hash # Multiple such strings are input as a space or comma separated list # If we get two lists with the same leading type, such as # -cbl = "-try-catch-finally -try-catch-otherwise" # then they will get merged as follows: # $rcuddled_block_types = { # 'try' => { # 'catch' => 1, # 'finally' => 2, # 'otherwise' => 1, # }, # }; # This will allow either type of chain to be followed. $cuddled_string =~ s/,/ /g; # allow space or comma separated lists my @cuddled_strings = split /\s+/, $cuddled_string; $rcuddled_block_types = {}; # process each dash-separated string... my $string_count = 0; foreach my $string (@cuddled_strings) { next unless $string; my @words = split /-+/, $string; # allow multiple dashes # we could look for and report possible errors here... next unless ( @words > 0 ); # allow either '-continue' or *-continue' for arbitrary starting type my $start = '*'; # a single word without dashes is a secondary block type if ( @words > 1 ) { $start = shift @words; } # always make an entry for the leading word. If none follow, this # will still prevent a wildcard from matching this word. if ( !defined( $rcuddled_block_types->{$start} ) ) { $rcuddled_block_types->{$start} = {}; } # The count gives the original word order in case we ever want it. $string_count++; my $word_count = 0; foreach my $word (@words) { next unless $word; if ( $no_cuddle{$word} ) { Warn( "## Ignoring keyword '$word' in -cbl; does not seem right\n" ); next; } $word_count++; $rcuddled_block_types->{$start}->{$word} = 1; #"$string_count.$word_count"; # git#9: Remove this word from the list of desired one-line # blocks $want_one_line_block{$word} = 0; } } return; } ## end sub prepare_cuddled_block_types } ## end closure prepare_cuddled_block_types sub dump_cuddled_block_list { my ($fh) = @_; # ORIGINAL METHOD: Here is the format of the cuddled block type hash # which controls this routine # my $rcuddled_block_types = { # 'if' => { # 'else' => 1, # 'elsif' => 1 # }, # 'try' => { # 'catch' => 1, # 'finally' => 1 # }, # }; # SIMPLIFIED METHOD: the simplified method uses a wildcard for # the starting block type and puts all cuddled blocks together: # my $rcuddled_block_types = { # '*' => { # 'else' => 1, # 'elsif' => 1 # 'catch' => 1, # 'finally' => 1 # }, # }; # Both methods work, but the simplified method has proven to be adequate and # easier to manage. my $cuddled_string = $rOpts->{'cuddled-block-list'}; $cuddled_string = EMPTY_STRING unless $cuddled_string; my $flags = EMPTY_STRING; $flags .= "-ce" if ( $rOpts->{'cuddled-else'} ); $flags .= " -cbl='$cuddled_string'"; unless ( $rOpts->{'cuddled-else'} ) { $flags .= "\nNote: You must specify -ce to generate a cuddled hash"; } $fh->print(<print( Dumper($rcuddled_block_types) ); $fh->print(<{'static-block-comment-prefix'} ) { my $prefix = $rOpts->{'static-block-comment-prefix'}; $prefix =~ s/^\s*//; my $pattern = $prefix; # user may give leading caret to force matching left comments only if ( $prefix !~ /^\^#/ ) { if ( $prefix !~ /^#/ ) { Die( "ERROR: the -sbcp prefix is '$prefix' but must begin with '#' or '^#'\n" ); } $pattern = '^\s*' . $prefix; } if ( bad_pattern($pattern) ) { Die( "ERROR: the -sbc prefix '$prefix' causes the invalid regex '$pattern'\n" ); } $static_block_comment_pattern = $pattern; } return; } ## end sub make_static_block_comment_pattern sub make_format_skipping_pattern { my ( $opt_name, $default ) = @_; my $param = $rOpts->{$opt_name}; unless ($param) { $param = $default } $param =~ s/^\s*//; if ( $param !~ /^#/ ) { Die("ERROR: the $opt_name parameter '$param' must begin with '#'\n"); } my $pattern = '^' . $param . '\s'; if ( bad_pattern($pattern) ) { Die( "ERROR: the $opt_name parameter '$param' causes the invalid regex '$pattern'\n" ); } return $pattern; } ## end sub make_format_skipping_pattern sub make_non_indenting_brace_pattern { # Create the pattern used to identify static side comments. # Note that we are ending the pattern in a \s. This will allow # the pattern to be followed by a space and some text, or a newline. # The pattern is used in sub 'non_indenting_braces' $non_indenting_brace_pattern = '^#<<<\s'; # allow the user to change it if ( $rOpts->{'non-indenting-brace-prefix'} ) { my $prefix = $rOpts->{'non-indenting-brace-prefix'}; $prefix =~ s/^\s*//; if ( $prefix !~ /^#/ ) { Die("ERROR: the -nibp parameter '$prefix' must begin with '#'\n"); } my $pattern = '^' . $prefix . '\s'; if ( bad_pattern($pattern) ) { Die( "ERROR: the -nibp prefix '$prefix' causes the invalid regex '$pattern'\n" ); } $non_indenting_brace_pattern = $pattern; } return; } ## end sub make_non_indenting_brace_pattern sub make_closing_side_comment_list_pattern { # turn any input list into a regex for recognizing selected block types $closing_side_comment_list_pattern = '^\w+'; if ( defined( $rOpts->{'closing-side-comment-list'} ) && $rOpts->{'closing-side-comment-list'} ) { $closing_side_comment_list_pattern = make_block_pattern( '-cscl', $rOpts->{'closing-side-comment-list'} ); } return; } ## end sub make_closing_side_comment_list_pattern sub make_sub_matching_pattern { # Patterns for standardizing matches to block types for regular subs and # anonymous subs. Examples # 'sub process' is a named sub # 'sub ::m' is a named sub # 'sub' is an anonymous sub # 'sub:' is a label, not a sub # 'sub :' is a label, not a sub ( block type will be ) # sub'_ is a named sub ( block type will be ) # 'substr' is a keyword # So note that named subs always have a space after 'sub' $SUB_PATTERN = '^sub\s'; # match normal sub $ASUB_PATTERN = '^sub$'; # match anonymous sub # Note (see also RT #133130): These patterns are used by # sub make_block_pattern, which is used for making most patterns. # So this sub needs to be called before other pattern-making routines. if ( $rOpts->{'sub-alias-list'} ) { # Note that any 'sub-alias-list' has been preprocessed to # be a trimmed, space-separated list which includes 'sub' # for example, it might be 'sub method fun' my $sub_alias_list = $rOpts->{'sub-alias-list'}; $sub_alias_list =~ s/\s+/\|/g; $SUB_PATTERN =~ s/sub/\($sub_alias_list\)/; $ASUB_PATTERN =~ s/sub/\($sub_alias_list\)/; } return; } ## end sub make_sub_matching_pattern sub make_bl_pattern { # Set defaults lists to retain historical default behavior for -bl: my $bl_list_string = '*'; my $bl_exclusion_list_string = 'sort map grep eval asub'; if ( defined( $rOpts->{'brace-left-list'} ) && $rOpts->{'brace-left-list'} ) { $bl_list_string = $rOpts->{'brace-left-list'}; } if ( $bl_list_string =~ /\bsub\b/ ) { $rOpts->{'opening-sub-brace-on-new-line'} ||= $rOpts->{'opening-brace-on-new-line'}; } if ( $bl_list_string =~ /\basub\b/ ) { $rOpts->{'opening-anonymous-sub-brace-on-new-line'} ||= $rOpts->{'opening-brace-on-new-line'}; } $bl_pattern = make_block_pattern( '-bll', $bl_list_string ); # for -bl, a list with '*' turns on -sbl and -asbl if ( $bl_pattern =~ /\.\*/ ) { $rOpts->{'opening-sub-brace-on-new-line'} ||= $rOpts->{'opening-brace-on-new-line'}; $rOpts->{'opening-anonymous-sub-brace-on-new-line'} ||= $rOpts->{'opening-anonymous-brace-on-new-line'}; } if ( defined( $rOpts->{'brace-left-exclusion-list'} ) && $rOpts->{'brace-left-exclusion-list'} ) { $bl_exclusion_list_string = $rOpts->{'brace-left-exclusion-list'}; if ( $bl_exclusion_list_string =~ /\bsub\b/ ) { $rOpts->{'opening-sub-brace-on-new-line'} = 0; } if ( $bl_exclusion_list_string =~ /\basub\b/ ) { $rOpts->{'opening-anonymous-sub-brace-on-new-line'} = 0; } } $bl_exclusion_pattern = make_block_pattern( '-blxl', $bl_exclusion_list_string ); return; } ## end sub make_bl_pattern sub make_bli_pattern { # default list of block types for which -bli would apply my $bli_list_string = 'if else elsif unless while for foreach do : sub'; my $bli_exclusion_list_string = SPACE; if ( defined( $rOpts->{'brace-left-and-indent-list'} ) && $rOpts->{'brace-left-and-indent-list'} ) { $bli_list_string = $rOpts->{'brace-left-and-indent-list'}; } $bli_pattern = make_block_pattern( '-blil', $bli_list_string ); if ( defined( $rOpts->{'brace-left-and-indent-exclusion-list'} ) && $rOpts->{'brace-left-and-indent-exclusion-list'} ) { $bli_exclusion_list_string = $rOpts->{'brace-left-and-indent-exclusion-list'}; } $bli_exclusion_pattern = make_block_pattern( '-blixl', $bli_exclusion_list_string ); return; } ## end sub make_bli_pattern sub make_keyword_group_list_pattern { # turn any input list into a regex for recognizing selected block types. # Here are the defaults: $keyword_group_list_pattern = '^(our|local|my|use|require|)$'; $keyword_group_list_comment_pattern = EMPTY_STRING; if ( defined( $rOpts->{'keyword-group-blanks-list'} ) && $rOpts->{'keyword-group-blanks-list'} ) { my @words = split /\s+/, $rOpts->{'keyword-group-blanks-list'}; my @keyword_list; my @comment_list; foreach my $word (@words) { if ( $word eq 'BC' || $word eq 'SBC' ) { push @comment_list, $word; if ( $word eq 'SBC' ) { push @comment_list, 'SBCX' } } else { push @keyword_list, $word; } } $keyword_group_list_pattern = make_block_pattern( '-kgbl', $rOpts->{'keyword-group-blanks-list'} ); $keyword_group_list_comment_pattern = make_block_pattern( '-kgbl', join( SPACE, @comment_list ) ); } return; } ## end sub make_keyword_group_list_pattern sub make_block_brace_vertical_tightness_pattern { # turn any input list into a regex for recognizing selected block types $block_brace_vertical_tightness_pattern = '^((if|else|elsif|unless|while|for|foreach|do|\w+:)$|sub)'; if ( defined( $rOpts->{'block-brace-vertical-tightness-list'} ) && $rOpts->{'block-brace-vertical-tightness-list'} ) { $block_brace_vertical_tightness_pattern = make_block_pattern( '-bbvtl', $rOpts->{'block-brace-vertical-tightness-list'} ); } return; } ## end sub make_block_brace_vertical_tightness_pattern sub make_blank_line_pattern { $blank_lines_before_closing_block_pattern = $SUB_PATTERN; my $key = 'blank-lines-before-closing-block-list'; if ( defined( $rOpts->{$key} ) && $rOpts->{$key} ) { $blank_lines_before_closing_block_pattern = make_block_pattern( '-blbcl', $rOpts->{$key} ); } $blank_lines_after_opening_block_pattern = $SUB_PATTERN; $key = 'blank-lines-after-opening-block-list'; if ( defined( $rOpts->{$key} ) && $rOpts->{$key} ) { $blank_lines_after_opening_block_pattern = make_block_pattern( '-blaol', $rOpts->{$key} ); } return; } ## end sub make_blank_line_pattern sub make_block_pattern { # given a string of block-type keywords, return a regex to match them # The only tricky part is that labels are indicated with a single ':' # and the 'sub' token text may have additional text after it (name of # sub). # # Example: # # input string: "if else elsif unless while for foreach do : sub"; # pattern: '^((if|else|elsif|unless|while|for|foreach|do|\w+:)$|sub)'; # Minor Update: # # To distinguish between anonymous subs and named subs, use 'sub' to # indicate a named sub, and 'asub' to indicate an anonymous sub my ( $abbrev, $string ) = @_; my @list = split_words($string); my @words = (); my %seen; for my $i (@list) { if ( $i eq '*' ) { my $pattern = '^.*'; return $pattern } next if $seen{$i}; $seen{$i} = 1; if ( $i eq 'sub' ) { } elsif ( $i eq 'asub' ) { } elsif ( $i eq ';' ) { push @words, ';'; } elsif ( $i eq '{' ) { push @words, '\{'; } elsif ( $i eq ':' ) { push @words, '\w+:'; } elsif ( $i =~ /^\w/ ) { push @words, $i; } else { Warn("unrecognized block type $i after $abbrev, ignoring\n"); } } # Fix 2 for c091, prevent the pattern from matching an empty string # '1 ' is an impossible block name. if ( !@words ) { push @words, "1 " } my $pattern = '(' . join( '|', @words ) . ')$'; my $sub_patterns = EMPTY_STRING; if ( $seen{'sub'} ) { $sub_patterns .= '|' . $SUB_PATTERN; } if ( $seen{'asub'} ) { $sub_patterns .= '|' . $ASUB_PATTERN; } if ($sub_patterns) { $pattern = '(' . $pattern . $sub_patterns . ')'; } $pattern = '^' . $pattern; return $pattern; } ## end sub make_block_pattern sub make_static_side_comment_pattern { # create the pattern used to identify static side comments $static_side_comment_pattern = '^##'; # allow the user to change it if ( $rOpts->{'static-side-comment-prefix'} ) { my $prefix = $rOpts->{'static-side-comment-prefix'}; $prefix =~ s/^\s*//; my $pattern = '^' . $prefix; if ( bad_pattern($pattern) ) { Die( "ERROR: the -sscp prefix '$prefix' causes the invalid regex '$pattern'\n" ); } $static_side_comment_pattern = $pattern; } return; } ## end sub make_static_side_comment_pattern sub make_closing_side_comment_prefix { # Be sure we have a valid closing side comment prefix my $csc_prefix = $rOpts->{'closing-side-comment-prefix'}; my $csc_prefix_pattern; if ( !defined($csc_prefix) ) { $csc_prefix = '## end'; $csc_prefix_pattern = '^##\s+end'; } else { my $test_csc_prefix = $csc_prefix; if ( $test_csc_prefix !~ /^#/ ) { $test_csc_prefix = '#' . $test_csc_prefix; } # make a regex to recognize the prefix my $test_csc_prefix_pattern = $test_csc_prefix; # escape any special characters $test_csc_prefix_pattern =~ s/([^#\s\w])/\\$1/g; $test_csc_prefix_pattern = '^' . $test_csc_prefix_pattern; # allow exact number of intermediate spaces to vary $test_csc_prefix_pattern =~ s/\s+/\\s\+/g; # make sure we have a good pattern # if we fail this we probably have an error in escaping # characters. if ( bad_pattern($test_csc_prefix_pattern) ) { # shouldn't happen..must have screwed up escaping, above if (DEVEL_MODE) { Fault(<{'closing-side-comment-prefix'} = $csc_prefix; $closing_side_comment_prefix_pattern = $csc_prefix_pattern; return; } ## end sub make_closing_side_comment_prefix ################################################## # CODE SECTION 4: receive lines from the tokenizer ################################################## { ## begin closure write_line my $nesting_depth; # Variables used by sub check_sequence_numbers: my $last_seqno; my %saw_opening_seqno; my %saw_closing_seqno; my $initial_seqno; sub initialize_write_line { $nesting_depth = undef; $last_seqno = SEQ_ROOT; %saw_opening_seqno = (); %saw_closing_seqno = (); return; } ## end sub initialize_write_line sub check_sequence_numbers { # Routine for checking sequence numbers. This only needs to be # done occasionally in DEVEL_MODE to be sure everything is working # correctly. my ( $rtokens, $rtoken_type, $rtype_sequence, $input_line_no ) = @_; my $jmax = @{$rtokens} - 1; return unless ( $jmax >= 0 ); foreach my $j ( 0 .. $jmax ) { my $seqno = $rtype_sequence->[$j]; my $token = $rtokens->[$j]; my $type = $rtoken_type->[$j]; $seqno = EMPTY_STRING unless ( defined($seqno) ); my $err_msg = "Error at j=$j, line number $input_line_no, seqno='$seqno', type='$type', tok='$token':\n"; if ( !$seqno ) { # Sequence numbers are generated for opening tokens, so every opening # token should be sequenced. Closing tokens will be unsequenced # if they do not have a matching opening token. if ( $is_opening_sequence_token{$token} && $type ne 'q' && $type ne 'Q' ) { Fault( <2 and may have gaps if ( !defined($initial_seqno) ) { $initial_seqno = $seqno } if ( $is_opening_sequence_token{$token} ) { # New method should have continuous numbering if ( $initial_seqno == 2 && $seqno != $last_seqno + 1 ) { Fault( <[_rblock_type_of_seqno_]->{$seqno} = $block_type; if ( $block_type =~ /$ASUB_PATTERN/ ) { $self->[_ris_asub_block_]->{$seqno} = 1; } elsif ( $block_type =~ /$SUB_PATTERN/ ) { $self->[_ris_sub_block_]->{$seqno} = 1; } return; } ## end sub store_block_type sub write_line { # This routine receives lines one-by-one from the tokenizer and stores # them in a format suitable for further processing. After the last # line has been sent, the tokenizer will call sub 'finish_formatting' # to do the actual formatting. my ( $self, $line_of_tokens_old ) = @_; my $rLL = $self->[_rLL_]; my $line_of_tokens = {}; foreach ( qw( _curly_brace_depth _ending_in_quote _guessed_indentation_level _line_number _line_text _line_type _paren_depth _quote_character _square_bracket_depth _starting_in_quote ) ) { $line_of_tokens->{$_} = $line_of_tokens_old->{$_}; } my $line_type = $line_of_tokens_old->{_line_type}; my $tee_output; my $Klimit = $self->[_Klimit_]; my $Kfirst; # Handle line of non-code if ( $line_type ne 'CODE' ) { $tee_output ||= $rOpts_tee_pod && substr( $line_type, 0, 3 ) eq 'POD'; $line_of_tokens->{_level_0} = 0; $line_of_tokens->{_ci_level_0} = 0; $line_of_tokens->{_nesting_blocks_0} = EMPTY_STRING; $line_of_tokens->{_nesting_tokens_0} = EMPTY_STRING; $line_of_tokens->{_ended_in_blank_token} = undef; } # Handle line of code else { my $rtokens = $line_of_tokens_old->{_rtokens}; my $jmax = @{$rtokens} - 1; if ( $jmax >= 0 ) { $Kfirst = defined($Klimit) ? $Klimit + 1 : 0; #---------------------------- # get the tokens on this line #---------------------------- $self->write_line_inner_loop( $line_of_tokens_old, $line_of_tokens ); # update Klimit for added tokens $Klimit = @{$rLL} - 1; } ## end if ( $jmax >= 0 ) else { # blank line $line_of_tokens->{_level_0} = 0; $line_of_tokens->{_ci_level_0} = 0; $line_of_tokens->{_nesting_blocks_0} = EMPTY_STRING; $line_of_tokens->{_nesting_tokens_0} = EMPTY_STRING; $line_of_tokens->{_ended_in_blank_token} = undef; } $tee_output ||= $rOpts_tee_block_comments && $jmax == 0 && $rLL->[$Kfirst]->[_TYPE_] eq '#'; $tee_output ||= $rOpts_tee_side_comments && defined($Kfirst) && $Klimit > $Kfirst && $rLL->[$Klimit]->[_TYPE_] eq '#'; } ## end if ( $line_type eq 'CODE') # Finish storing line variables $line_of_tokens->{_rK_range} = [ $Kfirst, $Klimit ]; $self->[_Klimit_] = $Klimit; my $rlines = $self->[_rlines_]; push @{$rlines}, $line_of_tokens; if ($tee_output) { my $fh_tee = $self->[_fh_tee_]; my $line_text = $line_of_tokens_old->{_line_text}; $fh_tee->print($line_text) if ($fh_tee); } return; } ## end sub write_line sub write_line_inner_loop { my ( $self, $line_of_tokens_old, $line_of_tokens ) = @_; #--------------------------------------------------------------------- # Copy the tokens on one line received from the tokenizer to their new # storage locations. #--------------------------------------------------------------------- # Input parameters: # $line_of_tokens_old = line received from tokenizer # $line_of_tokens = line of tokens being formed for formatter my $rtokens = $line_of_tokens_old->{_rtokens}; my $jmax = @{$rtokens} - 1; if ( $jmax < 0 ) { # safety check; shouldn't happen DEVEL_MODE && Fault("unexpected jmax=$jmax\n"); return; } my $line_index = $line_of_tokens_old->{_line_number} - 1; my $rtoken_type = $line_of_tokens_old->{_rtoken_type}; my $rblock_type = $line_of_tokens_old->{_rblock_type}; my $rtype_sequence = $line_of_tokens_old->{_rtype_sequence}; my $rlevels = $line_of_tokens_old->{_rlevels}; my $rci_levels = $line_of_tokens_old->{_rci_levels}; my $rLL = $self->[_rLL_]; my $rSS = $self->[_rSS_]; my $rdepth_of_opening_seqno = $self->[_rdepth_of_opening_seqno_]; DEVEL_MODE && check_sequence_numbers( $rtokens, $rtoken_type, $rtype_sequence, $line_index + 1 ); # Find the starting nesting depth ... # It must be the value of variable 'level' of the first token # because the nesting depth is used as a token tag in the # vertical aligner and is compared to actual levels. # So vertical alignment problems will occur with any other # starting value. if ( !defined($nesting_depth) ) { $nesting_depth = $rlevels->[0]; $nesting_depth = 0 if ( $nesting_depth < 0 ); $rdepth_of_opening_seqno->[SEQ_ROOT] = $nesting_depth - 1; } my $j = -1; # NOTE: coding efficiency is critical in this loop over all tokens foreach my $token ( @{$rtokens} ) { # Do not clip the 'level' variable yet. We will do this # later, in sub 'store_token_to_go'. The reason is that in # files with level errors, the logic in 'weld_cuddled_else' # uses a stack logic that will give bad welds if we clip # levels here. ## $j++; ## if ( $rlevels->[$j] < 0 ) { $rlevels->[$j] = 0 } my $seqno = EMPTY_STRING; # Handle tokens with sequence numbers ... # note the ++ increment hidden here for efficiency if ( $rtype_sequence->[ ++$j ] ) { $seqno = $rtype_sequence->[$j]; my $sign = 1; if ( $is_opening_token{$token} ) { $self->[_K_opening_container_]->{$seqno} = @{$rLL}; $rdepth_of_opening_seqno->[$seqno] = $nesting_depth; $nesting_depth++; # Save a sequenced block type at its opening token. # Note that unsequenced block types can occur in # unbalanced code with errors but are ignored here. $self->store_block_type( $rblock_type->[$j], $seqno ) if ( $rblock_type->[$j] ); } elsif ( $is_closing_token{$token} ) { # The opening depth should always be defined, and # it should equal $nesting_depth-1. To protect # against unforseen error conditions, however, we # will check this and fix things if necessary. For # a test case see issue c055. my $opening_depth = $rdepth_of_opening_seqno->[$seqno]; if ( !defined($opening_depth) ) { $opening_depth = $nesting_depth - 1; $opening_depth = 0 if ( $opening_depth < 0 ); $rdepth_of_opening_seqno->[$seqno] = $opening_depth; # This is not fatal but should not happen. The # tokenizer generates sequence numbers # incrementally upon encountering each new # opening token, so every positive sequence # number should correspond to an opening token. DEVEL_MODE && Fault(<[_K_closing_container_]->{$seqno} = @{$rLL}; $nesting_depth = $opening_depth; $sign = -1; } elsif ( $token eq '?' ) { } elsif ( $token eq ':' ) { $sign = -1; } # The only sequenced types output by the tokenizer are # the opening & closing containers and the ternary # types. So we would only get here if the tokenizer has # been changed to mark some other tokens with sequence # numbers, or if an error has been introduced in a # hash such as %is_opening_container else { DEVEL_MODE && Fault(<[$j]', sequence=$seqno arrived from tokenizer. Expecting only opening or closing container tokens or ternary tokens with sequence numbers. EOM } if ( $sign > 0 ) { $self->[_Iss_opening_]->[$seqno] = @{$rSS}; # For efficiency, we find the maximum level of # opening tokens of any type. The actual maximum # level will be that of their contents which is 1 # greater. That will be fixed in sub # 'finish_formatting'. my $level = $rlevels->[$j]; if ( $level > $self->[_maximum_level_] ) { $self->[_maximum_level_] = $level; $self->[_maximum_level_at_line_] = $line_index + 1; } } else { $self->[_Iss_closing_]->[$seqno] = @{$rSS} } push @{$rSS}, $sign * $seqno; } my @tokary; @tokary[ _TOKEN_, _TYPE_, _TYPE_SEQUENCE_, _LEVEL_, _CI_LEVEL_, _LINE_INDEX_, ] = ( $token, $rtoken_type->[$j], $seqno, $rlevels->[$j], $rci_levels->[$j], $line_index, ); push @{$rLL}, \@tokary; } ## end token loop # Need to remember if we can trim the input line $line_of_tokens->{_ended_in_blank_token} = $rtoken_type->[$jmax] eq 'b'; # Values needed by Logger $line_of_tokens->{_level_0} = $rlevels->[0]; $line_of_tokens->{_ci_level_0} = $rci_levels->[0]; $line_of_tokens->{_nesting_blocks_0} = $line_of_tokens_old->{_nesting_blocks_0}; $line_of_tokens->{_nesting_tokens_0} = $line_of_tokens_old->{_nesting_tokens_0}; return; } ## end sub write_line_inner_loop } ## end closure write_line ############################################# # CODE SECTION 5: Pre-process the entire file ############################################# sub finish_formatting { my ( $self, $severe_error ) = @_; # The file has been tokenized and is ready to be formatted. # All of the relevant data is stored in $self, ready to go. # Returns: # true if input file was copied verbatim due to errors # false otherwise # Some of the code in sub break_lists is not robust enough to process code # with arbitrary brace errors. The simplest fix is to just return the file # verbatim if there are brace errors. This fixes issue c160. $severe_error ||= get_saw_brace_error(); # Check the maximum level. If it is extremely large we will give up and # output the file verbatim. Note that the actual maximum level is 1 # greater than the saved value, so we fix that here. $self->[_maximum_level_] += 1; my $maximum_level = $self->[_maximum_level_]; my $maximum_table_index = $#maximum_line_length_at_level; if ( !$severe_error && $maximum_level >= $maximum_table_index ) { $severe_error ||= 1; Warn(<{'dump-block-summary'} ) { if ($severe_error) { Exit(1) } $self->dump_block_summary(); Exit(0); } # output file verbatim if severe error or no formatting requested if ( $severe_error || $rOpts->{notidy} ) { $self->dump_verbatim(); $self->wrapup($severe_error); return 1; } # Update the 'save_logfile' flag based to include any tokenization errors. # We can save time by skipping logfile calls if it is not going to be saved. my $logger_object = $self->[_logger_object_]; if ($logger_object) { my $save_logfile = $logger_object->get_save_logfile(); $self->[_save_logfile_] = $save_logfile; my $file_writer_object = $self->[_file_writer_object_]; $file_writer_object->set_save_logfile($save_logfile); } { my $rix_side_comments = $self->set_CODE_type(); $self->find_non_indenting_braces($rix_side_comments); # Handle any requested side comment deletions. It is easier to get # this done here rather than farther down the pipeline because IO # lines take a different route, and because lines with deleted HSC # become BL lines. We have already handled any tee requests in sub # getline, so it is safe to delete side comments now. $self->delete_side_comments($rix_side_comments) if ( $rOpts_delete_side_comments || $rOpts_delete_closing_side_comments ); } # Verify that the line hash does not have any unknown keys. $self->check_line_hashes() if (DEVEL_MODE); { # Make a pass through all tokens, adding or deleting any whitespace as # required. Also make any other changes, such as adding semicolons. # All token changes must be made here so that the token data structure # remains fixed for the rest of this iteration. my ( $error, $rqw_lines ) = $self->respace_tokens(); if ($error) { $self->dump_verbatim(); $self->wrapup(); return 1; } $self->find_multiline_qw($rqw_lines); } $self->examine_vertical_tightness_flags(); $self->set_excluded_lp_containers(); $self->keep_old_line_breaks(); # Implement any welding needed for the -wn or -cb options $self->weld_containers(); # Collect info needed to implement the -xlp style $self->xlp_collapsed_lengths() if ( $rOpts_line_up_parentheses && $rOpts_extended_line_up_parentheses ); # Locate small nested blocks which should not be broken $self->mark_short_nested_blocks(); $self->special_indentation_adjustments(); # Verify that the main token array looks OK. If this ever causes a fault # then place similar checks before the sub calls above to localize the # problem. $self->check_rLL("Before 'process_all_lines'") if (DEVEL_MODE); # Finishes formatting and write the result to the line sink. # Eventually this call should just change the 'rlines' data according to the # new line breaks and then return so that we can do an internal iteration # before continuing with the next stages of formatting. $self->process_all_lines(); # A final routine to tie up any loose ends $self->wrapup(); return; } ## end sub finish_formatting my %is_loop_type; BEGIN { my @q = qw( for foreach while do until ); @{is_loop_type}{@q} = (1) x scalar(@q); } sub find_level_info { # Find level ranges and total variations of all code blocks in this file. # Returns: # ref to hash with block info, with seqno as key (see below) my ($self) = @_; # The array _rSS_ has the complete container tree for this file. my $rSS = $self->[_rSS_]; # We will be ignoring everything except code block containers my $rblock_type_of_seqno = $self->[_rblock_type_of_seqno_]; my @stack; my %level_info; # TREE_LOOP: foreach my $sseq ( @{$rSS} ) { my $stack_depth = @stack; my $seq_next = $sseq > 0 ? $sseq : -$sseq; next if ( !$rblock_type_of_seqno->{$seq_next} ); if ( $sseq > 0 ) { # STACK_LOOP: my $item; foreach my $seq (@stack) { $item = $level_info{$seq}; if ( $item->{maximum_depth} < $stack_depth ) { $item->{maximum_depth} = $stack_depth; } $item->{block_count}++; } ## end STACK LOOP push @stack, $seq_next; my $block_type = $rblock_type_of_seqno->{$seq_next}; # If this block is a loop nested within a loop, then we # will mark it as an 'inner_loop'. This is a useful # complexity measure. my $is_inner_loop = 0; if ( $is_loop_type{$block_type} && defined($item) ) { $is_inner_loop = $is_loop_type{ $item->{block_type} }; } $level_info{$seq_next} = { starting_depth => $stack_depth, maximum_depth => $stack_depth, block_count => 1, block_type => $block_type, is_inner_loop => $is_inner_loop, }; } else { my $seq_test = pop @stack; # error check if ( $seq_test != $seq_next ) { # Shouldn't happen - the $rSS array must have an error DEVEL_MODE && Fault("stack error finding total depths\n"); %level_info = (); last; } } } ## end TREE_LOOP return \%level_info; } ## end sub find_level_info sub find_loop_label { my ( $self, $seqno ) = @_; # Given: # $seqno = sequence number of a block of code for a loop # Return: # $label = the loop label text, if any, or an empty string my $rLL = $self->[_rLL_]; my $rlines = $self->[_rlines_]; my $K_opening_container = $self->[_K_opening_container_]; my $label = EMPTY_STRING; my $K_opening = $K_opening_container->{$seqno}; # backup to the line with the opening paren, if any, in case the # keyword is on a different line my $Kp = $self->K_previous_code($K_opening); return $label unless ( defined($Kp) ); if ( $rLL->[$Kp]->[_TOKEN_] eq ')' ) { $seqno = $rLL->[$Kp]->[_TYPE_SEQUENCE_]; $K_opening = $K_opening_container->{$seqno}; } return $label unless ( defined($K_opening) ); my $lx_open = $rLL->[$K_opening]->[_LINE_INDEX_]; # look for a lable within a few lines; allow a couple of blank lines foreach my $lx ( reverse( $lx_open - 3 .. $lx_open ) ) { last if ( $lx < 0 ); my $line_of_tokens = $rlines->[$lx]; my $line_type = $line_of_tokens->{_line_type}; # stop search on a non-code line last if ( $line_type ne 'CODE' ); my $rK_range = $line_of_tokens->{_rK_range}; my ( $Kfirst, $Klast ) = @{$rK_range}; # skip a blank line next if ( !defined($Kfirst) ); # check for a lable if ( $rLL->[$Kfirst]->[_TYPE_] eq 'J' ) { $label = $rLL->[$Kfirst]->[_TOKEN_]; last; } # quit the search if we are above the starting line last if ( $lx < $lx_open ); } return $label; } ## end sub find_loop_label { ## closure find_mccabe_count my %is_mccabe_logic_keyword; my %is_mccabe_logic_operator; BEGIN { my @q = (qw( && || ||= &&= ? <<= >>= )); @is_mccabe_logic_operator{@q} = (1) x scalar(@q); @q = (qw( and or xor if else elsif unless until while for foreach )); @is_mccabe_logic_keyword{@q} = (1) x scalar(@q); } ## end BEGIN sub find_mccabe_count { my ($self) = @_; # Find the cumulative mccabe count to each token # Return '$rmccabe_count_sum' = ref to array with cumulative # mccabe count to each token $K # NOTE: This sub currently follows the definitions in Perl::Critic my $rmccabe_count_sum; my $rLL = $self->[_rLL_]; my $count = 0; my $Klimit = $self->[_Klimit_]; foreach my $KK ( 0 .. $Klimit ) { $rmccabe_count_sum->{$KK} = $count; my $type = $rLL->[$KK]->[_TYPE_]; if ( $type eq 'k' ) { my $token = $rLL->[$KK]->[_TOKEN_]; if ( $is_mccabe_logic_keyword{$token} ) { $count++ } } elsif ( $is_mccabe_logic_operator{$type} ) { $count++; } } $rmccabe_count_sum->{ $Klimit + 1 } = $count; return $rmccabe_count_sum; } ## end sub find_mccabe_count } ## end closure find_mccabe_count sub find_code_line_count { my ($self) = @_; # Find the cumulative number of lines of code, excluding blanks, # comments and pod. # Return '$rcode_line_count' = ref to array with cumulative # code line count for each input line number. my $rcode_line_count; my $rLL = $self->[_rLL_]; my $rlines = $self->[_rlines_]; my $ix_line = -1; my $code_line_count = 0; # loop over all lines foreach my $line_of_tokens ( @{$rlines} ) { $ix_line++; # what type of line? my $line_type = $line_of_tokens->{_line_type}; # if 'CODE' it must be non-blank and non-comment if ( $line_type eq 'CODE' ) { my $rK_range = $line_of_tokens->{_rK_range}; my ( $Kfirst, $Klast ) = @{$rK_range}; if ( defined($Kfirst) ) { # it is non-blank my $jmax = defined($Kfirst) ? $Klast - $Kfirst : -1; if ( $jmax > 0 || $rLL->[$Klast]->[_TYPE_] ne '#' ) { # ok, it is a non-comment $code_line_count++; } } } # Count all other special line types except pod; # For a list of line types see sub 'process_all_lines' elsif ( $line_type !~ /^POD/ ) { $code_line_count++ } # Store the cumulative count using the input line index $rcode_line_count->[$ix_line] = $code_line_count; } return $rcode_line_count; } ## end sub find_code_line_count sub find_selected_packages { my ( $self, $rdump_block_types ) = @_; # returns a list of all package statements in a file if requested unless ( $rdump_block_types->{'*'} || $rdump_block_types->{'package'} || $rdump_block_types->{'class'} ) { return; } my $rLL = $self->[_rLL_]; my $Klimit = $self->[_Klimit_]; my $rlines = $self->[_rlines_]; my $K_closing_container = $self->[_K_closing_container_]; my @package_list; my @package_sweep; foreach my $KK ( 0 .. $Klimit ) { my $item = $rLL->[$KK]; my $type = $item->[_TYPE_]; if ( $type ne 'i' ) { next; } my $token = $item->[_TOKEN_]; if ( substr( $token, 0, 7 ) eq 'package' && $token =~ /^package\s/ || substr( $token, 0, 5 ) eq 'class' && $token =~ /^class\s/ ) { $token =~ s/\s+/ /g; my ( $keyword, $name ) = split /\s+/, $token, 2; my $lx_start = $item->[_LINE_INDEX_]; my $level = $item->[_LEVEL_]; my $parent_seqno = $self->parent_seqno_by_K($KK); # Skip a class BLOCK because it will be handled as a block if ( $keyword eq 'class' ) { my $line_of_tokens = $rlines->[$lx_start]; my $rK_range = $line_of_tokens->{_rK_range}; my ( $K_first, $K_last ) = @{$rK_range}; if ( $rLL->[$K_last]->[_TYPE_] eq '#' ) { $K_last = $self->K_previous_code($K_last); } if ( defined($K_last) ) { my $seqno_class = $rLL->[$K_last]->[_TYPE_SEQUENCE_]; my $block_type_next = $self->[_rblock_type_of_seqno_]->{$seqno_class}; # these block types are currently marked 'package' # but may be 'class' in the future, so allow both. if ( defined($block_type_next) && $block_type_next =~ /^(class|package)\b/ ) { next; } } } my $K_closing = $Klimit; if ( $parent_seqno != SEQ_ROOT ) { my $Kc = $K_closing_container->{$parent_seqno}; if ( defined($Kc) ) { $K_closing = $Kc; } } # This package ends any previous package at this level if ( defined( my $ix = $package_sweep[$level] ) ) { my $rpk = $package_list[$ix]; my $Kc = $rpk->{K_closing}; if ( $Kc > $KK ) { $rpk->{K_closing} = $KK - 1; } } $package_sweep[$level] = @package_list; # max_change and block_count are not currently reported 'package' push @package_list, { line_start => $lx_start + 1, K_opening => $KK, K_closing => $Klimit, name => $name, type => $keyword, level => $level, max_change => 0, block_count => 0, }; } } return \@package_list; } ## end sub find_selected_packages sub find_selected_blocks { my ( $self, $rdump_block_types ) = @_; # Find blocks needed for --dump-block-summary # Returns: # $rslected_blocks = ref to a list of information on the selected blocks my $rLL = $self->[_rLL_]; my $rlines = $self->[_rlines_]; my $rblock_type_of_seqno = $self->[_rblock_type_of_seqno_]; my $K_opening_container = $self->[_K_opening_container_]; my $K_closing_container = $self->[_K_closing_container_]; my $ris_asub_block = $self->[_ris_asub_block_]; my $ris_sub_block = $self->[_ris_sub_block_]; my $dump_all_types = $rdump_block_types->{'*'}; # Get level variation info for code blocks my $rlevel_info = $self->find_level_info(); my @selected_blocks; #--------------------------------------------------- # BEGIN loop over all blocks to find selected blocks #--------------------------------------------------- foreach my $seqno ( keys %{$rblock_type_of_seqno} ) { my $type; my $name = EMPTY_STRING; my $block_type = $rblock_type_of_seqno->{$seqno}; my $K_opening = $K_opening_container->{$seqno}; my $K_closing = $K_closing_container->{$seqno}; my $level = $rLL->[$K_opening]->[_LEVEL_]; my $lx_open = $rLL->[$K_opening]->[_LINE_INDEX_]; my $line_of_tokens = $rlines->[$lx_open]; my $rK_range = $line_of_tokens->{_rK_range}; my ( $Kfirst, $Klast ) = @{$rK_range}; if ( !defined($Kfirst) || !defined($Klast) || $Kfirst > $K_opening ) { my $line_type = $line_of_tokens->{_line_type}; # shouldn't happen my $CODE_type = $line_of_tokens->{_code_type}; DEVEL_MODE && Fault(<{$seqno}; if ( defined($item) ) { my $starting_depth = $item->{starting_depth}; my $maximum_depth = $item->{maximum_depth}; $block_count = $item->{block_count}; $max_change = $maximum_depth - $starting_depth + 1; # this is a '+' character if this block is an inner loops $inner_loop_plus = $item->{is_inner_loop} ? '+' : EMPTY_STRING; } # Skip closures unless type 'closure' is explicitely requested if ( ( $block_type eq '}' || $block_type eq ';' ) && $rdump_block_types->{'closure'} ) { $type = 'closure'; } # Both 'sub' and 'asub' select an anonymous sub. # This allows anonymous subs to be explicitely selected elsif ( $ris_asub_block->{$seqno} && ( $dump_all_types || $rdump_block_types->{'sub'} || $rdump_block_types->{'asub'} ) ) { $type = 'asub'; # Look back to try to find some kind of name, such as # my $var = sub { - var is type 'i' # var => sub { - var is type 'w' # -var => sub { - var is type 'w' # 'var' => sub { - var is type 'Q' my ( $saw_equals, $saw_fat_comma, $blank_count ); foreach my $KK ( reverse( $Kfirst .. $K_opening - 1 ) ) { my $token_type = $rLL->[$KK]->[_TYPE_]; if ( $token_type eq 'b' ) { $blank_count++; next } if ( $token_type eq '=>' ) { $saw_fat_comma++; next } if ( $token_type eq '=' ) { $saw_equals++; next } if ( $token_type eq 'i' && $saw_equals || ( $token_type eq 'w' || $token_type eq 'Q' ) && $saw_fat_comma ) { $name = $rLL->[$KK]->[_TOKEN_]; last; } } } elsif ( $ris_sub_block->{$seqno} && ( $dump_all_types || $rdump_block_types->{'sub'} ) ) { $type = 'sub'; # what we want: # $block_type $name # 'sub setidentifier($)' => 'setidentifier' # 'method setidentifier($)' => 'setidentifier' my @parts = split /\s+/, $block_type; $name = $parts[1]; $name =~ s/\(.*$//; } elsif ( $block_type =~ /^(package|class)\b/ && ( $dump_all_types || $rdump_block_types->{'package'} || $rdump_block_types->{'class'} ) ) { $type = 'class'; my @parts = split /\s+/, $block_type; $name = $parts[1]; $name =~ s/\(.*$//; } elsif ( $is_loop_type{$block_type} && ( $dump_all_types || $rdump_block_types->{$block_type} || $rdump_block_types->{ $block_type . $inner_loop_plus } || $rdump_block_types->{$inner_loop_plus} ) ) { $type = $block_type . $inner_loop_plus; } elsif ( $dump_all_types || $rdump_block_types->{$block_type} ) { if ( $is_loop_type{$block_type} ) { $name = $self->find_loop_label($seqno); } $type = $block_type; } else { next; } push @selected_blocks, { K_opening => $K_opening, K_closing => $K_closing, line_start => $lx_open + 1, name => $name, type => $type, level => $level, max_change => $max_change, block_count => $block_count, }; } ## END loop to get info for selected blocks return \@selected_blocks; } ## end sub find_selected_blocks sub dump_block_summary { my ($self) = @_; # Dump information about selected code blocks to STDOUT # This sub is called when # --dump-block-summary (-dbs) is set. # The following controls are available: # --dump-block-types=s (-dbt=s), where s is a list of block types # (if else elsif for foreach while do ... sub) ; default is 'sub' # --dump-block-minimum-lines=n (-dbml=n), where n is the minimum # number of lines for a block to be included; default is 20. my $rOpts_dump_block_types = $rOpts->{'dump-block-types'}; if ( !defined($rOpts_dump_block_types) ) { $rOpts_dump_block_types = 'sub' } $rOpts_dump_block_types =~ s/^\s+//; $rOpts_dump_block_types =~ s/\s+$//; my @list = split /\s+/, $rOpts_dump_block_types; my %dump_block_types; @{dump_block_types}{@list} = (1) x scalar(@list); # Get block info my $rselected_blocks = $self->find_selected_blocks( \%dump_block_types ); # Get package info my $rpackage_list = $self->find_selected_packages( \%dump_block_types ); return if ( !@{$rselected_blocks} && !@{$rpackage_list} ); my $input_stream_name = get_input_stream_name(); # Get code line count my $rcode_line_count = $self->find_code_line_count(); # Get mccabe count my $rmccabe_count_sum = $self->find_mccabe_count(); my $rOpts_dump_block_minimum_lines = $rOpts->{'dump-block-minimum-lines'}; if ( !defined($rOpts_dump_block_minimum_lines) ) { $rOpts_dump_block_minimum_lines = 20; } my $rLL = $self->[_rLL_]; # merge blocks and packages, add various counts, filter and print to STDOUT my $routput_lines = []; foreach my $item ( @{$rselected_blocks}, @{$rpackage_list} ) { my $K_opening = $item->{K_opening}; my $K_closing = $item->{K_closing}; # define total number of lines my $lx_open = $rLL->[$K_opening]->[_LINE_INDEX_]; my $lx_close = $rLL->[$K_closing]->[_LINE_INDEX_]; my $line_count = $lx_close - $lx_open + 1; # define total number of lines of code excluding blanks, comments, pod my $code_lines_open = $rcode_line_count->[$lx_open]; my $code_lines_close = $rcode_line_count->[$lx_close]; my $code_lines = 0; if ( defined($code_lines_open) && defined($code_lines_close) ) { $code_lines = $code_lines_close - $code_lines_open + 1; } # filter out blocks below the selected code line limit if ( $code_lines < $rOpts_dump_block_minimum_lines ) { next; } # add mccabe_count for this block my $mccabe_closing = $rmccabe_count_sum->{ $K_closing + 1 }; my $mccabe_opening = $rmccabe_count_sum->{$K_opening}; my $mccabe_count = 1; # add 1 to match Perl::Critic if ( defined($mccabe_opening) && defined($mccabe_closing) ) { $mccabe_count += $mccabe_closing - $mccabe_opening; } # Store the final set of print variables push @{$routput_lines}, [ $input_stream_name, $item->{line_start}, $line_count, $code_lines, $item->{type}, $item->{name}, $item->{level}, $item->{max_change}, $item->{block_count}, $mccabe_count, ]; } return unless @{$routput_lines}; # Sort blocks and packages on starting line number my @sorted_lines = sort { $a->[1] <=> $b->[1] } @{$routput_lines}; print STDOUT "file,line,line_count,code_lines,type,name,level,max_change,block_count,mccabe_count\n"; foreach my $rline_vars (@sorted_lines) { my $line = join( ",", @{$rline_vars} ) . "\n"; print STDOUT $line; } return; } ## end sub dump_block_summary sub set_CODE_type { my ($self) = @_; # Examine each line of code and set a flag '$CODE_type' to describe it. # Also return a list of lines with side comments. my $rLL = $self->[_rLL_]; my $rlines = $self->[_rlines_]; my $rOpts_format_skipping_begin = $rOpts->{'format-skipping-begin'}; my $rOpts_format_skipping_end = $rOpts->{'format-skipping-end'}; my $rOpts_static_block_comment_prefix = $rOpts->{'static-block-comment-prefix'}; # Remember indexes of lines with side comments my @ix_side_comments; my $In_format_skipping_section = 0; my $Saw_VERSION_in_this_file = 0; my $has_side_comment = 0; my ( $Kfirst, $Klast ); my $CODE_type; # Loop to set CODE_type # Possible CODE_types # 'VB' = Verbatim - line goes out verbatim (a quote) # 'FS' = Format Skipping - line goes out verbatim # 'BL' = Blank Line # 'HSC' = Hanging Side Comment - fix this hanging side comment # 'SBCX'= Static Block Comment Without Leading Space # 'SBC' = Static Block Comment # 'BC' = Block Comment - an ordinary full line comment # 'IO' = Indent Only - line goes out unchanged except for indentation # 'NIN' = No Internal Newlines - line does not get broken # 'VER' = VERSION statement # '' = ordinary line of code with no restrictions my $ix_line = -1; foreach my $line_of_tokens ( @{$rlines} ) { $ix_line++; my $line_type = $line_of_tokens->{_line_type}; my $Last_line_had_side_comment = $has_side_comment; if ($has_side_comment) { push @ix_side_comments, $ix_line - 1; $has_side_comment = 0; } my $last_CODE_type = $CODE_type; $CODE_type = EMPTY_STRING; if ( $line_type ne 'CODE' ) { next; } my $Klast_prev = $Klast; my $rK_range = $line_of_tokens->{_rK_range}; ( $Kfirst, $Klast ) = @{$rK_range}; my $input_line = $line_of_tokens->{_line_text}; my $jmax = defined($Kfirst) ? $Klast - $Kfirst : -1; my $is_block_comment = 0; if ( $jmax >= 0 && $rLL->[$Klast]->[_TYPE_] eq '#' ) { if ( $jmax == 0 ) { $is_block_comment = 1; } else { $has_side_comment = 1 } } # Write line verbatim if we are in a formatting skip section if ($In_format_skipping_section) { # Note: extra space appended to comment simplifies pattern matching if ( $is_block_comment # optional fast pre-check && ( substr( $rLL->[$Kfirst]->[_TOKEN_], 0, 4 ) eq '#>>>' || $rOpts_format_skipping_end ) && ( $rLL->[$Kfirst]->[_TOKEN_] . SPACE ) =~ /$format_skipping_pattern_end/ ) { $In_format_skipping_section = 0; my $input_line_no = $line_of_tokens->{_line_number}; write_logfile_entry( "Line $input_line_no: Exiting format-skipping section\n"); } $CODE_type = 'FS'; next; } # Check for a continued quote.. if ( $line_of_tokens->{_starting_in_quote} ) { # A line which is entirely a quote or pattern must go out # verbatim. Note: the \n is contained in $input_line. if ( $jmax <= 0 ) { if ( $self->[_save_logfile_] && $input_line =~ /\t/ ) { my $input_line_number = $line_of_tokens->{_line_number}; $self->note_embedded_tab($input_line_number); } $CODE_type = 'VB'; next; } } # See if we are entering a formatting skip section if ( $is_block_comment # optional fast pre-check && ( substr( $rLL->[$Kfirst]->[_TOKEN_], 0, 4 ) eq '#<<<' || $rOpts_format_skipping_begin ) && $rOpts_format_skipping && ( $rLL->[$Kfirst]->[_TOKEN_] . SPACE ) =~ /$format_skipping_pattern_begin/ ) { $In_format_skipping_section = 1; my $input_line_no = $line_of_tokens->{_line_number}; write_logfile_entry( "Line $input_line_no: Entering format-skipping section\n"); $CODE_type = 'FS'; next; } # ignore trailing blank tokens (they will get deleted later) if ( $jmax > 0 && $rLL->[$Klast]->[_TYPE_] eq 'b' ) { $jmax--; } # blank line.. if ( $jmax < 0 ) { $CODE_type = 'BL'; next; } # Handle comments if ($is_block_comment) { # see if this is a static block comment (starts with ## by default) my $is_static_block_comment = 0; my $no_leading_space = substr( $input_line, 0, 1 ) eq '#'; if ( # optional fast pre-check ( substr( $rLL->[$Kfirst]->[_TOKEN_], 0, 2 ) eq '##' || $rOpts_static_block_comment_prefix ) && $rOpts_static_block_comments && $input_line =~ /$static_block_comment_pattern/ ) { $is_static_block_comment = 1; } # Check for comments which are line directives # Treat exactly as static block comments without leading space # reference: perlsyn, near end, section Plain Old Comments (Not!) # example: '# line 42 "new_filename.plx"' if ( $no_leading_space && $input_line =~ /^\# \s* line \s+ (\d+) \s* (?:\s("?)([^"]+)\2)? \s* $/x ) { $is_static_block_comment = 1; } # look for hanging side comment ... if ( $Last_line_had_side_comment # last line had side comment && !$no_leading_space # there is some leading space && ! $is_static_block_comment # do not make static comment hanging ) { # continuing an existing HSC chain? if ( $last_CODE_type eq 'HSC' ) { $has_side_comment = 1; $CODE_type = 'HSC'; next; } # starting a new HSC chain? elsif ( $rOpts->{'hanging-side-comments'} # user is allowing # hanging side comments # like this && ( defined($Klast_prev) && $Klast_prev > 1 ) # and the previous side comment was not static (issue c070) && !( $rOpts->{'static-side-comments'} && $rLL->[$Klast_prev]->[_TOKEN_] =~ /$static_side_comment_pattern/ ) ) { # and it is not a closing side comment (issue c070). my $K_penult = $Klast_prev - 1; $K_penult -= 1 if ( $rLL->[$K_penult]->[_TYPE_] eq 'b' ); my $follows_csc = ( $rLL->[$K_penult]->[_TOKEN_] eq '}' && $rLL->[$K_penult]->[_TYPE_] eq '}' && $rLL->[$Klast_prev]->[_TOKEN_] =~ /$closing_side_comment_prefix_pattern/ ); if ( !$follows_csc ) { $has_side_comment = 1; $CODE_type = 'HSC'; next; } } } if ($is_static_block_comment) { $CODE_type = $no_leading_space ? 'SBCX' : 'SBC'; next; } elsif ($Last_line_had_side_comment && !$rOpts_maximum_consecutive_blank_lines && $rLL->[$Kfirst]->[_LEVEL_] > 0 ) { # Emergency fix to keep a block comment from becoming a hanging # side comment. This fix is for the case that blank lines # cannot be inserted. There is related code in sub # 'process_line_of_CODE' $CODE_type = 'SBCX'; next; } else { $CODE_type = 'BC'; next; } } # End of comments. Handle a line of normal code: if ($rOpts_indent_only) { $CODE_type = 'IO'; next; } if ( !$rOpts_add_newlines ) { $CODE_type = 'NIN'; next; } # Patch needed for MakeMaker. Do not break a statement # in which $VERSION may be calculated. See MakeMaker.pm; # this is based on the coding in it. # The first line of a file that matches this will be eval'd: # /([\$*])(([\w\:\']*)\bVERSION)\b.*\=/ # Examples: # *VERSION = \'1.01'; # ( $VERSION ) = '$Revision: 1.74 $ ' =~ /\$Revision:\s+([^\s]+)/; # We will pass such a line straight through without breaking # it unless -npvl is used. # Patch for problem reported in RT #81866, where files # had been flattened into a single line and couldn't be # tidied without -npvl. There are two parts to this patch: # First, it is not done for a really long line (80 tokens for now). # Second, we will only allow up to one semicolon # before the VERSION. We need to allow at least one semicolon # for statements like this: # require Exporter; our $VERSION = $Exporter::VERSION; # where both statements must be on a single line for MakeMaker if ( !$Saw_VERSION_in_this_file && $jmax < 80 && $input_line =~ /^[^;]*;?[^;]*([\$*])(([\w\:\']*)\bVERSION)\b.*\=/ ) { $Saw_VERSION_in_this_file = 1; write_logfile_entry("passing VERSION line; -npvl deactivates\n"); # This code type has lower priority than others $CODE_type = 'VER'; next; } } continue { $line_of_tokens->{_code_type} = $CODE_type; } if ($has_side_comment) { push @ix_side_comments, $ix_line; } return \@ix_side_comments; } ## end sub set_CODE_type sub find_non_indenting_braces { my ( $self, $rix_side_comments ) = @_; return unless ( $rOpts->{'non-indenting-braces'} ); my $rLL = $self->[_rLL_]; return unless ( defined($rLL) && @{$rLL} ); my $rlines = $self->[_rlines_]; my $rblock_type_of_seqno = $self->[_rblock_type_of_seqno_]; my $rseqno_non_indenting_brace_by_ix = $self->[_rseqno_non_indenting_brace_by_ix_]; foreach my $ix ( @{$rix_side_comments} ) { my $line_of_tokens = $rlines->[$ix]; my $line_type = $line_of_tokens->{_line_type}; if ( $line_type ne 'CODE' ) { # shouldn't happen DEVEL_MODE && Fault("unexpected line_type=$line_type\n"); next; } my $rK_range = $line_of_tokens->{_rK_range}; my ( $Kfirst, $Klast ) = @{$rK_range}; unless ( defined($Kfirst) && $rLL->[$Klast]->[_TYPE_] eq '#' ) { # shouldn't happen DEVEL_MODE && Fault("did not get a comment\n"); next; } next unless ( $Klast > $Kfirst ); # maybe HSC my $token_sc = $rLL->[$Klast]->[_TOKEN_]; my $K_m = $Klast - 1; my $type_m = $rLL->[$K_m]->[_TYPE_]; if ( $type_m eq 'b' && $K_m > $Kfirst ) { $K_m--; $type_m = $rLL->[$K_m]->[_TYPE_]; } my $seqno_m = $rLL->[$K_m]->[_TYPE_SEQUENCE_]; if ($seqno_m) { my $block_type_m = $rblock_type_of_seqno->{$seqno_m}; # The pattern ends in \s but we have removed the newline, so # we added it back for the match. That way we require an exact # match to the special string and also allow additional text. $token_sc .= "\n"; if ( $block_type_m && $is_opening_type{$type_m} && $token_sc =~ /$non_indenting_brace_pattern/ ) { $rseqno_non_indenting_brace_by_ix->{$ix} = $seqno_m; } } } return; } ## end sub find_non_indenting_braces sub delete_side_comments { my ( $self, $rix_side_comments ) = @_; # Given a list of indexes of lines with side comments, handle any # requested side comment deletions. my $rLL = $self->[_rLL_]; my $rlines = $self->[_rlines_]; my $rblock_type_of_seqno = $self->[_rblock_type_of_seqno_]; my $rseqno_non_indenting_brace_by_ix = $self->[_rseqno_non_indenting_brace_by_ix_]; foreach my $ix ( @{$rix_side_comments} ) { my $line_of_tokens = $rlines->[$ix]; my $line_type = $line_of_tokens->{_line_type}; # This fault shouldn't happen because we only saved CODE lines with # side comments in the TASK 1 loop above. if ( $line_type ne 'CODE' ) { if (DEVEL_MODE) { my $lno = $ix + 1; Fault(<{_code_type}; my $rK_range = $line_of_tokens->{_rK_range}; my ( $Kfirst, $Klast ) = @{$rK_range}; unless ( defined($Kfirst) && $rLL->[$Klast]->[_TYPE_] eq '#' ) { if (DEVEL_MODE) { my $lno = $ix + 1; Fault(< $Kfirst || $CODE_type eq 'HSC' ) && (!$CODE_type || $CODE_type eq 'HSC' || $CODE_type eq 'IO' || $CODE_type eq 'NIN' ); # Do not delete special control side comments if ( $rseqno_non_indenting_brace_by_ix->{$ix} ) { $delete_side_comment = 0; } if ( $rOpts_delete_closing_side_comments && !$delete_side_comment && $Klast > $Kfirst && ( !$CODE_type || $CODE_type eq 'HSC' || $CODE_type eq 'IO' || $CODE_type eq 'NIN' ) ) { my $token = $rLL->[$Klast]->[_TOKEN_]; my $K_m = $Klast - 1; my $type_m = $rLL->[$K_m]->[_TYPE_]; if ( $type_m eq 'b' && $K_m > $Kfirst ) { $K_m-- } my $seqno_m = $rLL->[$K_m]->[_TYPE_SEQUENCE_]; if ($seqno_m) { my $block_type_m = $rblock_type_of_seqno->{$seqno_m}; if ( $block_type_m && $token =~ /$closing_side_comment_prefix_pattern/ && $block_type_m =~ /$closing_side_comment_list_pattern/ ) { $delete_side_comment = 1; } } } ## end if ( $rOpts_delete_closing_side_comments...) if ($delete_side_comment) { # We are actually just changing the side comment to a blank. # This may produce multiple blanks in a row, but sub respace_tokens # will check for this and fix it. $rLL->[$Klast]->[_TYPE_] = 'b'; $rLL->[$Klast]->[_TOKEN_] = SPACE; # The -io option outputs the line text, so we have to update # the line text so that the comment does not reappear. if ( $CODE_type eq 'IO' ) { my $line = EMPTY_STRING; foreach my $KK ( $Kfirst .. $Klast - 1 ) { $line .= $rLL->[$KK]->[_TOKEN_]; } $line =~ s/\s+$//; $line_of_tokens->{_line_text} = $line . "\n"; } # If we delete a hanging side comment the line becomes blank. if ( $CODE_type eq 'HSC' ) { $line_of_tokens->{_code_type} = 'BL' } } } return; } ## end sub delete_side_comments sub dump_verbatim { my $self = shift; my $rlines = $self->[_rlines_]; foreach my $line ( @{$rlines} ) { my $input_line = $line->{_line_text}; $self->write_unindented_line($input_line); } return; } ## end sub dump_verbatim my %wU; my %wiq; my %is_wit; my %is_sigil; my %is_nonlist_keyword; my %is_nonlist_type; my %is_s_y_m_slash; my %is_unexpected_equals; BEGIN { # added 'U' to fix cases b1125 b1126 b1127 my @q = qw(w U); @{wU}{@q} = (1) x scalar(@q); @q = qw(w i q Q G C Z); @{wiq}{@q} = (1) x scalar(@q); @q = qw(w i t); @{is_wit}{@q} = (1) x scalar(@q); @q = qw($ & % * @); @{is_sigil}{@q} = (1) x scalar(@q); # Parens following these keywords will not be marked as lists. Note that # 'for' is not included and is handled separately, by including 'f' in the # hash %is_counted_type, since it may or may not be a c-style for loop. @q = qw( if elsif unless and or ); @is_nonlist_keyword{@q} = (1) x scalar(@q); # Parens following these types will not be marked as lists @q = qw( && || ); @is_nonlist_type{@q} = (1) x scalar(@q); @q = qw( s y m / ); @is_s_y_m_slash{@q} = (1) x scalar(@q); @q = qw( = == != ); @is_unexpected_equals{@q} = (1) x scalar(@q); } ## end BEGIN { #<<< begin clousure respace_tokens my $rLL_new; # This will be the new array of tokens # These are variables in $self my $rLL; my $length_function; my $is_encoded_data; my $K_closing_ternary; my $K_opening_ternary; my $rchildren_of_seqno; my $rhas_broken_code_block; my $rhas_broken_list; my $rhas_broken_list_with_lec; my $rhas_code_block; my $rhas_list; my $rhas_ternary; my $ris_assigned_structure; my $ris_broken_container; my $ris_excluded_lp_container; my $ris_list_by_seqno; my $ris_permanently_broken; my $rlec_count_by_seqno; my $roverride_cab3; my $rparent_of_seqno; my $rtype_count_by_seqno; my $rblock_type_of_seqno; my $K_opening_container; my $K_closing_container; my %K_first_here_doc_by_seqno; my $last_nonblank_code_type; my $last_nonblank_code_token; my $last_nonblank_block_type; my $last_last_nonblank_code_type; my $last_last_nonblank_code_token; my %seqno_stack; my %K_old_opening_by_seqno; my $depth_next; my $depth_next_max; my $cumulative_length; # Variables holding the current line info my $Ktoken_vars; my $Kfirst_old; my $Klast_old; my $Klast_old_code; my $CODE_type; my $rwhitespace_flags; sub initialize_respace_tokens_closure { my ($self) = @_; $rLL_new = []; # This is the new array $rLL = $self->[_rLL_]; $length_function = $self->[_length_function_]; $is_encoded_data = $self->[_is_encoded_data_]; $K_closing_ternary = $self->[_K_closing_ternary_]; $K_opening_ternary = $self->[_K_opening_ternary_]; $rchildren_of_seqno = $self->[_rchildren_of_seqno_]; $rhas_broken_code_block = $self->[_rhas_broken_code_block_]; $rhas_broken_list = $self->[_rhas_broken_list_]; $rhas_broken_list_with_lec = $self->[_rhas_broken_list_with_lec_]; $rhas_code_block = $self->[_rhas_code_block_]; $rhas_list = $self->[_rhas_list_]; $rhas_ternary = $self->[_rhas_ternary_]; $ris_assigned_structure = $self->[_ris_assigned_structure_]; $ris_broken_container = $self->[_ris_broken_container_]; $ris_excluded_lp_container = $self->[_ris_excluded_lp_container_]; $ris_list_by_seqno = $self->[_ris_list_by_seqno_]; $ris_permanently_broken = $self->[_ris_permanently_broken_]; $rlec_count_by_seqno = $self->[_rlec_count_by_seqno_]; $roverride_cab3 = $self->[_roverride_cab3_]; $rparent_of_seqno = $self->[_rparent_of_seqno_]; $rtype_count_by_seqno = $self->[_rtype_count_by_seqno_]; $rblock_type_of_seqno = $self->[_rblock_type_of_seqno_]; %K_first_here_doc_by_seqno = (); $last_nonblank_code_type = ';'; $last_nonblank_code_token = ';'; $last_nonblank_block_type = EMPTY_STRING; $last_last_nonblank_code_type = ';'; $last_last_nonblank_code_token = ';'; %seqno_stack = (); %K_old_opening_by_seqno = (); # Note: old K index $depth_next = 0; $depth_next_max = 0; # we will be setting token lengths as we go $cumulative_length = 0; $Ktoken_vars = undef; # the old K value of $rtoken_vars $Kfirst_old = undef; # min K of old line $Klast_old = undef; # max K of old line $Klast_old_code = undef; # K of last token if side comment $CODE_type = EMPTY_STRING; # Set the whitespace flags, which indicate the token spacing preference. $rwhitespace_flags = $self->set_whitespace_flags(); # Note that $K_opening_container and $K_closing_container have values # defined in sub get_line() for the previous K indexes. They were needed # in case option 'indent-only' was set, and we didn't get here. We no # longer need those and will eliminate them now to avoid any possible # mixing of old and new values. This must be done AFTER the call to # set_whitespace_flags, which needs these. $K_opening_container = $self->[_K_opening_container_] = {}; $K_closing_container = $self->[_K_closing_container_] = {}; return; } ## end sub initialize_respace_tokens_closure sub respace_tokens { my $self = shift; #-------------------------------------------------------------------------- # This routine is called once per file to do as much formatting as possible # before new line breaks are set. #-------------------------------------------------------------------------- # Return parameters: # Set $severe_error=true if processing must terminate immediately my ( $severe_error, $rqw_lines ); # We change any spaces in --indent-only mode if ( $rOpts->{'indent-only'} ) { # We need to define lengths for -indent-only to avoid undefs, even # though these values are not actually needed for option --indent-only. $rLL = $self->[_rLL_]; $length_function = $self->[_length_function_]; $cumulative_length = 0; foreach my $item ( @{$rLL} ) { my $token = $item->[_TOKEN_]; my $token_length = $length_function->($token); $cumulative_length += $token_length; $item->[_TOKEN_LENGTH_] = $token_length; $item->[_CUMULATIVE_LENGTH_] = $cumulative_length; } return ( $severe_error, $rqw_lines ); } # This routine makes all necessary and possible changes to the tokenization # after the initial tokenization of the file. This is a tedious routine, # but basically it consists of inserting and deleting whitespace between # nonblank tokens according to the selected parameters. In a few cases # non-space characters are added, deleted or modified. # The goal of this routine is to create a new token array which only needs # the definition of new line breaks and padding to complete formatting. In # a few cases we have to cheat a little to achieve this goal. In # particular, we may not know if a semicolon will be needed, because it # depends on how the line breaks go. To handle this, we include the # semicolon as a 'phantom' which can be displayed as normal or as an empty # string. # Method: The old tokens are copied one-by-one, with changes, from the old # linear storage array $rLL to a new array $rLL_new. # (re-)initialize closure variables for this problem $self->initialize_respace_tokens_closure(); #-------------------------------- # Main over all lines of the file #-------------------------------- my $rlines = $self->[_rlines_]; my $line_type = EMPTY_STRING; my $last_K_out; foreach my $line_of_tokens ( @{$rlines} ) { my $input_line_number = $line_of_tokens->{_line_number}; my $last_line_type = $line_type; $line_type = $line_of_tokens->{_line_type}; next unless ( $line_type eq 'CODE' ); $CODE_type = $line_of_tokens->{_code_type}; if ( $CODE_type eq 'BL' ) { my $seqno = $seqno_stack{ $depth_next - 1 }; if ( defined($seqno) ) { $self->[_rblank_and_comment_count_]->{$seqno} += 1; $self->set_permanently_broken($seqno) if (!$ris_permanently_broken->{$seqno} && $rOpts_maximum_consecutive_blank_lines ); } } my $rK_range = $line_of_tokens->{_rK_range}; my ( $Kfirst, $Klast ) = @{$rK_range}; next unless defined($Kfirst); ( $Kfirst_old, $Klast_old ) = ( $Kfirst, $Klast ); $Klast_old_code = $Klast_old; # Be sure an old K value is defined for sub store_token $Ktoken_vars = $Kfirst; # Check for correct sequence of token indexes... # An error here means that sub write_line() did not correctly # package the tokenized lines as it received them. If we # get a fault here it has not output a continuous sequence # of K values. Or a line of CODE may have been mis-marked as # something else. There is no good way to continue after such an # error. if ( defined($last_K_out) ) { if ( $Kfirst != $last_K_out + 1 ) { Fault_Warn( "Program Bug: last K out was $last_K_out but Kfirst=$Kfirst" ); $severe_error = 1; return ( $severe_error, $rqw_lines ); } } else { # The first token should always have been given index 0 by sub # write_line() if ( $Kfirst != 0 ) { Fault("Program Bug: first K is $Kfirst but should be 0"); } } $last_K_out = $Klast; # Handle special lines of code if ( $CODE_type && $CODE_type ne 'NIN' && $CODE_type ne 'VER' ) { # CODE_types are as follows. # 'BL' = Blank Line # 'VB' = Verbatim - line goes out verbatim # 'FS' = Format Skipping - line goes out verbatim, no blanks # 'IO' = Indent Only - only indentation may be changed # 'NIN' = No Internal Newlines - line does not get broken # 'HSC'=Hanging Side Comment - fix this hanging side comment # 'BC'=Block Comment - an ordinary full line comment # 'SBC'=Static Block Comment - a block comment which does not get # indented # 'SBCX'=Static Block Comment Without Leading Space # 'VER'=VERSION statement # '' or (undefined) - no restructions # For a hanging side comment we insert an empty quote before # the comment so that it becomes a normal side comment and # will be aligned by the vertical aligner if ( $CODE_type eq 'HSC' ) { # Safety Check: This must be a line with one token (a comment) my $rvars_Kfirst = $rLL->[$Kfirst]; if ( $Kfirst == $Klast && $rvars_Kfirst->[_TYPE_] eq '#' ) { # Note that even if the flag 'noadd-whitespace' is set, we # will make an exception here and allow a blank to be # inserted to push the comment to the right. We can think # of this as an adjustment of indentation rather than # whitespace between tokens. This will also prevent the # hanging side comment from getting converted to a block # comment if whitespace gets deleted, as for example with # the -extrude and -mangle options. my $rcopy = copy_token_as_type( $rvars_Kfirst, 'q', EMPTY_STRING ); $self->store_token($rcopy); $rcopy = copy_token_as_type( $rvars_Kfirst, 'b', SPACE ); $self->store_token($rcopy); $self->store_token($rvars_Kfirst); next; } else { # This line was mis-marked by sub scan_comment. Catch in # DEVEL_MODE, otherwise try to repair and keep going. Fault( "Program bug. A hanging side comment has been mismarked" ) if (DEVEL_MODE); $CODE_type = EMPTY_STRING; $line_of_tokens->{_code_type} = $CODE_type; } } # Copy tokens unchanged foreach my $KK ( $Kfirst .. $Klast ) { $Ktoken_vars = $KK; $self->store_token( $rLL->[$KK] ); } next; } # Handle normal line.. # Define index of last token before any side comment for comma counts my $type_end = $rLL->[$Klast_old_code]->[_TYPE_]; if ( ( $type_end eq '#' || $type_end eq 'b' ) && $Klast_old_code > $Kfirst_old ) { $Klast_old_code--; if ( $rLL->[$Klast_old_code]->[_TYPE_] eq 'b' && $Klast_old_code > $Kfirst_old ) { $Klast_old_code--; } } # Insert any essential whitespace between lines # if last line was normal CODE. # Patch for rt #125012: use K_previous_code rather than '_nonblank' # because comments may disappear. # Note that we must do this even if --noadd-whitespace is set if ( $last_line_type eq 'CODE' ) { my $type_next = $rLL->[$Kfirst]->[_TYPE_]; my $token_next = $rLL->[$Kfirst]->[_TOKEN_]; if ( is_essential_whitespace( $last_last_nonblank_code_token, $last_last_nonblank_code_type, $last_nonblank_code_token, $last_nonblank_code_type, $token_next, $type_next, ) ) { $self->store_space(); } } #----------------------------------------------- # Inner loop to respace tokens on a line of code #----------------------------------------------- # The inner loop is in a separate sub for clarity $self->respace_tokens_inner_loop( $Kfirst, $Klast, $input_line_number ); } # End line loop # finalize data structures $self->respace_post_loop_ops(); # Reset memory to be the new array $self->[_rLL_] = $rLL_new; my $Klimit; if ( @{$rLL_new} ) { $Klimit = @{$rLL_new} - 1 } $self->[_Klimit_] = $Klimit; # During development, verify that the new array still looks okay. DEVEL_MODE && $self->check_token_array(); # update the token limits of each line ( $severe_error, $rqw_lines ) = $self->resync_lines_and_tokens(); return ( $severe_error, $rqw_lines ); } ## end sub respace_tokens sub respace_tokens_inner_loop { my ( $self, $Kfirst, $Klast, $input_line_number ) = @_; #----------------------------------------------------------------- # Loop to copy all tokens on one line, making any spacing changes, # while also collecting information needed by later subs. #----------------------------------------------------------------- foreach my $KK ( $Kfirst .. $Klast ) { # TODO: consider eliminating this closure var by passing directly to # store_token following pattern of store_tokens_to_go. $Ktoken_vars = $KK; my $rtoken_vars = $rLL->[$KK]; my $type = $rtoken_vars->[_TYPE_]; # Handle a blank space ... if ( $type eq 'b' ) { # Delete it if not wanted by whitespace rules # or we are deleting all whitespace # Note that whitespace flag is a flag indicating whether a # white space BEFORE the token is needed next if ( $KK >= $Klast ); # skip terminal blank my $Knext = $KK + 1; if ($rOpts_freeze_whitespace) { $self->store_token($rtoken_vars); next; } my $ws = $rwhitespace_flags->[$Knext]; if ( $ws == -1 || $rOpts_delete_old_whitespace ) { my $token_next = $rLL->[$Knext]->[_TOKEN_]; my $type_next = $rLL->[$Knext]->[_TYPE_]; my $do_not_delete = is_essential_whitespace( $last_last_nonblank_code_token, $last_last_nonblank_code_type, $last_nonblank_code_token, $last_nonblank_code_type, $token_next, $type_next, ); # Note that repeated blanks will get filtered out here next unless ($do_not_delete); } # make it just one character $rtoken_vars->[_TOKEN_] = SPACE; $self->store_token($rtoken_vars); next; } my $token = $rtoken_vars->[_TOKEN_]; # Handle a sequenced token ... i.e. one of ( ) { } [ ] ? : if ( $rtoken_vars->[_TYPE_SEQUENCE_] ) { # One of ) ] } ... if ( $is_closing_token{$token} ) { my $type_sequence = $rtoken_vars->[_TYPE_SEQUENCE_]; my $block_type = $rblock_type_of_seqno->{$type_sequence}; #--------------------------------------------- # check for semicolon addition in a code block #--------------------------------------------- if ($block_type) { # if not preceded by a ';' .. if ( $last_nonblank_code_type ne ';' ) { # tentatively insert a semicolon if appropriate $self->add_phantom_semicolon($KK) if $rOpts->{'add-semicolons'}; } } #---------------------------------------------------------- # check for addition/deletion of a trailing comma in a list #---------------------------------------------------------- else { # if this is a list .. my $rtype_count = $rtype_count_by_seqno->{$type_sequence}; if ( $rtype_count && $rtype_count->{','} && !$rtype_count->{';'} && !$rtype_count->{'f'} ) { # if NOT preceded by a comma.. if ( $last_nonblank_code_type ne ',' ) { # insert a comma if requested if ( $rOpts_add_trailing_commas && %trailing_comma_rules ) { $self->add_trailing_comma( $KK, $Kfirst, $trailing_comma_rules{$token} ); } } # if preceded by a comma .. else { # delete a trailing comma if requested my $deleted; if ( $rOpts_delete_trailing_commas && %trailing_comma_rules ) { $deleted = $self->delete_trailing_comma( $KK, $Kfirst, $trailing_comma_rules{$token} ); } # delete a weld-interfering comma if requested if ( !$deleted && $rOpts_delete_weld_interfering_commas && $is_closing_type{ $last_last_nonblank_code_type} ) { $self->delete_weld_interfering_comma($KK); } } } } } } # Modify certain tokens here for whitespace # The following is not yet done, but could be: # sub (x x x) # ( $type =~ /^[wit]$/ ) elsif ( $is_wit{$type} ) { # change '$ var' to '$var' etc # change '@ ' to '@' # Examples: <> my $ord = ord( substr( $token, 1, 1 ) ); if ( # quick test for possible blank at second char $ord > 0 && ( $ord < ORD_PRINTABLE_MIN || $ord > ORD_PRINTABLE_MAX ) ) { my ( $sigil, $word ) = split /\s+/, $token, 2; # $sigil =~ /^[\$\&\%\*\@]$/ ) if ( $is_sigil{$sigil} ) { $token = $sigil; $token .= $word if ( defined($word) ); # fix c104 $rtoken_vars->[_TOKEN_] = $token; } } # Trim certain spaces in identifiers if ( $type eq 'i' ) { if ( $token =~ /$SUB_PATTERN/ ) { # -spp = 0 : no space before opening prototype paren # -spp = 1 : stable (follow input spacing) # -spp = 2 : always space before opening prototype paren if ( !defined($rOpts_space_prototype_paren) || $rOpts_space_prototype_paren == 1 ) { ## default: stable } elsif ( $rOpts_space_prototype_paren == 0 ) { $token =~ s/\s+\(/\(/; } elsif ( $rOpts_space_prototype_paren == 2 ) { $token =~ s/\(/ (/; } # one space max, and no tabs $token =~ s/\s+/ /g; $rtoken_vars->[_TOKEN_] = $token; $self->[_ris_special_identifier_token_]->{$token} = 'sub'; } # clean up spaces in package identifiers, like # "package Bob::Dog;" elsif ( substr( $token, 0, 7 ) eq 'package' && $token =~ /^package\s/ ) { $token =~ s/\s+/ /g; $rtoken_vars->[_TOKEN_] = $token; $self->[_ris_special_identifier_token_]->{$token} = 'package'; } # trim identifiers of trailing blanks which can occur # under some unusual circumstances, such as if the # identifier 'witch' has trailing blanks on input here: # # sub # witch # () # prototype may be on new line ... # ... my $ord_ch = ord( substr( $token, -1, 1 ) ); if ( # quick check for possible ending space $ord_ch > 0 && ( $ord_ch < ORD_PRINTABLE_MIN || $ord_ch > ORD_PRINTABLE_MAX ) ) { $token =~ s/\s+$//g; $rtoken_vars->[_TOKEN_] = $token; } } } # handle semicolons elsif ( $type eq ';' ) { # Remove unnecessary semicolons, but not after bare # blocks, where it could be unsafe if the brace is # mis-tokenized. if ( $rOpts->{'delete-semicolons'} && ( ( $last_nonblank_block_type && $last_nonblank_code_type eq '}' && ( $is_block_without_semicolon{ $last_nonblank_block_type} || $last_nonblank_block_type =~ /$SUB_PATTERN/ || $last_nonblank_block_type =~ /^\w+:$/ ) ) || $last_nonblank_code_type eq ';' ) ) { # This looks like a deletable semicolon, but even if a # semicolon can be deleted it is not necessarily best to do # so. We apply these additional rules for deletion: # - Always ok to delete a ';' at the end of a line # - Never delete a ';' before a '#' because it would # promote it to a block comment. # - If a semicolon is not at the end of line, then only # delete if it is followed by another semicolon or closing # token. This includes the comment rule. It may take # two passes to get to a final state, but it is a little # safer. For example, keep the first semicolon here: # eval { sub bubba { ok(0) }; ok(0) } || ok(1); # It is not required but adds some clarity. my $ok_to_delete = 1; if ( $KK < $Klast ) { my $Kn = $self->K_next_nonblank($KK); if ( defined($Kn) && $Kn <= $Klast ) { my $next_nonblank_token_type = $rLL->[$Kn]->[_TYPE_]; $ok_to_delete = $next_nonblank_token_type eq ';' || $next_nonblank_token_type eq '}'; } } # do not delete only nonblank token in a file else { my $Kp = $self->K_previous_code( undef, $rLL_new ); my $Kn = $self->K_next_nonblank($KK); $ok_to_delete = defined($Kn) || defined($Kp); } if ($ok_to_delete) { $self->note_deleted_semicolon($input_line_number); next; } else { write_logfile_entry("Extra ';'\n"); } } } # Old patch to add space to something like "x10". # Note: This is now done in the Tokenizer, but this code remains # for reference. elsif ( $type eq 'n' ) { if ( substr( $token, 0, 1 ) eq 'x' && $token =~ /^x\d+/ ) { $token =~ s/x/x /; $rtoken_vars->[_TOKEN_] = $token; if (DEVEL_MODE) { Fault(<[_TOKEN_] = $token; if ( $self->[_save_logfile_] && $token =~ /\t/ ) { $self->note_embedded_tab($input_line_number); } if ( $rwhitespace_flags->[$KK] == WS_YES && @{$rLL_new} && $rLL_new->[-1]->[_TYPE_] ne 'b' && $rOpts_add_whitespace ) { $self->store_space(); } $self->store_token($rtoken_vars); next; } ## end if ( $type eq 'q' ) # delete repeated commas if requested elsif ( $type eq ',' ) { if ( $last_nonblank_code_type eq ',' && $rOpts->{'delete-repeated-commas'} ) { # Could note this deletion as a possible future update: ## $self->note_deleted_comma($input_line_number); next; } # remember input line index of first comma if -wtc is used if (%trailing_comma_rules) { my $seqno = $seqno_stack{ $depth_next - 1 }; if ( defined($seqno) && !defined( $self->[_rfirst_comma_line_index_]->{$seqno} ) ) { $self->[_rfirst_comma_line_index_]->{$seqno} = $rtoken_vars->[_LINE_INDEX_]; } } } # change 'LABEL :' to 'LABEL:' elsif ( $type eq 'J' ) { $token =~ s/\s+//g; $rtoken_vars->[_TOKEN_] = $token; } # check a quote for problems elsif ( $type eq 'Q' ) { $self->check_Q( $KK, $Kfirst, $input_line_number ) if ( $self->[_save_logfile_] ); } # Store this token with possible previous blank if ( $rwhitespace_flags->[$KK] == WS_YES && @{$rLL_new} && $rLL_new->[-1]->[_TYPE_] ne 'b' && $rOpts_add_whitespace ) { $self->store_space(); } $self->store_token($rtoken_vars); } # End token loop return; } ## end sub respace_tokens_inner_loop sub respace_post_loop_ops { my ($self) = @_; # Walk backwards through the tokens, making forward links to sequence items. if ( @{$rLL_new} ) { my $KNEXT; foreach my $KK ( reverse( 0 .. @{$rLL_new} - 1 ) ) { $rLL_new->[$KK]->[_KNEXT_SEQ_ITEM_] = $KNEXT; if ( $rLL_new->[$KK]->[_TYPE_SEQUENCE_] ) { $KNEXT = $KK } } $self->[_K_first_seq_item_] = $KNEXT; } # Find and remember lists by sequence number my %is_C_style_for; foreach my $seqno ( keys %{$K_opening_container} ) { my $K_opening = $K_opening_container->{$seqno}; next unless defined($K_opening); # code errors may leave undefined closing tokens my $K_closing = $K_closing_container->{$seqno}; next unless defined($K_closing); my $lx_open = $rLL_new->[$K_opening]->[_LINE_INDEX_]; my $lx_close = $rLL_new->[$K_closing]->[_LINE_INDEX_]; my $line_diff = $lx_close - $lx_open; $ris_broken_container->{$seqno} = $line_diff; # See if this is a list my $is_list; my $rtype_count = $rtype_count_by_seqno->{$seqno}; if ($rtype_count) { my $comma_count = $rtype_count->{','}; my $fat_comma_count = $rtype_count->{'=>'}; my $semicolon_count = $rtype_count->{';'}; if ( $rtype_count->{'f'} ) { $semicolon_count += $rtype_count->{'f'}; $is_C_style_for{$seqno} = 1; } # We will define a list to be a container with one or more commas # and no semicolons. Note that we have included the semicolons # in a 'for' container in the semicolon count to keep c-style for # statements from being formatted as lists. if ( ( $comma_count || $fat_comma_count ) && !$semicolon_count ) { $is_list = 1; # We need to do one more check for a parenthesized list: # At an opening paren following certain tokens, such as 'if', # we do not want to format the contents as a list. if ( $rLL_new->[$K_opening]->[_TOKEN_] eq '(' ) { my $Kp = $self->K_previous_code( $K_opening, $rLL_new ); if ( defined($Kp) ) { my $type_p = $rLL_new->[$Kp]->[_TYPE_]; my $token_p = $rLL_new->[$Kp]->[_TOKEN_]; $is_list = $type_p eq 'k' ? !$is_nonlist_keyword{$token_p} : !$is_nonlist_type{$type_p}; } } } } # Look for a block brace marked as uncertain. If the tokenizer thinks # its guess is uncertain for the type of a brace following an unknown # bareword then it adds a trailing space as a signal. We can fix the # type here now that we have had a better look at the contents of the # container. This fixes case b1085. To find the corresponding code in # Tokenizer.pm search for 'b1085' with an editor. my $block_type = $rblock_type_of_seqno->{$seqno}; if ( $block_type && substr( $block_type, -1, 1 ) eq SPACE ) { # Always remove the trailing space $block_type =~ s/\s+$//; # Try to filter out parenless sub calls my $Knn1 = $self->K_next_nonblank( $K_opening, $rLL_new ); my $Knn2; if ( defined($Knn1) ) { $Knn2 = $self->K_next_nonblank( $Knn1, $rLL_new ); } my $type_nn1 = defined($Knn1) ? $rLL_new->[$Knn1]->[_TYPE_] : 'b'; my $type_nn2 = defined($Knn2) ? $rLL_new->[$Knn2]->[_TYPE_] : 'b'; # if ( $type_nn1 =~ /^[wU]$/ && $type_nn2 =~ /^[wiqQGCZ]$/ ) { if ( $wU{$type_nn1} && $wiq{$type_nn2} ) { $is_list = 0; } # Convert to a hash brace if it looks like it holds a list if ($is_list) { $block_type = EMPTY_STRING; $rLL_new->[$K_opening]->[_CI_LEVEL_] = 1; $rLL_new->[$K_closing]->[_CI_LEVEL_] = 1; } $rblock_type_of_seqno->{$seqno} = $block_type; } # Handle a list container if ( $is_list && !$block_type ) { $ris_list_by_seqno->{$seqno} = $seqno; my $seqno_parent = $rparent_of_seqno->{$seqno}; my $depth = 0; while ( defined($seqno_parent) && $seqno_parent ne SEQ_ROOT ) { $depth++; # for $rhas_list we need to save the minimum depth if ( !$rhas_list->{$seqno_parent} || $rhas_list->{$seqno_parent} > $depth ) { $rhas_list->{$seqno_parent} = $depth; } if ($line_diff) { $rhas_broken_list->{$seqno_parent} = 1; # Patch1: We need to mark broken lists with non-terminal # line-ending commas for the -bbx=2 parameter. This insures # that the list will stay broken. Otherwise the flag # -bbx=2 can be unstable. This fixes case b789 and b938. # Patch2: Updated to also require either one fat comma or # one more line-ending comma. Fixes cases b1069 b1070 # b1072 b1076. if ( $rlec_count_by_seqno->{$seqno} && ( $rlec_count_by_seqno->{$seqno} > 1 || $rtype_count_by_seqno->{$seqno}->{'=>'} ) ) { $rhas_broken_list_with_lec->{$seqno_parent} = 1; } } $seqno_parent = $rparent_of_seqno->{$seqno_parent}; } } # Handle code blocks ... # The -lp option needs to know if a container holds a code block elsif ( $block_type && $rOpts_line_up_parentheses ) { my $seqno_parent = $rparent_of_seqno->{$seqno}; while ( defined($seqno_parent) && $seqno_parent ne SEQ_ROOT ) { $rhas_code_block->{$seqno_parent} = 1; $rhas_broken_code_block->{$seqno_parent} = $line_diff; $seqno_parent = $rparent_of_seqno->{$seqno_parent}; } } } # Find containers with ternaries, needed for -lp formatting. foreach my $seqno ( keys %{$K_opening_ternary} ) { my $seqno_parent = $rparent_of_seqno->{$seqno}; while ( defined($seqno_parent) && $seqno_parent ne SEQ_ROOT ) { $rhas_ternary->{$seqno_parent} = 1; $seqno_parent = $rparent_of_seqno->{$seqno_parent}; } } # Turn off -lp for containers with here-docs with text within a container, # since they have their own fixed indentation. Fixes case b1081. if ($rOpts_line_up_parentheses) { foreach my $seqno ( keys %K_first_here_doc_by_seqno ) { my $Kh = $K_first_here_doc_by_seqno{$seqno}; my $Kc = $K_closing_container->{$seqno}; my $line_Kh = $rLL_new->[$Kh]->[_LINE_INDEX_]; my $line_Kc = $rLL_new->[$Kc]->[_LINE_INDEX_]; next if ( $line_Kh == $line_Kc ); $ris_excluded_lp_container->{$seqno} = 1; } } # Set a flag to turn off -cab=3 in complex structures. Otherwise, # instability can occur. When it is overridden the behavior of the closest # match, -cab=2, will be used instead. This fixes cases b1096 b1113. if ( $rOpts_comma_arrow_breakpoints == 3 ) { foreach my $seqno ( keys %{$K_opening_container} ) { my $rtype_count = $rtype_count_by_seqno->{$seqno}; next unless ( $rtype_count && $rtype_count->{'=>'} ); # override -cab=3 if this contains a sub-list if ( !defined( $roverride_cab3->{$seqno} ) ) { if ( $rhas_list->{$seqno} ) { $roverride_cab3->{$seqno} = 2; } # or if this is a sub-list of its parent container else { my $seqno_parent = $rparent_of_seqno->{$seqno}; if ( defined($seqno_parent) && $ris_list_by_seqno->{$seqno_parent} ) { $roverride_cab3->{$seqno} = 2; } } } } } # Add -ci to C-style for loops (issue c154) # This is much easier to do here than in the tokenizer. foreach my $seqno ( keys %is_C_style_for ) { my $K_opening = $K_opening_container->{$seqno}; my $K_closing = $K_closing_container->{$seqno}; my $type_last = 'f'; for my $KK ( $K_opening + 1 .. $K_closing - 1 ) { $rLL_new->[$KK]->[_CI_LEVEL_] = $type_last eq 'f' ? 0 : 1; my $type = $rLL_new->[$KK]->[_TYPE_]; if ( $type ne 'b' && $type ne '#' ) { $type_last = $type } } } return; } ## end sub respace_post_loop_ops sub set_permanently_broken { my ( $self, $seqno ) = @_; while ( defined($seqno) ) { $ris_permanently_broken->{$seqno} = 1; $seqno = $rparent_of_seqno->{$seqno}; } return; } ## end sub set_permanently_broken sub store_token { my ( $self, $item ) = @_; #------------------------------------------ # Store one token during respace operations #------------------------------------------ # Input parameter: # $item = ref to a token # NOTE: this sub is called once per token so coding efficiency is critical. # The next multiple assignment statements are significantly faster than # doing them one-by-one. my ( $type, $token, $type_sequence, ) = @{$item}[ _TYPE_, _TOKEN_, _TYPE_SEQUENCE_, ]; # Set the token length. Later it may be adjusted again if phantom or # ignoring side comment lengths. my $token_length = $is_encoded_data ? $length_function->($token) : length($token); # handle blanks if ( $type eq 'b' ) { # Do not output consecutive blanks. This situation should have been # prevented earlier, but it is worth checking because later routines # make this assumption. if ( @{$rLL_new} && $rLL_new->[-1]->[_TYPE_] eq 'b' ) { return; } } # handle comments elsif ( $type eq '#' ) { # trim comments if necessary my $ord = ord( substr( $token, -1, 1 ) ); if ( $ord > 0 && ( $ord < ORD_PRINTABLE_MIN || $ord > ORD_PRINTABLE_MAX ) && $token =~ s/\s+$// ) { $token_length = $length_function->($token); $item->[_TOKEN_] = $token; } # Mark length of side comments as just 1 if sc lengths are ignored if ( $rOpts_ignore_side_comment_lengths && ( !$CODE_type || $CODE_type eq 'HSC' ) ) { $token_length = 1; } my $seqno = $seqno_stack{ $depth_next - 1 }; if ( defined($seqno) ) { $self->[_rblank_and_comment_count_]->{$seqno} += 1 if ( $CODE_type eq 'BC' ); $self->set_permanently_broken($seqno) if !$ris_permanently_broken->{$seqno}; } } # handle non-blanks and non-comments else { my $block_type; # check for a sequenced item (i.e., container or ?/:) if ($type_sequence) { # This will be the index of this item in the new array my $KK_new = @{$rLL_new}; if ( $is_opening_token{$token} ) { $K_opening_container->{$type_sequence} = $KK_new; $block_type = $rblock_type_of_seqno->{$type_sequence}; # Fix for case b1100: Count a line ending in ', [' as having # a line-ending comma. Otherwise, these commas can be hidden # with something like --opening-square-bracket-right if ( $last_nonblank_code_type eq ',' && $Ktoken_vars == $Klast_old_code && $Ktoken_vars > $Kfirst_old ) { $rlec_count_by_seqno->{$type_sequence}++; } if ( $last_nonblank_code_type eq '=' || $last_nonblank_code_type eq '=>' ) { $ris_assigned_structure->{$type_sequence} = $last_nonblank_code_type; } my $seqno_parent = $seqno_stack{ $depth_next - 1 }; $seqno_parent = SEQ_ROOT unless defined($seqno_parent); push @{ $rchildren_of_seqno->{$seqno_parent} }, $type_sequence; $rparent_of_seqno->{$type_sequence} = $seqno_parent; $seqno_stack{$depth_next} = $type_sequence; $K_old_opening_by_seqno{$type_sequence} = $Ktoken_vars; $depth_next++; if ( $depth_next > $depth_next_max ) { $depth_next_max = $depth_next; } } elsif ( $is_closing_token{$token} ) { $K_closing_container->{$type_sequence} = $KK_new; $block_type = $rblock_type_of_seqno->{$type_sequence}; # Do not include terminal commas in counts if ( $last_nonblank_code_type eq ',' || $last_nonblank_code_type eq '=>' ) { $rtype_count_by_seqno->{$type_sequence} ->{$last_nonblank_code_type}--; if ( $Ktoken_vars == $Kfirst_old && $last_nonblank_code_type eq ',' && $rlec_count_by_seqno->{$type_sequence} ) { $rlec_count_by_seqno->{$type_sequence}--; } } # Update the stack... $depth_next--; } else { # For ternary, note parent but do not include as child my $seqno_parent = $seqno_stack{ $depth_next - 1 }; $seqno_parent = SEQ_ROOT unless defined($seqno_parent); $rparent_of_seqno->{$type_sequence} = $seqno_parent; # These are not yet used but could be useful if ( $token eq '?' ) { $K_opening_ternary->{$type_sequence} = $KK_new; } elsif ( $token eq ':' ) { $K_closing_ternary->{$type_sequence} = $KK_new; } else { # We really shouldn't arrive here, just being cautious: # The only sequenced types output by the tokenizer are the # opening & closing containers and the ternary types. Each # of those was checked above. So we would only get here # if the tokenizer has been changed to mark some other # tokens with sequence numbers. if (DEVEL_MODE) { Fault( "Unexpected token type with sequence number: type='$type', seqno='$type_sequence'" ); } } } } # Remember the most recent two non-blank, non-comment tokens. # NOTE: the phantom semicolon code may change the output stack # without updating these values. Phantom semicolons are considered # the same as blanks for now, but future needs might change that. # See the related note in sub 'add_phantom_semicolon'. $last_last_nonblank_code_type = $last_nonblank_code_type; $last_last_nonblank_code_token = $last_nonblank_code_token; $last_nonblank_code_type = $type; $last_nonblank_code_token = $token; $last_nonblank_block_type = $block_type; # count selected types if ( $is_counted_type{$type} ) { my $seqno = $seqno_stack{ $depth_next - 1 }; if ( defined($seqno) ) { $rtype_count_by_seqno->{$seqno}->{$type}++; # Count line-ending commas for -bbx if ( $type eq ',' && $Ktoken_vars == $Klast_old_code ) { $rlec_count_by_seqno->{$seqno}++; } # Remember index of first here doc target if ( $type eq 'h' && !$K_first_here_doc_by_seqno{$seqno} ) { my $KK_new = @{$rLL_new}; $K_first_here_doc_by_seqno{$seqno} = $KK_new; } } } } # cumulative length is the length sum including this token $cumulative_length += $token_length; $item->[_CUMULATIVE_LENGTH_] = $cumulative_length; $item->[_TOKEN_LENGTH_] = $token_length; # For reference, here is how to get the parent sequence number. # This is not used because it is slower than finding it on the fly # in sub parent_seqno_by_K: # my $seqno_parent = # $type_sequence && $is_opening_token{$token} # ? $seqno_stack{ $depth_next - 2 } # : $seqno_stack{ $depth_next - 1 }; # my $KK = @{$rLL_new}; # $rseqno_of_parent_by_K->{$KK} = $seqno_parent; # and finally, add this item to the new array push @{$rLL_new}, $item; return; } ## end sub store_token sub store_space { my ($self) = @_; # Store a blank space in the new array # - but never start the array with a space # - and never store two consecutive spaces if ( @{$rLL_new} && $rLL_new->[-1]->[_TYPE_] ne 'b' ) { my $ritem = []; $ritem->[_TYPE_] = 'b'; $ritem->[_TOKEN_] = SPACE; $ritem->[_TYPE_SEQUENCE_] = EMPTY_STRING; $ritem->[_LINE_INDEX_] = $rLL_new->[-1]->[_LINE_INDEX_]; # The level and ci_level of newly created spaces should be the same # as the previous token. Otherwise the coding for the -lp option # can create a blinking state in some rare cases (see b1109, b1110). $ritem->[_LEVEL_] = $rLL_new->[-1]->[_LEVEL_]; $ritem->[_CI_LEVEL_] = $rLL_new->[-1]->[_CI_LEVEL_]; $self->store_token($ritem); } return; } ## end sub store_space sub add_phantom_semicolon { my ( $self, $KK ) = @_; # The token at old index $KK is a closing block brace, and not preceded # by a semicolon. Before we push it onto the new token list, we may # want to add a phantom semicolon which can be activated if the the # block is broken on output. # We are only adding semicolons for certain block types my $type_sequence = $rLL->[$KK]->[_TYPE_SEQUENCE_]; return unless ($type_sequence); my $block_type = $rblock_type_of_seqno->{$type_sequence}; return unless ($block_type); return unless ( $ok_to_add_semicolon_for_block_type{$block_type} || $block_type =~ /^(sub|package)/ || $block_type =~ /^\w+\:$/ ); # Find the most recent token in the new token list my $Kp = $self->K_previous_nonblank( undef, $rLL_new ); return unless ( defined($Kp) ); # shouldn't happen except for bad input my $type_p = $rLL_new->[$Kp]->[_TYPE_]; my $token_p = $rLL_new->[$Kp]->[_TOKEN_]; my $type_sequence_p = $rLL_new->[$Kp]->[_TYPE_SEQUENCE_]; # Do not add a semicolon if... return if ( # it would follow a comment (and be isolated) $type_p eq '#' # it follows a code block ( because they are not always wanted # there and may add clutter) || $type_sequence_p && $rblock_type_of_seqno->{$type_sequence_p} # it would follow a label || $type_p eq 'J' # it would be inside a 'format' statement (and cause syntax error) || ( $type_p eq 'k' && $token_p =~ /format/ ) ); # Do not add a semicolon if it would impede a weld with an immediately # following closing token...like this # { ( some code ) } # ^--No semicolon can go here # look at the previous token... note use of the _NEW rLL array here, # but sequence numbers are invariant. my $seqno_inner = $rLL_new->[$Kp]->[_TYPE_SEQUENCE_]; # If it is also a CLOSING token we have to look closer... if ( $seqno_inner && $is_closing_token{$token_p} # we only need to look if there is just one inner container.. && defined( $rchildren_of_seqno->{$type_sequence} ) && @{ $rchildren_of_seqno->{$type_sequence} } == 1 ) { # Go back and see if the corresponding two OPENING tokens are also # together. Note that we are using the OLD K indexing here: my $K_outer_opening = $K_old_opening_by_seqno{$type_sequence}; if ( defined($K_outer_opening) ) { my $K_nxt = $self->K_next_nonblank($K_outer_opening); if ( defined($K_nxt) ) { my $seqno_nxt = $rLL->[$K_nxt]->[_TYPE_SEQUENCE_]; # Is the next token after the outer opening the same as # our inner closing (i.e. same sequence number)? # If so, do not insert a semicolon here. return if ( $seqno_nxt && $seqno_nxt == $seqno_inner ); } } } # We will insert an empty semicolon here as a placeholder. Later, if # it becomes the last token on a line, we will bring it to life. The # advantage of doing this is that (1) we just have to check line # endings, and (2) the phantom semicolon has zero width and therefore # won't cause needless breaks of one-line blocks. my $Ktop = -1; if ( $rLL_new->[$Ktop]->[_TYPE_] eq 'b' && $want_left_space{';'} == WS_NO ) { # convert the blank into a semicolon.. # be careful: we are working on the new stack top # on a token which has been stored. my $rcopy = copy_token_as_type( $rLL_new->[$Ktop], 'b', SPACE ); # Convert the existing blank to: # a phantom semicolon for one_line_block option = 0 or 1 # a real semicolon for one_line_block option = 2 my $tok = EMPTY_STRING; my $len_tok = 0; if ( $rOpts_one_line_block_semicolons == 2 ) { $tok = ';'; $len_tok = 1; } $rLL_new->[$Ktop]->[_TOKEN_] = $tok; $rLL_new->[$Ktop]->[_TOKEN_LENGTH_] = $len_tok; $rLL_new->[$Ktop]->[_TYPE_] = ';'; $self->[_rtype_count_by_seqno_]->{$type_sequence}->{';'}++; # NOTE: we are changing the output stack without updating variables # $last_nonblank_code_type, etc. Future needs might require that # those variables be updated here. For now, it seems ok to skip # this. # Then store a new blank $self->store_token($rcopy); } else { # Patch for issue c078: keep line indexes in order. If the top # token is a space that we are keeping (due to '-wls=';') then # we have to check that old line indexes stay in order. # In very rare # instances in which side comments have been deleted and converted # into blanks, we may have filtered down multiple blanks into just # one. In that case the top blank may have a higher line number # than the previous nonblank token. Although the line indexes of # blanks are not really significant, we need to keep them in order # in order to pass error checks. if ( $rLL_new->[$Ktop]->[_TYPE_] eq 'b' ) { my $old_top_ix = $rLL_new->[$Ktop]->[_LINE_INDEX_]; my $new_top_ix = $rLL_new->[$Kp]->[_LINE_INDEX_]; if ( $new_top_ix < $old_top_ix ) { $rLL_new->[$Ktop]->[_LINE_INDEX_] = $new_top_ix; } } my $rcopy = copy_token_as_type( $rLL_new->[$Kp], ';', EMPTY_STRING ); $self->store_token($rcopy); } return; } ## end sub add_phantom_semicolon sub add_trailing_comma { # Implement the --add-trailing-commas flag to the line end before index $KK: my ( $self, $KK, $Kfirst, $trailing_comma_rule ) = @_; # Input parameter: # $KK = index of closing token in old ($rLL) token list # which starts a new line and is not preceded by a comma # $Kfirst = index of first token on the current line of input tokens # $add_flags = user control flags # For example, we might want to add a comma here: # bless { # _name => $name, # _price => $price, # _rebate => $rebate <------ location of possible bare comma # }, $pkg; # ^-------------------closing token at index $KK on new line # Do not add a comma if it would follow a comment my $Kp = $self->K_previous_nonblank( undef, $rLL_new ); return unless ( defined($Kp) ); my $type_p = $rLL_new->[$Kp]->[_TYPE_]; return if ( $type_p eq '#' ); # see if the user wants a trailing comma here my $match = $self->match_trailing_comma_rule( $KK, $Kfirst, $Kp, $trailing_comma_rule, 1 ); # if so, add a comma if ($match) { my $Knew = $self->store_new_token( ',', ',', $Kp ); } return; } ## end sub add_trailing_comma sub delete_trailing_comma { my ( $self, $KK, $Kfirst, $trailing_comma_rule ) = @_; # Apply the --delete-trailing-commas flag to the comma before index $KK # Input parameter: # $KK = index of a closing token in OLD ($rLL) token list # which is preceded by a comma on the same line. # $Kfirst = index of first token on the current line of input tokens # $delete_option = user control flag # Returns true if the comma was deleted # For example, we might want to delete this comma: # my @asset = ("FASMX", "FASGX", "FASIX",); # | |^--------token at index $KK # | ^------comma of interest # ^-------------token at $Kfirst # Verify that the previous token is a comma. Note that we are working in # the new token list $rLL_new. my $Kp = $self->K_previous_nonblank( undef, $rLL_new ); return unless ( defined($Kp) ); if ( $rLL_new->[$Kp]->[_TYPE_] ne ',' ) { # there must be a '#' between the ',' and closing token; give up. return; } # Do not delete commas when formatting under stress to avoid instability. # This fixes b1389, b1390, b1391, b1392. The $high_stress_level has # been found to work well for trailing commas. if ( $rLL_new->[$Kp]->[_LEVEL_] >= $high_stress_level ) { return; } # See if the user wants this trailing comma my $match = $self->match_trailing_comma_rule( $KK, $Kfirst, $Kp, $trailing_comma_rule, 0 ); # Patch: the --noadd-whitespace flag can cause instability in complex # structures. In this case do not delete the comma. Fixes b1409. if ( !$match && !$rOpts_add_whitespace ) { my $Kn = $self->K_next_nonblank($KK); if ( defined($Kn) ) { my $type_n = $rLL->[$Kn]->[_TYPE_]; if ( $type_n ne ';' && $type_n ne '#' ) { return } } } # If no match, delete it if ( !$match ) { return $self->unstore_last_nonblank_token(','); } return; } ## end sub delete_trailing_comma sub delete_weld_interfering_comma { my ( $self, $KK ) = @_; # Apply the flag '--delete-weld-interfering-commas' to the comma # before index $KK # Input parameter: # $KK = index of a closing token in OLD ($rLL) token list # which is preceded by a comma on the same line. # Returns true if the comma was deleted # For example, we might want to delete this comma: # my $tmpl = { foo => {no_override => 1, default => 42}, }; # || ^------$KK # |^---$Kp # $Kpp---^ # # Note that: # index $KK is in the old $rLL array, but # indexes $Kp and $Kpp are in the new $rLL_new array. my $type_sequence = $rLL->[$KK]->[_TYPE_SEQUENCE_]; return unless ($type_sequence); # Find the previous token and verify that it is a comma. my $Kp = $self->K_previous_nonblank( undef, $rLL_new ); return unless ( defined($Kp) ); if ( $rLL_new->[$Kp]->[_TYPE_] ne ',' ) { # it is not a comma, so give up ( it is probably a '#' ) return; } # This must be the only comma in this list my $rtype_count = $self->[_rtype_count_by_seqno_]->{$type_sequence}; return unless ( defined($rtype_count) && $rtype_count->{','} && $rtype_count->{','} == 1 ); # Back up to the previous closing token my $Kpp = $self->K_previous_nonblank( $Kp, $rLL_new ); return unless ( defined($Kpp) ); my $seqno_pp = $rLL_new->[$Kpp]->[_TYPE_SEQUENCE_]; my $type_pp = $rLL_new->[$Kpp]->[_TYPE_]; # The containers must be nesting (i.e., sequence numbers must differ by 1 ) if ( $seqno_pp && $is_closing_type{$type_pp} ) { if ( $seqno_pp == $type_sequence + 1 ) { # remove the ',' from the top of the new token list return $self->unstore_last_nonblank_token(','); } } return; } ## end sub delete_weld_interfering_comma sub unstore_last_nonblank_token { my ( $self, $type ) = @_; # remove the most recent nonblank token from the new token list # Input parameter: # $type = type to be removed (for safety check) # Returns true if success # false if error # This was written and is used for removing commas, but might # be useful for other tokens. If it is ever used for other tokens # then the issue of what to do about the other variables, such # as token counts and the '$last...' vars needs to be considered. # Safety check, shouldn't happen if ( @{$rLL_new} < 3 ) { DEVEL_MODE && Fault("not enough tokens on stack to remove '$type'\n"); return; } my ( $rcomma, $rblank ); # case 1: pop comma from top of stack if ( $rLL_new->[-1]->[_TYPE_] eq $type ) { $rcomma = pop @{$rLL_new}; } # case 2: pop blank and then comma from top of stack elsif ($rLL_new->[-1]->[_TYPE_] eq 'b' && $rLL_new->[-2]->[_TYPE_] eq $type ) { $rblank = pop @{$rLL_new}; $rcomma = pop @{$rLL_new}; } # case 3: error, shouldn't happen unless bad call else { DEVEL_MODE && Fault("Could not find token type '$type' to remove\n"); return; } # A note on updating vars set by sub store_token for this comma: If we # reduce the comma count by 1 then we also have to change the variable # $last_nonblank_code_type to be $last_last_nonblank_code_type because # otherwise sub store_token is going to ALSO reduce the comma count. # Alternatively, we can leave the count alone and the # $last_nonblank_code_type alone. Then sub store_token will produce # the correct result. This is simpler and is done here. # Now add a blank space after the comma if appropriate. # Some unusual spacing controls might need another iteration to # reach a final state. if ( $rLL_new->[-1]->[_TYPE_] ne 'b' ) { if ( defined($rblank) ) { $rblank->[_CUMULATIVE_LENGTH_] -= 1; # fix for deleted comma push @{$rLL_new}, $rblank; } } return 1; } ## end sub unstore_last_nonblank_token sub match_trailing_comma_rule { my ( $self, $KK, $Kfirst, $Kp, $trailing_comma_rule, $if_add ) = @_; # Decide if a trailing comma rule is matched. # Input parameter: # $KK = index of closing token in old ($rLL) token list which follows # the location of a possible trailing comma. See diagram below. # $Kfirst = (old) index of first token on the current line of input tokens # $Kp = index of previous nonblank token in new ($rLL_new) array # $trailing_comma_rule = packed user control flags # $if_add = true if adding comma, false if deleteing comma # Returns: # false if no match # true if match # For example, we might be checking for addition of a comma here: # bless { # _name => $name, # _price => $price, # _rebate => $rebate <------ location of possible trailing comma # }, $pkg; # ^-------------------closing token at index $KK return unless ($trailing_comma_rule); my ( $trailing_comma_style, $paren_flag ) = @{$trailing_comma_rule}; # List of $trailing_comma_style values: # undef stable: do not change # '0' : no list should have a trailing comma # '1' or '*' : every list should have a trailing comma # 'm' a multi-line list should have a trailing commas # 'b' trailing commas should be 'bare' (comma followed by newline) # 'h' lists of key=>value pairs with a bare trailing comma # 'i' same as s=h but also include any list with no more than about one # comma per line # ' ' or -wtc not defined : leave trailing commas unchanged [DEFAULT]. # Note: an interesting generalization would be to let an upper case # letter denote the negation of styles 'm', 'b', 'h', 'i'. This might # be useful for undoing operations. It would be implemented as a wrapper # around this routine. #----------------------------------------- # No style defined : do not add or delete #----------------------------------------- if ( !defined($trailing_comma_style) ) { return !$if_add } #---------------------------------------- # Set some flags describing this location #---------------------------------------- my $type_sequence = $rLL->[$KK]->[_TYPE_SEQUENCE_]; return unless ($type_sequence); my $closing_token = $rLL->[$KK]->[_TOKEN_]; my $rtype_count = $self->[_rtype_count_by_seqno_]->{$type_sequence}; return unless ( defined($rtype_count) && $rtype_count->{','} ); my $is_permanently_broken = $self->[_ris_permanently_broken_]->{$type_sequence}; # Note that _ris_broken_container_ also stores the line diff # but it is not available at this early stage. my $K_opening = $self->[_K_opening_container_]->{$type_sequence}; return if ( !defined($K_opening) ); # multiline definition 1: opening and closing tokens on different lines my $iline_o = $rLL_new->[$K_opening]->[_LINE_INDEX_]; my $iline_c = $rLL->[$KK]->[_LINE_INDEX_]; my $line_diff_containers = $iline_c - $iline_o; my $has_multiline_containers = $line_diff_containers > 0; # multiline definition 2: first and last commas on different lines my $iline_first = $self->[_rfirst_comma_line_index_]->{$type_sequence}; my $iline_last = $rLL_new->[$Kp]->[_LINE_INDEX_]; my $has_multiline_commas; my $line_diff_commas = 0; if ( !defined($iline_first) ) { # shouldn't happen if caller checked comma count my $type_kp = $rLL_new->[$Kp]->[_TYPE_]; Fault( "at line $iline_last but line of first comma not defined, at Kp=$Kp, type=$type_kp\n" ) if (DEVEL_MODE); } else { $line_diff_commas = $iline_last - $iline_first; $has_multiline_commas = $line_diff_commas > 0; } # To avoid instability in edge cases, when adding commas we uses the # multiline_commas definition, but when deleting we use multiline # containers. This fixes b1384, b1396, b1397, b1398, b1400. my $is_multiline = $if_add ? $has_multiline_commas : $has_multiline_containers; my $is_bare_multiline_comma = $is_multiline && $KK == $Kfirst; my $match; #---------------------------- # 0 : does not match any list #---------------------------- if ( $trailing_comma_style eq '0' ) { $match = 0; } #------------------------------ # '*' or '1' : matches any list #------------------------------ elsif ( $trailing_comma_style eq '*' || $trailing_comma_style eq '1' ) { $match = 1; } #----------------------------- # 'm' matches a Multiline list #----------------------------- elsif ( $trailing_comma_style eq 'm' ) { $match = $is_multiline; } #---------------------------------- # 'b' matches a Bare trailing comma #---------------------------------- elsif ( $trailing_comma_style eq 'b' ) { $match = $is_bare_multiline_comma; } #-------------------------------------------------------------------------- # 'h' matches a bare hash list with about 1 comma and 1 fat comma per line. # 'i' matches a bare stable list with about 1 comma per line. #-------------------------------------------------------------------------- elsif ( $trailing_comma_style eq 'h' || $trailing_comma_style eq 'i' ) { # We can treat these together because they are similar. # The set of 'i' matches includes the set of 'h' matches. # the trailing comma must be bare for both 'h' and 'i' return if ( !$is_bare_multiline_comma ); # There must be no more than one comma per line for both 'h' and 'i' # The new_comma_count here will include the trailing comma. my $new_comma_count = $rtype_count->{','}; $new_comma_count += 1 if ($if_add); my $excess_commas = $new_comma_count - $line_diff_commas - 1; if ( $excess_commas > 0 ) { # Exception for a special edge case for option 'i': if the trailing # comma is followed by a blank line or comment, then it cannot be # covered. Then we can safely accept a small list to avoid # instability (issue b1443). if ( $trailing_comma_style eq 'i' && $iline_c - $rLL_new->[$Kp]->[_LINE_INDEX_] > 1 && $new_comma_count <= 2 ) { $match = 1; } else { return; } } # a list of key=>value pairs with at least 2 fat commas is a match # for both 'h' and 'i' my $fat_comma_count = $rtype_count->{'=>'}; if ( !$match && $fat_comma_count && $fat_comma_count >= 2 ) { # comma count (including trailer) and fat comma count must differ by # by no more than 1. This allows for some small variations. my $comma_diff = $new_comma_count - $fat_comma_count; $match = ( $comma_diff >= -1 && $comma_diff <= 1 ); } # For 'i' only, a list that can be shown to be stable is a match if ( !$match && $trailing_comma_style eq 'i' ) { $match = ( $is_permanently_broken || ( $rOpts_break_at_old_comma_breakpoints && !$rOpts_ignore_old_breakpoints ) ); } } #------------------------------------------------------------------------- # Unrecognized parameter. This should have been caught in the input check. #------------------------------------------------------------------------- else { DEVEL_MODE && Fault("Unrecognized parameter '$trailing_comma_style'\n"); # do not add or delete return !$if_add; } # Now do any special paren check if ( $match && $paren_flag && $paren_flag ne '1' && $paren_flag ne '*' && $closing_token eq ')' ) { $match &&= $self->match_paren_control_flag( $type_sequence, $paren_flag, $rLL_new ); } # Fix for b1379, b1380, b1381, b1382, b1384 part 1. Mark trailing commas # for use by -vtc logic to avoid instability when -dtc and -atc are both # active. if ($match) { if ( $if_add && $rOpts_delete_trailing_commas || !$if_add && $rOpts_add_trailing_commas ) { $self->[_ris_bare_trailing_comma_by_seqno_]->{$type_sequence} = 1; # The combination of -atc and -dtc and -cab=3 can be unstable # (b1394). So we deactivate -cab=3 in this case. # A value of '0' or '4' is required for stability of case b1451. if ( $rOpts_comma_arrow_breakpoints == 3 ) { $self->[_roverride_cab3_]->{$type_sequence} = 0; } } } return $match; } ## end sub match_trailing_comma_rule sub store_new_token { my ( $self, $type, $token, $Kp ) = @_; # Create and insert a completely new token into the output stream # Input parameters: # $type = the token type # $token = the token text # $Kp = index of the previous token in the new list, $rLL_new # Returns: # $Knew = index in $rLL_new of the new token # This operation is a little tricky because we are creating a new token and # we have to take care to follow the requested whitespace rules. my $Ktop = @{$rLL_new} - 1; my $top_is_space = $Ktop >= 0 && $rLL_new->[$Ktop]->[_TYPE_] eq 'b'; my $Knew; if ( $top_is_space && $want_left_space{$type} == WS_NO ) { #---------------------------------------------------- # Method 1: Convert the top blank into the new token. #---------------------------------------------------- # Be Careful: we are working on the top of the new stack, on a token # which has been stored. my $rcopy = copy_token_as_type( $rLL_new->[$Ktop], 'b', SPACE ); $Knew = $Ktop; $rLL_new->[$Knew]->[_TOKEN_] = $token; $rLL_new->[$Knew]->[_TOKEN_LENGTH_] = length($token); $rLL_new->[$Knew]->[_TYPE_] = $type; # NOTE: we are changing the output stack without updating variables # $last_nonblank_code_type, etc. Future needs might require that # those variables be updated here. For now, we just update the # type counts as necessary. if ( $is_counted_type{$type} ) { my $seqno = $seqno_stack{ $depth_next - 1 }; if ($seqno) { $self->[_rtype_count_by_seqno_]->{$seqno}->{$type}++; } } # Then store a new blank $self->store_token($rcopy); } else { #---------------------------------------- # Method 2: Use the normal storage method #---------------------------------------- # Patch for issue c078: keep line indexes in order. If the top # token is a space that we are keeping (due to '-wls=...) then # we have to check that old line indexes stay in order. # In very rare # instances in which side comments have been deleted and converted # into blanks, we may have filtered down multiple blanks into just # one. In that case the top blank may have a higher line number # than the previous nonblank token. Although the line indexes of # blanks are not really significant, we need to keep them in order # in order to pass error checks. if ($top_is_space) { my $old_top_ix = $rLL_new->[$Ktop]->[_LINE_INDEX_]; my $new_top_ix = $rLL_new->[$Kp]->[_LINE_INDEX_]; if ( $new_top_ix < $old_top_ix ) { $rLL_new->[$Ktop]->[_LINE_INDEX_] = $new_top_ix; } } my $rcopy = copy_token_as_type( $rLL_new->[$Kp], $type, $token ); $self->store_token($rcopy); $Knew = @{$rLL_new} - 1; } return $Knew; } ## end sub store_new_token sub check_Q { # Check that a quote looks okay, and report possible problems # to the logfile. my ( $self, $KK, $Kfirst, $line_number ) = @_; my $token = $rLL->[$KK]->[_TOKEN_]; if ( $token =~ /\t/ ) { $self->note_embedded_tab($line_number); } # The remainder of this routine looks for something like # '$var = s/xxx/yyy/;' # in case it should have been '$var =~ s/xxx/yyy/;' # Start by looking for a token beginning with one of: s y m / tr return unless ( $is_s_y_m_slash{ substr( $token, 0, 1 ) } || substr( $token, 0, 2 ) eq 'tr' ); # ... and preceded by one of: = == != my $Kp = $self->K_previous_nonblank( undef, $rLL_new ); return unless ( defined($Kp) ); my $previous_nonblank_type = $rLL_new->[$Kp]->[_TYPE_]; return unless ( $is_unexpected_equals{$previous_nonblank_type} ); my $previous_nonblank_token = $rLL_new->[$Kp]->[_TOKEN_]; my $previous_nonblank_type_2 = 'b'; my $previous_nonblank_token_2 = EMPTY_STRING; my $Kpp = $self->K_previous_nonblank( $Kp, $rLL_new ); if ( defined($Kpp) ) { $previous_nonblank_type_2 = $rLL_new->[$Kpp]->[_TYPE_]; $previous_nonblank_token_2 = $rLL_new->[$Kpp]->[_TOKEN_]; } my $next_nonblank_token = EMPTY_STRING; my $Kn = $KK + 1; my $Kmax = @{$rLL} - 1; if ( $Kn <= $Kmax && $rLL->[$Kn]->[_TYPE_] eq 'b' ) { $Kn += 1 } if ( $Kn <= $Kmax ) { $next_nonblank_token = $rLL->[$Kn]->[_TOKEN_]; } my $token_0 = $rLL->[$Kfirst]->[_TOKEN_]; my $type_0 = $rLL->[$Kfirst]->[_TYPE_]; if ( # preceded by simple scalar $previous_nonblank_type_2 eq 'i' && $previous_nonblank_token_2 =~ /^\$/ # followed by some kind of termination # (but give complaint if we can not see far enough ahead) && $next_nonblank_token =~ /^[; \)\}]$/ # scalar is not declared ## =~ /^(my|our|local)$/ && !( $type_0 eq 'k' && $is_my_our_local{$token_0} ) ) { my $lno = 1 + $rLL_new->[$Kp]->[_LINE_INDEX_]; my $guess = substr( $previous_nonblank_token, 0, 1 ) . '~'; complain( "Line $lno: Note: be sure you want '$previous_nonblank_token' instead of '$guess' here\n" ); } return; } ## end sub check_Q } ## end closure respace_tokens sub copy_token_as_type { # This provides a quick way to create a new token by # slightly modifying an existing token. my ( $rold_token, $type, $token ) = @_; if ( !defined($token) ) { if ( $type eq 'b' ) { $token = SPACE; } elsif ( $type eq 'q' ) { $token = EMPTY_STRING; } elsif ( $type eq '->' ) { $token = '->'; } elsif ( $type eq ';' ) { $token = ';'; } elsif ( $type eq ',' ) { $token = ','; } else { # Unexpected type ... this sub will work as long as both $token and # $type are defined, but we should catch any unexpected types during # development. if (DEVEL_MODE) { Fault(<' or ';' EOM } # Shouldn't get here $token = $type; } } my @rnew_token = @{$rold_token}; $rnew_token[_TYPE_] = $type; $rnew_token[_TOKEN_] = $token; $rnew_token[_TYPE_SEQUENCE_] = EMPTY_STRING; return \@rnew_token; } ## end sub copy_token_as_type sub K_next_code { my ( $self, $KK, $rLL ) = @_; # return the index K of the next nonblank, non-comment token return unless ( defined($KK) && $KK >= 0 ); # use the standard array unless given otherwise $rLL = $self->[_rLL_] unless ( defined($rLL) ); my $Num = @{$rLL}; my $Knnb = $KK + 1; while ( $Knnb < $Num ) { if ( !defined( $rLL->[$Knnb] ) ) { # We seem to have encountered a gap in our array. # This shouldn't happen because sub write_line() pushed # items into the $rLL array. Fault("Undefined entry for k=$Knnb") if (DEVEL_MODE); return; } if ( $rLL->[$Knnb]->[_TYPE_] ne 'b' && $rLL->[$Knnb]->[_TYPE_] ne '#' ) { return $Knnb; } $Knnb++; } return; } ## end sub K_next_code sub K_next_nonblank { my ( $self, $KK, $rLL ) = @_; # return the index K of the next nonblank token, or # return undef if none return unless ( defined($KK) && $KK >= 0 ); # The third arg allows this routine to be used on any array. This is # useful in sub respace_tokens when we are copying tokens from an old $rLL # to a new $rLL array. But usually the third arg will not be given and we # will just use the $rLL array in $self. $rLL = $self->[_rLL_] unless ( defined($rLL) ); my $Num = @{$rLL}; my $Knnb = $KK + 1; return unless ( $Knnb < $Num ); return $Knnb if ( $rLL->[$Knnb]->[_TYPE_] ne 'b' ); return unless ( ++$Knnb < $Num ); return $Knnb if ( $rLL->[$Knnb]->[_TYPE_] ne 'b' ); # Backup loop. Very unlikely to get here; it means we have neighboring # blanks in the token stream. $Knnb++; while ( $Knnb < $Num ) { # Safety check, this fault shouldn't happen: The $rLL array is the # main array of tokens, so all entries should be used. It is # initialized in sub write_line, and then re-initialized by sub # store_token() within sub respace_tokens. Tokens are pushed on # so there shouldn't be any gaps. if ( !defined( $rLL->[$Knnb] ) ) { Fault("Undefined entry for k=$Knnb") if (DEVEL_MODE); return; } if ( $rLL->[$Knnb]->[_TYPE_] ne 'b' ) { return $Knnb } $Knnb++; } return; } ## end sub K_next_nonblank sub K_previous_code { # return the index K of the previous nonblank, non-comment token # Call with $KK=undef to start search at the top of the array my ( $self, $KK, $rLL ) = @_; # use the standard array unless given otherwise $rLL = $self->[_rLL_] unless ( defined($rLL) ); my $Num = @{$rLL}; if ( !defined($KK) ) { $KK = $Num } elsif ( $KK > $Num ) { # This fault can be caused by a programming error in which a bad $KK is # given. The caller should make the first call with KK_new=undef to # avoid this error. Fault( "Program Bug: K_previous_nonblank_new called with K=$KK which exceeds $Num" ) if (DEVEL_MODE); return; } my $Kpnb = $KK - 1; while ( $Kpnb >= 0 ) { if ( $rLL->[$Kpnb]->[_TYPE_] ne 'b' && $rLL->[$Kpnb]->[_TYPE_] ne '#' ) { return $Kpnb; } $Kpnb--; } return; } ## end sub K_previous_code sub K_previous_nonblank { # return index of previous nonblank token before item K; # Call with $KK=undef to start search at the top of the array my ( $self, $KK, $rLL ) = @_; # use the standard array unless given otherwise $rLL = $self->[_rLL_] unless ( defined($rLL) ); my $Num = @{$rLL}; if ( !defined($KK) ) { $KK = $Num } elsif ( $KK > $Num ) { # This fault can be caused by a programming error in which a bad $KK is # given. The caller should make the first call with KK_new=undef to # avoid this error. Fault( "Program Bug: K_previous_nonblank_new called with K=$KK which exceeds $Num" ) if (DEVEL_MODE); return; } my $Kpnb = $KK - 1; return unless ( $Kpnb >= 0 ); return $Kpnb if ( $rLL->[$Kpnb]->[_TYPE_] ne 'b' ); return unless ( --$Kpnb >= 0 ); return $Kpnb if ( $rLL->[$Kpnb]->[_TYPE_] ne 'b' ); # Backup loop. We should not get here unless some routine # slipped repeated blanks into the token stream. return unless ( --$Kpnb >= 0 ); while ( $Kpnb >= 0 ) { if ( $rLL->[$Kpnb]->[_TYPE_] ne 'b' ) { return $Kpnb } $Kpnb--; } return; } ## end sub K_previous_nonblank sub parent_seqno_by_K { # Return the sequence number of the parent container of token K, if any. my ( $self, $KK ) = @_; my $rLL = $self->[_rLL_]; # The task is to jump forward to the next container token # and use the sequence number of either it or its parent. # For example, consider the following with seqno=5 of the '[' and ']' # being called with index K of the first token of each line: # # result # push @tests, # - # [ # - # sub { 99 }, 'do {&{%s} for 1,2}', # 5 # '(&{})(&{})', undef, # 5 # [ 2, 2, 0 ], 0 # 5 # ]; # - # NOTE: The ending parent will be SEQ_ROOT for a balanced file. For # unbalanced files, last sequence number will either be undefined or it may # be at a deeper level. In either case we will just return SEQ_ROOT to # have a defined value and allow formatting to proceed. my $parent_seqno = SEQ_ROOT; my $type_sequence = $rLL->[$KK]->[_TYPE_SEQUENCE_]; if ($type_sequence) { $parent_seqno = $self->[_rparent_of_seqno_]->{$type_sequence}; } else { my $Kt = $rLL->[$KK]->[_KNEXT_SEQ_ITEM_]; if ( defined($Kt) ) { $type_sequence = $rLL->[$Kt]->[_TYPE_SEQUENCE_]; my $type = $rLL->[$Kt]->[_TYPE_]; # if next container token is closing, it is the parent seqno if ( $is_closing_type{$type} ) { $parent_seqno = $type_sequence; } # otherwise we want its parent container else { $parent_seqno = $self->[_rparent_of_seqno_]->{$type_sequence}; } } } $parent_seqno = SEQ_ROOT unless ( defined($parent_seqno) ); return $parent_seqno; } ## end sub parent_seqno_by_K sub is_in_block_by_i { my ( $self, $i ) = @_; # returns true if # token at i is contained in a BLOCK # or is at root level # or there is some kind of error (i.e. unbalanced file) # returns false otherwise if ( $i < 0 ) { DEVEL_MODE && Fault("Bad call, i='$i'\n"); return 1; } my $seqno = $parent_seqno_to_go[$i]; return 1 if ( !$seqno || $seqno eq SEQ_ROOT ); return 1 if ( $self->[_rblock_type_of_seqno_]->{$seqno} ); return; } ## end sub is_in_block_by_i sub is_in_list_by_i { my ( $self, $i ) = @_; # returns true if token at i is contained in a LIST # returns false otherwise my $seqno = $parent_seqno_to_go[$i]; return unless ( $seqno && $seqno ne SEQ_ROOT ); if ( $self->[_ris_list_by_seqno_]->{$seqno} ) { return 1; } return; } ## end sub is_in_list_by_i sub is_list_by_K { # Return true if token K is in a list my ( $self, $KK ) = @_; my $parent_seqno = $self->parent_seqno_by_K($KK); return unless defined($parent_seqno); return $self->[_ris_list_by_seqno_]->{$parent_seqno}; } ## end sub is_list_by_K sub is_list_by_seqno { # Return true if the immediate contents of a container appears to be a # list. my ( $self, $seqno ) = @_; return unless defined($seqno); return $self->[_ris_list_by_seqno_]->{$seqno}; } ## end sub is_list_by_seqno sub resync_lines_and_tokens { my $self = shift; # Re-construct the arrays of tokens associated with the original input # lines since they have probably changed due to inserting and deleting # blanks and a few other tokens. # Return paremeters: # set severe_error = true if processing needs to terminate my $severe_error; my $rqw_lines = []; my $rLL = $self->[_rLL_]; my $Klimit = $self->[_Klimit_]; my $rlines = $self->[_rlines_]; my @Krange_code_without_comments; my @Klast_valign_code; # This is the next token and its line index: my $Knext = 0; my $Kmax = defined($Klimit) ? $Klimit : -1; # Verify that old line indexes are in still order. If this error occurs, # check locations where sub 'respace_tokens' creates new tokens (like # blank spaces). It must have set a bad old line index. if ( DEVEL_MODE && defined($Klimit) ) { my $iline = $rLL->[0]->[_LINE_INDEX_]; foreach my $KK ( 1 .. $Klimit ) { my $iline_last = $iline; $iline = $rLL->[$KK]->[_LINE_INDEX_]; if ( $iline < $iline_last ) { my $KK_m = $KK - 1; my $token_m = $rLL->[$KK_m]->[_TOKEN_]; my $token = $rLL->[$KK]->[_TOKEN_]; my $type_m = $rLL->[$KK_m]->[_TYPE_]; my $type = $rLL->[$KK]->[_TYPE_]; Fault(<{_line_type}; if ( $line_type eq 'CODE' ) { # Get the old number of tokens on this line my $rK_range_old = $line_of_tokens->{_rK_range}; my ( $Kfirst_old, $Klast_old ) = @{$rK_range_old}; my $Kdiff_old = 0; if ( defined($Kfirst_old) ) { $Kdiff_old = $Klast_old - $Kfirst_old; } # Find the range of NEW K indexes for the line: # $Kfirst = index of first token on line # $Klast = index of last token on line my ( $Kfirst, $Klast ); my $Knext_beg = $Knext; # this will be $Kfirst if we find tokens # Optimization: Although the actual K indexes may be completely # changed after respacing, the number of tokens on any given line # will often be nearly unchanged. So we will see if we can start # our search by guessing that the new line has the same number # of tokens as the old line. my $Knext_guess = $Knext + $Kdiff_old; if ( $Knext_guess > $Knext && $Knext_guess < $Kmax && $rLL->[$Knext_guess]->[_LINE_INDEX_] <= $iline ) { # the guess is good, so we can start our search here $Knext = $Knext_guess + 1; } while ($Knext <= $Kmax && $rLL->[$Knext]->[_LINE_INDEX_] <= $iline ) { $Knext++; } if ( $Knext > $Knext_beg ) { $Klast = $Knext - 1; # Delete any terminal blank token if ( $rLL->[$Klast]->[_TYPE_] eq 'b' ) { $Klast -= 1 } if ( $Klast < $Knext_beg ) { $Klast = undef; } else { $Kfirst = $Knext_beg; # Save ranges of non-comment code. This will be used by # sub keep_old_line_breaks. if ( $rLL->[$Kfirst]->[_TYPE_] ne '#' ) { push @Krange_code_without_comments, [ $Kfirst, $Klast ]; } # Only save ending K indexes of code types which are blank # or 'VER'. These will be used for a convergence check. # See related code in sub 'convey_batch_to_vertical_aligner' my $CODE_type = $line_of_tokens->{_code_type}; if ( !$CODE_type || $CODE_type eq 'VER' ) { push @Klast_valign_code, $Klast; } } } # It is only safe to trim the actual line text if the input # line had a terminal blank token. Otherwise, we may be # in a quote. if ( $line_of_tokens->{_ended_in_blank_token} ) { $line_of_tokens->{_line_text} =~ s/\s+$//; } $line_of_tokens->{_rK_range} = [ $Kfirst, $Klast ]; # Deleting semicolons can create new empty code lines # which should be marked as blank if ( !defined($Kfirst) ) { my $CODE_type = $line_of_tokens->{_code_type}; if ( !$CODE_type ) { $line_of_tokens->{_code_type} = 'BL'; } } else { #--------------------------------------------------- # save indexes of all lines with a 'q' at either end # for later use by sub find_multiline_qw #--------------------------------------------------- if ( $rLL->[$Kfirst]->[_TYPE_] eq 'q' || $rLL->[$Klast]->[_TYPE_] eq 'q' ) { push @{$rqw_lines}, $iline; } } } } # There shouldn't be any nodes beyond the last one. This routine is # relinking lines and tokens after the tokens have been respaced. A fault # here indicates some kind of bug has been introduced into the above loops. # There is not good way to keep going; we better stop here. if ( $Knext <= $Kmax ) { Fault_Warn( "unexpected tokens at end of file when reconstructing lines"); $severe_error = 1; return ( $severe_error, $rqw_lines ); } $self->[_rKrange_code_without_comments_] = \@Krange_code_without_comments; # Setup the convergence test in the FileWriter based on line-ending indexes my $file_writer_object = $self->[_file_writer_object_]; $file_writer_object->setup_convergence_test( \@Klast_valign_code ); return ( $severe_error, $rqw_lines ); } ## end sub resync_lines_and_tokens sub check_for_old_break { my ( $self, $KK, $rkeep_break_hash, $rbreak_hash ) = @_; # This sub is called to help implement flags: # --keep-old-breakpoints-before and --keep-old-breakpoints-after # Given: # $KK = index of a token, # $rkeep_break_hash = user control for --keep-old-... # $rbreak_hash = hash of tokens where breaks are requested # Set $rbreak_hash as follows if a user break is requested: # = 1 make a hard break (flush the current batch) # best for something like leading commas (-kbb=',') # = 2 make a soft break (keep building current batch) # best for something like leading -> my $rLL = $self->[_rLL_]; my $seqno = $rLL->[$KK]->[_TYPE_SEQUENCE_]; # non-container tokens use the type as the key if ( !$seqno ) { my $type = $rLL->[$KK]->[_TYPE_]; if ( $rkeep_break_hash->{$type} ) { $rbreak_hash->{$KK} = $is_soft_keep_break_type{$type} ? 2 : 1; } } # container tokens use the token as the key else { my $token = $rLL->[$KK]->[_TOKEN_]; my $flag = $rkeep_break_hash->{$token}; if ($flag) { my $match = $flag eq '1' || $flag eq '*'; # check for special matching codes if ( !$match ) { if ( $token eq '(' || $token eq ')' ) { $match = $self->match_paren_control_flag( $seqno, $flag ); } elsif ( $token eq '{' || $token eq '}' ) { # These tentative codes 'b' and 'B' for brace types are # placeholders for possible future brace types. They # are not documented and may be changed. my $block_type = $self->[_rblock_type_of_seqno_]->{$seqno}; if ( $flag eq 'b' ) { $match = $block_type } elsif ( $flag eq 'B' ) { $match = !$block_type } else { # unknown code - no match } } } if ($match) { my $type = $rLL->[$KK]->[_TYPE_]; $rbreak_hash->{$KK} = $is_soft_keep_break_type{$type} ? 2 : 1; } } } return; } ## end sub check_for_old_break sub keep_old_line_breaks { # Called once per file to find and mark any old line breaks which # should be kept. We will be translating the input hashes into # token indexes. # A flag is set as follows: # = 1 make a hard break (flush the current batch) # best for something like leading commas (-kbb=',') # = 2 make a soft break (keep building current batch) # best for something like leading -> my ($self) = @_; my $rLL = $self->[_rLL_]; my $rKrange_code_without_comments = $self->[_rKrange_code_without_comments_]; my $rbreak_before_Kfirst = $self->[_rbreak_before_Kfirst_]; my $rbreak_after_Klast = $self->[_rbreak_after_Klast_]; my $rbreak_container = $self->[_rbreak_container_]; #---------------------------------------- # Apply --break-at-old-method-breakpoints #---------------------------------------- # This code moved here from sub break_lists to fix b1120 if ( $rOpts->{'break-at-old-method-breakpoints'} ) { foreach my $item ( @{$rKrange_code_without_comments} ) { my ( $Kfirst, $Klast ) = @{$item}; my $type = $rLL->[$Kfirst]->[_TYPE_]; my $token = $rLL->[$Kfirst]->[_TOKEN_]; # leading '->' use a value of 2 which causes a soft # break rather than a hard break if ( $type eq '->' ) { $rbreak_before_Kfirst->{$Kfirst} = 2; } # leading ')->' use a special flag to insure that both # opening and closing parens get opened # Fix for b1120: only for parens, not braces elsif ( $token eq ')' ) { my $Kn = $self->K_next_nonblank($Kfirst); next unless ( defined($Kn) && $Kn <= $Klast && $rLL->[$Kn]->[_TYPE_] eq '->' ); my $seqno = $rLL->[$Kfirst]->[_TYPE_SEQUENCE_]; next unless ($seqno); # Note: in previous versions there was a fix here to avoid # instability between conflicting -bom and -pvt or -pvtc flags. # The fix skipped -bom for a small line difference. But this # was troublesome, and instead the fix has been moved to # sub set_vertical_tightness_flags where priority is given to # the -bom flag over -pvt and -pvtc flags. Both opening and # closing paren flags are involved because even though -bom only # requests breaking before the closing paren, automated logic # opens the opening paren when the closing paren opens. # Relevant cases are b977, b1215, b1270, b1303 $rbreak_container->{$seqno} = 1; } } } #--------------------------------------------------------------------- # Apply --keep-old-breakpoints-before and --keep-old-breakpoints-after #--------------------------------------------------------------------- return unless ( %keep_break_before_type || %keep_break_after_type ); foreach my $item ( @{$rKrange_code_without_comments} ) { my ( $Kfirst, $Klast ) = @{$item}; $self->check_for_old_break( $Kfirst, \%keep_break_before_type, $rbreak_before_Kfirst ); $self->check_for_old_break( $Klast, \%keep_break_after_type, $rbreak_after_Klast ); } return; } ## end sub keep_old_line_breaks sub weld_containers { # Called once per file to do any welding operations requested by --weld* # flags. my ($self) = @_; # This count is used to eliminate needless calls for weld checks elsewhere $total_weld_count = 0; return if ( $rOpts->{'indent-only'} ); return unless ($rOpts_add_newlines); # Important: sub 'weld_cuddled_blocks' must be called before # sub 'weld_nested_containers'. This is because the cuddled option needs to # use the original _LEVEL_ values of containers, but the weld nested # containers changes _LEVEL_ of welded containers. # Here is a good test case to be sure that both cuddling and welding # are working and not interfering with each other: <> # perltidy -wn -ce # if ($BOLD_MATH) { ( # $labels, $comment, # join( '', '', &make_math( $mode, '', '', $_ ), '' ) # ) } else { ( # &process_math_in_latex( $mode, $math_style, $slevel, "\\mbox{$text}" ), # $after # ) } $self->weld_cuddled_blocks() if ( %{$rcuddled_block_types} ); if ( $rOpts->{'weld-nested-containers'} ) { $self->weld_nested_containers(); $self->weld_nested_quotes(); } #------------------------------------------------------------- # All welding is done. Finish setting up weld data structures. #------------------------------------------------------------- my $rLL = $self->[_rLL_]; my $rK_weld_left = $self->[_rK_weld_left_]; my $rK_weld_right = $self->[_rK_weld_right_]; my $rweld_len_right_at_K = $self->[_rweld_len_right_at_K_]; my @K_multi_weld; my @keys = keys %{$rK_weld_right}; $total_weld_count = @keys; # First pass to process binary welds. # This loop is processed in unsorted order for efficiency. foreach my $Kstart (@keys) { my $Kend = $rK_weld_right->{$Kstart}; # An error here would be due to an incorrect initialization introduced # in one of the above weld routines, like sub weld_nested. if ( $Kend <= $Kstart ) { Fault("Bad weld link: Kend=$Kend <= Kstart=$Kstart\n") if (DEVEL_MODE); next; } # Set weld values for all tokens this welded pair foreach ( $Kstart + 1 .. $Kend ) { $rK_weld_left->{$_} = $Kstart; } foreach my $Kx ( $Kstart .. $Kend - 1 ) { $rK_weld_right->{$Kx} = $Kend; $rweld_len_right_at_K->{$Kx} = $rLL->[$Kend]->[_CUMULATIVE_LENGTH_] - $rLL->[$Kx]->[_CUMULATIVE_LENGTH_]; } # Remember the leftmost index of welds which continue to the right if ( defined( $rK_weld_right->{$Kend} ) && !defined( $rK_weld_left->{$Kstart} ) ) { push @K_multi_weld, $Kstart; } } # Second pass to process chains of welds (these are rare). # This has to be processed in sorted order. if (@K_multi_weld) { my $Kend = -1; foreach my $Kstart ( sort { $a <=> $b } @K_multi_weld ) { # Skip any interior K which was originally missing a left link next if ( $Kstart <= $Kend ); # Find the end of this chain $Kend = $rK_weld_right->{$Kstart}; my $Knext = $rK_weld_right->{$Kend}; while ( defined($Knext) ) { $Kend = $Knext; $Knext = $rK_weld_right->{$Kend}; } # Set weld values this chain foreach ( $Kstart + 1 .. $Kend ) { $rK_weld_left->{$_} = $Kstart; } foreach my $Kx ( $Kstart .. $Kend - 1 ) { $rK_weld_right->{$Kx} = $Kend; $rweld_len_right_at_K->{$Kx} = $rLL->[$Kend]->[_CUMULATIVE_LENGTH_] - $rLL->[$Kx]->[_CUMULATIVE_LENGTH_]; } } } return; } ## end sub weld_containers sub cumulative_length_before_K { my ( $self, $KK ) = @_; my $rLL = $self->[_rLL_]; return ( $KK <= 0 ) ? 0 : $rLL->[ $KK - 1 ]->[_CUMULATIVE_LENGTH_]; } sub weld_cuddled_blocks { my ($self) = @_; # Called once per file to handle cuddled formatting my $rK_weld_left = $self->[_rK_weld_left_]; my $rK_weld_right = $self->[_rK_weld_right_]; my $rblock_type_of_seqno = $self->[_rblock_type_of_seqno_]; # This routine implements the -cb flag by finding the appropriate # closing and opening block braces and welding them together. return unless ( %{$rcuddled_block_types} ); my $rLL = $self->[_rLL_]; return unless ( defined($rLL) && @{$rLL} ); my $rbreak_container = $self->[_rbreak_container_]; my $ris_broken_container = $self->[_ris_broken_container_]; my $ris_cuddled_closing_brace = $self->[_ris_cuddled_closing_brace_]; my $K_closing_container = $self->[_K_closing_container_]; # A stack to remember open chains at all levels: This is a hash rather than # an array for safety because negative levels can occur in files with # errors. This allows us to keep processing with negative levels. # $in_chain{$level} = [$chain_type, $type_sequence]; my %in_chain; my $CBO = $rOpts->{'cuddled-break-option'}; # loop over structure items to find cuddled pairs my $level = 0; my $KNEXT = $self->[_K_first_seq_item_]; while ( defined($KNEXT) ) { my $KK = $KNEXT; $KNEXT = $rLL->[$KNEXT]->[_KNEXT_SEQ_ITEM_]; my $rtoken_vars = $rLL->[$KK]; my $type_sequence = $rtoken_vars->[_TYPE_SEQUENCE_]; if ( !$type_sequence ) { next if ( $KK == 0 ); # first token in file may not be container # A fault here implies that an error was made in the little loop at # the bottom of sub 'respace_tokens' which set the values of # _KNEXT_SEQ_ITEM_. Or an error has been introduced in the # loop control lines above. Fault("sequence = $type_sequence not defined at K=$KK") if (DEVEL_MODE); next; } # NOTE: we must use the original levels here. They can get changed # by sub 'weld_nested_containers', so this routine must be called # before sub 'weld_nested_containers'. my $last_level = $level; $level = $rtoken_vars->[_LEVEL_]; if ( $level < $last_level ) { $in_chain{$last_level} = undef } elsif ( $level > $last_level ) { $in_chain{$level} = undef } # We are only looking at code blocks my $token = $rtoken_vars->[_TOKEN_]; my $type = $rtoken_vars->[_TYPE_]; next unless ( $type eq $token ); if ( $token eq '{' ) { my $block_type = $rblock_type_of_seqno->{$type_sequence}; if ( !$block_type ) { # patch for unrecognized block types which may not be labeled my $Kp = $self->K_previous_nonblank($KK); while ( $Kp && $rLL->[$Kp]->[_TYPE_] eq '#' ) { $Kp = $self->K_previous_nonblank($Kp); } next unless $Kp; $block_type = $rLL->[$Kp]->[_TOKEN_]; } if ( $in_chain{$level} ) { # we are in a chain and are at an opening block brace. # See if we are welding this opening brace with the previous # block brace. Get their identification numbers: my $closing_seqno = $in_chain{$level}->[1]; my $opening_seqno = $type_sequence; # The preceding block must be on multiple lines so that its # closing brace will start a new line. if ( !$ris_broken_container->{$closing_seqno} && !$rbreak_container->{$closing_seqno} ) { next unless ( $CBO == 2 ); $rbreak_container->{$closing_seqno} = 1; } # We can weld the closing brace to its following word .. my $Ko = $K_closing_container->{$closing_seqno}; my $Kon; if ( defined($Ko) ) { $Kon = $self->K_next_nonblank($Ko); } # ..unless it is a comment if ( defined($Kon) && $rLL->[$Kon]->[_TYPE_] ne '#' ) { # OK to weld these two tokens... $rK_weld_right->{$Ko} = $Kon; $rK_weld_left->{$Kon} = $Ko; # Set flag that we want to break the next container # so that the cuddled line is balanced. $rbreak_container->{$opening_seqno} = 1 if ($CBO); # Remember which braces are cuddled. # The closing brace is used to set adjusted indentations. # The opening brace is not yet used but might eventually # be needed in setting adjusted indentation. $ris_cuddled_closing_brace->{$closing_seqno} = 1; } } else { # We are not in a chain. Start a new chain if we see the # starting block type. if ( $rcuddled_block_types->{$block_type} ) { $in_chain{$level} = [ $block_type, $type_sequence ]; } else { $block_type = '*'; $in_chain{$level} = [ $block_type, $type_sequence ]; } } } elsif ( $token eq '}' ) { if ( $in_chain{$level} ) { # We are in a chain at a closing brace. See if this chain # continues.. my $Knn = $self->K_next_code($KK); next unless $Knn; my $chain_type = $in_chain{$level}->[0]; my $next_nonblank_token = $rLL->[$Knn]->[_TOKEN_]; if ( $rcuddled_block_types->{$chain_type}->{$next_nonblank_token} ) { # Note that we do not weld yet because we must wait until # we we are sure that an opening brace for this follows. $in_chain{$level}->[1] = $type_sequence; } else { $in_chain{$level} = undef } } } } return; } ## end sub weld_cuddled_blocks sub find_nested_pairs { my $self = shift; # This routine is called once per file to do preliminary work needed for # the --weld-nested option. This information is also needed for adding # semicolons. my $rLL = $self->[_rLL_]; return unless ( defined($rLL) && @{$rLL} ); my $Num = @{$rLL}; my $K_opening_container = $self->[_K_opening_container_]; my $K_closing_container = $self->[_K_closing_container_]; my $rblock_type_of_seqno = $self->[_rblock_type_of_seqno_]; # We define an array of pairs of nested containers my @nested_pairs; # Names of calling routines can either be marked as 'i' or 'w', # and they may invoke a sub call with an '->'. We will consider # any consecutive string of such types as a single unit when making # weld decisions. We also allow a leading ! my $is_name_type = { 'i' => 1, 'w' => 1, 'U' => 1, '->' => 1, '!' => 1, }; # Loop over all closing container tokens foreach my $inner_seqno ( keys %{$K_closing_container} ) { my $K_inner_closing = $K_closing_container->{$inner_seqno}; # See if it is immediately followed by another, outer closing token my $K_outer_closing = $K_inner_closing + 1; $K_outer_closing += 1 if ( $K_outer_closing < $Num && $rLL->[$K_outer_closing]->[_TYPE_] eq 'b' ); next unless ( $K_outer_closing < $Num ); my $outer_seqno = $rLL->[$K_outer_closing]->[_TYPE_SEQUENCE_]; next unless ($outer_seqno); my $token_outer_closing = $rLL->[$K_outer_closing]->[_TOKEN_]; next unless ( $is_closing_token{$token_outer_closing} ); # Simple filter: No commas or semicolons in the outer container my $rtype_count = $self->[_rtype_count_by_seqno_]->{$outer_seqno}; if ($rtype_count) { next if ( $rtype_count->{','} || $rtype_count->{';'} ); } # Now we have to check the opening tokens. my $K_outer_opening = $K_opening_container->{$outer_seqno}; my $K_inner_opening = $K_opening_container->{$inner_seqno}; next unless defined($K_outer_opening) && defined($K_inner_opening); my $inner_blocktype = $rblock_type_of_seqno->{$inner_seqno}; my $outer_blocktype = $rblock_type_of_seqno->{$outer_seqno}; # Verify that the inner opening token is the next container after the # outer opening token. my $K_io_check = $rLL->[$K_outer_opening]->[_KNEXT_SEQ_ITEM_]; next unless defined($K_io_check); if ( $K_io_check != $K_inner_opening ) { # The inner opening container does not immediately follow the outer # opening container, but we may still allow a weld if they are # separated by a sub signature. For example, we may have something # like this, where $K_io_check may be at the first 'x' instead of # 'io'. So we need to hop over the signature and see if we arrive # at 'io'. # oo io # | x x | # $obj->then( sub ( $code ) { # ... # return $c->render(text => '', status => $code); # } ); # | | # ic oc next if ( !$inner_blocktype || $inner_blocktype ne 'sub' ); next if $rLL->[$K_io_check]->[_TOKEN_] ne '('; my $seqno_signature = $rLL->[$K_io_check]->[_TYPE_SEQUENCE_]; next unless defined($seqno_signature); my $K_signature_closing = $K_closing_container->{$seqno_signature}; next unless defined($K_signature_closing); my $K_test = $rLL->[$K_signature_closing]->[_KNEXT_SEQ_ITEM_]; next unless ( defined($K_test) && $K_test == $K_inner_opening ); # OK, we have arrived at 'io' in the above diagram. We should put # a limit on the length or complexity of the signature here. There # is no perfect way to do this, one way is to put a limit on token # count. For consistency with older versions, we should allow a # signature with a single variable to weld, but not with # multiple variables. A single variable as in 'sub ($code) {' can # have a $Kdiff of 2 to 4, depending on spacing. # But two variables like 'sub ($v1,$v2) {' can have a diff of 4 to # 7, depending on spacing. So to keep formatting consistent with # previous versions, we will also avoid welding if there is a comma # in the signature. my $Kdiff = $K_signature_closing - $K_io_check; next if ( $Kdiff > 4 ); # backup comma count test; but we cannot get here with Kdiff<=4 my $rtc = $self->[_rtype_count_by_seqno_]->{$seqno_signature}; next if ( $rtc && $rtc->{','} ); } # Yes .. this is a possible nesting pair. # They can be separated by a small amount. my $K_diff = $K_inner_opening - $K_outer_opening; # Count the number of nonblank characters separating them. # Note: the $nonblank_count includes the inner opening container # but not the outer opening container, so it will be >= 1. if ( $K_diff < 0 ) { next } # Shouldn't happen my $nonblank_count = 0; my $type; my $is_name; # Here is an example of a long identifier chain which counts as a # single nonblank here (this spans about 10 K indexes): # if ( !Boucherot::SetOfConnections->new->handler->execute( # ^--K_o_o ^--K_i_o # @array) ) my $Kn_first = $K_outer_opening; my $Kn_last_nonblank; my $saw_comment; foreach my $Kn ( $K_outer_opening + 1 .. $K_inner_opening ) { next if ( $rLL->[$Kn]->[_TYPE_] eq 'b' ); if ( !$nonblank_count ) { $Kn_first = $Kn } if ( $Kn eq $K_inner_opening ) { $nonblank_count++; last; } $Kn_last_nonblank = $Kn; # skip chain of identifier tokens my $last_type = $type; my $last_is_name = $is_name; $type = $rLL->[$Kn]->[_TYPE_]; if ( $type eq '#' ) { $saw_comment = 1; last } $is_name = $is_name_type->{$type}; next if ( $is_name && $last_is_name ); # do not count a possible leading - of bareword hash key next if ( $type eq 'm' && !$last_type ); $nonblank_count++; last if ( $nonblank_count > 2 ); } # Do not weld across a comment .. fix for c058. next if ($saw_comment); # Patch for b1104: do not weld to a paren preceded by sort/map/grep # because the special line break rules may cause a blinking state if ( defined($Kn_last_nonblank) && $rLL->[$K_inner_opening]->[_TOKEN_] eq '(' && $rLL->[$Kn_last_nonblank]->[_TYPE_] eq 'k' ) { my $token = $rLL->[$Kn_last_nonblank]->[_TOKEN_]; # Turn off welding at sort/map/grep ( if ( $is_sort_map_grep{$token} ) { $nonblank_count = 10 } } my $token_oo = $rLL->[$K_outer_opening]->[_TOKEN_]; if ( # 1: adjacent opening containers, like: do {{ $nonblank_count == 1 # 2. anonymous sub + prototype or sig: )->then( sub ($code) { # ... but it seems best not to stack two structural blocks, like # this # sub make_anon_with_my_sub { sub { # because it probably hides the structure a little too much. || ( $inner_blocktype && $inner_blocktype eq 'sub' && $rLL->[$Kn_first]->[_TOKEN_] eq 'sub' && !$outer_blocktype ) # 3. short item following opening paren, like: fun( yyy ( || $nonblank_count == 2 && $token_oo eq '(' # 4. weld around fat commas, if requested (git #108), such as # elf->call_method( method_name_foo => { || ( $type eq '=>' && $nonblank_count <= 3 && %weld_fat_comma_rules && $weld_fat_comma_rules{$token_oo} ) ) { push @nested_pairs, [ $inner_seqno, $outer_seqno, $K_inner_closing ]; } next; } # The weld routine expects the pairs in order in the form # [$seqno_inner, $seqno_outer] # And they must be in the same order as the inner closing tokens # (otherwise, welds of three or more adjacent tokens will not work). The K # value of this inner closing token has temporarily been stored for # sorting. @nested_pairs = # Drop the K index after sorting (it would cause trouble downstream) map { [ $_->[0], $_->[1] ] } # Sort on the K values sort { $a->[2] <=> $b->[2] } @nested_pairs; return \@nested_pairs; } ## end sub find_nested_pairs sub match_paren_control_flag { # Decide if this paren is excluded by user request: # undef matches no parens # '*' matches all parens # 'k' matches only if the previous nonblank token is a perl builtin # keyword (such as 'if', 'while'), # 'K' matches if 'k' does not, meaning if the previous token is not a # keyword. # 'f' matches if the previous token is a function other than a keyword. # 'F' matches if 'f' does not. # 'w' matches if either 'k' or 'f' match. # 'W' matches if 'w' does not. my ( $self, $seqno, $flag, $rLL ) = @_; # Input parameters: # $seqno = sequence number of the container (should be paren) # $flag = the flag which defines what matches # $rLL = an optional alternate token list needed for respace operations $rLL = $self->[_rLL_] unless ( defined($rLL) ); return 0 unless ( defined($flag) ); return 0 if $flag eq '0'; return 1 if $flag eq '1'; return 1 if $flag eq '*'; return 0 unless ($seqno); my $K_opening = $self->[_K_opening_container_]->{$seqno}; return unless ( defined($K_opening) ); my ( $is_f, $is_k, $is_w ); my $Kp = $self->K_previous_nonblank( $K_opening, $rLL ); if ( defined($Kp) ) { my $type_p = $rLL->[$Kp]->[_TYPE_]; # keyword? $is_k = $type_p eq 'k'; # function call? $is_f = $self->[_ris_function_call_paren_]->{$seqno}; # either keyword or function call? $is_w = $is_k || $is_f; } my $match; if ( $flag eq 'k' ) { $match = $is_k } elsif ( $flag eq 'K' ) { $match = !$is_k } elsif ( $flag eq 'f' ) { $match = $is_f } elsif ( $flag eq 'F' ) { $match = !$is_f } elsif ( $flag eq 'w' ) { $match = $is_w } elsif ( $flag eq 'W' ) { $match = !$is_w } return $match; } ## end sub match_paren_control_flag sub is_excluded_weld { # decide if this weld is excluded by user request my ( $self, $KK, $is_leading ) = @_; my $rLL = $self->[_rLL_]; my $rtoken_vars = $rLL->[$KK]; my $token = $rtoken_vars->[_TOKEN_]; my $rflags = $weld_nested_exclusion_rules{$token}; return 0 unless ( defined($rflags) ); my $flag = $is_leading ? $rflags->[0] : $rflags->[1]; return 0 unless ( defined($flag) ); return 1 if $flag eq '*'; my $seqno = $rtoken_vars->[_TYPE_SEQUENCE_]; return $self->match_paren_control_flag( $seqno, $flag ); } ## end sub is_excluded_weld # hashes to simplify welding logic my %type_ok_after_bareword; my %has_tight_paren; BEGIN { # types needed for welding RULE 6 my @q = qw# => -> { ( [ #; @type_ok_after_bareword{@q} = (1) x scalar(@q); # these types do not 'like' to be separated from a following paren @q = qw(w i q Q G C Z U); @{has_tight_paren}{@q} = (1) x scalar(@q); } ## end BEGIN use constant DEBUG_WELD => 0; sub setup_new_weld_measurements { # Define quantities to check for excess line lengths when welded. # Called by sub 'weld_nested_containers' and sub 'weld_nested_quotes' my ( $self, $Kouter_opening, $Kinner_opening ) = @_; # Given indexes of outer and inner opening containers to be welded: # $Kouter_opening, $Kinner_opening # Returns these variables: # $new_weld_ok = true (new weld ok) or false (do not start new weld) # $starting_indent = starting indentation # $starting_lentot = starting cumulative length # $msg = diagnostic message for debugging my $rLL = $self->[_rLL_]; my $rlines = $self->[_rlines_]; my $starting_level; my $starting_ci; my $starting_lentot; my $maximum_text_length; my $msg = EMPTY_STRING; my $iline_oo = $rLL->[$Kouter_opening]->[_LINE_INDEX_]; my $rK_range = $rlines->[$iline_oo]->{_rK_range}; my ( $Kfirst, $Klast ) = @{$rK_range}; #------------------------------------------------------------------------- # We now define a reference index, '$Kref', from which to start measuring # This choice turns out to be critical for keeping welds stable during # iterations, so we go through a number of STEPS... #------------------------------------------------------------------------- # STEP 1: Our starting guess is to use measure from the first token of the # current line. This is usually a good guess. my $Kref = $Kfirst; # STEP 2: See if we should go back a little farther my $Kprev = $self->K_previous_nonblank($Kfirst); if ( defined($Kprev) ) { # Avoid measuring from between an opening paren and a previous token # which should stay close to it ... fixes b1185 my $token_oo = $rLL->[$Kouter_opening]->[_TOKEN_]; my $type_prev = $rLL->[$Kprev]->[_TYPE_]; if ( $Kouter_opening == $Kfirst && $token_oo eq '(' && $has_tight_paren{$type_prev} ) { $Kref = $Kprev; } # Back up and count length from a token like '=' or '=>' if -lp # is used (this fixes b520) # ...or if a break is wanted before there elsif ($rOpts_line_up_parentheses || $want_break_before{$type_prev} ) { # If there are other sequence items between the start of this line # and the opening token in question, then do not include tokens on # the previous line in length calculations. This check added to # fix case b1174 which had a '?' on the line my $no_previous_seq_item = $Kref == $Kouter_opening || $rLL->[$Kref]->[_KNEXT_SEQ_ITEM_] == $Kouter_opening; if ( $no_previous_seq_item && substr( $type_prev, 0, 1 ) eq '=' ) { $Kref = $Kprev; # Fix for b1144 and b1112: backup to the first nonblank # character before the =>, or to the start of its line. if ( $type_prev eq '=>' ) { my $iline_prev = $rLL->[$Kprev]->[_LINE_INDEX_]; my $rK_range_prev = $rlines->[$iline_prev]->{_rK_range}; my ( $Kfirst_prev, $Klast_prev ) = @{$rK_range_prev}; foreach my $KK ( reverse( $Kfirst_prev .. $Kref - 1 ) ) { next if ( $rLL->[$KK]->[_TYPE_] eq 'b' ); $Kref = $KK; last; } } } } } # STEP 3: Now look ahead for a ternary and, if found, use it. # This fixes case b1182. # Also look for a ')' at the same level and, if found, use it. # This fixes case b1224. if ( $Kref < $Kouter_opening ) { my $Knext = $rLL->[$Kref]->[_KNEXT_SEQ_ITEM_]; my $level_oo = $rLL->[$Kouter_opening]->[_LEVEL_]; while ( $Knext < $Kouter_opening ) { if ( $rLL->[$Knext]->[_LEVEL_] == $level_oo ) { if ( $is_ternary{ $rLL->[$Knext]->[_TYPE_] } || $rLL->[$Knext]->[_TOKEN_] eq ')' ) { $Kref = $Knext; last; } } $Knext = $rLL->[$Knext]->[_KNEXT_SEQ_ITEM_]; } } # Define the starting measurements we will need $starting_lentot = $Kref <= 0 ? 0 : $rLL->[ $Kref - 1 ]->[_CUMULATIVE_LENGTH_]; $starting_level = $rLL->[$Kref]->[_LEVEL_]; $starting_ci = $rLL->[$Kref]->[_CI_LEVEL_]; $maximum_text_length = $maximum_text_length_at_level[$starting_level] - $starting_ci * $rOpts_continuation_indentation; # STEP 4: Switch to using the outer opening token as the reference # point if a line break before it would make a longer line. # Fixes case b1055 and is also an alternate fix for b1065. my $starting_level_oo = $rLL->[$Kouter_opening]->[_LEVEL_]; if ( $Kref < $Kouter_opening ) { my $starting_ci_oo = $rLL->[$Kouter_opening]->[_CI_LEVEL_]; my $lentot_oo = $rLL->[ $Kouter_opening - 1 ]->[_CUMULATIVE_LENGTH_]; my $maximum_text_length_oo = $maximum_text_length_at_level[$starting_level_oo] - $starting_ci_oo * $rOpts_continuation_indentation; # The excess length to any cumulative length K = lenK is either # $excess = $lenk - ($lentot + $maximum_text_length), or # $excess = $lenk - ($lentot_oo + $maximum_text_length_oo), # so the worst case (maximum excess) corresponds to the configuration # with minimum value of the sum: $lentot + $maximum_text_length if ( $lentot_oo + $maximum_text_length_oo < $starting_lentot + $maximum_text_length ) { $Kref = $Kouter_opening; $starting_level = $starting_level_oo; $starting_ci = $starting_ci_oo; $starting_lentot = $lentot_oo; $maximum_text_length = $maximum_text_length_oo; } } my $new_weld_ok = 1; # STEP 5, fix b1020: Avoid problem areas with the -wn -lp combination. The # combination -wn -lp -dws -naws does not work well and can cause blinkers. # It will probably only occur in stress testing. For this situation we # will only start a new weld if we start at a 'good' location. # - Added 'if' to fix case b1032. # - Require blank before certain previous characters to fix b1111. # - Add ';' to fix case b1139 # - Convert from '$ok_to_weld' to '$new_weld_ok' to fix b1162. # - relaxed constraints for b1227 # - added skip if type is 'q' for b1349 and b1350 b1351 b1352 b1353 # - added skip if type is 'Q' for b1447 if ( $starting_ci && $rOpts_line_up_parentheses && $rOpts_delete_old_whitespace && !$rOpts_add_whitespace && $rLL->[$Kinner_opening]->[_TYPE_] ne 'q' && $rLL->[$Kinner_opening]->[_TYPE_] ne 'Q' && defined($Kprev) ) { my $type_first = $rLL->[$Kfirst]->[_TYPE_]; my $token_first = $rLL->[$Kfirst]->[_TOKEN_]; my $type_prev = $rLL->[$Kprev]->[_TYPE_]; my $type_pp = 'b'; if ( $Kprev >= 0 ) { $type_pp = $rLL->[ $Kprev - 1 ]->[_TYPE_] } unless ( $type_prev =~ /^[\,\.\;]/ || $type_prev =~ /^[=\{\[\(\L]/ && ( $type_pp eq 'b' || $type_pp eq '}' || $type_first eq 'k' ) || $type_first =~ /^[=\,\.\;\{\[\(\L]/ || $type_first eq '||' || ( $type_first eq 'k' && ( $token_first eq 'if' || $token_first eq 'or' ) ) ) { $msg = "Skipping weld: poor break with -lp and ci at type_first='$type_first' type_prev='$type_prev' type_pp=$type_pp\n"; $new_weld_ok = 0; } } return ( $new_weld_ok, $maximum_text_length, $starting_lentot, $msg ); } ## end sub setup_new_weld_measurements sub excess_line_length_for_Krange { my ( $self, $Kfirst, $Klast ) = @_; # returns $excess_length = # by how many characters a line composed of tokens $Kfirst .. $Klast will # exceed the allowed line length my $rLL = $self->[_rLL_]; my $length_before_Kfirst = $Kfirst <= 0 ? 0 : $rLL->[ $Kfirst - 1 ]->[_CUMULATIVE_LENGTH_]; # backup before a side comment if necessary my $Kend = $Klast; if ( $rOpts_ignore_side_comment_lengths && $rLL->[$Klast]->[_TYPE_] eq '#' ) { my $Kprev = $self->K_previous_nonblank($Klast); if ( defined($Kprev) && $Kprev >= $Kfirst ) { $Kend = $Kprev } } # get the length of the text my $length = $rLL->[$Kend]->[_CUMULATIVE_LENGTH_] - $length_before_Kfirst; # get the size of the text window my $level = $rLL->[$Kfirst]->[_LEVEL_]; my $ci_level = $rLL->[$Kfirst]->[_CI_LEVEL_]; my $max_text_length = $maximum_text_length_at_level[$level] - $ci_level * $rOpts_continuation_indentation; my $excess_length = $length - $max_text_length; DEBUG_WELD && print "Kfirst=$Kfirst, Klast=$Klast, Kend=$Kend, level=$level, ci=$ci_level, max_text_length=$max_text_length, length=$length\n"; return ($excess_length); } ## end sub excess_line_length_for_Krange sub weld_nested_containers { my ($self) = @_; # Called once per file for option '--weld-nested-containers' my $rK_weld_left = $self->[_rK_weld_left_]; my $rK_weld_right = $self->[_rK_weld_right_]; # This routine implements the -wn flag by "welding together" # the nested closing and opening tokens which were previously # identified by sub 'find_nested_pairs'. "welding" simply # involves setting certain hash values which will be checked # later during formatting. my $rLL = $self->[_rLL_]; my $rlines = $self->[_rlines_]; my $K_opening_container = $self->[_K_opening_container_]; my $K_closing_container = $self->[_K_closing_container_]; my $rblock_type_of_seqno = $self->[_rblock_type_of_seqno_]; my $ris_asub_block = $self->[_ris_asub_block_]; my $rmax_vertical_tightness = $self->[_rmax_vertical_tightness_]; my $rOpts_asbl = $rOpts->{'opening-anonymous-sub-brace-on-new-line'}; # Find nested pairs of container tokens for any welding. my $rnested_pairs = $self->find_nested_pairs(); # Return unless there are nested pairs to weld return unless defined($rnested_pairs) && @{$rnested_pairs}; # NOTE: It would be nice to apply RULE 5 right here by deleting unwanted # pairs. But it isn't clear if this is possible because we don't know # which sequences might actually start a weld. my $rOpts_break_at_old_method_breakpoints = $rOpts->{'break-at-old-method-breakpoints'}; # This array will hold the sequence numbers of the tokens to be welded. my @welds; # Variables needed for estimating line lengths my $maximum_text_length; # maximum spaces available for text my $starting_lentot; # cumulative text to start of current line my $iline_outer_opening = -1; my $weld_count_this_start = 0; # OLD: $single_line_tol added to fix cases b1180 b1181 # = $rOpts_continuation_indentation > $rOpts_indent_columns ? 1 : 0; # NEW: $single_line_tol=0; fixes b1212 and b1180-1181 work now my $single_line_tol = 0; my $multiline_tol = $single_line_tol + 1 + max( $rOpts_indent_columns, $rOpts_continuation_indentation ); # Define a welding cutoff level: do not start a weld if the inside # container level equals or exceeds this level. # We use the minimum of two criteria, either of which may be more # restrictive. The 'alpha' value is more restrictive in (b1206, b1252) and # the 'beta' value is more restrictive in other cases (b1243). # Reduced beta term from beta+3 to beta+2 to fix b1401. Previously: # my $weld_cutoff_level = min($stress_level_alpha, $stress_level_beta + 2); # This is now '$high_stress_level'. # The vertical tightness flags can throw off line length calculations. # This patch was added to fix instability issue b1284. # It works to always use a tol of 1 for 1 line block length tests, but # this restricted value keeps test case wn6.wn working as before. # It may be necessary to include '[' and '{' here in the future. my $one_line_tol = $opening_vertical_tightness{'('} ? 1 : 0; # Abbreviations: # _oo=outer opening, i.e. first of { { # _io=inner opening, i.e. second of { { # _oc=outer closing, i.e. second of } { # _ic=inner closing, i.e. first of } } my $previous_pair; # Main loop over nested pairs... # We are working from outermost to innermost pairs so that # level changes will be complete when we arrive at the inner pairs. while ( my $item = pop( @{$rnested_pairs} ) ) { my ( $inner_seqno, $outer_seqno ) = @{$item}; my $Kouter_opening = $K_opening_container->{$outer_seqno}; my $Kinner_opening = $K_opening_container->{$inner_seqno}; my $Kouter_closing = $K_closing_container->{$outer_seqno}; my $Kinner_closing = $K_closing_container->{$inner_seqno}; # RULE: do not weld if inner container has <= 3 tokens unless the next # token is a heredoc (so we know there will be multiple lines) if ( $Kinner_closing - $Kinner_opening <= 4 ) { my $Knext_nonblank = $self->K_next_nonblank($Kinner_opening); next unless defined($Knext_nonblank); my $type = $rLL->[$Knext_nonblank]->[_TYPE_]; next unless ( $type eq 'h' ); } my $outer_opening = $rLL->[$Kouter_opening]; my $inner_opening = $rLL->[$Kinner_opening]; my $outer_closing = $rLL->[$Kouter_closing]; my $inner_closing = $rLL->[$Kinner_closing]; # RULE: do not weld to a hash brace. The reason is that it has a very # strong bond strength to the next token, so a line break after it # may not work. Previously we allowed welding to something like @{ # but that caused blinking states (cases b751, b779). if ( $inner_opening->[_TYPE_] eq 'L' ) { next; } # RULE: do not weld to a square bracket which does not contain commas if ( $inner_opening->[_TYPE_] eq '[' ) { my $rtype_count = $self->[_rtype_count_by_seqno_]->{$inner_seqno}; next unless ( $rtype_count && $rtype_count->{','} ); # Do not weld if there is text before a '[' such as here: # curr_opt ( @beg [2,5] ) # It will not break into the desired sandwich structure. # This fixes case b109, 110. my $Kdiff = $Kinner_opening - $Kouter_opening; next if ( $Kdiff > 2 ); next if ( $Kdiff == 2 && $rLL->[ $Kouter_opening + 1 ]->[_TYPE_] ne 'b' ); } # RULE: Avoid welding under stress. The idea is that we need to have a # little space* within a welded container to avoid instability. Note # that after each weld the level values are reduced, so long multiple # welds can still be made. This rule will seldom be a limiting factor # in actual working code. Fixes b1206, b1243. my $inner_level = $inner_opening->[_LEVEL_]; if ( $inner_level >= $high_stress_level ) { next } # Set flag saying if this pair starts a new weld my $starting_new_weld = !( @welds && $outer_seqno == $welds[-1]->[0] ); # Set flag saying if this pair is adjacent to the previous nesting pair # (even if previous pair was rejected as a weld) my $touch_previous_pair = defined($previous_pair) && $outer_seqno == $previous_pair->[0]; $previous_pair = $item; my $do_not_weld_rule = 0; my $Msg = EMPTY_STRING; my $is_one_line_weld; my $iline_oo = $outer_opening->[_LINE_INDEX_]; my $iline_io = $inner_opening->[_LINE_INDEX_]; my $iline_ic = $inner_closing->[_LINE_INDEX_]; my $iline_oc = $outer_closing->[_LINE_INDEX_]; my $token_oo = $outer_opening->[_TOKEN_]; my $token_io = $inner_opening->[_TOKEN_]; # DO-NOT-WELD RULE 7: Do not weld if this conflicts with -bom # Added for case b973. Moved here from below to fix b1423. if ( !$do_not_weld_rule && $rOpts_break_at_old_method_breakpoints && $iline_io > $iline_oo ) { foreach my $iline ( $iline_oo + 1 .. $iline_io ) { my $rK_range = $rlines->[$iline]->{_rK_range}; next unless defined($rK_range); my ( $Kfirst, $Klast ) = @{$rK_range}; next unless defined($Kfirst); if ( $rLL->[$Kfirst]->[_TYPE_] eq '->' ) { $do_not_weld_rule = 7; last; } } } next if ($do_not_weld_rule); # Turn off vertical tightness at possible one-line welds. Fixes b1402, # b1419, b1421, b1424, b1425. This also fixes issues b1338, b1339, # b1340, b1341, b1342, b1343, which previously used a separate fix. # Issue c161 is the latest and simplest check, using # $iline_ic==$iline_io as the test. if ( %opening_vertical_tightness && $iline_ic == $iline_io && $opening_vertical_tightness{$token_oo} ) { $rmax_vertical_tightness->{$outer_seqno} = 0; } my $is_multiline_weld = $iline_oo == $iline_io && $iline_ic == $iline_oc && $iline_io != $iline_ic; if (DEBUG_WELD) { my $len_oo = $rLL->[$Kouter_opening]->[_CUMULATIVE_LENGTH_]; my $len_io = $rLL->[$Kinner_opening]->[_CUMULATIVE_LENGTH_]; $Msg .= < $iline_outer_opening ) ) { # Remember the line we are using as a reference $iline_outer_opening = $iline_oo; $weld_count_this_start = 0; ( my $new_weld_ok, $maximum_text_length, $starting_lentot, my $msg ) = $self->setup_new_weld_measurements( $Kouter_opening, $Kinner_opening ); if ( !$new_weld_ok && ( $iline_oo != $iline_io || $iline_ic != $iline_oc ) ) { if (DEBUG_WELD) { print $msg} next; } my $rK_range = $rlines->[$iline_oo]->{_rK_range}; my ( $Kfirst, $Klast ) = @{$rK_range}; # An existing one-line weld is a line in which # (1) the containers are all on one line, and # (2) the line does not exceed the allowable length if ( $iline_oo == $iline_oc ) { # All the tokens are on one line, now check their length. # Start with the full line index range. We will reduce this # in the coding below in some cases. my $Kstart = $Kfirst; my $Kstop = $Klast; # Note that the following minimal choice for measuring will # work and will not cause any instabilities because it is # invariant: ## my $Kstart = $Kouter_opening; ## my $Kstop = $Kouter_closing; # But that can lead to some undesirable welds. So a little # more complicated method has been developed. # We are trying to avoid creating bad two-line welds when we are # working on long, previously un-welded input text, such as # INPUT (example of a long input line weld candidate): ## $mutation->transpos( $self->RNA->position($mutation->label, $atg_label)); # GOOD two-line break: (not welded; result marked too long): ## $mutation->transpos( ## $self->RNA->position($mutation->label, $atg_label)); # BAD two-line break: (welded; result if we weld): ## $mutation->transpos($self->RNA->position( ## $mutation->label, $atg_label)); # We can only get an approximate estimate of the final length, # since the line breaks may change, and for -lp mode because # even the indentation is not yet known. my $level_first = $rLL->[$Kfirst]->[_LEVEL_]; my $level_last = $rLL->[$Klast]->[_LEVEL_]; my $level_oo = $rLL->[$Kouter_opening]->[_LEVEL_]; my $level_oc = $rLL->[$Kouter_closing]->[_LEVEL_]; # - measure to the end of the original line if balanced # - measure to the closing container if unbalanced (fixes b1230) #if ( $level_first != $level_last ) { $Kstop = $Kouter_closing } if ( $level_oc > $level_last ) { $Kstop = $Kouter_closing } # - measure from the start of the original line if balanced # - measure from the most previous token with same level # if unbalanced (b1232) if ( $Kouter_opening > $Kfirst && $level_oo > $level_first ) { $Kstart = $Kouter_opening; foreach my $KK ( reverse( $Kfirst + 1 .. $Kouter_opening - 1 ) ) { next if ( $rLL->[$KK]->[_TYPE_] eq 'b' ); last if ( $rLL->[$KK]->[_LEVEL_] < $level_oo ); $Kstart = $KK; } } my $excess = $self->excess_line_length_for_Krange( $Kstart, $Kstop ); # Coding simplified here for case b1219. # Increased tol from 0 to 1 when pvt>0 to fix b1284. $is_one_line_weld = $excess <= $one_line_tol; } # DO-NOT-WELD RULE 1: # Do not weld something that looks like the start of a two-line # function call, like this: <> # $trans->add_transformation( # PDL::Graphics::TriD::Scale->new( $sx, $sy, $sz ) ); # We will look for a semicolon after the closing paren. # We want to weld something complex, like this though # my $compass = uc( opposite_direction( line_to_canvas_direction( # @{ $coords[0] }, @{ $coords[1] } ) ) ); # Otherwise we will get a 'blinker'. For example, the following # would become a blinker without this rule: # $Self->_Add( $SortOrderDisplay{ $Field # ->GenerateFieldForSelectSQL() } ); # But it is okay to weld a two-line statement if it looks like # it was already welded, meaning that the two opening containers are # on a different line that the two closing containers. This is # necessary to prevent blinking of something like this with # perltidy -wn -pbp (starting indentation two levels deep): # $top_label->set_text( gettext( # "Unable to create personal directory - check permissions.") ); if ( $iline_oc == $iline_oo + 1 && $iline_io == $iline_ic && $token_oo eq '(' ) { # Look for following semicolon... my $Knext_nonblank = $self->K_next_nonblank($Kouter_closing); my $next_nonblank_type = defined($Knext_nonblank) ? $rLL->[$Knext_nonblank]->[_TYPE_] : 'b'; if ( $next_nonblank_type eq ';' ) { # Then do not weld if no other containers between inner # opening and closing. my $Knext_seq_item = $inner_opening->[_KNEXT_SEQ_ITEM_]; if ( $Knext_seq_item == $Kinner_closing ) { $do_not_weld_rule = 1; } } } } ## end starting new weld sequence else { # set the 1-line flag if continuing a weld sequence; fixes b1239 $is_one_line_weld = ( $iline_oo == $iline_oc ); } # DO-NOT-WELD RULE 2: # Do not weld an opening paren to an inner one line brace block # We will just use old line numbers for this test and require # iterations if necessary for convergence # For example, otherwise we could cause the opening paren # in the following example to separate from the caller name # as here: # $_[0]->code_handler # ( sub { $more .= $_[1] . ":" . $_[0] . "\n" } ); # Here is another example where we do not want to weld: # $wrapped->add_around_modifier( # sub { push @tracelog => 'around 1'; $_[0]->(); } ); # If the one line sub block gets broken due to length or by the # user, then we can weld. The result will then be: # $wrapped->add_around_modifier( sub { # push @tracelog => 'around 1'; # $_[0]->(); # } ); # Updated to fix cases b1082 b1102 b1106 b1115: # Also, do not weld to an intact inner block if the outer opening token # is on a different line. For example, this prevents oscillation # between these two states in case b1106: # return map{ # ($_,[$self->$_(@_[1..$#_])]) # }@every; # return map { ( # $_, [ $self->$_( @_[ 1 .. $#_ ] ) ] # ) } @every; # The effect of this change on typical code is very minimal. Sometimes # it may take a second iteration to converge, but this gives protection # against blinking. if ( !$do_not_weld_rule && !$is_one_line_weld && $iline_ic == $iline_io ) { $do_not_weld_rule = 2 if ( $token_oo eq '(' || $iline_oo != $iline_io ); } # DO-NOT-WELD RULE 2A: # Do not weld an opening asub brace in -lp mode if -asbl is set. This # helps avoid instabilities in one-line block formation, and fixes # b1241. Previously, the '$is_one_line_weld' flag was tested here # instead of -asbl, and this fixed most cases. But it turns out that # the real problem was the -asbl flag, and switching to this was # necessary to fixe b1268. This also fixes b1269, b1277, b1278. if ( !$do_not_weld_rule && $rOpts_line_up_parentheses && $rOpts_asbl && $ris_asub_block->{$outer_seqno} ) { $do_not_weld_rule = '2A'; } # DO-NOT-WELD RULE 3: # Do not weld if this makes our line too long. # Use a tolerance which depends on if the old tokens were welded # (fixes cases b746 b748 b749 b750 b752 b753 b754 b755 b756 b758 b759) if ( !$do_not_weld_rule ) { # Measure to a little beyond the inner opening token if it is # followed by a bare word, which may have unusual line break rules. # NOTE: Originally this was OLD RULE 6: do not weld to a container # which is followed on the same line by an unknown bareword token. # This can cause blinkers (cases b626, b611). But OK to weld one # line welds to fix cases b1057 b1064. For generality, OLD RULE 6 # has been merged into RULE 3 here to also fix cases b1078 b1091. my $K_for_length = $Kinner_opening; my $Knext_io = $self->K_next_nonblank($Kinner_opening); next unless ( defined($Knext_io) ); # shouldn't happen my $type_io_next = $rLL->[$Knext_io]->[_TYPE_]; # Note: may need to eventually also include other types here, # such as 'Z' and 'Y': if ($type_io_next =~ /^[ZYw]$/) { if ( $type_io_next eq 'w' ) { my $Knext_io2 = $self->K_next_nonblank($Knext_io); next unless ( defined($Knext_io2) ); my $type_io_next2 = $rLL->[$Knext_io2]->[_TYPE_]; if ( !$type_ok_after_bareword{$type_io_next2} ) { $K_for_length = $Knext_io2; } } # Use a tolerance for welds over multiple lines to avoid blinkers. # We can use zero tolerance if it looks like we are working on an # existing weld. my $tol = $is_one_line_weld || $is_multiline_weld ? $single_line_tol : $multiline_tol; # By how many characters does this exceed the text window? my $excess = $self->cumulative_length_before_K($K_for_length) - $starting_lentot + 1 + $tol - $maximum_text_length; # Old patch: Use '>=0' instead of '> 0' here to fix cases b995 b998 # b1000 b1001 b1007 b1008 b1009 b1010 b1011 b1012 b1016 b1017 b1018 # Revised patch: New tolerance definition allows going back to '> 0' # here. This fixes case b1124. See also cases b1087 and b1087a. if ( $excess > 0 ) { $do_not_weld_rule = 3 } if (DEBUG_WELD) { $Msg .= "RULE 3 test: excess length to K=$Kinner_opening is $excess > 0 with tol= $tol ?) \n"; } } # DO-NOT-WELD RULE 4; implemented for git#10: # Do not weld an opening -ce brace if the next container is on a single # line, different from the opening brace. (This is very rare). For # example, given the following with -ce, we will avoid joining the { # and [ # } else { # [ $_, length($_) ] # } # because this would produce a terminal one-line block: # } else { [ $_, length($_) ] } # which may not be what is desired. But given this input: # } else { [ $_, length($_) ] } # then we will do the weld and retain the one-line block if ( !$do_not_weld_rule && $rOpts->{'cuddled-else'} ) { my $block_type = $rblock_type_of_seqno->{$outer_seqno}; if ( $block_type && $rcuddled_block_types->{'*'}->{$block_type} ) { my $io_line = $inner_opening->[_LINE_INDEX_]; my $ic_line = $inner_closing->[_LINE_INDEX_]; my $oo_line = $outer_opening->[_LINE_INDEX_]; if ( $oo_line < $io_line && $ic_line == $io_line ) { $do_not_weld_rule = 4; } } } # DO-NOT-WELD RULE 5: do not include welds excluded by user if ( !$do_not_weld_rule && %weld_nested_exclusion_rules && ( $self->is_excluded_weld( $Kouter_opening, $starting_new_weld ) || $self->is_excluded_weld( $Kinner_opening, 0 ) ) ) { $do_not_weld_rule = 5; } # DO-NOT-WELD RULE 6: This has been merged into RULE 3 above. if ($do_not_weld_rule) { # After neglecting a pair, we start measuring from start of point # io ... but not if previous type does not like to be separated # from its container (fixes case b1184) my $Kprev = $self->K_previous_nonblank($Kinner_opening); my $type_prev = defined($Kprev) ? $rLL->[$Kprev]->[_TYPE_] : 'w'; if ( !$has_tight_paren{$type_prev} ) { my $starting_level = $inner_opening->[_LEVEL_]; my $starting_ci_level = $inner_opening->[_CI_LEVEL_]; $starting_lentot = $self->cumulative_length_before_K($Kinner_opening); $maximum_text_length = $maximum_text_length_at_level[$starting_level] - $starting_ci_level * $rOpts_continuation_indentation; } if (DEBUG_WELD) { $Msg .= "Not welding due to RULE $do_not_weld_rule\n"; print $Msg; } # Normally, a broken pair should not decrease indentation of # intermediate tokens: ## if ( $last_pair_broken ) { next } # However, for long strings of welded tokens, such as '{{{{{{...' # we will allow broken pairs to also remove indentation. # This will keep very long strings of opening and closing # braces from marching off to the right. We will do this if the # number of tokens in a weld before the broken weld is 4 or more. # This rule will mainly be needed for test scripts, since typical # welds have fewer than about 4 welded tokens. if ( !@welds || @{ $welds[-1] } < 4 ) { next } } # otherwise start new weld ... elsif ($starting_new_weld) { $weld_count_this_start++; if (DEBUG_WELD) { $Msg .= "Starting new weld\n"; print $Msg; } push @welds, $item; $rK_weld_right->{$Kouter_opening} = $Kinner_opening; $rK_weld_left->{$Kinner_opening} = $Kouter_opening; $rK_weld_right->{$Kinner_closing} = $Kouter_closing; $rK_weld_left->{$Kouter_closing} = $Kinner_closing; } # ... or extend current weld else { $weld_count_this_start++; if (DEBUG_WELD) { $Msg .= "Extending current weld\n"; print $Msg; } unshift @{ $welds[-1] }, $inner_seqno; $rK_weld_right->{$Kouter_opening} = $Kinner_opening; $rK_weld_left->{$Kinner_opening} = $Kouter_opening; $rK_weld_right->{$Kinner_closing} = $Kouter_closing; $rK_weld_left->{$Kouter_closing} = $Kinner_closing; # Keep a broken container broken at multiple welds. This might # also be useful for simple welds, but for now it is restricted # to multiple welds to minimize changes to existing coding. This # fixes b1429, b1430. Updated for issue c198: but allow a # line differences of 1 (simple shear) so that a simple shear # can remain or become a single line. if ( $iline_ic - $iline_io > 1 ) { # Only set this break if it is the last possible weld in this # chain. This will keep some extreme test cases unchanged. my $is_chain_end = !@{$rnested_pairs} || $rnested_pairs->[-1]->[1] != $inner_seqno; if ($is_chain_end) { $self->[_rbreak_container_]->{$inner_seqno} = 1; } } } # After welding, reduce the indentation level if all intermediate tokens my $dlevel = $outer_opening->[_LEVEL_] - $inner_opening->[_LEVEL_]; if ( $dlevel != 0 ) { my $Kstart = $Kinner_opening; my $Kstop = $Kinner_closing; foreach my $KK ( $Kstart .. $Kstop ) { $rLL->[$KK]->[_LEVEL_] += $dlevel; } # Copy opening ci level to help break at = for -lp mode (case b1124) $rLL->[$Kinner_opening]->[_CI_LEVEL_] = $rLL->[$Kouter_opening]->[_CI_LEVEL_]; # But do not copy the closing ci level ... it can give poor results ## $rLL->[$Kinner_closing]->[_CI_LEVEL_] = ## $rLL->[$Kouter_closing]->[_CI_LEVEL_]; } } return; } ## end sub weld_nested_containers sub weld_nested_quotes { # Called once per file for option '--weld-nested-containers'. This # does welding on qw quotes. my $self = shift; # See if quotes are excluded from welding my $rflags = $weld_nested_exclusion_rules{'q'}; return if ( defined($rflags) && defined( $rflags->[1] ) ); my $rK_weld_left = $self->[_rK_weld_left_]; my $rK_weld_right = $self->[_rK_weld_right_]; my $rLL = $self->[_rLL_]; return unless ( defined($rLL) && @{$rLL} ); my $Num = @{$rLL}; my $K_opening_container = $self->[_K_opening_container_]; my $K_closing_container = $self->[_K_closing_container_]; my $rlines = $self->[_rlines_]; my $starting_lentot; my $maximum_text_length; my $is_single_quote = sub { my ( $Kbeg, $Kend, $quote_type ) = @_; foreach my $K ( $Kbeg .. $Kend ) { my $test_type = $rLL->[$K]->[_TYPE_]; next if ( $test_type eq 'b' ); return if ( $test_type ne $quote_type ); } return 1; }; # Length tolerance - same as previously used for sub weld_nested my $multiline_tol = 1 + max( $rOpts_indent_columns, $rOpts_continuation_indentation ); # look for single qw quotes nested in containers my $KNEXT = $self->[_K_first_seq_item_]; while ( defined($KNEXT) ) { my $KK = $KNEXT; $KNEXT = $rLL->[$KNEXT]->[_KNEXT_SEQ_ITEM_]; my $rtoken_vars = $rLL->[$KK]; my $outer_seqno = $rtoken_vars->[_TYPE_SEQUENCE_]; if ( !$outer_seqno ) { next if ( $KK == 0 ); # first token in file may not be container # A fault here implies that an error was made in the little loop at # the bottom of sub 'respace_tokens' which set the values of # _KNEXT_SEQ_ITEM_. Or an error has been introduced in the # loop control lines above. Fault("sequence = $outer_seqno not defined at K=$KK") if (DEVEL_MODE); next; } my $token = $rtoken_vars->[_TOKEN_]; if ( $is_opening_token{$token} ) { # see if the next token is a quote of some type my $Kn = $KK + 1; $Kn += 1 if ( $Kn < $Num && $rLL->[$Kn]->[_TYPE_] eq 'b' ); next unless ( $Kn < $Num ); my $next_token = $rLL->[$Kn]->[_TOKEN_]; my $next_type = $rLL->[$Kn]->[_TYPE_]; next unless ( ( $next_type eq 'q' || $next_type eq 'Q' ) && substr( $next_token, 0, 1 ) eq 'q' ); # The token before the closing container must also be a quote my $Kouter_closing = $K_closing_container->{$outer_seqno}; my $Kinner_closing = $self->K_previous_nonblank($Kouter_closing); next unless $rLL->[$Kinner_closing]->[_TYPE_] eq $next_type; # This is an inner opening container my $Kinner_opening = $Kn; # Do not weld to single-line quotes. Nothing is gained, and it may # look bad. next if ( $Kinner_closing == $Kinner_opening ); # Only weld to quotes delimited with container tokens. This is # because welding to arbitrary quote delimiters can produce code # which is less readable than without welding. my $closing_delimiter = substr( $rLL->[$Kinner_closing]->[_TOKEN_], -1, 1 ); next unless ( $is_closing_token{$closing_delimiter} || $closing_delimiter eq '>' ); # Now make sure that there is just a single quote in the container next unless ( $is_single_quote->( $Kinner_opening + 1, $Kinner_closing - 1, $next_type ) ); # OK: This is a candidate for welding my $Msg = EMPTY_STRING; my $do_not_weld; my $Kouter_opening = $K_opening_container->{$outer_seqno}; my $iline_oo = $rLL->[$Kouter_opening]->[_LINE_INDEX_]; my $iline_io = $rLL->[$Kinner_opening]->[_LINE_INDEX_]; my $iline_oc = $rLL->[$Kouter_closing]->[_LINE_INDEX_]; my $iline_ic = $rLL->[$Kinner_closing]->[_LINE_INDEX_]; my $is_old_weld = ( $iline_oo == $iline_io && $iline_ic == $iline_oc ); # Fix for case b1189. If quote is marked as type 'Q' then only weld # if the two closing tokens are on the same input line. Otherwise, # the closing line will be output earlier in the pipeline than # other CODE lines and welding will not actually occur. This will # leave a half-welded structure with potential formatting # instability. This might be fixed by adding a check for a weld on # a closing Q token and sending it down the normal channel, but it # would complicate the code and is potentially risky. next if (!$is_old_weld && $next_type eq 'Q' && $iline_ic != $iline_oc ); # If welded, the line must not exceed allowed line length ( my $ok_to_weld, $maximum_text_length, $starting_lentot, my $msg ) = $self->setup_new_weld_measurements( $Kouter_opening, $Kinner_opening ); if ( !$ok_to_weld ) { if (DEBUG_WELD) { print $msg} next; } my $length = $rLL->[$Kinner_opening]->[_CUMULATIVE_LENGTH_] - $starting_lentot; my $excess = $length + $multiline_tol - $maximum_text_length; my $excess_max = ( $is_old_weld ? $multiline_tol : 0 ); if ( $excess >= $excess_max ) { $do_not_weld = 1; } if (DEBUG_WELD) { if ( !$is_old_weld ) { $is_old_weld = EMPTY_STRING } $Msg .= "excess=$excess>=$excess_max, multiline_tol=$multiline_tol, is_old_weld='$is_old_weld'\n"; } # Check weld exclusion rules for outer container if ( !$do_not_weld ) { my $is_leading = !defined( $rK_weld_left->{$Kouter_opening} ); if ( $self->is_excluded_weld( $KK, $is_leading ) ) { if (DEBUG_WELD) { $Msg .= "No qw weld due to weld exclusion rules for outer container\n"; } $do_not_weld = 1; } } # Check the length of the last line (fixes case b1039) if ( !$do_not_weld ) { my $rK_range_ic = $rlines->[$iline_ic]->{_rK_range}; my ( $Kfirst_ic, $Klast_ic ) = @{$rK_range_ic}; my $excess_ic = $self->excess_line_length_for_Krange( $Kfirst_ic, $Kouter_closing ); # Allow extra space for additional welded closing container(s) # and a space and comma or semicolon. # NOTE: weld len has not been computed yet. Use 2 spaces # for now, correct for a single weld. This estimate could # be made more accurate if necessary. my $weld_len = defined( $rK_weld_right->{$Kouter_closing} ) ? 2 : 0; if ( $excess_ic + $weld_len + 2 > 0 ) { if (DEBUG_WELD) { $Msg .= "No qw weld due to excess ending line length=$excess_ic + $weld_len + 2 > 0\n"; } $do_not_weld = 1; } } if ($do_not_weld) { if (DEBUG_WELD) { $Msg .= "Not Welding QW\n"; print $Msg; } next; } # OK to weld if (DEBUG_WELD) { $Msg .= "Welding QW\n"; print $Msg; } $rK_weld_right->{$Kouter_opening} = $Kinner_opening; $rK_weld_left->{$Kinner_opening} = $Kouter_opening; $rK_weld_right->{$Kinner_closing} = $Kouter_closing; $rK_weld_left->{$Kouter_closing} = $Kinner_closing; # Undo one indentation level if an extra level was added to this # multiline quote my $qw_seqno = $self->[_rstarting_multiline_qw_seqno_by_K_]->{$Kinner_opening}; if ( $qw_seqno && $self->[_rmultiline_qw_has_extra_level_]->{$qw_seqno} ) { foreach my $K ( $Kinner_opening + 1 .. $Kinner_closing - 1 ) { $rLL->[$K]->[_LEVEL_] -= 1; } $rLL->[$Kinner_opening]->[_CI_LEVEL_] = 0; $rLL->[$Kinner_closing]->[_CI_LEVEL_] = 0; } # undo CI for other welded quotes else { foreach my $K ( $Kinner_opening .. $Kinner_closing ) { $rLL->[$K]->[_CI_LEVEL_] = 0; } } # Change the level of a closing qw token to be that of the outer # containing token. This will allow -lp indentation to function # correctly in the vertical aligner. # Patch to fix c002: but not if it contains text if ( length( $rLL->[$Kinner_closing]->[_TOKEN_] ) == 1 ) { $rLL->[$Kinner_closing]->[_LEVEL_] = $rLL->[$Kouter_closing]->[_LEVEL_]; } } } return; } ## end sub weld_nested_quotes sub is_welded_at_seqno { my ( $self, $seqno ) = @_; # given a sequence number: # return true if it is welded either left or right # return false otherwise return unless ( $total_weld_count && defined($seqno) ); my $KK_o = $self->[_K_opening_container_]->{$seqno}; return unless defined($KK_o); return defined( $self->[_rK_weld_left_]->{$KK_o} ) || defined( $self->[_rK_weld_right_]->{$KK_o} ); } ## end sub is_welded_at_seqno sub mark_short_nested_blocks { # This routine looks at the entire file and marks any short nested blocks # which should not be broken. The results are stored in the hash # $rshort_nested->{$type_sequence} # which will be true if the container should remain intact. # # For example, consider the following line: # sub cxt_two { sort { $a <=> $b } test_if_list() } # The 'sort' block is short and nested within an outer sub block. # Normally, the existence of the 'sort' block will force the sub block to # break open, but this is not always desirable. Here we will set a flag for # the sort block to prevent this. To give the user control, we will # follow the input file formatting. If either of the blocks is broken in # the input file then we will allow it to remain broken. Otherwise we will # set a flag to keep it together in later formatting steps. # The flag which is set here will be checked in two places: # 'sub process_line_of_CODE' and 'sub starting_one_line_block' my $self = shift; return if $rOpts->{'indent-only'}; my $rLL = $self->[_rLL_]; return unless ( defined($rLL) && @{$rLL} ); return unless ( $rOpts->{'one-line-block-nesting'} ); my $K_opening_container = $self->[_K_opening_container_]; my $K_closing_container = $self->[_K_closing_container_]; my $rbreak_container = $self->[_rbreak_container_]; my $ris_broken_container = $self->[_ris_broken_container_]; my $rshort_nested = $self->[_rshort_nested_]; my $rblock_type_of_seqno = $self->[_rblock_type_of_seqno_]; # Variables needed for estimating line lengths my $maximum_text_length; my $starting_lentot; my $length_tol = 1; my $excess_length_to_K = sub { my ($K) = @_; # Estimate the length from the line start to a given token my $length = $self->cumulative_length_before_K($K) - $starting_lentot; my $excess_length = $length + $length_tol - $maximum_text_length; return ($excess_length); }; # loop over all containers my @open_block_stack; my $iline = -1; my $KNEXT = $self->[_K_first_seq_item_]; while ( defined($KNEXT) ) { my $KK = $KNEXT; $KNEXT = $rLL->[$KNEXT]->[_KNEXT_SEQ_ITEM_]; my $rtoken_vars = $rLL->[$KK]; my $type_sequence = $rtoken_vars->[_TYPE_SEQUENCE_]; if ( !$type_sequence ) { next if ( $KK == 0 ); # first token in file may not be container # A fault here implies that an error was made in the little loop at # the bottom of sub 'respace_tokens' which set the values of # _KNEXT_SEQ_ITEM_. Or an error has been introduced in the # loop control lines above. Fault("sequence = $type_sequence not defined at K=$KK") if (DEVEL_MODE); next; } # Patch: do not mark short blocks with welds. # In some cases blinkers can form (case b690). if ( $total_weld_count && $self->is_welded_at_seqno($type_sequence) ) { next; } # We are just looking at code blocks my $token = $rtoken_vars->[_TOKEN_]; my $type = $rtoken_vars->[_TYPE_]; next unless ( $type eq $token ); next unless ( $rblock_type_of_seqno->{$type_sequence} ); # Keep a stack of all acceptable block braces seen. # Only consider blocks entirely on one line so dump the stack when line # changes. my $iline_last = $iline; $iline = $rLL->[$KK]->[_LINE_INDEX_]; if ( $iline != $iline_last ) { @open_block_stack = () } if ( $token eq '}' ) { if (@open_block_stack) { pop @open_block_stack } } next unless ( $token eq '{' ); # block must be balanced (bad scripts may be unbalanced) my $K_opening = $K_opening_container->{$type_sequence}; my $K_closing = $K_closing_container->{$type_sequence}; next unless ( defined($K_opening) && defined($K_closing) ); # require that this block be entirely on one line next if ( $ris_broken_container->{$type_sequence} || $rbreak_container->{$type_sequence} ); # See if this block fits on one line of allowed length (which may # be different from the input script) $starting_lentot = $KK <= 0 ? 0 : $rLL->[ $KK - 1 ]->[_CUMULATIVE_LENGTH_]; my $level = $rLL->[$KK]->[_LEVEL_]; my $ci_level = $rLL->[$KK]->[_CI_LEVEL_]; $maximum_text_length = $maximum_text_length_at_level[$level] - $ci_level * $rOpts_continuation_indentation; # Dump the stack if block is too long and skip this block if ( $excess_length_to_K->($K_closing) > 0 ) { @open_block_stack = (); next; } # OK, Block passes tests, remember it push @open_block_stack, $type_sequence; # We are only marking nested code blocks, # so check for a previous block on the stack next unless ( @open_block_stack > 1 ); # Looks OK, mark this as a short nested block $rshort_nested->{$type_sequence} = 1; } return; } ## end sub mark_short_nested_blocks sub special_indentation_adjustments { my ($self) = @_; # Called once per file to do special indentation adjustments. # These routines adjust levels either by changing _CI_LEVEL_ directly or # by setting modified levels in the array $self->[_radjusted_levels_]. # Initialize the adjusted levels. These will be the levels actually used # for computing indentation. # NOTE: This routine is called after the weld routines, which may have # already adjusted _LEVEL_, so we are making adjustments on top of those # levels. It would be much nicer to have the weld routines also use this # adjustment, but that gets complicated when we combine -gnu -wn and have # some welded quotes. my $Klimit = $self->[_Klimit_]; my $rLL = $self->[_rLL_]; my $radjusted_levels = $self->[_radjusted_levels_]; return unless ( defined($Klimit) ); foreach my $KK ( 0 .. $Klimit ) { $radjusted_levels->[$KK] = $rLL->[$KK]->[_LEVEL_]; } # First set adjusted levels for any non-indenting braces. $self->do_non_indenting_braces(); # Adjust breaks and indentation list containers $self->break_before_list_opening_containers(); # Set adjusted levels for the whitespace cycle option. $self->whitespace_cycle_adjustment(); $self->braces_left_setup(); # Adjust continuation indentation if -bli is set $self->bli_adjustment(); $self->extended_ci() if ($rOpts_extended_continuation_indentation); # Now clip any adjusted levels to be non-negative $self->clip_adjusted_levels(); return; } ## end sub special_indentation_adjustments sub clip_adjusted_levels { # Replace any negative adjusted levels with zero. # Negative levels can occur in files with brace errors. my ($self) = @_; my $radjusted_levels = $self->[_radjusted_levels_]; return unless defined($radjusted_levels) && @{$radjusted_levels}; my $min = min( @{$radjusted_levels} ); # fast check for min if ( $min < 0 ) { # slow loop, but rarely needed foreach ( @{$radjusted_levels} ) { $_ = 0 if ( $_ < 0 ) } } return; } ## end sub clip_adjusted_levels sub do_non_indenting_braces { # Called once per file to handle the --non-indenting-braces parameter. # Remove indentation within marked braces if requested my ($self) = @_; # Any non-indenting braces have been found by sub find_non_indenting_braces # and are defined by the following hash: my $rseqno_non_indenting_brace_by_ix = $self->[_rseqno_non_indenting_brace_by_ix_]; return unless ( %{$rseqno_non_indenting_brace_by_ix} ); my $rlines = $self->[_rlines_]; my $K_opening_container = $self->[_K_opening_container_]; my $K_closing_container = $self->[_K_closing_container_]; my $rspecial_side_comment_type = $self->[_rspecial_side_comment_type_]; my $radjusted_levels = $self->[_radjusted_levels_]; # First locate all of the marked blocks my @K_stack; foreach my $ix ( keys %{$rseqno_non_indenting_brace_by_ix} ) { my $seqno = $rseqno_non_indenting_brace_by_ix->{$ix}; my $KK = $K_opening_container->{$seqno}; my $line_of_tokens = $rlines->[$ix]; my $rK_range = $line_of_tokens->{_rK_range}; my ( $Kfirst, $Klast ) = @{$rK_range}; $rspecial_side_comment_type->{$Klast} = 'NIB'; push @K_stack, [ $KK, 1 ]; my $Kc = $K_closing_container->{$seqno}; push @K_stack, [ $Kc, -1 ] if ( defined($Kc) ); } return unless (@K_stack); @K_stack = sort { $a->[0] <=> $b->[0] } @K_stack; # Then loop to remove indentation within marked blocks my $KK_last = 0; my $ndeep = 0; foreach my $item (@K_stack) { my ( $KK, $inc ) = @{$item}; if ( $ndeep > 0 ) { foreach ( $KK_last + 1 .. $KK ) { $radjusted_levels->[$_] -= $ndeep; } # We just subtracted the old $ndeep value, which only applies to a # '{'. The new $ndeep applies to a '}', so we undo the error. if ( $inc < 0 ) { $radjusted_levels->[$KK] += 1 } } $ndeep += $inc; $KK_last = $KK; } return; } ## end sub do_non_indenting_braces sub whitespace_cycle_adjustment { my $self = shift; # Called once per file to implement the --whitespace-cycle option my $rLL = $self->[_rLL_]; return unless ( defined($rLL) && @{$rLL} ); my $radjusted_levels = $self->[_radjusted_levels_]; my $maximum_level = $self->[_maximum_level_]; if ( $rOpts_whitespace_cycle && $rOpts_whitespace_cycle > 0 && $rOpts_whitespace_cycle < $maximum_level ) { my $Kmax = @{$rLL} - 1; my $whitespace_last_level = -1; my @whitespace_level_stack = (); my $last_nonblank_type = 'b'; my $last_nonblank_token = EMPTY_STRING; foreach my $KK ( 0 .. $Kmax ) { my $level_abs = $radjusted_levels->[$KK]; my $level = $level_abs; if ( $level_abs < $whitespace_last_level ) { pop(@whitespace_level_stack); } if ( !@whitespace_level_stack ) { push @whitespace_level_stack, $level_abs; } elsif ( $level_abs > $whitespace_last_level ) { $level = $whitespace_level_stack[-1] + ( $level_abs - $whitespace_last_level ); if ( # 1 Try to break at a block brace ( $level > $rOpts_whitespace_cycle && $last_nonblank_type eq '{' && $last_nonblank_token eq '{' ) # 2 Then either a brace or bracket || ( $level > $rOpts_whitespace_cycle + 1 && $last_nonblank_token =~ /^[\{\[]$/ ) # 3 Then a paren too || $level > $rOpts_whitespace_cycle + 2 ) { $level = 1; } push @whitespace_level_stack, $level; } $level = $whitespace_level_stack[-1]; $radjusted_levels->[$KK] = $level; $whitespace_last_level = $level_abs; my $type = $rLL->[$KK]->[_TYPE_]; my $token = $rLL->[$KK]->[_TOKEN_]; if ( $type ne 'b' ) { $last_nonblank_type = $type; $last_nonblank_token = $token; } } } return; } ## end sub whitespace_cycle_adjustment use constant DEBUG_BBX => 0; sub break_before_list_opening_containers { my ($self) = @_; # This routine is called once per batch to implement parameters # --break-before-hash-brace=n and similar -bbx=n flags # and their associated indentation flags: # --break-before-hash-brace-and-indent and similar -bbxi=n # Nothing to do if none of the -bbx=n parameters has been set return unless %break_before_container_types; my $rLL = $self->[_rLL_]; return unless ( defined($rLL) && @{$rLL} ); # Loop over all opening container tokens my $K_opening_container = $self->[_K_opening_container_]; my $K_closing_container = $self->[_K_closing_container_]; my $ris_broken_container = $self->[_ris_broken_container_]; my $ris_permanently_broken = $self->[_ris_permanently_broken_]; my $rhas_list = $self->[_rhas_list_]; my $rhas_broken_list_with_lec = $self->[_rhas_broken_list_with_lec_]; my $radjusted_levels = $self->[_radjusted_levels_]; my $rparent_of_seqno = $self->[_rparent_of_seqno_]; my $rlines = $self->[_rlines_]; my $rtype_count_by_seqno = $self->[_rtype_count_by_seqno_]; my $rlec_count_by_seqno = $self->[_rlec_count_by_seqno_]; my $rno_xci_by_seqno = $self->[_rno_xci_by_seqno_]; my $rK_weld_right = $self->[_rK_weld_right_]; my $rblock_type_of_seqno = $self->[_rblock_type_of_seqno_]; my $length_tol = max( 1, $rOpts_continuation_indentation, $rOpts_indent_columns ); if ($rOpts_ignore_old_breakpoints) { # Patch suggested by b1231; the old tol was excessive. ## $length_tol += $rOpts_maximum_line_length; $length_tol *= 2; } my $rbreak_before_container_by_seqno = {}; my $rwant_reduced_ci = {}; foreach my $seqno ( keys %{$K_opening_container} ) { #---------------------------------------------------------------- # Part 1: Examine any -bbx=n flags #---------------------------------------------------------------- next if ( $rblock_type_of_seqno->{$seqno} ); my $KK = $K_opening_container->{$seqno}; # This must be a list or contain a list. # Note1: switched from 'has_broken_list' to 'has_list' to fix b1024. # Note2: 'has_list' holds the depth to the sub-list. We will require # a depth of just 1 my $is_list = $self->is_list_by_seqno($seqno); my $has_list = $rhas_list->{$seqno}; # Fix for b1173: if welded opening container, use flag of innermost # seqno. Otherwise, the restriction $has_list==1 prevents triple and # higher welds from following the -BBX parameters. if ($total_weld_count) { my $KK_test = $rK_weld_right->{$KK}; if ( defined($KK_test) ) { my $seqno_inner = $rLL->[$KK_test]->[_TYPE_SEQUENCE_]; $is_list ||= $self->is_list_by_seqno($seqno_inner); $has_list = $rhas_list->{$seqno_inner}; } } next unless ( $is_list || $has_list && $has_list == 1 ); my $has_list_with_lec = $rhas_broken_list_with_lec->{$seqno}; # Only for types of container tokens with a non-default break option my $token = $rLL->[$KK]->[_TOKEN_]; my $break_option = $break_before_container_types{$token}; next unless ($break_option); # Do not use -bbx under stress for stability ... fixes b1300 # TODO: review this; do we also need to look at stress_level_lalpha? my $level = $rLL->[$KK]->[_LEVEL_]; if ( $level >= $stress_level_beta ) { DEBUG_BBX && print "BBX: Switching off at $seqno: level=$level exceeds beta stress level=$stress_level_beta\n"; next; } # Require previous nonblank to be '=' or '=>' my $Kprev = $KK - 1; next if ( $Kprev < 0 ); my $prev_type = $rLL->[$Kprev]->[_TYPE_]; if ( $prev_type eq 'b' ) { $Kprev--; next if ( $Kprev < 0 ); $prev_type = $rLL->[$Kprev]->[_TYPE_]; } next unless ( $is_equal_or_fat_comma{$prev_type} ); my $ci = $rLL->[$KK]->[_CI_LEVEL_]; #-------------------------------------------- # New coding for option 2 (break if complex). #-------------------------------------------- # This new coding uses clues which are invariant under formatting to # decide if a list is complex. For now it is only applied when -lp # and -vmll are used, but eventually it may become the standard method. # Fixes b1274, b1275, and others, including b1099. if ( $break_option == 2 ) { if ( $rOpts_line_up_parentheses || $rOpts_variable_maximum_line_length ) { # Start with the basic definition of a complex list... my $is_complex = $is_list && $has_list; # and it is also complex if the parent is a list if ( !$is_complex ) { my $parent = $rparent_of_seqno->{$seqno}; if ( $self->is_list_by_seqno($parent) ) { $is_complex = 1; } } # finally, we will call it complex if there are inner opening # and closing container tokens, not parens, within the outer # container tokens. if ( !$is_complex ) { my $Kp = $self->K_next_nonblank($KK); my $token_p = defined($Kp) ? $rLL->[$Kp]->[_TOKEN_] : 'b'; if ( $is_opening_token{$token_p} && $token_p ne '(' ) { my $Kc = $K_closing_container->{$seqno}; my $Km = $self->K_previous_nonblank($Kc); my $token_m = defined($Km) ? $rLL->[$Km]->[_TOKEN_] : 'b'; # ignore any optional ending comma if ( $token_m eq ',' ) { $Km = $self->K_previous_nonblank($Km); $token_m = defined($Km) ? $rLL->[$Km]->[_TOKEN_] : 'b'; } $is_complex ||= $is_closing_token{$token_m} && $token_m ne ')'; } } # Convert to option 3 (always break) if complex next unless ($is_complex); $break_option = 3; } } # Fix for b1231: the has_list_with_lec does not cover all cases. # A broken container containing a list and with line-ending commas # will stay broken, so can be treated as if it had a list with lec. $has_list_with_lec ||= $has_list && $ris_broken_container->{$seqno} && $rlec_count_by_seqno->{$seqno}; DEBUG_BBX && print STDOUT "BBX: Looking at seqno=$seqno, token = $token with option=$break_option\n"; # -bbx=1 = stable, try to follow input if ( $break_option == 1 ) { my $iline = $rLL->[$KK]->[_LINE_INDEX_]; my $rK_range = $rlines->[$iline]->{_rK_range}; my ( $Kfirst, $Klast ) = @{$rK_range}; next unless ( $KK == $Kfirst ); } # -bbx=2 => apply this style only for a 'complex' list elsif ( $break_option == 2 ) { # break if this list contains a broken list with line-ending comma my $ok_to_break; my $Msg = EMPTY_STRING; if ($has_list_with_lec) { $ok_to_break = 1; DEBUG_BBX && do { $Msg = "has list with lec;" }; } if ( !$ok_to_break ) { # Turn off -xci if -bbx=2 and this container has a sublist but # not a broken sublist. This avoids creating blinkers. The # problem is that -xci can cause one-line lists to break open, # and thereby creating formatting instability. # This fixes cases b1033 b1036 b1037 b1038 b1042 b1043 b1044 # b1045 b1046 b1047 b1051 b1052 b1061. if ($has_list) { $rno_xci_by_seqno->{$seqno} = 1 } my $parent = $rparent_of_seqno->{$seqno}; if ( $self->is_list_by_seqno($parent) ) { DEBUG_BBX && do { $Msg = "parent is list" }; $ok_to_break = 1; } } if ( !$ok_to_break ) { DEBUG_BBX && print STDOUT "Not breaking at seqno=$seqno: $Msg\n"; next; } DEBUG_BBX && print STDOUT "OK to break at seqno=$seqno: $Msg\n"; # Patch: turn off -xci if -bbx=2 and -lp # This fixes cases b1090 b1095 b1101 b1116 b1118 b1121 b1122 $rno_xci_by_seqno->{$seqno} = 1 if ($rOpts_line_up_parentheses); } # -bbx=3 = always break elsif ( $break_option == 3 ) { # ok to break } # Shouldn't happen! Bad flag, but make behavior same as 3 else { # ok to break } # Set a flag for actual implementation later in # sub insert_breaks_before_list_opening_containers $rbreak_before_container_by_seqno->{$seqno} = 1; DEBUG_BBX && print STDOUT "BBX: ok to break at seqno=$seqno\n"; # -bbxi=0: Nothing more to do if the ci value remains unchanged my $ci_flag = $container_indentation_options{$token}; next unless ($ci_flag); # -bbxi=1: This option removes ci and is handled in # later sub get_final_indentation if ( $ci_flag == 1 ) { $rwant_reduced_ci->{$seqno} = 1; next; } # -bbxi=2: This option changes the level ... # This option can conflict with -xci in some cases. We can turn off # -xci for this container to avoid blinking. For now, only do this if # -vmll is set. ( fixes b1335, b1336 ) if ($rOpts_variable_maximum_line_length) { $rno_xci_by_seqno->{$seqno} = 1; } #---------------------------------------------------------------- # Part 2: Perform tests before committing to changing ci and level #---------------------------------------------------------------- # Before changing the ci level of the opening container, we need # to be sure that the container will be broken in the later stages of # formatting. We have to do this because we are working early in the # formatting pipeline. A problem can occur if we change the ci or # level of the opening token but do not actually break the container # open as expected. In most cases it wouldn't make any difference if # we changed ci or not, but there are some edge cases where this # can cause blinking states, so we need to try to only change ci if # the container will really be broken. # Only consider containers already broken next if ( !$ris_broken_container->{$seqno} ); # Patch to fix issue b1305: the combination of -naws and ci>i appears # to cause an instability. It should almost never occur in practice. next if (!$rOpts_add_whitespace && $rOpts_continuation_indentation > $rOpts_indent_columns ); # Always ok to change ci for permanently broken containers if ( $ris_permanently_broken->{$seqno} ) { } # Always OK if this list contains a broken sub-container with # a non-terminal line-ending comma elsif ($has_list_with_lec) { } # Otherwise, we are considering a single container... else { # A single container must have at least 1 line-ending comma: next unless ( $rlec_count_by_seqno->{$seqno} ); my $OK; # Since it has a line-ending comma, it will stay broken if the # -boc flag is set if ($rOpts_break_at_old_comma_breakpoints) { $OK = 1 } # OK if the container contains multiple fat commas # Better: multiple lines with fat commas if ( !$OK && !$rOpts_ignore_old_breakpoints ) { my $rtype_count = $rtype_count_by_seqno->{$seqno}; next unless ($rtype_count); my $fat_comma_count = $rtype_count->{'=>'}; DEBUG_BBX && print STDOUT "BBX: fat comma count=$fat_comma_count\n"; if ( $fat_comma_count && $fat_comma_count >= 2 ) { $OK = 1 } } # The last check we can make is to see if this container could # fit on a single line. Use the least possible indentation # estimate, ci=0, so we are not subtracting $ci * # $rOpts_continuation_indentation from tabulated # $maximum_text_length value. if ( !$OK ) { my $maximum_text_length = $maximum_text_length_at_level[$level]; my $K_closing = $K_closing_container->{$seqno}; my $length = $self->cumulative_length_before_K($K_closing) - $self->cumulative_length_before_K($KK); my $excess_length = $length - $maximum_text_length; DEBUG_BBX && print STDOUT "BBX: excess=$excess_length: maximum_text_length=$maximum_text_length, length=$length, ci=$ci\n"; # OK if the net container definitely breaks on length if ( $excess_length > $length_tol ) { $OK = 1; DEBUG_BBX && print STDOUT "BBX: excess_length=$excess_length\n"; } # Otherwise skip it else { next } } } #------------------------------------------------------------ # Part 3: Looks OK: apply -bbx=n and any related -bbxi=n flag #------------------------------------------------------------ DEBUG_BBX && print STDOUT "BBX: OK to break\n"; # -bbhbi=n # -bbsbi=n # -bbpi=n # where: # n=0 default indentation (usually one ci) # n=1 outdent one ci # n=2 indent one level (minus one ci) # n=3 indent one extra ci [This may be dropped] # NOTE: We are adjusting indentation of the opening container. The # closing container will normally follow the indentation of the opening # container automatically, so this is not currently done. next unless ($ci); # option 1: outdent if ( $ci_flag == 1 ) { $ci -= 1; } # option 2: indent one level elsif ( $ci_flag == 2 ) { $ci -= 1; $radjusted_levels->[$KK] += 1; } # unknown option else { # Shouldn't happen - leave ci unchanged } $rLL->[$KK]->[_CI_LEVEL_] = $ci if ( $ci >= 0 ); } $self->[_rbreak_before_container_by_seqno_] = $rbreak_before_container_by_seqno; $self->[_rwant_reduced_ci_] = $rwant_reduced_ci; return; } ## end sub break_before_list_opening_containers use constant DEBUG_XCI => 0; sub extended_ci { # This routine implements the -xci (--extended-continuation-indentation) # flag. We add CI to interior tokens of a container which itself has CI but # only if a token does not already have CI. # To do this, we will locate opening tokens which themselves have # continuation indentation (CI). We track them with their sequence # numbers. These sequence numbers are called 'controlling sequence # numbers'. They apply continuation indentation to the tokens that they # contain. These inner tokens remember their controlling sequence numbers. # Later, when these inner tokens are output, they have to see if the output # lines with their controlling tokens were output with CI or not. If not, # then they must remove their CI too. # The controlling CI concept works hierarchically. But CI itself is not # hierarchical; it is either on or off. There are some rare instances where # it would be best to have hierarchical CI too, but not enough to be worth # the programming effort. # The operations to remove unwanted CI are done in sub 'undo_ci'. my ($self) = @_; my $rLL = $self->[_rLL_]; return unless ( defined($rLL) && @{$rLL} ); my $ris_list_by_seqno = $self->[_ris_list_by_seqno_]; my $ris_seqno_controlling_ci = $self->[_ris_seqno_controlling_ci_]; my $rseqno_controlling_my_ci = $self->[_rseqno_controlling_my_ci_]; my $rno_xci_by_seqno = $self->[_rno_xci_by_seqno_]; my $ris_bli_container = $self->[_ris_bli_container_]; my $rblock_type_of_seqno = $self->[_rblock_type_of_seqno_]; my %available_space; # Loop over all opening container tokens my $K_opening_container = $self->[_K_opening_container_]; my $K_closing_container = $self->[_K_closing_container_]; my @seqno_stack; my $seqno_top; my $KLAST; my $KNEXT = $self->[_K_first_seq_item_]; # The following variable can be used to allow a little extra space to # avoid blinkers. A value $len_tol = 20 fixed the following # fixes cases: b1025 b1026 b1027 b1028 b1029 b1030 but NOT b1031. # It turned out that the real problem was mis-parsing a list brace as # a code block in a 'use' statement when the line length was extremely # small. A value of 0 works now, but a slightly larger value can # be used to minimize the chance of a blinker. my $len_tol = 0; while ( defined($KNEXT) ) { # Fix all tokens up to the next sequence item if we are changing CI if ($seqno_top) { my $is_list = $ris_list_by_seqno->{$seqno_top}; my $space = $available_space{$seqno_top}; my $count = 0; foreach my $Kt ( $KLAST + 1 .. $KNEXT - 1 ) { next if ( $rLL->[$Kt]->[_CI_LEVEL_] ); # But do not include tokens which might exceed the line length # and are not in a list. # ... This fixes case b1031 if ( $is_list || $rLL->[$Kt]->[_TOKEN_LENGTH_] < $space || $rLL->[$Kt]->[_TYPE_] eq '#' ) { $rLL->[$Kt]->[_CI_LEVEL_] = 1; $rseqno_controlling_my_ci->{$Kt} = $seqno_top; $count++; } } $ris_seqno_controlling_ci->{$seqno_top} += $count; } $KLAST = $KNEXT; my $KK = $KNEXT; $KNEXT = $rLL->[$KNEXT]->[_KNEXT_SEQ_ITEM_]; my $seqno = $rLL->[$KK]->[_TYPE_SEQUENCE_]; # see if we have reached the end of the current controlling container if ( $seqno_top && $seqno == $seqno_top ) { $seqno_top = pop @seqno_stack; } # Patch to fix some block types... # Certain block types arrive from the tokenizer without CI but should # have it for this option. These include anonymous subs and # do sort map grep eval my $block_type = $rblock_type_of_seqno->{$seqno}; if ( $block_type && $is_block_with_ci{$block_type} ) { $rLL->[$KK]->[_CI_LEVEL_] = 1; if ($seqno_top) { $rseqno_controlling_my_ci->{$KK} = $seqno_top; $ris_seqno_controlling_ci->{$seqno_top}++; } } # If this does not have ci, update ci if necessary and continue looking elsif ( !$rLL->[$KK]->[_CI_LEVEL_] ) { if ($seqno_top) { $rLL->[$KK]->[_CI_LEVEL_] = 1; $rseqno_controlling_my_ci->{$KK} = $seqno_top; $ris_seqno_controlling_ci->{$seqno_top}++; } next; } # We are looking for opening container tokens with ci my $K_opening = $K_opening_container->{$seqno}; next unless ( defined($K_opening) && $KK == $K_opening ); # Make sure there is a corresponding closing container # (could be missing if the script has a brace error) my $K_closing = $K_closing_container->{$seqno}; next unless defined($K_closing); # Skip if requested by -bbx to avoid blinkers next if ( $rno_xci_by_seqno->{$seqno} ); # Skip if this is a -bli container (this fixes case b1065) Note: case # b1065 is also fixed by the update for b1055, so this update is not # essential now. But there does not seem to be a good reason to add # xci and bli together, so the update is retained. next if ( $ris_bli_container->{$seqno} ); # Require different input lines. This will filter out a large number # of small hash braces and array brackets. If we accidentally filter # out an important container, it will get fixed on the next pass. if ( $rLL->[$K_opening]->[_LINE_INDEX_] == $rLL->[$K_closing]->[_LINE_INDEX_] && ( $rLL->[$K_closing]->[_CUMULATIVE_LENGTH_] - $rLL->[$K_opening]->[_CUMULATIVE_LENGTH_] > $rOpts_maximum_line_length ) ) { DEBUG_XCI && print "XCI: Skipping seqno=$seqno, require different lines\n"; next; } # Do not apply -xci if adding extra ci will put the container contents # beyond the line length limit (fixes cases b899 b935) my $level = $rLL->[$K_opening]->[_LEVEL_]; my $ci_level = $rLL->[$K_opening]->[_CI_LEVEL_]; my $maximum_text_length = $maximum_text_length_at_level[$level] - $ci_level * $rOpts_continuation_indentation; # Fix for b1197 b1198 b1199 b1200 b1201 b1202 # Do not apply -xci if we are running out of space # TODO: review this; do we also need to look at stress_level_alpha? if ( $level >= $stress_level_beta ) { DEBUG_XCI && print "XCI: Skipping seqno=$seqno, level=$level exceeds stress level=$stress_level_beta\n"; next; } # remember how much space is available for patch b1031 above my $space = $maximum_text_length - $len_tol - $rOpts_continuation_indentation; if ( $space < 0 ) { DEBUG_XCI && print "XCI: Skipping seqno=$seqno, space=$space\n"; next; } DEBUG_XCI && print "XCI: OK seqno=$seqno, space=$space\n"; $available_space{$seqno} = $space; # This becomes the next controlling container push @seqno_stack, $seqno_top if ($seqno_top); $seqno_top = $seqno; } return; } ## end sub extended_ci sub braces_left_setup { # Called once per file to mark all -bl, -sbl, and -asbl containers my $self = shift; my $rOpts_bl = $rOpts->{'opening-brace-on-new-line'}; my $rOpts_sbl = $rOpts->{'opening-sub-brace-on-new-line'}; my $rOpts_asbl = $rOpts->{'opening-anonymous-sub-brace-on-new-line'}; return unless ( $rOpts_bl || $rOpts_sbl || $rOpts_asbl ); my $rLL = $self->[_rLL_]; return unless ( defined($rLL) && @{$rLL} ); # We will turn on this hash for braces controlled by these flags: my $rbrace_left = $self->[_rbrace_left_]; my $rblock_type_of_seqno = $self->[_rblock_type_of_seqno_]; my $ris_asub_block = $self->[_ris_asub_block_]; my $ris_sub_block = $self->[_ris_sub_block_]; foreach my $seqno ( keys %{$rblock_type_of_seqno} ) { my $block_type = $rblock_type_of_seqno->{$seqno}; # use -asbl flag for an anonymous sub block if ( $ris_asub_block->{$seqno} ) { if ($rOpts_asbl) { $rbrace_left->{$seqno} = 1; } } # use -sbl flag for a named sub elsif ( $ris_sub_block->{$seqno} ) { if ($rOpts_sbl) { $rbrace_left->{$seqno} = 1; } } # use -bl flag if not a sub block of any type else { if ( $rOpts_bl && $block_type =~ /$bl_pattern/ && $block_type !~ /$bl_exclusion_pattern/ ) { $rbrace_left->{$seqno} = 1; } } } return; } ## end sub braces_left_setup sub bli_adjustment { # Called once per file to implement the --brace-left-and-indent option. # If -bli is set, adds one continuation indentation for certain braces my $self = shift; return unless ( $rOpts->{'brace-left-and-indent'} ); my $rLL = $self->[_rLL_]; return unless ( defined($rLL) && @{$rLL} ); my $rblock_type_of_seqno = $self->[_rblock_type_of_seqno_]; my $ris_bli_container = $self->[_ris_bli_container_]; my $rbrace_left = $self->[_rbrace_left_]; my $K_opening_container = $self->[_K_opening_container_]; my $K_closing_container = $self->[_K_closing_container_]; foreach my $seqno ( keys %{$rblock_type_of_seqno} ) { my $block_type = $rblock_type_of_seqno->{$seqno}; if ( $block_type && $block_type =~ /$bli_pattern/ && $block_type !~ /$bli_exclusion_pattern/ ) { $ris_bli_container->{$seqno} = 1; $rbrace_left->{$seqno} = 1; my $Ko = $K_opening_container->{$seqno}; my $Kc = $K_closing_container->{$seqno}; if ( defined($Ko) && defined($Kc) ) { $rLL->[$Kc]->[_CI_LEVEL_] = ++$rLL->[$Ko]->[_CI_LEVEL_]; } } } return; } ## end sub bli_adjustment sub find_multiline_qw { my ( $self, $rqw_lines ) = @_; # Multiline qw quotes are not sequenced items like containers { [ ( # but behave in some respects in a similar way. So this routine finds them # and creates a separate sequence number system for later use. # This is straightforward because they always begin at the end of one line # and end at the beginning of a later line. This is true no matter how we # finally make our line breaks, so we can find them before deciding on new # line breaks. # Input parameter: # if $rqw_lines is defined it is a ref to array of all line index numbers # for which there is a type 'q' qw quote at either end of the line. This # was defined by sub resync_lines_and_tokens for efficiency. # my $rlines = $self->[_rlines_]; # if $rqw_lines is not defined (this will occur with -io option) then we # will have to scan all lines. if ( !defined($rqw_lines) ) { $rqw_lines = [ 0 .. @{$rlines} - 1 ]; } # if $rqw_lines is defined but empty, just return because there are no # multiline qw's else { if ( !@{$rqw_lines} ) { return } } my $rstarting_multiline_qw_seqno_by_K = {}; my $rending_multiline_qw_seqno_by_K = {}; my $rKrange_multiline_qw_by_seqno = {}; my $rmultiline_qw_has_extra_level = {}; my $ris_excluded_lp_container = $self->[_ris_excluded_lp_container_]; my $rLL = $self->[_rLL_]; my $qw_seqno; my $num_qw_seqno = 0; my $K_start_multiline_qw; # For reference, here is the old loop, before $rqw_lines became available: ## foreach my $line_of_tokens ( @{$rlines} ) { foreach my $iline ( @{$rqw_lines} ) { my $line_of_tokens = $rlines->[$iline]; # Note that these first checks are required in case we have to scan # all lines, not just lines with type 'q' at the ends. my $line_type = $line_of_tokens->{_line_type}; next unless ( $line_type eq 'CODE' ); my $rK_range = $line_of_tokens->{_rK_range}; my ( $Kfirst, $Klast ) = @{$rK_range}; next unless ( defined($Kfirst) && defined($Klast) ); # skip blank line # Continuing a sequence of qw lines ... if ( defined($K_start_multiline_qw) ) { my $type = $rLL->[$Kfirst]->[_TYPE_]; # shouldn't happen if ( $type ne 'q' ) { DEVEL_MODE && print STDERR <K_previous_nonblank($Kfirst); my $Knext = $self->K_next_nonblank($Kfirst); my $type_m = defined($Kprev) ? $rLL->[$Kprev]->[_TYPE_] : 'b'; my $type_p = defined($Knext) ? $rLL->[$Knext]->[_TYPE_] : 'b'; if ( $type_m eq 'q' && $type_p ne 'q' ) { $rending_multiline_qw_seqno_by_K->{$Kfirst} = $qw_seqno; $rKrange_multiline_qw_by_seqno->{$qw_seqno} = [ $K_start_multiline_qw, $Kfirst ]; $K_start_multiline_qw = undef; $qw_seqno = undef; } } # Starting a new a sequence of qw lines ? if ( !defined($K_start_multiline_qw) && $rLL->[$Klast]->[_TYPE_] eq 'q' ) { my $Kprev = $self->K_previous_nonblank($Klast); my $Knext = $self->K_next_nonblank($Klast); my $type_m = defined($Kprev) ? $rLL->[$Kprev]->[_TYPE_] : 'b'; my $type_p = defined($Knext) ? $rLL->[$Knext]->[_TYPE_] : 'b'; if ( $type_m ne 'q' && $type_p eq 'q' ) { $num_qw_seqno++; $qw_seqno = 'q' . $num_qw_seqno; $K_start_multiline_qw = $Klast; $rstarting_multiline_qw_seqno_by_K->{$Klast} = $qw_seqno; } } } # Give multiline qw lists extra indentation instead of CI. This option # works well but is currently only activated when the -xci flag is set. # The reason is to avoid unexpected changes in formatting. if ($rOpts_extended_continuation_indentation) { while ( my ( $qw_seqno_x, $rKrange ) = each %{$rKrange_multiline_qw_by_seqno} ) { my ( $Kbeg, $Kend ) = @{$rKrange}; # require isolated closing token my $token_end = $rLL->[$Kend]->[_TOKEN_]; next unless ( length($token_end) == 1 && ( $is_closing_token{$token_end} || $token_end eq '>' ) ); # require isolated opening token my $token_beg = $rLL->[$Kbeg]->[_TOKEN_]; # allow space(s) after the qw if ( length($token_beg) > 3 && substr( $token_beg, 2, 1 ) =~ m/\s/ ) { $token_beg =~ s/\s+//; } next unless ( length($token_beg) == 3 ); foreach my $KK ( $Kbeg + 1 .. $Kend - 1 ) { $rLL->[$KK]->[_LEVEL_]++; $rLL->[$KK]->[_CI_LEVEL_] = 0; } # set flag for -wn option, which will remove the level $rmultiline_qw_has_extra_level->{$qw_seqno_x} = 1; } } # For the -lp option we need to mark all parent containers of # multiline quotes if ( $rOpts_line_up_parentheses && !$rOpts_extended_line_up_parentheses ) { while ( my ( $qw_seqno_x, $rKrange ) = each %{$rKrange_multiline_qw_by_seqno} ) { my ( $Kbeg, $Kend ) = @{$rKrange}; my $parent_seqno = $self->parent_seqno_by_K($Kend); next unless ($parent_seqno); # If the parent container exactly surrounds this qw, then -lp # formatting seems to work so we will not mark it. my $is_tightly_contained; my $Kn = $self->K_next_nonblank($Kend); my $seqno_n = defined($Kn) ? $rLL->[$Kn]->[_TYPE_SEQUENCE_] : undef; if ( defined($seqno_n) && $seqno_n eq $parent_seqno ) { my $Kp = $self->K_previous_nonblank($Kbeg); my $seqno_p = defined($Kp) ? $rLL->[$Kp]->[_TYPE_SEQUENCE_] : undef; if ( defined($seqno_p) && $seqno_p eq $parent_seqno ) { $is_tightly_contained = 1; } } $ris_excluded_lp_container->{$parent_seqno} = 1 unless ($is_tightly_contained); # continue up the tree marking parent containers while (1) { $parent_seqno = $self->[_rparent_of_seqno_]->{$parent_seqno}; last unless ( defined($parent_seqno) && $parent_seqno ne SEQ_ROOT ); $ris_excluded_lp_container->{$parent_seqno} = 1; } } } $self->[_rstarting_multiline_qw_seqno_by_K_] = $rstarting_multiline_qw_seqno_by_K; $self->[_rending_multiline_qw_seqno_by_K_] = $rending_multiline_qw_seqno_by_K; $self->[_rKrange_multiline_qw_by_seqno_] = $rKrange_multiline_qw_by_seqno; $self->[_rmultiline_qw_has_extra_level_] = $rmultiline_qw_has_extra_level; return; } ## end sub find_multiline_qw use constant DEBUG_COLLAPSED_LENGTHS => 0; # Minimum space reserved for contents of a code block. A value of 40 has given # reasonable results. With a large line length, say -l=120, this will not # normally be noticeable but it will prevent making a mess in some edge cases. use constant MIN_BLOCK_LEN => 40; my %is_handle_type; BEGIN { my @q = qw( w C U G i k => ); @is_handle_type{@q} = (1) x scalar(@q); my $i = 0; use constant { _max_prong_len_ => $i++, _handle_len_ => $i++, _seqno_o_ => $i++, _iline_o_ => $i++, _K_o_ => $i++, _K_c_ => $i++, _interrupted_list_rule_ => $i++, }; } ## end BEGIN sub is_fragile_block_type { my ( $self, $block_type, $seqno ) = @_; # Given: # $block_type = the block type of a token, and # $seqno = its sequence number # Return: # true if this block type stays broken after being broken, # false otherwise # This sub has been added to isolate a tricky decision needed # to fix issue b1428. # The coding here needs to agree with: # - sub process_line where variable '$rbrace_follower' is set # - sub process_line_inner_loop where variable '$is_opening_BLOCK' is set, if ( $is_sort_map_grep_eval{$block_type} || $block_type eq 't' || $self->[_rshort_nested_]->{$seqno} ) { return 0; } return 1; } ## end sub is_fragile_block_type { ## closure xlp_collapsed_lengths my $max_prong_len; my $len; my $last_nonblank_type; my @stack; sub xlp_collapsed_lengths_initialize { $max_prong_len = 0; $len = 0; $last_nonblank_type = 'b'; @stack = (); push @stack, [ 0, # $max_prong_len, 0, # $handle_len, SEQ_ROOT, # $seqno, undef, # $iline, undef, # $KK, undef, # $K_c, undef, # $interrupted_list_rule ]; return; } ## end sub xlp_collapsed_lengths_initialize sub cumulative_length_to_comma { my ( $self, $KK, $K_comma, $K_closing ) = @_; # Given: # $KK = index of starting token, or blank before start # $K_comma = index of line-ending comma # $K_closing = index of the container closing token # Return: # $length = cumulative length of the term my $rLL = $self->[_rLL_]; if ( $rLL->[$KK]->[_TYPE_] eq 'b' ) { $KK++ } my $length = 0; if ( $KK < $K_comma && $rLL->[$K_comma]->[_TYPE_] eq ',' # should be true # Ignore if terminal comma, causes instability (b1297, # b1330) && ( $K_closing - $K_comma > 2 || ( $K_closing - $K_comma == 2 && $rLL->[ $K_comma + 1 ]->[_TYPE_] ne 'b' ) ) # The comma should be in this container && ( $rLL->[$K_comma]->[_LEVEL_] - 1 == $rLL->[$K_closing]->[_LEVEL_] ) ) { # An additional check: if line ends in ), and the ) has vtc then # skip this estimate. Otherwise, vtc can give oscillating results. # Fixes b1448. For example, this could be unstable: # ( $os ne 'win' ? ( -selectcolor => "red" ) : () ), # | |^--K_comma # | ^-- K_prev # ^--- KK # An alternative, possibly better strategy would be to try to turn # off -vtc locally, but it turns out to be difficult to locate the # appropriate closing token when it is not on the same line as its # opening token. my $K_prev = $self->K_previous_nonblank($K_comma); if ( defined($K_prev) && $K_prev >= $KK && $rLL->[$K_prev]->[_TYPE_SEQUENCE_] ) { my $token = $rLL->[$K_prev]->[_TOKEN_]; my $type = $rLL->[$K_prev]->[_TYPE_]; if ( $closing_vertical_tightness{$token} && $type ne 'R' ) { ## type 'R' does not normally get broken, so ignore ## skip length calculation return 0; } } my $starting_len = $KK >= 0 ? $rLL->[ $KK - 1 ]->[_CUMULATIVE_LENGTH_] : 0; $length = $rLL->[$K_comma]->[_CUMULATIVE_LENGTH_] - $starting_len; } return $length; } ## end sub cumulative_length_to_comma sub xlp_collapsed_lengths { my $self = shift; #---------------------------------------------------------------- # Define the collapsed lengths of containers for -xlp indentation #---------------------------------------------------------------- # We need an estimate of the minimum required line length starting at # any opening container for the -xlp style. This is needed to avoid # using too much indentation space for lower level containers and # thereby running out of space for outer container tokens due to the # maximum line length limit. # The basic idea is that at each node in the tree we imagine that we # have a fork with a handle and collapsible prongs: # # |------------ # |-------- # ------------|------- # handle |------------ # |-------- # prongs # # Each prong has a minimum collapsed length. The collapsed length at a # node is the maximum of these minimum lengths, plus the handle length. # Each of the prongs may itself be a tree node. # This is just a rough calculation to get an approximate starting point # for indentation. Later routines will be more precise. It is # important that these estimates be independent of the line breaks of # the input stream in order to avoid instabilities. my $rLL = $self->[_rLL_]; my $rlines = $self->[_rlines_]; my $rcollapsed_length_by_seqno = $self->[_rcollapsed_length_by_seqno_]; my $rtype_count_by_seqno = $self->[_rtype_count_by_seqno_]; my $K_start_multiline_qw; my $level_start_multiline_qw = 0; xlp_collapsed_lengths_initialize(); #-------------------------------- # Loop over all lines in the file #-------------------------------- my $iline = -1; my $skip_next_line; foreach my $line_of_tokens ( @{$rlines} ) { $iline++; if ($skip_next_line) { $skip_next_line = 0; next; } my $line_type = $line_of_tokens->{_line_type}; next if ( $line_type ne 'CODE' ); my $CODE_type = $line_of_tokens->{_code_type}; # Always skip blank lines next if ( $CODE_type eq 'BL' ); # Note on other line types: # 'FS' (Format Skipping) lines may contain opening/closing tokens so # we have to process them to keep the stack correctly sequenced # 'VB' (Verbatim) lines could be skipped, but testing shows that # results look better if we include their lengths. # Also note that we could exclude -xlp formatting of containers with # 'FS' and 'VB' lines, but in testing that was not really beneficial # So we process tokens in 'FS' and 'VB' lines like all the rest... my $rK_range = $line_of_tokens->{_rK_range}; my ( $K_first, $K_last ) = @{$rK_range}; next unless ( defined($K_first) && defined($K_last) ); my $has_comment = $rLL->[$K_last]->[_TYPE_] eq '#'; # Always ignore block comments next if ( $has_comment && $K_first == $K_last ); # Handle an intermediate line of a multiline qw quote. These may # require including some -ci or -i spaces. See cases c098/x063. # Updated to check all lines (not just $K_first==$K_last) to fix # b1316 my $K_begin_loop = $K_first; if ( $rLL->[$K_first]->[_TYPE_] eq 'q' ) { my $KK = $K_first; my $level = $rLL->[$KK]->[_LEVEL_]; my $ci_level = $rLL->[$KK]->[_CI_LEVEL_]; # remember the level of the start if ( !defined($K_start_multiline_qw) ) { $K_start_multiline_qw = $K_first; $level_start_multiline_qw = $level; my $seqno_qw = $self->[_rstarting_multiline_qw_seqno_by_K_] ->{$K_start_multiline_qw}; if ( !$seqno_qw ) { my $Kp = $self->K_previous_nonblank($K_first); if ( defined($Kp) && $rLL->[$Kp]->[_TYPE_] eq 'q' ) { $K_start_multiline_qw = $Kp; $level_start_multiline_qw = $rLL->[$K_start_multiline_qw]->[_LEVEL_]; } else { # Fix for b1319, b1320 $K_start_multiline_qw = undef; } } } if ( defined($K_start_multiline_qw) ) { $len = $rLL->[$KK]->[_CUMULATIVE_LENGTH_] - $rLL->[ $KK - 1 ]->[_CUMULATIVE_LENGTH_]; # We may have to add the spaces of one level or ci level # ... it depends depends on the -xci flag, the -wn flag, # and if the qw uses a container token as the quote # delimiter. # First rule: add ci if there is a $ci_level if ($ci_level) { $len += $rOpts_continuation_indentation; } # Second rule: otherwise, look for an extra indentation # level from the start and add one indentation level if # found. elsif ( $level > $level_start_multiline_qw ) { $len += $rOpts_indent_columns; } if ( $len > $max_prong_len ) { $max_prong_len = $len } $last_nonblank_type = 'q'; $K_begin_loop = $K_first + 1; # We can skip to the next line if more tokens next if ( $K_begin_loop > $K_last ); } } $K_start_multiline_qw = undef; # Find the terminal token, before any side comment my $K_terminal = $K_last; if ($has_comment) { $K_terminal -= 1; $K_terminal -= 1 if ( $rLL->[$K_terminal]->[_TYPE_] eq 'b' && $K_terminal > $K_first ); } # Use length to terminal comma if interrupted list rule applies if ( @stack && $stack[-1]->[_interrupted_list_rule_] ) { my $K_c = $stack[-1]->[_K_c_]; if ( defined($K_c) ) { #---------------------------------------------------------- # BEGIN patch for issue b1408: If this line ends in an # opening token, look for the closing token and comma at # the end of the next line. If so, combine the two lines to # get the correct sums. This problem seems to require -xlp # -vtc=2 and blank lines to occur. Use %is_opening_type to # fix b1431. #---------------------------------------------------------- if ( $is_opening_type{ $rLL->[$K_terminal]->[_TYPE_] } && !$has_comment ) { my $seqno_end = $rLL->[$K_terminal]->[_TYPE_SEQUENCE_]; my $Kc_test = $rLL->[$K_terminal]->[_KNEXT_SEQ_ITEM_]; # We are looking for a short broken remnant on the next # line; something like the third line here (b1408): # parent => # Moose::Util::TypeConstraints::find_type_constraint( # 'RefXX' ), # or this # # Help::WorkSubmitter->_filter_chores_and_maybe_warn_user( # $story_set_all_chores), # or this (b1431): # $issue->{ # 'borrowernumber'}, # borrowernumber if ( defined($Kc_test) && $seqno_end == $rLL->[$Kc_test]->[_TYPE_SEQUENCE_] && $rLL->[$Kc_test]->[_LINE_INDEX_] == $iline + 1 ) { my $line_of_tokens_next = $rlines->[ $iline + 1 ]; my $rtype_count = $rtype_count_by_seqno->{$seqno_end}; my ( $K_first_next, $K_terminal_next ) = @{ $line_of_tokens_next->{_rK_range} }; # backup at a side comment if ( defined($K_terminal_next) && $rLL->[$K_terminal_next]->[_TYPE_] eq '#' ) { my $Kprev = $self->K_previous_nonblank($K_terminal_next); if ( defined($Kprev) && $Kprev >= $K_first_next ) { $K_terminal_next = $Kprev; } } if ( defined($K_terminal_next) # next line ends with a comma && $rLL->[$K_terminal_next]->[_TYPE_] eq ',' # which follows the closing container token && ( $K_terminal_next - $Kc_test == 1 || ( $K_terminal_next - $Kc_test == 2 && $rLL->[ $K_terminal_next - 1 ] ->[_TYPE_] eq 'b' ) ) # no commas in the container && ( !defined($rtype_count) || !$rtype_count->{','} ) # for now, restrict this to a container with # just 1 or two tokens && $K_terminal_next - $K_terminal <= 5 ) { # combine the next line with the current line $K_terminal = $K_terminal_next; $skip_next_line = 1; if (DEBUG_COLLAPSED_LENGTHS) { print "Combining lines at line $iline\n"; } } } } #-------------------------- # END patch for issue b1408 #-------------------------- if ( $rLL->[$K_terminal]->[_TYPE_] eq ',' ) { my $length = $self->cumulative_length_to_comma( $K_first, $K_terminal, $K_c ); # Fix for b1331: at a broken => item, include the # length of the previous half of the item plus one for # the missing space if ( $last_nonblank_type eq '=>' ) { $length += $len + 1; } if ( $length > $max_prong_len ) { $max_prong_len = $length; } } } } #---------------------------------- # Loop over all tokens on this line #---------------------------------- $self->xlp_collapse_lengths_inner_loop( $iline, $K_begin_loop, $K_terminal, $K_last ); # Now take care of any side comment; if ($has_comment) { if ($rOpts_ignore_side_comment_lengths) { $len = 0; } else { # For a side comment when -iscl is not set, measure length from # the start of the previous nonblank token my $len0 = $K_terminal > 0 ? $rLL->[ $K_terminal - 1 ]->[_CUMULATIVE_LENGTH_] : 0; $len = $rLL->[$K_last]->[_CUMULATIVE_LENGTH_] - $len0; if ( $len > $max_prong_len ) { $max_prong_len = $len } } } } ## end loop over lines if (DEBUG_COLLAPSED_LENGTHS) { print "\nCollapsed lengths--\n"; foreach my $key ( sort { $a <=> $b } keys %{$rcollapsed_length_by_seqno} ) { my $clen = $rcollapsed_length_by_seqno->{$key}; print "$key -> $clen\n"; } } return; } ## end sub xlp_collapsed_lengths sub xlp_collapse_lengths_inner_loop { my ( $self, $iline, $K_begin_loop, $K_terminal, $K_last ) = @_; my $rLL = $self->[_rLL_]; my $K_closing_container = $self->[_K_closing_container_]; my $rblock_type_of_seqno = $self->[_rblock_type_of_seqno_]; my $rcollapsed_length_by_seqno = $self->[_rcollapsed_length_by_seqno_]; my $ris_permanently_broken = $self->[_ris_permanently_broken_]; my $ris_list_by_seqno = $self->[_ris_list_by_seqno_]; my $rhas_broken_list = $self->[_rhas_broken_list_]; my $rtype_count_by_seqno = $self->[_rtype_count_by_seqno_]; #---------------------------------- # Loop over tokens on this line ... #---------------------------------- foreach my $KK ( $K_begin_loop .. $K_terminal ) { my $type = $rLL->[$KK]->[_TYPE_]; next if ( $type eq 'b' ); #------------------------ # Handle sequenced tokens #------------------------ my $seqno = $rLL->[$KK]->[_TYPE_SEQUENCE_]; if ($seqno) { my $token = $rLL->[$KK]->[_TOKEN_]; #---------------------------- # Entering a new container... #---------------------------- if ( $is_opening_token{$token} && defined( $K_closing_container->{$seqno} ) ) { # save current prong length $stack[-1]->[_max_prong_len_] = $max_prong_len; $max_prong_len = 0; # Start new prong one level deeper my $handle_len = 0; if ( $rblock_type_of_seqno->{$seqno} ) { # code blocks do not use -lp indentation, but behave as # if they had a handle of one indentation length $handle_len = $rOpts_indent_columns; } elsif ( $is_handle_type{$last_nonblank_type} ) { $handle_len = $len; $handle_len += 1 if ( $KK > 0 && $rLL->[ $KK - 1 ]->[_TYPE_] eq 'b' ); } # Set a flag if the 'Interrupted List Rule' will be applied # (see sub copy_old_breakpoints). # - Added check on has_broken_list to fix issue b1298 my $interrupted_list_rule = $ris_permanently_broken->{$seqno} && $ris_list_by_seqno->{$seqno} && !$rhas_broken_list->{$seqno} && !$rOpts_ignore_old_breakpoints; # NOTES: Since we are looking at old line numbers we have # to be very careful not to introduce an instability. # This following causes instability (b1288-b1296): # $interrupted_list_rule ||= # $rOpts_break_at_old_comma_breakpoints; # - We could turn off the interrupted list rule if there is # a broken sublist, to follow 'Compound List Rule 1'. # - We could use the _rhas_broken_list_ flag for this. # - But it seems safer not to do this, to avoid # instability, since the broken sublist could be # temporary. It seems better to let the formatting # stabilize by itself after one or two iterations. # - So, not doing this for now # Turn off the interrupted list rule if -vmll is set and a # list has '=>' characters. This avoids instabilities due # to dependence on old line breaks; issue b1325. if ( $interrupted_list_rule && $rOpts_variable_maximum_line_length ) { my $rtype_count = $rtype_count_by_seqno->{$seqno}; if ( $rtype_count && $rtype_count->{'=>'} ) { $interrupted_list_rule = 0; } } my $K_c = $K_closing_container->{$seqno}; # Add length of any terminal list item if interrupted # so that the result is the same as if the term is # in the next line (b1446). if ( $interrupted_list_rule && $KK < $K_terminal # The line should end in a comma # NOTE: this currently assumes break after comma. # As long as the other call to cumulative_length.. # makes the same assumption we should remain stable. && $rLL->[$K_terminal]->[_TYPE_] eq ',' ) { $max_prong_len = $self->cumulative_length_to_comma( $KK + 1, $K_terminal, $K_c ); } push @stack, [ $max_prong_len, $handle_len, $seqno, $iline, $KK, $K_c, $interrupted_list_rule ]; } #-------------------- # Exiting a container #-------------------- elsif ( $is_closing_token{$token} && @stack ) { # The current prong ends - get its handle my $item = pop @stack; my $handle_len = $item->[_handle_len_]; my $seqno_o = $item->[_seqno_o_]; my $iline_o = $item->[_iline_o_]; my $K_o = $item->[_K_o_]; my $K_c_expect = $item->[_K_c_]; my $collapsed_len = $max_prong_len; if ( $seqno_o ne $seqno ) { # This can happen if input file has brace errors. # Otherwise it shouldn't happen. Not fatal but -lp # formatting could get messed up. if ( DEVEL_MODE && !get_saw_brace_error() ) { Fault(<{$seqno}; if ($block_type) { my $K_c = $KK; my $block_length = MIN_BLOCK_LEN; my $is_one_line_block; my $level = $rLL->[$K_o]->[_LEVEL_]; if ( defined($K_o) && defined($K_c) ) { # note: fixed 3 May 2022 (removed 'my') $block_length = $rLL->[ $K_c - 1 ]->[_CUMULATIVE_LENGTH_] - $rLL->[$K_o]->[_CUMULATIVE_LENGTH_]; $is_one_line_block = $iline == $iline_o; } # Code block rule 1: Use the total block length if # it is less than the minimum. if ( $block_length < MIN_BLOCK_LEN ) { $collapsed_len = $block_length; } # Code block rule 2: Use the full length of a # one-line block to avoid breaking it, unless # extremely long. We do not need to do a precise # check here, because if it breaks then it will # stay broken on later iterations. elsif ( $is_one_line_block && $block_length < $maximum_line_length_at_level[$level] # But skip this for blocks types which can reform, # like sort/map/grep/eval blocks, to avoid # instability (b1345, b1428) && $self->is_fragile_block_type( $block_type, $seqno ) ) { $collapsed_len = $block_length; } # Code block rule 3: Otherwise the length should be # at least MIN_BLOCK_LEN to avoid scrunching code # blocks. elsif ( $collapsed_len < MIN_BLOCK_LEN ) { $collapsed_len = MIN_BLOCK_LEN; } } # Store the result. Some extra space, '2', allows for # length of an opening token, inside space, comma, ... # This constant has been tuned to give good overall # results. $collapsed_len += 2; $rcollapsed_length_by_seqno->{$seqno} = $collapsed_len; # Restart scanning the lower level prong if (@stack) { $max_prong_len = $stack[-1]->[_max_prong_len_]; $collapsed_len += $handle_len; if ( $collapsed_len > $max_prong_len ) { $max_prong_len = $collapsed_len; } } } # it is a ternary - no special processing for these yet else { } $len = 0; $last_nonblank_type = $type; next; } #---------------------------- # Handle non-container tokens #---------------------------- my $token_length = $rLL->[$KK]->[_TOKEN_LENGTH_]; # Count lengths of things like 'xx => yy' as a single item if ( $type eq '=>' ) { $len += $token_length + 1; if ( $len > $max_prong_len ) { $max_prong_len = $len } } elsif ( $last_nonblank_type eq '=>' ) { $len += $token_length; if ( $len > $max_prong_len ) { $max_prong_len = $len } # but only include one => per item $len = $token_length; } # include everything to end of line after a here target elsif ( $type eq 'h' ) { $len = $rLL->[$K_last]->[_CUMULATIVE_LENGTH_] - $rLL->[ $KK - 1 ]->[_CUMULATIVE_LENGTH_]; if ( $len > $max_prong_len ) { $max_prong_len = $len } } # for everything else just use the token length else { $len = $token_length; if ( $len > $max_prong_len ) { $max_prong_len = $len } } $last_nonblank_type = $type; } ## end loop over tokens on this line return; } ## end sub xlp_collapse_lengths_inner_loop } ## end closure xlp_collapsed_lengths sub is_excluded_lp { # Decide if this container is excluded by user request: # returns true if this token is excluded (i.e., may not use -lp) # returns false otherwise # The control hash can either describe: # what to exclude: $line_up_parentheses_control_is_lxpl = 1, or # what to include: $line_up_parentheses_control_is_lxpl = 0 # Input parameter: # $KK = index of the container opening token my ( $self, $KK ) = @_; my $rLL = $self->[_rLL_]; my $rtoken_vars = $rLL->[$KK]; my $token = $rtoken_vars->[_TOKEN_]; my $rflags = $line_up_parentheses_control_hash{$token}; #----------------------------------------------- # TEST #1: check match to listed container types #----------------------------------------------- if ( !defined($rflags) ) { # There is no entry for this container, so we are done return !$line_up_parentheses_control_is_lxpl; } my ( $flag1, $flag2 ) = @{$rflags}; #----------------------------------------------------------- # TEST #2: check match to flag1, the preceding nonblank word #----------------------------------------------------------- my $match_flag1 = !defined($flag1) || $flag1 eq '*'; if ( !$match_flag1 ) { # Find the previous token my ( $is_f, $is_k, $is_w ); my $Kp = $self->K_previous_nonblank($KK); if ( defined($Kp) ) { my $type_p = $rLL->[$Kp]->[_TYPE_]; my $seqno = $rtoken_vars->[_TYPE_SEQUENCE_]; # keyword? $is_k = $type_p eq 'k'; # function call? $is_f = $self->[_ris_function_call_paren_]->{$seqno}; # either keyword or function call? $is_w = $is_k || $is_f; } # Check for match based on flag1 and the previous token: if ( $flag1 eq 'k' ) { $match_flag1 = $is_k } elsif ( $flag1 eq 'K' ) { $match_flag1 = !$is_k } elsif ( $flag1 eq 'f' ) { $match_flag1 = $is_f } elsif ( $flag1 eq 'F' ) { $match_flag1 = !$is_f } elsif ( $flag1 eq 'w' ) { $match_flag1 = $is_w } elsif ( $flag1 eq 'W' ) { $match_flag1 = !$is_w } ## else { no match found } } # See if we can exclude this based on the flag1 test... if ($line_up_parentheses_control_is_lxpl) { return 1 if ($match_flag1); } else { return 1 if ( !$match_flag1 ); } #------------------------------------------------------------- # TEST #3: exclusion based on flag2 and the container contents #------------------------------------------------------------- # Note that this is an exclusion test for both -lpxl or -lpil input methods # The options are: # 0 or blank: ignore container contents # 1 exclude non-lists or lists with sublists # 2 same as 1 but also exclude lists with code blocks my $match_flag2; if ($flag2) { my $seqno = $rtoken_vars->[_TYPE_SEQUENCE_]; my $is_list = $self->[_ris_list_by_seqno_]->{$seqno}; my $has_list = $self->[_rhas_list_]->{$seqno}; my $has_code_block = $self->[_rhas_code_block_]->{$seqno}; my $has_ternary = $self->[_rhas_ternary_]->{$seqno}; if ( !$is_list || $has_list || $flag2 eq '2' && ( $has_code_block || $has_ternary ) ) { $match_flag2 = 1; } } return $match_flag2; } ## end sub is_excluded_lp sub set_excluded_lp_containers { my ($self) = @_; return unless ($rOpts_line_up_parentheses); my $rLL = $self->[_rLL_]; return unless ( defined($rLL) && @{$rLL} ); my $K_opening_container = $self->[_K_opening_container_]; my $ris_excluded_lp_container = $self->[_ris_excluded_lp_container_]; my $rblock_type_of_seqno = $self->[_rblock_type_of_seqno_]; foreach my $seqno ( keys %{$K_opening_container} ) { # code blocks are always excluded by the -lp coding so we can skip them next if ( $rblock_type_of_seqno->{$seqno} ); my $KK = $K_opening_container->{$seqno}; next unless defined($KK); # see if a user exclusion rule turns off -lp for this container if ( $self->is_excluded_lp($KK) ) { $ris_excluded_lp_container->{$seqno} = 1; } } return; } ## end sub set_excluded_lp_containers ###################################### # CODE SECTION 6: Process line-by-line ###################################### sub process_all_lines { #---------------------------------------------------------- # Main loop to format all lines of a file according to type #---------------------------------------------------------- my $self = shift; my $rlines = $self->[_rlines_]; my $rOpts_keep_old_blank_lines = $rOpts->{'keep-old-blank-lines'}; my $file_writer_object = $self->[_file_writer_object_]; my $logger_object = $self->[_logger_object_]; my $vertical_aligner_object = $self->[_vertical_aligner_object_]; my $save_logfile = $self->[_save_logfile_]; # Flag to prevent blank lines when POD occurs in a format skipping sect. my $in_format_skipping_section; # set locations for blanks around long runs of keywords my $rwant_blank_line_after = $self->keyword_group_scan(); my $line_type = EMPTY_STRING; my $i_last_POD_END = -10; my $i = -1; foreach my $line_of_tokens ( @{$rlines} ) { # insert blank lines requested for keyword sequences if ( defined( $rwant_blank_line_after->{$i} ) && $rwant_blank_line_after->{$i} == 1 ) { $self->want_blank_line(); } $i++; my $last_line_type = $line_type; $line_type = $line_of_tokens->{_line_type}; my $input_line = $line_of_tokens->{_line_text}; # _line_type codes are: # SYSTEM - system-specific code before hash-bang line # CODE - line of perl code (including comments) # POD_START - line starting pod, such as '=head' # POD - pod documentation text # POD_END - last line of pod section, '=cut' # HERE - text of here-document # HERE_END - last line of here-doc (target word) # FORMAT - format section # FORMAT_END - last line of format section, '.' # SKIP - code skipping section # SKIP_END - last line of code skipping section, '#>>V' # DATA_START - __DATA__ line # DATA - unidentified text following __DATA__ # END_START - __END__ line # END - unidentified text following __END__ # ERROR - we are in big trouble, probably not a perl script # put a blank line after an =cut which comes before __END__ and __DATA__ # (required by podchecker) if ( $last_line_type eq 'POD_END' && !$self->[_saw_END_or_DATA_] ) { $i_last_POD_END = $i; $file_writer_object->reset_consecutive_blank_lines(); if ( !$in_format_skipping_section && $input_line !~ /^\s*$/ ) { $self->want_blank_line(); } } # handle line of code.. if ( $line_type eq 'CODE' ) { my $CODE_type = $line_of_tokens->{_code_type}; $in_format_skipping_section = $CODE_type eq 'FS'; # Handle blank lines if ( $CODE_type eq 'BL' ) { # Keep this blank? Start with the flag -kbl=n, where # n=0 ignore all old blank lines # n=1 stable: keep old blanks, but limited by -mbl=n # n=2 keep all old blank lines, regardless of -mbl=n # If n=0 we delete all old blank lines and let blank line # rules generate any needed blank lines. my $kgb_keep = $rOpts_keep_old_blank_lines; # Then delete lines requested by the keyword-group logic if # allowed if ( $kgb_keep == 1 && defined( $rwant_blank_line_after->{$i} ) && $rwant_blank_line_after->{$i} == 2 ) { $kgb_keep = 0; } # But always keep a blank line following an =cut if ( $i - $i_last_POD_END < 3 && !$kgb_keep ) { $kgb_keep = 1; } if ($kgb_keep) { $self->flush($CODE_type); $file_writer_object->write_blank_code_line( $rOpts_keep_old_blank_lines == 2 ); $self->[_last_line_leading_type_] = 'b'; } next; } else { # Let logger see all non-blank lines of code. This is a slow # operation so we avoid it if it is not going to be saved. if ( $save_logfile && $logger_object ) { $logger_object->black_box( $line_of_tokens, $vertical_aligner_object->get_output_line_number ); } } # Handle Format Skipping (FS) and Verbatim (VB) Lines if ( $CODE_type eq 'VB' || $CODE_type eq 'FS' ) { $self->write_unindented_line("$input_line"); $file_writer_object->reset_consecutive_blank_lines(); next; } # Handle all other lines of code $self->process_line_of_CODE($line_of_tokens); } # handle line of non-code.. else { # set special flags my $skip_line = 0; if ( substr( $line_type, 0, 3 ) eq 'POD' ) { # Pod docs should have a preceding blank line. But stay # out of __END__ and __DATA__ sections, because # the user may be using this section for any purpose whatsoever if ( $rOpts->{'delete-pod'} ) { $skip_line = 1; } if ( $rOpts->{'trim-pod'} ) { $input_line =~ s/\s+$// } if ( !$skip_line && !$in_format_skipping_section && $line_type eq 'POD_START' && !$self->[_saw_END_or_DATA_] ) { $self->want_blank_line(); } } # leave the blank counters in a predictable state # after __END__ or __DATA__ elsif ( $line_type eq 'END_START' || $line_type eq 'DATA_START' ) { $file_writer_object->reset_consecutive_blank_lines(); $self->[_saw_END_or_DATA_] = 1; } # Patch to avoid losing blank lines after a code-skipping block; # fixes case c047. elsif ( $line_type eq 'SKIP_END' ) { $file_writer_object->reset_consecutive_blank_lines(); } # write unindented non-code line if ( !$skip_line ) { $self->write_unindented_line($input_line); } } } return; } ## end sub process_all_lines { ## closure keyword_group_scan # this is the return var my $rhash_of_desires; # user option variables for -kgb my ( $rOpts_kgb_after, $rOpts_kgb_before, $rOpts_kgb_delete, $rOpts_kgb_inside, $rOpts_kgb_size_max, $rOpts_kgb_size_min, ); # group variables, initialized by kgb_initialize_group_vars my ( $ibeg, $iend, $count, $level_beg, $K_closing ); my ( @iblanks, @group, @subgroup ); # line variables, updated by sub keyword_group_scan my ( $line_type, $CODE_type, $K_first, $K_last ); my $number_of_groups_seen; #------------------------ # -kgb helper subroutines #------------------------ sub kgb_initialize_options { # check and initialize user options for -kgb # return error flag: # true for some input error, do not continue # false if ok # Local copies of the various control parameters $rOpts_kgb_after = $rOpts->{'keyword-group-blanks-after'}; # '-kgba' $rOpts_kgb_before = $rOpts->{'keyword-group-blanks-before'}; # '-kgbb' $rOpts_kgb_delete = $rOpts->{'keyword-group-blanks-delete'}; # '-kgbd' $rOpts_kgb_inside = $rOpts->{'keyword-group-blanks-inside'}; # '-kgbi' # A range of sizes can be input with decimal notation like 'min.max' # with any number of dots between the two numbers. Examples: # string => min max matches # 1.1 1 1 exactly 1 # 1.3 1 3 1,2, or 3 # 1..3 1 3 1,2, or 3 # 5 5 - 5 or more # 6. 6 - 6 or more # .2 - 2 up to 2 # 1.0 1 0 nothing my $rOpts_kgb_size = $rOpts->{'keyword-group-blanks-size'}; # '-kgbs' ( $rOpts_kgb_size_min, $rOpts_kgb_size_max ) = split /\.+/, $rOpts_kgb_size; if ( $rOpts_kgb_size_min && $rOpts_kgb_size_min !~ /^\d+$/ || $rOpts_kgb_size_max && $rOpts_kgb_size_max !~ /^\d+$/ ) { Warn(<{'keyword-group-blanks-size'} = EMPTY_STRING; return $rhash_of_desires; } $rOpts_kgb_size_min = 1 unless ($rOpts_kgb_size_min); if ( $rOpts_kgb_size_max && $rOpts_kgb_size_max < $rOpts_kgb_size_min ) { return $rhash_of_desires; } # check codes for $rOpts_kgb_before and # $rOpts_kgb_after: # 0 = never (delete if exist) # 1 = stable (keep unchanged) # 2 = always (insert if missing) return $rhash_of_desires unless $rOpts_kgb_size_min > 0 && ( $rOpts_kgb_before != 1 || $rOpts_kgb_after != 1 || $rOpts_kgb_inside || $rOpts_kgb_delete ); return; } ## end sub kgb_initialize_options sub kgb_initialize_group_vars { # Definitions: # $ibeg = first line index of this entire group # $iend = last line index of this entire group # $count = total number of keywords seen in this entire group # $level_beg = indentation level of this group # @group = [ $i, $token, $count ] =list of all keywords & blanks # @subgroup = $j, index of group where token changes # @iblanks = line indexes of blank lines in input stream in this group # where i=starting line index # token (the keyword) # count = number of this token in this subgroup # j = index in group where token changes $ibeg = -1; $iend = undef; $level_beg = -1; $K_closing = undef; $count = 0; @group = (); @subgroup = (); @iblanks = (); return; } ## end sub kgb_initialize_group_vars sub kgb_initialize_line_vars { $CODE_type = EMPTY_STRING; $K_first = undef; $K_last = undef; $line_type = EMPTY_STRING; return; } ## end sub kgb_initialize_line_vars sub kgb_initialize { # initialize all closure variables for -kgb # return: # true to cause immediate exit (something is wrong) # false to continue ... all is okay # This is the return variable: $rhash_of_desires = {}; # initialize and check user options; my $quit = kgb_initialize_options(); if ($quit) { return $quit } # initialize variables for the current group and subgroups: kgb_initialize_group_vars(); # initialize variables for the most recently seen line: kgb_initialize_line_vars(); $number_of_groups_seen = 0; # all okay return; } ## end sub kgb_initialize sub kgb_insert_blank_after { my ($i) = @_; $rhash_of_desires->{$i} = 1; my $ip = $i + 1; if ( defined( $rhash_of_desires->{$ip} ) && $rhash_of_desires->{$ip} == 2 ) { $rhash_of_desires->{$ip} = 0; } return; } ## end sub kgb_insert_blank_after sub kgb_split_into_sub_groups { # place blanks around long sub-groups of keywords # ...if requested return unless ($rOpts_kgb_inside); # loop over sub-groups, index k push @subgroup, scalar @group; my $kbeg = 1; my $kend = @subgroup - 1; foreach my $k ( $kbeg .. $kend ) { # index j runs through all keywords found my $j_b = $subgroup[ $k - 1 ]; my $j_e = $subgroup[$k] - 1; # index i is the actual line number of a keyword my ( $i_b, $tok_b, $count_b ) = @{ $group[$j_b] }; my ( $i_e, $tok_e, $count_e ) = @{ $group[$j_e] }; my $num = $count_e - $count_b + 1; # This subgroup runs from line $ib to line $ie-1, but may contain # blank lines if ( $num >= $rOpts_kgb_size_min ) { # if there are blank lines, we require that at least $num lines # be non-blank up to the boundary with the next subgroup. my $nog_b = my $nog_e = 1; if ( @iblanks && !$rOpts_kgb_delete ) { my $j_bb = $j_b + $num - 1; my ( $i_bb, $tok_bb, $count_bb ) = @{ $group[$j_bb] }; $nog_b = $count_bb - $count_b + 1 == $num; my $j_ee = $j_e - ( $num - 1 ); my ( $i_ee, $tok_ee, $count_ee ) = @{ $group[$j_ee] }; $nog_e = $count_e - $count_ee + 1 == $num; } if ( $nog_b && $k > $kbeg ) { kgb_insert_blank_after( $i_b - 1 ); } if ( $nog_e && $k < $kend ) { my ( $i_ep, $tok_ep, $count_ep ) = @{ $group[ $j_e + 1 ] }; kgb_insert_blank_after( $i_ep - 1 ); } } } return; } ## end sub kgb_split_into_sub_groups sub kgb_delete_if_blank { my ( $self, $i ) = @_; # delete line $i if it is blank my $rlines = $self->[_rlines_]; return unless ( $i >= 0 && $i < @{$rlines} ); return if ( $rlines->[$i]->{_line_type} ne 'CODE' ); my $code_type = $rlines->[$i]->{_code_type}; if ( $code_type eq 'BL' ) { $rhash_of_desires->{$i} = 2; } return; } ## end sub kgb_delete_if_blank sub kgb_delete_inner_blank_lines { # always remove unwanted trailing blank lines from our list return unless (@iblanks); while ( my $ibl = pop(@iblanks) ) { if ( $ibl < $iend ) { push @iblanks, $ibl; last } $iend = $ibl; } # now mark mark interior blank lines for deletion if requested return unless ($rOpts_kgb_delete); while ( my $ibl = pop(@iblanks) ) { $rhash_of_desires->{$ibl} = 2 } return; } ## end sub kgb_delete_inner_blank_lines sub kgb_end_group { # end a group of keywords my ( $self, $bad_ending ) = @_; if ( defined($ibeg) && $ibeg >= 0 ) { # then handle sufficiently large groups if ( $count >= $rOpts_kgb_size_min ) { $number_of_groups_seen++; # do any blank deletions regardless of the count kgb_delete_inner_blank_lines(); my $rlines = $self->[_rlines_]; if ( $ibeg > 0 ) { my $code_type = $rlines->[ $ibeg - 1 ]->{_code_type}; # patch for hash bang line which is not currently marked as # a comment; mark it as a comment if ( $ibeg == 1 && !$code_type ) { my $line_text = $rlines->[ $ibeg - 1 ]->{_line_text}; $code_type = 'BC' if ( $line_text && $line_text =~ /^#/ ); } # Do not insert a blank after a comment # (this could be subject to a flag in the future) if ( $code_type !~ /(BC|SBC|SBCX)/ ) { if ( $rOpts_kgb_before == INSERT ) { kgb_insert_blank_after( $ibeg - 1 ); } elsif ( $rOpts_kgb_before == DELETE ) { $self->kgb_delete_if_blank( $ibeg - 1 ); } } } # We will only put blanks before code lines. We could loosen # this rule a little, but we have to be very careful because # for example we certainly don't want to drop a blank line # after a line like this: # my $var = <[_rLL_]; my $level = $rLL->[$K_first]->[_LEVEL_]; my $ci_level = $rLL->[$K_first]->[_CI_LEVEL_]; if ( $level == $level_beg && $ci_level == 0 && !$bad_ending && $iend < @{$rlines} && $CODE_type ne 'HSC' ) { if ( $rOpts_kgb_after == INSERT ) { kgb_insert_blank_after($iend); } elsif ( $rOpts_kgb_after == DELETE ) { $self->kgb_delete_if_blank( $iend + 1 ); } } } } kgb_split_into_sub_groups(); } # reset for another group kgb_initialize_group_vars(); return; } ## end sub kgb_end_group sub kgb_find_container_end { # If the keyword line is continued onto subsequent lines, find the # closing token '$K_closing' so that we can easily skip past the # contents of the container. # We only set this value if we find a simple list, meaning # -contents only one level deep # -not welded my ($self) = @_; # First check: skip if next line is not one deeper my $Knext_nonblank = $self->K_next_nonblank($K_last); return if ( !defined($Knext_nonblank) ); my $rLL = $self->[_rLL_]; my $level_next = $rLL->[$Knext_nonblank]->[_LEVEL_]; return if ( $level_next != $level_beg + 1 ); # Find the parent container of the first token on the next line my $parent_seqno = $self->parent_seqno_by_K($Knext_nonblank); return unless ( defined($parent_seqno) ); # Must not be a weld (can be unstable) return if ( $total_weld_count && $self->is_welded_at_seqno($parent_seqno) ); # Opening container must exist and be on this line my $Ko = $self->[_K_opening_container_]->{$parent_seqno}; return unless ( defined($Ko) && $Ko > $K_first && $Ko <= $K_last ); # Verify that the closing container exists and is on a later line my $Kc = $self->[_K_closing_container_]->{$parent_seqno}; return unless ( defined($Kc) && $Kc > $K_last ); # That's it $K_closing = $Kc; return; } ## end sub kgb_find_container_end sub kgb_add_to_group { my ( $self, $i, $token, $level ) = @_; # End the previous group if we have reached the maximum # group size if ( $rOpts_kgb_size_max && @group >= $rOpts_kgb_size_max ) { $self->kgb_end_group(); } if ( @group == 0 ) { $ibeg = $i; $level_beg = $level; $count = 0; } $count++; $iend = $i; # New sub-group? if ( !@group || $token ne $group[-1]->[1] ) { push @subgroup, scalar(@group); } push @group, [ $i, $token, $count ]; # remember if this line ends in an open container $self->kgb_find_container_end(); return; } ## end sub kgb_add_to_group #--------------------- # -kgb main subroutine #--------------------- sub keyword_group_scan { my $self = shift; # Called once per file to process --keyword-group-blanks-* parameters. # Task: # Manipulate blank lines around keyword groups (kgb* flags) # Scan all lines looking for runs of consecutive lines beginning with # selected keywords. Example keywords are 'my', 'our', 'local', ... but # they may be anything. We will set flags requesting that blanks be # inserted around and within them according to input parameters. Note # that we are scanning the lines as they came in in the input stream, so # they are not necessarily well formatted. # Returns: # The output of this sub is a return hash ref whose keys are the indexes # of lines after which we desire a blank line. For line index $i: # $rhash_of_desires->{$i} = 1 means we want a blank line AFTER line $i # $rhash_of_desires->{$i} = 2 means we want blank line $i removed # Nothing to do if no blanks can be output. This test added to fix # case b760. if ( !$rOpts_maximum_consecutive_blank_lines ) { return $rhash_of_desires; } #--------------- # initialization #--------------- my $quit = kgb_initialize(); if ($quit) { return $rhash_of_desires } my $rLL = $self->[_rLL_]; my $rlines = $self->[_rlines_]; $self->kgb_end_group(); my $i = -1; my $Opt_repeat_count = $rOpts->{'keyword-group-blanks-repeat-count'}; # '-kgbr' #---------------------------------- # loop over all lines of the source #---------------------------------- foreach my $line_of_tokens ( @{$rlines} ) { $i++; last if ( $Opt_repeat_count > 0 && $number_of_groups_seen >= $Opt_repeat_count ); kgb_initialize_line_vars(); $line_type = $line_of_tokens->{_line_type}; # always end a group at non-CODE if ( $line_type ne 'CODE' ) { $self->kgb_end_group(); next } $CODE_type = $line_of_tokens->{_code_type}; # end any group at a format skipping line if ( $CODE_type && $CODE_type eq 'FS' ) { $self->kgb_end_group(); next; } # continue in a verbatim (VB) type; it may be quoted text if ( $CODE_type eq 'VB' ) { if ( $ibeg >= 0 ) { $iend = $i; } next; } # and continue in blank (BL) types if ( $CODE_type eq 'BL' ) { if ( $ibeg >= 0 ) { $iend = $i; push @{iblanks}, $i; # propagate current subgroup token my $tok = $group[-1]->[1]; push @group, [ $i, $tok, $count ]; } next; } # examine the first token of this line my $rK_range = $line_of_tokens->{_rK_range}; ( $K_first, $K_last ) = @{$rK_range}; if ( !defined($K_first) ) { # Somewhat unexpected blank line.. # $rK_range is normally defined for line type CODE, but this can # happen for example if the input line was a single semicolon # which is being deleted. In that case there was code in the # input file but it is not being retained. So we can silently # return. return $rhash_of_desires; } my $level = $rLL->[$K_first]->[_LEVEL_]; my $type = $rLL->[$K_first]->[_TYPE_]; my $token = $rLL->[$K_first]->[_TOKEN_]; my $ci_level = $rLL->[$K_first]->[_CI_LEVEL_]; # End a group 'badly' at an unexpected level. This will prevent # blank lines being incorrectly placed after the end of the group. # We are looking for any deviation from two acceptable patterns: # PATTERN 1: a simple list; secondary lines are at level+1 # PATTERN 2: a long statement; all secondary lines same level # This was added as a fix for case b1177, in which a complex # structure got incorrectly inserted blank lines. if ( $ibeg >= 0 ) { # Check for deviation from PATTERN 1, simple list: if ( defined($K_closing) && $K_first < $K_closing ) { $self->kgb_end_group(1) if ( $level != $level_beg + 1 ); } # Check for deviation from PATTERN 2, single statement: elsif ( $level != $level_beg ) { $self->kgb_end_group(1) } } # Do not look for keywords in lists ( keyword 'my' can occur in # lists, see case b760); fixed for c048. if ( $self->is_list_by_K($K_first) ) { if ( $ibeg >= 0 ) { $iend = $i } next; } # see if this is a code type we seek (i.e. comment) if ( $CODE_type && $keyword_group_list_comment_pattern && $CODE_type =~ /$keyword_group_list_comment_pattern/ ) { my $tok = $CODE_type; # Continuing a group if ( $ibeg >= 0 && $level == $level_beg ) { $self->kgb_add_to_group( $i, $tok, $level ); } # Start new group else { # first end old group if any; we might be starting new # keywords at different level if ( $ibeg >= 0 ) { $self->kgb_end_group(); } $self->kgb_add_to_group( $i, $tok, $level ); } next; } # See if it is a keyword we seek, but never start a group in a # continuation line; the code may be badly formatted. if ( $ci_level == 0 && $type eq 'k' && $token =~ /$keyword_group_list_pattern/ ) { # Continuing a keyword group if ( $ibeg >= 0 && $level == $level_beg ) { $self->kgb_add_to_group( $i, $token, $level ); } # Start new keyword group else { # first end old group if any; we might be starting new # keywords at different level if ( $ibeg >= 0 ) { $self->kgb_end_group(); } $self->kgb_add_to_group( $i, $token, $level ); } next; } # This is not one of our keywords, but we are in a keyword group # so see if we should continue or quit elsif ( $ibeg >= 0 ) { # - bail out on a large level change; we may have walked into a # data structure or anonymous sub code. if ( $level > $level_beg + 1 || $level < $level_beg ) { $self->kgb_end_group(1); next; } # - keep going on a continuation line of the same level, since # it is probably a continuation of our previous keyword, # - and keep going past hanging side comments because we never # want to interrupt them. if ( ( ( $level == $level_beg ) && $ci_level > 0 ) || $CODE_type eq 'HSC' ) { $iend = $i; next; } # - continue if if we are within in a container which started # with the line of the previous keyword. if ( defined($K_closing) && $K_first <= $K_closing ) { # continue if entire line is within container if ( $K_last <= $K_closing ) { $iend = $i; next } # continue at ); or }; or ]; my $KK = $K_closing + 1; if ( $rLL->[$KK]->[_TYPE_] eq ';' ) { if ( $KK < $K_last ) { if ( $rLL->[ ++$KK ]->[_TYPE_] eq 'b' ) { ++$KK } if ( $KK > $K_last || $rLL->[$KK]->[_TYPE_] ne '#' ) { $self->kgb_end_group(1); next; } } $iend = $i; next; } $self->kgb_end_group(1); next; } # - end the group if none of the above $self->kgb_end_group(); next; } # not in a keyword group; continue else { next } } ## end of loop over all lines $self->kgb_end_group(); return $rhash_of_desires; } ## end sub keyword_group_scan } ## end closure keyword_group_scan ####################################### # CODE SECTION 7: Process lines of code ####################################### { ## begin closure process_line_of_CODE # The routines in this closure receive lines of code and combine them into # 'batches' and send them along. A 'batch' is the unit of code which can be # processed further as a unit. It has the property that it is the largest # amount of code into which which perltidy is free to place one or more # line breaks within it without violating any constraints. # When a new batch is formed it is sent to sub 'grind_batch_of_code'. # flags needed by the store routine my $line_of_tokens; my $no_internal_newlines; my $CODE_type; # range of K of tokens for the current line my ( $K_first, $K_last ); my ( $rLL, $radjusted_levels, $rparent_of_seqno, $rdepth_of_opening_seqno, $rblock_type_of_seqno, $ri_starting_one_line_block ); # past stored nonblank tokens and flags my ( $K_last_nonblank_code, $looking_for_else, $is_static_block_comment, $last_CODE_type, $last_line_had_side_comment, $next_parent_seqno, $next_slevel, ); # Called once at the start of a new file sub initialize_process_line_of_CODE { $K_last_nonblank_code = undef; $looking_for_else = 0; $is_static_block_comment = 0; $last_line_had_side_comment = 0; $next_parent_seqno = SEQ_ROOT; $next_slevel = undef; return; } ## end sub initialize_process_line_of_CODE # Batch variables: these describe the current batch of code being formed # and sent down the pipeline. They are initialized in the next # sub. my ( $rbrace_follower, $index_start_one_line_block, $starting_in_quote, $ending_in_quote, ); # Called before the start of each new batch sub initialize_batch_variables { # Initialize array values for a new batch. Any changes here must be # carefully coordinated with sub store_token_to_go. $max_index_to_go = UNDEFINED_INDEX; $summed_lengths_to_go[0] = 0; $nesting_depth_to_go[0] = 0; $ri_starting_one_line_block = []; # Redefine some sparse arrays. # It is more efficient to redefine these sparse arrays and rely on # undef's instead of initializing to 0's. Testing showed that using # @array=() is more efficient than $#array=-1 @old_breakpoint_to_go = (); @forced_breakpoint_to_go = (); @block_type_to_go = (); @mate_index_to_go = (); @type_sequence_to_go = (); # NOTE: @nobreak_to_go is sparse and could be treated this way, but # testing showed that there would be very little efficiency gain # because an 'if' test must be added in store_token_to_go. # The initialization code for the remaining batch arrays is as follows # and can be activated for testing. But profiling shows that it is # time-consuming to re-initialize the batch arrays and is not necessary # because the maximum valid token, $max_index_to_go, is carefully # controlled. This means however that it is not possible to do any # type of filter or map operation directly on these arrays. And it is # not possible to use negative indexes. As a precaution against program # changes which might do this, sub pad_array_to_go adds some undefs at # the end of the current batch of data. ## 0 && do { #<<< ## @nobreak_to_go = (); ## @token_lengths_to_go = (); ## @levels_to_go = (); ## @ci_levels_to_go = (); ## @tokens_to_go = (); ## @K_to_go = (); ## @types_to_go = (); ## @leading_spaces_to_go = (); ## @reduced_spaces_to_go = (); ## @inext_to_go = (); ## @parent_seqno_to_go = (); ## }; $rbrace_follower = undef; $ending_in_quote = 0; $index_start_one_line_block = undef; # initialize forced breakpoint vars associated with each output batch $forced_breakpoint_count = 0; $index_max_forced_break = UNDEFINED_INDEX; $forced_breakpoint_undo_count = 0; return; } ## end sub initialize_batch_variables sub leading_spaces_to_go { # return the number of indentation spaces for a token in the output # stream my ($ii) = @_; return 0 if ( $ii < 0 ); my $indentation = $leading_spaces_to_go[$ii]; return ref($indentation) ? $indentation->get_spaces() : $indentation; } ## end sub leading_spaces_to_go sub create_one_line_block { # set index starting next one-line block # call with no args to delete the current one-line block ($index_start_one_line_block) = @_; return; } ## end sub create_one_line_block # Routine to place the current token into the output stream. # Called once per output token. use constant DEBUG_STORE => 0; sub store_token_to_go { my ( $self, $Ktoken_vars, $rtoken_vars ) = @_; #------------------------------------------------------- # Token storage utility for sub process_line_of_CODE. # Add one token to the next batch of '_to_go' variables. #------------------------------------------------------- # Input parameters: # $Ktoken_vars = the index K in the global token array # $rtoken_vars = $rLL->[$Ktoken_vars] = the corresponding token values # unless they are temporarily being overridden #------------------------------------------------------------------ # NOTE: called once per token so coding efficiency is critical here. # All changes need to be benchmarked with Devel::NYTProf. #------------------------------------------------------------------ my ( $type, $token, $ci_level, $level, $seqno, $length, ) = @{$rtoken_vars}[ _TYPE_, _TOKEN_, _CI_LEVEL_, _LEVEL_, _TYPE_SEQUENCE_, _TOKEN_LENGTH_, ]; # Check for emergency flush... # The K indexes in the batch must always be a continuous sequence of # the global token array. The batch process programming assumes this. # If storing this token would cause this relation to fail we must dump # the current batch before storing the new token. It is extremely rare # for this to happen. One known example is the following two-line # snippet when run with parameters # --noadd-newlines --space-terminal-semicolon: # if ( $_ =~ /PENCIL/ ) { $pencil_flag= 1 } ; ; # $yy=1; if ( $max_index_to_go >= 0 ) { if ( $Ktoken_vars != $K_to_go[$max_index_to_go] + 1 ) { $self->flush_batch_of_CODE(); } # Do not output consecutive blank tokens ... this should not # happen, but it is worth checking. Later code can then make the # simplifying assumption that blank tokens are not consecutive. elsif ( $type eq 'b' && $types_to_go[$max_index_to_go] eq 'b' ) { if (DEVEL_MODE) { # if this happens, it is may be that consecutive blanks # were inserted into the token stream in 'respace_tokens' my $lno = $rLL->[$Ktoken_vars]->[_LINE_INDEX_] + 1; Fault("consecutive blanks near line $lno; please fix"); } return; } } # Do not start a batch with a blank token. # Fixes cases b149 b888 b984 b985 b986 b987 else { if ( $type eq 'b' ) { return } } # Update counter and do initializations if first token of new batch if ( !++$max_index_to_go ) { # Reset flag '$starting_in_quote' for a new batch. It must be set # to the value of '$in_continued_quote', but here for efficiency we # set it to zero, which is its normal value. Then in coding below # we will change it if we find we are actually in a continued quote. $starting_in_quote = 0; # Update the next parent sequence number for each new batch. #---------------------------------------- # Begin coding from sub parent_seqno_by_K #---------------------------------------- # The following is equivalent to this call but much faster: # $next_parent_seqno = $self->parent_seqno_by_K($Ktoken_vars); $next_parent_seqno = SEQ_ROOT; if ($seqno) { $next_parent_seqno = $rparent_of_seqno->{$seqno}; } else { my $Kt = $rLL->[$Ktoken_vars]->[_KNEXT_SEQ_ITEM_]; if ( defined($Kt) ) { my $type_sequence_t = $rLL->[$Kt]->[_TYPE_SEQUENCE_]; my $type_t = $rLL->[$Kt]->[_TYPE_]; # if next container token is closing, it is the parent seqno if ( $is_closing_type{$type_t} ) { $next_parent_seqno = $type_sequence_t; } # otherwise we want its parent container else { $next_parent_seqno = $rparent_of_seqno->{$type_sequence_t}; } } } $next_parent_seqno = SEQ_ROOT unless ( defined($next_parent_seqno) ); #-------------------------------------- # End coding from sub parent_seqno_by_K #-------------------------------------- $next_slevel = $rdepth_of_opening_seqno->[$next_parent_seqno] + 1; } # Clip levels to zero if there are level errors in the file. # We had to wait until now for reasons explained in sub 'write_line'. if ( $level < 0 ) { $level = 0 } # Safety check that length is defined. This is slow and should not be # needed now, so just do it in DEVEL_MODE to check programming changes. # Formerly needed for --indent-only, in which the entire set of tokens # is normally turned into type 'q'. Lengths are now defined in sub # 'respace_tokens' so this check is no longer needed. if ( DEVEL_MODE && !defined($length) ) { my $lno = $rLL->[$Ktoken_vars]->[_LINE_INDEX_] + 1; $length = length($token); Fault(<{$seqno}; if ( $is_opening_token{$token} ) { my $slevel = $rdepth_of_opening_seqno->[$seqno]; $nesting_depth_to_go[$max_index_to_go] = $slevel; $next_slevel = $slevel + 1; $next_parent_seqno = $seqno; } elsif ( $is_closing_token{$token} ) { $next_slevel = $rdepth_of_opening_seqno->[$seqno]; my $slevel = $next_slevel + 1; $nesting_depth_to_go[$max_index_to_go] = $slevel; my $parent_seqno = $rparent_of_seqno->{$seqno}; $parent_seqno = SEQ_ROOT unless defined($parent_seqno); $parent_seqno_to_go[$max_index_to_go] = $parent_seqno; $next_parent_seqno = $parent_seqno; } else { # ternary token: nothing to do } } # Define the indentation that this token will have in two cases: # Without CI = reduced_spaces_to_go # With CI = leading_spaces_to_go if ( ( $Ktoken_vars == $K_first ) && $line_of_tokens->{_starting_in_quote} ) { # in a continued quote - correct value set above if first token if ( $max_index_to_go == 0 ) { $starting_in_quote = 1 } $leading_spaces_to_go[$max_index_to_go] = 0; $reduced_spaces_to_go[$max_index_to_go] = 0; } else { $leading_spaces_to_go[$max_index_to_go] = $reduced_spaces_to_go[$max_index_to_go] = $rOpts_indent_columns * $radjusted_levels->[$Ktoken_vars]; $leading_spaces_to_go[$max_index_to_go] += $rOpts_continuation_indentation * $ci_level if ($ci_level); } DEBUG_STORE && do { my ( $a, $b, $c ) = caller(); print STDOUT "STORE: from $a $c: storing token $token type $type lev=$level at $max_index_to_go\n"; }; return; } ## end sub store_token_to_go sub flush_batch_of_CODE { # Finish and process the current batch. # This must be the only call to grind_batch_of_CODE() my ($self) = @_; # If a batch has been started ... if ( $max_index_to_go >= 0 ) { # Create an array to hold variables for this batch my $this_batch = []; $this_batch->[_starting_in_quote_] = 1 if ($starting_in_quote); $this_batch->[_ending_in_quote_] = 1 if ($ending_in_quote); if ( $CODE_type || $last_CODE_type ) { $this_batch->[_batch_CODE_type_] = $K_to_go[$max_index_to_go] >= $K_first ? $CODE_type : $last_CODE_type; } $last_line_had_side_comment = ( $max_index_to_go > 0 && $types_to_go[$max_index_to_go] eq '#' ); # The flag $is_static_block_comment applies to the line which just # arrived. So it only applies if we are outputting that line. if ( $is_static_block_comment && !$last_line_had_side_comment ) { $this_batch->[_is_static_block_comment_] = $K_to_go[0] == $K_first; } $this_batch->[_ri_starting_one_line_block_] = $ri_starting_one_line_block; $self->[_this_batch_] = $this_batch; #------------------- # process this batch #------------------- $self->grind_batch_of_CODE(); # Done .. this batch is history $self->[_this_batch_] = undef; initialize_batch_variables(); } return; } ## end sub flush_batch_of_CODE sub end_batch { # End the current batch, EXCEPT for a few special cases my ($self) = @_; if ( $max_index_to_go < 0 ) { # nothing to do .. this is harmless but wastes time. if (DEVEL_MODE) { Fault("sub end_batch called with nothing to do; please fix\n"); } return; } # Exceptions when a line does not end with a comment... (fixes c058) if ( $types_to_go[$max_index_to_go] ne '#' ) { # Exception 1: Do not end line in a weld return if ( $total_weld_count && $self->[_rK_weld_right_]->{ $K_to_go[$max_index_to_go] } ); # Exception 2: just set a tentative breakpoint if we might be in a # one-line block if ( defined($index_start_one_line_block) ) { $self->set_forced_breakpoint($max_index_to_go); return; } } $self->flush_batch_of_CODE(); return; } ## end sub end_batch sub flush_vertical_aligner { my ($self) = @_; my $vao = $self->[_vertical_aligner_object_]; $vao->flush(); return; } ## end sub flush_vertical_aligner # flush is called to output any tokens in the pipeline, so that # an alternate source of lines can be written in the correct order sub flush { my ( $self, $CODE_type_flush ) = @_; # end the current batch with 1 exception $index_start_one_line_block = undef; # Exception: if we are flushing within the code stream only to insert # blank line(s), then we can keep the batch intact at a weld. This # improves formatting of -ce. See test 'ce1.ce' if ( $CODE_type_flush && $CODE_type_flush eq 'BL' ) { $self->end_batch() if ( $max_index_to_go >= 0 ); } # otherwise, we have to shut things down completely. else { $self->flush_batch_of_CODE() } $self->flush_vertical_aligner(); return; } ## end sub flush my %is_assignment_or_fat_comma; BEGIN { %is_assignment_or_fat_comma = %is_assignment; $is_assignment_or_fat_comma{'=>'} = 1; } sub process_line_of_CODE { my ( $self, $my_line_of_tokens ) = @_; #---------------------------------------------------------------- # This routine is called once per INPUT line to format all of the # tokens on that line. #---------------------------------------------------------------- # It outputs full-line comments and blank lines immediately. # For lines of code: # - Tokens are copied one-by-one from the global token # array $rLL to a set of '_to_go' arrays which collect batches of # tokens. This is done with calls to 'store_token_to_go'. # - A batch is closed and processed upon reaching a well defined # structural break point (i.e. code block boundary) or forced # breakpoint (i.e. side comment or special user controls). # - Subsequent stages of formatting make additional line breaks # appropriate for lists and logical structures, and as necessary to # keep line lengths below the requested maximum line length. #----------------------------------- # begin initialize closure variables #----------------------------------- $line_of_tokens = $my_line_of_tokens; my $rK_range = $line_of_tokens->{_rK_range}; if ( !defined( $rK_range->[0] ) ) { # Empty line: This can happen if tokens are deleted, for example # with the -mangle parameter return; } ( $K_first, $K_last ) = @{$rK_range}; $last_CODE_type = $CODE_type; $CODE_type = $line_of_tokens->{_code_type}; $rLL = $self->[_rLL_]; $radjusted_levels = $self->[_radjusted_levels_]; $rparent_of_seqno = $self->[_rparent_of_seqno_]; $rdepth_of_opening_seqno = $self->[_rdepth_of_opening_seqno_]; $rblock_type_of_seqno = $self->[_rblock_type_of_seqno_]; #--------------------------------- # end initialize closure variables #--------------------------------- # This flag will become nobreak_to_go and should be set to 2 to prevent # a line break AFTER the current token. $no_internal_newlines = 0; if ( !$rOpts_add_newlines || $CODE_type eq 'NIN' ) { $no_internal_newlines = 2; } my $input_line = $line_of_tokens->{_line_text}; my ( $is_block_comment, $has_side_comment ); if ( $rLL->[$K_last]->[_TYPE_] eq '#' ) { if ( $K_last == $K_first ) { $is_block_comment = 1 } else { $has_side_comment = 1 } } my $is_static_block_comment_without_leading_space = $CODE_type eq 'SBCX'; $is_static_block_comment = $CODE_type eq 'SBC' || $is_static_block_comment_without_leading_space; # check for a $VERSION statement if ( $CODE_type eq 'VER' ) { $self->[_saw_VERSION_in_this_file_] = 1; $no_internal_newlines = 2; } # Add interline blank if any my $last_old_nonblank_type = "b"; my $first_new_nonblank_token = EMPTY_STRING; my $K_first_true = $K_first; if ( $max_index_to_go >= 0 ) { $last_old_nonblank_type = $types_to_go[$max_index_to_go]; $first_new_nonblank_token = $rLL->[$K_first]->[_TOKEN_]; if ( !$is_block_comment && $types_to_go[$max_index_to_go] ne 'b' && $K_first > 0 && $rLL->[ $K_first - 1 ]->[_TYPE_] eq 'b' ) { $K_first -= 1; } } my $rtok_first = $rLL->[$K_first]; my $in_quote = $line_of_tokens->{_ending_in_quote}; $ending_in_quote = $in_quote; #------------------------------------ # Handle a block (full-line) comment. #------------------------------------ if ($is_block_comment) { if ( $rOpts->{'delete-block-comments'} ) { $self->flush(); return; } $index_start_one_line_block = undef; $self->end_batch() if ( $max_index_to_go >= 0 ); # output a blank line before block comments if ( # unless we follow a blank or comment line $self->[_last_line_leading_type_] ne '#' && $self->[_last_line_leading_type_] ne 'b' # only if allowed && $rOpts->{'blanks-before-comments'} # if this is NOT an empty comment, unless it follows a side # comment and could become a hanging side comment. && ( $rtok_first->[_TOKEN_] ne '#' || ( $last_line_had_side_comment && $rLL->[$K_first]->[_LEVEL_] > 0 ) ) # not after a short line ending in an opening token # because we already have space above this comment. # Note that the first comment in this if block, after # the 'if (', does not get a blank line because of this. && !$self->[_last_output_short_opening_token_] # never before static block comments && !$is_static_block_comment ) { $self->flush(); # switching to new output stream my $file_writer_object = $self->[_file_writer_object_]; $file_writer_object->write_blank_code_line(); $self->[_last_line_leading_type_] = 'b'; } if ( $rOpts->{'indent-block-comments'} && ( !$rOpts->{'indent-spaced-block-comments'} || $input_line =~ /^\s+/ ) && !$is_static_block_comment_without_leading_space ) { my $Ktoken_vars = $K_first; my $rtoken_vars = $rLL->[$Ktoken_vars]; $self->store_token_to_go( $Ktoken_vars, $rtoken_vars ); $self->end_batch(); } else { # switching to new output stream $self->flush(); # Note that last arg in call here is 'undef' for comments my $file_writer_object = $self->[_file_writer_object_]; $file_writer_object->write_code_line( $rtok_first->[_TOKEN_] . "\n", undef ); $self->[_last_line_leading_type_] = '#'; } return; } #-------------------------------------------- # Compare input/output indentation in logfile #-------------------------------------------- if ( $self->[_save_logfile_] ) { # Compare input/output indentation except for: # - hanging side comments # - continuation lines (have unknown leading blank space) # - and lines which are quotes (they may have been outdented) my $guessed_indentation_level = $line_of_tokens->{_guessed_indentation_level}; unless ( $CODE_type eq 'HSC' || $rtok_first->[_CI_LEVEL_] > 0 || $guessed_indentation_level == 0 && $rtok_first->[_TYPE_] eq 'Q' ) { my $input_line_number = $line_of_tokens->{_line_number}; $self->compare_indentation_levels( $K_first, $guessed_indentation_level, $input_line_number ); } } #----------------------------------------- # Handle a line marked as indentation-only #----------------------------------------- if ( $CODE_type eq 'IO' ) { $self->flush(); my $line = $input_line; # Fix for rt #125506 Unexpected string formating # in which leading space of a terminal quote was removed $line =~ s/\s+$//; $line =~ s/^\s+// unless ( $line_of_tokens->{_starting_in_quote} ); my $Ktoken_vars = $K_first; # We work with a copy of the token variables and change the # first token to be the entire line as a quote variable my $rtoken_vars = $rLL->[$Ktoken_vars]; $rtoken_vars = copy_token_as_type( $rtoken_vars, 'q', $line ); # Patch: length is not really important here but must be defined $rtoken_vars->[_TOKEN_LENGTH_] = length($line); $self->store_token_to_go( $Ktoken_vars, $rtoken_vars ); $self->end_batch(); return; } #--------------------------- # Handle all other lines ... #--------------------------- # If we just saw the end of an elsif block, write nag message # if we do not see another elseif or an else. if ($looking_for_else) { ## /^(elsif|else)$/ if ( !$is_elsif_else{ $rLL->[$K_first_true]->[_TOKEN_] } ) { write_logfile_entry("(No else block)\n"); } $looking_for_else = 0; } # This is a good place to kill incomplete one-line blocks if ( $max_index_to_go >= 0 ) { # For -iob and -lp, mark essential old breakpoints. # Fixes b1021 b1023 b1034 b1048 b1049 b1050 b1056 b1058 # See related code below. if ( $rOpts_ignore_old_breakpoints && $rOpts_line_up_parentheses ) { my $type_first = $rLL->[$K_first_true]->[_TYPE_]; if ( $is_assignment_or_fat_comma{$type_first} ) { $old_breakpoint_to_go[$max_index_to_go] = 1; } } if ( # this check needed -mangle (for example rt125012) ( ( !$index_start_one_line_block ) && ( $last_old_nonblank_type eq ';' ) && ( $first_new_nonblank_token ne '}' ) ) # Patch for RT #98902. Honor request to break at old commas. || ( $rOpts_break_at_old_comma_breakpoints && $last_old_nonblank_type eq ',' ) ) { $forced_breakpoint_to_go[$max_index_to_go] = 1 if ($rOpts_break_at_old_comma_breakpoints); $index_start_one_line_block = undef; $self->end_batch(); } # Keep any requested breaks before this line. Note that we have to # use the original K_first because it may have been reduced above # to add a blank. The value of the flag is as follows: # 1 => hard break, flush the batch # 2 => soft break, set breakpoint and continue building the batch # added check on max_index_to_go for c177 if ( $max_index_to_go >= 0 && $self->[_rbreak_before_Kfirst_]->{$K_first_true} ) { $index_start_one_line_block = undef; if ( $self->[_rbreak_before_Kfirst_]->{$K_first_true} == 2 ) { $self->set_forced_breakpoint($max_index_to_go); } else { $self->end_batch(); } } } #-------------------------------------- # loop to process the tokens one-by-one #-------------------------------------- $self->process_line_inner_loop($has_side_comment); # if there is anything left in the output buffer ... if ( $max_index_to_go >= 0 ) { my $type = $rLL->[$K_last]->[_TYPE_]; my $break_flag = $self->[_rbreak_after_Klast_]->{$K_last}; # we have to flush .. if ( # if there is a side comment... $type eq '#' # if this line ends in a quote # NOTE: This is critically important for insuring that quoted # lines do not get processed by things like -sot and -sct || $in_quote # if this is a VERSION statement || $CODE_type eq 'VER' # to keep a label at the end of a line || ( $type eq 'J' && $rOpts_break_after_labels != 2 ) # if we have a hard break request || $break_flag && $break_flag != 2 # if we are instructed to keep all old line breaks || !$rOpts->{'delete-old-newlines'} # if this is a line of the form 'use overload'. A break here in # the input file is a good break because it will allow the # operators which follow to be formatted well. Without this # break the formatting with -ci=4 -xci is poor, for example. # use overload # '+' => sub { # print length $_[2], "\n"; # my ( $x, $y ) = _order(@_); # Number::Roman->new( int $x + $y ); # }, # '-' => sub { # my ( $x, $y ) = _order(@_); # Number::Roman->new( int $x - $y ); # }; || ( $max_index_to_go == 2 && $types_to_go[0] eq 'k' && $tokens_to_go[0] eq 'use' && $tokens_to_go[$max_index_to_go] eq 'overload' ) ) { $index_start_one_line_block = undef; $self->end_batch(); } else { # Check for a soft break request if ( $break_flag && $break_flag == 2 ) { $self->set_forced_breakpoint($max_index_to_go); } # mark old line breakpoints in current output stream if ( !$rOpts_ignore_old_breakpoints # Mark essential old breakpoints if combination -iob -lp is # used. These two options do not work well together, but # we can avoid turning -iob off by ignoring -iob at certain # essential line breaks. See also related code above. # Fixes b1021 b1023 b1034 b1048 b1049 b1050 b1056 b1058 || ( $rOpts_line_up_parentheses && $is_assignment_or_fat_comma{$type} ) ) { $old_breakpoint_to_go[$max_index_to_go] = 1; } } } return; } ## end sub process_line_of_CODE sub process_line_inner_loop { my ( $self, $has_side_comment ) = @_; #-------------------------------------------------------------------- # Loop to move all tokens from one input line to a newly forming batch #-------------------------------------------------------------------- # Do not start a new batch with a blank space if ( $max_index_to_go < 0 && $rLL->[$K_first]->[_TYPE_] eq 'b' ) { $K_first++; } foreach my $Ktoken_vars ( $K_first .. $K_last ) { my $rtoken_vars = $rLL->[$Ktoken_vars]; #-------------- # handle blanks #-------------- if ( $rtoken_vars->[_TYPE_] eq 'b' ) { $self->store_token_to_go( $Ktoken_vars, $rtoken_vars ); next; } #------------------ # handle non-blanks #------------------ my $type = $rtoken_vars->[_TYPE_]; # If we are continuing after seeing a right curly brace, flush # buffer unless we see what we are looking for, as in # } else ... if ($rbrace_follower) { my $token = $rtoken_vars->[_TOKEN_]; unless ( $rbrace_follower->{$token} ) { $self->end_batch() if ( $max_index_to_go >= 0 ); } $rbrace_follower = undef; } my ( $block_type, $type_sequence, $is_opening_BLOCK, $is_closing_BLOCK, $nobreak_BEFORE_BLOCK ); if ( $rtoken_vars->[_TYPE_SEQUENCE_] ) { my $token = $rtoken_vars->[_TOKEN_]; $type_sequence = $rtoken_vars->[_TYPE_SEQUENCE_]; $block_type = $rblock_type_of_seqno->{$type_sequence}; if ( $block_type && $token eq $type && $block_type ne 't' && !$self->[_rshort_nested_]->{$type_sequence} ) { if ( $type eq '{' ) { $is_opening_BLOCK = 1; $nobreak_BEFORE_BLOCK = $no_internal_newlines; } elsif ( $type eq '}' ) { $is_closing_BLOCK = 1; $nobreak_BEFORE_BLOCK = $no_internal_newlines; } } } #--------------------- # handle side comments #--------------------- if ($has_side_comment) { # if at last token ... if ( $Ktoken_vars == $K_last ) { $self->store_token_to_go( $Ktoken_vars, $rtoken_vars ); next; } # if before last token ... do not allow breaks which would # promote a side comment to a block comment elsif ($Ktoken_vars == $K_last - 1 || $Ktoken_vars == $K_last - 2 && $rLL->[ $K_last - 1 ]->[_TYPE_] eq 'b' ) { $no_internal_newlines = 2; } } # Process non-blank and non-comment tokens ... #----------------- # handle semicolon #----------------- if ( $type eq ';' ) { my $next_nonblank_token_type = 'b'; my $next_nonblank_token = EMPTY_STRING; if ( $Ktoken_vars < $K_last ) { my $Knnb = $Ktoken_vars + 1; $Knnb++ if ( $rLL->[$Knnb]->[_TYPE_] eq 'b' ); $next_nonblank_token = $rLL->[$Knnb]->[_TOKEN_]; $next_nonblank_token_type = $rLL->[$Knnb]->[_TYPE_]; } if ( $rOpts_break_at_old_semicolon_breakpoints && ( $Ktoken_vars == $K_first ) && $max_index_to_go >= 0 && !defined($index_start_one_line_block) ) { $self->end_batch(); } $self->store_token_to_go( $Ktoken_vars, $rtoken_vars ); $self->end_batch() unless ( $no_internal_newlines || ( $rOpts_keep_interior_semicolons && $Ktoken_vars < $K_last ) || ( $next_nonblank_token eq '}' ) ); } #----------- # handle '{' #----------- elsif ($is_opening_BLOCK) { # Tentatively output this token. This is required before # calling starting_one_line_block. We may have to unstore # it, though, if we have to break before it. $self->store_token_to_go( $Ktoken_vars, $rtoken_vars ); # Look ahead to see if we might form a one-line block.. my $too_long = $self->starting_one_line_block( $Ktoken_vars, $K_last_nonblank_code, $K_last ); $self->clear_breakpoint_undo_stack(); # to simplify the logic below, set a flag to indicate if # this opening brace is far from the keyword which introduces it my $keyword_on_same_line = 1; if ( $max_index_to_go >= 0 && defined($K_last_nonblank_code) && $rLL->[$K_last_nonblank_code]->[_TYPE_] eq ')' && ( ( $rtoken_vars->[_LEVEL_] < $levels_to_go[0] ) || $too_long ) ) { $keyword_on_same_line = 0; } # Break before '{' if requested with -bl or -bli flag my $want_break = $self->[_rbrace_left_]->{$type_sequence}; # But do not break if this token is welded to the left if ( $total_weld_count && defined( $self->[_rK_weld_left_]->{$Ktoken_vars} ) ) { $want_break = 0; } # Break BEFORE an opening '{' ... if ( # if requested $want_break # and we were unable to start looking for a block, && !defined($index_start_one_line_block) # or if it will not be on same line as its keyword, so that # it will be outdented (eval.t, overload.t), and the user # has not insisted on keeping it on the right || ( !$keyword_on_same_line && !$rOpts_opening_brace_always_on_right ) ) { # but only if allowed unless ($nobreak_BEFORE_BLOCK) { # since we already stored this token, we must unstore it $self->unstore_token_to_go(); # then output the line $self->end_batch() if ( $max_index_to_go >= 0 ); # and now store this token at the start of a new line $self->store_token_to_go( $Ktoken_vars, $rtoken_vars ); } } # now output this line $self->end_batch() if ( $max_index_to_go >= 0 && !$no_internal_newlines ); } #----------- # handle '}' #----------- elsif ($is_closing_BLOCK) { my $next_nonblank_token_type = 'b'; my $next_nonblank_token = EMPTY_STRING; my $Knnb; if ( $Ktoken_vars < $K_last ) { $Knnb = $Ktoken_vars + 1; $Knnb++ if ( $rLL->[$Knnb]->[_TYPE_] eq 'b' ); $next_nonblank_token = $rLL->[$Knnb]->[_TOKEN_]; $next_nonblank_token_type = $rLL->[$Knnb]->[_TYPE_]; } # If there is a pending one-line block .. if ( defined($index_start_one_line_block) ) { # Fix for b1208: if a side comment follows this closing # brace then we must include its length in the length test # ... unless the -issl flag is set (fixes b1307-1309). # Assume a minimum of 1 blank space to the comment. my $added_length = 0; if ( $has_side_comment && !$rOpts_ignore_side_comment_lengths && $next_nonblank_token_type eq '#' ) { $added_length = 1 + $rLL->[$K_last]->[_TOKEN_LENGTH_]; } # we have to terminate it if.. if ( # it is too long (final length may be different from # initial estimate). note: must allow 1 space for this # token $self->excess_line_length( $index_start_one_line_block, $max_index_to_go ) + $added_length >= 0 ) { $index_start_one_line_block = undef; } } # put a break before this closing curly brace if appropriate $self->end_batch() if ( $max_index_to_go >= 0 && !$nobreak_BEFORE_BLOCK && !defined($index_start_one_line_block) ); # store the closing curly brace $self->store_token_to_go( $Ktoken_vars, $rtoken_vars ); # ok, we just stored a closing curly brace. Often, but # not always, we want to end the line immediately. # So now we have to check for special cases. # if this '}' successfully ends a one-line block.. my $one_line_block_type = EMPTY_STRING; my $keep_going; if ( defined($index_start_one_line_block) ) { # Remember the type of token just before the # opening brace. It would be more general to use # a stack, but this will work for one-line blocks. $one_line_block_type = $types_to_go[$index_start_one_line_block]; # we have to actually make it by removing tentative # breaks that were set within it $self->undo_forced_breakpoint_stack(0); # For -lp, extend the nobreak to include a trailing # terminal ','. This is because the -lp indentation was # not known when making one-line blocks, so we may be able # to move the line back to fit. Otherwise we may create a # needlessly stranded comma on the next line. my $iend_nobreak = $max_index_to_go - 1; if ( $rOpts_line_up_parentheses && $next_nonblank_token_type eq ',' && $Knnb eq $K_last ) { my $p_seqno = $parent_seqno_to_go[$max_index_to_go]; my $is_excluded = $self->[_ris_excluded_lp_container_]->{$p_seqno}; $iend_nobreak = $max_index_to_go if ( !$is_excluded ); } $self->set_nobreaks( $index_start_one_line_block, $iend_nobreak ); # save starting block indexes so that sub correct_lp can # check and adjust -lp indentation (c098) push @{$ri_starting_one_line_block}, $index_start_one_line_block; # then re-initialize for the next one-line block $index_start_one_line_block = undef; # then decide if we want to break after the '}' .. # We will keep going to allow certain brace followers as in: # do { $ifclosed = 1; last } unless $losing; # # But make a line break if the curly ends a # significant block: if ( ( $is_block_without_semicolon{$block_type} # Follow users break point for # one line block types U & G, such as a 'try' block || $one_line_block_type =~ /^[UG]$/ && $Ktoken_vars == $K_last ) # if needless semicolon follows we handle it later && $next_nonblank_token ne ';' ) { $self->end_batch() unless ($no_internal_newlines); } } # set string indicating what we need to look for brace follower # tokens if ( $is_if_unless_elsif_else{$block_type} ) { $rbrace_follower = undef; } elsif ( $block_type eq 'do' ) { $rbrace_follower = \%is_do_follower; if ( $self->tight_paren_follows( $K_to_go[0], $Ktoken_vars ) ) { $rbrace_follower = { ')' => 1 }; } } # added eval for borris.t elsif ($is_sort_map_grep_eval{$block_type} || $one_line_block_type eq 'G' ) { $rbrace_follower = undef; $keep_going = 1; } # anonymous sub elsif ( $self->[_ris_asub_block_]->{$type_sequence} ) { if ($one_line_block_type) { $rbrace_follower = \%is_anon_sub_1_brace_follower; # Exceptions to help keep -lp intact, see git #74 ... # Exception 1: followed by '}' on this line if ( $Ktoken_vars < $K_last && $next_nonblank_token eq '}' ) { $rbrace_follower = undef; $keep_going = 1; } # Exception 2: followed by '}' on next line if -lp set. # The -lp requirement allows the formatting to follow # old breaks when -lp is not used, minimizing changes. # Fixes issue c087. elsif ($Ktoken_vars == $K_last && $rOpts_line_up_parentheses ) { my $K_closing_container = $self->[_K_closing_container_]; my $p_seqno = $parent_seqno_to_go[$max_index_to_go]; my $Kc = $K_closing_container->{$p_seqno}; my $is_excluded = $self->[_ris_excluded_lp_container_]->{$p_seqno}; $keep_going = ( defined($Kc) && $rLL->[$Kc]->[_TOKEN_] eq '}' && !$is_excluded && $Kc - $Ktoken_vars <= 2 ); $rbrace_follower = undef if ($keep_going); } } else { $rbrace_follower = \%is_anon_sub_brace_follower; } } # None of the above: specify what can follow a closing # brace of a block which is not an # if/elsif/else/do/sort/map/grep/eval # Testfiles: # 'Toolbar.pm', 'Menubar.pm', bless.t, '3rules.pl', 'break1.t else { $rbrace_follower = \%is_other_brace_follower; } # See if an elsif block is followed by another elsif or else; # complain if not. if ( $block_type eq 'elsif' ) { if ( $next_nonblank_token_type eq 'b' ) { # end of line? $looking_for_else = 1; # ok, check on next line } else { ## /^(elsif|else)$/ if ( !$is_elsif_else{$next_nonblank_token} ) { write_logfile_entry("No else block :(\n"); } } } # keep going after certain block types (map,sort,grep,eval) # added eval for borris.t if ($keep_going) { # keep going $rbrace_follower = undef; } # if no more tokens, postpone decision until re-entering elsif ( ( $next_nonblank_token_type eq 'b' ) && $rOpts_add_newlines ) { unless ($rbrace_follower) { $self->end_batch() unless ( $no_internal_newlines || $max_index_to_go < 0 ); } } elsif ($rbrace_follower) { if ( $rbrace_follower->{$next_nonblank_token} ) { # Fix for b1385: keep break after a comma following a # 'do' block. This could also be used for other block # types, but that would cause a significant change in # existing formatting without much benefit. if ( $next_nonblank_token eq ',' && $Knnb eq $K_last && $block_type eq 'do' && $rOpts_add_newlines && $self->is_trailing_comma($Knnb) ) { $self->[_rbreak_after_Klast_]->{$K_last} = 1; } } else { $self->end_batch() unless ( $no_internal_newlines || $max_index_to_go < 0 ); } $rbrace_follower = undef; } else { $self->end_batch() unless ( $no_internal_newlines || $max_index_to_go < 0 ); } } ## end treatment of closing block token #------------------------------ # handle here_doc target string #------------------------------ elsif ( $type eq 'h' ) { # no newlines after seeing here-target $no_internal_newlines = 2; $self->store_token_to_go( $Ktoken_vars, $rtoken_vars ); } #----------------------------- # handle all other token types #----------------------------- else { $self->store_token_to_go( $Ktoken_vars, $rtoken_vars ); # break after a label if requested if ( $rOpts_break_after_labels && $type eq 'J' && $rOpts_break_after_labels == 1 ) { $self->end_batch() unless ($no_internal_newlines); } } # remember previous nonblank, non-comment OUTPUT token $K_last_nonblank_code = $Ktoken_vars; } ## end of loop over all tokens in this line return; } ## end sub process_line_inner_loop } ## end closure process_line_of_CODE sub is_trailing_comma { my ( $self, $KK ) = @_; # Given: # $KK - index of a comma in token list # Return: # true if the comma at index $KK is a trailing comma # false if not my $rLL = $self->[_rLL_]; my $type_KK = $rLL->[$KK]->[_TYPE_]; if ( $type_KK ne ',' ) { DEVEL_MODE && Fault("Bad call: expected type ',' but received '$type_KK'\n"); return; } my $Knnb = $self->K_next_nonblank($KK); if ( defined($Knnb) ) { my $type_sequence = $rLL->[$Knnb]->[_TYPE_SEQUENCE_]; my $type_Knnb = $rLL->[$Knnb]->[_TYPE_]; if ( $type_sequence && $is_closing_type{$type_Knnb} ) { return 1; } } return; } ## end sub is_trailing_comma sub tight_paren_follows { my ( $self, $K_to_go_0, $K_ic ) = @_; # Input parameters: # $K_to_go_0 = first token index K of this output batch (=K_to_go[0]) # $K_ic = index of the closing do brace (=K_to_go[$max_index_to_go]) # Return parameter: # false if we want a break after the closing do brace # true if we do not want a break after the closing do brace # We are at the closing brace of a 'do' block. See if this brace is # followed by a closing paren, and if so, set a flag which indicates # that we do not want a line break between the '}' and ')'. # xxxxx ( ...... do { ... } ) { # ^-------looking at this brace, K_ic # Subscript notation: # _i = inner container (braces in this case) # _o = outer container (parens in this case) # _io = inner opening = '{' # _ic = inner closing = '}' # _oo = outer opening = '(' # _oc = outer closing = ')' # |--K_oo |--K_oc = outer container # xxxxx ( ...... do { ...... } ) { # |--K_io |--K_ic = inner container # In general, the safe thing to do is return a 'false' value # if the statement appears to be complex. This will have # the downstream side-effect of opening up outer containers # to help make complex code readable. But for simpler # do blocks it can be preferable to keep the code compact # by returning a 'true' value. return unless defined($K_ic); my $rLL = $self->[_rLL_]; # we should only be called at a closing block my $seqno_i = $rLL->[$K_ic]->[_TYPE_SEQUENCE_]; return unless ($seqno_i); # shouldn't happen; # This only applies if the next nonblank is a ')' my $K_oc = $self->K_next_nonblank($K_ic); return unless defined($K_oc); my $token_next = $rLL->[$K_oc]->[_TOKEN_]; return unless ( $token_next eq ')' ); my $seqno_o = $rLL->[$K_oc]->[_TYPE_SEQUENCE_]; my $K_io = $self->[_K_opening_container_]->{$seqno_i}; my $K_oo = $self->[_K_opening_container_]->{$seqno_o}; return unless ( defined($K_io) && defined($K_oo) ); # RULE 1: Do not break before a closing signature paren # (regardless of complexity). This is a fix for issue git#22. # Looking for something like: # sub xxx ( ... do { ... } ) { # ^----- next block_type my $K_test = $self->K_next_nonblank($K_oc); if ( defined($K_test) && $rLL->[$K_test]->[_TYPE_] eq '{' ) { my $seqno_test = $rLL->[$K_test]->[_TYPE_SEQUENCE_]; if ($seqno_test) { if ( $self->[_ris_asub_block_]->{$seqno_test} || $self->[_ris_sub_block_]->{$seqno_test} ) { return 1; } } } # RULE 2: Break if the contents within braces appears to be 'complex'. We # base this decision on the number of tokens between braces. # xxxxx ( ... do { ... } ) { # ^^^^^^ # Although very simple, it has the advantages of (1) being insensitive to # changes in lengths of identifier names, (2) easy to understand, implement # and test. A test case for this is 't/snippets/long_line.in'. # Example: $K_ic - $K_oo = 9 [Pass Rule 2] # if ( do { $2 !~ /&/ } ) { ... } # Example: $K_ic - $K_oo = 10 [Pass Rule 2] # for ( split /\s*={70,}\s*/, do { local $/; }) { ... } # Example: $K_ic - $K_oo = 20 [Fail Rule 2] # test_zero_args( "do-returned list slice", do { ( 10, 11 )[ 2, 3 ]; }); return if ( $K_ic - $K_io > 16 ); # RULE 3: break if the code between the opening '(' and the '{' is 'complex' # As with the previous rule, we decide based on the token count # xxxxx ( ... do { ... } ) { # ^^^^^^^^ # Example: $K_ic - $K_oo = 9 [Pass Rule 2] # $K_io - $K_oo = 4 [Pass Rule 3] # if ( do { $2 !~ /&/ } ) { ... } # Example: $K_ic - $K_oo = 10 [Pass rule 2] # $K_io - $K_oo = 9 [Pass rule 3] # for ( split /\s*={70,}\s*/, do { local $/; }) { ... } return if ( $K_io - $K_oo > 9 ); # RULE 4: Break if we have already broken this batch of output tokens return if ( $K_oo < $K_to_go_0 ); # RULE 5: Break if input is not on one line # For example, we will set the flag for the following expression # written in one line: # This has: $K_ic - $K_oo = 10 [Pass rule 2] # $K_io - $K_oo = 8 [Pass rule 3] # $self->debug( 'Error: ' . do { local $/; <$err> } ); # but we break after the brace if it is on multiple lines on input, since # the user may prefer it on multiple lines: # [Fail rule 5] # $self->debug( # 'Error: ' . do { local $/; <$err> } # ); if ( !$rOpts_ignore_old_breakpoints ) { my $iline_oo = $rLL->[$K_oo]->[_LINE_INDEX_]; my $iline_oc = $rLL->[$K_oc]->[_LINE_INDEX_]; return if ( $iline_oo != $iline_oc ); } # OK to keep the paren tight return 1; } ## end sub tight_paren_follows my %is_brace_semicolon_colon; BEGIN { my @q = qw( { } ; : ); @is_brace_semicolon_colon{@q} = (1) x scalar(@q); } sub starting_one_line_block { # After seeing an opening curly brace, look for the closing brace and see # if the entire block will fit on a line. This routine is not always right # so a check is made later (at the closing brace) to make sure we really # have a one-line block. We have to do this preliminary check, though, # because otherwise we would always break at a semicolon within a one-line # block if the block contains multiple statements. # Given: # $Kj = index of opening brace # $K_last_nonblank = index of previous nonblank code token # $K_last = index of last token of input line # Calls 'create_one_line_block' if one-line block might be formed. # Also returns a flag '$too_long': # true = distance from opening keyword to OPENING brace exceeds # the maximum line length. # false (simple return) => not too long # Note that this flag is for distance from the statement start to the # OPENING brace, not the closing brace. my ( $self, $Kj, $K_last_nonblank, $K_last ) = @_; my $rbreak_container = $self->[_rbreak_container_]; my $rshort_nested = $self->[_rshort_nested_]; my $rLL = $self->[_rLL_]; my $K_opening_container = $self->[_K_opening_container_]; my $rblock_type_of_seqno = $self->[_rblock_type_of_seqno_]; # kill any current block - we can only go 1 deep create_one_line_block(); my $i_start = 0; # This routine should not have been called if there are no tokens in the # 'to_go' arrays of previously stored tokens. A previous call to # 'store_token_to_go' should have stored an opening brace. An error here # indicates that a programming change may have caused a flush operation to # clean out the previously stored tokens. if ( !defined($max_index_to_go) || $max_index_to_go < 0 ) { Fault("program bug: store_token_to_go called incorrectly\n") if (DEVEL_MODE); return; } # Return if block should be broken my $type_sequence_j = $rLL->[$Kj]->[_TYPE_SEQUENCE_]; if ( $rbreak_container->{$type_sequence_j} ) { return; } my $ris_bli_container = $self->[_ris_bli_container_]; my $is_bli = $ris_bli_container->{$type_sequence_j}; my $block_type = $rblock_type_of_seqno->{$type_sequence_j}; $block_type = EMPTY_STRING unless ( defined($block_type) ); my $previous_nonblank_token = EMPTY_STRING; my $i_last_nonblank = -1; if ( defined($K_last_nonblank) ) { $i_last_nonblank = $K_last_nonblank - $K_to_go[0]; if ( $i_last_nonblank >= 0 ) { $previous_nonblank_token = $rLL->[$K_last_nonblank]->[_TOKEN_]; } } #--------------------------------------------------------------------- # find the starting keyword for this block (such as 'if', 'else', ...) #--------------------------------------------------------------------- if ( $max_index_to_go == 0 ##|| $block_type =~ /^[\{\}\;\:]$/ || $is_brace_semicolon_colon{$block_type} || substr( $block_type, 0, 7 ) eq 'package' ) { $i_start = $max_index_to_go; } # the previous nonblank token should start these block types elsif ( $i_last_nonblank >= 0 && ( $previous_nonblank_token eq $block_type || $self->[_ris_asub_block_]->{$type_sequence_j} || $self->[_ris_sub_block_]->{$type_sequence_j} || substr( $block_type, -2, 2 ) eq '()' ) ) { $i_start = $i_last_nonblank; # For signatures and extended syntax ... # If this brace follows a parenthesized list, we should look back to # find the keyword before the opening paren because otherwise we might # form a one line block which stays intact, and cause the parenthesized # expression to break open. That looks bad. if ( $tokens_to_go[$i_start] eq ')' ) { # Find the opening paren my $K_start = $K_to_go[$i_start]; return unless defined($K_start); my $seqno = $type_sequence_to_go[$i_start]; return unless ($seqno); my $K_opening = $K_opening_container->{$seqno}; return unless defined($K_opening); my $i_opening = $i_start + ( $K_opening - $K_start ); # give up if not on this line return unless ( $i_opening >= 0 ); $i_start = $i_opening; # go back one token before the opening paren if ( $i_start > 0 ) { $i_start-- } if ( $types_to_go[$i_start] eq 'b' && $i_start > 0 ) { $i_start--; } my $lev = $levels_to_go[$i_start]; if ( $lev > $rLL->[$Kj]->[_LEVEL_] ) { return } } } elsif ( $previous_nonblank_token eq ')' ) { # For something like "if (xxx) {", the keyword "if" will be # just after the most recent break. This will be 0 unless # we have just killed a one-line block and are starting another. # (doif.t) # Note: cannot use inext_index_to_go[] here because that array # is still being constructed. $i_start = $index_max_forced_break + 1; if ( $types_to_go[$i_start] eq 'b' ) { $i_start++; } # Patch to avoid breaking short blocks defined with extended_syntax: # Strip off any trailing () which was added in the parser to mark # the opening keyword. For example, in the following # create( TypeFoo $e) {$bubba} # the blocktype would be marked as create() my $stripped_block_type = $block_type; if ( substr( $block_type, -2, 2 ) eq '()' ) { $stripped_block_type = substr( $block_type, 0, -2 ); } unless ( $tokens_to_go[$i_start] eq $stripped_block_type ) { return; } } # patch for SWITCH/CASE to retain one-line case/when blocks elsif ( $block_type eq 'case' || $block_type eq 'when' ) { # Note: cannot use inext_index_to_go[] here because that array # is still being constructed. $i_start = $index_max_forced_break + 1; if ( $types_to_go[$i_start] eq 'b' ) { $i_start++; } unless ( $tokens_to_go[$i_start] eq $block_type ) { return; } } else { #------------------------------------------- # Couldn't find start - return too_long flag #------------------------------------------- return 1; } my $pos = total_line_length( $i_start, $max_index_to_go ) - 1; my $maximum_line_length = $maximum_line_length_at_level[ $levels_to_go[$i_start] ]; # see if distance to the opening container is too great to even start if ( $pos > $maximum_line_length ) { #------------------------------ # too long to the opening token #------------------------------ return 1; } #----------------------------------------------------------------------- # OK so far: the statement is not to long just to the OPENING token. Now # see if everything to the closing token will fit on one line #----------------------------------------------------------------------- # This is part of an update to fix cases b562 .. b983 my $K_closing = $self->[_K_closing_container_]->{$type_sequence_j}; return unless ( defined($K_closing) ); my $container_length = $rLL->[$K_closing]->[_CUMULATIVE_LENGTH_] - $rLL->[$Kj]->[_CUMULATIVE_LENGTH_]; my $excess = $pos + 1 + $container_length - $maximum_line_length; # Add a small tolerance for welded tokens (case b901) if ( $total_weld_count && $self->is_welded_at_seqno($type_sequence_j) ) { $excess += 2; } if ( $excess > 0 ) { # line is too long... there is no chance of forming a one line block # if the excess is more than 1 char return if ( $excess > 1 ); # ... and give up if it is not a one-line block on input. # note: for a one-line block on input, it may be possible to keep # it as a one-line block (by removing a needless semicolon ). my $K_start = $K_to_go[$i_start]; my $ldiff = $rLL->[$K_closing]->[_LINE_INDEX_] - $rLL->[$K_start]->[_LINE_INDEX_]; return if ($ldiff); } #------------------------------------------------------------------ # Loop to check contents and length of the potential one-line block #------------------------------------------------------------------ foreach my $Ki ( $Kj + 1 .. $K_last ) { # old whitespace could be arbitrarily large, so don't use it if ( $rLL->[$Ki]->[_TYPE_] eq 'b' ) { $pos += 1 } else { $pos += $rLL->[$Ki]->[_TOKEN_LENGTH_] } # ignore some small blocks my $type_sequence_i = $rLL->[$Ki]->[_TYPE_SEQUENCE_]; my $nobreak = $rshort_nested->{$type_sequence_i}; # Return false result if we exceed the maximum line length, if ( $pos > $maximum_line_length ) { return; } # keep going for non-containers elsif ( !$type_sequence_i ) { } # return if we encounter another opening brace before finding the # closing brace. elsif ($rLL->[$Ki]->[_TOKEN_] eq '{' && $rLL->[$Ki]->[_TYPE_] eq '{' && $rblock_type_of_seqno->{$type_sequence_i} && !$nobreak ) { return; } # if we find our closing brace.. elsif ($rLL->[$Ki]->[_TOKEN_] eq '}' && $rLL->[$Ki]->[_TYPE_] eq '}' && $rblock_type_of_seqno->{$type_sequence_i} && !$nobreak ) { # be sure any trailing comment also fits on the line my $Ki_nonblank = $Ki; if ( $Ki_nonblank < $K_last ) { $Ki_nonblank++; if ( $rLL->[$Ki_nonblank]->[_TYPE_] eq 'b' && $Ki_nonblank < $K_last ) { $Ki_nonblank++; } } # Patch for one-line sort/map/grep/eval blocks with side comments: # We will ignore the side comment length for sort/map/grep/eval # because this can lead to statements which change every time # perltidy is run. Here is an example from Denis Moskowitz which # oscillates between these two states without this patch: ## -------- ## grep { $_->foo ne 'bar' } # asdfa asdf asdf asdf asdf asdf asdf asdf asdf asdf asdf ## @baz; ## ## grep { ## $_->foo ne 'bar' ## } # asdfa asdf asdf asdf asdf asdf asdf asdf asdf asdf asdf ## @baz; ## -------- # When the first line is input it gets broken apart by the main # line break logic in sub process_line_of_CODE. # When the second line is input it gets recombined by # process_line_of_CODE and passed to the output routines. The # output routines (break_long_lines) do not break it apart # because the bond strengths are set to the highest possible value # for grep/map/eval/sort blocks, so the first version gets output. # It would be possible to fix this by changing bond strengths, # but they are high to prevent errors in older versions of perl. # See c100 for eval test. if ( $Ki < $K_last && $rLL->[$K_last]->[_TYPE_] eq '#' && $rLL->[$K_last]->[_LEVEL_] == $rLL->[$Ki]->[_LEVEL_] && !$rOpts_ignore_side_comment_lengths && !$is_sort_map_grep_eval{$block_type} && $K_last - $Ki_nonblank <= 2 ) { # Only include the side comment for if/else/elsif/unless if it # immediately follows (because the current '$rbrace_follower' # logic for these will give an immediate brake after these # closing braces). So for example a line like this # if (...) { ... } ; # very long comment...... # will already break like this: # if (...) { ... } # ; # very long comment...... # so we do not need to include the length of the comment, which # would break the block. Project 'bioperl' has coding like this. ## !~ /^(if|else|elsif|unless)$/ if ( !$is_if_unless_elsif_else{$block_type} || $K_last == $Ki_nonblank ) { $Ki_nonblank = $K_last; $pos += $rLL->[$Ki_nonblank]->[_TOKEN_LENGTH_]; if ( $Ki_nonblank > $Ki + 1 ) { # source whitespace could be anything, assume # at least one space before the hash on output if ( $rLL->[ $Ki + 1 ]->[_TYPE_] eq 'b' ) { $pos += 1; } else { $pos += $rLL->[ $Ki + 1 ]->[_TOKEN_LENGTH_] } } if ( $pos >= $maximum_line_length ) { return; } } } #-------------------------- # ok, it's a one-line block #-------------------------- create_one_line_block($i_start); return; } # just keep going for other characters else { } } #-------------------------------------------------- # End Loop to examine tokens in potential one-block #-------------------------------------------------- # We haven't hit the closing brace, but there is still space. So the # question here is, should we keep going to look at more lines in hopes of # forming a new one-line block, or should we stop right now. The problem # with continuing is that we will not be able to honor breaks before the # opening brace if we continue. # Typically we will want to keep trying to make one-line blocks for things # like sort/map/grep/eval. But it is not always a good idea to make as # many one-line blocks as possible, so other types are not done. The user # can always use -mangle. # If we want to keep going, we will create a new one-line block. # The blocks which we can keep going are in a hash, but we never want # to continue if we are at a '-bli' block. if ( $want_one_line_block{$block_type} && !$is_bli ) { my $rtype_count = $self->[_rtype_count_by_seqno_]->{$type_sequence_j}; my $semicolon_count = $rtype_count && $rtype_count->{';'} ? $rtype_count->{';'} : 0; # Ignore a terminal semicolon in the count if ( $semicolon_count <= 2 ) { my $K_closing_container = $self->[_K_closing_container_]; my $K_closing_j = $K_closing_container->{$type_sequence_j}; my $Kp = $self->K_previous_nonblank($K_closing_j); if ( defined($Kp) && $rLL->[$Kp]->[_TYPE_] eq ';' ) { $semicolon_count -= 1; } } if ( $semicolon_count <= 0 ) { create_one_line_block($i_start); } elsif ( $semicolon_count == 1 && $block_type eq 'eval' ) { # Mark short broken eval blocks for possible later use in # avoiding adding spaces before a 'package' line. This is not # essential but helps keep newer and older formatting the same. $self->[_ris_short_broken_eval_block_]->{$type_sequence_j} = 1; } } return; } ## end sub starting_one_line_block sub unstore_token_to_go { # remove most recent token from output stream my $self = shift; if ( $max_index_to_go > 0 ) { $max_index_to_go--; } else { $max_index_to_go = UNDEFINED_INDEX; } return; } ## end sub unstore_token_to_go sub compare_indentation_levels { # Check to see if output line tabbing agrees with input line # this can be very useful for debugging a script which has an extra # or missing brace. my ( $self, $K_first, $guessed_indentation_level, $line_number ) = @_; return unless ( defined($K_first) ); my $rLL = $self->[_rLL_]; # ignore a line with a leading blank token - issue c195 my $type = $rLL->[$K_first]->[_TYPE_]; return if ( $type eq 'b' ); my $structural_indentation_level = $self->[_radjusted_levels_]->[$K_first]; # record max structural depth for log file if ( $structural_indentation_level > $self->[_maximum_BLOCK_level_] ) { $self->[_maximum_BLOCK_level_] = $structural_indentation_level; $self->[_maximum_BLOCK_level_at_line_] = $line_number; } my $type_sequence = $rLL->[$K_first]->[_TYPE_SEQUENCE_]; my $is_closing_block = $type_sequence && $self->[_rblock_type_of_seqno_]->{$type_sequence} && $type eq '}'; if ( $guessed_indentation_level ne $structural_indentation_level ) { $self->[_last_tabbing_disagreement_] = $line_number; if ($is_closing_block) { if ( !$self->[_in_brace_tabbing_disagreement_] ) { $self->[_in_brace_tabbing_disagreement_] = $line_number; } if ( !$self->[_first_brace_tabbing_disagreement_] ) { $self->[_first_brace_tabbing_disagreement_] = $line_number; } } if ( !$self->[_in_tabbing_disagreement_] ) { $self->[_tabbing_disagreement_count_]++; if ( $self->[_tabbing_disagreement_count_] <= MAX_NAG_MESSAGES ) { write_logfile_entry( "Start indentation disagreement: input=$guessed_indentation_level; output=$structural_indentation_level\n" ); } $self->[_in_tabbing_disagreement_] = $line_number; $self->[_first_tabbing_disagreement_] = $line_number unless ( $self->[_first_tabbing_disagreement_] ); } } else { $self->[_in_brace_tabbing_disagreement_] = 0 if ($is_closing_block); my $in_tabbing_disagreement = $self->[_in_tabbing_disagreement_]; if ($in_tabbing_disagreement) { if ( $self->[_tabbing_disagreement_count_] <= MAX_NAG_MESSAGES ) { write_logfile_entry( "End indentation disagreement from input line $in_tabbing_disagreement\n" ); if ( $self->[_tabbing_disagreement_count_] == MAX_NAG_MESSAGES ) { write_logfile_entry( "No further tabbing disagreements will be noted\n"); } } $self->[_in_tabbing_disagreement_] = 0; } } return; } ## end sub compare_indentation_levels ################################################### # CODE SECTION 8: Utilities for setting breakpoints ################################################### { ## begin closure set_forced_breakpoint my @forced_breakpoint_undo_stack; # These are global vars for efficiency: # my $forced_breakpoint_count; # my $forced_breakpoint_undo_count; # my $index_max_forced_break; # Break before or after certain tokens based on user settings my %break_before_or_after_token; BEGIN { # Updated to use all operators. This fixes case b1054 # Here is the previous simplified version: ## my @q = qw( . : ? and or xor && || ); my @q = @all_operators; push @q, ','; @break_before_or_after_token{@q} = (1) x scalar(@q); } ## end BEGIN sub set_fake_breakpoint { # Just bump up the breakpoint count as a signal that there are breaks. # This is useful if we have breaks but may want to postpone deciding # where to make them. $forced_breakpoint_count++; return; } ## end sub set_fake_breakpoint use constant DEBUG_FORCE => 0; sub set_forced_breakpoint { my ( $self, $i ) = @_; # Set a breakpoint AFTER the token at index $i in the _to_go arrays. # Exceptions: # - If the token at index $i is a blank, backup to $i-1 to # get to the previous nonblank token. # - For certain tokens, the break may be placed BEFORE the token # at index $i, depending on user break preference settings. # - If a break is made after an opening token, then a break will # also be made before the corresponding closing token. # Returns '$i_nonblank': # = index of the token after which the breakpoint was actually placed # = undef if breakpoint was not set. my $i_nonblank; if ( !defined($i) || $i < 0 ) { # Calls with bad index $i are harmless but waste time and should # be caught and eliminated during code development. if (DEVEL_MODE) { my ( $a, $b, $c ) = caller(); Fault( "Bad call to forced breakpoint from $a $b $c ; called with i=$i; please fix\n" ); } return; } # Break after token $i $i_nonblank = $self->set_forced_breakpoint_AFTER($i); # If we break at an opening container..break at the closing my $set_closing; if ( defined($i_nonblank) && $is_opening_sequence_token{ $tokens_to_go[$i_nonblank] } ) { $set_closing = 1; $self->set_closing_breakpoint($i_nonblank); } DEBUG_FORCE && do { my ( $a, $b, $c ) = caller(); my $msg = "FORCE $forced_breakpoint_count after call from $a $c with i=$i max=$max_index_to_go"; if ( !defined($i_nonblank) ) { $i = EMPTY_STRING unless defined($i); $msg .= " but could not set break after i='$i'\n"; } else { my $nobr = $nobreak_to_go[$i_nonblank]; $nobr = 0 if ( !defined($nobr) ); $msg .= <= 0 ); # Back up at a blank so we have a token to examine. # This was added to fix for cases like b932 involving an '=' break. if ( $i > 0 && $types_to_go[$i] eq 'b' ) { $i-- } # Never break between welded tokens return if ( $total_weld_count && $self->[_rK_weld_right_]->{ $K_to_go[$i] } ); my $token = $tokens_to_go[$i]; my $type = $types_to_go[$i]; # For certain tokens, use user settings to decide if we break before or # after it if ( $break_before_or_after_token{$token} && ( $type eq $token || $type eq 'k' ) ) { if ( $want_break_before{$token} && $i >= 0 ) { $i-- } } # breaks are forced before 'if' and 'unless' elsif ( $is_if_unless{$token} && $type eq 'k' ) { $i-- } if ( $i >= 0 && $i <= $max_index_to_go ) { my $i_nonblank = ( $types_to_go[$i] ne 'b' ) ? $i : $i - 1; if ( $i_nonblank >= 0 && !$nobreak_to_go[$i_nonblank] && !$forced_breakpoint_to_go[$i_nonblank] ) { $forced_breakpoint_to_go[$i_nonblank] = 1; if ( $i_nonblank > $index_max_forced_break ) { $index_max_forced_break = $i_nonblank; } $forced_breakpoint_count++; $forced_breakpoint_undo_stack[ $forced_breakpoint_undo_count++ ] = $i_nonblank; # success return $i_nonblank; } } return; } ## end sub set_forced_breakpoint_AFTER sub clear_breakpoint_undo_stack { my ($self) = @_; $forced_breakpoint_undo_count = 0; return; } use constant DEBUG_UNDOBP => 0; sub undo_forced_breakpoint_stack { my ( $self, $i_start ) = @_; # Given $i_start, a non-negative index the 'undo stack' of breakpoints, # remove all breakpoints from the top of the 'undo stack' down to and # including index $i_start. # The 'undo stack' is a stack of all breakpoints made for a batch of # code. if ( $i_start < 0 ) { $i_start = 0; my ( $a, $b, $c ) = caller(); # Bad call, can only be due to a recent programming change. Fault( "Program Bug: undo_forced_breakpoint_stack from $a $c has bad i=$i_start " ) if (DEVEL_MODE); return; } while ( $forced_breakpoint_undo_count > $i_start ) { my $i = $forced_breakpoint_undo_stack[ --$forced_breakpoint_undo_count ]; if ( $i >= 0 && $i <= $max_index_to_go ) { $forced_breakpoint_to_go[$i] = 0; $forced_breakpoint_count--; DEBUG_UNDOBP && do { my ( $a, $b, $c ) = caller(); print STDOUT "UNDOBP: undo forced_breakpoint i=$i $forced_breakpoint_undo_count from $a $c max=$max_index_to_go\n"; }; } # shouldn't happen, but not a critical error else { if (DEVEL_MODE) { my ( $a, $b, $c ) = caller(); Fault(< $i_break + 2 ) { # break before } ] and ), but sub set_forced_breakpoint will decide # to break before or after a ? and : my $inc = ( $tokens_to_go[$i_break] eq '?' ) ? 0 : 1; $self->set_forced_breakpoint_AFTER( $mate_index_to_go[$i_break] - $inc ); } } else { my $type_sequence = $type_sequence_to_go[$i_break]; if ($type_sequence) { $postponed_breakpoint{$type_sequence} = 1; } } return; } ## end sub set_closing_breakpoint } ## end closure set_closing_breakpoint ######################################### # CODE SECTION 9: Process batches of code ######################################### { ## begin closure grind_batch_of_CODE # The routines in this closure begin the processing of a 'batch' of code. # A variable to keep track of consecutive nonblank lines so that we can # insert occasional blanks my @nonblank_lines_at_depth; # A variable to remember maximum size of previous batches; this is needed # by the logical padding routine my $peak_batch_size; my $batch_count; # variables to keep track of indentation of unmatched containers. my %saved_opening_indentation; sub initialize_grind_batch_of_CODE { @nonblank_lines_at_depth = (); $peak_batch_size = 0; $batch_count = 0; %saved_opening_indentation = (); return; } ## end sub initialize_grind_batch_of_CODE # sub grind_batch_of_CODE receives sections of code which are the longest # possible lines without a break. In other words, it receives what is left # after applying all breaks forced by blank lines, block comments, side # comments, pod text, and structural braces. Its job is to break this code # down into smaller pieces, if necessary, which fit within the maximum # allowed line length. Then it sends the resulting lines of code on down # the pipeline to the VerticalAligner package, breaking the code into # continuation lines as necessary. The batch of tokens are in the "to_go" # arrays. The name 'grind' is slightly suggestive of a machine continually # breaking down long lines of code, but mainly it is unique and easy to # remember and find with an editor search. # The two routines 'process_line_of_CODE' and 'grind_batch_of_CODE' work # together in the following way: # - 'process_line_of_CODE' receives the original INPUT lines one-by-one and # combines them into the largest sequences of tokens which might form a new # line. # - 'grind_batch_of_CODE' determines which tokens will form the OUTPUT # lines. # So sub 'process_line_of_CODE' builds up the longest possible continuous # sequences of tokens, regardless of line length, and then # grind_batch_of_CODE breaks these sequences back down into the new output # lines. # Sub 'grind_batch_of_CODE' ships its output lines to the vertical aligner. use constant DEBUG_GRIND => 0; sub check_grind_input { # Check for valid input to sub grind_batch_of_CODE. An error here # would most likely be due to an error in 'sub store_token_to_go'. my ($self) = @_; # Be sure there are tokens in the batch if ( $max_index_to_go < 0 ) { Fault(<[_Klimit_]; # The local batch tokens must be a continuous part of the global token # array. my $KK; foreach my $ii ( 0 .. $max_index_to_go ) { my $Km = $KK; $KK = $K_to_go[$ii]; if ( !defined($KK) || $KK < 0 || $KK > $Klimit ) { $KK = '(undef)' unless defined($KK); Fault(< 0 && $KK != $Km + 1 ) { my $im = $ii - 1; Fault(< #; push @q, ','; @quick_filter{@q} = (1) x scalar(@q); } sub grind_batch_of_CODE { my ($self) = @_; #----------------------------------------------------------------- # This sub directs the formatting of one complete batch of tokens. # The tokens of the batch are in the '_to_go' arrays. #----------------------------------------------------------------- my $this_batch = $self->[_this_batch_]; $this_batch->[_peak_batch_size_] = $peak_batch_size; $this_batch->[_batch_count_] = ++$batch_count; $self->check_grind_input() if (DEVEL_MODE); # This routine is only called from sub flush_batch_of_code, so that # routine is a better spot for debugging. DEBUG_GRIND && do { my $token = my $type = EMPTY_STRING; if ( $max_index_to_go >= 0 ) { $token = $tokens_to_go[$max_index_to_go]; $type = $types_to_go[$max_index_to_go]; } my $output_str = EMPTY_STRING; if ( $max_index_to_go > 20 ) { my $mm = $max_index_to_go - 10; $output_str = join( EMPTY_STRING, @tokens_to_go[ 0 .. 10 ] ) . " ... " . join( EMPTY_STRING, @tokens_to_go[ $mm .. $max_index_to_go ] ); } else { $output_str = join EMPTY_STRING, @tokens_to_go[ 0 .. $max_index_to_go ]; } print STDERR <= 0 && $types_to_go[$max_index_to_go] eq 'b' ) { $max_index_to_go -= 1; } return if ( $max_index_to_go < 0 ); if ($rOpts_line_up_parentheses) { $self->set_lp_indentation(); } #-------------------------------------------------- # Shortcut for block comments # Note that this shortcut does not work for -lp yet #-------------------------------------------------- elsif ( !$max_index_to_go && $types_to_go[0] eq '#' ) { my $ibeg = 0; $this_batch->[_ri_first_] = [$ibeg]; $this_batch->[_ri_last_] = [$ibeg]; $self->convey_batch_to_vertical_aligner(); my $level = $levels_to_go[$ibeg]; $self->[_last_line_leading_type_] = $types_to_go[$ibeg]; $self->[_last_line_leading_level_] = $level; $nonblank_lines_at_depth[$level] = 1; return; } #------------- # Normal route #------------- my $rLL = $self->[_rLL_]; #------------------------------------------------------- # Loop over the batch to initialize some batch variables #------------------------------------------------------- my $comma_count_in_batch = 0; my @colon_list; my @ix_seqno_controlling_ci; my %comma_arrow_count; my $comma_arrow_count_contained = 0; my @unmatched_closing_indexes_in_this_batch; my @unmatched_opening_indexes_in_this_batch; my @i_for_semicolon; foreach my $i ( 0 .. $max_index_to_go ) { if ( $types_to_go[$i] eq 'b' ) { $inext_to_go[$i] = $inext_to_go[ $i - 1 ] = $i + 1; next; } $inext_to_go[$i] = $i + 1; # This is an optional shortcut to save a bit of time by skipping # most tokens. Note: the filter may need to be updated if the # next 'if' tests are ever changed to include more token types. next if ( !$quick_filter{ $types_to_go[$i] } ); my $type = $types_to_go[$i]; # gather info needed by sub break_long_lines if ( $type_sequence_to_go[$i] ) { my $seqno = $type_sequence_to_go[$i]; my $token = $tokens_to_go[$i]; # remember indexes of any tokens controlling xci # in this batch. This list is needed by sub undo_ci. if ( $self->[_ris_seqno_controlling_ci_]->{$seqno} ) { push @ix_seqno_controlling_ci, $i; } if ( $is_opening_sequence_token{$token} ) { if ( $self->[_rbreak_container_]->{$seqno} ) { $self->set_forced_breakpoint($i); } push @unmatched_opening_indexes_in_this_batch, $i; if ( $type eq '?' ) { push @colon_list, $type; } } elsif ( $is_closing_sequence_token{$token} ) { if ( $i > 0 && $self->[_rbreak_container_]->{$seqno} ) { $self->set_forced_breakpoint( $i - 1 ); } my $i_mate = pop @unmatched_opening_indexes_in_this_batch; if ( defined($i_mate) && $i_mate >= 0 ) { if ( $type_sequence_to_go[$i_mate] == $type_sequence_to_go[$i] ) { $mate_index_to_go[$i] = $i_mate; $mate_index_to_go[$i_mate] = $i; my $cac = $comma_arrow_count{$seqno}; $comma_arrow_count_contained += $cac if ($cac); } else { push @unmatched_opening_indexes_in_this_batch, $i_mate; push @unmatched_closing_indexes_in_this_batch, $i; } } else { push @unmatched_closing_indexes_in_this_batch, $i; } if ( $type eq ':' ) { push @colon_list, $type; } } ## end elsif ( $is_closing_sequence_token...) } ## end if ($seqno) elsif ( $type eq ',' ) { $comma_count_in_batch++; } elsif ( $type eq '=>' ) { if (@unmatched_opening_indexes_in_this_batch) { my $j = $unmatched_opening_indexes_in_this_batch[-1]; my $seqno = $type_sequence_to_go[$j]; $comma_arrow_count{$seqno}++; } } elsif ( $type eq 'f' ) { push @i_for_semicolon, $i; } } ## end for ( my $i = 0 ; $i <=...) # Break at a single interior C-style for semicolon in this batch (c154) if ( @i_for_semicolon && @i_for_semicolon == 1 ) { my $i = $i_for_semicolon[0]; my $inext = $inext_to_go[$i]; if ( $inext <= $max_index_to_go && $types_to_go[$inext] ne '#' ) { $self->set_forced_breakpoint($i); } } my $is_unbalanced_batch = @unmatched_opening_indexes_in_this_batch + @unmatched_closing_indexes_in_this_batch; if (@unmatched_opening_indexes_in_this_batch) { $this_batch->[_runmatched_opening_indexes_] = \@unmatched_opening_indexes_in_this_batch; } if (@ix_seqno_controlling_ci) { $this_batch->[_rix_seqno_controlling_ci_] = \@ix_seqno_controlling_ci; } #------------------------ # Set special breakpoints #------------------------ # If this line ends in a code block brace, set breaks at any # previous closing code block braces to breakup a chain of code # blocks on one line. This is very rare but can happen for # user-defined subs. For example we might be looking at this: # BOOL { $server_data{uptime} > 0; } NUM { $server_data{load}; } STR { my $saw_good_break; # flag to force breaks even if short line if ( # looking for opening or closing block brace $block_type_to_go[$max_index_to_go] # never any good breaks if just one token && $max_index_to_go > 0 # but not one of these which are never duplicated on a line: # until|while|for|if|elsif|else && !$is_block_without_semicolon{ $block_type_to_go[$max_index_to_go] } ) { my $lev = $nesting_depth_to_go[$max_index_to_go]; # Walk backwards from the end and # set break at any closing block braces at the same level. # But quit if we are not in a chain of blocks. foreach my $i ( reverse( 0 .. $max_index_to_go - 1 ) ) { last if ( $levels_to_go[$i] < $lev ); # stop at a lower level next if ( $levels_to_go[$i] > $lev ); # skip past higher level if ( $block_type_to_go[$i] ) { if ( $tokens_to_go[$i] eq '}' ) { $self->set_forced_breakpoint($i); $saw_good_break = 1; } } # quit if we see anything besides words, function, blanks # at this level elsif ( $types_to_go[$i] !~ /^[\(\)Gwib]$/ ) { last } } } #----------------------------------------------- # insertion of any blank lines before this batch #----------------------------------------------- my $imin = 0; my $imax = $max_index_to_go; # trim any blank tokens - for safety, but should not be necessary if ( $types_to_go[$imin] eq 'b' ) { $imin++ } if ( $types_to_go[$imax] eq 'b' ) { $imax-- } if ( $imin > $imax ) { if (DEVEL_MODE) { my $K0 = $K_to_go[0]; my $lno = EMPTY_STRING; if ( defined($K0) ) { $lno = $rLL->[$K0]->[_LINE_INDEX_] + 1 } Fault(<[_last_line_leading_type_]; my $last_line_leading_level = $self->[_last_line_leading_level_]; my $leading_type = $types_to_go[0]; my $leading_level = $levels_to_go[0]; # add blank line(s) before certain key types but not after a comment if ( $last_line_leading_type ne '#' ) { my $blank_count = 0; my $leading_token = $tokens_to_go[0]; # break before certain key blocks except one-liners if ( $leading_type eq 'k' ) { if ( $leading_token eq 'BEGIN' || $leading_token eq 'END' ) { $blank_count = $rOpts->{'blank-lines-before-subs'} if ( terminal_type_i( 0, $max_index_to_go ) ne '}' ); } # Break before certain block types if we haven't had a # break at this level for a while. This is the # difficult decision.. elsif ($last_line_leading_type ne 'b' && $is_if_unless_while_until_for_foreach{$leading_token} ) { my $lc = $nonblank_lines_at_depth[$last_line_leading_level]; if ( !defined($lc) ) { $lc = 0 } # patch for RT #128216: no blank line inserted at a level # change if ( $levels_to_go[0] != $last_line_leading_level ) { $lc = 0; } if ( $rOpts->{'blanks-before-blocks'} && $lc >= $rOpts->{'long-block-line-count'} && $self->consecutive_nonblank_lines() >= $rOpts->{'long-block-line-count'} && terminal_type_i( 0, $max_index_to_go ) ne '}' ) { $blank_count = 1; } } } # blank lines before subs except declarations and one-liners elsif ( $leading_type eq 'i' ) { my $special_identifier = $self->[_ris_special_identifier_token_]->{$leading_token}; if ($special_identifier) { ## $leading_token =~ /$SUB_PATTERN/ if ( $special_identifier eq 'sub' ) { $blank_count = $rOpts->{'blank-lines-before-subs'} if ( terminal_type_i( 0, $max_index_to_go ) !~ /^[\;\}\,]$/ ); } # break before all package declarations ## substr( $leading_token, 0, 8 ) eq 'package ' elsif ( $special_identifier eq 'package' ) { # ... except in a very short eval block my $pseqno = $parent_seqno_to_go[0]; $blank_count = $rOpts->{'blank-lines-before-packages'} if ( !$self->[_ris_short_broken_eval_block_]->{$pseqno} ); } } } # Check for blank lines wanted before a closing brace elsif ( $leading_token eq '}' ) { if ( $rOpts->{'blank-lines-before-closing-block'} && $block_type_to_go[0] && $block_type_to_go[0] =~ /$blank_lines_before_closing_block_pattern/ ) { my $nblanks = $rOpts->{'blank-lines-before-closing-block'}; if ( $nblanks > $blank_count ) { $blank_count = $nblanks; } } } if ($blank_count) { # future: send blank line down normal path to VerticalAligner? $self->flush_vertical_aligner(); my $file_writer_object = $self->[_file_writer_object_]; $file_writer_object->require_blank_code_lines($blank_count); } } # update blank line variables and count number of consecutive # non-blank, non-comment lines at this level if ( $leading_level == $last_line_leading_level && $leading_type ne '#' && defined( $nonblank_lines_at_depth[$leading_level] ) ) { $nonblank_lines_at_depth[$leading_level]++; } else { $nonblank_lines_at_depth[$leading_level] = 1; } $self->[_last_line_leading_type_] = $leading_type; $self->[_last_line_leading_level_] = $leading_level; #-------------------------- # scan lists and long lines #-------------------------- # Flag to remember if we called sub 'pad_array_to_go'. # Some routines (break_lists(), break_long_lines() ) need some # extra tokens added at the end of the batch. Most batches do not # use these routines, so we will avoid calling 'pad_array_to_go' # unless it is needed. my $called_pad_array_to_go; # set all forced breakpoints for good list formatting my $is_long_line; my $multiple_old_lines_in_batch; if ( $max_index_to_go > 0 ) { $is_long_line = $self->excess_line_length( $imin, $max_index_to_go ) > 0; my $Kbeg = $K_to_go[0]; my $Kend = $K_to_go[$max_index_to_go]; $multiple_old_lines_in_batch = $rLL->[$Kend]->[_LINE_INDEX_] - $rLL->[$Kbeg]->[_LINE_INDEX_]; } my $rbond_strength_bias = []; if ( $is_long_line || $multiple_old_lines_in_batch # must always call break_lists() with unbalanced batches because # it is maintaining some stacks || $is_unbalanced_batch # call break_lists if we might want to break at commas || ( $comma_count_in_batch && ( $rOpts_maximum_fields_per_table > 0 && $rOpts_maximum_fields_per_table <= $comma_count_in_batch || $rOpts_comma_arrow_breakpoints == 0 ) ) # call break_lists if user may want to break open some one-line # hash references || ( $comma_arrow_count_contained && $rOpts_comma_arrow_breakpoints != 3 ) ) { # add a couple of extra terminal blank tokens $self->pad_array_to_go(); $called_pad_array_to_go = 1; my $sgb = $self->break_lists( $is_long_line, $rbond_strength_bias ); $saw_good_break ||= $sgb; } # let $ri_first and $ri_last be references to lists of # first and last tokens of line fragments to output.. my ( $ri_first, $ri_last ); #----------------------------- # a single token uses one line #----------------------------- if ( !$max_index_to_go ) { $ri_first = [$imin]; $ri_last = [$imax]; } # for multiple tokens else { #------------------------- # write a single line if.. #------------------------- if ( ( # this line is 'short' !$is_long_line # and we didn't see a good breakpoint && !$saw_good_break # and we don't already have an interior breakpoint && !$forced_breakpoint_count ) # or, we aren't allowed to add any newlines || !$rOpts_add_newlines ) { $ri_first = [$imin]; $ri_last = [$imax]; } #----------------------------- # otherwise use multiple lines #----------------------------- else { # add a couple of extra terminal blank tokens if we haven't # already done so $self->pad_array_to_go() unless ($called_pad_array_to_go); ( $ri_first, $ri_last, my $rbond_strength_to_go ) = $self->break_long_lines( $saw_good_break, \@colon_list, $rbond_strength_bias ); $self->break_all_chain_tokens( $ri_first, $ri_last ); $self->break_equals( $ri_first, $ri_last ) if @{$ri_first} >= 3; # now we do a correction step to clean this up a bit # (The only time we would not do this is for debugging) $self->recombine_breakpoints( $ri_first, $ri_last, $rbond_strength_to_go ) if ( $rOpts_recombine && @{$ri_first} > 1 ); $self->insert_final_ternary_breaks( $ri_first, $ri_last ) if (@colon_list); } $self->insert_breaks_before_list_opening_containers( $ri_first, $ri_last ) if ( %break_before_container_types && $max_index_to_go > 0 ); # Check for a phantom semicolon at the end of the batch if ( !$token_lengths_to_go[$imax] && $types_to_go[$imax] eq ';' ) { $self->unmask_phantom_token($imax); } if ( $rOpts_one_line_block_semicolons == 0 ) { $self->delete_one_line_semicolons( $ri_first, $ri_last ); } # Remember the largest batch size processed. This is needed by the # logical padding routine to avoid padding the first nonblank token if ( $max_index_to_go > $peak_batch_size ) { $peak_batch_size = $max_index_to_go; } } #------------------- # -lp corrector step #------------------- if ($rOpts_line_up_parentheses) { $self->correct_lp_indentation( $ri_first, $ri_last ); } #-------------------- # ship this batch out #-------------------- $this_batch->[_ri_first_] = $ri_first; $this_batch->[_ri_last_] = $ri_last; $self->convey_batch_to_vertical_aligner(); #------------------------------------------------------------------- # Write requested number of blank lines after an opening block brace #------------------------------------------------------------------- if ($rOpts_blank_lines_after_opening_block) { my $iterm = $imax; if ( $types_to_go[$iterm] eq '#' && $iterm > $imin ) { $iterm -= 1; if ( $types_to_go[$iterm] eq 'b' && $iterm > $imin ) { $iterm -= 1; } } if ( $types_to_go[$iterm] eq '{' && $block_type_to_go[$iterm] && $block_type_to_go[$iterm] =~ /$blank_lines_after_opening_block_pattern/ ) { my $nblanks = $rOpts_blank_lines_after_opening_block; $self->flush_vertical_aligner(); my $file_writer_object = $self->[_file_writer_object_]; $file_writer_object->require_blank_code_lines($nblanks); } } return; } ## end sub grind_batch_of_CODE sub iprev_to_go { my ($i) = @_; return $i - 1 > 0 && $types_to_go[ $i - 1 ] eq 'b' ? $i - 2 : $i - 1; } sub unmask_phantom_token { my ( $self, $iend ) = @_; # Turn a phantom token into a real token. # Input parameter: # $iend = the index in the output batch array of this token. # Phantom tokens are specially marked token types (such as ';') with # no token text which only become real tokens if they occur at the end # of an output line. At one time phantom ',' tokens were handled # here, but now they are processed elsewhere. my $rLL = $self->[_rLL_]; my $KK = $K_to_go[$iend]; my $line_number = 1 + $rLL->[$KK]->[_LINE_INDEX_]; my $type = $types_to_go[$iend]; return unless ( $type eq ';' ); my $tok = $type; my $tok_len = length($tok); if ( $want_left_space{$type} != WS_NO ) { $tok = SPACE . $tok; $tok_len += 1; } $tokens_to_go[$iend] = $tok; $token_lengths_to_go[$iend] = $tok_len; $rLL->[$KK]->[_TOKEN_] = $tok; $rLL->[$KK]->[_TOKEN_LENGTH_] = $tok_len; $self->note_added_semicolon($line_number); # This changes the summed lengths of the rest of this batch foreach ( $iend .. $max_index_to_go ) { $summed_lengths_to_go[ $_ + 1 ] += $tok_len; } return; } ## end sub unmask_phantom_token sub save_opening_indentation { # This should be called after each batch of tokens is output. It # saves indentations of lines of all unmatched opening tokens. # These will be used by sub get_opening_indentation. my ( $self, $ri_first, $ri_last, $rindentation_list, $runmatched_opening_indexes ) = @_; $runmatched_opening_indexes = [] if ( !defined($runmatched_opening_indexes) ); # QW INDENTATION PATCH 1: # Also save indentation for multiline qw quotes my @i_qw; my $seqno_qw_opening; if ( $types_to_go[$max_index_to_go] eq 'q' ) { my $KK = $K_to_go[$max_index_to_go]; $seqno_qw_opening = $self->[_rstarting_multiline_qw_seqno_by_K_]->{$KK}; if ($seqno_qw_opening) { push @i_qw, $max_index_to_go; } } # we need to save indentations of any unmatched opening tokens # in this batch because we may need them in a subsequent batch. foreach ( @{$runmatched_opening_indexes}, @i_qw ) { my $seqno = $type_sequence_to_go[$_]; if ( !$seqno ) { if ( $seqno_qw_opening && $_ == $max_index_to_go ) { $seqno = $seqno_qw_opening; } else { # shouldn't happen $seqno = 'UNKNOWN'; DEVEL_MODE && Fault("unable to find sequence number\n"); } } $saved_opening_indentation{$seqno} = [ lookup_opening_indentation( $_, $ri_first, $ri_last, $rindentation_list ) ]; } return; } ## end sub save_opening_indentation sub get_saved_opening_indentation { my ($seqno) = @_; my ( $indent, $offset, $is_leading, $exists ) = ( 0, 0, 0, 0 ); if ($seqno) { if ( $saved_opening_indentation{$seqno} ) { ( $indent, $offset, $is_leading ) = @{ $saved_opening_indentation{$seqno} }; $exists = 1; } } # some kind of serious error it doesn't exist # (example is badfile.t) return ( $indent, $offset, $is_leading, $exists ); } ## end sub get_saved_opening_indentation } ## end closure grind_batch_of_CODE sub lookup_opening_indentation { # get the indentation of the line in the current output batch # which output a selected opening token # # given: # $i_opening - index of an opening token in the current output batch # whose line indentation we need # $ri_first - reference to list of the first index $i for each output # line in this batch # $ri_last - reference to list of the last index $i for each output line # in this batch # $rindentation_list - reference to a list containing the indentation # used for each line. (NOTE: the first slot in # this list is the last returned line number, and this is # followed by the list of indentations). # # return # -the indentation of the line which contained token $i_opening # -and its offset (number of columns) from the start of the line my ( $i_opening, $ri_start, $ri_last, $rindentation_list ) = @_; if ( !@{$ri_last} ) { # An error here implies a bug introduced by a recent program change. # Every batch of code has lines, so this should never happen. if (DEVEL_MODE) { Fault("Error in opening_indentation: no lines"); } return ( 0, 0, 0 ); } my $nline = $rindentation_list->[0]; # line number of previous lookup # reset line location if necessary $nline = 0 if ( $i_opening < $ri_start->[$nline] ); # find the correct line unless ( $i_opening > $ri_last->[-1] ) { while ( $i_opening > $ri_last->[$nline] ) { $nline++; } } # Error - token index is out of bounds - shouldn't happen # A program bug has been introduced in one of the calling routines. # We better stop here. else { my $i_last_line = $ri_last->[-1]; if (DEVEL_MODE) { Fault(< $i_last_line = max index of last line This batch has max index = $max_index_to_go, EOM } $nline = $#{$ri_last}; } $rindentation_list->[0] = $nline; # save line number to start looking next call my $ibeg = $ri_start->[$nline]; my $offset = token_sequence_length( $ibeg, $i_opening ) - 1; my $is_leading = ( $ibeg == $i_opening ); return ( $rindentation_list->[ $nline + 1 ], $offset, $is_leading ); } ## end sub lookup_opening_indentation sub terminal_type_i { # returns type of last token on this line (terminal token), as follows: # returns # for a full-line comment # returns ' ' for a blank line # otherwise returns final token type my ( $ibeg, $iend ) = @_; # Start at the end and work backwards my $i = $iend; my $type_i = $types_to_go[$i]; # Check for side comment if ( $type_i eq '#' ) { $i--; if ( $i < $ibeg ) { return wantarray ? ( $type_i, $ibeg ) : $type_i; } $type_i = $types_to_go[$i]; } # Skip past a blank if ( $type_i eq 'b' ) { $i--; if ( $i < $ibeg ) { return wantarray ? ( $type_i, $ibeg ) : $type_i; } $type_i = $types_to_go[$i]; } # Found it..make sure it is a BLOCK termination, # but hide a terminal } after sort/map/grep/eval/do because it is not # necessarily the end of the line. (terminal.t) my $block_type = $block_type_to_go[$i]; if ( $type_i eq '}' && ( !$block_type || $is_sort_map_grep_eval_do{$block_type} ) ) { $type_i = 'b'; } return wantarray ? ( $type_i, $i ) : $type_i; } ## end sub terminal_type_i sub pad_array_to_go { # To simplify coding in break_lists and set_bond_strengths, it helps to # create some extra blank tokens at the end of the arrays. We also add # some undef's to help guard against using invalid data. my ($self) = @_; $K_to_go[ $max_index_to_go + 1 ] = undef; $tokens_to_go[ $max_index_to_go + 1 ] = EMPTY_STRING; $tokens_to_go[ $max_index_to_go + 2 ] = EMPTY_STRING; $tokens_to_go[ $max_index_to_go + 3 ] = undef; $types_to_go[ $max_index_to_go + 1 ] = 'b'; $types_to_go[ $max_index_to_go + 2 ] = 'b'; $types_to_go[ $max_index_to_go + 3 ] = undef; $nesting_depth_to_go[ $max_index_to_go + 2 ] = undef; $nesting_depth_to_go[ $max_index_to_go + 1 ] = $nesting_depth_to_go[$max_index_to_go]; # /^[R\}\)\]]$/ if ( $is_closing_type{ $types_to_go[$max_index_to_go] } ) { if ( $nesting_depth_to_go[$max_index_to_go] <= 0 ) { # Nesting depths are set to be >=0 in sub write_line, so it should # not be possible to get here unless the code has a bracing error # which leaves a closing brace with zero nesting depth. unless ( get_saw_brace_error() ) { if (DEVEL_MODE) { Fault(<[$n]; my $ir = $ri_right->[$n]; my $typel = $types_to_go[$il]; my $typer = $types_to_go[$ir]; $typel = '+' if ( $typel eq '-' ); # treat + and - the same $typer = '+' if ( $typer eq '-' ); $typel = '*' if ( $typel eq '/' ); # treat * and / the same $typer = '*' if ( $typer eq '/' ); my $keyl = $typel eq 'k' ? $tokens_to_go[$il] : $typel; my $keyr = $typer eq 'k' ? $tokens_to_go[$ir] : $typer; if ( $is_chain_operator{$keyl} && $want_break_before{$typel} ) { next if ( $typel eq '?' ); push @{ $left_chain_type{$keyl} }, $il; $saw_chain_type{$keyl} = 1; $count++; } if ( $is_chain_operator{$keyr} && !$want_break_before{$typer} ) { next if ( $typer eq '?' ); push @{ $right_chain_type{$keyr} }, $ir; $saw_chain_type{$keyr} = 1; $count++; } } return unless $count; # now look for any interior tokens of the same types $count = 0; my $has_interior_dot_or_plus; for my $n ( 0 .. $nmax ) { my $il = $ri_left->[$n]; my $ir = $ri_right->[$n]; foreach my $i ( $il + 1 .. $ir - 1 ) { my $type = $types_to_go[$i]; my $key = $type eq 'k' ? $tokens_to_go[$i] : $type; $key = '+' if ( $key eq '-' ); $key = '*' if ( $key eq '/' ); if ( $saw_chain_type{$key} ) { push @{ $interior_chain_type{$key} }, $i; $count++; $has_interior_dot_or_plus ||= ( $key eq '.' || $key eq '+' ); } } } return unless $count; my @keys = keys %saw_chain_type; # quit if just ONE continuation line with leading . For example-- # print LATEXFILE '\framebox{\parbox[c][' . $h . '][t]{' . $w . '}{' # . $contents; # Fixed for b1399. if ( $has_interior_dot_or_plus && $nmax == 1 && @keys == 1 ) { return; } # now make a list of all new break points my @insert_list; # loop over all chain types foreach my $key (@keys) { # loop over all interior chain tokens foreach my $itest ( @{ $interior_chain_type{$key} } ) { # loop over all left end tokens of same type if ( $left_chain_type{$key} ) { next if $nobreak_to_go[ $itest - 1 ]; foreach my $i ( @{ $left_chain_type{$key} } ) { next unless $self->in_same_container_i( $i, $itest ); push @insert_list, $itest - 1; # Break at matching ? if this : is at a different level. # For example, the ? before $THRf_DEAD in the following # should get a break if its : gets a break. # # my $flags = # ( $_ & 1 ) ? ( $_ & 4 ) ? $THRf_DEAD : $THRf_ZOMBIE # : ( $_ & 4 ) ? $THRf_R_DETACHED # : $THRf_R_JOINABLE; if ( $key eq ':' && $levels_to_go[$i] != $levels_to_go[$itest] ) { my $i_question = $mate_index_to_go[$itest]; if ( defined($i_question) && $i_question > 0 ) { push @insert_list, $i_question - 1; } } last; } } # loop over all right end tokens of same type if ( $right_chain_type{$key} ) { next if $nobreak_to_go[$itest]; foreach my $i ( @{ $right_chain_type{$key} } ) { next unless $self->in_same_container_i( $i, $itest ); push @insert_list, $itest; # break at matching ? if this : is at a different level if ( $key eq ':' && $levels_to_go[$i] != $levels_to_go[$itest] ) { my $i_question = $mate_index_to_go[$itest]; if ( defined($i_question) ) { push @insert_list, $i_question; } } last; } } } } # insert any new break points if (@insert_list) { $self->insert_additional_breaks( \@insert_list, $ri_left, $ri_right ); } return; } ## end sub break_all_chain_tokens sub insert_additional_breaks { # this routine will add line breaks at requested locations after # sub break_long_lines has made preliminary breaks. my ( $self, $ri_break_list, $ri_first, $ri_last ) = @_; my $i_f; my $i_l; my $line_number = 0; foreach my $i_break_left ( sort { $a <=> $b } @{$ri_break_list} ) { next if ( $nobreak_to_go[$i_break_left] ); $i_f = $ri_first->[$line_number]; $i_l = $ri_last->[$line_number]; while ( $i_break_left >= $i_l ) { $line_number++; # shouldn't happen unless caller passes bad indexes if ( $line_number >= @{$ri_last} ) { if (DEVEL_MODE) { Fault(<[$line_number]; $i_l = $ri_last->[$line_number]; } # Do not leave a blank at the end of a line; back up if necessary if ( $types_to_go[$i_break_left] eq 'b' ) { $i_break_left-- } my $i_break_right = $inext_to_go[$i_break_left]; if ( $i_break_left >= $i_f && $i_break_left < $i_l && $i_break_right > $i_f && $i_break_right <= $i_l ) { splice( @{$ri_first}, $line_number, 1, ( $i_f, $i_break_right ) ); splice( @{$ri_last}, $line_number, 1, ( $i_break_left, $i_l ) ); } } return; } ## end sub insert_additional_breaks { ## begin closure in_same_container_i my $ris_break_token; my $ris_comma_token; BEGIN { # all cases break on seeing commas at same level my @q = qw( => ); push @q, ','; @{$ris_comma_token}{@q} = (1) x scalar(@q); # Non-ternary text also breaks on seeing any of qw(? : || or ) # Example: we would not want to break at any of these .'s # : "$str" push @q, qw( or || ? : ); @{$ris_break_token}{@q} = (1) x scalar(@q); } ## end BEGIN sub in_same_container_i { # Check to see if tokens at i1 and i2 are in the same container, and # not separated by certain characters: => , ? : || or # This is an interface between the _to_go arrays to the rLL array my ( $self, $i1, $i2 ) = @_; # quick check my $parent_seqno_1 = $parent_seqno_to_go[$i1]; return if ( $parent_seqno_to_go[$i2] ne $parent_seqno_1 ); if ( $i2 < $i1 ) { ( $i1, $i2 ) = ( $i2, $i1 ) } my $K1 = $K_to_go[$i1]; my $K2 = $K_to_go[$i2]; my $rLL = $self->[_rLL_]; my $depth_1 = $nesting_depth_to_go[$i1]; return if ( $depth_1 < 0 ); # Shouldn't happen since i1 and i2 have same parent: return unless ( $nesting_depth_to_go[$i2] == $depth_1 ); # Select character set to scan for my $type_1 = $types_to_go[$i1]; my $rbreak = ( $type_1 ne ':' ) ? $ris_break_token : $ris_comma_token; # Fast preliminary loop to verify that tokens are in the same container my $KK = $K1; while (1) { $KK = $rLL->[$KK]->[_KNEXT_SEQ_ITEM_]; last if !defined($KK); last if ( $KK >= $K2 ); my $ii = $i1 + $KK - $K1; my $depth_i = $nesting_depth_to_go[$ii]; return if ( $depth_i < $depth_1 ); next if ( $depth_i > $depth_1 ); if ( $type_1 ne ':' ) { my $tok_i = $tokens_to_go[$ii]; return if ( $tok_i eq '?' || $tok_i eq ':' ); } } # Slow loop checking for certain characters #----------------------------------------------------- # This is potentially a slow routine and not critical. # For safety just give up for large differences. # See test file 'infinite_loop.txt' #----------------------------------------------------- return if ( $i2 - $i1 > 200 ); foreach my $ii ( $i1 + 1 .. $i2 - 1 ) { my $depth_i = $nesting_depth_to_go[$ii]; next if ( $depth_i > $depth_1 ); return if ( $depth_i < $depth_1 ); my $tok_i = $tokens_to_go[$ii]; return if ( $rbreak->{$tok_i} ); } return 1; } ## end sub in_same_container_i } ## end closure in_same_container_i sub break_equals { # Look for assignment operators that could use a breakpoint. # For example, in the following snippet # # $HOME = $ENV{HOME} # || $ENV{LOGDIR} # || $pw[7] # || die "no home directory for user $<"; # # we could break at the = to get this, which is a little nicer: # $HOME = # $ENV{HOME} # || $ENV{LOGDIR} # || $pw[7] # || die "no home directory for user $<"; # # The logic here follows the logic in set_logical_padding, which # will add the padding in the second line to improve alignment. # my ( $self, $ri_left, $ri_right ) = @_; my $nmax = @{$ri_right} - 1; return unless ( $nmax >= 2 ); # scan the left ends of first two lines my $tokbeg = EMPTY_STRING; my $depth_beg; for my $n ( 1 .. 2 ) { my $il = $ri_left->[$n]; my $typel = $types_to_go[$il]; my $tokenl = $tokens_to_go[$il]; my $keyl = $typel eq 'k' ? $tokenl : $typel; my $has_leading_op = $is_chain_operator{$keyl}; return unless ($has_leading_op); if ( $n > 1 ) { return unless ( $tokenl eq $tokbeg && $nesting_depth_to_go[$il] eq $depth_beg ); } $tokbeg = $tokenl; $depth_beg = $nesting_depth_to_go[$il]; } # now look for any interior tokens of the same types my $il = $ri_left->[0]; my $ir = $ri_right->[0]; # now make a list of all new break points my @insert_list; foreach my $i ( reverse( $il + 1 .. $ir - 1 ) ) { my $type = $types_to_go[$i]; if ( $is_assignment{$type} && $nesting_depth_to_go[$i] eq $depth_beg ) { if ( $want_break_before{$type} ) { push @insert_list, $i - 1; } else { push @insert_list, $i; } } } # Break after a 'return' followed by a chain of operators # return ( $^O !~ /win32|dos/i ) # && ( $^O ne 'VMS' ) # && ( $^O ne 'OS2' ) # && ( $^O ne 'MacOS' ); # To give: # return # ( $^O !~ /win32|dos/i ) # && ( $^O ne 'VMS' ) # && ( $^O ne 'OS2' ) # && ( $^O ne 'MacOS' ); my $i = 0; if ( $types_to_go[$i] eq 'k' && $tokens_to_go[$i] eq 'return' && $ir > $il && $nesting_depth_to_go[$i] eq $depth_beg ) { push @insert_list, $i; } return unless (@insert_list); # One final check... # scan second and third lines and be sure there are no assignments # we want to avoid breaking at an = to make something like this: # unless ( $icon = # $html_icons{"$type-$state"} # or $icon = $html_icons{$type} # or $icon = $html_icons{$state} ) for my $n ( 1 .. 2 ) { my $il_n = $ri_left->[$n]; my $ir_n = $ri_right->[$n]; foreach my $i ( $il_n + 1 .. $ir_n ) { my $type = $types_to_go[$i]; return if ( $is_assignment{$type} && $nesting_depth_to_go[$i] eq $depth_beg ); } } # ok, insert any new break point if (@insert_list) { $self->insert_additional_breaks( \@insert_list, $ri_left, $ri_right ); } return; } ## end sub break_equals { ## begin closure recombine_breakpoints # This routine is called once per batch to see if it would be better # to combine some of the lines into which the batch has been broken. my %is_amp_amp; my %is_math_op; my %is_plus_minus; my %is_mult_div; BEGIN { my @q; @q = qw( && || ); @is_amp_amp{@q} = (1) x scalar(@q); @q = qw( + - * / ); @is_math_op{@q} = (1) x scalar(@q); @q = qw( + - ); @is_plus_minus{@q} = (1) x scalar(@q); @q = qw( * / ); @is_mult_div{@q} = (1) x scalar(@q); } ## end BEGIN sub Debug_dump_breakpoints { # Debug routine to dump current breakpoints...not normally called # We are given indexes to the current lines: # $ri_beg = ref to array of BEGinning indexes of each line # $ri_end = ref to array of ENDing indexes of each line my ( $self, $ri_beg, $ri_end, $msg ) = @_; print STDERR "----Dumping breakpoints from: $msg----\n"; for my $n ( 0 .. @{$ri_end} - 1 ) { my $ibeg = $ri_beg->[$n]; my $iend = $ri_end->[$n]; my $text = EMPTY_STRING; foreach my $i ( $ibeg .. $iend ) { $text .= $tokens_to_go[$i]; } print STDERR "$n ($ibeg:$iend) $text\n"; } print STDERR "----\n"; return; } ## end sub Debug_dump_breakpoints sub delete_one_line_semicolons { my ( $self, $ri_beg, $ri_end ) = @_; my $rLL = $self->[_rLL_]; my $K_opening_container = $self->[_K_opening_container_]; # Walk down the lines of this batch and delete any semicolons # terminating one-line blocks; my $nmax = @{$ri_end} - 1; foreach my $n ( 0 .. $nmax ) { my $i_beg = $ri_beg->[$n]; my $i_e = $ri_end->[$n]; my $K_beg = $K_to_go[$i_beg]; my $K_e = $K_to_go[$i_e]; my $K_end = $K_e; my $type_end = $rLL->[$K_end]->[_TYPE_]; if ( $type_end eq '#' ) { $K_end = $self->K_previous_nonblank($K_end); if ( defined($K_end) ) { $type_end = $rLL->[$K_end]->[_TYPE_]; } } # we are looking for a line ending in closing brace next unless ( $type_end eq '}' && $rLL->[$K_end]->[_TOKEN_] eq '}' ); # ...and preceded by a semicolon on the same line my $K_semicolon = $self->K_previous_nonblank($K_end); next unless defined($K_semicolon); my $i_semicolon = $i_beg + ( $K_semicolon - $K_beg ); next if ( $i_semicolon <= $i_beg ); next unless ( $rLL->[$K_semicolon]->[_TYPE_] eq ';' ); # Safety check - shouldn't happen - not critical # This is not worth throwing a Fault, except in DEVEL_MODE if ( $types_to_go[$i_semicolon] ne ';' ) { DEVEL_MODE && Fault("unexpected type looking for semicolon"); next; } # ... with the corresponding opening brace on the same line my $type_sequence = $rLL->[$K_end]->[_TYPE_SEQUENCE_]; my $K_opening = $K_opening_container->{$type_sequence}; next unless ( defined($K_opening) ); my $i_opening = $i_beg + ( $K_opening - $K_beg ); next if ( $i_opening < $i_beg ); # ... and only one semicolon between these braces my $semicolon_count = 0; foreach my $K ( $K_opening + 1 .. $K_semicolon - 1 ) { if ( $rLL->[$K]->[_TYPE_] eq ';' ) { $semicolon_count++; last; } } next if ($semicolon_count); # ...ok, then make the semicolon invisible my $len = $token_lengths_to_go[$i_semicolon]; $tokens_to_go[$i_semicolon] = EMPTY_STRING; $token_lengths_to_go[$i_semicolon] = 0; $rLL->[$K_semicolon]->[_TOKEN_] = EMPTY_STRING; $rLL->[$K_semicolon]->[_TOKEN_LENGTH_] = 0; foreach ( $i_semicolon .. $max_index_to_go ) { $summed_lengths_to_go[ $_ + 1 ] -= $len; } } return; } ## end sub delete_one_line_semicolons use constant DEBUG_RECOMBINE => 0; sub recombine_breakpoints { my ( $self, $ri_beg, $ri_end, $rbond_strength_to_go ) = @_; # This sub implements the 'recombine' operation on a batch. # Its task is to combine some of these lines back together to # improve formatting. The need for this arises because # sub 'break_long_lines' is very liberal in setting line breaks # for long lines, always setting breaks at good breakpoints, even # when that creates small lines. Sometimes small line fragments # are produced which would look better if they were combined. # Input parameters: # $ri_beg = ref to array of BEGinning indexes of each line # $ri_end = ref to array of ENDing indexes of each line # $rbond_strength_to_go = array of bond strengths pulling # tokens together, used to decide where best to recombine lines. #------------------------------------------------------------------- # Do nothing under extreme stress; use <= 2 for c171. # (NOTE: New optimizations make this unnecessary. But removing this # check is not really useful because this condition only occurs in # test runs, and another formatting pass will fix things anyway.) # This routine has a long history of improvements. Some past # relevant issues are : c118, c167, c171, c186, c187, c193, c200. #------------------------------------------------------------------- return if ( $high_stress_level <= 2 ); my $nmax_start = @{$ri_end} - 1; return if ( $nmax_start <= 0 ); my $iend_max = $ri_end->[$nmax_start]; if ( $types_to_go[$iend_max] eq '#' ) { $iend_max = iprev_to_go($iend_max); } my $has_terminal_semicolon = $iend_max >= 0 && $types_to_go[$iend_max] eq ';'; #-------------------------------------------------------------------- # Break into the smallest possible sub-sections to improve efficiency #-------------------------------------------------------------------- # Also make a list of all good joining tokens between the lines # n-1 and n. my @joint; my $rsections = []; my $nbeg_sec = 0; my $nend_sec; my $nmax_section = 0; foreach my $nn ( 1 .. $nmax_start ) { my $ibeg_1 = $ri_beg->[ $nn - 1 ]; my $iend_1 = $ri_end->[ $nn - 1 ]; my $iend_2 = $ri_end->[$nn]; my $ibeg_2 = $ri_beg->[$nn]; # Define certain good joint tokens my ( $itok, $itokp, $itokm ); foreach my $itest ( $iend_1, $ibeg_2 ) { my $type = $types_to_go[$itest]; if ( $is_math_op{$type} || $is_amp_amp{$type} || $is_assignment{$type} || $type eq ':' ) { $itok = $itest; } } # joint[$nn] = index of joint character $joint[$nn] = $itok; # Update the section list my $excess = $self->excess_line_length( $ibeg_1, $iend_2, 1 ); if ( $excess <= 1 # The number 5 here is an arbitrary small number intended # to keep most small matches in one sub-section. || ( defined($nend_sec) && ( $nn < 5 || $nmax_start - $nn < 5 ) ) ) { $nend_sec = $nn; } else { if ( defined($nend_sec) ) { push @{$rsections}, [ $nbeg_sec, $nend_sec ]; my $num = $nend_sec - $nbeg_sec; if ( $num > $nmax_section ) { $nmax_section = $num } $nbeg_sec = $nn; $nend_sec = undef; } $nbeg_sec = $nn; } } if ( defined($nend_sec) ) { push @{$rsections}, [ $nbeg_sec, $nend_sec ]; my $num = $nend_sec - $nbeg_sec; if ( $num > $nmax_section ) { $nmax_section = $num } } my $num_sections = @{$rsections}; if ( DEBUG_RECOMBINE > 1 ) { print STDERR < 0 ) { my $max = 0; print STDERR "-----\n$num_sections sections found for nmax=$nmax_start\n"; foreach my $sect ( @{$rsections} ) { my ( $nbeg, $nend ) = @{$sect}; my $num = $nend - $nbeg; if ( $num > $max ) { $max = $num } print STDERR "$nbeg $nend\n"; } print STDERR "max size=$max of $nmax_start lines\n"; } # Loop over all sub-sections. Note that we have to work backwards # from the end of the batch since the sections use original line # numbers, and the line numbers change as we go. while ( my $section = pop @{$rsections} ) { my ( $nbeg, $nend ) = @{$section}; $self->recombine_section_loop( { _ri_beg => $ri_beg, _ri_end => $ri_end, _nbeg => $nbeg, _nend => $nend, _rjoint => \@joint, _rbond_strength_to_go => $rbond_strength_to_go, _has_terminal_semicolon => $has_terminal_semicolon, } ); } return; } ## end sub recombine_breakpoints sub recombine_section_loop { my ( $self, $rhash ) = @_; # Recombine breakpoints for one section of lines in the current batch # Given: # $ri_beg, $ri_end = ref to arrays with token indexes of the first # and last line # $nbeg, $nend = line numbers bounding this section # $rjoint = ref to array of good joining tokens per line # Update: $ri_beg, $ri_end, $rjoint if lines are joined # Returns: # nothing #------------- # Definitions: #------------- # $rhash = { # _ri_beg = ref to array with starting token index by line # _ri_end = ref to array with ending token index by line # _nbeg = first line number of this section # _nend = last line number of this section # _rjoint = ref to array of good joining tokens for each line # _rbond_strength_to_go = array of bond strengths # _has_terminal_semicolon = true if last line of batch has ';' # _num_freeze = fixed number of lines at end of this batch # _optimization_on = true during final optimization loop # _num_compares = total number of line compares made so far # _pair_list = list of line pairs in optimal search order # }; my $ri_beg = $rhash->{_ri_beg}; my $ri_end = $rhash->{_ri_end}; # Line index range of this section: my $nbeg = $rhash->{_nbeg}; # stays constant my $nend = $rhash->{_nend}; # will decrease # $nmax_batch = starting number of lines in the full batch # $num_freeze = number of lines following this section to leave alone my $nmax_batch = @{$ri_end} - 1; $rhash->{_num_freeze} = $nmax_batch - $nend; # Setup the list of line pairs to test. This stores the following # values for each line pair: # [ $n=index of the second line of the pair, $bs=bond strength] my @pair_list; my $rbond_strength_to_go = $rhash->{_rbond_strength_to_go}; foreach my $n ( $nbeg + 1 .. $nend ) { my $iend_1 = $ri_end->[ $n - 1 ]; my $ibeg_2 = $ri_beg->[$n]; my $bs_tweak = 0; if ( $is_amp_amp{ $types_to_go[$ibeg_2] } ) { $bs_tweak = 0.25 } my $bs = $rbond_strength_to_go->[$iend_1] + $bs_tweak; push @pair_list, [ $n, $bs ]; } # Any order for testing is possible, but optimization is only possible # if we sort the line pairs on decreasing joint strength. @pair_list = sort { $b->[1] <=> $a->[1] || $a->[0] <=> $b->[0] } @pair_list; $rhash->{_rpair_list} = \@pair_list; #---------------- # Iteration limit #---------------- # This was originally an O(n-squared) loop which required a check on # the maximum number of iterations for safety. It is now a very fast # loop which runs in O(n) time, but a check on total number of # iterations is retained to guard against future programming errors. # Most cases require roughly 1 comparison per line pair (1 full pass). # The upper bound is estimated to be about 3 comparisons per line pair # unless optimization is deactivated. The approximate breakdown is: # 1 pass with 1 compare per joint to do any special cases, plus # 1 pass with up to 2 compares per joint in optimization mode # The most extreme cases in my collection are: # camel1.t - needs 2.7 compares per line (12 without optimization) # ternary.t - needs 2.8 compares per line (12 without optimization) # So a value of MAX_COMPARE_RATIO = 3 looks like an upper bound as # long as optimization is used. A value of 20 should allow all code to # pass even if optimization is turned off for testing. # The OPTIMIZE_OK flag should be true except for testing. use constant MAX_COMPARE_RATIO => 20; use constant OPTIMIZE_OK => 1; my $num_pairs = $nend - $nbeg + 1; my $max_compares = MAX_COMPARE_RATIO * $num_pairs; # Always start with optimization off $rhash->{_num_compares} = 0; $rhash->{_optimization_on} = 0; $rhash->{_ix_best_last} = 0; #-------------------------------------------- # loop until there are no more recombinations #-------------------------------------------- my $nmax_last = $nmax_batch + 1; while (1) { # Stop when the number of lines in the batch does not decrease $nmax_batch = @{$ri_end} - 1; if ( $nmax_batch >= $nmax_last ) { last; } $nmax_last = $nmax_batch; #----------------------------------------- # inner loop to find next best combination #----------------------------------------- $self->recombine_inner_loop($rhash); # Iteration limit check: if ( $rhash->{_num_compares} > $max_compares ) { # See note above; should only get here on a programming error if (DEVEL_MODE) { my $ibeg = $ri_beg->[$nbeg]; my $Kbeg = $K_to_go[$ibeg]; my $lno = $self->[_rLL_]->[$Kbeg]->[_LINE_INDEX_]; Fault(<{_num_compares} exceeds max=$max_compares, near line $lno EOM } last; } } ## end iteration loop if (DEBUG_RECOMBINE) { my $ratio = sprintf "%0.3f", $rhash->{_num_compares} / $num_pairs; print STDERR "exiting recombine_inner_loop with $nmax_last lines, opt=$rhash->{_optimization_on}, starting pairs=$num_pairs, num_compares=$rhash->{_num_compares}, ratio=$ratio\n"; } return; } ## end sub recombine_section_loop sub recombine_inner_loop { my ( $self, $rhash ) = @_; # This is the inner loop of the recombine operation. We look at all of # the remaining joints in this section and select the best joint to be # recombined. If a recombination is made, the number of lines # in this section will be reduced by one. # Returns: nothing my $rK_weld_right = $self->[_rK_weld_right_]; my $rK_weld_left = $self->[_rK_weld_left_]; my $ri_beg = $rhash->{_ri_beg}; my $ri_end = $rhash->{_ri_end}; my $nbeg = $rhash->{_nbeg}; my $rjoint = $rhash->{_rjoint}; my $rbond_strength_to_go = $rhash->{_rbond_strength_to_go}; my $rpair_list = $rhash->{_rpair_list}; # This will remember the best joint: my $n_best = 0; my $bs_best = 0.; my $ix_best = 0; my $num_bs = 0; # The range of lines in this group is $nbeg to $nstop my $nmax = @{$ri_end} - 1; my $nstop = $nmax - $rhash->{_num_freeze}; my $num_joints = $nstop - $nbeg; # Turn off optimization if just two joints remain to allow # special two-line logic to be checked (c193) if ( $rhash->{_optimization_on} && $num_joints <= 2 ) { $rhash->{_optimization_on} = 0; } # Start where we ended the last search my $ix_start = $rhash->{_ix_best_last}; # Keep the starting index in bounds $ix_start = max( 0, $ix_start ); # Make a search order list which cycles around to visit # all line pairs. my $ix_max = @{$rpair_list} - 1; my @ix_list = ( $ix_start .. $ix_max, 0 .. $ix_start - 1 ); my $ix_last = $ix_list[-1]; #------------------------- # loop over all line pairs #------------------------- my $incomplete_loop; foreach my $ix (@ix_list) { my $item = $rpair_list->[$ix]; my ( $n, $bs ) = @{$item}; # This flag will be true if we 'last' out of this loop early. # We cannot turn on optimization if this is true. $incomplete_loop = $ix != $ix_last; # Update the count of the number of times through this inner loop $rhash->{_num_compares}++; #---------------------------------------------------------- # If we join the current pair of lines, # line $n-1 will become the left part of the joined line # line $n will become the right part of the joined line # # Here are Indexes of the endpoint tokens of the two lines: # # -----line $n-1--- | -----line $n----- # $ibeg_1 $iend_1 | $ibeg_2 $iend_2 # ^ # | # We want to decide if we should remove the line break # between the tokens at $iend_1 and $ibeg_2 # # We will apply a number of ad-hoc tests to see if joining # here will look ok. The code will just move to the next # pair if the join doesn't look good. If we get through # the gauntlet of tests, the lines will be recombined. #---------------------------------------------------------- # # beginning and ending tokens of the lines we are working on my $ibeg_1 = $ri_beg->[ $n - 1 ]; my $iend_1 = $ri_end->[ $n - 1 ]; my $iend_2 = $ri_end->[$n]; my $ibeg_2 = $ri_beg->[$n]; # The combined line cannot be too long my $excess = $self->excess_line_length( $ibeg_1, $iend_2, 1 ); next if ( $excess > 0 ); my $type_iend_1 = $types_to_go[$iend_1]; my $type_iend_2 = $types_to_go[$iend_2]; my $type_ibeg_1 = $types_to_go[$ibeg_1]; my $type_ibeg_2 = $types_to_go[$ibeg_2]; DEBUG_RECOMBINE > 1 && do { print STDERR "RECOMBINE: ix=$ix iend1=$iend_1 iend2=$iend_2 n=$n nmax=$nmax if=$ibeg_1 type=$type_ibeg_1 =$tokens_to_go[$ibeg_1] next_type=$type_ibeg_2 next_tok=$tokens_to_go[$ibeg_2]\n"; }; # If line $n is the last line, we set some flags and # do any special checks for it my $this_line_is_semicolon_terminated; if ( $n == $nmax ) { if ( $type_ibeg_2 eq '{' ) { # join isolated ')' and '{' if requested (git #110) if ( $rOpts_cuddled_paren_brace && $type_iend_1 eq '}' && $iend_1 == $ibeg_1 && $ibeg_2 == $iend_2 ) { if ( $tokens_to_go[$iend_1] eq ')' && $tokens_to_go[$ibeg_2] eq '{' ) { $n_best = $n; $ix_best = $ix; last; } } # otherwise, a terminal '{' should stay where it is # unless preceded by a fat comma next if ( $type_iend_1 ne '=>' ); } $this_line_is_semicolon_terminated = $rhash->{_has_terminal_semicolon}; } #---------------------------------------------------------- # Recombine Section 0: # Examine the special token joining this line pair, if any. # Put as many tests in this section to avoid duplicate code # and to make formatting independent of whether breaks are # to the left or right of an operator. #---------------------------------------------------------- my $itok = $rjoint->[$n]; if ($itok) { my $ok_0 = recombine_section_0( $itok, $ri_beg, $ri_end, $n ); next if ( !$ok_0 ); } #---------------------------------------------------------- # Recombine Section 1: # Join welded nested containers immediately #---------------------------------------------------------- if ( $total_weld_count && ( $type_sequence_to_go[$iend_1] && defined( $rK_weld_right->{ $K_to_go[$iend_1] } ) || $type_sequence_to_go[$ibeg_2] && defined( $rK_weld_left->{ $K_to_go[$ibeg_2] } ) ) ) { $n_best = $n; $ix_best = $ix; last; } #---------------------------------------------------------- # Recombine Section 2: # Examine token at $iend_1 (right end of first line of pair) #---------------------------------------------------------- my ( $ok_2, $skip_Section_3 ) = recombine_section_2( $ri_beg, $ri_end, $n, $this_line_is_semicolon_terminated ); next if ( !$ok_2 ); #---------------------------------------------------------- # Recombine Section 3: # Examine token at $ibeg_2 (left end of second line of pair) #---------------------------------------------------------- # Join lines identified above as capable of # causing an outdented line with leading closing paren. # Note that we are skipping the rest of this section # and the rest of the loop to do the join. if ($skip_Section_3) { $forced_breakpoint_to_go[$iend_1] = 0; $n_best = $n; $ix_best = $ix; $incomplete_loop = 1; last; } my ( $ok_3, $bs_tweak ) = recombine_section_3( $ri_beg, $ri_end, $n, $this_line_is_semicolon_terminated ); next if ( !$ok_3 ); #---------------------------------------------------------- # Recombine Section 4: # Combine the lines if we arrive here and it is possible #---------------------------------------------------------- # honor hard breakpoints next if ( $forced_breakpoint_to_go[$iend_1] ); if (DEVEL_MODE) { # This fault can only occur if an array index error has been # introduced by a recent programming change. my $bs_check = $rbond_strength_to_go->[$iend_1] + $bs_tweak; if ( $bs_check != $bs ) { Fault(< "[" . ( join ',', map { "\"$_\"" } split "\n", $_ ) . "]" ## }, ## $type; next if ( $old_breakpoint_to_go[$iend_1] && !$this_line_is_semicolon_terminated && $n < $nmax && $excess + 4 > 0 && $type_iend_2 ne ',' ); # do not recombine if we would skip in indentation levels if ( $n < $nmax ) { my $if_next = $ri_beg->[ $n + 1 ]; next if ( $levels_to_go[$ibeg_1] < $levels_to_go[$ibeg_2] && $levels_to_go[$ibeg_2] < $levels_to_go[$if_next] # but an isolated 'if (' is undesirable && !( $n == 1 && $iend_1 - $ibeg_1 <= 2 && $type_ibeg_1 eq 'k' && $tokens_to_go[$ibeg_1] eq 'if' && $tokens_to_go[$iend_1] ne '(' ) ); } ## OLD: honor no-break's ## next if ( $bs >= NO_BREAK - 1 ); # removed for b1257 # remember the pair with the greatest bond strength if ( !$n_best ) { # First good joint ... $n_best = $n; $ix_best = $ix; $bs_best = $bs; $num_bs = 1; # In optimization mode: stop on the first acceptable joint # because we already know it has the highest strength if ( $rhash->{_optimization_on} == 1 ) { last; } } else { # Second and later joints .. $num_bs++; # save maximum strength; in case of a tie select min $n if ( $bs > $bs_best || $bs == $bs_best && $n < $n_best ) { $n_best = $n; $ix_best = $ix; $bs_best = $bs; } } } ## end loop over all line pairs #--------------------------------------------------- # recombine the pair with the greatest bond strength #--------------------------------------------------- if ($n_best) { DEBUG_RECOMBINE > 1 && print "BEST: nb=$n_best nbeg=$nbeg stop=$nstop bs=$bs_best\n"; splice @{$ri_beg}, $n_best, 1; splice @{$ri_end}, $n_best - 1, 1; splice @{$rjoint}, $n_best, 1; splice @{$rpair_list}, $ix_best, 1; # Update the line indexes in the pair list: # Old $n values greater than the best $n decrease by 1 # because of the splice we just did. foreach my $item ( @{$rpair_list} ) { my $n_old = $item->[0]; if ( $n_old > $n_best ) { $item->[0] -= 1 } } # Store the index of this location for starting the next search. # We must subtract 1 to get an updated index because the splice # above just removed the best pair. # BUT CAUTION: if this is the first pair in the pair list, then # this produces an invalid index. So this index must be tested # before use in the next pass through the outer loop. $rhash->{_ix_best_last} = $ix_best - 1; # Turn on optimization if ... if ( # it is not already on, and !$rhash->{_optimization_on} # we have not taken a shortcut to get here, and && !$incomplete_loop # we have seen a good break on strength, and && $num_bs # we are allowed to optimize && OPTIMIZE_OK ) { $rhash->{_optimization_on} = 1; if (DEBUG_RECOMBINE) { my $num_compares = $rhash->{_num_compares}; my $pair_count = @ix_list; print STDERR "Entering optimization phase at $num_compares compares, pair count = $pair_count\n"; } } } return; } ## end sub recombine_inner_loop sub recombine_section_0 { my ( $itok, $ri_beg, $ri_end, $n ) = @_; # Recombine Section 0: # Examine special candidate joining token $itok # Given: # $itok = index of token at a possible join of lines $n-1 and $n # Return: # true => ok to combine # false => do not combine lines # Here are Indexes of the endpoint tokens of the two lines: # # -----line $n-1--- | -----line $n----- # $ibeg_1 $iend_1 | $ibeg_2 $iend_2 # ^ ^ # | | # ------------$itok is one of these tokens # Put as many tests in this section to avoid duplicate code # and to make formatting independent of whether breaks are # to the left or right of an operator. my $nmax = @{$ri_end} - 1; my $ibeg_1 = $ri_beg->[ $n - 1 ]; my $iend_1 = $ri_end->[ $n - 1 ]; my $ibeg_2 = $ri_beg->[$n]; my $iend_2 = $ri_end->[$n]; if ($itok) { my $type = $types_to_go[$itok]; if ( $type eq ':' ) { # do not join at a colon unless it disobeys the # break request if ( $itok eq $iend_1 ) { return unless $want_break_before{$type}; } else { return if $want_break_before{$type}; } } ## end if ':' # handle math operators + - * / elsif ( $is_math_op{$type} ) { # Combine these lines if this line is a single # number, or if it is a short term with same # operator as the previous line. For example, in # the following code we will combine all of the # short terms $A, $B, $C, $D, $E, $F, together # instead of leaving them one per line: # my $time = # $A * $B * $C * $D * $E * $F * # ( 2. * $eps * $sigma * $area ) * # ( 1. / $tcold**3 - 1. / $thot**3 ); # This can be important in math-intensive code. my $good_combo; my $itokp = min( $inext_to_go[$itok], $iend_2 ); my $itokpp = min( $inext_to_go[$itokp], $iend_2 ); my $itokm = max( iprev_to_go($itok), $ibeg_1 ); my $itokmm = max( iprev_to_go($itokm), $ibeg_1 ); # check for a number on the right if ( $types_to_go[$itokp] eq 'n' ) { # ok if nothing else on right if ( $itokp == $iend_2 ) { $good_combo = 1; } else { # look one more token to right.. # okay if math operator or some termination $good_combo = ( ( $itokpp == $iend_2 ) && $is_math_op{ $types_to_go[$itokpp] } ) || $types_to_go[$itokpp] =~ /^[#,;]$/; } } # check for a number on the left if ( !$good_combo && $types_to_go[$itokm] eq 'n' ) { # okay if nothing else to left if ( $itokm == $ibeg_1 ) { $good_combo = 1; } # otherwise look one more token to left else { # okay if math operator, comma, or assignment $good_combo = ( $itokmm == $ibeg_1 ) && ( $is_math_op{ $types_to_go[$itokmm] } || $types_to_go[$itokmm] =~ /^[,]$/ || $is_assignment{ $types_to_go[$itokmm] } ); } } # look for a single short token either side of the # operator if ( !$good_combo ) { # Slight adjustment factor to make results # independent of break before or after operator # in long summed lists. (An operator and a # space make two spaces). my $two = ( $itok eq $iend_1 ) ? 2 : 0; $good_combo = # numbers or id's on both sides of this joint $types_to_go[$itokp] =~ /^[in]$/ && $types_to_go[$itokm] =~ /^[in]$/ # one of the two lines must be short: && ( ( # no more than 2 nonblank tokens right # of joint $itokpp == $iend_2 # short && token_sequence_length( $itokp, $iend_2 ) < $two + $rOpts_short_concatenation_item_length ) || ( # no more than 2 nonblank tokens left of # joint $itokmm == $ibeg_1 # short && token_sequence_length( $ibeg_1, $itokm ) < 2 - $two + $rOpts_short_concatenation_item_length ) ) # keep pure terms; don't mix +- with */ && !( $is_plus_minus{$type} && ( $is_mult_div{ $types_to_go[$itokmm] } || $is_mult_div{ $types_to_go[$itokpp] } ) ) && !( $is_mult_div{$type} && ( $is_plus_minus{ $types_to_go[$itokmm] } || $is_plus_minus{ $types_to_go[$itokpp] } ) ) ; } # it is also good to combine if we can reduce to 2 # lines if ( !$good_combo ) { # index on other line where same token would be # in a long chain. my $iother = ( $itok == $iend_1 ) ? $iend_2 : $ibeg_1; $good_combo = $n == 2 && $n == $nmax && $types_to_go[$iother] ne $type; } return unless ($good_combo); } ## end math elsif ( $is_amp_amp{$type} ) { ##TBD } ## end &&, || elsif ( $is_assignment{$type} ) { ##TBD } ## end assignment } # ok to combine lines return 1; } ## end sub recombine_section_0 sub recombine_section_2 { my ( $ri_beg, $ri_end, $n, $this_line_is_semicolon_terminated ) = @_; # Recombine Section 2: # Examine token at $iend_1 (right end of first line of pair) # Here are Indexes of the endpoint tokens of the two lines: # # -----line $n-1--- | -----line $n----- # $ibeg_1 $iend_1 | $ibeg_2 $iend_2 # ^ # | # -----Section 2 looks at this token # Returns: # (nothing) => do not join lines # 1, skip_Section_3 => ok to join lines # $skip_Section_3 is a flag for skipping the next section my $skip_Section_3 = 0; my $nmax = @{$ri_end} - 1; my $ibeg_1 = $ri_beg->[ $n - 1 ]; my $iend_1 = $ri_end->[ $n - 1 ]; my $iend_2 = $ri_end->[$n]; my $ibeg_2 = $ri_beg->[$n]; my $ibeg_3 = $n < $nmax ? $ri_beg->[ $n + 1 ] : -1; my $ibeg_nmax = $ri_beg->[$nmax]; my $type_iend_1 = $types_to_go[$iend_1]; my $type_iend_2 = $types_to_go[$iend_2]; my $type_ibeg_1 = $types_to_go[$ibeg_1]; my $type_ibeg_2 = $types_to_go[$ibeg_2]; # an isolated '}' may join with a ';' terminated segment if ( $type_iend_1 eq '}' ) { # Check for cases where combining a semicolon terminated # statement with a previous isolated closing paren will # allow the combined line to be outdented. This is # generally a good move. For example, we can join up # the last two lines here: # ( # $dev, $ino, $mode, $nlink, $uid, $gid, $rdev, # $size, $atime, $mtime, $ctime, $blksize, $blocks # ) # = stat($file); # # to get: # ( # $dev, $ino, $mode, $nlink, $uid, $gid, $rdev, # $size, $atime, $mtime, $ctime, $blksize, $blocks # ) = stat($file); # # which makes the parens line up. # # Another example, from Joe Matarazzo, probably looks best # with the 'or' clause appended to the trailing paren: # $self->some_method( # PARAM1 => 'foo', # PARAM2 => 'bar' # ) or die "Some_method didn't work"; # # But we do not want to do this for something like the -lp # option where the paren is not outdentable because the # trailing clause will be far to the right. # # The logic here is synchronized with the logic in sub # sub get_final_indentation, which actually does # the outdenting. # my $combine_ok = $this_line_is_semicolon_terminated # only one token on last line && $ibeg_1 == $iend_1 # must be structural paren && $tokens_to_go[$iend_1] eq ')' # style must allow outdenting, && !$closing_token_indentation{')'} # but leading colons probably line up with a # previous colon or question (count could be wrong). && $type_ibeg_2 ne ':' # only one step in depth allowed. this line must not # begin with a ')' itself. && ( $nesting_depth_to_go[$iend_1] == $nesting_depth_to_go[$iend_2] + 1 ); # But only combine leading '&&', '||', if no previous && || : # seen. This count includes these tokens at all levels. The # idea is that seeing these at any level can make it hard to read # formatting if we recombine. if ( $is_amp_amp{$type_ibeg_2} ) { foreach my $n_t ( reverse( 0 .. $n - 2 ) ) { my $ibeg_t = $ri_beg->[$n_t]; my $type_t = $types_to_go[$ibeg_t]; if ( $is_amp_amp{$type_t} || $type_t eq ':' ) { $combine_ok = 0; last; } } } $skip_Section_3 ||= $combine_ok; # YVES patch 2 of 2: # Allow cuddled eval chains, like this: # eval { # #STUFF; # 1; # return true # } or do { # #handle error # }; # This patch works together with a patch in # setting adjusted indentation (where the closing eval # brace is outdented if possible). # The problem is that an 'eval' block has continuation # indentation and it looks better to undo it in some # cases. If we do not use this patch we would get: # eval { # #STUFF; # 1; # return true # } # or do { # #handle error # }; # The alternative, for uncuddled style, is to create # a patch in get_final_indentation which undoes # the indentation of a leading line like 'or do {'. # This doesn't work well with -icb through if ( $block_type_to_go[$iend_1] && $rOpts_brace_follower_vertical_tightness > 0 && ( # -bfvt=1, allow cuddled eval chains [default] ( $tokens_to_go[$iend_2] eq '{' && $block_type_to_go[$iend_1] eq 'eval' && !ref( $leading_spaces_to_go[$iend_1] ) && !$rOpts_indent_closing_brace ) # -bfvt=2, allow most brace followers [part of git #110] || ( $rOpts_brace_follower_vertical_tightness > 1 && $ibeg_1 == $iend_1 ) ) && ( ( $type_ibeg_2 =~ /^(\&\&|\|\|)$/ ) || ( $type_ibeg_2 eq 'k' && $is_and_or{ $tokens_to_go[$ibeg_2] } ) || $is_if_unless{ $tokens_to_go[$ibeg_2] } ) ) { $skip_Section_3 ||= 1; } return unless ( $skip_Section_3 # handle '.' and '?' specially below || ( $type_ibeg_2 =~ /^[\.\?]$/ ) # fix for c054 (unusual -pbp case) || $type_ibeg_2 eq '==' ); } elsif ( $type_iend_1 eq '{' ) { # YVES # honor breaks at opening brace # Added to prevent recombining something like this: # } || eval { package main; return if ( $forced_breakpoint_to_go[$iend_1] ); } # do not recombine lines with ending &&, ||, elsif ( $is_amp_amp{$type_iend_1} ) { return unless ( $want_break_before{$type_iend_1} ); } # Identify and recombine a broken ?/: chain elsif ( $type_iend_1 eq '?' ) { # Do not recombine different levels return if ( $levels_to_go[$ibeg_1] ne $levels_to_go[$ibeg_2] ); # do not recombine unless next line ends in : return unless $type_iend_2 eq ':'; } # for lines ending in a comma... elsif ( $type_iend_1 eq ',' ) { # Do not recombine at comma which is following the # input bias. # NOTE: this could be controlled by a special flag, # but it seems to work okay. return if ( $old_breakpoint_to_go[$iend_1] ); # An isolated '},' may join with an identifier + ';' # This is useful for the class of a 'bless' statement # (bless.t) if ( $type_ibeg_1 eq '}' && $type_ibeg_2 eq 'i' ) { return unless ( ( $ibeg_1 == ( $iend_1 - 1 ) ) && ( $iend_2 == ( $ibeg_2 + 1 ) ) && $this_line_is_semicolon_terminated ); # override breakpoint $forced_breakpoint_to_go[$iend_1] = 0; } # but otherwise .. else { # do not recombine after a comma unless this will # leave just 1 more line return unless ( $n + 1 >= $nmax ); # do not recombine if there is a change in # indentation depth return if ( $levels_to_go[$iend_1] != $levels_to_go[$iend_2] ); # do not recombine a "complex expression" after a # comma. "complex" means no parens. my $saw_paren; foreach my $ii ( $ibeg_2 .. $iend_2 ) { if ( $tokens_to_go[$ii] eq '(' ) { $saw_paren = 1; last; } } return if $saw_paren; } } # opening paren.. elsif ( $type_iend_1 eq '(' ) { # No longer doing this } elsif ( $type_iend_1 eq ')' ) { # No longer doing this } # keep a terminal for-semicolon elsif ( $type_iend_1 eq 'f' ) { return; } # if '=' at end of line ... elsif ( $is_assignment{$type_iend_1} ) { # keep break after = if it was in input stream # this helps prevent 'blinkers' return if ( $old_breakpoint_to_go[$iend_1] # don't strand an isolated '=' && $iend_1 != $ibeg_1 ); my $is_short_quote = ( $type_ibeg_2 eq 'Q' && $ibeg_2 == $iend_2 && token_sequence_length( $ibeg_2, $ibeg_2 ) < $rOpts_short_concatenation_item_length ); my $is_ternary = ( $type_ibeg_1 eq '?' && ( $ibeg_3 >= 0 && $types_to_go[$ibeg_3] eq ':' ) ); # always join an isolated '=', a short quote, or if this # will put ?/: at start of adjacent lines if ( $ibeg_1 != $iend_1 && !$is_short_quote && !$is_ternary ) { return unless ( ( # unless we can reduce this to two lines $nmax < $n + 2 # or three lines, the last with a leading # semicolon || ( $nmax == $n + 2 && $types_to_go[$ibeg_nmax] eq ';' ) # or the next line ends with a here doc || $type_iend_2 eq 'h' # or the next line ends in an open paren or # brace and the break hasn't been forced # [dima.t] || ( !$forced_breakpoint_to_go[$iend_1] && $type_iend_2 eq '{' ) ) # do not recombine if the two lines might align # well this is a very approximate test for this && ( # RT#127633 - the leading tokens are not # operators ( $type_ibeg_2 ne $tokens_to_go[$ibeg_2] ) # or they are different || ( $ibeg_3 >= 0 && $type_ibeg_2 ne $types_to_go[$ibeg_3] ) ) ); if ( # Recombine if we can make two lines $nmax >= $n + 2 # -lp users often prefer this: # my $title = function($env, $env, $sysarea, # "bubba Borrower Entry"); # so we will recombine if -lp is used we have # ending comma && !( $ibeg_3 > 0 && ref( $leading_spaces_to_go[$ibeg_3] ) && $type_iend_2 eq ',' ) ) { # otherwise, scan the rhs line up to last token for # complexity. Note that we are not counting the last token # in case it is an opening paren. my $ok = simple_rhs( $ri_end, $n, $nmax, $ibeg_2, $iend_2 ); return if ( !$ok ); } } unless ( $tokens_to_go[$ibeg_2] =~ /^[\{\(\[]$/ ) { $forced_breakpoint_to_go[$iend_1] = 0; } } # for keywords.. elsif ( $type_iend_1 eq 'k' ) { # make major control keywords stand out # (recombine.t) return if ( #/^(last|next|redo|return)$/ $is_last_next_redo_return{ $tokens_to_go[$iend_1] } # but only if followed by multiple lines && $n < $nmax ); if ( $is_and_or{ $tokens_to_go[$iend_1] } ) { return unless $want_break_before{ $tokens_to_go[$iend_1] }; } } elsif ( $type_iend_1 eq '.' ) { # NOTE: the logic here should match that of section 3 so that # line breaks are independent of choice of break before or after. # It would be nice to combine them in section 0, but the # special junction case ') .' makes that difficult. # This section added to fix issues c172, c174. my $i_next_nonblank = $ibeg_2; my $summed_len_1 = $summed_lengths_to_go[ $iend_1 + 1 ] - $summed_lengths_to_go[$ibeg_1]; my $summed_len_2 = $summed_lengths_to_go[ $iend_2 + 1 ] - $summed_lengths_to_go[$ibeg_2]; my $iend_1_minus = max( $ibeg_1, iprev_to_go($iend_1) ); return unless ( # ... unless there is just one and we can reduce # this to two lines if we do. For example, this # # # $bodyA .= # '($dummy, $pat) = &get_next_tex_cmd;' . '$args .= $pat;' # # looks better than this: # $bodyA .= '($dummy, $pat) = &get_next_tex_cmd;' . # '$args .= $pat;' # check for 2 lines, not in a long broken '.' chain ( $n == 2 && $n == $nmax && $type_iend_1 ne $type_iend_2 ) # ... or this would strand a short quote , like this # "some long quote" . # "\n"; || ( $types_to_go[$i_next_nonblank] eq 'Q' && $i_next_nonblank >= $iend_2 - 2 && $token_lengths_to_go[$i_next_nonblank] < $rOpts_short_concatenation_item_length # additional constraints to fix c167 && ( $types_to_go[$iend_1_minus] ne 'Q' || $summed_len_2 < $summed_len_1 ) ) ); } return ( 1, $skip_Section_3 ); } ## end sub recombine_section_2 sub simple_rhs { my ( $ri_end, $n, $nmax, $ibeg_2, $iend_2 ) = @_; # Scan line ibeg_2 to $iend_2 up to last token for complexity. # We are not counting the last token in case it is an opening paren. # Return: # true if rhs is simple, ok to recombine # false otherwise my $tv = 0; my $depth = $nesting_depth_to_go[$ibeg_2]; foreach my $i ( $ibeg_2 + 1 .. $iend_2 - 1 ) { if ( $nesting_depth_to_go[$i] != $depth ) { $tv++; last if ( $tv > 1 ); } $depth = $nesting_depth_to_go[$i]; } # ok to recombine if no level changes before # last token if ( $tv > 0 ) { # otherwise, do not recombine if more than # two level changes. return if ( $tv > 1 ); # check total complexity of the two # adjacent lines that will occur if we do # this join my $istop = ( $n < $nmax ) ? $ri_end->[ $n + 1 ] : $iend_2; foreach my $i ( $iend_2 .. $istop ) { if ( $nesting_depth_to_go[$i] != $depth ) { $tv++; last if ( $tv > 2 ); } $depth = $nesting_depth_to_go[$i]; } # do not recombine if total is more than 2 # level changes return if ( $tv > 2 ); } return 1; } ## end sub simple_rhs sub recombine_section_3 { my ( $ri_beg, $ri_end, $n, $this_line_is_semicolon_terminated ) = @_; # Recombine Section 3: # Examine token at $ibeg_2 (right end of first line of pair) # Here are Indexes of the endpoint tokens of the two lines: # # -----line $n-1--- | -----line $n----- # $ibeg_1 $iend_1 | $ibeg_2 $iend_2 # ^ # | # -----Section 3 looks at this token # Returns: # (nothing) => do not join lines # 1, bs_tweak => ok to join lines # $bstweak is a small tolerance to add to bond strengths my $bs_tweak = 0; my $nmax = @{$ri_end} - 1; my $ibeg_1 = $ri_beg->[ $n - 1 ]; my $iend_1 = $ri_end->[ $n - 1 ]; my $iend_2 = $ri_end->[$n]; my $ibeg_2 = $ri_beg->[$n]; my $ibeg_0 = $n > 1 ? $ri_beg->[ $n - 2 ] : -1; my $ibeg_3 = $n < $nmax ? $ri_beg->[ $n + 1 ] : -1; my $ibeg_4 = $n + 2 <= $nmax ? $ri_beg->[ $n + 2 ] : -1; my $ibeg_nmax = $ri_beg->[$nmax]; my $type_iend_1 = $types_to_go[$iend_1]; my $type_iend_2 = $types_to_go[$iend_2]; my $type_ibeg_1 = $types_to_go[$ibeg_1]; my $type_ibeg_2 = $types_to_go[$ibeg_2]; # handle lines with leading &&, || if ( $is_amp_amp{$type_ibeg_2} ) { # ok to recombine if it follows a ? or : # and is followed by an open paren.. my $ok = ( $is_ternary{$type_ibeg_1} && $tokens_to_go[$iend_2] eq '(' ) # or is followed by a ? or : at same depth # # We are looking for something like this. We can # recombine the && line with the line above to make the # structure more clear: # return # exists $G->{Attr}->{V} # && exists $G->{Attr}->{V}->{$u} # ? %{ $G->{Attr}->{V}->{$u} } # : (); # # We should probably leave something like this alone: # return # exists $G->{Attr}->{E} # && exists $G->{Attr}->{E}->{$u} # && exists $G->{Attr}->{E}->{$u}->{$v} # ? %{ $G->{Attr}->{E}->{$u}->{$v} } # : (); # so that we either have all of the &&'s (or ||'s) # on one line, as in the first example, or break at # each one as in the second example. However, it # sometimes makes things worse to check for this because # it prevents multiple recombinations. So this is not done. || ( $ibeg_3 >= 0 && $is_ternary{ $types_to_go[$ibeg_3] } && $nesting_depth_to_go[$ibeg_3] == $nesting_depth_to_go[$ibeg_2] ); # Combine a trailing && term with an || term: fix for # c060 This is rare but can happen. $ok ||= 1 if ( $ibeg_3 < 0 && $type_ibeg_2 eq '&&' && $type_ibeg_1 eq '||' && $nesting_depth_to_go[$ibeg_2] == $nesting_depth_to_go[$ibeg_1] ); return if !$ok && $want_break_before{$type_ibeg_2}; $forced_breakpoint_to_go[$iend_1] = 0; # tweak the bond strength to give this joint priority # over ? and : $bs_tweak = 0.25; } # Identify and recombine a broken ?/: chain elsif ( $type_ibeg_2 eq '?' ) { # Do not recombine different levels my $lev = $levels_to_go[$ibeg_2]; return if ( $lev ne $levels_to_go[$ibeg_1] ); # Do not recombine a '?' if either next line or # previous line does not start with a ':'. The reasons # are that (1) no alignment of the ? will be possible # and (2) the expression is somewhat complex, so the # '?' is harder to see in the interior of the line. my $follows_colon = $ibeg_1 >= 0 && $type_ibeg_1 eq ':'; my $precedes_colon = $ibeg_3 >= 0 && $types_to_go[$ibeg_3] eq ':'; return unless ( $follows_colon || $precedes_colon ); # we will always combining a ? line following a : line if ( !$follows_colon ) { # ...otherwise recombine only if it looks like a # chain. we will just look at a few nearby lines # to see if this looks like a chain. my $local_count = 0; foreach my $ii ( $ibeg_0, $ibeg_1, $ibeg_3, $ibeg_4 ) { $local_count++ if $ii >= 0 && $types_to_go[$ii] eq ':' && $levels_to_go[$ii] == $lev; } return unless ( $local_count > 1 ); } $forced_breakpoint_to_go[$iend_1] = 0; } # do not recombine lines with leading '.' elsif ( $type_ibeg_2 eq '.' ) { my $i_next_nonblank = min( $inext_to_go[$ibeg_2], $iend_2 ); my $summed_len_1 = $summed_lengths_to_go[ $iend_1 + 1 ] - $summed_lengths_to_go[$ibeg_1]; my $summed_len_2 = $summed_lengths_to_go[ $iend_2 + 1 ] - $summed_lengths_to_go[$ibeg_2]; return unless ( # ... unless there is just one and we can reduce # this to two lines if we do. For example, this # # # $bodyA .= # '($dummy, $pat) = &get_next_tex_cmd;' . '$args .= $pat;' # # looks better than this: # $bodyA .= '($dummy, $pat) = &get_next_tex_cmd;' # . '$args .= $pat;' ( $n == 2 && $n == $nmax && $type_ibeg_1 ne $type_ibeg_2 ) # ... or this would strand a short quote , like this # . "some long quote" # . "\n"; || ( $types_to_go[$i_next_nonblank] eq 'Q' && $i_next_nonblank >= $iend_2 - 1 && $token_lengths_to_go[$i_next_nonblank] < $rOpts_short_concatenation_item_length # additional constraints to fix c167 && ( $types_to_go[$iend_1] ne 'Q' # allow a term shorter than the previous term || $summed_len_2 < $summed_len_1 # or allow a short semicolon-terminated term if this # makes two lines (see c169) || ( $n == 2 && $n == $nmax && $this_line_is_semicolon_terminated ) ) ) ); } # handle leading keyword.. elsif ( $type_ibeg_2 eq 'k' ) { # handle leading "or" if ( $tokens_to_go[$ibeg_2] eq 'or' ) { return unless ( $this_line_is_semicolon_terminated && ( $type_ibeg_1 eq '}' || ( # following 'if' or 'unless' or 'or' $type_ibeg_1 eq 'k' && $is_if_unless{ $tokens_to_go[$ibeg_1] } # important: only combine a very simple # or statement because the step below # may have combined a trailing 'and' # with this or, and we do not want to # then combine everything together && ( $iend_2 - $ibeg_2 <= 7 ) ) ) ); #X: RT #81854 $forced_breakpoint_to_go[$iend_1] = 0 unless ( $old_breakpoint_to_go[$iend_1] ); } # handle leading 'and' and 'xor' elsif ($tokens_to_go[$ibeg_2] eq 'and' || $tokens_to_go[$ibeg_2] eq 'xor' ) { # Decide if we will combine a single terminal 'and' # after an 'if' or 'unless'. # This looks best with the 'and' on the same # line as the 'if': # # $a = 1 # if $seconds and $nu < 2; # # But this looks better as shown: # # $a = 1 # if !$this->{Parents}{$_} # or $this->{Parents}{$_} eq $_; # return unless ( $this_line_is_semicolon_terminated && ( # following 'if' or 'unless' or 'or' $type_ibeg_1 eq 'k' && ( $is_if_unless{ $tokens_to_go[$ibeg_1] } || $tokens_to_go[$ibeg_1] eq 'or' ) ) ); } # handle leading "if" and "unless" elsif ( $is_if_unless{ $tokens_to_go[$ibeg_2] } ) { # Combine something like: # next # if ( $lang !~ /${l}$/i ); # into: # next if ( $lang !~ /${l}$/i ); return unless ( $this_line_is_semicolon_terminated # previous line begins with 'and' or 'or' && $type_ibeg_1 eq 'k' && $is_and_or{ $tokens_to_go[$ibeg_1] } ); } # handle all other leading keywords else { # keywords look best at start of lines, # but combine things like "1 while" unless ( $is_assignment{$type_iend_1} ) { return if ( ( $type_iend_1 ne 'k' ) && ( $tokens_to_go[$ibeg_2] ne 'while' ) ); } } } # similar treatment of && and || as above for 'and' and # 'or': NOTE: This block of code is currently bypassed # because of a previous block but is retained for possible # future use. elsif ( $is_amp_amp{$type_ibeg_2} ) { # maybe looking at something like: # unless $TEXTONLY || $item =~ m%|p>|a|img)%i; return unless ( $this_line_is_semicolon_terminated # previous line begins with an 'if' or 'unless' # keyword && $type_ibeg_1 eq 'k' && $is_if_unless{ $tokens_to_go[$ibeg_1] } ); } # handle line with leading = or similar elsif ( $is_assignment{$type_ibeg_2} ) { return unless ( $n == 1 || $n == $nmax ); return if ( $old_breakpoint_to_go[$iend_1] ); return unless ( # unless we can reduce this to two lines $nmax == 2 # or three lines, the last with a leading semicolon || ( $nmax == 3 && $types_to_go[$ibeg_nmax] eq ';' ) # or the next line ends with a here doc || $type_iend_2 eq 'h' # or this is a short line ending in ; || ( $n == $nmax && $this_line_is_semicolon_terminated ) ); $forced_breakpoint_to_go[$iend_1] = 0; } return ( 1, $bs_tweak ); } ## end sub recombine_section_3 } ## end closure recombine_breakpoints sub insert_final_ternary_breaks { my ( $self, $ri_left, $ri_right ) = @_; # Called once per batch to look for and do any final line breaks for # long ternary chains my $nmax = @{$ri_right} - 1; # scan the left and right end tokens of all lines my $i_first_colon = -1; for my $n ( 0 .. $nmax ) { my $il = $ri_left->[$n]; my $ir = $ri_right->[$n]; my $typel = $types_to_go[$il]; my $typer = $types_to_go[$ir]; return if ( $typel eq '?' ); return if ( $typer eq '?' ); if ( $typel eq ':' ) { $i_first_colon = $il; last; } elsif ( $typer eq ':' ) { $i_first_colon = $ir; last; } } # For long ternary chains, # if the first : we see has its ? is in the interior # of a preceding line, then see if there are any good # breakpoints before the ?. if ( $i_first_colon > 0 ) { my $i_question = $mate_index_to_go[$i_first_colon]; if ( defined($i_question) && $i_question > 0 ) { my @insert_list; foreach my $ii ( reverse( 0 .. $i_question - 1 ) ) { my $token = $tokens_to_go[$ii]; my $type = $types_to_go[$ii]; # For now, a good break is either a comma or, # in a long chain, a 'return'. # Patch for RT #126633: added the $nmax>1 check to avoid # breaking after a return for a simple ternary. For longer # chains the break after return allows vertical alignment, so # it is still done. So perltidy -wba='?' will not break # immediately after the return in the following statement: # sub x { # return 0 ? 'aaaaaaaaaaaaaaaaaaaaa' : # 'bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'; # } if ( ( $type eq ',' || $type eq 'k' && ( $nmax > 1 && $token eq 'return' ) ) && $self->in_same_container_i( $ii, $i_question ) ) { push @insert_list, $ii; last; } } # insert any new break points if (@insert_list) { $self->insert_additional_breaks( \@insert_list, $ri_left, $ri_right ); } } } return; } ## end sub insert_final_ternary_breaks sub insert_breaks_before_list_opening_containers { my ( $self, $ri_left, $ri_right ) = @_; # This routine is called once per batch to implement the parameters # --break-before-hash-brace, etc. # Nothing to do if none of these parameters has been set return unless %break_before_container_types; my $nmax = @{$ri_right} - 1; return unless ( $nmax >= 0 ); my $rLL = $self->[_rLL_]; my $rbreak_before_container_by_seqno = $self->[_rbreak_before_container_by_seqno_]; my $rK_weld_left = $self->[_rK_weld_left_]; # scan the ends of all lines my @insert_list; for my $n ( 0 .. $nmax ) { my $il = $ri_left->[$n]; my $ir = $ri_right->[$n]; next unless ( $ir > $il ); my $Kl = $K_to_go[$il]; my $Kr = $K_to_go[$ir]; my $Kend = $Kr; my $type_end = $rLL->[$Kr]->[_TYPE_]; # Backup before any side comment if ( $type_end eq '#' ) { $Kend = $self->K_previous_nonblank($Kr); next unless defined($Kend); $type_end = $rLL->[$Kend]->[_TYPE_]; } # Backup to the start of any weld; fix for b1173. if ($total_weld_count) { my $Kend_test = $rK_weld_left->{$Kend}; if ( defined($Kend_test) && $Kend_test > $Kl ) { $Kend = $Kend_test; $Kend_test = $rK_weld_left->{$Kend}; } # Do not break if we did not back up to the start of a weld # (shouldn't happen) next if ( defined($Kend_test) ); } my $token = $rLL->[$Kend]->[_TOKEN_]; next unless ( $is_opening_token{$token} ); next unless ( $Kl < $Kend - 1 ); my $seqno = $rLL->[$Kend]->[_TYPE_SEQUENCE_]; next unless ( defined($seqno) ); # Use the flag which was previously set next unless ( $rbreak_before_container_by_seqno->{$seqno} ); # Install a break before this opening token. my $Kbreak = $self->K_previous_nonblank($Kend); my $ibreak = $Kbreak - $Kl + $il; next if ( $ibreak < $il ); next if ( $nobreak_to_go[$ibreak] ); push @insert_list, $ibreak; } # insert any new break points if (@insert_list) { $self->insert_additional_breaks( \@insert_list, $ri_left, $ri_right ); } return; } ## end sub insert_breaks_before_list_opening_containers sub note_added_semicolon { my ( $self, $line_number ) = @_; $self->[_last_added_semicolon_at_] = $line_number; if ( $self->[_added_semicolon_count_] == 0 ) { $self->[_first_added_semicolon_at_] = $line_number; } $self->[_added_semicolon_count_]++; write_logfile_entry("Added ';' here\n"); return; } ## end sub note_added_semicolon sub note_deleted_semicolon { my ( $self, $line_number ) = @_; $self->[_last_deleted_semicolon_at_] = $line_number; if ( $self->[_deleted_semicolon_count_] == 0 ) { $self->[_first_deleted_semicolon_at_] = $line_number; } $self->[_deleted_semicolon_count_]++; write_logfile_entry("Deleted unnecessary ';' at line $line_number\n"); return; } ## end sub note_deleted_semicolon sub note_embedded_tab { my ( $self, $line_number ) = @_; $self->[_embedded_tab_count_]++; $self->[_last_embedded_tab_at_] = $line_number; if ( !$self->[_first_embedded_tab_at_] ) { $self->[_first_embedded_tab_at_] = $line_number; } if ( $self->[_embedded_tab_count_] <= MAX_NAG_MESSAGES ) { write_logfile_entry("Embedded tabs in quote or pattern\n"); } return; } ## end sub note_embedded_tab use constant DEBUG_CORRECT_LP => 0; sub correct_lp_indentation { # When the -lp option is used, we need to make a last pass through # each line to correct the indentation positions in case they differ # from the predictions. This is necessary because perltidy uses a # predictor/corrector method for aligning with opening parens. The # predictor is usually good, but sometimes stumbles. The corrector # tries to patch things up once the actual opening paren locations # are known. my ( $self, $ri_first, $ri_last ) = @_; # first remove continuation indentation if appropriate my $max_line = @{$ri_first} - 1; #--------------------------------------------------------------------------- # PASS 1: reduce indentation if necessary at any long one-line blocks (c098) #--------------------------------------------------------------------------- # The point is that sub 'starting_one_line_block' made one-line blocks based # on default indentation, not -lp indentation. So some of the one-line # blocks may be too long when given -lp indentation. We will fix that now # if possible, using the list of these closing block indexes. my $ri_starting_one_line_block = $self->[_this_batch_]->[_ri_starting_one_line_block_]; if ( @{$ri_starting_one_line_block} ) { $self->correct_lp_indentation_pass_1( $ri_first, $ri_last, $ri_starting_one_line_block ); } #------------------------------------------------------------------- # PASS 2: look for and fix other problems in each line of this batch #------------------------------------------------------------------- # look at each output line ... foreach my $line ( 0 .. $max_line ) { my $ibeg = $ri_first->[$line]; my $iend = $ri_last->[$line]; # looking at each token in this output line ... foreach my $i ( $ibeg .. $iend ) { # How many space characters to place before this token # for special alignment. Actual padding is done in the # continue block. # looking for next unvisited indentation item ... my $indentation = $leading_spaces_to_go[$i]; # This is just for indentation objects (c098) next unless ( ref($indentation) ); # Visit each indentation object just once next if ( $indentation->get_marked() ); # Mark first visit $indentation->set_marked(1); # Skip indentation objects which do not align with container tokens my $align_seqno = $indentation->get_align_seqno(); next unless ($align_seqno); # Skip a container which is entirely on this line my $Ko = $self->[_K_opening_container_]->{$align_seqno}; my $Kc = $self->[_K_closing_container_]->{$align_seqno}; if ( defined($Ko) && defined($Kc) ) { next if ( $Ko >= $K_to_go[$ibeg] && $Kc <= $K_to_go[$iend] ); } # Note on flag '$do_not_pad': # We want to avoid a situation like this, where the aligner # inserts whitespace before the '=' to align it with a previous # '=', because otherwise the parens might become mis-aligned in a # situation like this, where the '=' has become aligned with the # previous line, pushing the opening '(' forward beyond where we # want it. # # $mkFloor::currentRoom = ''; # $mkFloor::c_entry = $c->Entry( # -width => '10', # -relief => 'sunken', # ... # ); # # We leave it to the aligner to decide how to do this. if ( $line == 1 && $i == $ibeg ) { $self->[_this_batch_]->[_do_not_pad_] = 1; } #-------------------------------------------- # Now see what the error is and try to fix it #-------------------------------------------- my $closing_index = $indentation->get_closed(); my $predicted_pos = $indentation->get_spaces(); # Find actual position: my $actual_pos; if ( $i == $ibeg ) { # Case 1: token is first character of of batch - table lookup if ( $line == 0 ) { $actual_pos = $predicted_pos; my ( $indent, $offset, $is_leading, $exists ) = get_saved_opening_indentation($align_seqno); if ( defined($indent) ) { # NOTE: we could use '1' here if no space after # opening and '2' if want space; it is hardwired at 1 # like -gnu-style. But it is probably best to leave # this alone because changing it would change # formatting of much existing code without any # significant benefit. $actual_pos = get_spaces($indent) + $offset + 1; } } # Case 2: token starts a new line - use length of previous line else { my $ibegm = $ri_first->[ $line - 1 ]; my $iendm = $ri_last->[ $line - 1 ]; $actual_pos = total_line_length( $ibegm, $iendm ); # follow -pt style ++$actual_pos if ( $types_to_go[ $iendm + 1 ] eq 'b' ); } } # Case 3: $i>$ibeg: token is mid-line - use length to previous token else { $actual_pos = total_line_length( $ibeg, $i - 1 ); # for mid-line token, we must check to see if all # additional lines have continuation indentation, # and remove it if so. Otherwise, we do not get # good alignment. if ( $closing_index > $iend ) { my $ibeg_next = $ri_first->[ $line + 1 ]; if ( $ci_levels_to_go[$ibeg_next] > 0 ) { $self->undo_lp_ci( $line, $i, $closing_index, $ri_first, $ri_last ); } } } # By how many spaces (plus or minus) would we need to increase the # indentation to get alignment with the opening token? my $move_right = $actual_pos - $predicted_pos; if (DEBUG_CORRECT_LP) { my $tok = substr( $tokens_to_go[$i], 0, 8 ); my $avail = $self->get_available_spaces_to_go($ibeg); print "CORRECT_LP for seq=$align_seqno, predicted pos=$predicted_pos actual=$actual_pos => move right=$move_right available=$avail i=$i max=$max_index_to_go tok=$tok\n"; } # nothing more to do if no error to correct (gnu2.t) if ( $move_right == 0 ) { $indentation->set_recoverable_spaces($move_right); next; } # Get any collapsed length defined for -xlp my $collapsed_length = $self->[_rcollapsed_length_by_seqno_]->{$align_seqno}; $collapsed_length = 0 unless ( defined($collapsed_length) ); if (DEBUG_CORRECT_LP) { print "CORRECT_LP for seq=$align_seqno, collapsed length is $collapsed_length\n"; } # if we have not seen closure for this indentation in this batch, # and do not have a collapsed length estimate, we can only pass on # a request to the vertical aligner if ( $closing_index < 0 && !$collapsed_length ) { $indentation->set_recoverable_spaces($move_right); next; } # If necessary, look ahead to see if there is really any leading # whitespace dependent on this whitespace, and also find the # longest line using this whitespace. Since it is always safe to # move left if there are no dependents, we only need to do this if # we may have dependent nodes or need to move right. my $have_child = $indentation->get_have_child(); my %saw_indentation; my $line_count = 1; $saw_indentation{$indentation} = $indentation; # How far can we move right before we hit the limit? # let $right_margen = the number of spaces that we can increase # the current indentation before hitting the maximum line length. my $right_margin = 0; if ( $have_child || $move_right > 0 ) { $have_child = 0; # include estimated collapsed length for incomplete containers my $max_length = 0; if ( $Kc > $K_to_go[$max_index_to_go] ) { $max_length = $collapsed_length + $predicted_pos; } if ( $i == $ibeg ) { my $length = total_line_length( $ibeg, $iend ); if ( $length > $max_length ) { $max_length = $length } } # look ahead at the rest of the lines of this batch.. foreach my $line_t ( $line + 1 .. $max_line ) { my $ibeg_t = $ri_first->[$line_t]; my $iend_t = $ri_last->[$line_t]; last if ( $closing_index <= $ibeg_t ); # remember all different indentation objects my $indentation_t = $leading_spaces_to_go[$ibeg_t]; $saw_indentation{$indentation_t} = $indentation_t; $line_count++; # remember longest line in the group my $length_t = total_line_length( $ibeg_t, $iend_t ); if ( $length_t > $max_length ) { $max_length = $length_t; } } $right_margin = $maximum_line_length_at_level[ $levels_to_go[$ibeg] ] - $max_length; if ( $right_margin < 0 ) { $right_margin = 0 } } my $first_line_comma_count = grep { $_ eq ',' } @types_to_go[ $ibeg .. $iend ]; my $comma_count = $indentation->get_comma_count(); my $arrow_count = $indentation->get_arrow_count(); # This is a simple approximate test for vertical alignment: # if we broke just after an opening paren, brace, bracket, # and there are 2 or more commas in the first line, # and there are no '=>'s, # then we are probably vertically aligned. We could set # an exact flag in sub break_lists, but this is good # enough. my $indentation_count = keys %saw_indentation; my $is_vertically_aligned = ( $i == $ibeg && $first_line_comma_count > 1 && $indentation_count == 1 && ( $arrow_count == 0 || $arrow_count == $line_count ) ); # Make the move if possible .. if ( # we can always move left $move_right < 0 # -xlp # incomplete container || ( $rOpts_extended_line_up_parentheses && $Kc > $K_to_go[$max_index_to_go] ) || $closing_index < 0 # but we should only move right if we are sure it will # not spoil vertical alignment || ( $comma_count == 0 ) || ( $comma_count > 0 && !$is_vertically_aligned ) ) { my $move = ( $move_right <= $right_margin ) ? $move_right : $right_margin; if (DEBUG_CORRECT_LP) { print "CORRECT_LP for seq=$align_seqno, moving $move spaces\n"; } foreach ( keys %saw_indentation ) { $saw_indentation{$_} ->permanently_decrease_available_spaces( -$move ); } } # Otherwise, record what we want and the vertical aligner # will try to recover it. else { $indentation->set_recoverable_spaces($move_right); } } ## end loop over tokens in a line } ## end loop over lines return; } ## end sub correct_lp_indentation sub correct_lp_indentation_pass_1 { my ( $self, $ri_first, $ri_last, $ri_starting_one_line_block ) = @_; # So some of the one-line blocks may be too long when given -lp # indentation. We will fix that now if possible, using the list of these # closing block indexes. my @ilist = @{$ri_starting_one_line_block}; return unless (@ilist); my $max_line = @{$ri_first} - 1; my $inext = shift(@ilist); # loop over lines, checking length of each with a one-line block my ( $ibeg, $iend ); foreach my $line ( 0 .. $max_line ) { $iend = $ri_last->[$line]; next if ( $inext > $iend ); $ibeg = $ri_first->[$line]; # This is just for lines with indentation objects (c098) my $excess = ref( $leading_spaces_to_go[$ibeg] ) ? $self->excess_line_length( $ibeg, $iend ) : 0; if ( $excess > 0 ) { my $available_spaces = $self->get_available_spaces_to_go($ibeg); if ( $available_spaces > 0 ) { my $delete_want = min( $available_spaces, $excess ); my $deleted_spaces = $self->reduce_lp_indentation( $ibeg, $delete_want ); $available_spaces = $self->get_available_spaces_to_go($ibeg); } } # skip forward to next one-line block to check while (@ilist) { $inext = shift @ilist; next if ( $inext <= $iend ); last if ( $inext > $iend ); } last if ( $inext <= $iend ); } return; } ## end sub correct_lp_indentation_pass_1 sub undo_lp_ci { # If there is a single, long parameter within parens, like this: # # $self->command( "/msg " # . $infoline->chan # . " You said $1, but did you know that it's square was " # . $1 * $1 . " ?" ); # # we can remove the continuation indentation of the 2nd and higher lines # to achieve this effect, which is more pleasing: # # $self->command("/msg " # . $infoline->chan # . " You said $1, but did you know that it's square was " # . $1 * $1 . " ?"); my ( $self, $line_open, $i_start, $closing_index, $ri_first, $ri_last ) = @_; my $max_line = @{$ri_first} - 1; # must be multiple lines return unless $max_line > $line_open; my $lev_start = $levels_to_go[$i_start]; my $ci_start_plus = 1 + $ci_levels_to_go[$i_start]; # see if all additional lines in this container have continuation # indentation my $line_1 = 1 + $line_open; my $n = $line_open; while ( ++$n <= $max_line ) { my $ibeg = $ri_first->[$n]; my $iend = $ri_last->[$n]; if ( $ibeg eq $closing_index ) { $n--; last } return if ( $lev_start != $levels_to_go[$ibeg] ); return if ( $ci_start_plus != $ci_levels_to_go[$ibeg] ); last if ( $closing_index <= $iend ); } # we can reduce the indentation of all continuation lines my $continuation_line_count = $n - $line_open; @ci_levels_to_go[ @{$ri_first}[ $line_1 .. $n ] ] = (0) x ($continuation_line_count); @leading_spaces_to_go[ @{$ri_first}[ $line_1 .. $n ] ] = @reduced_spaces_to_go[ @{$ri_first}[ $line_1 .. $n ] ]; return; } ## end sub undo_lp_ci ############################################### # CODE SECTION 10: Code to break long statments ############################################### use constant DEBUG_BREAK_LINES => 0; sub break_long_lines { #----------------------------------------------------------- # Break a batch of tokens into lines which do not exceed the # maximum line length. #----------------------------------------------------------- my ( $self, $saw_good_break, $rcolon_list, $rbond_strength_bias ) = @_; # Input parameters: # $saw_good_break - a flag set by break_lists # $rcolon_list - ref to a list of all the ? and : tokens in the batch, # in order. # $rbond_strength_bias - small bond strength bias values set by break_lists # Output: returns references to the arrays: # @i_first # @i_last # which contain the indexes $i of the first and last tokens on each # line. # In addition, the array: # $forced_breakpoint_to_go[$i] # may be updated to be =1 for any index $i after which there must be # a break. This signals later routines not to undo the breakpoint. # Method: # This routine is called if a statement is longer than the maximum line # length, or if a preliminary scanning located desirable break points. # Sub break_lists has already looked at these tokens and set breakpoints # (in array $forced_breakpoint_to_go[$i]) where it wants breaks (for # example after commas, after opening parens, and before closing parens). # This routine will honor these breakpoints and also add additional # breakpoints as necessary to keep the line length below the maximum # requested. It bases its decision on where the 'bond strength' is # lowest. my @i_first = (); # the first index to output my @i_last = (); # the last index to output my @i_colon_breaks = (); # needed to decide if we have to break at ?'s if ( $types_to_go[0] eq ':' ) { push @i_colon_breaks, 0 } # Get the 'bond strengths' between tokens my $rbond_strength_to_go = $self->set_bond_strengths(); # Add any comma bias set by break_lists if ( @{$rbond_strength_bias} ) { foreach my $item ( @{$rbond_strength_bias} ) { my ( $ii, $bias ) = @{$item}; if ( $ii >= 0 && $ii <= $max_index_to_go ) { $rbond_strength_to_go->[$ii] += $bias; } elsif (DEVEL_MODE) { my $KK = $K_to_go[0]; my $lno = $self->[_rLL_]->[$KK]->[_LINE_INDEX_]; Fault( "Bad bond strength bias near line $lno: i=$ii must be between 0 and $max_index_to_go\n" ); } } } my $imin = 0; my $imax = $max_index_to_go; if ( $types_to_go[$imin] eq 'b' ) { $imin++ } if ( $types_to_go[$imax] eq 'b' ) { $imax-- } my $i_begin = $imin; my $last_break_strength = NO_BREAK; my $i_last_break = -1; my $line_count = 0; # see if any ?/:'s are in order my $colons_in_order = 1; my $last_tok = EMPTY_STRING; foreach ( @{$rcolon_list} ) { if ( $_ eq $last_tok ) { $colons_in_order = 0; last } $last_tok = $_; } # This is a sufficient but not necessary condition for colon chain my $is_colon_chain = ( $colons_in_order && @{$rcolon_list} > 2 ); #------------------------------------------ # BEGINNING of main loop to set breakpoints # Keep iterating until we reach the end #------------------------------------------ while ( $i_begin <= $imax ) { #------------------------------------------------------------------ # Find the best next breakpoint based on token-token bond strengths #------------------------------------------------------------------ my ( $i_lowest, $lowest_strength, $leading_alignment_type, $Msg ) = $self->break_lines_inner_loop( $i_begin, $i_last_break, $imax, $last_break_strength, $line_count, $rbond_strength_to_go, $saw_good_break, ); # Now make any adjustments required by ternary breakpoint rules if ( @{$rcolon_list} ) { my $i_next_nonblank = $inext_to_go[$i_lowest]; #------------------------------------------------------- # ?/: rule 1 : if a break here will separate a '?' on this # line from its closing ':', then break at the '?' instead. # But do not break a sequential chain of ?/: statements #------------------------------------------------------- if ( !$is_colon_chain ) { foreach my $i ( $i_begin + 1 .. $i_lowest - 1 ) { next unless ( $tokens_to_go[$i] eq '?' ); # do not break if statement is broken by side comment next if ( $tokens_to_go[$max_index_to_go] eq '#' && terminal_type_i( 0, $max_index_to_go ) !~ /^[\;\}]$/ ); # no break needed if matching : is also on the line next if ( defined( $mate_index_to_go[$i] ) && $mate_index_to_go[$i] <= $i_next_nonblank ); $i_lowest = $i; if ( $want_break_before{'?'} ) { $i_lowest-- } $i_next_nonblank = $inext_to_go[$i_lowest]; last; } } my $next_nonblank_type = $types_to_go[$i_next_nonblank]; #------------------------------------------------------------- # ?/: rule 2 : if we break at a '?', then break at its ':' # # Note: this rule is also in sub break_lists to handle a break # at the start and end of a line (in case breaks are dictated # by side comments). #------------------------------------------------------------- if ( $next_nonblank_type eq '?' ) { $self->set_closing_breakpoint($i_next_nonblank); } elsif ( $types_to_go[$i_lowest] eq '?' ) { $self->set_closing_breakpoint($i_lowest); } #-------------------------------------------------------- # ?/: rule 3 : if we break at a ':' then we save # its location for further work below. We may need to go # back and break at its '?'. #-------------------------------------------------------- if ( $next_nonblank_type eq ':' ) { push @i_colon_breaks, $i_next_nonblank; } elsif ( $types_to_go[$i_lowest] eq ':' ) { push @i_colon_breaks, $i_lowest; } # here we should set breaks for all '?'/':' pairs which are # separated by this line } # guard against infinite loop (should never happen) if ( $i_lowest <= $i_last_break ) { DEVEL_MODE && Fault("i_lowest=$i_lowest <= i_last_break=$i_last_break\n"); $i_lowest = $imax; } DEBUG_BREAK_LINES && print STDOUT "BREAK: best is i = $i_lowest strength = $lowest_strength;\nReason>> $Msg\n"; $line_count++; # save this line segment, after trimming blanks at the ends push( @i_first, ( $types_to_go[$i_begin] eq 'b' ) ? $i_begin + 1 : $i_begin ); push( @i_last, ( $types_to_go[$i_lowest] eq 'b' ) ? $i_lowest - 1 : $i_lowest ); # set a forced breakpoint at a container opening, if necessary, to # signal a break at a closing container. Excepting '(' for now. if ( ( $tokens_to_go[$i_lowest] eq '{' || $tokens_to_go[$i_lowest] eq '[' ) && !$forced_breakpoint_to_go[$i_lowest] ) { $self->set_closing_breakpoint($i_lowest); } # get ready to find the next breakpoint $last_break_strength = $lowest_strength; $i_last_break = $i_lowest; $i_begin = $i_lowest + 1; # skip past a blank if ( ( $i_begin <= $imax ) && ( $types_to_go[$i_begin] eq 'b' ) ) { $i_begin++; } } #------------------------------------------------- # END of main loop to set continuation breakpoints #------------------------------------------------- #----------------------------------------------------------- # ?/: rule 4 -- if we broke at a ':', then break at # corresponding '?' unless this is a chain of ?: expressions #----------------------------------------------------------- if (@i_colon_breaks) { my $is_chain = ( $colons_in_order && @i_colon_breaks > 1 ); if ( !$is_chain ) { $self->do_colon_breaks( \@i_colon_breaks, \@i_first, \@i_last ); } } return ( \@i_first, \@i_last, $rbond_strength_to_go ); } ## end sub break_long_lines # small bond strength numbers to help break ties use constant TINY_BIAS => 0.0001; use constant MAX_BIAS => 0.001; sub break_lines_inner_loop { #----------------------------------------------------------------- # Find the best next breakpoint in index range ($i_begin .. $imax) # which, if possible, does not exceed the maximum line length. #----------------------------------------------------------------- my ( $self, # $i_begin, $i_last_break, $imax, $last_break_strength, $line_count, $rbond_strength_to_go, $saw_good_break, ) = @_; # Given: # $i_begin = first index of range # $i_last_break = index of previous break # $imax = last index of range # $last_break_strength = bond strength of last break # $line_count = number of output lines so far # $rbond_strength_to_go = ref to array of bond strengths # $saw_good_break = true if old line had a good breakpoint # Returns: # $i_lowest = index of best breakpoint # $lowest_strength = 'bond strength' at best breakpoint # $leading_alignment_type = special token type after break # $Msg = string of debug info my $Msg = EMPTY_STRING; my $strength = NO_BREAK; my $i_test = $i_begin - 1; my $i_lowest = -1; my $starting_sum = $summed_lengths_to_go[$i_begin]; my $lowest_strength = NO_BREAK; my $leading_alignment_type = EMPTY_STRING; my $leading_spaces = leading_spaces_to_go($i_begin); my $maximum_line_length = $maximum_line_length_at_level[ $levels_to_go[$i_begin] ]; DEBUG_BREAK_LINES && do { $Msg .= "updating leading spaces to be $leading_spaces at i=$i_begin\n"; }; # Do not separate an isolated bare word from an opening paren. # Alternate Fix #2 for issue b1299. This waits as long as possible # to make the decision. if ( $types_to_go[$i_begin] eq 'i' && substr( $tokens_to_go[$i_begin], 0, 1 ) =~ /\w/ ) { my $i_next_nonblank = $inext_to_go[$i_begin]; if ( $tokens_to_go[$i_next_nonblank] eq '(' ) { $rbond_strength_to_go->[$i_begin] = NO_BREAK; } } # Avoid a break which would strand a single punctuation # token. For example, we do not want to strand a leading # '.' which is followed by a long quoted string. # But note that we do want to do this with -extrude (l=1) # so please test any changes to this code on -extrude. if ( ( $i_begin < $imax ) && ( $tokens_to_go[$i_begin] eq $types_to_go[$i_begin] ) && !$forced_breakpoint_to_go[$i_begin] && !( # Allow break after a closing eval brace. This is an # approximate way to simulate a forced breakpoint made in # Section B below. No differences have been found, but if # necessary the full logic of Section B could be used here # (see c165). $tokens_to_go[$i_begin] eq '}' && $block_type_to_go[$i_begin] && $block_type_to_go[$i_begin] eq 'eval' ) && ( ( $leading_spaces + $summed_lengths_to_go[ $i_begin + 1 ] - $starting_sum ) < $maximum_line_length ) ) { $i_test = min( $imax, $inext_to_go[$i_begin] ) - 1; DEBUG_BREAK_LINES && do { $Msg .= " :skip ahead at i=$i_test"; }; } #------------------------------------------------------- # Begin INNER_LOOP over the indexes in the _to_go arrays #------------------------------------------------------- while ( ++$i_test <= $imax ) { my $type = $types_to_go[$i_test]; my $token = $tokens_to_go[$i_test]; my $i_next_nonblank = $inext_to_go[$i_test]; my $next_nonblank_type = $types_to_go[$i_next_nonblank]; my $next_nonblank_token = $tokens_to_go[$i_next_nonblank]; my $next_nonblank_block_type = $block_type_to_go[$i_next_nonblank]; #--------------------------------------------------------------- # Section A: Get token-token strength and handle any adjustments #--------------------------------------------------------------- # adjustments to the previous bond strength may have been made, and # we must keep the bond strength of a token and its following blank # the same; my $last_strength = $strength; $strength = $rbond_strength_to_go->[$i_test]; if ( $type eq 'b' ) { $strength = $last_strength } # reduce strength a bit to break ties at an old comma breakpoint ... if ( $old_breakpoint_to_go[$i_test] # Patch: limited to just commas to avoid blinking states && $type eq ',' # which is a 'good' breakpoint, meaning ... # we don't want to break before it && !$want_break_before{$type} # and either we want to break before the next token # or the next token is not short (i.e. not a '*', '/' etc.) && $i_next_nonblank <= $imax && ( $want_break_before{$next_nonblank_type} || $token_lengths_to_go[$i_next_nonblank] > 2 || $next_nonblank_type eq ',' || $is_opening_type{$next_nonblank_type} ) ) { $strength -= TINY_BIAS; DEBUG_BREAK_LINES && do { $Msg .= " :-bias at i=$i_test" }; } # otherwise increase strength a bit if this token would be at the # maximum line length. This is necessary to avoid blinking # in the above example when the -iob flag is added. else { my $len = $leading_spaces + $summed_lengths_to_go[ $i_test + 1 ] - $starting_sum; if ( $len >= $maximum_line_length ) { $strength += TINY_BIAS; DEBUG_BREAK_LINES && do { $Msg .= " :+bias at i=$i_test" }; } } #------------------------------------- # Section B: Handle forced breakpoints #------------------------------------- my $must_break; # Force an immediate break at certain operators # with lower level than the start of the line, # unless we've already seen a better break. # # Note on an issue with a preceding '?' : # There may be a break at a previous ? if the line is long. Because # of this we do not want to force a break if there is a previous ? on # this line. For now the best way to do this is to not break if we # have seen a lower strength point, which is probably a ?. # # Example of unwanted breaks we are avoiding at a '.' following a ? # from pod2html using perltidy -gnu: # ) # ? "\n<A NAME=\"" # . $value # . "\">\n$text</A>\n" # : "\n$type$pod2.html\#" . $value . "\">$text<\/A>\n"; if ( ( $strength <= $lowest_strength ) && ( $nesting_depth_to_go[$i_begin] > $nesting_depth_to_go[$i_next_nonblank] ) && ( $next_nonblank_type =~ /^(\.|\&\&|\|\|)$/ || ( $next_nonblank_type eq 'k' ## /^(and|or)$/ # note: includes 'xor' now && $is_and_or{$next_nonblank_token} ) ) ) { $self->set_forced_breakpoint($i_next_nonblank); DEBUG_BREAK_LINES && do { $Msg .= " :Forced break at i=$i_next_nonblank" }; } if ( # Try to put a break where requested by break_lists $forced_breakpoint_to_go[$i_test] # break between ) { in a continued line so that the '{' can # be outdented # See similar logic in break_lists which catches instances # where a line is just something like ') {'. We have to # be careful because the corresponding block keyword might # not be on the first line, such as 'for' here: # # eval { # for ("a") { # for $x ( 1, 2 ) { local $_ = "b"; s/(.*)/+$1/ } # } # }; # || ( $line_count && ( $token eq ')' ) && ( $next_nonblank_type eq '{' ) && ($next_nonblank_block_type) && ( $next_nonblank_block_type ne $tokens_to_go[$i_begin] ) # RT #104427: Dont break before opening sub brace because # sub block breaks handled at higher level, unless # it looks like the preceding list is long and broken && !( ( $next_nonblank_block_type =~ /$SUB_PATTERN/ || $next_nonblank_block_type =~ /$ASUB_PATTERN/ ) && ( $nesting_depth_to_go[$i_begin] == $nesting_depth_to_go[$i_next_nonblank] ) ) && !$rOpts_opening_brace_always_on_right ) # There is an implied forced break at a terminal opening brace || ( ( $type eq '{' ) && ( $i_test == $imax ) ) ) { # Forced breakpoints must sometimes be overridden, for example # because of a side comment causing a NO_BREAK. It is easier # to catch this here than when they are set. if ( $strength < NO_BREAK - 1 ) { $strength = $lowest_strength - TINY_BIAS; $must_break = 1; DEBUG_BREAK_LINES && do { $Msg .= " :set must_break at i=$i_next_nonblank" }; } } # quit if a break here would put a good terminal token on # the next line and we already have a possible break if ( ( $next_nonblank_type eq ';' || $next_nonblank_type eq ',' ) && !$must_break && ( ( $leading_spaces + $summed_lengths_to_go[ $i_next_nonblank + 1 ] - $starting_sum ) > $maximum_line_length ) ) { if ( $i_lowest >= 0 ) { DEBUG_BREAK_LINES && do { $Msg .= " :quit at good terminal='$next_nonblank_type'"; }; last; } } #------------------------------------------------------------ # Section C: Look for the lowest bond strength between tokens #------------------------------------------------------------ if ( ( $strength <= $lowest_strength ) && ( $strength < NO_BREAK ) ) { # break at previous best break if it would have produced # a leading alignment of certain common tokens, and it # is different from the latest candidate break if ($leading_alignment_type) { DEBUG_BREAK_LINES && do { $Msg .= " :last at leading_alignment='$leading_alignment_type'"; }; last; } # Force at least one breakpoint if old code had good # break It is only called if a breakpoint is required or # desired. This will probably need some adjustments # over time. A goal is to try to be sure that, if a new # side comment is introduced into formatted text, then # the same breakpoints will occur. scbreak.t if ( $i_test == $imax # we are at the end && !$forced_breakpoint_count && $saw_good_break # old line had good break && $type =~ /^[#;\{]$/ # and this line ends in # ';' or side comment && $i_last_break < 0 # and we haven't made a break && $i_lowest >= 0 # and we saw a possible break && $i_lowest < $imax - 1 # (but not just before this ;) && $strength - $lowest_strength < 0.5 * WEAK # and it's good ) { DEBUG_BREAK_LINES && do { $Msg .= " :last at good old break\n"; }; last; } # Do not skip past an important break point in a short final # segment. For example, without this check we would miss the # break at the final / in the following code: # # $depth_stop = # ( $tau * $mass_pellet * $q_0 * # ( 1. - exp( -$t_stop / $tau ) ) - # 4. * $pi * $factor * $k_ice * # ( $t_melt - $t_ice ) * # $r_pellet * # $t_stop ) / # ( $rho_ice * $Qs * $pi * $r_pellet**2 ); # if ( $line_count > 2 && $i_lowest >= 0 # and we saw a possible break && $i_lowest < $i_test && $i_test > $imax - 2 && $nesting_depth_to_go[$i_begin] > $nesting_depth_to_go[$i_lowest] && $lowest_strength < $last_break_strength - .5 * WEAK ) { # Make this break for math operators for now my $ir = $inext_to_go[$i_lowest]; my $il = iprev_to_go($ir); if ( $types_to_go[$il] =~ /^[\/\*\+\-\%]$/ || $types_to_go[$ir] =~ /^[\/\*\+\-\%]$/ ) { DEBUG_BREAK_LINES && do { $Msg .= " :last-noskip_short"; }; last; } } # Update the minimum bond strength location $lowest_strength = $strength; $i_lowest = $i_test; if ($must_break) { DEBUG_BREAK_LINES && do { $Msg .= " :last-must_break"; }; last; } # set flags to remember if a break here will produce a # leading alignment of certain common tokens if ( $line_count > 0 && $i_test < $imax && ( $lowest_strength - $last_break_strength <= MAX_BIAS ) ) { my $i_last_end = iprev_to_go($i_begin); my $tok_beg = $tokens_to_go[$i_begin]; my $type_beg = $types_to_go[$i_begin]; if ( # check for leading alignment of certain tokens ( $tok_beg eq $next_nonblank_token && $is_chain_operator{$tok_beg} && ( $type_beg eq 'k' || $type_beg eq $tok_beg ) && $nesting_depth_to_go[$i_begin] >= $nesting_depth_to_go[$i_next_nonblank] ) || ( $tokens_to_go[$i_last_end] eq $token && $is_chain_operator{$token} && ( $type eq 'k' || $type eq $token ) && $nesting_depth_to_go[$i_last_end] >= $nesting_depth_to_go[$i_test] ) ) { $leading_alignment_type = $next_nonblank_type; } } } #----------------------------------------------------------- # Section D: See if the maximum line length will be exceeded #----------------------------------------------------------- # Quit if there are no more tokens to test last if ( $i_test >= $imax ); # Keep going if we have not reached the limit my $excess = $leading_spaces + $summed_lengths_to_go[ $i_test + 2 ] - $starting_sum - $maximum_line_length; if ( $excess < 0 ) { next; } elsif ( $excess == 0 ) { # To prevent blinkers we will avoid leaving a token exactly at # the line length limit unless it is the last token or one of # several "good" types. # # The following code was a blinker with -pbp before this # modification: # $last_nonblank_token eq '(' # && $is_indirect_object_taker{ $paren_type # [$paren_depth] } # The issue causing the problem is that if the # term [$paren_depth] gets broken across a line then # the whitespace routine doesn't see both opening and closing # brackets and will format like '[ $paren_depth ]'. This # leads to an oscillation in length depending if we break # before the closing bracket or not. if ( $i_test + 1 < $imax && $next_nonblank_type ne ',' && !$is_closing_type{$next_nonblank_type} ) { # too long DEBUG_BREAK_LINES && do { $Msg .= " :too_long"; } } else { next; } } else { # too long } # a break here makes the line too long ... DEBUG_BREAK_LINES && do { my $ltok = $token; my $rtok = $next_nonblank_token ? $next_nonblank_token : EMPTY_STRING; my $i_testp2 = $i_test + 2; if ( $i_testp2 > $max_index_to_go + 1 ) { $i_testp2 = $max_index_to_go + 1; } if ( length($ltok) > 6 ) { $ltok = substr( $ltok, 0, 8 ) } if ( length($rtok) > 6 ) { $rtok = substr( $rtok, 0, 8 ) } print STDOUT "BREAK: i=$i_test imax=$imax $types_to_go[$i_test] $next_nonblank_type sp=($leading_spaces) lnext= $summed_lengths_to_go[$i_testp2] str=$strength $ltok $rtok\n"; }; # Exception: allow one extra terminal token after exceeding line length # if it would strand this token. if ( $i_lowest == $i_test && $token_lengths_to_go[$i_test] > 1 && ( $next_nonblank_type eq ';' || $next_nonblank_type eq ',' ) && $rOpts_fuzzy_line_length ) { DEBUG_BREAK_LINES && do { $Msg .= " :do_not_strand next='$next_nonblank_type'"; }; next; } # Stop if here if we have a solution and the line will be too long if ( $i_lowest >= 0 ) { DEBUG_BREAK_LINES && do { $Msg .= " :Done-too_long && i_lowest=$i_lowest at itest=$i_test, imax=$imax"; }; last; } } #----------------------------------------------------- # End INNER_LOOP over the indexes in the _to_go arrays #----------------------------------------------------- # Be sure we return an index in the range ($ibegin .. $imax). # We will break at imax if no other break was found. if ( $i_lowest < 0 ) { $i_lowest = $imax } return ( $i_lowest, $lowest_strength, $leading_alignment_type, $Msg ); } ## end sub break_lines_inner_loop sub do_colon_breaks { my ( $self, $ri_colon_breaks, $ri_first, $ri_last ) = @_; # using a simple method for deciding if we are in a ?/: chain -- # this is a chain if it has multiple ?/: pairs all in order; # otherwise not. # Note that if line starts in a ':' we count that above as a break my @insert_list = (); foreach ( @{$ri_colon_breaks} ) { my $i_question = $mate_index_to_go[$_]; if ( defined($i_question) ) { if ( $want_break_before{'?'} ) { $i_question = iprev_to_go($i_question); } if ( $i_question >= 0 ) { push @insert_list, $i_question; } } $self->insert_additional_breaks( \@insert_list, $ri_first, $ri_last ); } return; } ## end sub do_colon_breaks ########################################### # CODE SECTION 11: Code to break long lists ########################################### { ## begin closure break_lists # These routines and variables are involved in finding good # places to break long lists. use constant DEBUG_BREAK_LISTS => 0; my ( $block_type, $current_depth, $depth, $i, $i_last_colon, $i_line_end, $i_line_start, $i_last_nonblank_token, $last_nonblank_block_type, $last_nonblank_token, $last_nonblank_type, $last_old_breakpoint_count, $minimum_depth, $next_nonblank_block_type, $next_nonblank_token, $next_nonblank_type, $old_breakpoint_count, $starting_breakpoint_count, $starting_depth, $token, $type, $type_sequence, ); my ( @breakpoint_stack, @breakpoint_undo_stack, @comma_index, @container_type, @identifier_count_stack, @index_before_arrow, @interrupted_list, @item_count_stack, @last_comma_index, @last_dot_index, @last_nonblank_type, @old_breakpoint_count_stack, @opening_structure_index_stack, @rfor_semicolon_list, @has_old_logical_breakpoints, @rand_or_list, @i_equals, @override_cab3, @type_sequence_stack, ); # these arrays must retain values between calls my ( @has_broken_sublist, @dont_align, @want_comma_break ); my $length_tol; my $lp_tol_boost; sub initialize_break_lists { @dont_align = (); @has_broken_sublist = (); @want_comma_break = (); #--------------------------------------------------- # Set tolerances to prevent formatting instabilities #--------------------------------------------------- # Define tolerances to use when checking if closed # containers will fit on one line. This is necessary to avoid # formatting instability. The basic tolerance is based on the # following: # - Always allow for at least one extra space after a closing token so # that we do not strand a comma or semicolon. (oneline.t). # - Use an increased line length tolerance when -ci > -i to avoid # blinking states (case b923 and others). $length_tol = 1 + max( 0, $rOpts_continuation_indentation - $rOpts_indent_columns ); # In addition, it may be necessary to use a few extra tolerance spaces # when -lp is used and/or when -xci is used. The history of this # so far is as follows: # FIX1: At least 3 characters were been found to be required for -lp # to fixes cases b1059 b1063 b1117. # FIX2: Further testing showed that we need a total of 3 extra spaces # when -lp is set for non-lists, and at least 2 spaces when -lp and # -xci are set. # Fixes cases b1063 b1103 b1134 b1135 b1136 b1138 b1140 b1143 b1144 # b1145 b1146 b1147 b1148 b1151 b1152 b1153 b1154 b1156 b1157 b1164 # b1165 # FIX3: To fix cases b1169 b1170 b1171, an update was made in sub # 'find_token_starting_list' to go back before an initial blank space. # This fixed these three cases, and allowed the tolerances to be # reduced to continue to fix all other known cases of instability. # This gives the current tolerance formulation. $lp_tol_boost = 0; if ($rOpts_line_up_parentheses) { # boost tol for combination -lp -xci if ($rOpts_extended_continuation_indentation) { $lp_tol_boost = 2; } # boost tol for combination -lp and any -vtc > 0, but only for # non-list containers else { foreach ( keys %closing_vertical_tightness ) { next unless ( $closing_vertical_tightness{$_} ); $lp_tol_boost = 1; # Fixes B1193; last; } } } # Define a level where list formatting becomes highly stressed and # needs to be simplified. Introduced for case b1262. # $list_stress_level = min($stress_level_alpha, $stress_level_beta + 2); # This is now '$high_stress_level'. return; } ## end sub initialize_break_lists # routine to define essential variables when we go 'up' to # a new depth sub check_for_new_minimum_depth { my ( $self, $depth_t, $seqno ) = @_; if ( $depth_t < $minimum_depth ) { $minimum_depth = $depth_t; # these arrays need not retain values between calls $type_sequence_stack[$depth_t] = $seqno; $override_cab3[$depth_t] = undef; if ( $rOpts_comma_arrow_breakpoints == 3 && $seqno ) { $override_cab3[$depth_t] = $self->[_roverride_cab3_]->{$seqno}; } $breakpoint_stack[$depth_t] = $starting_breakpoint_count; $container_type[$depth_t] = EMPTY_STRING; $identifier_count_stack[$depth_t] = 0; $index_before_arrow[$depth_t] = -1; $interrupted_list[$depth_t] = 1; $item_count_stack[$depth_t] = 0; $last_nonblank_type[$depth_t] = EMPTY_STRING; $opening_structure_index_stack[$depth_t] = -1; $breakpoint_undo_stack[$depth_t] = undef; $comma_index[$depth_t] = undef; $last_comma_index[$depth_t] = undef; $last_dot_index[$depth_t] = undef; $old_breakpoint_count_stack[$depth_t] = undef; $has_old_logical_breakpoints[$depth_t] = 0; $rand_or_list[$depth_t] = []; $rfor_semicolon_list[$depth_t] = []; $i_equals[$depth_t] = -1; # these arrays must retain values between calls if ( !defined( $has_broken_sublist[$depth_t] ) ) { $dont_align[$depth_t] = 0; $has_broken_sublist[$depth_t] = 0; $want_comma_break[$depth_t] = 0; } } return; } ## end sub check_for_new_minimum_depth # routine to decide which commas to break at within a container; # returns: # $bp_count = number of comma breakpoints set # $do_not_break_apart = a flag indicating if container need not # be broken open sub set_comma_breakpoints { my ( $self, $dd, $rbond_strength_bias ) = @_; my $bp_count = 0; my $do_not_break_apart = 0; # anything to do? if ( $item_count_stack[$dd] ) { # Do not break a list unless there are some non-line-ending commas. # This avoids getting different results with only non-essential # commas, and fixes b1192. my $seqno = $type_sequence_stack[$dd]; my $real_comma_count = $seqno ? $self->[_rtype_count_by_seqno_]->{$seqno}->{','} : 1; # handle commas not in containers... if ( $dont_align[$dd] ) { $self->do_uncontained_comma_breaks( $dd, $rbond_strength_bias ); } # handle commas within containers... elsif ($real_comma_count) { my $fbc = $forced_breakpoint_count; # always open comma lists not preceded by keywords, # barewords, identifiers (that is, anything that doesn't # look like a function call) my $must_break_open = $last_nonblank_type[$dd] !~ /^[kwiU]$/; $self->table_maker( { depth => $dd, i_opening_paren => $opening_structure_index_stack[$dd], i_closing_paren => $i, item_count => $item_count_stack[$dd], identifier_count => $identifier_count_stack[$dd], rcomma_index => $comma_index[$dd], next_nonblank_type => $next_nonblank_type, list_type => $container_type[$dd], interrupted => $interrupted_list[$dd], rdo_not_break_apart => \$do_not_break_apart, must_break_open => $must_break_open, has_broken_sublist => $has_broken_sublist[$dd], } ); $bp_count = $forced_breakpoint_count - $fbc; $do_not_break_apart = 0 if $must_break_open; } } return ( $bp_count, $do_not_break_apart ); } ## end sub set_comma_breakpoints # These types are excluded at breakpoints to prevent blinking # Switched from excluded to included as part of fix for b1214 my %is_uncontained_comma_break_included_type; BEGIN { my @q = qw< k R } ) ] Y Z U w i q Q . = **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x=>; @is_uncontained_comma_break_included_type{@q} = (1) x scalar(@q); } ## end BEGIN sub do_uncontained_comma_breaks { # Handle commas not in containers... # This is a catch-all routine for commas that we # don't know what to do with because the don't fall # within containers. We will bias the bond strength # to break at commas which ended lines in the input # file. This usually works better than just trying # to put as many items on a line as possible. A # downside is that if the input file is garbage it # won't work very well. However, the user can always # prevent following the old breakpoints with the # -iob flag. my ( $self, $dd, $rbond_strength_bias ) = @_; # Check added for issue c131; an error here would be due to an # error initializing @comma_index when entering depth $dd. if (DEVEL_MODE) { foreach my $ii ( @{ $comma_index[$dd] } ) { if ( $ii < 0 || $ii > $max_index_to_go ) { my $KK = $K_to_go[0]; my $lno = $self->[_rLL_]->[$KK]->[_LINE_INDEX_]; Fault(<0 [ignore a ci change by -xci] # ... fixes b1220. If ci>0 we are in the middle of a snippet, # maybe because -boc has been forcing out previous lines. # For example, we will follow the user and break after # 'print' in this snippet: # print # "conformability (Not the same dimension)\n", # "\t", $have, " is ", text_unit($hu), "\n", # "\t", $want, " is ", text_unit($wu), "\n", # ; # # Another example, just one comma, where we will break after # the return: # return # $x * cos($a) - $y * sin($a), # $x * sin($a) + $y * cos($a); # Breaking a print statement: # print SAVEOUT # ( $? & 127 ) ? " (SIG#" . ( $? & 127 ) . ")" : "", # ( $? & 128 ) ? " -- core dumped" : "", "\n"; # # But we will not force a break after the opening paren here # (causes a blinker): # $heap->{stream}->set_output_filter( # poe::filter::reference->new('myotherfreezer') ), # ; # my $i_first_comma = $comma_index[$dd]->[0]; my $level_comma = $levels_to_go[$i_first_comma]; my $ci_start = $ci_levels_to_go[0]; # Here we want to use the value of ci before any -xci adjustment if ( $ci_start && $rOpts_extended_continuation_indentation ) { my $K0 = $K_to_go[0]; if ( $self->[_rseqno_controlling_my_ci_]->{$K0} ) { $ci_start = 0 } } if ( !$ci_start && $old_breakpoint_to_go[$i_first_comma] && $level_comma == $levels_to_go[0] ) { my $ibreak = -1; my $obp_count = 0; foreach my $ii ( reverse( 0 .. $i_first_comma - 1 ) ) { if ( $old_breakpoint_to_go[$ii] ) { $obp_count++; last if ( $obp_count > 1 ); $ibreak = $ii if ( $levels_to_go[$ii] == $level_comma ); } } # Changed rule from multiple old commas to just one here: if ( $ibreak >= 0 && $obp_count == 1 && $old_comma_break_count > 0 ) { my $ibreak_m = $ibreak; $ibreak_m-- if ( $types_to_go[$ibreak_m] eq 'b' ); if ( $ibreak_m >= 0 ) { # In order to avoid blinkers we have to be fairly # restrictive: # OLD Rules: # Rule 1: Do not to break before an opening token # Rule 2: avoid breaking at ternary operators # (see b931, which is similar to the above print example) # Rule 3: Do not break at chain operators to fix case b1119 # - The previous test was '$typem !~ /^[\(\{\[L\?\:]$/' # NEW Rule, replaced above rules after case b1214: # only break at one of the included types # Be sure to test any changes to these rules against runs # with -l=0 such as the 'bbvt' test (perltidyrc_colin) # series. my $type_m = $types_to_go[$ibreak_m]; # Switched from excluded to included for b1214. If necessary # the token could also be checked if type_m eq 'k' if ( $is_uncontained_comma_break_included_type{$type_m} ) { # Rule added to fix b1449: # Do not break before a '?' if -nbot is set # Otherwise, we may alternately arrive here and # set the break, or not, depending on the input. my $no_break; my $ibreak_p = $inext_to_go[$ibreak_m]; if ( !$rOpts_break_at_old_ternary_breakpoints && $ibreak_p <= $max_index_to_go ) { my $type_p = $types_to_go[$ibreak_p]; $no_break = $type_p eq '?'; } $self->set_forced_breakpoint($ibreak) if ( !$no_break ); } } } } return; } ## end sub do_uncontained_comma_breaks my %is_logical_container; my %quick_filter; BEGIN { my @q = qw# if elsif unless while and or err not && | || ? : ! #; @is_logical_container{@q} = (1) x scalar(@q); # This filter will allow most tokens to skip past a section of code %quick_filter = %is_assignment; @q = qw# => . ; < > ~ #; push @q, ','; push @q, 'f'; # added for ';' for issue c154 @quick_filter{@q} = (1) x scalar(@q); } ## end BEGIN sub set_for_semicolon_breakpoints { my ( $self, $dd ) = @_; foreach ( @{ $rfor_semicolon_list[$dd] } ) { $self->set_forced_breakpoint($_); } return; } ## end sub set_for_semicolon_breakpoints sub set_logical_breakpoints { my ( $self, $dd ) = @_; if ( $item_count_stack[$dd] == 0 && $is_logical_container{ $container_type[$dd] } || $has_old_logical_breakpoints[$dd] ) { # Look for breaks in this order: # 0 1 2 3 # or and || && foreach my $i ( 0 .. 3 ) { if ( $rand_or_list[$dd][$i] ) { foreach ( @{ $rand_or_list[$dd][$i] } ) { $self->set_forced_breakpoint($_); } # break at any 'if' and 'unless' too foreach ( @{ $rand_or_list[$dd][4] } ) { $self->set_forced_breakpoint($_); } $rand_or_list[$dd] = []; last; } } } return; } ## end sub set_logical_breakpoints sub is_unbreakable_container { # never break a container of one of these types # because bad things can happen (map1.t) my $dd = shift; return $is_sort_map_grep{ $container_type[$dd] }; } ## end sub is_unbreakable_container sub break_lists { my ( $self, $is_long_line, $rbond_strength_bias ) = @_; #-------------------------------------------------------------------- # This routine is called once per batch, if the batch is a list, to # set line breaks so that hierarchical structure can be displayed and # so that list items can be vertically aligned. The output of this # routine is stored in the array @forced_breakpoint_to_go, which is # used by sub 'break_long_lines' to set final breakpoints. This is # probably the most complex routine in perltidy, so I have # broken it into pieces and over-commented it. #-------------------------------------------------------------------- $starting_depth = $nesting_depth_to_go[0]; $block_type = SPACE; $current_depth = $starting_depth; $i = -1; $i_last_colon = -1; $i_line_end = -1; $i_line_start = -1; $last_nonblank_token = ';'; $last_nonblank_type = ';'; $last_nonblank_block_type = SPACE; $last_old_breakpoint_count = 0; $minimum_depth = $current_depth + 1; # forces update in check below $old_breakpoint_count = 0; $starting_breakpoint_count = $forced_breakpoint_count; $token = ';'; $type = ';'; $type_sequence = EMPTY_STRING; my $total_depth_variation = 0; my $i_old_assignment_break; my $depth_last = $starting_depth; my $comma_follows_last_closing_token; $self->check_for_new_minimum_depth( $current_depth, $parent_seqno_to_go[0] ) if ( $current_depth < $minimum_depth ); my $i_want_previous_break = -1; my $saw_good_breakpoint; #---------------------------------------- # Main loop over all tokens in this batch #---------------------------------------- while ( ++$i <= $max_index_to_go ) { if ( $type ne 'b' ) { $i_last_nonblank_token = $i - 1; $last_nonblank_type = $type; $last_nonblank_token = $token; $last_nonblank_block_type = $block_type; } $type = $types_to_go[$i]; $block_type = $block_type_to_go[$i]; $token = $tokens_to_go[$i]; $type_sequence = $type_sequence_to_go[$i]; my $i_next_nonblank = $inext_to_go[$i]; $next_nonblank_type = $types_to_go[$i_next_nonblank]; $next_nonblank_token = $tokens_to_go[$i_next_nonblank]; $next_nonblank_block_type = $block_type_to_go[$i_next_nonblank]; #------------------------------------------- # Loop Section A: Look for special breakpoints... #------------------------------------------- # set break if flag was set if ( $i_want_previous_break >= 0 ) { $self->set_forced_breakpoint($i_want_previous_break); $i_want_previous_break = -1; } $last_old_breakpoint_count = $old_breakpoint_count; # Check for a good old breakpoint .. if ( $old_breakpoint_to_go[$i] ) { ( $i_want_previous_break, $i_old_assignment_break ) = $self->examine_old_breakpoint( $i_next_nonblank, $i_want_previous_break, $i_old_assignment_break ); } next if ( $type eq 'b' ); $depth = $nesting_depth_to_go[ $i + 1 ]; $total_depth_variation += abs( $depth - $depth_last ); $depth_last = $depth; # safety check - be sure we always break after a comment # Shouldn't happen .. an error here probably means that the # nobreak flag did not get turned off correctly during # formatting. if ( $type eq '#' ) { if ( $i != $max_index_to_go ) { if (DEVEL_MODE) { Fault(<set_forced_breakpoint($i); } ## end if ( $i != $max_index_to_go) } ## end if ( $type eq '#' ) # Force breakpoints at certain tokens in long lines. # Note that such breakpoints will be undone later if these tokens # are fully contained within parens on a line. if ( # break before a keyword within a line $type eq 'k' && $i > 0 # if one of these keywords: && $is_if_unless_while_until_for_foreach{$token} # but do not break at something like '1 while' && ( $last_nonblank_type ne 'n' || $i > 2 ) # and let keywords follow a closing 'do' brace && ( !$last_nonblank_block_type || $last_nonblank_block_type ne 'do' ) && ( $is_long_line # or container is broken (by side-comment, etc) || ( $next_nonblank_token eq '(' && ( !defined( $mate_index_to_go[$i_next_nonblank] ) || $mate_index_to_go[$i_next_nonblank] < $i ) ) ) ) { $self->set_forced_breakpoint( $i - 1 ); } # remember locations of '||' and '&&' for possible breaks if we # decide this is a long logical expression. if ( $type eq '||' ) { push @{ $rand_or_list[$depth][2] }, $i; ++$has_old_logical_breakpoints[$depth] if ( ( $i == $i_line_start || $i == $i_line_end ) && $rOpts_break_at_old_logical_breakpoints ); } elsif ( $type eq '&&' ) { push @{ $rand_or_list[$depth][3] }, $i; ++$has_old_logical_breakpoints[$depth] if ( ( $i == $i_line_start || $i == $i_line_end ) && $rOpts_break_at_old_logical_breakpoints ); } elsif ( $type eq 'f' ) { push @{ $rfor_semicolon_list[$depth] }, $i; } elsif ( $type eq 'k' ) { if ( $token eq 'and' ) { push @{ $rand_or_list[$depth][1] }, $i; ++$has_old_logical_breakpoints[$depth] if ( ( $i == $i_line_start || $i == $i_line_end ) && $rOpts_break_at_old_logical_breakpoints ); } # break immediately at 'or's which are probably not in a logical # block -- but we will break in logical breaks below so that # they do not add to the forced_breakpoint_count elsif ( $token eq 'or' ) { push @{ $rand_or_list[$depth][0] }, $i; ++$has_old_logical_breakpoints[$depth] if ( ( $i == $i_line_start || $i == $i_line_end ) && $rOpts_break_at_old_logical_breakpoints ); if ( $is_logical_container{ $container_type[$depth] } ) { } else { if ($is_long_line) { $self->set_forced_breakpoint($i) } elsif ( ( $i == $i_line_start || $i == $i_line_end ) && $rOpts_break_at_old_logical_breakpoints ) { $saw_good_breakpoint = 1; } } } elsif ( $token eq 'if' || $token eq 'unless' ) { push @{ $rand_or_list[$depth][4] }, $i; if ( ( $i == $i_line_start || $i == $i_line_end ) && $rOpts_break_at_old_logical_breakpoints ) { $self->set_forced_breakpoint($i); } } } elsif ( $is_assignment{$type} ) { $i_equals[$depth] = $i; } #----------------------------------------- # Loop Section B: Handle a sequenced token #----------------------------------------- if ($type_sequence) { $self->break_lists_type_sequence; } #------------------------------------------ # Loop Section C: Handle Increasing Depth.. #------------------------------------------ # hardened against bad input syntax: depth jump must be 1 and type # must be opening..fixes c102 if ( $depth == $current_depth + 1 && $is_opening_type{$type} ) { $self->break_lists_increasing_depth(); } #------------------------------------------ # Loop Section D: Handle Decreasing Depth.. #------------------------------------------ # hardened against bad input syntax: depth jump must be 1 and type # must be closing .. fixes c102 elsif ( $depth == $current_depth - 1 && $is_closing_type{$type} ) { $self->break_lists_decreasing_depth(); $comma_follows_last_closing_token = $next_nonblank_type eq ',' || $next_nonblank_type eq '=>'; } #---------------------------------- # Loop Section E: Handle this token #---------------------------------- $current_depth = $depth; # most token types can skip the rest of this loop next unless ( $quick_filter{$type} ); # handle comma-arrow if ( $type eq '=>' ) { next if ( $last_nonblank_type eq '=>' ); next if $rOpts_break_at_old_comma_breakpoints; next if ( $rOpts_comma_arrow_breakpoints == 3 && !defined( $override_cab3[$depth] ) ); $want_comma_break[$depth] = 1; $index_before_arrow[$depth] = $i_last_nonblank_token; next; } elsif ( $type eq '.' ) { $last_dot_index[$depth] = $i; } # Turn off comma alignment if we are sure that this is not a list # environment. To be safe, we will do this if we see certain # non-list tokens, such as ';', '=', and also the environment is # not a list. ## $type =~ /^[\;\<\>\~f]$/ || $is_assignment{$type} elsif ( $is_non_list_type{$type} && !$self->is_in_list_by_i($i) ) { $dont_align[$depth] = 1; $want_comma_break[$depth] = 0; $index_before_arrow[$depth] = -1; # no special comma breaks in C-style 'for' terms (c154) if ( $type eq 'f' ) { $last_comma_index[$depth] = undef } } # now just handle any commas next if ( $type ne ',' ); $self->study_comma($comma_follows_last_closing_token); } ## end while ( ++$i <= $max_index_to_go) #------------------------------------------- # END of loop over all tokens in this batch # Now set breaks for any unfinished lists .. #------------------------------------------- foreach my $dd ( reverse( $minimum_depth .. $current_depth ) ) { $interrupted_list[$dd] = 1; $has_broken_sublist[$dd] = 1 if ( $dd < $current_depth ); $self->set_comma_breakpoints( $dd, $rbond_strength_bias ) if ( $item_count_stack[$dd] ); $self->set_logical_breakpoints($dd) if ( $has_old_logical_breakpoints[$dd] ); $self->set_for_semicolon_breakpoints($dd); # break open container... my $i_opening = $opening_structure_index_stack[$dd]; if ( defined($i_opening) && $i_opening >= 0 ) { $self->set_forced_breakpoint($i_opening) unless ( is_unbreakable_container($dd) # Avoid a break which would place an isolated ' or " # on a line || ( $type eq 'Q' && $i_opening >= $max_index_to_go - 2 && ( $token eq "'" || $token eq '"' ) ) ); } } ## end for ( my $dd = $current_depth...) #---------------------------------------- # Return the flag '$saw_good_breakpoint'. #---------------------------------------- # This indicates if the input file had some good breakpoints. This # flag will be used to force a break in a line shorter than the # allowed line length. if ( $has_old_logical_breakpoints[$current_depth] ) { $saw_good_breakpoint = 1; } # A complex line with one break at an = has a good breakpoint. # This is not complex ($total_depth_variation=0): # $res1 # = 10; # # This is complex ($total_depth_variation=6): # $res2 = # (is_boundp("a", 'self-insert') && is_boundp("b", 'self-insert')); # The check ($i_old_.. < $max_index_to_go) was added to fix b1333 elsif ($i_old_assignment_break && $total_depth_variation > 4 && $old_breakpoint_count == 1 && $i_old_assignment_break < $max_index_to_go ) { $saw_good_breakpoint = 1; } return $saw_good_breakpoint; } ## end sub break_lists sub study_comma { # study and store info for a list comma my ( $self, $comma_follows_last_closing_token ) = @_; $last_dot_index[$depth] = undef; $last_comma_index[$depth] = $i; # break here if this comma follows a '=>' # but not if there is a side comment after the comma if ( $want_comma_break[$depth] ) { if ( $next_nonblank_type =~ /^[\)\}\]R]$/ ) { if ($rOpts_comma_arrow_breakpoints) { $want_comma_break[$depth] = 0; return; } } $self->set_forced_breakpoint($i) unless ( $next_nonblank_type eq '#' ); # break before the previous token if it looks safe # Example of something that we will not try to break before: # DBI::SQL_SMALLINT() => $ado_consts->{adSmallInt}, # Also we don't want to break at a binary operator (like +): # $c->createOval( # $x + $R, $y + # $R => $x - $R, # $y - $R, -fill => 'black', # ); my $ibreak = $index_before_arrow[$depth] - 1; if ( $ibreak > 0 && $tokens_to_go[ $ibreak + 1 ] !~ /^[\)\}\]]$/ ) { if ( $tokens_to_go[$ibreak] eq '-' ) { $ibreak-- } if ( $types_to_go[$ibreak] eq 'b' ) { $ibreak-- } if ( $types_to_go[$ibreak] =~ /^[,wiZCUG\(\{\[]$/ ) { # don't break before a comma, as in the following: # ( LONGER_THAN,=> 1, # EIGHTY_CHARACTERS,=> 2, # CAUSES_FORMATTING,=> 3, # LIKE_THIS,=> 4, # ); # This example is for -tso but should be general rule if ( $tokens_to_go[ $ibreak + 1 ] ne '->' && $tokens_to_go[ $ibreak + 1 ] ne ',' ) { $self->set_forced_breakpoint($ibreak); } } } $want_comma_break[$depth] = 0; $index_before_arrow[$depth] = -1; # handle list which mixes '=>'s and ','s: # treat any list items so far as an interrupted list $interrupted_list[$depth] = 1; return; } # Break after all commas above starting depth... # But only if the last closing token was followed by a comma, # to avoid breaking a list operator (issue c119) if ( $depth < $starting_depth && $comma_follows_last_closing_token && !$dont_align[$depth] ) { $self->set_forced_breakpoint($i) unless ( $next_nonblank_type eq '#' ); return; } # add this comma to the list.. my $item_count = $item_count_stack[$depth]; if ( $item_count == 0 ) { # but do not form a list with no opening structure # for example: # open INFILE_COPY, ">$input_file_copy" # or die ("very long message"); if ( ( $opening_structure_index_stack[$depth] < 0 ) && $self->is_in_block_by_i($i) ) { $dont_align[$depth] = 1; } } $comma_index[$depth][$item_count] = $i; ++$item_count_stack[$depth]; if ( $last_nonblank_type =~ /^[iR\]]$/ ) { $identifier_count_stack[$depth]++; } return; } ## end sub study_comma my %poor_types; my %poor_keywords; my %poor_next_types; my %poor_next_keywords; BEGIN { # Setup filters for detecting very poor breaks to ignore. # b1097: old breaks after type 'L' and before 'R' are poor # b1450: old breaks at 'eq' and related operators are poor my @q = qw(== <= >= !=); @{poor_types}{@q} = (1) x scalar(@q); @{poor_next_types}{@q} = (1) x scalar(@q); $poor_types{'L'} = 1; $poor_next_types{'R'} = 1; @q = qw(eq ne le ge lt gt); @{poor_keywords}{@q} = (1) x scalar(@q); @{poor_next_keywords}{@q} = (1) x scalar(@q); } ## end BEGIN sub examine_old_breakpoint { my ( $self, $i_next_nonblank, $i_want_previous_break, $i_old_assignment_break ) = @_; # Look at an old breakpoint and set/update certain flags: # Given indexes of three tokens in this batch: # $i_next_nonblank - index of the next nonblank token # $i_want_previous_break - we want a break before this index # $i_old_assignment_break - the index of an '=' or equivalent # Update: # $old_breakpoint_count - a counter to increment unless poor break # Update and return: # $i_want_previous_break # $i_old_assignment_break #----------------------- # Filter out poor breaks #----------------------- # Just return if this is a poor break and pretend it does not exist. # Otherwise, poor breaks made under stress can cause instability. my $poor_break; if ( $type eq 'k' ) { $poor_break ||= $poor_keywords{$token} } else { $poor_break ||= $poor_types{$type} } if ( $next_nonblank_type eq 'k' ) { $poor_break ||= $poor_next_keywords{$next_nonblank_token}; } else { $poor_break ||= $poor_next_types{$next_nonblank_type} } # Also ignore any high stress level breaks; fixes b1395 $poor_break ||= $levels_to_go[$i] >= $high_stress_level; if ($poor_break) { goto RETURN } #-------------------------------------------- # Not a poor break, so continue to examine it #-------------------------------------------- $old_breakpoint_count++; $i_line_end = $i; $i_line_start = $i_next_nonblank; #--------------------------------------- # Do we want to break before this token? #--------------------------------------- # Break before certain keywords if user broke there and # this is a 'safe' break point. The idea is to retain # any preferred breaks for sequential list operations, # like a schwartzian transform. if ($rOpts_break_at_old_keyword_breakpoints) { if ( $next_nonblank_type eq 'k' && $is_keyword_returning_list{$next_nonblank_token} && ( $type =~ /^[=\)\]\}Riw]$/ || $type eq 'k' && $is_keyword_returning_list{$token} ) ) { # we actually have to set this break next time through # the loop because if we are at a closing token (such # as '}') which forms a one-line block, this break might # get undone. # But do not do this at an '=' if: # - the user wants breaks before an equals (b434 b903) # - or -naws is set (can be unstable, see b1354) my $skip = $type eq '=' && ( $want_break_before{$type} || !$rOpts_add_whitespace ); $i_want_previous_break = $i unless ($skip); } } # Break before attributes if user broke there if ($rOpts_break_at_old_attribute_breakpoints) { if ( $next_nonblank_type eq 'A' ) { $i_want_previous_break = $i; } } #--------------------------------- # Is this an old assignment break? #--------------------------------- if ( $is_assignment{$type} ) { $i_old_assignment_break = $i; } elsif ( $is_assignment{$next_nonblank_type} ) { $i_old_assignment_break = $i_next_nonblank; } RETURN: return ( $i_want_previous_break, $i_old_assignment_break ); } ## end sub examine_old_breakpoint sub break_lists_type_sequence { my ($self) = @_; # We have encountered a sequenced token while setting list breakpoints # if closing type, one of } ) ] : if ( $is_closing_sequence_token{$token} ) { if ( $type eq ':' ) { $i_last_colon = $i; # retain break at a ':' line break if ( ( $i == $i_line_start || $i == $i_line_end ) && $rOpts_break_at_old_ternary_breakpoints && $levels_to_go[$i] < $high_stress_level ) { $self->set_forced_breakpoint($i); # Break at a previous '=', but only if it is before # the mating '?'. Mate_index test fixes b1287. my $ieq = $i_equals[$depth]; my $mix = $mate_index_to_go[$i]; if ( !defined($mix) ) { $mix = -1 } if ( $ieq > 0 && $ieq < $mix ) { $self->set_forced_breakpoint( $i_equals[$depth] ); $i_equals[$depth] = -1; } } } # handle any postponed closing breakpoints if ( has_postponed_breakpoint($type_sequence) ) { my $inc = ( $type eq ':' ) ? 0 : 1; if ( $i >= $inc ) { $self->set_forced_breakpoint( $i - $inc ); } } } # must be opening token, one of { ( [ ? else { # set breaks at ?/: if they will get separated (and are # not a ?/: chain), or if the '?' is at the end of the # line if ( $token eq '?' ) { my $i_colon = $mate_index_to_go[$i]; if ( !defined($i_colon) # the ':' is not in this batch || $i == 0 # this '?' is the first token of the line || $i == $max_index_to_go # or this '?' is the last token ) { # don't break if # this has a side comment, and # don't break at a '?' if preceded by ':' on # this line of previous ?/: pair on this line. # This is an attempt to preserve a chain of ?/: # expressions (elsif2.t). if ( ( $i_last_colon < 0 || $parent_seqno_to_go[$i_last_colon] != $parent_seqno_to_go[$i] ) && $tokens_to_go[$max_index_to_go] ne '#' ) { $self->set_forced_breakpoint($i); } $self->set_closing_breakpoint($i); } } # must be one of { ( [ else { # do requested -lp breaks at the OPENING token for BROKEN # blocks. NOTE: this can be done for both -lp and -xlp, # but only -xlp can really take advantage of this. So this # is currently restricted to -xlp to avoid excess changes to # existing -lp formatting. if ( $rOpts_extended_line_up_parentheses && !defined( $mate_index_to_go[$i] ) ) { my $lp_object = $self->[_rlp_object_by_seqno_]->{$type_sequence}; if ($lp_object) { my $K_begin_line = $lp_object->get_K_begin_line(); my $i_begin_line = $K_begin_line - $K_to_go[0]; $self->set_forced_lp_break( $i_begin_line, $i ); } } } } return; } ## end sub break_lists_type_sequence sub break_lists_increasing_depth { my ($self) = @_; #-------------------------------------------- # prepare for a new list when depth increases # token $i is a '(','{', or '[' #-------------------------------------------- #---------------------------------------------------------- # BEGIN initialize depth arrays # ... use the same order as sub check_for_new_minimum_depth #---------------------------------------------------------- $type_sequence_stack[$depth] = $type_sequence; $override_cab3[$depth] = undef; if ( $rOpts_comma_arrow_breakpoints == 3 && $type_sequence ) { $override_cab3[$depth] = $self->[_roverride_cab3_]->{$type_sequence}; } $breakpoint_stack[$depth] = $forced_breakpoint_count; $container_type[$depth] = # k => && || ? : . $is_container_label_type{$last_nonblank_type} ? $last_nonblank_token : EMPTY_STRING; $identifier_count_stack[$depth] = 0; $index_before_arrow[$depth] = -1; $interrupted_list[$depth] = 0; $item_count_stack[$depth] = 0; $last_nonblank_type[$depth] = $last_nonblank_type; $opening_structure_index_stack[$depth] = $i; $breakpoint_undo_stack[$depth] = $forced_breakpoint_undo_count; $comma_index[$depth] = undef; $last_comma_index[$depth] = undef; $last_dot_index[$depth] = undef; $old_breakpoint_count_stack[$depth] = $old_breakpoint_count; $has_old_logical_breakpoints[$depth] = 0; $rand_or_list[$depth] = []; $rfor_semicolon_list[$depth] = []; $i_equals[$depth] = -1; # if line ends here then signal closing token to break if ( $next_nonblank_type eq 'b' || $next_nonblank_type eq '#' ) { $self->set_closing_breakpoint($i); } # Not all lists of values should be vertically aligned.. $dont_align[$depth] = # code BLOCKS are handled at a higher level ##( $block_type ne EMPTY_STRING ) $block_type # certain paren lists || ( $type eq '(' ) && ( # it does not usually look good to align a list of # identifiers in a parameter list, as in: # my($var1, $var2, ...) # (This test should probably be refined, for now I'm just # testing for any keyword) ( $last_nonblank_type eq 'k' ) # a trailing '(' usually indicates a non-list || ( $next_nonblank_type eq '(' ) ); $has_broken_sublist[$depth] = 0; $want_comma_break[$depth] = 0; #---------------------------- # END initialize depth arrays #---------------------------- # patch to outdent opening brace of long if/for/.. # statements (like this one). See similar coding in # set_continuation breaks. We have also catch it here for # short line fragments which otherwise will not go through # break_long_lines. if ( $block_type # if we have the ')' but not its '(' in this batch.. && ( $last_nonblank_token eq ')' ) && !defined( $mate_index_to_go[$i_last_nonblank_token] ) # and user wants brace to left && !$rOpts_opening_brace_always_on_right && ( $type eq '{' ) # should be true && ( $token eq '{' ) # should be true ) { $self->set_forced_breakpoint( $i - 1 ); } return; } ## end sub break_lists_increasing_depth sub break_lists_decreasing_depth { my ( $self, $rbond_strength_bias ) = @_; # We have arrived at a closing container token in sub break_lists: # the token at index $i is one of these: ')','}', ']' # A number of important breakpoints for this container can now be set # based on the information that we have collected. This includes: # - breaks at commas to format tables # - breaks at certain logical operators and other good breakpoints # - breaks at opening and closing containers if needed by selected # formatting styles # These breaks are made by calling sub 'set_forced_breakpoint' $self->check_for_new_minimum_depth( $depth, $parent_seqno_to_go[$i] ) if ( $depth < $minimum_depth ); # force all outer logical containers to break after we see on # old breakpoint $has_old_logical_breakpoints[$depth] ||= $has_old_logical_breakpoints[$current_depth]; # Patch to break between ') {' if the paren list is broken. # There is similar logic in break_long_lines for # non-broken lists. if ( $token eq ')' && $next_nonblank_block_type && $interrupted_list[$current_depth] && $next_nonblank_type eq '{' && !$rOpts_opening_brace_always_on_right ) { $self->set_forced_breakpoint($i); } #print "LISTY sees: i=$i type=$type tok=$token block=$block_type depth=$depth next=$next_nonblank_type next_block=$next_nonblank_block_type inter=$interrupted_list[$current_depth]\n"; #----------------------------------------------------------------- # Set breaks at commas to display a table of values if appropriate #----------------------------------------------------------------- my ( $bp_count, $do_not_break_apart ) = ( 0, 0 ); ( $bp_count, $do_not_break_apart ) = $self->set_comma_breakpoints( $current_depth, $rbond_strength_bias ) if ( $item_count_stack[$current_depth] ); #----------------------------------------------------------- # Now set flags needed to decide if we should break open the # container ... This is a long rambling section which has # grown over time to handle all situations. #----------------------------------------------------------- my $i_opening = $opening_structure_index_stack[$current_depth]; my $saw_opening_structure = ( $i_opening >= 0 ); my $lp_object; if ( $rOpts_line_up_parentheses && $saw_opening_structure ) { $lp_object = $self->[_rlp_object_by_seqno_] ->{ $type_sequence_to_go[$i_opening] }; } # this term is long if we had to break at interior commas.. my $is_long_term = $bp_count > 0; # If this is a short container with one or more comma arrows, # then we will mark it as a long term to open it if requested. # $rOpts_comma_arrow_breakpoints = # 0 - open only if comma precedes closing brace # 1 - stable: except for one line blocks # 2 - try to form 1 line blocks # 3 - ignore => # 4 - always open up if vt=0 # 5 - stable: even for one line blocks if vt=0 my $cab_flag = $rOpts_comma_arrow_breakpoints; # replace -cab=3 if overriden if ( $cab_flag == 3 && $type_sequence ) { my $test_cab = $self->[_roverride_cab3_]->{$type_sequence}; if ( defined($test_cab) ) { $cab_flag = $test_cab } } # PATCH: Modify the -cab flag if we are not processing a list: # We only want the -cab flag to apply to list containers, so # for non-lists we use the default and stable -cab=5 value. # Fixes case b939a. if ( $type_sequence && !$self->[_ris_list_by_seqno_]->{$type_sequence} ) { $cab_flag = 5; } # Ignore old breakpoints when under stress. # Fixes b1203 b1204 as well as b1197-b1200. # But not if -lp: fixes b1264, b1265. NOTE: rechecked with # b1264 to see if this check is still required at all, and # these still require a check, but at higher level beta+3 # instead of beta: b1193 b780 if ( $saw_opening_structure && !$lp_object && $levels_to_go[$i_opening] >= $high_stress_level ) { $cab_flag = 2; # Do not break hash braces under stress (fixes b1238) $do_not_break_apart ||= $types_to_go[$i_opening] eq 'L'; # This option fixes b1235, b1237, b1240 with old and new # -lp, but formatting is nicer with next option. ## $is_long_term ||= ## $levels_to_go[$i_opening] > $stress_level_beta + 1; # This option fixes b1240 but not b1235, b1237 with new -lp, # but this gives better formatting than the previous option. # TODO: see if stress_level_alpha should also be considered $do_not_break_apart ||= $levels_to_go[$i_opening] > $stress_level_beta; } if ( !$is_long_term && $saw_opening_structure && $is_opening_token{ $tokens_to_go[$i_opening] } && $index_before_arrow[ $depth + 1 ] > 0 && !$opening_vertical_tightness{ $tokens_to_go[$i_opening] } ) { $is_long_term = $cab_flag == 4 || $cab_flag == 0 && $last_nonblank_token eq ',' || $cab_flag == 5 && $old_breakpoint_to_go[$i_opening]; } # mark term as long if the length between opening and closing # parens exceeds allowed line length if ( !$is_long_term && $saw_opening_structure ) { my $i_opening_minus = $self->find_token_starting_list($i_opening); my $excess = $self->excess_line_length( $i_opening_minus, $i ); # Use standard spaces for indentation of lists in -lp mode # if it gives a longer line length. This helps to avoid an # instability due to forming and breaking one-line blocks. # This fixes case b1314. my $indentation = $leading_spaces_to_go[$i_opening_minus]; if ( ref($indentation) && $self->[_ris_broken_container_]->{$type_sequence} ) { my $lp_spaces = $indentation->get_spaces(); my $std_spaces = $indentation->get_standard_spaces(); my $diff = $std_spaces - $lp_spaces; if ( $diff > 0 ) { $excess += $diff } } my $tol = $length_tol; # boost tol for an -lp container if ( $lp_tol_boost && $lp_object && ( $rOpts_extended_continuation_indentation || !$self->[_ris_list_by_seqno_]->{$type_sequence} ) ) { $tol += $lp_tol_boost; } # Patch to avoid blinking with -bbxi=2 and -cab=2 # in which variations in -ci cause unstable formatting # in edge cases. We just always add one ci level so that # the formatting is independent of the -BBX results. # Fixes cases b1137 b1149 b1150 b1155 b1158 b1159 b1160 # b1161 b1166 b1167 b1168 if ( !$ci_levels_to_go[$i_opening] && $self->[_rbreak_before_container_by_seqno_]->{$type_sequence} ) { $tol += $rOpts_continuation_indentation; } $is_long_term = $excess + $tol > 0; } # We've set breaks after all comma-arrows. Now we have to # undo them if this can be a one-line block # (the only breakpoints set will be due to comma-arrows) if ( # user doesn't require breaking after all comma-arrows ( $cab_flag != 0 ) && ( $cab_flag != 4 ) # and if the opening structure is in this batch && $saw_opening_structure # and either on the same old line && ( $old_breakpoint_count_stack[$current_depth] == $last_old_breakpoint_count # or user wants to form long blocks with arrows || $cab_flag == 2 ) # and we made breakpoints between the opening and closing && ( $breakpoint_undo_stack[$current_depth] < $forced_breakpoint_undo_count ) # and this block is short enough to fit on one line # Note: use < because need 1 more space for possible comma && !$is_long_term ) { $self->undo_forced_breakpoint_stack( $breakpoint_undo_stack[$current_depth] ); } # now see if we have any comma breakpoints left my $has_comma_breakpoints = ( $breakpoint_stack[$current_depth] != $forced_breakpoint_count ); # update broken-sublist flag of the outer container $has_broken_sublist[$depth] = $has_broken_sublist[$depth] || $has_broken_sublist[$current_depth] || $is_long_term || $has_comma_breakpoints; # Having come to the closing ')', '}', or ']', now we have to decide # if we should 'open up' the structure by placing breaks at the # opening and closing containers. This is a tricky decision. Here # are some of the basic considerations: # # -If this is a BLOCK container, then any breakpoints will have # already been set (and according to user preferences), so we need do # nothing here. # # -If we have a comma-separated list for which we can align the list # items, then we need to do so because otherwise the vertical aligner # cannot currently do the alignment. # # -If this container does itself contain a container which has been # broken open, then it should be broken open to properly show the # structure. # # -If there is nothing to align, and no other reason to break apart, # then do not do it. # # We will not break open the parens of a long but 'simple' logical # expression. For example: # # This is an example of a simple logical expression and its formatting: # # if ( $bigwasteofspace1 && $bigwasteofspace2 # || $bigwasteofspace3 && $bigwasteofspace4 ) # # Most people would prefer this than the 'spacey' version: # # if ( # $bigwasteofspace1 && $bigwasteofspace2 # || $bigwasteofspace3 && $bigwasteofspace4 # ) # # To illustrate the rules for breaking logical expressions, consider: # # FULLY DENSE: # if ( $opt_excl # and ( exists $ids_excl_uc{$id_uc} # or grep $id_uc =~ /$_/, @ids_excl_uc )) # # This is on the verge of being difficult to read. The current # default is to open it up like this: # # DEFAULT: # if ( # $opt_excl # and ( exists $ids_excl_uc{$id_uc} # or grep $id_uc =~ /$_/, @ids_excl_uc ) # ) # # This is a compromise which tries to avoid being too dense and to # spacey. A more spaced version would be: # # SPACEY: # if ( # $opt_excl # and ( # exists $ids_excl_uc{$id_uc} # or grep $id_uc =~ /$_/, @ids_excl_uc # ) # ) # # Some people might prefer the spacey version -- an option could be # added. The innermost expression contains a long block '( exists # $ids_... ')'. # # Here is how the logic goes: We will force a break at the 'or' that # the innermost expression contains, but we will not break apart its # opening and closing containers because (1) it contains no # multi-line sub-containers itself, and (2) there is no alignment to # be gained by breaking it open like this # # and ( # exists $ids_excl_uc{$id_uc} # or grep $id_uc =~ /$_/, @ids_excl_uc # ) # # (although this looks perfectly ok and might be good for long # expressions). The outer 'if' container, though, contains a broken # sub-container, so it will be broken open to avoid too much density. # Also, since it contains no 'or's, there will be a forced break at # its 'and'. # Handle the experimental flag --break-open-compact-parens # NOTE: This flag is not currently used and may eventually be removed. # If this flag is set, we will implement it by # pretending we did not see the opening structure, since in that case # parens always get opened up. if ( $saw_opening_structure && $rOpts_break_open_compact_parens ) { # This parameter is a one-character flag, as follows: # '0' matches no parens -> break open NOT OK # '1' matches all parens -> break open OK # Other values are same as used by the weld-exclusion-list my $flag = $rOpts_break_open_compact_parens; if ( $flag eq '*' || $flag eq '1' ) { $saw_opening_structure = 0; } else { # NOTE: $seqno will be equal to closure var $type_sequence here my $seqno = $type_sequence_to_go[$i_opening]; $saw_opening_structure = !$self->match_paren_control_flag( $seqno, $flag ); } } # Set some more flags telling something about this container.. my $is_simple_logical_expression; if ( $item_count_stack[$current_depth] == 0 && $saw_opening_structure && $tokens_to_go[$i_opening] eq '(' && $is_logical_container{ $container_type[$current_depth] } ) { # This seems to be a simple logical expression with # no existing breakpoints. Set a flag to prevent # opening it up. if ( !$has_comma_breakpoints ) { $is_simple_logical_expression = 1; } #--------------------------------------------------- # This seems to be a simple logical expression with # breakpoints (broken sublists, for example). Break # at all 'or's and '||'s. #--------------------------------------------------- else { $self->set_logical_breakpoints($current_depth); } } # break long terms at any C-style for semicolons (c154) if ( $is_long_term && @{ $rfor_semicolon_list[$current_depth] } ) { $self->set_for_semicolon_breakpoints($current_depth); # and open up a long 'for' or 'foreach' container to allow # leading term alignment unless -lp is used. $has_comma_breakpoints = 1 unless ($lp_object); } #---------------------------------------------------------------- # FINALLY: Break open container according to the flags which have # been set. #---------------------------------------------------------------- if ( # breaks for code BLOCKS are handled at a higher level !$block_type # we do not need to break at the top level of an 'if' # type expression && !$is_simple_logical_expression ## modification to keep ': (' containers vertically tight; ## but probably better to let user set -vt=1 to avoid ## inconsistency with other paren types ## && ($container_type[$current_depth] ne ':') # otherwise, we require one of these reasons for breaking: && ( # - this term has forced line breaks $has_comma_breakpoints # - the opening container is separated from this batch # for some reason (comment, blank line, code block) # - this is a non-paren container spanning multiple lines || !$saw_opening_structure # - this is a long block contained in another breakable # container || $is_long_term && !$self->is_in_block_by_i($i_opening) ) ) { # do special -lp breaks at the CLOSING token for INTACT # blocks (because we might not do them if the block does # not break open) if ($lp_object) { my $K_begin_line = $lp_object->get_K_begin_line(); my $i_begin_line = $K_begin_line - $K_to_go[0]; $self->set_forced_lp_break( $i_begin_line, $i_opening ); } # break after opening structure. # note: break before closing structure will be automatic if ( $minimum_depth <= $current_depth ) { if ( $i_opening >= 0 ) { if ( !$do_not_break_apart && !is_unbreakable_container($current_depth) ) { $self->set_forced_breakpoint($i_opening); # Do not let brace types L/R use vertical tightness # flags to recombine if we have to break on length # because instability is possible if both vt and vtc # flags are set ... see issue b1444. if ( $is_long_term && $types_to_go[$i_opening] eq 'L' && $opening_vertical_tightness{'{'} && $closing_vertical_tightness{'}'} ) { my $seqno = $type_sequence_to_go[$i_opening]; if ($seqno) { $self->[_rbreak_container_]->{$seqno} = 1; } } } } # break at ',' of lower depth level before opening token if ( $last_comma_index[$depth] ) { $self->set_forced_breakpoint( $last_comma_index[$depth] ); } # break at '.' of lower depth level before opening token if ( $last_dot_index[$depth] ) { $self->set_forced_breakpoint( $last_dot_index[$depth] ); } # break before opening structure if preceded by another # closing structure and a comma. This is normally # done by the previous closing brace, but not # if it was a one-line block. if ( $i_opening > 2 ) { my $i_prev = ( $types_to_go[ $i_opening - 1 ] eq 'b' ) ? $i_opening - 2 : $i_opening - 1; my $type_prev = $types_to_go[$i_prev]; my $token_prev = $tokens_to_go[$i_prev]; if ( $type_prev eq ',' && ( $types_to_go[ $i_prev - 1 ] eq ')' || $types_to_go[ $i_prev - 1 ] eq '}' ) ) { $self->set_forced_breakpoint($i_prev); } # also break before something like ':(' or '?(' # if appropriate. elsif ($type_prev =~ /^([k\:\?]|&&|\|\|)$/ && $want_break_before{$token_prev} ) { $self->set_forced_breakpoint($i_prev); } } } # break after comma following closing structure if ( $types_to_go[ $i + 1 ] eq ',' ) { $self->set_forced_breakpoint( $i + 1 ); } # break before an '=' following closing structure if ( $is_assignment{$next_nonblank_type} && ( $breakpoint_stack[$current_depth] != $forced_breakpoint_count ) ) { $self->set_forced_breakpoint($i); } # break at any comma before the opening structure Added # for -lp, but seems to be good in general. It isn't # obvious how far back to look; the '5' below seems to # work well and will catch the comma in something like # push @list, myfunc( $param, $param, .. my $icomma = $last_comma_index[$depth]; if ( defined($icomma) && ( $i_opening - $icomma ) < 5 ) { unless ( $forced_breakpoint_to_go[$icomma] ) { $self->set_forced_breakpoint($icomma); } } } #----------------------------------------------------------- # Break open a logical container open if it was already open #----------------------------------------------------------- elsif ($is_simple_logical_expression && $has_old_logical_breakpoints[$current_depth] ) { $self->set_logical_breakpoints($current_depth); } # Handle long container which does not get opened up elsif ($is_long_term) { # must set fake breakpoint to alert outer containers that # they are complex set_fake_breakpoint(); } return; } ## end sub break_lists_decreasing_depth } ## end closure break_lists my %is_kwiZ; my %is_key_type; BEGIN { # Added 'w' to fix b1172 my @q = qw(k w i Z ->); @is_kwiZ{@q} = (1) x scalar(@q); # added = for b1211 @q = qw<( [ { L R } ] ) = b>; push @q, ','; @is_key_type{@q} = (1) x scalar(@q); } ## end BEGIN use constant DEBUG_FIND_START => 0; sub find_token_starting_list { # When testing to see if a block will fit on one line, some # previous token(s) may also need to be on the line; particularly # if this is a sub call. So we will look back at least one # token. my ( $self, $i_opening_paren ) = @_; # This will be the return index my $i_opening_minus = $i_opening_paren; if ( $i_opening_minus <= 0 ) { return $i_opening_minus; } my $im1 = $i_opening_paren - 1; my ( $iprev_nb, $type_prev_nb ) = ( $im1, $types_to_go[$im1] ); if ( $type_prev_nb eq 'b' && $iprev_nb > 0 ) { $iprev_nb -= 1; $type_prev_nb = $types_to_go[$iprev_nb]; } if ( $type_prev_nb eq ',' ) { # a previous comma is a good break point # $i_opening_minus = $i_opening_paren; } elsif ( $tokens_to_go[$i_opening_paren] eq '(' # non-parens added here to fix case b1186 || $is_kwiZ{$type_prev_nb} ) { $i_opening_minus = $im1; # Walk back to improve length estimate... # FIX for cases b1169 b1170 b1171: start walking back # at the previous nonblank. This makes the result insensitive # to the flag --space-function-paren, and similar. # previous loop: for ( my $j = $im1 ; $j >= 0 ; $j-- ) { foreach my $j ( reverse( 0 .. $iprev_nb ) ) { if ( $is_key_type{ $types_to_go[$j] } ) { # fix for b1211 if ( $types_to_go[$j] eq '=' ) { $i_opening_minus = $j } last; } $i_opening_minus = $j; } if ( $types_to_go[$i_opening_minus] eq 'b' ) { $i_opening_minus++ } } DEBUG_FIND_START && print < im=$i_opening_minus tok=$tokens_to_go[$i_opening_minus] EOM return $i_opening_minus; } ## end sub find_token_starting_list { ## begin closure table_maker my %is_keyword_with_special_leading_term; BEGIN { # These keywords have prototypes which allow a special leading item # followed by a list my @q = qw( chmod formline grep join kill map pack printf push sprintf unshift ); @is_keyword_with_special_leading_term{@q} = (1) x scalar(@q); } ## end BEGIN use constant DEBUG_SPARSE => 0; sub table_maker { # Given a list of comma-separated items, set breakpoints at some of # the commas, if necessary, to make it easy to read. # This is done by making calls to 'set_forced_breakpoint'. # This is a complex routine because there are many special cases. # Returns: nothing # The numerous variables involved are contained three hashes: # $rhash_IN : For contents see the calling routine # $rhash_A: For contents see return from sub 'table_layout_A' # $rhash_B: For contents see return from sub 'table_layout_B' my ( $self, $rhash_IN ) = @_; # Find lengths of all list items needed for calculating page layout my $rhash_A = table_layout_A($rhash_IN); return if ( !defined($rhash_A) ); # Some variables received from caller... my $i_closing_paren = $rhash_IN->{i_closing_paren}; my $i_opening_paren = $rhash_IN->{i_opening_paren}; my $has_broken_sublist = $rhash_IN->{has_broken_sublist}; my $interrupted = $rhash_IN->{interrupted}; #----------------------------------------- # Section A: Handle some special cases ... #----------------------------------------- #------------------------------------------------------------- # Special Case A1: Compound List Rule 1: # Break at (almost) every comma for a list containing a broken # sublist. This has higher priority than the Interrupted List # Rule. #------------------------------------------------------------- if ($has_broken_sublist) { $self->apply_broken_sublist_rule( $rhash_A, $interrupted ); return; } #-------------------------------------------------------------- # Special Case A2: Interrupted List Rule: # A list is forced to use old breakpoints if it was interrupted # by side comments or blank lines, or requested by user. #-------------------------------------------------------------- if ( $rOpts_break_at_old_comma_breakpoints || $interrupted || $i_opening_paren < 0 ) { my $i_first_comma = $rhash_A->{_i_first_comma}; my $i_true_last_comma = $rhash_A->{_i_true_last_comma}; $self->copy_old_breakpoints( $i_first_comma, $i_true_last_comma ); return; } #----------------------------------------------------------------- # Special Case A3: If it fits on one line, return and let the line # break logic decide if and where to break. #----------------------------------------------------------------- # The -bbxi=2 parameters can add an extra hidden level of indentation # so they need a tolerance to avoid instability. Fixes b1259, 1260. my $opening_token = $tokens_to_go[$i_opening_paren]; my $tol = 0; if ( $break_before_container_types{$opening_token} && $container_indentation_options{$opening_token} && $container_indentation_options{$opening_token} == 2 ) { $tol = $rOpts_indent_columns; # use greater of -ci and -i (fix for case b1334) if ( $tol < $rOpts_continuation_indentation ) { $tol = $rOpts_continuation_indentation; } } my $i_opening_minus = $self->find_token_starting_list($i_opening_paren); my $excess = $self->excess_line_length( $i_opening_minus, $i_closing_paren ); return if ( $excess + $tol <= 0 ); #--------------------------------------- # Section B: Handle a multiline list ... #--------------------------------------- $self->break_multiline_list( $rhash_IN, $rhash_A, $i_opening_minus ); return; } ## end sub table_maker sub apply_broken_sublist_rule { my ( $self, $rhash_A, $interrupted ) = @_; my $ritem_lengths = $rhash_A->{_ritem_lengths}; my $ri_term_begin = $rhash_A->{_ri_term_begin}; my $ri_term_end = $rhash_A->{_ri_term_end}; my $ri_term_comma = $rhash_A->{_ri_term_comma}; my $item_count = $rhash_A->{_item_count_A}; my $i_first_comma = $rhash_A->{_i_first_comma}; my $i_true_last_comma = $rhash_A->{_i_true_last_comma}; # Break at every comma except for a comma between two # simple, small terms. This prevents long vertical # columns of, say, just 0's. my $small_length = 10; # 2 + actual maximum length wanted # We'll insert a break in long runs of small terms to # allow alignment in uniform tables. my $skipped_count = 0; my $columns = table_columns_available($i_first_comma); my $fields = int( $columns / $small_length ); if ( $rOpts_maximum_fields_per_table && $fields > $rOpts_maximum_fields_per_table ) { $fields = $rOpts_maximum_fields_per_table; } my $max_skipped_count = $fields - 1; my $is_simple_last_term = 0; my $is_simple_next_term = 0; foreach my $j ( 0 .. $item_count ) { $is_simple_last_term = $is_simple_next_term; $is_simple_next_term = 0; if ( $j < $item_count && $ri_term_end->[$j] == $ri_term_begin->[$j] && $ritem_lengths->[$j] <= $small_length ) { $is_simple_next_term = 1; } next if $j == 0; if ( $is_simple_last_term && $is_simple_next_term && $skipped_count < $max_skipped_count ) { $skipped_count++; } else { $skipped_count = 0; my $i_tc = $ri_term_comma->[ $j - 1 ]; last unless defined $i_tc; $self->set_forced_breakpoint($i_tc); } } # always break at the last comma if this list is # interrupted; we wouldn't want to leave a terminal '{', for # example. if ($interrupted) { $self->set_forced_breakpoint($i_true_last_comma); } return; } ## end sub apply_broken_sublist_rule sub set_emergency_comma_breakpoints { my ( $self, # $number_of_fields_best, $rhash_IN, $comma_count, $i_first_comma, ) = @_; # The number of fields worked out to be negative, so we # have to make an emergency fix. my $rcomma_index = $rhash_IN->{rcomma_index}; my $next_nonblank_type = $rhash_IN->{next_nonblank_type}; my $rdo_not_break_apart = $rhash_IN->{rdo_not_break_apart}; my $must_break_open = $rhash_IN->{must_break_open}; # are we an item contained in an outer list? my $in_hierarchical_list = $next_nonblank_type =~ /^[\}\,]$/; # In many cases, it may be best to not force a break if there is just # one comma, because the standard continuation break logic will do a # better job without it. # In the common case that all but one of the terms can fit # on a single line, it may look better not to break open the # containing parens. Consider, for example # $color = # join ( '/', # sort { $color_value{$::a} <=> $color_value{$::b}; } # keys %colors ); # which will look like this with the container broken: # $color = join ( # '/', # sort { $color_value{$::a} <=> $color_value{$::b}; } keys %colors # ); # Here is an example of this rule for a long last term: # log_message( 0, 256, 128, # "Number of routes in adj-RIB-in to be considered: $peercount" ); # And here is an example with a long first term: # $s = sprintf( # "%2d wallclock secs (%$f usr %$f sys + %$f cusr %$f csys = %$f CPU)", # $r, $pu, $ps, $cu, $cs, $tt # ) # if $style eq 'all'; my $i_last_comma = $rcomma_index->[ $comma_count - 1 ]; my $long_last_term = $self->excess_line_length( 0, $i_last_comma ) <= 0; my $long_first_term = $self->excess_line_length( $i_first_comma + 1, $max_index_to_go ) <= 0; # break at every comma ... if ( # if requested by user or is best looking $number_of_fields_best == 1 # or if this is a sublist of a larger list || $in_hierarchical_list # or if multiple commas and we don't have a long first or last # term || ( $comma_count > 1 && !( $long_last_term || $long_first_term ) ) ) { foreach ( 0 .. $comma_count - 1 ) { $self->set_forced_breakpoint( $rcomma_index->[$_] ); } } elsif ($long_last_term) { $self->set_forced_breakpoint($i_last_comma); ${$rdo_not_break_apart} = 1 unless $must_break_open; } elsif ($long_first_term) { $self->set_forced_breakpoint($i_first_comma); } else { # let breaks be defined by default bond strength logic } return; } ## end sub set_emergency_comma_breakpoints sub break_multiline_list { my ( $self, $rhash_IN, $rhash_A, $i_opening_minus ) = @_; # Overriden variables my $item_count = $rhash_A->{_item_count_A}; my $identifier_count = $rhash_A->{_identifier_count_A}; # Derived variables: my $ritem_lengths = $rhash_A->{_ritem_lengths}; my $ri_term_begin = $rhash_A->{_ri_term_begin}; my $ri_term_end = $rhash_A->{_ri_term_end}; my $ri_term_comma = $rhash_A->{_ri_term_comma}; my $rmax_length = $rhash_A->{_rmax_length}; my $comma_count = $rhash_A->{_comma_count}; my $i_effective_last_comma = $rhash_A->{_i_effective_last_comma}; my $first_term_length = $rhash_A->{_first_term_length}; my $i_first_comma = $rhash_A->{_i_first_comma}; my $i_last_comma = $rhash_A->{_i_last_comma}; my $i_true_last_comma = $rhash_A->{_i_true_last_comma}; # Veriables received from caller my $i_opening_paren = $rhash_IN->{i_opening_paren}; my $i_closing_paren = $rhash_IN->{i_closing_paren}; my $rcomma_index = $rhash_IN->{rcomma_index}; my $next_nonblank_type = $rhash_IN->{next_nonblank_type}; my $list_type = $rhash_IN->{list_type}; my $interrupted = $rhash_IN->{interrupted}; my $rdo_not_break_apart = $rhash_IN->{rdo_not_break_apart}; my $must_break_open = $rhash_IN->{must_break_open}; ## NOTE: these input vars from caller use the values from rhash_A (see above): ## my $item_count = $rhash_IN->{item_count}; ## my $identifier_count = $rhash_IN->{identifier_count}; # NOTE: i_opening_paren changes value below so we need to get these here my $opening_is_in_block = $self->is_in_block_by_i($i_opening_paren); my $opening_token = $tokens_to_go[$i_opening_paren]; #--------------------------------------------------------------- # Section B1: Determine '$number_of_fields' = the best number of # fields to use if this is to be formatted as a table. #--------------------------------------------------------------- # Now we know that this block spans multiple lines; we have to set # at least one breakpoint -- real or fake -- as a signal to break # open any outer containers. set_fake_breakpoint(); # Set a flag indicating if we need to break open to keep -lp # items aligned. This is necessary if any of the list terms # exceeds the available space after the '('. my $need_lp_break_open = $must_break_open; my $is_lp_formatting = ref( $leading_spaces_to_go[$i_first_comma] ); if ( $is_lp_formatting && !$must_break_open ) { my $columns_if_unbroken = $maximum_line_length_at_level[ $levels_to_go[$i_opening_minus] ] - total_line_length( $i_opening_minus, $i_opening_paren ); $need_lp_break_open = ( $rmax_length->[0] > $columns_if_unbroken ) || ( $rmax_length->[1] > $columns_if_unbroken ) || ( $first_term_length > $columns_if_unbroken ); } my $hash_B = $self->table_layout_B( $rhash_IN, $rhash_A, $is_lp_formatting ); return if ( !defined($hash_B) ); # Updated variables $i_first_comma = $hash_B->{_i_first_comma_B}; $i_opening_paren = $hash_B->{_i_opening_paren_B}; $item_count = $hash_B->{_item_count_B}; # New variables my $columns = $hash_B->{_columns}; my $formatted_columns = $hash_B->{_formatted_columns}; my $formatted_lines = $hash_B->{_formatted_lines}; my $max_width = $hash_B->{_max_width}; my $new_identifier_count = $hash_B->{_new_identifier_count}; my $number_of_fields = $hash_B->{_number_of_fields}; my $odd_or_even = $hash_B->{_odd_or_even}; my $packed_columns = $hash_B->{_packed_columns}; my $packed_lines = $hash_B->{_packed_lines}; my $pair_width = $hash_B->{_pair_width}; my $ri_ragged_break_list = $hash_B->{_ri_ragged_break_list}; my $use_separate_first_term = $hash_B->{_use_separate_first_term}; # are we an item contained in an outer list? my $in_hierarchical_list = $next_nonblank_type =~ /^[\}\,]$/; my $unused_columns = $formatted_columns - $packed_columns; # set some empirical parameters to help decide if we should try to # align; high sparsity does not look good, especially with few lines my $sparsity = ($unused_columns) / ($formatted_columns); my $max_allowed_sparsity = ( $item_count < 3 ) ? 0.1 : ( $packed_lines == 1 ) ? 0.15 : ( $packed_lines == 2 ) ? 0.4 : 0.7; my $two_line_word_wrap_ok; if ( $opening_token eq '(' ) { # default is to allow wrapping of short paren lists $two_line_word_wrap_ok = 1; # but turn off word wrap where requested if ($rOpts_break_open_compact_parens) { # This parameter is a one-character flag, as follows: # '0' matches no parens -> break open NOT OK -> word wrap OK # '1' matches all parens -> break open OK -> word wrap NOT OK # Other values are the same as used by the weld-exclusion-list my $flag = $rOpts_break_open_compact_parens; if ( $flag eq '*' || $flag eq '1' ) { $two_line_word_wrap_ok = 0; } elsif ( $flag eq '0' ) { $two_line_word_wrap_ok = 1; } else { my $seqno = $type_sequence_to_go[$i_opening_paren]; $two_line_word_wrap_ok = !$self->match_paren_control_flag( $seqno, $flag ); } } } #------------------------------------------------------------------- # Section B2: Check for shortcut methods, which avoid treating # a list as a table for relatively small parenthesized lists. These # are usually easier to read if not formatted as tables. #------------------------------------------------------------------- if ( $packed_lines <= 2 # probably can fit in 2 lines && $item_count < 9 # doesn't have too many items && $opening_is_in_block # not a sub-container && $two_line_word_wrap_ok # ok to wrap this paren list ) { # Section B2A: Shortcut method 1: for -lp and just one comma: # This is a no-brainer, just break at the comma. if ( $is_lp_formatting # -lp && $item_count == 2 # two items, one comma && !$must_break_open ) { my $i_break = $rcomma_index->[0]; $self->set_forced_breakpoint($i_break); ${$rdo_not_break_apart} = 1; return; } # Section B2B: Shortcut method 2 is for most small ragged lists # which might look best if not displayed as a table. if ( ( $number_of_fields == 2 && $item_count == 3 ) || ( $new_identifier_count > 0 # isn't all quotes && $sparsity > 0.15 ) # would be fairly spaced gaps if aligned ) { my $break_count = $self->set_ragged_breakpoints( $ri_term_comma, $ri_ragged_break_list ); ++$break_count if ($use_separate_first_term); # NOTE: we should really use the true break count here, # which can be greater if there are large terms and # little space, but usually this will work well enough. unless ($must_break_open) { if ( $break_count <= 1 ) { ${$rdo_not_break_apart} = 1; } elsif ( $is_lp_formatting && !$need_lp_break_open ) { ${$rdo_not_break_apart} = 1; } } return; } } ## end shortcut methods # debug stuff DEBUG_SPARSE && do { # How many spaces across the page will we fill? my $columns_per_line = ( int $number_of_fields / 2 ) * $pair_width + ( $number_of_fields % 2 ) * $max_width; print STDOUT "SPARSE:cols=$columns commas=$comma_count items:$item_count ids=$identifier_count pairwidth=$pair_width fields=$number_of_fields lines packed: $packed_lines packed_cols=$packed_columns fmtd:$formatted_lines cols /line:$columns_per_line unused:$unused_columns fmtd:$formatted_columns sparsity=$sparsity allow=$max_allowed_sparsity\n"; }; #------------------------------------------------------------------ # Section B3: Compound List Rule 2: # If this list is too long for one line, and it is an item of a # larger list, then we must format it, regardless of sparsity # (ian.t). One reason that we have to do this is to trigger # Compound List Rule 1, above, which causes breaks at all commas of # all outer lists. In this way, the structure will be properly # displayed. #------------------------------------------------------------------ # Decide if this list is too long for one line unless broken my $total_columns = table_columns_available($i_opening_paren); my $too_long = $packed_columns > $total_columns; # For a paren list, include the length of the token just before the # '(' because this is likely a sub call, and we would have to # include the sub name on the same line as the list. This is still # imprecise, but not too bad. (steve.t) if ( !$too_long && $i_opening_paren > 0 && $opening_token eq '(' ) { $too_long = $self->excess_line_length( $i_opening_minus, $i_effective_last_comma + 1 ) > 0; } # TODO: For an item after a '=>', try to include the length of the # thing before the '=>'. This is crude and should be improved by # actually looking back token by token. if ( !$too_long && $i_opening_paren > 0 && $list_type eq '=>' ) { my $i_opening_minus_test = $i_opening_paren - 4; if ( $i_opening_minus >= 0 ) { $too_long = $self->excess_line_length( $i_opening_minus_test, $i_effective_last_comma + 1 ) > 0; } } # Always break lists contained in '[' and '{' if too long for 1 line, # and always break lists which are too long and part of a more complex # structure. my $must_break_open_container = $must_break_open || ( $too_long && ( $in_hierarchical_list || !$two_line_word_wrap_ok ) ); #-------------------------------------------------------------------- # Section B4: A table will work here. But do not attempt to align # columns if this is a tiny table or it would be too spaced. It # seems that the more packed lines we have, the sparser the list that # can be allowed and still look ok. #-------------------------------------------------------------------- if ( ( $formatted_lines < 3 && $packed_lines < $formatted_lines ) || ( $formatted_lines < 2 ) || ( $unused_columns > $max_allowed_sparsity * $formatted_columns ) ) { #---------------------------------------------------------------- # Section B4A: too sparse: would not look good aligned in a table #---------------------------------------------------------------- # use old breakpoints if this is a 'big' list if ( $packed_lines > 2 && $item_count > 10 ) { write_logfile_entry("List sparse: using old breakpoints\n"); $self->copy_old_breakpoints( $i_first_comma, $i_last_comma ); } # let the continuation logic handle it if 2 lines else { my $break_count = $self->set_ragged_breakpoints( $ri_term_comma, $ri_ragged_break_list ); ++$break_count if ($use_separate_first_term); unless ($must_break_open_container) { if ( $break_count <= 1 ) { ${$rdo_not_break_apart} = 1; } elsif ( $is_lp_formatting && !$need_lp_break_open ) { ${$rdo_not_break_apart} = 1; } } } return; } #-------------------------------------------- # Section B4B: Go ahead and format as a table #-------------------------------------------- $self->write_formatted_table( $number_of_fields, $comma_count, $rcomma_index, $use_separate_first_term ); return; } ## end sub break_multiline_list sub table_layout_A { my ($rhash_IN) = @_; # Find lengths of all list items needed to calculate page layout # Returns: # - nothing if this list is empty, or # - a ref to a hash containg some derived parameters my $i_opening_paren = $rhash_IN->{i_opening_paren}; my $i_closing_paren = $rhash_IN->{i_closing_paren}; my $identifier_count = $rhash_IN->{identifier_count}; my $rcomma_index = $rhash_IN->{rcomma_index}; my $item_count = $rhash_IN->{item_count}; # nothing to do if no commas seen return if ( $item_count < 1 ); my $i_first_comma = $rcomma_index->[0]; my $i_true_last_comma = $rcomma_index->[ $item_count - 1 ]; my $i_last_comma = $i_true_last_comma; if ( $i_last_comma >= $max_index_to_go ) { $item_count -= 1; return if ( $item_count < 1 ); $i_last_comma = $rcomma_index->[ $item_count - 1 ]; } my $comma_count = $item_count; my $ritem_lengths = []; my $ri_term_begin = []; my $ri_term_end = []; my $ri_term_comma = []; my $rmax_length = [ 0, 0 ]; my $i_prev_plus; my $first_term_length; my $i = $i_opening_paren; my $is_odd = 1; foreach my $j ( 0 .. $comma_count - 1 ) { $is_odd = 1 - $is_odd; $i_prev_plus = $i + 1; $i = $rcomma_index->[$j]; my $i_term_end = ( $i == 0 || $types_to_go[ $i - 1 ] eq 'b' ) ? $i - 2 : $i - 1; my $i_term_begin = ( $types_to_go[$i_prev_plus] eq 'b' ) ? $i_prev_plus + 1 : $i_prev_plus; push @{$ri_term_begin}, $i_term_begin; push @{$ri_term_end}, $i_term_end; push @{$ri_term_comma}, $i; # note: currently adding 2 to all lengths (for comma and space) my $length = 2 + token_sequence_length( $i_term_begin, $i_term_end ); push @{$ritem_lengths}, $length; if ( $j == 0 ) { $first_term_length = $length; } else { if ( $length > $rmax_length->[$is_odd] ) { $rmax_length->[$is_odd] = $length; } } } # now we have to make a distinction between the comma count and item # count, because the item count will be one greater than the comma # count if the last item is not terminated with a comma my $i_b = ( $types_to_go[ $i_last_comma + 1 ] eq 'b' ) ? $i_last_comma + 1 : $i_last_comma; my $i_e = ( $types_to_go[ $i_closing_paren - 1 ] eq 'b' ) ? $i_closing_paren - 2 : $i_closing_paren - 1; my $i_effective_last_comma = $i_last_comma; my $last_item_length = token_sequence_length( $i_b + 1, $i_e ); if ( $last_item_length > 0 ) { # add 2 to length because other lengths include a comma and a blank $last_item_length += 2; push @{$ritem_lengths}, $last_item_length; push @{$ri_term_begin}, $i_b + 1; push @{$ri_term_end}, $i_e; push @{$ri_term_comma}, undef; my $i_odd = $item_count % 2; if ( $last_item_length > $rmax_length->[$i_odd] ) { $rmax_length->[$i_odd] = $last_item_length; } $item_count++; $i_effective_last_comma = $i_e + 1; if ( $types_to_go[ $i_b + 1 ] =~ /^[iR\]]$/ ) { $identifier_count++; } } # be sure we do not extend beyond the current list length if ( $i_effective_last_comma >= $max_index_to_go ) { $i_effective_last_comma = $max_index_to_go - 1; } # Return the hash of derived variables. return { # Updated variables _item_count_A => $item_count, _identifier_count_A => $identifier_count, # New variables _ritem_lengths => $ritem_lengths, _ri_term_begin => $ri_term_begin, _ri_term_end => $ri_term_end, _ri_term_comma => $ri_term_comma, _rmax_length => $rmax_length, _comma_count => $comma_count, _i_effective_last_comma => $i_effective_last_comma, _first_term_length => $first_term_length, _i_first_comma => $i_first_comma, _i_last_comma => $i_last_comma, _i_true_last_comma => $i_true_last_comma, }; } ## end sub table_layout_A sub table_layout_B { my ( $self, $rhash_IN, $rhash_A, $is_lp_formatting ) = @_; # Determine variables for the best table layout, including # the best number of fields. # Returns: # - nothing if nothing more to do # - a ref to a hash containg some derived parameters # Variables from caller my $i_opening_paren = $rhash_IN->{i_opening_paren}; my $list_type = $rhash_IN->{list_type}; my $next_nonblank_type = $rhash_IN->{next_nonblank_type}; my $rcomma_index = $rhash_IN->{rcomma_index}; my $rdo_not_break_apart = $rhash_IN->{rdo_not_break_apart}; # Table size variables my $comma_count = $rhash_A->{_comma_count}; my $first_term_length = $rhash_A->{_first_term_length}; my $i_effective_last_comma = $rhash_A->{_i_effective_last_comma}; my $i_first_comma = $rhash_A->{_i_first_comma}; my $identifier_count = $rhash_A->{_identifier_count_A}; my $item_count = $rhash_A->{_item_count_A}; my $ri_term_begin = $rhash_A->{_ri_term_begin}; my $ri_term_comma = $rhash_A->{_ri_term_comma}; my $ri_term_end = $rhash_A->{_ri_term_end}; my $ritem_lengths = $rhash_A->{_ritem_lengths}; my $rmax_length = $rhash_A->{_rmax_length}; # Specify if the list must have an even number of fields or not. # It is generally safest to assume an even number, because the # list items might be a hash list. But if we can be sure that # it is not a hash, then we can allow an odd number for more # flexibility. # 1 = odd field count ok, 2 = want even count my $odd_or_even = 2; if ( $identifier_count >= $item_count - 1 || $is_assignment{$next_nonblank_type} || ( $list_type && $list_type ne '=>' && $list_type !~ /^[\:\?]$/ ) ) { $odd_or_even = 1; } # do we have a long first term which should be # left on a line by itself? my $use_separate_first_term = ( $odd_or_even == 1 # only if we can use 1 field/line && $item_count > 3 # need several items && $first_term_length > 2 * $rmax_length->[0] - 2 # need long first term && $first_term_length > 2 * $rmax_length->[1] - 2 # need long first term ); # or do we know from the type of list that the first term should # be placed alone? if ( !$use_separate_first_term ) { if ( $is_keyword_with_special_leading_term{$list_type} ) { $use_separate_first_term = 1; # should the container be broken open? if ( $item_count < 3 ) { if ( $i_first_comma - $i_opening_paren < 4 ) { ${$rdo_not_break_apart} = 1; } } elsif ($first_term_length < 20 && $i_first_comma - $i_opening_paren < 4 ) { my $columns = table_columns_available($i_first_comma); if ( $first_term_length < $columns ) { ${$rdo_not_break_apart} = 1; } } } } # if so, if ($use_separate_first_term) { # ..set a break and update starting values $self->set_forced_breakpoint($i_first_comma); $item_count--; #--------------------------------------------------------------- # Section B1A: Stop if one item remains ($i_first_comma = undef) #--------------------------------------------------------------- # Fix for b1442: use '$item_count' here instead of '$comma_count' # to make the result independent of any trailing comma. return if ( $item_count <= 1 ); $i_opening_paren = $i_first_comma; $i_first_comma = $rcomma_index->[1]; shift @{$ritem_lengths}; shift @{$ri_term_begin}; shift @{$ri_term_end}; shift @{$ri_term_comma}; } # if not, update the metrics to include the first term else { if ( $first_term_length > $rmax_length->[0] ) { $rmax_length->[0] = $first_term_length; } } # Field width parameters my $pair_width = ( $rmax_length->[0] + $rmax_length->[1] ); my $max_width = ( $rmax_length->[0] > $rmax_length->[1] ) ? $rmax_length->[0] : $rmax_length->[1]; # Number of free columns across the page width for laying out tables my $columns = table_columns_available($i_first_comma); # Patch for b1210 and b1216-b1218 when -vmll is set. If we are unable # to break after an opening paren, then the maximum line length for the # first line could be less than the later lines. So we need to reduce # the line length. Normally, we will get a break after an opening # paren, but in some cases we might not. if ( $rOpts_variable_maximum_line_length && $tokens_to_go[$i_opening_paren] eq '(' && @{$ri_term_begin} ) { my $ib = $ri_term_begin->[0]; my $type = $types_to_go[$ib]; # So far, the only known instance of this problem is when # a bareword follows an opening paren with -vmll if ( $type eq 'w' ) { # If a line starts with paren+space+terms, then its max length # could be up to ci+2-i spaces less than if the term went out # on a line after the paren. So.. my $tol_w = max( 0, 2 + $rOpts_continuation_indentation - $rOpts_indent_columns ); $columns = max( 0, $columns - $tol_w ); ## Here is the original b1210 fix, but it failed on b1216-b1218 ##my $columns2 = table_columns_available($i_opening_paren); ##$columns = min( $columns, $columns2 ); } } # Estimated maximum number of fields which fit this space. # This will be our first guess: my $number_of_fields_max = maximum_number_of_fields( $columns, $odd_or_even, $max_width, $pair_width ); my $number_of_fields = $number_of_fields_max; # Find the best-looking number of fields. # This will be our second guess, if possible. my ( $number_of_fields_best, $ri_ragged_break_list, $new_identifier_count ) = $self->study_list_complexity( $ri_term_begin, $ri_term_end, $ritem_lengths, $max_width ); if ( $number_of_fields_best != 0 && $number_of_fields_best < $number_of_fields_max ) { $number_of_fields = $number_of_fields_best; } # fix b1427 elsif ($number_of_fields_best > 1 && $number_of_fields_best > $number_of_fields_max ) { $number_of_fields_best = $number_of_fields_max; } # If we are crowded and the -lp option is being used, try # to undo some indentation if ( $is_lp_formatting && ( $number_of_fields == 0 || ( $number_of_fields == 1 && $number_of_fields != $number_of_fields_best ) ) ) { ( $number_of_fields, $number_of_fields_best, $columns ) = $self->lp_table_fix( $columns, $i_first_comma, $max_width, $number_of_fields, $number_of_fields_best, $odd_or_even, $pair_width, $ritem_lengths, ); } # try for one column if two won't work if ( $number_of_fields <= 0 ) { $number_of_fields = int( $columns / $max_width ); } # The user can place an upper bound on the number of fields, # which can be useful for doing maintenance on tables if ( $rOpts_maximum_fields_per_table && $number_of_fields > $rOpts_maximum_fields_per_table ) { $number_of_fields = $rOpts_maximum_fields_per_table; } # How many columns (characters) and lines would this container take # if no additional whitespace were added? my $packed_columns = token_sequence_length( $i_opening_paren + 1, $i_effective_last_comma + 1 ); if ( $columns <= 0 ) { $columns = 1 } # avoid divide by zero my $packed_lines = 1 + int( $packed_columns / $columns ); #----------------------------------------------------------------- # Section B1B: Stop here if we did not compute a positive number of # fields. In this case we just have to bail out. #----------------------------------------------------------------- if ( $number_of_fields <= 0 ) { $self->set_emergency_comma_breakpoints( $number_of_fields_best, $rhash_IN, $comma_count, $i_first_comma, ); return; } #------------------------------------------------------------------ # Section B1B: We have a tentative field count that seems to work. # Now we must look more closely to determine if a table layout will # actually look okay. #------------------------------------------------------------------ # How many lines will this require? my $formatted_lines = $item_count / ($number_of_fields); if ( $formatted_lines != int $formatted_lines ) { $formatted_lines = 1 + int $formatted_lines; } # So far we've been trying to fill out to the right margin. But # compact tables are easier to read, so let's see if we can use fewer # fields without increasing the number of lines. $number_of_fields = compactify_table( $item_count, $number_of_fields, $formatted_lines, $odd_or_even ); my $formatted_columns; if ( $number_of_fields > 1 ) { $formatted_columns = ( $pair_width * ( int( $item_count / 2 ) ) + ( $item_count % 2 ) * $max_width ); } else { $formatted_columns = $max_width * $item_count; } if ( $formatted_columns < $packed_columns ) { $formatted_columns = $packed_columns; } # Construce hash_B: return { # Updated variables _i_first_comma_B => $i_first_comma, _i_opening_paren_B => $i_opening_paren, _item_count_B => $item_count, # New variables _columns => $columns, _formatted_columns => $formatted_columns, _formatted_lines => $formatted_lines, _max_width => $max_width, _new_identifier_count => $new_identifier_count, _number_of_fields => $number_of_fields, _odd_or_even => $odd_or_even, _packed_columns => $packed_columns, _packed_lines => $packed_lines, _pair_width => $pair_width, _ri_ragged_break_list => $ri_ragged_break_list, _use_separate_first_term => $use_separate_first_term, }; } ## end sub table_layout_B sub lp_table_fix { # try to undo some -lp indentation to improve table formatting my ( $self, # $columns, $i_first_comma, $max_width, $number_of_fields, $number_of_fields_best, $odd_or_even, $pair_width, $ritem_lengths, ) = @_; my $available_spaces = $self->get_available_spaces_to_go($i_first_comma); if ( $available_spaces > 0 ) { my $spaces_wanted = $max_width - $columns; # for 1 field if ( $number_of_fields_best == 0 ) { $number_of_fields_best = get_maximum_fields_wanted($ritem_lengths); } if ( $number_of_fields_best != 1 ) { my $spaces_wanted_2 = 1 + $pair_width - $columns; # for 2 fields if ( $available_spaces > $spaces_wanted_2 ) { $spaces_wanted = $spaces_wanted_2; } } if ( $spaces_wanted > 0 ) { my $deleted_spaces = $self->reduce_lp_indentation( $i_first_comma, $spaces_wanted ); # redo the math if ( $deleted_spaces > 0 ) { $columns = table_columns_available($i_first_comma); $number_of_fields = maximum_number_of_fields( $columns, $odd_or_even, $max_width, $pair_width ); if ( $number_of_fields_best == 1 && $number_of_fields >= 1 ) { $number_of_fields = $number_of_fields_best; } } } } return ( $number_of_fields, $number_of_fields_best, $columns ); } ## end sub lp_table_fix sub write_formatted_table { # Write a table of comma separated items with fixed number of fields my ( $self, $number_of_fields, $comma_count, $rcomma_index, $use_separate_first_term ) = @_; write_logfile_entry( "List: auto formatting with $number_of_fields fields/row\n"); my $j_first_break = $use_separate_first_term ? $number_of_fields : $number_of_fields - 1; my $j = $j_first_break; while ( $j < $comma_count ) { my $i_comma = $rcomma_index->[$j]; $self->set_forced_breakpoint($i_comma); $j += $number_of_fields; } return; } ## end sub write_formatted_table } ## end closure set_comma_breakpoint_final sub study_list_complexity { # Look for complex tables which should be formatted with one term per line. # Returns the following: # # \@i_ragged_break_list = list of good breakpoints to avoid lines # which are hard to read # $number_of_fields_best = suggested number of fields based on # complexity; = 0 if any number may be used. # my ( $self, $ri_term_begin, $ri_term_end, $ritem_lengths, $max_width ) = @_; my $item_count = @{$ri_term_begin}; my $complex_item_count = 0; my $number_of_fields_best = $rOpts_maximum_fields_per_table; my $i_max = @{$ritem_lengths} - 1; ##my @item_complexity; my $i_last_last_break = -3; my $i_last_break = -2; my @i_ragged_break_list; my $definitely_complex = 30; my $definitely_simple = 12; my $quote_count = 0; for my $i ( 0 .. $i_max ) { my $ib = $ri_term_begin->[$i]; my $ie = $ri_term_end->[$i]; # define complexity: start with the actual term length my $weighted_length = ( $ritem_lengths->[$i] - 2 ); ##TBD: join types here and check for variations ##my $str=join "", @tokens_to_go[$ib..$ie]; my $is_quote = 0; if ( $types_to_go[$ib] =~ /^[qQ]$/ ) { $is_quote = 1; $quote_count++; } elsif ( $types_to_go[$ib] =~ /^[w\-]$/ ) { $quote_count++; } if ( $ib eq $ie ) { if ( $is_quote && $tokens_to_go[$ib] =~ /\s/ ) { $complex_item_count++; $weighted_length *= 2; } else { } } else { if ( grep { $_ eq 'b' } @types_to_go[ $ib .. $ie ] ) { $complex_item_count++; $weighted_length *= 2; } if ( grep { $_ eq '..' } @types_to_go[ $ib .. $ie ] ) { $weighted_length += 4; } } # add weight for extra tokens. $weighted_length += 2 * ( $ie - $ib ); ## my $BUB = join '', @tokens_to_go[$ib..$ie]; ## print "# COMPLEXITY:$weighted_length $BUB\n"; ##push @item_complexity, $weighted_length; # now mark a ragged break after this item it if it is 'long and # complex': if ( $weighted_length >= $definitely_complex ) { # if we broke after the previous term # then break before it too if ( $i_last_break == $i - 1 && $i > 1 && $i_last_last_break != $i - 2 ) { ## TODO: don't strand a small term pop @i_ragged_break_list; push @i_ragged_break_list, $i - 2; push @i_ragged_break_list, $i - 1; } push @i_ragged_break_list, $i; $i_last_last_break = $i_last_break; $i_last_break = $i; } # don't break before a small last term -- it will # not look good on a line by itself. elsif ($i == $i_max && $i_last_break == $i - 1 && $weighted_length <= $definitely_simple ) { pop @i_ragged_break_list; } } my $identifier_count = $i_max + 1 - $quote_count; # Need more tuning here.. if ( $max_width > 12 && $complex_item_count > $item_count / 2 && $number_of_fields_best != 2 ) { $number_of_fields_best = 1; } return ( $number_of_fields_best, \@i_ragged_break_list, $identifier_count ); } ## end sub study_list_complexity sub get_maximum_fields_wanted { # Not all tables look good with more than one field of items. # This routine looks at a table and decides if it should be # formatted with just one field or not. # This coding is still under development. my ($ritem_lengths) = @_; my $number_of_fields_best = 0; # For just a few items, we tentatively assume just 1 field. my $item_count = @{$ritem_lengths}; if ( $item_count <= 5 ) { $number_of_fields_best = 1; } # For larger tables, look at it both ways and see what looks best else { my $is_odd = 1; my @max_length = ( 0, 0 ); my @last_length_2 = ( undef, undef ); my @first_length_2 = ( undef, undef ); my $last_length = undef; my $total_variation_1 = 0; my $total_variation_2 = 0; my @total_variation_2 = ( 0, 0 ); foreach my $j ( 0 .. $item_count - 1 ) { $is_odd = 1 - $is_odd; my $length = $ritem_lengths->[$j]; if ( $length > $max_length[$is_odd] ) { $max_length[$is_odd] = $length; } if ( defined($last_length) ) { my $dl = abs( $length - $last_length ); $total_variation_1 += $dl; } $last_length = $length; my $ll = $last_length_2[$is_odd]; if ( defined($ll) ) { my $dl = abs( $length - $ll ); $total_variation_2[$is_odd] += $dl; } else { $first_length_2[$is_odd] = $length; } $last_length_2[$is_odd] = $length; } $total_variation_2 = $total_variation_2[0] + $total_variation_2[1]; my $factor = ( $item_count > 10 ) ? 1 : ( $item_count > 5 ) ? 0.75 : 0; unless ( $total_variation_2 < $factor * $total_variation_1 ) { $number_of_fields_best = 1; } } return ($number_of_fields_best); } ## end sub get_maximum_fields_wanted sub table_columns_available { my $i_first_comma = shift; my $columns = $maximum_line_length_at_level[ $levels_to_go[$i_first_comma] ] - leading_spaces_to_go($i_first_comma); # Patch: the vertical formatter does not line up lines whose lengths # exactly equal the available line length because of allowances # that must be made for side comments. Therefore, the number of # available columns is reduced by 1 character. $columns -= 1; return $columns; } ## end sub table_columns_available sub maximum_number_of_fields { # how many fields will fit in the available space? my ( $columns, $odd_or_even, $max_width, $pair_width ) = @_; my $max_pairs = int( $columns / $pair_width ); my $number_of_fields = $max_pairs * 2; if ( $odd_or_even == 1 && $max_pairs * $pair_width + $max_width <= $columns ) { $number_of_fields++; } return $number_of_fields; } ## end sub maximum_number_of_fields sub compactify_table { # given a table with a certain number of fields and a certain number # of lines, see if reducing the number of fields will make it look # better. my ( $item_count, $number_of_fields, $formatted_lines, $odd_or_even ) = @_; if ( $number_of_fields >= $odd_or_even * 2 && $formatted_lines > 0 ) { my $min_fields = $number_of_fields; while ($min_fields >= $odd_or_even && $min_fields * $formatted_lines >= $item_count ) { $number_of_fields = $min_fields; $min_fields -= $odd_or_even; } } return $number_of_fields; } ## end sub compactify_table sub set_ragged_breakpoints { # Set breakpoints in a list that cannot be formatted nicely as a # table. my ( $self, $ri_term_comma, $ri_ragged_break_list ) = @_; my $break_count = 0; foreach ( @{$ri_ragged_break_list} ) { my $j = $ri_term_comma->[$_]; if ($j) { $self->set_forced_breakpoint($j); $break_count++; } } return $break_count; } ## end sub set_ragged_breakpoints sub copy_old_breakpoints { my ( $self, $i_first_comma, $i_last_comma ) = @_; for my $i ( $i_first_comma .. $i_last_comma ) { if ( $old_breakpoint_to_go[$i] ) { # If the comma style is under certain controls, and if this is a # comma breakpoint with the comma is at the beginning of the next # line, then we must pass that index instead. This will allow sub # set_forced_breakpoints to check and follow the user settings. This # produces a uniform style and can prevent instability (b1422). # # The flag '$controlled_comma_style' will be set if the user # entered any of -wbb=',' -wba=',' -kbb=',' -kba=','. It is not # needed or set for the -boc flag. my $ibreak = $i; if ( $types_to_go[$ibreak] ne ',' && $controlled_comma_style ) { my $index = $inext_to_go[$ibreak]; if ( $index > $ibreak && $types_to_go[$index] eq ',' ) { $ibreak = $index; } } $self->set_forced_breakpoint($ibreak); } } return; } ## end sub copy_old_breakpoints sub set_nobreaks { my ( $self, $i, $j ) = @_; if ( $i >= 0 && $i <= $j && $j <= $max_index_to_go ) { 0 && do { my ( $a, $b, $c ) = caller(); print STDOUT "NOBREAK: forced_breakpoint $forced_breakpoint_count from $a $c with i=$i max=$max_index_to_go type=$types_to_go[$i]\n"; }; @nobreak_to_go[ $i .. $j ] = (1) x ( $j - $i + 1 ); } # shouldn't happen; non-critical error else { if (DEVEL_MODE) { my ( $a, $b, $c ) = caller(); Fault(< $iend ) { return 0; } return $summed_lengths_to_go[ $iend + 1 ] - $summed_lengths_to_go[$ibeg]; } ## end sub token_sequence_length sub total_line_length { # return length of a line of tokens ($ibeg .. $iend) my ( $ibeg, $iend ) = @_; # Start with the leading spaces on this line ... my $length = $leading_spaces_to_go[$ibeg]; if ( ref($length) ) { $length = $length->get_spaces() } # ... then add the net token length $length += $summed_lengths_to_go[ $iend + 1 ] - $summed_lengths_to_go[$ibeg]; return $length; } ## end sub total_line_length sub excess_line_length { # return number of characters by which a line of tokens ($ibeg..$iend) # exceeds the allowable line length. # NOTE: profiling shows that efficiency of this routine is essential. my ( $self, $ibeg, $iend, $ignore_right_weld ) = @_; # Start with the leading spaces on this line ... my $excess = $leading_spaces_to_go[$ibeg]; if ( ref($excess) ) { $excess = $excess->get_spaces() } # ... then add the net token length, minus the maximum length $excess += $summed_lengths_to_go[ $iend + 1 ] - $summed_lengths_to_go[$ibeg] - $maximum_line_length_at_level[ $levels_to_go[$ibeg] ]; # ... and include right weld lengths unless requested not to if ( $total_weld_count && $type_sequence_to_go[$iend] && !$ignore_right_weld ) { my $wr = $self->[_rweld_len_right_at_K_]->{ $K_to_go[$iend] }; $excess += $wr if defined($wr); } return $excess; } ## end sub excess_line_length sub get_spaces { # return the number of leading spaces associated with an indentation # variable $indentation is either a constant number of spaces or an object # with a get_spaces method. my $indentation = shift; return ref($indentation) ? $indentation->get_spaces() : $indentation; } ## end sub get_spaces sub get_recoverable_spaces { # return the number of spaces (+ means shift right, - means shift left) # that we would like to shift a group of lines with the same indentation # to get them to line up with their opening parens my $indentation = shift; return ref($indentation) ? $indentation->get_recoverable_spaces() : 0; } ## end sub get_recoverable_spaces sub get_available_spaces_to_go { my ( $self, $ii ) = @_; my $item = $leading_spaces_to_go[$ii]; # return the number of available leading spaces associated with an # indentation variable. $indentation is either a constant number of # spaces or an object with a get_available_spaces method. return ref($item) ? $item->get_available_spaces() : 0; } ## end sub get_available_spaces_to_go { ## begin closure set_lp_indentation use constant DEBUG_LP => 0; # Stack of -lp index objects which survives between batches. my $rLP; my $max_lp_stack; # The predicted position of the next opening container which may start # an -lp indentation level. This survives between batches. my $lp_position_predictor; BEGIN { # Index names for the -lp stack variables. # Do not combine with other BEGIN blocks (c101). my $i = 0; use constant { _lp_ci_level_ => $i++, _lp_level_ => $i++, _lp_object_ => $i++, _lp_container_seqno_ => $i++, _lp_space_count_ => $i++, }; } ## end BEGIN sub initialize_lp_vars { # initialize gnu variables for a new file; # must be called once at the start of a new file. $lp_position_predictor = 0; $max_lp_stack = 0; # we can turn off -lp if all levels will be at or above the cutoff if ( $high_stress_level <= 1 ) { $rOpts_line_up_parentheses = 0; $rOpts_extended_line_up_parentheses = 0; } $rLP = []; # initialize the leading whitespace stack to negative levels # so that we can never run off the end of the stack $rLP->[$max_lp_stack]->[_lp_ci_level_] = -1; $rLP->[$max_lp_stack]->[_lp_level_] = -1; $rLP->[$max_lp_stack]->[_lp_object_] = undef; $rLP->[$max_lp_stack]->[_lp_container_seqno_] = SEQ_ROOT; $rLP->[$max_lp_stack]->[_lp_space_count_] = 0; return; } ## end sub initialize_lp_vars # hashes for efficient testing my %hash_test1; my %hash_test2; my %hash_test3; BEGIN { my @q = qw< } ) ] >; @hash_test1{@q} = (1) x scalar(@q); @q = qw(: ? f); push @q, ','; @hash_test2{@q} = (1) x scalar(@q); @q = qw( . || && ); @hash_test3{@q} = (1) x scalar(@q); } ## end BEGIN # shared variables, re-initialized for each batch my $rlp_object_list; my $max_lp_object_list; my %lp_comma_count; my %lp_arrow_count; my $space_count; my $current_level; my $current_ci_level; my $ii_begin_line; my $in_lp_mode; my $stack_changed; my $K_last_nonblank; my $last_nonblank_token; my $last_nonblank_type; my $last_last_nonblank_type; sub set_lp_indentation { my ($self) = @_; #------------------------------------------------------------------ # Define the leading whitespace for all tokens in the current batch # when the -lp formatting is selected. #------------------------------------------------------------------ return unless ($rOpts_line_up_parentheses); return unless ( defined($max_index_to_go) && $max_index_to_go >= 0 ); # List of -lp indentation objects created in this batch $rlp_object_list = []; $max_lp_object_list = -1; %lp_comma_count = (); %lp_arrow_count = (); $space_count = undef; $current_level = undef; $current_ci_level = undef; $ii_begin_line = 0; $in_lp_mode = 0; $stack_changed = 1; $K_last_nonblank = undef; $last_nonblank_token = EMPTY_STRING; $last_nonblank_type = EMPTY_STRING; $last_last_nonblank_type = EMPTY_STRING; my %last_lp_equals = (); my $rLL = $self->[_rLL_]; my $starting_in_quote = $self->[_this_batch_]->[_starting_in_quote_]; my $imin = 0; # The 'starting_in_quote' flag means that the first token is the first # token of a line and it is also the continuation of some kind of # multi-line quote or pattern. It must have no added leading # whitespace, so we can skip it. if ($starting_in_quote) { $imin += 1; } my $Kpnb = $K_to_go[0] - 1; if ( $Kpnb > 0 && $rLL->[$Kpnb]->[_TYPE_] eq 'b' ) { $Kpnb -= 1; } if ( $Kpnb >= 0 && $rLL->[$Kpnb]->[_TYPE_] ne 'b' ) { $K_last_nonblank = $Kpnb; } if ( defined($K_last_nonblank) ) { $last_nonblank_token = $rLL->[$K_last_nonblank]->[_TOKEN_]; $last_nonblank_type = $rLL->[$K_last_nonblank]->[_TYPE_]; } #----------------------------------- # Loop over all tokens in this batch #----------------------------------- foreach my $ii ( $imin .. $max_index_to_go ) { my $type = $types_to_go[$ii]; my $token = $tokens_to_go[$ii]; my $level = $levels_to_go[$ii]; my $ci_level = $ci_levels_to_go[$ii]; my $total_depth = $nesting_depth_to_go[$ii]; # get the top state from the stack if it has changed if ($stack_changed) { my $rLP_top = $rLP->[$max_lp_stack]; my $lp_object = $rLP_top->[_lp_object_]; if ($lp_object) { ( $space_count, $current_level, $current_ci_level ) = @{ $lp_object->get_spaces_level_ci() }; } else { $current_ci_level = $rLP_top->[_lp_ci_level_]; $current_level = $rLP_top->[_lp_level_]; $space_count = $rLP_top->[_lp_space_count_]; } $stack_changed = 0; } #------------------------------------------------------------ # Break at a previous '=' if necessary to control line length #------------------------------------------------------------ if ( $type eq '{' || $type eq '(' ) { $lp_comma_count{ $total_depth + 1 } = 0; $lp_arrow_count{ $total_depth + 1 } = 0; # If we come to an opening token after an '=' token of some # type, see if it would be helpful to 'break' after the '=' to # save space my $ii_last_equals = $last_lp_equals{$total_depth}; if ($ii_last_equals) { $self->lp_equals_break_check( $ii, $ii_last_equals ); } } #------------------------ # Handle decreasing depth #------------------------ # Note that one token may have both decreasing and then increasing # depth. For example, (level, ci) can go from (1,1) to (2,0). So, # in this example we would first go back to (1,0) then up to (2,0) # in a single call. if ( $level < $current_level || $ci_level < $current_ci_level ) { $self->lp_decreasing_depth($ii); } #------------------------ # handle increasing depth #------------------------ if ( $level > $current_level || $ci_level > $current_ci_level ) { $self->lp_increasing_depth($ii); } #------------------ # Handle all tokens #------------------ if ( $type ne 'b' ) { # Count commas and look for non-list characters. Once we see a # non-list character, we give up and don't look for any more # commas. if ( $type eq '=>' ) { $lp_arrow_count{$total_depth}++; # remember '=>' like '=' for estimating breaks (but see # above note for b1035) $last_lp_equals{$total_depth} = $ii; } elsif ( $type eq ',' ) { $lp_comma_count{$total_depth}++; } elsif ( $is_assignment{$type} ) { $last_lp_equals{$total_depth} = $ii; } # this token might start a new line if .. if ( $ii > $ii_begin_line && ( # this is the first nonblank token of the line $ii == 1 && $types_to_go[0] eq 'b' # or previous character was one of these: # /^([\:\?\,f])$/ || $hash_test2{$last_nonblank_type} # or previous character was opening and this is not # closing || ( $last_nonblank_type eq '{' && $type ne '}' ) || ( $last_nonblank_type eq '(' and $type ne ')' ) # or this token is one of these: # /^([\.]|\|\||\&\&)$/ || $hash_test3{$type} # or this is a closing structure || ( $last_nonblank_type eq '}' && $last_nonblank_token eq $last_nonblank_type ) # or previous token was keyword 'return' || ( $last_nonblank_type eq 'k' && ( $last_nonblank_token eq 'return' && $type ne '{' ) ) # or starting a new line at certain keywords is fine || ( $type eq 'k' && $is_if_unless_and_or_last_next_redo_return{ $token} ) # or this is after an assignment after a closing # structure || ( $is_assignment{$last_nonblank_type} && ( # /^[\}\)\]]$/ $hash_test1{$last_last_nonblank_type} # and it is significantly to the right || $lp_position_predictor > ( $maximum_line_length_at_level[$level] - $rOpts_maximum_line_length / 2 ) ) ) ) ) { check_for_long_gnu_style_lines($ii); $ii_begin_line = $ii; # back up 1 token if we want to break before that type # otherwise, we may strand tokens like '?' or ':' on a line if ( $ii_begin_line > 0 ) { my $wbb = $last_nonblank_type eq 'k' ? $want_break_before{$last_nonblank_token} : $want_break_before{$last_nonblank_type}; $ii_begin_line-- if ($wbb); } } $K_last_nonblank = $K_to_go[$ii]; $last_last_nonblank_type = $last_nonblank_type; $last_nonblank_type = $type; $last_nonblank_token = $token; } ## end if ( $type ne 'b' ) # remember the predicted position of this token on the output line if ( $ii > $ii_begin_line ) { ## NOTE: this is a critical loop - the following call has been ## expanded for about 2x speedup: ## $lp_position_predictor = ## total_line_length( $ii_begin_line, $ii ); my $indentation = $leading_spaces_to_go[$ii_begin_line]; if ( ref($indentation) ) { $indentation = $indentation->get_spaces(); } $lp_position_predictor = $indentation + $summed_lengths_to_go[ $ii + 1 ] - $summed_lengths_to_go[$ii_begin_line]; } else { $lp_position_predictor = $space_count + $token_lengths_to_go[$ii]; } # Store the indentation object for this token. # This allows us to manipulate the leading whitespace # (in case we have to reduce indentation to fit a line) without # having to change any token values. #--------------------------------------------------------------- # replace leading whitespace with indentation objects where used #--------------------------------------------------------------- if ( $rLP->[$max_lp_stack]->[_lp_object_] ) { my $lp_object = $rLP->[$max_lp_stack]->[_lp_object_]; $leading_spaces_to_go[$ii] = $lp_object; if ( $max_lp_stack > 0 && $ci_level && $rLP->[ $max_lp_stack - 1 ]->[_lp_object_] ) { $reduced_spaces_to_go[$ii] = $rLP->[ $max_lp_stack - 1 ]->[_lp_object_]; } else { $reduced_spaces_to_go[$ii] = $lp_object; } } } ## end loop over all tokens in this batch undo_incomplete_lp_indentation() if ( !$rOpts_extended_line_up_parentheses ); return; } ## end sub set_lp_indentation sub lp_equals_break_check { my ( $self, $ii, $ii_last_equals ) = @_; # If we come to an opening token after an '=' token of some # type, see if it would be helpful to 'break' after the '=' to # save space. # Given: # $ii = index of an opening token in the output batch # $ii_begin_line = index of token starting next output line # Update: # $lp_position_predictor - updated position predictor # $ii_begin_line = updated starting token index # Skip an empty set of parens, such as after channel(): # my $exchange = $self->_channel()->exchange( # This fixes issues b1318 b1322 b1323 b1328 my $is_empty_container; if ( $ii_last_equals && $ii < $max_index_to_go ) { my $seqno = $type_sequence_to_go[$ii]; my $inext_nb = $ii + 1; $inext_nb++ if ( $types_to_go[$inext_nb] eq 'b' ); my $seqno_nb = $type_sequence_to_go[$inext_nb]; $is_empty_container = $seqno && $seqno_nb && $seqno_nb == $seqno; } if ( $ii_last_equals && $ii_last_equals > $ii_begin_line && !$is_empty_container ) { my $seqno = $type_sequence_to_go[$ii]; # find the position if we break at the '=' my $i_test = $ii_last_equals; # Fix for issue b1229, check if want break before this token # Fix for issue b1356, if i_test is a blank, the leading spaces may # be incorrect (if it was an interline blank). # Fix for issue b1357 .. b1370, i_test must be prev nonblank # ( the ci value for blanks can vary ) # See also case b223 # Fix for issue b1371-b1374 : all of these and the above are fixed # by simply backing up one index and setting the leading spaces of # a blank equal to that of the equals. if ( $want_break_before{ $types_to_go[$i_test] } ) { $i_test -= 1; $leading_spaces_to_go[$i_test] = $leading_spaces_to_go[$ii_last_equals] if ( $types_to_go[$i_test] eq 'b' ); } elsif ( $types_to_go[ $i_test + 1 ] eq 'b' ) { $i_test++ } my $test_position = total_line_length( $i_test, $ii ); my $mll = $maximum_line_length_at_level[ $levels_to_go[$i_test] ]; #------------------------------------------------------ # Break if structure will reach the maximum line length #------------------------------------------------------ # Historically, -lp just used one-half line length here my $len_increase = $rOpts_maximum_line_length / 2; # For -xlp, we can also use the pre-computed lengths my $min_len = $self->[_rcollapsed_length_by_seqno_]->{$seqno}; if ( $min_len && $min_len > $len_increase ) { $len_increase = $min_len; } if ( # if we might exceed the maximum line length $lp_position_predictor + $len_increase > $mll # if a -bbx flag WANTS a break before this opening token || ( $seqno && $self->[_rbreak_before_container_by_seqno_]->{$seqno} ) # or we are beyond the 1/4 point and there was an old # break at an assignment (not '=>') [fix for b1035] || ( $lp_position_predictor > $mll - $rOpts_maximum_line_length * 3 / 4 && $types_to_go[$ii_last_equals] ne '=>' && ( $old_breakpoint_to_go[$ii_last_equals] || ( $ii_last_equals > 0 && $old_breakpoint_to_go[ $ii_last_equals - 1 ] ) || ( $ii_last_equals > 1 && $types_to_go[ $ii_last_equals - 1 ] eq 'b' && $old_breakpoint_to_go[ $ii_last_equals - 2 ] ) ) ) ) { # then make the switch -- note that we do not set a # real breakpoint here because we may not really need # one; sub break_lists will do that if necessary. my $Kc = $self->[_K_closing_container_]->{$seqno}; if ( # For -lp, only if the closing token is in this # batch (c117). Otherwise it cannot be done by sub # break_lists. defined($Kc) && $Kc <= $K_to_go[$max_index_to_go] # For -xlp, we only need one nonblank token after # the opening token. || $rOpts_extended_line_up_parentheses ) { $ii_begin_line = $i_test + 1; $lp_position_predictor = $test_position; #-------------------------------------------------- # Fix for an opening container terminating a batch: #-------------------------------------------------- # To get alignment of a -lp container with its # contents, we have to put a break after $i_test. # For $ii<$max_index_to_go, this will be done by # sub break_lists based on the indentation object. # But for $ii=$max_index_to_go, the indentation # object for this seqno will not be created until # the next batch, so we have to set a break at # $i_test right now in order to get one. if ( $ii == $max_index_to_go && !$block_type_to_go[$ii] && $types_to_go[$ii] eq '{' && $seqno && !$self->[_ris_excluded_lp_container_]->{$seqno} ) { $self->set_forced_lp_break( $ii_begin_line, $ii ); } } } } return; } ## end sub lp_equals_break_check sub lp_decreasing_depth { my ( $self, $ii ) = @_; my $rLL = $self->[_rLL_]; my $level = $levels_to_go[$ii]; my $ci_level = $ci_levels_to_go[$ii]; # loop to find the first entry at or completely below this level while (1) { # Be sure we have not hit the stack bottom - should never # happen because only negative levels can get here, and # $level was forced to be positive above. if ( !$max_lp_stack ) { # non-fatal, just keep going except in DEVEL_MODE if (DEVEL_MODE) { Fault(<[$max_lp_stack]->[_lp_object_] ) { my $lp_object = $rLP->[$max_lp_stack]->[_lp_object_]; $lp_object->set_closed($ii); my $comma_count = 0; my $arrow_count = 0; my $type = $types_to_go[$ii]; if ( $type eq '}' || $type eq ')' ) { my $total_depth = $nesting_depth_to_go[$ii]; $comma_count = $lp_comma_count{$total_depth}; $arrow_count = $lp_arrow_count{$total_depth}; $comma_count = 0 unless $comma_count; $arrow_count = 0 unless $arrow_count; } $lp_object->set_comma_count($comma_count); $lp_object->set_arrow_count($arrow_count); # Undo any extra indentation if we saw no commas my $available_spaces = $lp_object->get_available_spaces(); my $K_start = $lp_object->get_K_begin_line(); if ( $available_spaces > 0 && $K_start >= $K_to_go[0] && ( $comma_count <= 0 || $arrow_count > 0 ) ) { my $i = $lp_object->get_lp_item_index(); # Safety check for a valid stack index. It # should be ok because we just checked that the # index K of the token associated with this # indentation is in this batch. if ( $i < 0 || $i > $max_lp_object_list ) { my $KK = $K_to_go[$ii]; my $lno = $rLL->[$KK]->[_LINE_INDEX_]; DEVEL_MODE && Fault(<=0 and <= max=$max_lp_object_list EOM last; } if ( $arrow_count == 0 ) { $rlp_object_list->[$i] ->permanently_decrease_available_spaces( $available_spaces); } else { $rlp_object_list->[$i] ->tentatively_decrease_available_spaces( $available_spaces); } foreach my $j ( $i + 1 .. $max_lp_object_list ) { $rlp_object_list->[$j] ->decrease_SPACES($available_spaces); } } } # go down one level --$max_lp_stack; my $rLP_top = $rLP->[$max_lp_stack]; my $ci_lev = $rLP_top->[_lp_ci_level_]; my $lev = $rLP_top->[_lp_level_]; my $spaces = $rLP_top->[_lp_space_count_]; if ( $rLP_top->[_lp_object_] ) { my $lp_obj = $rLP_top->[_lp_object_]; ( $spaces, $lev, $ci_lev ) = @{ $lp_obj->get_spaces_level_ci() }; } # stop when we reach a level at or below the current # level if ( $lev <= $level && $ci_lev <= $ci_level ) { $space_count = $spaces; $current_level = $lev; $current_ci_level = $ci_lev; last; } } return; } ## end sub lp_decreasing_depth sub lp_increasing_depth { my ( $self, $ii ) = @_; my $rLL = $self->[_rLL_]; my $type = $types_to_go[$ii]; my $level = $levels_to_go[$ii]; my $ci_level = $ci_levels_to_go[$ii]; $stack_changed = 1; # Compute the standard incremental whitespace. This will be # the minimum incremental whitespace that will be used. This # choice results in a smooth transition between the gnu-style # and the standard style. my $standard_increment = ( $level - $current_level ) * $rOpts_indent_columns + ( $ci_level - $current_ci_level ) * $rOpts_continuation_indentation; # Now we have to define how much extra incremental space # ("$available_space") we want. This extra space will be # reduced as necessary when long lines are encountered or when # it becomes clear that we do not have a good list. my $available_spaces = 0; my $align_seqno = 0; my $K_extra_space; my $last_nonblank_seqno; my $last_nonblank_block_type; if ( defined($K_last_nonblank) ) { $last_nonblank_seqno = $rLL->[$K_last_nonblank]->[_TYPE_SEQUENCE_]; $last_nonblank_block_type = $last_nonblank_seqno ? $self->[_rblock_type_of_seqno_]->{$last_nonblank_seqno} : undef; } $in_lp_mode = $rLP->[$max_lp_stack]->[_lp_object_]; #----------------------------------------------- # Initialize indentation spaces on empty stack.. #----------------------------------------------- if ( $max_lp_stack == 0 ) { $space_count = $level * $rOpts_indent_columns; } #---------------------------------------- # Add the standard space increment if ... #---------------------------------------- elsif ( # if this is a BLOCK, add the standard increment $last_nonblank_block_type # or if this is not a sequenced item || !$last_nonblank_seqno # or this container is excluded by user rules # or contains here-docs or multiline qw text || defined($last_nonblank_seqno) && $self->[_ris_excluded_lp_container_]->{$last_nonblank_seqno} # or if last nonblank token was not structural indentation || $last_nonblank_type ne '{' # and do not start -lp under stress .. fixes b1244, b1255 || !$in_lp_mode && $level >= $high_stress_level ) { # If we have entered lp mode, use the top lp object to get # the current indentation spaces because it may have # changed. Fixes b1285, b1286. if ($in_lp_mode) { $space_count = $in_lp_mode->get_spaces(); } $space_count += $standard_increment; } #--------------------------------------------------------------- # -lp mode: try to use space to the first non-blank level change #--------------------------------------------------------------- else { # see how much space we have available my $test_space_count = $lp_position_predictor; my $excess = 0; my $min_len = $self->[_rcollapsed_length_by_seqno_]->{$last_nonblank_seqno}; my $next_opening_too_far; if ( defined($min_len) ) { $excess = $test_space_count + $min_len - $maximum_line_length_at_level[$level]; if ( $excess > 0 ) { $test_space_count -= $excess; # will the next opening token be a long way out? $next_opening_too_far = $lp_position_predictor + $excess > $maximum_line_length_at_level[$level]; } } my $rLP_top = $rLP->[$max_lp_stack]; my $min_gnu_indentation = $rLP_top->[_lp_space_count_]; if ( $rLP_top->[_lp_object_] ) { $min_gnu_indentation = $rLP_top->[_lp_object_]->get_spaces(); } $available_spaces = $test_space_count - $min_gnu_indentation; # Do not startup -lp indentation mode if no space ... # ... or if it puts the opening far to the right if ( !$in_lp_mode && ( $available_spaces <= 0 || $next_opening_too_far ) ) { $space_count += $standard_increment; $available_spaces = 0; } # Use -lp mode else { $space_count = $test_space_count; $in_lp_mode = 1; if ( $available_spaces >= $standard_increment ) { $min_gnu_indentation += $standard_increment; } elsif ( $available_spaces > 1 ) { $min_gnu_indentation += $available_spaces + 1; # The "+1" space can cause mis-alignment if there is no # blank space between the opening paren and the next # nonblank token (i.e., -pt=2) and the container does not # get broken open. So we will mark this token for later # space removal by sub 'xlp_tweak' if this container # remains intact (issue git #106). if ( $type ne 'b' # Skip if the maximum line length is exceeded here && $excess <= 0 # This is only for level changes, not ci level changes. # But note: this test is here out of caution but I have # not found a case where it is actually necessary. && $is_opening_token{$last_nonblank_token} # Be sure we are at consecutive nonblanks. This test # should be true, but it guards against future coding # changes to level values assigned to blank spaces. && $ii > 0 && $types_to_go[ $ii - 1 ] ne 'b' ) { $K_extra_space = $K_to_go[$ii]; } } elsif ( $is_opening_token{$last_nonblank_token} ) { if ( ( $tightness{$last_nonblank_token} < 2 ) ) { $min_gnu_indentation += 2; } else { $min_gnu_indentation += 1; } } else { $min_gnu_indentation += $standard_increment; } $available_spaces = $space_count - $min_gnu_indentation; if ( $available_spaces < 0 ) { $space_count = $min_gnu_indentation; $available_spaces = 0; } $align_seqno = $last_nonblank_seqno; } } #------------------------------------------- # update the state, but not on a blank token #------------------------------------------- if ( $type ne 'b' ) { if ( $rLP->[$max_lp_stack]->[_lp_object_] ) { $rLP->[$max_lp_stack]->[_lp_object_]->set_have_child(1); $in_lp_mode = 1; } #---------------------------------------- # Create indentation object if in lp-mode #---------------------------------------- ++$max_lp_stack; my $lp_object; if ($in_lp_mode) { # A negative level implies not to store the item in the # item_list my $lp_item_index = 0; if ( $level >= 0 ) { $lp_item_index = ++$max_lp_object_list; } my $K_begin_line = 0; if ( $ii_begin_line >= 0 && $ii_begin_line <= $max_index_to_go ) { $K_begin_line = $K_to_go[$ii_begin_line]; } # Minor Fix: when creating indentation at a side # comment we don't know what the space to the actual # next code token will be. We will allow a space for # sub correct_lp to move it in if necessary. if ( $type eq '#' && $max_index_to_go > 0 && $align_seqno ) { $available_spaces += 1; } my $standard_spaces = $leading_spaces_to_go[$ii]; $lp_object = Perl::Tidy::IndentationItem->new( spaces => $space_count, level => $level, ci_level => $ci_level, available_spaces => $available_spaces, lp_item_index => $lp_item_index, align_seqno => $align_seqno, stack_depth => $max_lp_stack, K_begin_line => $K_begin_line, standard_spaces => $standard_spaces, K_extra_space => $K_extra_space, ); DEBUG_LP && do { my $tok_beg = $rLL->[$K_begin_line]->[_TOKEN_]; my $token = $tokens_to_go[$ii]; print STDERR <= 0 ) { $rlp_object_list->[$max_lp_object_list] = $lp_object; } if ( $is_opening_token{$last_nonblank_token} && $last_nonblank_seqno ) { $self->[_rlp_object_by_seqno_]->{$last_nonblank_seqno} = $lp_object; } } #------------------------------------ # Store this indentation on the stack #------------------------------------ $rLP->[$max_lp_stack]->[_lp_ci_level_] = $ci_level; $rLP->[$max_lp_stack]->[_lp_level_] = $level; $rLP->[$max_lp_stack]->[_lp_object_] = $lp_object; $rLP->[$max_lp_stack]->[_lp_container_seqno_] = $last_nonblank_seqno; $rLP->[$max_lp_stack]->[_lp_space_count_] = $space_count; # If the opening paren is beyond the half-line length, then # we will use the minimum (standard) indentation. This will # help avoid problems associated with running out of space # near the end of a line. As a result, in deeply nested # lists, there will be some indentations which are limited # to this minimum standard indentation. But the most deeply # nested container will still probably be able to shift its # parameters to the right for proper alignment, so in most # cases this will not be noticeable. if ( $available_spaces > 0 && $lp_object ) { my $halfway = $maximum_line_length_at_level[$level] - $rOpts_maximum_line_length / 2; $lp_object->tentatively_decrease_available_spaces( $available_spaces) if ( $space_count > $halfway ); } } return; } ## end sub lp_increasing_depth sub check_for_long_gnu_style_lines { # look at the current estimated maximum line length, and # remove some whitespace if it exceeds the desired maximum my ($ii_to_go) = @_; # nothing can be done if no stack items defined for this line return if ( $max_lp_object_list < 0 ); # See if we have exceeded the maximum desired line length .. # keep 2 extra free because they are needed in some cases # (result of trial-and-error testing) my $tol = 2; # But reduce tol to 0 at a terminal comma; fixes b1432 if ( $tokens_to_go[$ii_to_go] eq ',' && $ii_to_go < $max_index_to_go ) { my $in = $ii_to_go + 1; if ( $types_to_go[$in] eq 'b' && $in < $max_index_to_go ) { $in++ } if ( $is_closing_token{ $tokens_to_go[$in] } ) { $tol = 0; } } my $spaces_needed = $lp_position_predictor - $maximum_line_length_at_level[ $levels_to_go[$ii_to_go] ] + $tol; return if ( $spaces_needed <= 0 ); # We are over the limit, so try to remove a requested number of # spaces from leading whitespace. We are only allowed to remove # from whitespace items created on this batch, since others have # already been used and cannot be undone. my @candidates = (); # loop over all whitespace items created for the current batch foreach my $i ( 0 .. $max_lp_object_list ) { my $item = $rlp_object_list->[$i]; # item must still be open to be a candidate (otherwise it # cannot influence the current token) next if ( $item->get_closed() >= 0 ); my $available_spaces = $item->get_available_spaces(); if ( $available_spaces > 0 ) { push( @candidates, [ $i, $available_spaces ] ); } } return unless (@candidates); # sort by available whitespace so that we can remove whitespace # from the maximum available first. @candidates = sort { $b->[1] <=> $a->[1] || $a->[0] <=> $b->[0] } @candidates; # keep removing whitespace until we are done or have no more foreach my $candidate (@candidates) { my ( $i, $available_spaces ) = @{$candidate}; my $deleted_spaces = ( $available_spaces > $spaces_needed ) ? $spaces_needed : $available_spaces; # remove the incremental space from this item $rlp_object_list->[$i]->decrease_available_spaces($deleted_spaces); my $i_debug = $i; # update the leading whitespace of this item and all items # that came after it $i -= 1; while ( ++$i <= $max_lp_object_list ) { my $old_spaces = $rlp_object_list->[$i]->get_spaces(); if ( $old_spaces >= $deleted_spaces ) { $rlp_object_list->[$i]->decrease_SPACES($deleted_spaces); } # shouldn't happen except for code bug: else { # non-fatal, keep going except in DEVEL_MODE if (DEVEL_MODE) { my $level = $rlp_object_list->[$i_debug]->get_level(); my $ci_level = $rlp_object_list->[$i_debug]->get_ci_level(); my $old_level = $rlp_object_list->[$i]->get_level(); my $old_ci_level = $rlp_object_list->[$i]->get_ci_level(); Fault(< 0 ); } return; } ## end sub check_for_long_gnu_style_lines sub undo_incomplete_lp_indentation { #------------------------------------------------------------------ # Undo indentation for all incomplete -lp indentation levels of the # current batch unless -xlp is set. #------------------------------------------------------------------ # This routine is called once after each output stream batch is # finished to undo indentation for all incomplete -lp indentation # levels. If this routine is called then comments and blank lines will # disrupt this indentation style. In older versions of perltidy this # was always done because it could cause problems otherwise, but recent # improvements allow fairly good results to be obtained by skipping # this step with the -xlp flag. # nothing to do if no stack items defined for this line return if ( $max_lp_object_list < 0 ); # loop over all whitespace items created for the current batch foreach my $i ( 0 .. $max_lp_object_list ) { my $item = $rlp_object_list->[$i]; # only look for open items next if ( $item->get_closed() >= 0 ); # Tentatively remove all of the available space # (The vertical aligner will try to get it back later) my $available_spaces = $item->get_available_spaces(); if ( $available_spaces > 0 ) { # delete incremental space for this item $rlp_object_list->[$i] ->tentatively_decrease_available_spaces($available_spaces); # Reduce the total indentation space of any nodes that follow # Note that any such nodes must necessarily be dependents # of this node. foreach ( $i + 1 .. $max_lp_object_list ) { $rlp_object_list->[$_]->decrease_SPACES($available_spaces); } } } return; } ## end sub undo_incomplete_lp_indentation } ## end closure set_lp_indentation #---------------------------------------------------------------------- # sub to set a requested break before an opening container in -lp mode. #---------------------------------------------------------------------- sub set_forced_lp_break { my ( $self, $i_begin_line, $i_opening ) = @_; # Given: # $i_begin_line = index of break in the _to_go arrays # $i_opening = index of the opening container # Set any requested break at a token before this opening container # token. This is often an '=' or '=>' but can also be things like # '.', ',', 'return'. It was defined by sub set_lp_indentation. # Important: # For intact containers, call this at the closing token. # For broken containers, call this at the opening token. # This will avoid needless breaks when it turns out that the # container does not actually get broken. This isn't known until # the closing container for intact blocks. return if ( $i_begin_line < 0 || $i_begin_line > $max_index_to_go ); # Handle request to put a break break immediately before this token. # We may not want to do that since we are also breaking after it. if ( $i_begin_line == $i_opening ) { # The following rules should be reviewed. We may want to always # allow the break. If we do not do the break, the indentation # may be off. # RULE: don't break before it unless it is welded to a qw. # This works well, but we may want to relax this to allow # breaks in additional cases. return if ( !$self->[_rK_weld_right_]->{ $K_to_go[$i_opening] } ); return unless ( $types_to_go[$max_index_to_go] eq 'q' ); } # Only break for breakpoints at the same # indentation level as the opening paren my $test1 = $nesting_depth_to_go[$i_opening]; my $test2 = $nesting_depth_to_go[$i_begin_line]; return if ( $test2 != $test1 ); # Back up at a blank (fixes case b932) my $ibr = $i_begin_line - 1; if ( $ibr > 0 && $types_to_go[$ibr] eq 'b' ) { $ibr--; } if ( $ibr >= 0 ) { my $i_nonblank = $self->set_forced_breakpoint($ibr); # Crude patch to prevent sub recombine_breakpoints from undoing # this break, especially after an '='. It will leave old # breakpoints alone. See c098/x045 for some examples. if ( defined($i_nonblank) ) { $old_breakpoint_to_go[$i_nonblank] = 1; } } return; } ## end sub set_forced_lp_break sub reduce_lp_indentation { # reduce the leading whitespace at token $i if possible by $spaces_needed # (a large value of $spaces_needed will remove all excess space) # NOTE: to be called from break_lists only for a sequence of tokens # contained between opening and closing parens/braces/brackets my ( $self, $i, $spaces_wanted ) = @_; my $deleted_spaces = 0; my $item = $leading_spaces_to_go[$i]; my $available_spaces = $item->get_available_spaces(); if ( $available_spaces > 0 && ( ( $spaces_wanted <= $available_spaces ) || !$item->get_have_child() ) ) { # we'll remove these spaces, but mark them as recoverable $deleted_spaces = $item->tentatively_decrease_available_spaces($spaces_wanted); } return $deleted_spaces; } ## end sub reduce_lp_indentation ########################################################### # CODE SECTION 13: Preparing batches for vertical alignment ########################################################### sub check_convey_batch_input { # Check for valid input to sub convey_batch_to_vertical_aligner. An # error here would most likely be due to an error in the calling # routine 'sub grind_batch_of_CODE'. my ( $self, $ri_first, $ri_last ) = @_; if ( !defined($ri_first) || !defined($ri_last) ) { Fault(<=0 EOM } my ( $ibeg, $iend ); foreach my $n ( 0 .. $nmax ) { my $ibeg_m = $ibeg; my $iend_m = $iend; $ibeg = $ri_first->[$n]; $iend = $ri_last->[$n]; if ( $ibeg < 0 || $iend < $ibeg || $iend > $max_index_to_go ) { Fault(<= ibeg and be in the range (0..$max_index_to_go) EOM } next if ( $n == 0 ); if ( $ibeg <= $iend_m ) { Fault(<add_closing_side_comment( $ri_first, $ri_last ); } if ( $n_last_line > 0 || $rOpts_extended_continuation_indentation ) { $self->undo_ci( $ri_first, $ri_last, $this_batch->[_rix_seqno_controlling_ci_] ); } # for multi-line batches ... if ( $n_last_line > 0 ) { # flush before a long if statement to avoid unwanted alignment $self->flush_vertical_aligner() if ( $type_beg_next eq 'k' && $is_if_unless{$token_beg_next} ); $self->set_logical_padding( $ri_first, $ri_last, $starting_in_quote ) if ($rOpts_logical_padding); $self->xlp_tweak( $ri_first, $ri_last ) if ($rOpts_extended_line_up_parentheses); } if (DEVEL_MODE) { $self->check_batch_summed_lengths() } # ---------------------------------------------------------- # define the vertical alignments for all lines of this batch # ---------------------------------------------------------- my $rline_alignments = $self->make_vertical_alignments( $ri_first, $ri_last ); # ---------------------------------------------- # loop to send each line to the vertical aligner # ---------------------------------------------- my ( $type_beg, $type_end, $token_beg, $ljump ); for my $n ( 0 .. $n_last_line ) { # ---------------------------------------------------------------- # This hash will hold the args for vertical alignment of this line # We will populate it as we go. # ---------------------------------------------------------------- my $rvao_args = {}; my $type_beg_last = $type_beg; my $type_end_last = $type_end; my $ibeg = $ibeg_next; my $iend = $iend_next; my $Kbeg = $K_to_go[$ibeg]; my $Kend = $K_to_go[$iend]; $type_beg = $type_beg_next; $type_end = $type_end_next; $token_beg = $token_beg_next; # --------------------------------------------------- # Define the check value 'Kend' to send for this line # --------------------------------------------------- # The 'Kend' value is an integer for checking that lines come out of # the far end of the pipeline in the right order. It increases # linearly along the token stream. But we only send ending K values of # non-comments down the pipeline. This is equivalent to checking that # the last CODE_type is blank or equal to 'VER'. See also sub # resync_lines_and_tokens for related coding. Note that # '$batch_CODE_type' is the code type of the line to which the ending # token belongs. my $Kend_code = $batch_CODE_type && $batch_CODE_type ne 'VER' ? undef : $Kend; # Get some vars on line [n+1], if any, # and define $ljump = level jump needed by 'sub get_final_indentation' if ( $n < $n_last_line ) { $ibeg_next = $ri_first->[ $n + 1 ]; $iend_next = $ri_last->[ $n + 1 ]; $type_beg_next = $types_to_go[$ibeg_next]; $type_end_next = $types_to_go[$iend_next]; $token_beg_next = $tokens_to_go[$ibeg_next]; my $Kbeg_next = $K_to_go[$ibeg_next]; $ljump = $rLL->[$Kbeg_next]->[_LEVEL_] - $rLL->[$Kend]->[_LEVEL_]; } elsif ( !$is_block_comment && $Kend < $Klimit ) { # Patch for git #51, a bare closing qw paren was not outdented # if the flag '-nodelete-old-newlines is set # Note that we are just looking ahead for the next nonblank # character. We could scan past an arbitrary number of block # comments or hanging side comments by calling K_next_code, but it # could add significant run time with very little to be gained. my $Kbeg_next = $Kend + 1; if ( $Kbeg_next < $Klimit && $rLL->[$Kbeg_next]->[_TYPE_] eq 'b' ) { $Kbeg_next += 1; } $ljump = $rLL->[$Kbeg_next]->[_LEVEL_] - $rLL->[$Kend]->[_LEVEL_]; } else { $ljump = 0; } # --------------------------------------------- # get the vertical alignment info for this line # --------------------------------------------- # The lines are broken into fields which can be spaced by the vertical # to achieve vertical alignment. These fields are the actual text # which will be output, so from here on no more changes can be made to # the text. my $rline_alignment = $rline_alignments->[$n]; my ( $rtokens, $rfields, $rpatterns, $rfield_lengths ) = @{$rline_alignment}; # Programming check: (shouldn't happen) # The number of tokens which separate the fields must always be # one less than the number of fields. If this is not true then # an error has been introduced in sub make_alignment_patterns. if (DEVEL_MODE) { if ( @{$rfields} && ( @{$rtokens} != ( @{$rfields} - 1 ) ) ) { my $nt = @{$rtokens}; my $nf = @{$rfields}; my $msg = <get_final_indentation( $ibeg, $iend, $rfields, $rpatterns, $ri_first, $ri_last, $rindentation_list, $ljump, $starting_in_quote, $is_static_block_comment, ); # -------------------------------- # define flag 'outdent_long_lines' # -------------------------------- if ( # we will allow outdenting of long lines.. # which are long quotes, if allowed ( $type_beg eq 'Q' && $rOpts_outdent_long_quotes ) # which are long block comments, if allowed || ( $type_beg eq '#' && $rOpts_outdent_long_comments # but not if this is a static block comment && !$is_static_block_comment ) ) { $rvao_args->{outdent_long_lines} = 1; # convert -lp indentation objects to spaces to allow outdenting if ( ref($indentation) ) { $indentation = $indentation->get_spaces(); } } # -------------------------------------------------- # define flags 'break_alignment_before' and '_after' # -------------------------------------------------- # These flags tell the vertical aligner to stop alignment before or # after this line. if ($is_outdented_line) { $rvao_args->{break_alignment_before} = 1; $rvao_args->{break_alignment_after} = 1; } elsif ($do_not_pad) { $rvao_args->{break_alignment_before} = 1; } # flush at an 'if' which follows a line with (1) terminal semicolon # or (2) terminal block_type which is not an 'if'. This prevents # unwanted alignment between the lines. elsif ( $type_beg eq 'k' && $token_beg eq 'if' ) { my $type_m = 'b'; my $block_type_m; if ( $Kbeg > 0 ) { my $Km = $Kbeg - 1; $type_m = $rLL->[$Km]->[_TYPE_]; if ( $type_m eq 'b' && $Km > 0 ) { $Km -= 1; $type_m = $rLL->[$Km]->[_TYPE_]; } if ( $type_m eq '#' && $Km > 0 ) { $Km -= 1; $type_m = $rLL->[$Km]->[_TYPE_]; if ( $type_m eq 'b' && $Km > 0 ) { $Km -= 1; $type_m = $rLL->[$Km]->[_TYPE_]; } } my $seqno_m = $rLL->[$Km]->[_TYPE_SEQUENCE_]; if ($seqno_m) { $block_type_m = $self->[_rblock_type_of_seqno_]->{$seqno_m}; } } # break after anything that is not if-like if ( $type_m eq ';' || ( $type_m eq '}' && $block_type_m && $block_type_m ne 'if' && $block_type_m ne 'unless' && $block_type_m ne 'elsif' && $block_type_m ne 'else' ) ) { $rvao_args->{break_alignment_before} = 1; } } # ---------------------------------- # define 'rvertical_tightness_flags' # ---------------------------------- # These flags tell the vertical aligner if/when to combine consecutive # lines, based on the user input parameters. $rvao_args->{rvertical_tightness_flags} = $self->set_vertical_tightness_flags( $n, $n_last_line, $ibeg, $iend, $ri_first, $ri_last, $ending_in_quote, $closing_side_comment ) unless ( $is_block_comment || $self->[_no_vertical_tightness_flags_] ); # ---------------------------------- # define 'is_terminal_ternary' flag # ---------------------------------- # This flag is set at the final ':' of a ternary chain to request # vertical alignment of the final term. Here is a slightly complex # example: # # $self->{_text} = ( # !$section ? '' # : $type eq 'item' ? "the $section entry" # : "the section on $section" # ) # . ( # $page # ? ( $section ? ' in ' : '' ) . "the $page$page_ext manpage" # : ' elsewhere in this document' # ); # if ( $type_beg eq ':' || $n > 0 && $type_end_last eq ':' ) { my $is_terminal_ternary = 0; my $last_leading_type = $n > 0 ? $type_beg_last : ':'; my $terminal_type = $types_to_go[$i_terminal]; if ( $terminal_type ne ';' && $n_last_line > $n && $level_end == $lev ) { my $Kbeg_next = $K_to_go[$ibeg_next]; $level_end = $rLL->[$Kbeg_next]->[_LEVEL_]; $terminal_type = $rLL->[$Kbeg_next]->[_TYPE_]; } if ( $last_leading_type eq ':' && ( ( $terminal_type eq ';' && $level_end <= $lev ) || ( $terminal_type ne ':' && $level_end < $lev ) ) ) { # the terminal term must not contain any ternary terms, as in # my $ECHO = ( # $Is_MSWin32 ? ".\\echo$$" # : $Is_MacOS ? ":echo$$" # : ( $Is_NetWare ? "echo$$" : "./echo$$" ) # ); $is_terminal_ternary = 1; my $KP = $rLL->[$Kbeg]->[_KNEXT_SEQ_ITEM_]; while ( defined($KP) && $KP <= $Kend ) { my $type_KP = $rLL->[$KP]->[_TYPE_]; if ( $type_KP eq '?' || $type_KP eq ':' ) { $is_terminal_ternary = 0; last; } $KP = $rLL->[$KP]->[_KNEXT_SEQ_ITEM_]; } } $rvao_args->{is_terminal_ternary} = $is_terminal_ternary; } # ------------------------------------------------- # add any new closing side comment to the last line # ------------------------------------------------- if ( $closing_side_comment && $n == $n_last_line && @{$rfields} ) { $rfields->[-1] .= " $closing_side_comment"; # NOTE: Patch for csc. We can just use 1 for the length of the csc # because its length should not be a limiting factor from here on. $rfield_lengths->[-1] += 2; # repack $rline_alignment = [ $rtokens, $rfields, $rpatterns, $rfield_lengths ]; } # ------------------------ # define flag 'list_seqno' # ------------------------ # This flag indicates if this line is contained in a multi-line list if ( !$is_block_comment ) { my $parent_seqno = $parent_seqno_to_go[$ibeg]; $rvao_args->{list_seqno} = $ris_list_by_seqno->{$parent_seqno}; } # The alignment tokens have been marked with nesting_depths, so we need # to pass nesting depths to the vertical aligner. They remain invariant # under all formatting operations. Previously, level values were sent # to the aligner. But they can be altered in welding and other # operations, and this can lead to alignment errors. my $nesting_depth_beg = $nesting_depth_to_go[$ibeg]; my $nesting_depth_end = $nesting_depth_to_go[$iend]; # A quirk in the definition of nesting depths is that the closing token # has the same depth as internal tokens. The vertical aligner is # programmed to expect them to have the lower depth, so we fix this. if ( $is_closing_type{ $types_to_go[$ibeg] } ) { $nesting_depth_beg-- } if ( $is_closing_type{ $types_to_go[$iend] } ) { $nesting_depth_end-- } # Adjust nesting depths to keep -lp indentation for qw lists. This is # required because qw lists contained in brackets do not get nesting # depths, but the vertical aligner is watching nesting depth changes to # decide if a -lp block is intact. Without this patch, qw lists # enclosed in angle brackets will not get the correct -lp indentation. # Looking for line with isolated qw ... if ( $rOpts_line_up_parentheses && $type_beg eq 'q' && $ibeg == $iend ) { # ... which is part of a multiline qw my $Km = $self->K_previous_nonblank($Kbeg); my $Kp = $self->K_next_nonblank($Kbeg); if ( defined($Km) && $rLL->[$Km]->[_TYPE_] eq 'q' || defined($Kp) && $rLL->[$Kp]->[_TYPE_] eq 'q' ) { $nesting_depth_beg++; $nesting_depth_end++; } } # --------------------------------- # define flag 'forget_side_comment' # --------------------------------- # This flag tells the vertical aligner to reset the side comment # location if we are entering a new block from level 0. This is # intended to keep side comments from drifting too far to the right. if ( $block_type_to_go[$i_terminal] && $nesting_depth_end > $nesting_depth_beg ) { $rvao_args->{forget_side_comment} = !$self->[_radjusted_levels_]->[$Kbeg]; } # ----------------------------------- # Store the remaining non-flag values # ----------------------------------- $rvao_args->{Kend} = $Kend_code; $rvao_args->{ci_level} = $ci_levels_to_go[$ibeg]; $rvao_args->{indentation} = $indentation; $rvao_args->{level_end} = $nesting_depth_end; $rvao_args->{level} = $nesting_depth_beg; $rvao_args->{rline_alignment} = $rline_alignment; $rvao_args->{maximum_line_length} = $maximum_line_length_at_level[ $levels_to_go[$ibeg] ]; # -------------------------------------- # send this line to the vertical aligner # -------------------------------------- my $vao = $self->[_vertical_aligner_object_]; $vao->valign_input($rvao_args); $do_not_pad = 0; } ## end of loop to output each line # Set flag indicating if the last line ends in an opening # token and is very short, so that a blank line is not # needed if the subsequent line is a comment. # Examples of what we are looking for: # { # && ( # BEGIN { # default { # sub { $self->[_last_output_short_opening_token_] # line ends in opening token # /^[\{\(\[L]$/ = $is_opening_type{$type_end} # and either && ( # line has either single opening token $iend_next == $ibeg_next # or is a single token followed by opening token. # Note that sub identifiers have blanks like 'sub doit' # $token_beg !~ /\s+/ || ( $iend_next - $ibeg_next <= 2 && index( $token_beg, SPACE ) < 0 ) ) # and limit total to 10 character widths && token_sequence_length( $ibeg_next, $iend_next ) <= 10; # remember indentation of lines containing opening containers for # later use by sub get_final_indentation $self->save_opening_indentation( $ri_first, $ri_last, $rindentation_list, $this_batch->[_runmatched_opening_indexes_] ) if ( $this_batch->[_runmatched_opening_indexes_] || $types_to_go[$max_index_to_go] eq 'q' ); # output any new -cscw block comment if ($cscw_block_comment) { $self->flush_vertical_aligner(); my $file_writer_object = $self->[_file_writer_object_]; $file_writer_object->write_code_line( $cscw_block_comment . "\n" ); } return; } ## end sub convey_batch_to_vertical_aligner sub check_batch_summed_lengths { my ( $self, $msg ) = @_; $msg = EMPTY_STRING unless defined($msg); my $rLL = $self->[_rLL_]; # Verify that the summed lengths are correct. We want to be sure that # errors have not been introduced by programming changes. Summed lengths # are defined in sub store_token. Operations like padding and unmasking # semicolons can change token lengths, but those operations are expected to # update the summed lengths when they make changes. So the summed lengths # should always be correct. foreach my $i ( 0 .. $max_index_to_go ) { my $len_by_sum = $summed_lengths_to_go[ $i + 1 ] - $summed_lengths_to_go[$i]; my $len_tok_i = $token_lengths_to_go[$i]; my $KK = $K_to_go[$i]; my $len_tok_K; # For --indent-only, there is not always agreement between # token lengths in _rLL_ and token_lengths_to_go, so skip that check. if ( defined($KK) && !$rOpts_indent_only ) { $len_tok_K = $rLL->[$KK]->[_TOKEN_LENGTH_]; } if ( $len_by_sum != $len_tok_i || defined($len_tok_K) && $len_by_sum != $len_tok_K ) { my $lno = defined($KK) ? $rLL->[$KK]->[_LINE_INDEX_] + 1 : "undef"; $KK = 'undef' unless defined($KK); my $tok = $tokens_to_go[$i]; my $type = $types_to_go[$i]; Fault(<[$KK]->[_TOKEN_LENGTH_]=$len_tok_K near line $lno starting with '$tokens_to_go[0]..' at token i=$i K=$KK token_type='$type' token='$tok' EOM } } return; } ## end sub check_batch_summed_lengths { ## begin closure set_vertical_alignment_markers my %is_vertical_alignment_type; my %is_not_vertical_alignment_token; my %is_vertical_alignment_keyword; my %is_terminal_alignment_type; my %is_low_level_alignment_token; BEGIN { my @q; # Replaced =~ and // in the list. // had been removed in RT 119588 @q = qw# = **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x= { ? : => && || ~~ !~~ =~ !~ // <=> -> #; @is_vertical_alignment_type{@q} = (1) x scalar(@q); # These 'tokens' are not aligned. We need this to remove [ # from the above list because it has type ='{' @q = qw([); @is_not_vertical_alignment_token{@q} = (1) x scalar(@q); # these are the only types aligned at a line end @q = qw(&& || =>); @is_terminal_alignment_type{@q} = (1) x scalar(@q); # these tokens only align at line level @q = ( '{', '(' ); @is_low_level_alignment_token{@q} = (1) x scalar(@q); # eq and ne were removed from this list to improve alignment chances @q = qw(if unless and or err for foreach while until); @is_vertical_alignment_keyword{@q} = (1) x scalar(@q); } ## end BEGIN my $ralignment_type_to_go; my $ralignment_counts; my $ralignment_hash_by_line; sub set_vertical_alignment_markers { my ( $self, $ri_first, $ri_last ) = @_; #---------------------------------------------------------------------- # This routine looks at output lines for certain tokens which can serve # as vertical alignment markers (such as an '='). #---------------------------------------------------------------------- # Input parameters: # $ri_first = ref to list of starting line indexes in _to_go arrays # $ri_last = ref to list of ending line indexes in _to_go arrays # Method: We look at each token $i in this output batch and set # $ralignment_type_to_go->[$i] equal to those tokens at which we would # accept vertical alignment. # Initialize closure (and return) variables: $ralignment_type_to_go = []; $ralignment_counts = []; $ralignment_hash_by_line = []; # NOTE: closing side comments can insert up to 2 additional tokens # beyond the original $max_index_to_go, so we need to check ri_last for # the last index. my $max_line = @{$ri_first} - 1; my $max_i = $ri_last->[$max_line]; if ( $max_i < $max_index_to_go ) { $max_i = $max_index_to_go } # ----------------------------------------------------------------- # Shortcut: # - no alignments if there is only 1 token. # - and nothing to do if we aren't allowed to change whitespace. # ----------------------------------------------------------------- if ( $max_i <= 0 || !$rOpts_add_whitespace ) { goto RETURN; } # ------------------------------- # First handle any side comment. # ------------------------------- my $i_terminal = $max_i; if ( $types_to_go[$max_i] eq '#' ) { # We know $max_i > 0 if we get here. $i_terminal -= 1; if ( $i_terminal > 0 && $types_to_go[$i_terminal] eq 'b' ) { $i_terminal -= 1; } my $token = $tokens_to_go[$max_i]; my $KK = $K_to_go[$max_i]; # Do not align various special side comments my $do_not_align = ( # it is any specially marked side comment ( defined($KK) && $self->[_rspecial_side_comment_type_]->{$KK} ) # or it is a static side comment || ( $rOpts->{'static-side-comments'} && $token =~ /$static_side_comment_pattern/ ) # or a closing side comment || ( $types_to_go[$i_terminal] eq '}' && $tokens_to_go[$i_terminal] eq '}' && $token =~ /$closing_side_comment_prefix_pattern/ ) ); # - For the specific combination -vc -nvsc, we put all side comments # at fixed locations. Note that we will lose hanging side comment # alignments. Otherwise, hsc's can move to strange locations. # - For -nvc -nvsc we make all side comments vertical alignments # because the vertical aligner will check for -nvsc and be able # to reduce the final padding to the side comments for long lines. # and keep hanging side comments aligned. if ( !$do_not_align && !$rOpts_valign_side_comments && $rOpts_valign_code ) { $do_not_align = 1; my $ipad = $max_i - 1; if ( $types_to_go[$ipad] eq 'b' ) { my $pad_spaces = $rOpts->{'minimum-space-to-comment'} - $token_lengths_to_go[$ipad]; $self->pad_token( $ipad, $pad_spaces ); } } if ( !$do_not_align ) { $ralignment_type_to_go->[$max_i] = '#'; $ralignment_hash_by_line->[$max_line]->{$max_i} = '#'; $ralignment_counts->[$max_line]++; } } # ---------------------------------------------- # Nothing more to do on this line if -nvc is set # ---------------------------------------------- if ( !$rOpts_valign_code ) { goto RETURN; } # ------------------------------------- # Loop over each line of this batch ... # ------------------------------------- foreach my $line ( 0 .. $max_line ) { my $ibeg = $ri_first->[$line]; my $iend = $ri_last->[$line]; next if ( $iend <= $ibeg ); # back up before any side comment if ( $iend > $i_terminal ) { $iend = $i_terminal } #---------------------------------- # Loop over all tokens on this line #---------------------------------- $self->set_vertical_alignment_markers_token_loop( $line, $ibeg, $iend ); } RETURN: return ( $ralignment_type_to_go, $ralignment_counts, $ralignment_hash_by_line ); } ## end sub set_vertical_alignment_markers sub set_vertical_alignment_markers_token_loop { my ( $self, $line, $ibeg, $iend ) = @_; # Set vertical alignment markers for the tokens on one line # of the current output batch. This is done by updating the # three closure variables: # $ralignment_type_to_go # $ralignment_counts # $ralignment_hash_by_line # Input parameters: # $line = index of this line in the current batch # $ibeg, $iend = index range of tokens to check in the _to_go arrays my $level_beg = $levels_to_go[$ibeg]; my $token_beg = $tokens_to_go[$ibeg]; my $type_beg = $types_to_go[$ibeg]; my $type_beg_special_char = ( $type_beg eq '.' || $type_beg eq ':' || $type_beg eq '?' ); my $last_vertical_alignment_BEFORE_index = -1; my $vert_last_nonblank_type = $type_beg; my $vert_last_nonblank_token = $token_beg; # ---------------------------------------------------------------- # Initialization code merged from 'sub delete_needless_alignments' # ---------------------------------------------------------------- my $i_good_paren = -1; my $i_elsif_close = $ibeg - 1; my $i_elsif_open = $iend + 1; my @imatch_list; if ( $type_beg eq 'k' ) { # Initialization for paren patch: mark a location of a paren we # should keep, such as one following something like a leading # 'if', 'elsif', $i_good_paren = $ibeg + 1; if ( $types_to_go[$i_good_paren] eq 'b' ) { $i_good_paren++; } # Initialization for 'elsif' patch: remember the paren range of # an elsif, and do not make alignments within them because this # can cause loss of padding and overall brace alignment in the # vertical aligner. if ( $token_beg eq 'elsif' && $i_good_paren < $iend && $tokens_to_go[$i_good_paren] eq '(' ) { $i_elsif_open = $i_good_paren; $i_elsif_close = $mate_index_to_go[$i_good_paren]; if ( !defined($i_elsif_close) ) { $i_elsif_close = -1 } } } ## end if ( $type_beg eq 'k' ) # -------------------------------------------- # Loop over each token in this output line ... # -------------------------------------------- foreach my $i ( $ibeg + 1 .. $iend ) { next if ( $types_to_go[$i] eq 'b' ); my $type = $types_to_go[$i]; my $token = $tokens_to_go[$i]; my $alignment_type = EMPTY_STRING; # ---------------------------------------------- # Check for 'paren patch' : Remove excess parens # ---------------------------------------------- # Excess alignment of parens can prevent other good alignments. # For example, note the parens in the first two rows of the # following snippet. They would normally get marked for # alignment and aligned as follows: # my $w = $columns * $cell_w + ( $columns + 1 ) * $border; # my $h = $rows * $cell_h + ( $rows + 1 ) * $border; # my $img = new Gimp::Image( $w, $h, RGB ); # This causes unnecessary paren alignment and prevents the # third equals from aligning. If we remove the unwanted # alignments we get: # my $w = $columns * $cell_w + ( $columns + 1 ) * $border; # my $h = $rows * $cell_h + ( $rows + 1 ) * $border; # my $img = new Gimp::Image( $w, $h, RGB ); # A rule for doing this which works well is to remove alignment # of parens whose containers do not contain other aligning # tokens, with the exception that we always keep alignment of # the first opening paren on a line (for things like 'if' and # 'elsif' statements). if ( $token eq ')' && @imatch_list ) { # undo the corresponding opening paren if: # - it is at the top of the stack # - and not the first overall opening paren # - does not follow a leading keyword on this line my $imate = $mate_index_to_go[$i]; if ( !defined($imate) ) { $imate = -1 } if ( $imatch_list[-1] eq $imate && ( $ibeg > 1 || @imatch_list > 1 ) && $imate > $i_good_paren ) { if ( $ralignment_type_to_go->[$imate] ) { $ralignment_type_to_go->[$imate] = EMPTY_STRING; $ralignment_counts->[$line]--; delete $ralignment_hash_by_line->[$line]->{$imate}; } pop @imatch_list; } } # do not align tokens at lower level than start of line # except for side comments if ( $levels_to_go[$i] < $level_beg ) { next; } #-------------------------------------------------------- # First see if we want to align BEFORE this token #-------------------------------------------------------- # The first possible token that we can align before # is index 2 because: 1) it doesn't normally make sense to # align before the first token and 2) the second # token must be a blank if we are to align before # the third if ( $i < $ibeg + 2 ) { } # must follow a blank token elsif ( $types_to_go[ $i - 1 ] ne 'b' ) { } # otherwise, do not align two in a row to create a # blank field elsif ( $last_vertical_alignment_BEFORE_index == $i - 2 ) { } # align before one of these keywords # (within a line, since $i>1) elsif ( $type eq 'k' ) { # /^(if|unless|and|or|eq|ne)$/ if ( $is_vertical_alignment_keyword{$token} ) { $alignment_type = $token; } } # align qw in a 'use' statement (issue git #93) elsif ( $type eq 'q' ) { if ( $types_to_go[0] eq 'k' && $tokens_to_go[0] eq 'use' ) { $alignment_type = $type; } } # align before one of these types.. elsif ( $is_vertical_alignment_type{$type} && !$is_not_vertical_alignment_token{$token} ) { $alignment_type = $token; # Do not align a terminal token. Although it might # occasionally look ok to do this, this has been found to be # a good general rule. The main problems are: # (1) that the terminal token (such as an = or :) might get # moved far to the right where it is hard to see because # nothing follows it, and # (2) doing so may prevent other good alignments. # Current exceptions are && and || and => if ( $i == $iend ) { $alignment_type = EMPTY_STRING unless ( $is_terminal_alignment_type{$type} ); } # Do not align leading ': (' or '. ('. This would prevent # alignment in something like the following: # $extra_space .= # ( $input_line_number < 10 ) ? " " # : ( $input_line_number < 100 ) ? " " # : ""; # or # $code = # ( $case_matters ? $accessor : " lc($accessor) " ) # . ( $yesno ? " eq " : " ne " ) # Also, do not align a ( following a leading ? so we can # align something like this: # $converter{$_}->{ushortok} = # $PDL::IO::Pic::biggrays # ? ( m/GIF/ ? 0 : 1 ) # : ( m/GIF|RAST|IFF/ ? 0 : 1 ); if ( $type_beg_special_char && $i == $ibeg + 2 && $types_to_go[ $i - 1 ] eq 'b' ) { $alignment_type = EMPTY_STRING; } # Certain tokens only align at the same level as the # initial line level if ( $is_low_level_alignment_token{$token} && $levels_to_go[$i] != $level_beg ) { $alignment_type = EMPTY_STRING; } if ( $token eq '(' ) { # For a paren after keyword, only align if-like parens, # such as: # if ( $a ) { &a } # elsif ( $b ) { &b } # ^-------------------aligned parens if ( $vert_last_nonblank_type eq 'k' && !$is_if_unless_elsif{$vert_last_nonblank_token} ) { $alignment_type = EMPTY_STRING; } # Do not align a spaced-function-paren if requested. # Issue git #53, #73. if ( !$rOpts_function_paren_vertical_alignment ) { my $seqno = $type_sequence_to_go[$i]; $alignment_type = EMPTY_STRING if ( $self->[_ris_function_call_paren_]->{$seqno} ); } # make () align with qw in a 'use' statement (git #93) if ( $tokens_to_go[0] eq 'use' && $types_to_go[0] eq 'k' && defined( $mate_index_to_go[$i] ) && $mate_index_to_go[$i] == $i + 1 ) { $alignment_type = 'q'; ## Note on discussion git #101. We could make this ## a separate type '()' to separate it from qw's: ## $alignment_type = ## $rOpts_valign_empty_parens_with_qw ? 'q' : '()'; } } # be sure the alignment tokens are unique # This experiment didn't work well: reason not determined # if ($token ne $type) {$alignment_type .= $type} } # NOTE: This is deactivated because it causes the previous # if/elsif alignment to fail #elsif ( $type eq '}' && $token eq '}' && $block_type_to_go[$i]) #{ $alignment_type = $type; } if ($alignment_type) { $last_vertical_alignment_BEFORE_index = $i; } #-------------------------------------------------------- # Next see if we want to align AFTER the previous nonblank #-------------------------------------------------------- # We want to line up ',' and interior ';' tokens, with the added # space AFTER these tokens. (Note: interior ';' is included # because it may occur in short blocks). elsif ( # previous token IS one of these: ( $vert_last_nonblank_type eq ',' || $vert_last_nonblank_type eq ';' ) # and it follows a blank && $types_to_go[ $i - 1 ] eq 'b' # and it's NOT one of these && !$is_closing_token{$type} # then go ahead and align ) { $alignment_type = $vert_last_nonblank_type; } #----------------------- # Set the alignment type #----------------------- if ($alignment_type) { # but do not align the opening brace of an anonymous sub if ( $token eq '{' && $block_type_to_go[$i] && $block_type_to_go[$i] =~ /$ASUB_PATTERN/ ) { } # and do not make alignments within 'elsif' parens elsif ( $i > $i_elsif_open && $i < $i_elsif_close ) { } # and ignore any tokens which have leading padded spaces # example: perl527/lop.t elsif ( substr( $alignment_type, 0, 1 ) eq SPACE ) { } else { $ralignment_type_to_go->[$i] = $alignment_type; $ralignment_hash_by_line->[$line]->{$i} = $alignment_type; $ralignment_counts->[$line]++; push @imatch_list, $i; } } $vert_last_nonblank_type = $type; $vert_last_nonblank_token = $token; } return; } ## end sub set_vertical_alignment_markers_token_loop } ## end closure set_vertical_alignment_markers sub make_vertical_alignments { my ( $self, $ri_first, $ri_last ) = @_; #---------------------------- # Shortcut for a single token #---------------------------- if ( $max_index_to_go == 0 ) { if ( @{$ri_first} == 1 && $ri_last->[0] == 0 ) { my $rtokens = []; my $rfields = [ $tokens_to_go[0] ]; my $rpatterns = [ $types_to_go[0] ]; my $rfield_lengths = [ $summed_lengths_to_go[1] - $summed_lengths_to_go[0] ]; return [ [ $rtokens, $rfields, $rpatterns, $rfield_lengths ] ]; } # Strange line packing, not fatal but should not happen elsif (DEVEL_MODE) { my $max_line = @{$ri_first} - 1; my $ibeg = $ri_first->[0]; my $iend = $ri_last->[0]; my $tok_b = $tokens_to_go[$ibeg]; my $tok_e = $tokens_to_go[$iend]; my $type_b = $types_to_go[$ibeg]; my $type_e = $types_to_go[$iend]; Fault( "Strange..max_index=0 but nlines=$max_line ibeg=$ibeg tok=$tok_b type=$type_b iend=$iend tok=$tok_e type=$type_e; please check\n" ); } } #--------------------------------------------------------- # Step 1: Define the alignment tokens for the entire batch #--------------------------------------------------------- my ( $ralignment_type_to_go, $ralignment_counts, $ralignment_hash_by_line ); # We only need to make this call if vertical alignment of code is # requested or if a line might have a side comment. if ( $rOpts_valign_code || $types_to_go[$max_index_to_go] eq '#' ) { ( $ralignment_type_to_go, $ralignment_counts, $ralignment_hash_by_line ) = $self->set_vertical_alignment_markers( $ri_first, $ri_last ); } #---------------------------------------------- # Step 2: Break each line into alignment fields #---------------------------------------------- my $rline_alignments = []; my $max_line = @{$ri_first} - 1; foreach my $line ( 0 .. $max_line ) { my $ibeg = $ri_first->[$line]; my $iend = $ri_last->[$line]; my $rtok_fld_pat_len = $self->make_alignment_patterns( $ibeg, $iend, $ralignment_type_to_go, $ralignment_counts->[$line], $ralignment_hash_by_line->[$line] ); push @{$rline_alignments}, $rtok_fld_pat_len; } return $rline_alignments; } ## end sub make_vertical_alignments sub get_seqno { # get opening and closing sequence numbers of a token for the vertical # aligner. Assign qw quotes a value to allow qw opening and closing tokens # to be treated somewhat like opening and closing tokens for stacking # tokens by the vertical aligner. my ( $self, $ii, $ending_in_quote ) = @_; my $rLL = $self->[_rLL_]; my $KK = $K_to_go[$ii]; my $seqno = $rLL->[$KK]->[_TYPE_SEQUENCE_]; if ( $rLL->[$KK]->[_TYPE_] eq 'q' ) { my $SEQ_QW = -1; my $token = $rLL->[$KK]->[_TOKEN_]; if ( $ii > 0 ) { $seqno = $SEQ_QW if ( $token =~ /^qw\s*[\(\{\[]/ ); } else { if ( !$ending_in_quote ) { $seqno = $SEQ_QW if ( $token =~ /[\)\}\]]$/ ); } } } return ($seqno); } ## end sub get_seqno { my %undo_extended_ci; sub initialize_undo_ci { %undo_extended_ci = (); return; } sub undo_ci { # Undo continuation indentation in certain sequences my ( $self, $ri_first, $ri_last, $rix_seqno_controlling_ci ) = @_; my ( $line_1, $line_2, $lev_last ); my $max_line = @{$ri_first} - 1; my $rseqno_controlling_my_ci = $self->[_rseqno_controlling_my_ci_]; # Prepare a list of controlling indexes for each line if required. # This is used for efficient processing below. Note: this is # critical for speed. In the initial implementation I just looped # through the @$rix_seqno_controlling_ci list below. Using NYT_prof, I # found that this routine was causing a huge run time in large lists. # On a very large list test case, this new coding dropped the run time # of this routine from 30 seconds to 169 milliseconds. my @i_controlling_ci; if ( $rix_seqno_controlling_ci && @{$rix_seqno_controlling_ci} ) { my @tmp = reverse @{$rix_seqno_controlling_ci}; my $ix_next = pop @tmp; foreach my $line ( 0 .. $max_line ) { my $iend = $ri_last->[$line]; while ( defined($ix_next) && $ix_next <= $iend ) { push @{ $i_controlling_ci[$line] }, $ix_next; $ix_next = pop @tmp; } } } # Loop over all lines of the batch ... # Workaround originally created for problem c007, in which the # combination -lp -xci could produce a "Program bug" message in unusual # circumstances. my $skip_SECTION_1; if ( $rOpts_line_up_parentheses && $rOpts_extended_continuation_indentation ) { # Only set this flag if -lp is actually used here foreach my $line ( 0 .. $max_line ) { my $ibeg = $ri_first->[$line]; if ( ref( $leading_spaces_to_go[$ibeg] ) ) { $skip_SECTION_1 = 1; last; } } } foreach my $line ( 0 .. $max_line ) { my $ibeg = $ri_first->[$line]; my $iend = $ri_last->[$line]; my $lev = $levels_to_go[$ibeg]; #----------------------------------- # SECTION 1: Undo needless common CI #----------------------------------- # We are looking at leading tokens and looking for a sequence all # at the same level and all at a higher level than enclosing lines. # For example, we can undo continuation indentation in sort/map/grep # chains # my $dat1 = pack( "n*", # map { $_, $lookup->{$_} } # sort { $a <=> $b } # grep { $lookup->{$_} ne $default } keys %$lookup ); # to become # my $dat1 = pack( "n*", # map { $_, $lookup->{$_} } # sort { $a <=> $b } # grep { $lookup->{$_} ne $default } keys %$lookup ); if ( $line > 0 && !$skip_SECTION_1 ) { # if we have started a chain.. if ($line_1) { # see if it continues.. if ( $lev == $lev_last ) { if ( $types_to_go[$ibeg] eq 'k' && $is_sort_map_grep{ $tokens_to_go[$ibeg] } ) { # chain continues... # check for chain ending at end of a statement my $is_semicolon_terminated = ( $line == $max_line && ( $types_to_go[$iend] eq ';' # with possible side comment || ( $types_to_go[$iend] eq '#' && $iend - $ibeg >= 2 && $types_to_go[ $iend - 2 ] eq ';' && $types_to_go[ $iend - 1 ] eq 'b' ) ) ); $line_2 = $line if ($is_semicolon_terminated); } else { # kill chain $line_1 = undef; } } elsif ( $lev < $lev_last ) { # chain ends with previous line $line_2 = $line - 1; } elsif ( $lev > $lev_last ) { # kill chain $line_1 = undef; } # undo the continuation indentation if a chain ends if ( defined($line_2) && defined($line_1) ) { my $continuation_line_count = $line_2 - $line_1 + 1; @ci_levels_to_go[ @{$ri_first}[ $line_1 .. $line_2 ] ] = (0) x ($continuation_line_count) if ( $continuation_line_count >= 0 ); @leading_spaces_to_go[ @{$ri_first} [ $line_1 .. $line_2 ] ] = @reduced_spaces_to_go[ @{$ri_first} [ $line_1 .. $line_2 ] ]; $line_1 = undef; } } # not in a chain yet.. else { # look for start of a new sort/map/grep chain if ( $lev > $lev_last ) { if ( $types_to_go[$ibeg] eq 'k' && $is_sort_map_grep{ $tokens_to_go[$ibeg] } ) { $line_1 = $line; } } } } #------------------------------------- # SECTION 2: Undo ci at cuddled blocks #------------------------------------- # Note that sub get_final_indentation will be called later to # actually do this, but for now we will tentatively mark cuddled # lines with ci=0 so that the the -xci loop which follows will be # correct at cuddles. if ( $types_to_go[$ibeg] eq '}' && ( $nesting_depth_to_go[$iend] + 1 == $nesting_depth_to_go[$ibeg] ) ) { my $terminal_type = $types_to_go[$iend]; if ( $terminal_type eq '#' && $iend > $ibeg ) { $terminal_type = $types_to_go[ $iend - 1 ]; if ( $terminal_type eq '#' && $iend - 1 > $ibeg ) { $terminal_type = $types_to_go[ $iend - 2 ]; } } # Patch for rt144979, part 2. Coordinated with part 1. # Skip cuddled braces. my $seqno_beg = $type_sequence_to_go[$ibeg]; my $is_cuddled_closing_brace = $seqno_beg && $self->[_ris_cuddled_closing_brace_]->{$seqno_beg}; if ( $terminal_type eq '{' && !$is_cuddled_closing_brace ) { $ci_levels_to_go[$ibeg] = 0; } } #-------------------------------------------------------- # SECTION 3: Undo ci set by sub extended_ci if not needed #-------------------------------------------------------- # Undo the ci of the leading token if its controlling token # went out on a previous line without ci if ( $ci_levels_to_go[$ibeg] ) { my $Kbeg = $K_to_go[$ibeg]; my $seqno = $rseqno_controlling_my_ci->{$Kbeg}; if ( $seqno && $undo_extended_ci{$seqno} ) { # but do not undo ci set by the -lp flag if ( !ref( $reduced_spaces_to_go[$ibeg] ) ) { $ci_levels_to_go[$ibeg] = 0; $leading_spaces_to_go[$ibeg] = $reduced_spaces_to_go[$ibeg]; } } } # Flag any controlling opening tokens in lines without ci. This # will be used later in the above if statement to undo the ci which # they added. The array i_controlling_ci[$line] was prepared at # the top of this routine. if ( !$ci_levels_to_go[$ibeg] && defined( $i_controlling_ci[$line] ) ) { foreach my $i ( @{ $i_controlling_ci[$line] } ) { my $seqno = $type_sequence_to_go[$i]; $undo_extended_ci{$seqno} = 1; } } $lev_last = $lev; } return; } ## end sub undo_ci } { ## begin closure set_logical_padding my %is_math_op; BEGIN { my @q = qw( + - * / ); @is_math_op{@q} = (1) x scalar(@q); } sub set_logical_padding { # Look at a batch of lines and see if extra padding can improve the # alignment when there are certain leading operators. Here is an # example, in which some extra space is introduced before # '( $year' to make it line up with the subsequent lines: # # if ( ( $Year < 1601 ) # || ( $Year > 2899 ) # || ( $EndYear < 1601 ) # || ( $EndYear > 2899 ) ) # { # &Error_OutOfRange; # } # my ( $self, $ri_first, $ri_last, $starting_in_quote ) = @_; my $max_line = @{$ri_first} - 1; my ( $ibeg, $ibeg_next, $ibegm, $iend, $iendm, $ipad, $pad_spaces, $tok_next, $type_next, $has_leading_op_next, $has_leading_op ); # Patch to produce padding in the first line of short code blocks. # This is part of an update to fix cases b562 .. b983. # This is needed to compensate for a change which was made in 'sub # starting_one_line_block' to prevent blinkers. Previously, that sub # would not look at the total block size and rely on sub # break_long_lines to break up long blocks. Consequently, the # first line of those batches would end in the opening block brace of a # sort/map/grep/eval block. When this was changed to immediately check # for blocks which were too long, the opening block brace would go out # in a single batch, and the block contents would go out as the next # batch. This caused the logic in this routine which decides if the # first line should be padded to be incorrect. To fix this, we set a # flag if the previous batch ended in an opening sort/map/grep/eval # block brace, and use it to adjust the logic to compensate. # For example, the following would have previously been a single batch # but now is two batches. We want to pad the line starting in '$dir': # my (@indices) = # batch n-1 (prev batch n) # sort { # batch n-1 (prev batch n) # $dir eq 'left' # batch n # ? $cells[$a] <=> $cells[$b] # batch n # : $cells[$b] <=> $cells[$a]; # batch n # } ( 0 .. $#cells ); # batch n my $rLL = $self->[_rLL_]; my $rblock_type_of_seqno = $self->[_rblock_type_of_seqno_]; my $is_short_block; if ( $K_to_go[0] > 0 ) { my $Kp = $K_to_go[0] - 1; if ( $Kp > 0 && $rLL->[$Kp]->[_TYPE_] eq 'b' ) { $Kp -= 1; } if ( $Kp > 0 && $rLL->[$Kp]->[_TYPE_] eq '#' ) { $Kp -= 1; if ( $Kp > 0 && $rLL->[$Kp]->[_TYPE_] eq 'b' ) { $Kp -= 1; } } my $seqno = $rLL->[$Kp]->[_TYPE_SEQUENCE_]; if ($seqno) { my $block_type = $rblock_type_of_seqno->{$seqno}; if ($block_type) { $is_short_block = $is_sort_map_grep_eval{$block_type}; $is_short_block ||= $want_one_line_block{$block_type}; } } } # looking at each line of this batch.. foreach my $line ( 0 .. $max_line - 1 ) { # see if the next line begins with a logical operator $ibeg = $ri_first->[$line]; $iend = $ri_last->[$line]; $ibeg_next = $ri_first->[ $line + 1 ]; $tok_next = $tokens_to_go[$ibeg_next]; $type_next = $types_to_go[$ibeg_next]; $has_leading_op_next = ( $tok_next =~ /^\w/ ) ? $is_chain_operator{$tok_next} # + - * / : ? && || : $is_chain_operator{$type_next}; # and, or next unless ($has_leading_op_next); # next line must not be at lesser depth next if ( $nesting_depth_to_go[$ibeg] > $nesting_depth_to_go[$ibeg_next] ); # identify the token in this line to be padded on the left $ipad = undef; # handle lines at same depth... if ( $nesting_depth_to_go[$ibeg] == $nesting_depth_to_go[$ibeg_next] ) { # if this is not first line of the batch ... if ( $line > 0 ) { # and we have leading operator.. next if $has_leading_op; # Introduce padding if.. # 1. the previous line is at lesser depth, or # 2. the previous line ends in an assignment # 3. the previous line ends in a 'return' # 4. the previous line ends in a comma # Example 1: previous line at lesser depth # if ( ( $Year < 1601 ) # <- we are here but # || ( $Year > 2899 ) # list has not yet # || ( $EndYear < 1601 ) # collapsed vertically # || ( $EndYear > 2899 ) ) # { # # Example 2: previous line ending in assignment: # $leapyear = # $year % 4 ? 0 # <- We are here # : $year % 100 ? 1 # : $year % 400 ? 0 # : 1; # # Example 3: previous line ending in comma: # push @expr, # /test/ ? undef # : eval($_) ? 1 # : eval($_) ? 1 # : 0; # be sure levels agree (never indent after an indented 'if') next if ( $levels_to_go[$ibeg] ne $levels_to_go[$ibeg_next] ); # allow padding on first line after a comma but only if: # (1) this is line 2 and # (2) there are at more than three lines and # (3) lines 3 and 4 have the same leading operator # These rules try to prevent padding within a long # comma-separated list. my $ok_comma; if ( $types_to_go[$iendm] eq ',' && $line == 1 && $max_line > 2 ) { my $ibeg_next_next = $ri_first->[ $line + 2 ]; my $tok_next_next = $tokens_to_go[$ibeg_next_next]; $ok_comma = $tok_next_next eq $tok_next; } next unless ( $is_assignment{ $types_to_go[$iendm] } || $ok_comma || ( $nesting_depth_to_go[$ibegm] < $nesting_depth_to_go[$ibeg] ) || ( $types_to_go[$iendm] eq 'k' && $tokens_to_go[$iendm] eq 'return' ) ); # we will add padding before the first token $ipad = $ibeg; } # for first line of the batch.. else { # WARNING: Never indent if first line is starting in a # continued quote, which would change the quote. next if $starting_in_quote; # if this is text after closing '}' # then look for an interior token to pad if ( $types_to_go[$ibeg] eq '}' ) { } # otherwise, we might pad if it looks really good elsif ($is_short_block) { $ipad = $ibeg; } else { # we might pad token $ibeg, so be sure that it # is at the same depth as the next line. next if ( $nesting_depth_to_go[$ibeg] != $nesting_depth_to_go[$ibeg_next] ); # We can pad on line 1 of a statement if at least 3 # lines will be aligned. Otherwise, it # can look very confusing. # We have to be careful not to pad if there are too few # lines. The current rule is: # (1) in general we require at least 3 consecutive lines # with the same leading chain operator token, # (2) but an exception is that we only require two lines # with leading colons if there are no more lines. For example, # the first $i in the following snippet would get padding # by the second rule: # # $i == 1 ? ( "First", "Color" ) # : $i == 2 ? ( "Then", "Rarity" ) # : ( "Then", "Name" ); next if ( $max_line <= 1 ); my $leading_token = $tokens_to_go[$ibeg_next]; my $tokens_differ; # never indent line 1 of a '.' series because # previous line is most likely at same level. # TODO: we should also look at the leading_spaces # of the last output line and skip if it is same # as this line. next if ( $leading_token eq '.' ); my $count = 1; foreach my $l ( 2 .. 3 ) { last if ( $line + $l > $max_line ); $count++; my $ibeg_next_next = $ri_first->[ $line + $l ]; next if ( $tokens_to_go[$ibeg_next_next] eq $leading_token ); $tokens_differ = 1; last; } next if ($tokens_differ); next if ( $count < 3 && $leading_token ne ':' ); $ipad = $ibeg; } } } # find interior token to pad if necessary if ( !defined($ipad) ) { foreach my $i ( $ibeg .. $iend - 1 ) { # find any unclosed container next unless ( $type_sequence_to_go[$i] && defined( $mate_index_to_go[$i] ) && $mate_index_to_go[$i] > $iend ); # find next nonblank token to pad $ipad = $inext_to_go[$i]; last if $ipad; } last if ( !$ipad || $ipad > $iend ); } # We cannot pad the first leading token of a file because # it could cause a bug in which the starting indentation # level is guessed incorrectly each time the code is run # though perltidy, thus causing the code to march off to # the right. For example, the following snippet would have # this problem: ## ov_method mycan( $package, '(""' ), $package ## or ov_method mycan( $package, '(0+' ), $package ## or ov_method mycan( $package, '(bool' ), $package ## or ov_method mycan( $package, '(nomethod' ), $package; # If this snippet is within a block this won't happen # unless the user just processes the snippet alone within # an editor. In that case either the user will see and # fix the problem or it will be corrected next time the # entire file is processed with perltidy. my $this_batch = $self->[_this_batch_]; my $peak_batch_size = $this_batch->[_peak_batch_size_]; next if ( $ipad == 0 && $peak_batch_size <= 1 ); # next line must not be at greater depth my $iend_next = $ri_last->[ $line + 1 ]; next if ( $nesting_depth_to_go[ $iend_next + 1 ] > $nesting_depth_to_go[$ipad] ); # lines must be somewhat similar to be padded.. my $inext_next = $inext_to_go[$ibeg_next]; my $type = $types_to_go[$ipad]; # see if there are multiple continuation lines my $logical_continuation_lines = 1; if ( $line + 2 <= $max_line ) { my $leading_token = $tokens_to_go[$ibeg_next]; my $ibeg_next_next = $ri_first->[ $line + 2 ]; if ( $tokens_to_go[$ibeg_next_next] eq $leading_token && $nesting_depth_to_go[$ibeg_next] eq $nesting_depth_to_go[$ibeg_next_next] ) { $logical_continuation_lines++; } } # see if leading types match my $types_match = $types_to_go[$inext_next] eq $type; my $matches_without_bang; # if first line has leading ! then compare the following token if ( !$types_match && $type eq '!' ) { $types_match = $matches_without_bang = $types_to_go[$inext_next] eq $types_to_go[ $ipad + 1 ]; } if ( # either we have multiple continuation lines to follow # and we are not padding the first token ( $logical_continuation_lines > 1 && ( $ipad > 0 || $is_short_block ) ) # or.. || ( # types must match $types_match # and keywords must match if keyword && !( $type eq 'k' && $tokens_to_go[$ipad] ne $tokens_to_go[$inext_next] ) ) ) { #----------------------begin special checks-------------- # # SPECIAL CHECK 1: # A check is needed before we can make the pad. # If we are in a list with some long items, we want each # item to stand out. So in the following example, the # first line beginning with '$casefold->' would look good # padded to align with the next line, but then it # would be indented more than the last line, so we # won't do it. # # ok( # $casefold->{code} eq '0041' # && $casefold->{status} eq 'C' # && $casefold->{mapping} eq '0061', # 'casefold 0x41' # ); # # Note: # It would be faster, and almost as good, to use a comma # count, and not pad if comma_count > 1 and the previous # line did not end with a comma. # my $ok_to_pad = 1; my $ibg = $ri_first->[ $line + 1 ]; my $depth = $nesting_depth_to_go[ $ibg + 1 ]; # just use simplified formula for leading spaces to avoid # needless sub calls my $lsp = $levels_to_go[$ibg] + $ci_levels_to_go[$ibg]; # look at each line beyond the next .. my $l = $line + 1; foreach my $ltest ( $line + 2 .. $max_line ) { $l = $ltest; my $ibeg_t = $ri_first->[$l]; # quit looking at the end of this container last if ( $nesting_depth_to_go[ $ibeg_t + 1 ] < $depth ) || ( $nesting_depth_to_go[$ibeg_t] < $depth ); # cannot do the pad if a later line would be # outdented more if ( $levels_to_go[$ibeg_t] + $ci_levels_to_go[$ibeg_t] < $lsp ) { $ok_to_pad = 0; last; } } # don't pad if we end in a broken list if ( $l == $max_line ) { my $i2 = $ri_last->[$l]; if ( $types_to_go[$i2] eq '#' ) { my $i1 = $ri_first->[$l]; next if terminal_type_i( $i1, $i2 ) eq ','; } } # SPECIAL CHECK 2: # a minus may introduce a quoted variable, and we will # add the pad only if this line begins with a bare word, # such as for the word 'Button' here: # [ # Button => "Print letter \"~$_\"", # -command => [ sub { print "$_[0]\n" }, $_ ], # -accelerator => "Meta+$_" # ]; # # On the other hand, if 'Button' is quoted, it looks best # not to pad: # [ # 'Button' => "Print letter \"~$_\"", # -command => [ sub { print "$_[0]\n" }, $_ ], # -accelerator => "Meta+$_" # ]; if ( $types_to_go[$ibeg_next] eq 'm' ) { $ok_to_pad = 0 if $types_to_go[$ibeg] eq 'Q'; } next unless $ok_to_pad; #----------------------end special check--------------- my $length_1 = total_line_length( $ibeg, $ipad - 1 ); my $length_2 = total_line_length( $ibeg_next, $inext_next - 1 ); $pad_spaces = $length_2 - $length_1; # If the first line has a leading ! and the second does # not, then remove one space to try to align the next # leading characters, which are often the same. For example: # if ( !$ts # || $ts == $self->Holder # || $self->Holder->Type eq "Arena" ) # # This usually helps readability, but if there are subsequent # ! operators things will still get messed up. For example: # # if ( !exists $Net::DNS::typesbyname{$qtype} # && exists $Net::DNS::classesbyname{$qtype} # && !exists $Net::DNS::classesbyname{$qclass} # && exists $Net::DNS::typesbyname{$qclass} ) # We can't fix that. if ($matches_without_bang) { $pad_spaces-- } # make sure this won't change if -lp is used my $indentation_1 = $leading_spaces_to_go[$ibeg]; if ( ref($indentation_1) && $indentation_1->get_recoverable_spaces() == 0 ) { my $indentation_2 = $leading_spaces_to_go[$ibeg_next]; if ( ref($indentation_2) && $indentation_2->get_recoverable_spaces() != 0 ) { $pad_spaces = 0; } } # we might be able to handle a pad of -1 by removing a blank # token if ( $pad_spaces < 0 ) { # Deactivated for -kpit due to conflict. This block deletes # a space in an attempt to improve alignment in some cases, # but it may conflict with user spacing requests. For now # it is just deactivated if the -kpit option is used. if ( $pad_spaces == -1 ) { if ( $ipad > $ibeg && $types_to_go[ $ipad - 1 ] eq 'b' && !%keyword_paren_inner_tightness ) { $self->pad_token( $ipad - 1, $pad_spaces ); } } $pad_spaces = 0; } # now apply any padding for alignment if ( $ipad >= 0 && $pad_spaces ) { my $length_t = total_line_length( $ibeg, $iend ); if ( $pad_spaces + $length_t <= $maximum_line_length_at_level[ $levels_to_go[$ibeg] ] ) { $self->pad_token( $ipad, $pad_spaces ); } } } } continue { $iendm = $iend; $ibegm = $ibeg; $has_leading_op = $has_leading_op_next; } ## end of loop over lines return; } ## end sub set_logical_padding } ## end closure set_logical_padding sub pad_token { # insert $pad_spaces before token number $ipad my ( $self, $ipad, $pad_spaces ) = @_; my $rLL = $self->[_rLL_]; my $KK = $K_to_go[$ipad]; my $tok = $rLL->[$KK]->[_TOKEN_]; my $tok_len = $rLL->[$KK]->[_TOKEN_LENGTH_]; if ( $pad_spaces > 0 ) { $tok = SPACE x $pad_spaces . $tok; $tok_len += $pad_spaces; } elsif ( $pad_spaces == 0 ) { return; } elsif ( $pad_spaces == -1 && $tokens_to_go[$ipad] eq SPACE ) { $tok = EMPTY_STRING; $tok_len = 0; } else { # shouldn't happen DEVEL_MODE && Fault("unexpected request for pad spaces = $pad_spaces\n"); return; } $tok = $rLL->[$KK]->[_TOKEN_] = $tok; $tok_len = $rLL->[$KK]->[_TOKEN_LENGTH_] = $tok_len; $token_lengths_to_go[$ipad] += $pad_spaces; $tokens_to_go[$ipad] = $tok; foreach my $i ( $ipad .. $max_index_to_go ) { $summed_lengths_to_go[ $i + 1 ] += $pad_spaces; } return; } ## end sub pad_token sub xlp_tweak { # Remove one indentation space from unbroken containers marked with # 'K_extra_space'. These are mostly two-line lists with short names # formatted with -xlp -pt=2. # # Before this fix (extra space in line 2): # is($module->VERSION, $expected, # "$main_module->VERSION matches $module->VERSION ($expected)"); # # After this fix: # is($module->VERSION, $expected, # "$main_module->VERSION matches $module->VERSION ($expected)"); # # Notes: # - This fixes issue git #106 # - This must be called after 'set_logical_padding'. # - This is currently only applied to -xlp. It would also work for -lp # but that style is essentially frozen. my ( $self, $ri_first, $ri_last ) = @_; # Must be 2 or more lines return unless ( @{$ri_first} > 1 ); # Pull indentation object from start of second line my $ibeg_1 = $ri_first->[1]; my $lp_object = $leading_spaces_to_go[$ibeg_1]; return if ( !ref($lp_object) ); # This only applies to an indentation object with a marked token my $K_extra_space = $lp_object->get_K_extra_space(); return unless ($K_extra_space); # Look for the marked token within the first line of this batch my $ibeg_0 = $ri_first->[0]; my $iend_0 = $ri_last->[0]; my $ii = $ibeg_0 + $K_extra_space - $K_to_go[$ibeg_0]; return if ( $ii <= $ibeg_0 || $ii > $iend_0 ); # Skip padded tokens, they have already been aligned my $tok = $tokens_to_go[$ii]; return if ( substr( $tok, 0, 1 ) eq SPACE ); # Skip 'if'-like statements, this does not improve them return if ( $types_to_go[$ibeg_0] eq 'k' && $is_if_unless_elsif{ $tokens_to_go[$ibeg_0] } ); # Looks okay, reduce indentation by 1 space if possible my $spaces = $lp_object->get_spaces(); if ( $spaces > 0 ) { $lp_object->decrease_SPACES(1); } return; } ## end sub xlp_tweak { ## begin closure make_alignment_patterns my %keyword_map; my %operator_map; my %is_w_n_C; my %is_my_local_our; my %is_kwU; my %is_use_like; my %is_binary_type; my %is_binary_keyword; my %name_map; BEGIN { # Note: %block_type_map is now global to enable the -gal=s option # map certain keywords to the same 'if' class to align # long if/elsif sequences. [elsif.pl] %keyword_map = ( 'unless' => 'if', 'else' => 'if', 'elsif' => 'if', 'when' => 'given', 'default' => 'given', 'case' => 'switch', # treat an 'undef' similar to numbers and quotes 'undef' => 'Q', ); # map certain operators to the same class for pattern matching %operator_map = ( '!~' => '=~', '+=' => '+=', '-=' => '+=', '*=' => '+=', '/=' => '+=', ); %is_w_n_C = ( 'w' => 1, 'n' => 1, 'C' => 1, ); # leading keywords which to skip for efficiency when making parenless # container names my @q = qw( my local our return ); @{is_my_local_our}{@q} = (1) x scalar(@q); # leading keywords where we should just join one token to form # parenless name @q = qw( use ); @{is_use_like}{@q} = (1) x scalar(@q); # leading token types which may be used to make a container name @q = qw( k w U ); @{is_kwU}{@q} = (1) x scalar(@q); # token types which prevent using leading word as a container name @q = qw( x / : % . | ^ < = > || >= != *= => !~ == && |= .= -= =~ += <= %= ^= x= ~~ ** << /= &= // >> ~. &. |. ^. **= <<= >>= &&= ||= //= <=> !~~ &.= |.= ^.= <<~ ); push @q, ','; @{is_binary_type}{@q} = (1) x scalar(@q); # token keywords which prevent using leading word as a container name @q = qw(and or err eq ne cmp); @is_binary_keyword{@q} = (1) x scalar(@q); # Some common function calls whose args can be aligned. These do not # give good alignments if the lengths differ significantly. %name_map = ( 'unlike' => 'like', 'isnt' => 'is', ##'is_deeply' => 'is', # poor; names lengths too different ); } ## end BEGIN sub make_alignment_patterns { my ( $self, $ibeg, $iend, $ralignment_type_to_go, $alignment_count, $ralignment_hash ) = @_; #------------------------------------------------------------------ # This sub creates arrays of vertical alignment info for one output # line. #------------------------------------------------------------------ # Input parameters: # $ibeg, $iend - index range of this line in the _to_go arrays # $ralignment_type_to_go - alignment type of tokens, like '=', if any # $alignment_count - number of alignment tokens in the line # $ralignment_hash - this contains all of the alignments for this # line. It is not yet used but is available for future coding in # case there is a need to do a preliminary scan of alignment tokens. # The arrays which are created contain strings that can be tested by # the vertical aligner to see if consecutive lines can be aligned # vertically. # # The four arrays are indexed on the vertical # alignment fields and are: # @tokens - a list of any vertical alignment tokens for this line. # These are tokens, such as '=' '&&' '#' etc which # we want to might align vertically. These are # decorated with various information such as # nesting depth to prevent unwanted vertical # alignment matches. # @fields - the actual text of the line between the vertical alignment # tokens. # @patterns - a modified list of token types, one for each alignment # field. These should normally each match before alignment is # allowed, even when the alignment tokens match. # @field_lengths - the display width of each field if (DEVEL_MODE) { my $new_count = 0; if ( defined($ralignment_hash) ) { $new_count = keys %{$ralignment_hash}; } my $old_count = $alignment_count; $old_count = 0 unless ($old_count); if ( $new_count != $old_count ) { my $K = $K_to_go[$ibeg]; my $rLL = $self->[_rLL_]; my $lnl = $rLL->[$K]->[_LINE_INDEX_]; Fault( "alignment hash token count gives count=$new_count but old count is $old_count near line=$lnl\n" ); } } # ------------------------------------- # Shortcut for lines without alignments # ------------------------------------- if ( !$alignment_count ) { my $rtokens = []; my $rfield_lengths = [ $summed_lengths_to_go[ $iend + 1 ] - $summed_lengths_to_go[$ibeg] ]; my $rpatterns; my $rfields; if ( $ibeg == $iend ) { $rfields = [ $tokens_to_go[$ibeg] ]; $rpatterns = [ $types_to_go[$ibeg] ]; } else { $rfields = [ join( EMPTY_STRING, @tokens_to_go[ $ibeg .. $iend ] ) ]; $rpatterns = [ join( EMPTY_STRING, @types_to_go[ $ibeg .. $iend ] ) ]; } return [ $rtokens, $rfields, $rpatterns, $rfield_lengths ]; } my $i_start = $ibeg; my $depth = 0; my $i_depth_prev = $i_start; my $depth_prev = $depth; my %container_name = ( 0 => EMPTY_STRING ); my @tokens = (); my @fields = (); my @patterns = (); my @field_lengths = (); #------------------------------------------------------------- # Make a container name for any uncontained commas, issue c089 #------------------------------------------------------------- # This is a generalization of the fix for rt136416 which was a # specialized patch just for 'use Module' statements. # We restrict this to semicolon-terminated statements; that way # we know that the top level commas are not in a list container. if ( $ibeg == 0 && $iend == $max_index_to_go ) { my $iterm = $max_index_to_go; if ( $types_to_go[$iterm] eq '#' ) { $iterm = iprev_to_go($iterm); } # Alignment lines ending like '=> sub {'; fixes issue c093 my $term_type_ok = $types_to_go[$iterm] eq ';'; $term_type_ok ||= $tokens_to_go[$iterm] eq '{' && $block_type_to_go[$iterm]; if ( $iterm > $ibeg && $term_type_ok && !$is_my_local_our{ $tokens_to_go[$ibeg] } && $levels_to_go[$ibeg] eq $levels_to_go[$iterm] ) { $container_name{'0'} = make_uncontained_comma_name( $iterm, $ibeg, $iend ); } } #-------------------------------- # Begin main loop over all tokens #-------------------------------- my $j = 0; # field index $patterns[0] = EMPTY_STRING; my %token_count; for my $i ( $ibeg .. $iend ) { #------------------------------------------------------------- # Part 1: keep track of containers balanced on this line only. #------------------------------------------------------------- # These are used below to prevent unwanted cross-line alignments. # Unbalanced containers already avoid aligning across # container boundaries. my $type = $types_to_go[$i]; if ( $type_sequence_to_go[$i] ) { my $token = $tokens_to_go[$i]; if ( $is_opening_token{$token} ) { # if container is balanced on this line... my $i_mate = $mate_index_to_go[$i]; if ( !defined($i_mate) ) { $i_mate = -1 } if ( $i_mate > $i && $i_mate <= $iend ) { $i_depth_prev = $i; $depth_prev = $depth; $depth++; # Append the previous token name to make the container name # more unique. This name will also be given to any commas # within this container, and it helps avoid undesirable # alignments of different types of containers. # Containers beginning with { and [ are given those names # for uniqueness. That way commas in different containers # will not match. Here is an example of what this prevents: # a => [ 1, 2, 3 ], # b => { b1 => 4, b2 => 5 }, # Here is another example of what we avoid by labeling the # commas properly: # is_d( [ $a, $a ], [ $b, $c ] ); # is_d( { foo => $a, bar => $a }, { foo => $b, bar => $c } ); # is_d( [ \$a, \$a ], [ \$b, \$c ] ); my $name = $token eq '(' ? $self->make_paren_name($i) : $token; # name cannot be '.', so change to something else if so if ( $name eq '.' ) { $name = 'dot' } $container_name{$depth} = "+" . $name; # Make the container name even more unique if necessary. # If we are not vertically aligning this opening paren, # append a character count to avoid bad alignment since # it usually looks bad to align commas within containers # for which the opening parens do not align. Here # is an example very BAD alignment of commas (because # the atan2 functions are not all aligned): # $XY = # $X * $RTYSQP1 * atan2( $X, $RTYSQP1 ) + # $Y * $RTXSQP1 * atan2( $Y, $RTXSQP1 ) - # $X * atan2( $X, 1 ) - # $Y * atan2( $Y, 1 ); # # On the other hand, it is usually okay to align commas # if opening parens align, such as: # glVertex3d( $cx + $s * $xs, $cy, $z ); # glVertex3d( $cx, $cy + $s * $ys, $z ); # glVertex3d( $cx - $s * $xs, $cy, $z ); # glVertex3d( $cx, $cy - $s * $ys, $z ); # # To distinguish between these situations, we append # the length of the line from the previous matching # token, or beginning of line, to the function name. # This will allow the vertical aligner to reject # undesirable matches. # if we are not aligning on this paren... if ( !$ralignment_type_to_go->[$i] ) { my $len = length_tag( $i, $ibeg, $i_start ); # tack this length onto the container name to try # to make a unique token name $container_name{$depth} .= "-" . $len; } ## end if ( !$ralignment_type_to_go...) } ## end if ( $i_mate > $i && $i_mate...) } ## end if ( $is_opening_token...) elsif ( $is_closing_type{$token} ) { $i_depth_prev = $i; $depth_prev = $depth; $depth-- if $depth > 0; } } ## end if ( $type_sequence_to_go...) #------------------------------------------------------------ # Part 2: if we find a new synchronization token, we are done # with a field #------------------------------------------------------------ if ( $i > $i_start && $ralignment_type_to_go->[$i] ) { my $tok = my $raw_tok = $ralignment_type_to_go->[$i]; # map similar items my $tok_map = $operator_map{$tok}; $tok = $tok_map if ($tok_map); # make separators in different nesting depths unique # by appending the nesting depth digit. if ( $raw_tok ne '#' ) { $tok .= "$nesting_depth_to_go[$i]"; } # also decorate commas with any container name to avoid # unwanted cross-line alignments. if ( $raw_tok eq ',' || $raw_tok eq '=>' ) { # If we are at an opening token which increased depth, we have # to use the name from the previous depth. my $depth_last = $i == $i_depth_prev ? $depth_prev : $depth; my $depth_p = ( $depth_last < $depth ? $depth_last : $depth ); if ( $container_name{$depth_p} ) { $tok .= $container_name{$depth_p}; } } # Patch to avoid aligning leading and trailing if, unless. # Mark trailing if, unless statements with container names. # This makes them different from leading if, unless which # are not so marked at present. If we ever need to name # them too, we could use ci to distinguish them. # Example problem to avoid: # return ( 2, "DBERROR" ) # if ( $retval == 2 ); # if ( scalar @_ ) { # my ( $a, $b, $c, $d, $e, $f ) = @_; # } if ( $raw_tok eq '(' ) { if ( $ci_levels_to_go[$ibeg] && $container_name{$depth} =~ /^\+(if|unless)/ ) { $tok .= $container_name{$depth}; } } # Decorate block braces with block types to avoid # unwanted alignments such as the following: # foreach ( @{$routput_array} ) { $fh->print($_) } # eval { $fh->close() }; if ( $raw_tok eq '{' && $block_type_to_go[$i] ) { my $block_type = $block_type_to_go[$i]; # map certain related block types to allow # else blocks to align $block_type = $block_type_map{$block_type} if ( defined( $block_type_map{$block_type} ) ); # remove sub names to allow one-line sub braces to align # regardless of name if ( $block_type =~ /$SUB_PATTERN/ ) { $block_type = 'sub' } # allow all control-type blocks to align if ( $block_type =~ /^[A-Z]+$/ ) { $block_type = 'BEGIN' } $tok .= $block_type; } # Mark multiple copies of certain tokens with the copy number # This will allow the aligner to decide if they are matched. # For now, only do this for equals. For example, the two # equals on the next line will be labeled '=0' and '=0.2'. # Later, the '=0.2' will be ignored in alignment because it # has no match. # $| = $debug = 1 if $opt_d; # $full_index = 1 if $opt_i; if ( $raw_tok eq '=' || $raw_tok eq '=>' ) { $token_count{$tok}++; if ( $token_count{$tok} > 1 ) { $tok .= '.' . $token_count{$tok}; } } # concatenate the text of the consecutive tokens to form # the field push( @fields, join( EMPTY_STRING, @tokens_to_go[ $i_start .. $i - 1 ] ) ); push @field_lengths, $summed_lengths_to_go[$i] - $summed_lengths_to_go[$i_start]; # store the alignment token for this field push( @tokens, $tok ); # get ready for the next batch $i_start = $i; $j++; $patterns[$j] = EMPTY_STRING; } ## end if ( new synchronization token #----------------------------------------------- # Part 3: continue accumulating the next pattern #----------------------------------------------- # for keywords we have to use the actual text if ( $type eq 'k' ) { my $tok_fix = $tokens_to_go[$i]; # but map certain keywords to a common string to allow # alignment. $tok_fix = $keyword_map{$tok_fix} if ( defined( $keyword_map{$tok_fix} ) ); $patterns[$j] .= $tok_fix; } elsif ( $type eq 'b' ) { $patterns[$j] .= $type; } # Mark most things before arrows as a quote to # get them to line up. Testfile: mixed.pl. # handle $type =~ /^[wnC]$/ elsif ( $is_w_n_C{$type} ) { my $type_fix = $type; if ( $i < $iend - 1 ) { my $next_type = $types_to_go[ $i + 1 ]; my $i_next_nonblank = ( ( $next_type eq 'b' ) ? $i + 2 : $i + 1 ); if ( $types_to_go[$i_next_nonblank] eq '=>' ) { $type_fix = 'Q'; # Patch to ignore leading minus before words, # by changing pattern 'mQ' into just 'Q', # so that we can align things like this: # Button => "Print letter \"~$_\"", # -command => [ sub { print "$_[0]\n" }, $_ ], if ( $patterns[$j] eq 'm' ) { $patterns[$j] = EMPTY_STRING; } } } # Convert a bareword within braces into a quote for # matching. This will allow alignment of expressions like # this: # local ( $SIG{'INT'} ) = IGNORE; # local ( $SIG{ALRM} ) = 'POSTMAN'; if ( $type eq 'w' && $i > $ibeg && $i < $iend && $types_to_go[ $i - 1 ] eq 'L' && $types_to_go[ $i + 1 ] eq 'R' ) { $type_fix = 'Q'; } # patch to make numbers and quotes align if ( $type eq 'n' ) { $type_fix = 'Q' } $patterns[$j] .= $type_fix; } ## end elsif ( $is_w_n_C{$type} ) # ignore any ! in patterns elsif ( $type eq '!' ) { } # everything else else { $patterns[$j] .= $type; # remove any zero-level name at first fat comma if ( $depth == 0 && $type eq '=>' ) { $container_name{$depth} = EMPTY_STRING; } } } ## end for my $i ( $ibeg .. $iend) #--------------------------------------------------------------- # End of main loop .. join text of tokens to make the last field #--------------------------------------------------------------- push( @fields, join( EMPTY_STRING, @tokens_to_go[ $i_start .. $iend ] ) ); push @field_lengths, $summed_lengths_to_go[ $iend + 1 ] - $summed_lengths_to_go[$i_start]; return [ \@tokens, \@fields, \@patterns, \@field_lengths ]; } ## end sub make_alignment_patterns sub make_uncontained_comma_name { my ( $iterm, $ibeg, $iend ) = @_; # Make a container name by combining all leading barewords, # keywords and functions. my $name = EMPTY_STRING; my $count = 0; my $count_max; my $iname_end; my $ilast_blank; for ( $ibeg .. $iterm ) { my $type = $types_to_go[$_]; if ( $type eq 'b' ) { $ilast_blank = $_; next; } my $token = $tokens_to_go[$_]; # Give up if we find an opening paren, binary operator or # comma within or after the proposed container name. if ( $token eq '(' || $is_binary_type{$type} || $type eq 'k' && $is_binary_keyword{$token} ) { $name = EMPTY_STRING; last; } # The container name is only built of certain types: last if ( !$is_kwU{$type} ); # Normally it is made of one word, but two words for 'use' if ( $count == 0 ) { if ( $type eq 'k' && $is_use_like{ $tokens_to_go[$_] } ) { $count_max = 2; } else { $count_max = 1; } } elsif ( defined($count_max) && $count >= $count_max ) { last; } if ( defined( $name_map{$token} ) ) { $token = $name_map{$token}; } $name .= SPACE . $token; $iname_end = $_; $count++; } # Require a space after the container name token(s) if ( $name && defined($ilast_blank) && $ilast_blank > $iname_end ) { $name = substr( $name, 1 ); } return $name; } ## end sub make_uncontained_comma_name sub length_tag { my ( $i, $ibeg, $i_start ) = @_; # Generate a line length to be used as a tag for rejecting bad # alignments. The tag is the length of the line from the previous # matching token, or beginning of line, to the function name. This # will allow the vertical aligner to reject undesirable matches. # The basic method: sum length from previous alignment my $len = token_sequence_length( $i_start, $i - 1 ); # Minor patch: do not include the length of any '!'. # Otherwise, commas in the following line will not # match # ok( 20, tapprox( ( pdl 2, 3 ), ( pdl 2, 3 ) ) ); # ok( 21, !tapprox( ( pdl 2, 3 ), ( pdl 2, 4 ) ) ); if ( grep { $_ eq '!' } @types_to_go[ $i_start .. $i - 1 ] ) { $len -= 1; } if ( $i_start == $ibeg ) { # For first token, use distance from start of # line but subtract off the indentation due to # level. Otherwise, results could vary with # indentation. $len += leading_spaces_to_go($ibeg) - $levels_to_go[$i_start] * $rOpts_indent_columns; } if ( $len < 0 ) { $len = 0 } return $len; } ## end sub length_tag } ## end closure make_alignment_patterns sub make_paren_name { my ( $self, $i ) = @_; # The token at index $i is a '('. # Create an alignment name for it to avoid incorrect alignments. # Start with the name of the previous nonblank token... my $name = EMPTY_STRING; my $im = $i - 1; return EMPTY_STRING if ( $im < 0 ); if ( $types_to_go[$im] eq 'b' ) { $im--; } return EMPTY_STRING if ( $im < 0 ); $name = $tokens_to_go[$im]; # Prepend any sub name to an isolated -> to avoid unwanted alignments # [test case is test8/penco.pl] if ( $name eq '->' ) { $im--; if ( $im >= 0 && $types_to_go[$im] ne 'b' ) { $name = $tokens_to_go[$im] . $name; } } # Finally, remove any leading arrows if ( substr( $name, 0, 2 ) eq '->' ) { $name = substr( $name, 2 ); } return $name; } ## end sub make_paren_name { ## begin closure get_final_indentation my ( $last_indentation_written, $last_unadjusted_indentation, $last_leading_token ); sub initialize_get_final_indentation { $last_indentation_written = 0; $last_unadjusted_indentation = 0; $last_leading_token = EMPTY_STRING; return; } ## end sub initialize_get_final_indentation sub get_final_indentation { my ( $self, # $ibeg, $iend, $rfields, $rpatterns, $ri_first, $ri_last, $rindentation_list, $level_jump, $starting_in_quote, $is_static_block_comment, ) = @_; #-------------------------------------------------------------- # This routine makes any necessary adjustments to get the final # indentation of a line in the Formatter. #-------------------------------------------------------------- # It starts with the basic indentation which has been defined for the # leading token, and then takes into account any options that the user # has set regarding special indenting and outdenting. # This routine has to resolve a number of complex interacting issues, # including: # 1. The various -cti=n type flags, which contain the desired change in # indentation for lines ending in commas and semicolons, should be # followed, # 2. qw quotes require special processing and do not fit perfectly # with normal containers, # 3. formatting with -wn can complicate things, especially with qw # quotes, # 4. formatting with the -lp option is complicated, and does not # work well with qw quotes and with -wn formatting. # 5. a number of special situations, such as 'cuddled' formatting. # 6. This routine is mainly concerned with outdenting closing tokens # but note that there is some overlap with the functions of sub # undo_ci, which was processed earlier, so care has to be taken to # keep them coordinated. # Find the last code token of this line my $i_terminal = $iend; my $terminal_type = $types_to_go[$iend]; if ( $terminal_type eq '#' && $i_terminal > $ibeg ) { $i_terminal -= 1; $terminal_type = $types_to_go[$i_terminal]; if ( $terminal_type eq 'b' && $i_terminal > $ibeg ) { $i_terminal -= 1; $terminal_type = $types_to_go[$i_terminal]; } } my $is_outdented_line; my $type_beg = $types_to_go[$ibeg]; my $token_beg = $tokens_to_go[$ibeg]; my $level_beg = $levels_to_go[$ibeg]; my $block_type_beg = $block_type_to_go[$ibeg]; my $leading_spaces_beg = $leading_spaces_to_go[$ibeg]; my $seqno_beg = $type_sequence_to_go[$ibeg]; my $is_closing_type_beg = $is_closing_type{$type_beg}; # QW INDENTATION PATCH 3: my $seqno_qw_closing; if ( $type_beg eq 'q' && $ibeg == 0 ) { my $KK = $K_to_go[$ibeg]; $seqno_qw_closing = $self->[_rending_multiline_qw_seqno_by_K_]->{$KK}; } my $is_semicolon_terminated = $terminal_type eq ';' && ( $nesting_depth_to_go[$iend] < $nesting_depth_to_go[$ibeg] || $seqno_qw_closing ); # NOTE: A future improvement would be to make it semicolon terminated # even if it does not have a semicolon but is followed by a closing # block brace. This would undo ci even for something like the # following, in which the final paren does not have a semicolon because # it is a possible weld location: # if ($BOLD_MATH) { # ( # $labels, $comment, # join( '', '', &make_math( $mode, '', '', $_ ), '' ) # ) # } # # MOJO patch: Set a flag if this lines begins with ')->' my $leading_paren_arrow = ( $is_closing_type_beg && $token_beg eq ')' && ( ( $ibeg < $i_terminal && $types_to_go[ $ibeg + 1 ] eq '->' ) || ( $ibeg < $i_terminal - 1 && $types_to_go[ $ibeg + 1 ] eq 'b' && $types_to_go[ $ibeg + 2 ] eq '->' ) ) ); #--------------------------------------------------------- # Section 1: set a flag and a default indentation # # Most lines are indented according to the initial token. # But it is common to outdent to the level just after the # terminal token in certain cases... # adjust_indentation flag: # 0 - do not adjust # 1 - outdent # 2 - vertically align with opening token # 3 - indent #--------------------------------------------------------- my $adjust_indentation = 0; my $default_adjust_indentation = 0; # Parameters needed for option 2, aligning with opening token: my ( $opening_indentation, $opening_offset, $is_leading, $opening_exists ); #------------------------------------- # Section 1A: # if line starts with a sequenced item #------------------------------------- if ( $seqno_beg || $seqno_qw_closing ) { # This can be tedious so we let a sub do it ( $adjust_indentation, $default_adjust_indentation, $opening_indentation, $opening_offset, $is_leading, $opening_exists, ) = $self->get_closing_token_indentation( $ibeg, $iend, $ri_first, $ri_last, $rindentation_list, $level_jump, $i_terminal, $is_semicolon_terminated, $seqno_qw_closing, ); } #-------------------------------------------------------- # Section 1B: # if at ');', '};', '>;', and '];' of a terminal qw quote #-------------------------------------------------------- elsif ( substr( $rpatterns->[0], 0, 2 ) eq 'qb' && substr( $rfields->[0], -1, 1 ) eq ';' ## $rpatterns->[0] =~ /^qb*;$/ && $rfields->[0] =~ /^([\)\}\]\>]);$/ ) { if ( $closing_token_indentation{$1} == 0 ) { $adjust_indentation = 1; } else { $adjust_indentation = 3; } } #--------------------------------------------------------- # Section 2: set indentation according to flag set above # # Select the indentation object to define leading # whitespace. If we are outdenting something like '} } );' # then we want to use one level below the last token # ($i_terminal) in order to get it to fully outdent through # all levels. #--------------------------------------------------------- my $indentation; my $lev; my $level_end = $levels_to_go[$iend]; #------------------------------------ # Section 2A: adjust_indentation == 0 # No change in indentation #------------------------------------ if ( $adjust_indentation == 0 ) { $indentation = $leading_spaces_beg; $lev = $level_beg; } #------------------------------------------------------------------- # Secton 2B: adjust_indentation == 1 # Change the indentation to be that of a different token on the line #------------------------------------------------------------------- elsif ( $adjust_indentation == 1 ) { # Previously, the indentation of the terminal token was used: # OLD CODING: # $indentation = $reduced_spaces_to_go[$i_terminal]; # $lev = $levels_to_go[$i_terminal]; # Generalization for MOJO patch: # Use the lowest level indentation of the tokens on the line. # For example, here we can use the indentation of the ending ';': # } until ($selection > 0 and $selection < 10); # ok to use ';' # But this will not outdent if we use the terminal indentation: # )->then( sub { # use indentation of the ->, not the { # Warning: reduced_spaces_to_go[] may be a reference, do not # do numerical checks with it my $i_ind = $ibeg; $indentation = $reduced_spaces_to_go[$i_ind]; $lev = $levels_to_go[$i_ind]; while ( $i_ind < $i_terminal ) { $i_ind++; if ( $levels_to_go[$i_ind] < $lev ) { $indentation = $reduced_spaces_to_go[$i_ind]; $lev = $levels_to_go[$i_ind]; } } } #-------------------------------------------------------------- # Secton 2C: adjust_indentation == 2 # Handle indented closing token which aligns with opening token #-------------------------------------------------------------- elsif ( $adjust_indentation == 2 ) { # handle option to align closing token with opening token $lev = $level_beg; # calculate spaces needed to align with opening token my $space_count = get_spaces($opening_indentation) + $opening_offset; # Indent less than the previous line. # # Problem: For -lp we don't exactly know what it was if there # were recoverable spaces sent to the aligner. A good solution # would be to force a flush of the vertical alignment buffer, so # that we would know. For now, this rule is used for -lp: # # When the last line did not start with a closing token we will # be optimistic that the aligner will recover everything wanted. # # This rule will prevent us from breaking a hierarchy of closing # tokens, and in a worst case will leave a closing paren too far # indented, but this is better than frequently leaving it not # indented enough. my $last_spaces = get_spaces($last_indentation_written); if ( ref($last_indentation_written) && !$is_closing_token{$last_leading_token} ) { $last_spaces += get_recoverable_spaces($last_indentation_written); } # reset the indentation to the new space count if it works # only options are all or none: nothing in-between looks good $lev = $level_beg; my $diff = $last_spaces - $space_count; if ( $diff > 0 ) { $indentation = $space_count; } else { # We need to fix things ... but there is no good way to do it. # The best solution is for the user to use a longer maximum # line length. We could get a smooth variation if we just move # the paren in using # $space_count -= ( 1 - $diff ); # But unfortunately this can give a rather unbalanced look. # For -xlp we currently allow a tolerance of one indentation # level and then revert to a simpler default. This will jump # suddenly but keeps a balanced look. if ( $rOpts_extended_line_up_parentheses && $diff >= -$rOpts_indent_columns && $space_count > $leading_spaces_beg ) { $indentation = $space_count; } # Otherwise revert to defaults elsif ( $default_adjust_indentation == 0 ) { $indentation = $leading_spaces_beg; } elsif ( $default_adjust_indentation == 1 ) { $indentation = $reduced_spaces_to_go[$i_terminal]; $lev = $levels_to_go[$i_terminal]; } } } #------------------------------------------------------------- # Secton 2D: adjust_indentation == 3 # Full indentation of closing tokens (-icb and -icp or -cti=2) #------------------------------------------------------------- else { # handle -icb (indented closing code block braces) # Updated method for indented block braces: indent one full level if # there is no continuation indentation. This will occur for major # structures such as sub, if, else, but not for things like map # blocks. # # Note: only code blocks without continuation indentation are # handled here (if, else, unless, ..). In the following snippet, # the terminal brace of the sort block will have continuation # indentation as shown so it will not be handled by the coding # here. We would have to undo the continuation indentation to do # this, but it probably looks ok as is. This is a possible future # update for semicolon terminated lines. # # if ($sortby eq 'date' or $sortby eq 'size') { # @files = sort { # $file_data{$a}{$sortby} <=> $file_data{$b}{$sortby} # or $a cmp $b # } @files; # } # if ( $block_type_beg && $ci_levels_to_go[$i_terminal] == 0 ) { my $spaces = get_spaces( $leading_spaces_to_go[$i_terminal] ); $indentation = $spaces + $rOpts_indent_columns; # NOTE: for -lp we could create a new indentation object, but # there is probably no need to do it } # handle -icp and any -icb block braces which fall through above # test such as the 'sort' block mentioned above. else { # There are currently two ways to handle -icp... # One way is to use the indentation of the previous line: # $indentation = $last_indentation_written; # The other way is to use the indentation that the previous line # would have had if it hadn't been adjusted: $indentation = $last_unadjusted_indentation; # Current method: use the minimum of the two. This avoids # inconsistent indentation. if ( get_spaces($last_indentation_written) < get_spaces($indentation) ) { $indentation = $last_indentation_written; } } # use previous indentation but use own level # to cause list to be flushed properly $lev = $level_beg; } #------------------------------------------------------------- # Remember indentation except for multi-line quotes, which get # no indentation #------------------------------------------------------------- if ( !( $ibeg == 0 && $starting_in_quote ) ) { $last_indentation_written = $indentation; $last_unadjusted_indentation = $leading_spaces_beg; $last_leading_token = $token_beg; # Patch to make a line which is the end of a qw quote work with the # -lp option. Make $token_beg look like a closing token as some # type even if it is not. This variable will become # $last_leading_token at the end of this loop. Then, if the -lp # style is selected, and the next line is also a # closing token, it will not get more indentation than this line. # We need to do this because qw quotes (at present) only get # continuation indentation, not one level of indentation, so we # need to turn off the -lp indentation. # ... a picture is worth a thousand words: # perltidy -wn -gnu (Without this patch): # ok(defined( # $seqio = $gb->get_Stream_by_batch([qw(J00522 AF303112 # 2981014)]) # )); # perltidy -wn -gnu (With this patch): # ok(defined( # $seqio = $gb->get_Stream_by_batch([qw(J00522 AF303112 # 2981014)]) # )); if ( $seqno_qw_closing && ( length($token_beg) > 1 || $token_beg eq '>' ) ) { $last_leading_token = ')'; } } #--------------------------------------------------------------------- # Rule: lines with leading closing tokens should not be outdented more # than the line which contained the corresponding opening token. #--------------------------------------------------------------------- # Updated per bug report in alex_bug.pl: we must not # mess with the indentation of closing logical braces, so # we must treat something like '} else {' as if it were # an isolated brace my $is_isolated_block_brace = $block_type_beg && ( $i_terminal == $ibeg || $is_if_elsif_else_unless_while_until_for_foreach{$block_type_beg} ); # only do this for a ':; which is aligned with its leading '?' my $is_unaligned_colon = $type_beg eq ':' && !$is_leading; if ( defined($opening_indentation) && !$leading_paren_arrow # MOJO patch && !$is_isolated_block_brace && !$is_unaligned_colon ) { if ( get_spaces($opening_indentation) > get_spaces($indentation) ) { $indentation = $opening_indentation; } } #---------------------------------------------------- # remember the indentation of each line of this batch #---------------------------------------------------- push @{$rindentation_list}, $indentation; #--------------------------------------------- # outdent lines with certain leading tokens... #--------------------------------------------- if ( # must be first word of this batch $ibeg == 0 # and ... && ( # certain leading keywords if requested $rOpts_outdent_keywords && $type_beg eq 'k' && $outdent_keyword{$token_beg} # or labels if requested || $rOpts_outdent_labels && $type_beg eq 'J' # or static block comments if requested || $is_static_block_comment && $rOpts_outdent_static_block_comments ) ) { my $space_count = leading_spaces_to_go($ibeg); if ( $space_count > 0 ) { $space_count -= $rOpts_continuation_indentation; $is_outdented_line = 1; if ( $space_count < 0 ) { $space_count = 0 } # do not promote a spaced static block comment to non-spaced; # this is not normally necessary but could be for some # unusual user inputs (such as -ci = -i) if ( $type_beg eq '#' && $space_count == 0 ) { $space_count = 1; } $indentation = $space_count; } } return ( $indentation, $lev, $level_end, $i_terminal, $is_outdented_line, ); } ## end sub get_final_indentation sub get_closing_token_indentation { # Determine indentation adjustment for a line with a leading closing # token - i.e. one of these: ) ] } : my ( $self, # $ibeg, $iend, $ri_first, $ri_last, $rindentation_list, $level_jump, $i_terminal, $is_semicolon_terminated, $seqno_qw_closing, ) = @_; my $adjust_indentation = 0; my $default_adjust_indentation = $adjust_indentation; my $terminal_type = $types_to_go[$i_terminal]; my $type_beg = $types_to_go[$ibeg]; my $token_beg = $tokens_to_go[$ibeg]; my $level_beg = $levels_to_go[$ibeg]; my $block_type_beg = $block_type_to_go[$ibeg]; my $leading_spaces_beg = $leading_spaces_to_go[$ibeg]; my $seqno_beg = $type_sequence_to_go[$ibeg]; my $is_closing_type_beg = $is_closing_type{$type_beg}; my ( $opening_indentation, $opening_offset, $is_leading, $opening_exists ); # Honor any flag to reduce -ci set by the -bbxi=n option if ( $seqno_beg && $self->[_rwant_reduced_ci_]->{$seqno_beg} ) { # if this is an opening, it must be alone on the line ... if ( $is_closing_type{$type_beg} || $ibeg == $i_terminal ) { $adjust_indentation = 1; } # ... or a single welded unit (fix for b1173) elsif ($total_weld_count) { my $K_beg = $K_to_go[$ibeg]; my $Kterm = $K_to_go[$i_terminal]; my $Kterm_test = $self->[_rK_weld_left_]->{$Kterm}; if ( defined($Kterm_test) && $Kterm_test >= $K_beg ) { $Kterm = $Kterm_test; } if ( $Kterm == $K_beg ) { $adjust_indentation = 1 } } } my $ris_bli_container = $self->[_ris_bli_container_]; my $is_bli_beg = $seqno_beg ? $ris_bli_container->{$seqno_beg} : 0; # Update the $is_bli flag as we go. It is initially 1. # We note seeing a leading opening brace by setting it to 2. # If we get to the closing brace without seeing the opening then we # turn it off. This occurs if the opening brace did not get output # at the start of a line, so we will then indent the closing brace # in the default way. if ( $is_bli_beg && $is_bli_beg == 1 ) { my $K_opening_container = $self->[_K_opening_container_]; my $K_opening = $K_opening_container->{$seqno_beg}; my $K_beg = $K_to_go[$ibeg]; if ( $K_beg eq $K_opening ) { $ris_bli_container->{$seqno_beg} = $is_bli_beg = 2; } else { $is_bli_beg = 0 } } # QW PATCH for the combination -lp -wn # For -lp formatting use $ibeg_weld_fix to get around the problem # that with -lp type formatting the opening and closing tokens to not # have sequence numbers. my $ibeg_weld_fix = $ibeg; if ( $seqno_qw_closing && $total_weld_count ) { my $i_plus = $inext_to_go[$ibeg]; if ( $i_plus <= $max_index_to_go ) { my $K_plus = $K_to_go[$i_plus]; if ( defined( $self->[_rK_weld_left_]->{$K_plus} ) ) { $ibeg_weld_fix = $i_plus; } } } # if we are at a closing token of some type.. if ( $is_closing_type_beg || $seqno_qw_closing ) { my $K_beg = $K_to_go[$ibeg]; # get the indentation of the line containing the corresponding # opening token ( $opening_indentation, $opening_offset, $is_leading, $opening_exists ) = $self->get_opening_indentation( $ibeg_weld_fix, $ri_first, $ri_last, $rindentation_list, $seqno_qw_closing ); # Patch for rt144979, part 1. Coordinated with part 2. # Do not undo ci for a cuddled closing brace control; it # needs to be treated exactly the same ci as an isolated # closing brace. my $is_cuddled_closing_brace = $seqno_beg && $self->[_ris_cuddled_closing_brace_]->{$seqno_beg}; # First set the default behavior: if ( # default behavior is to outdent closing lines # of the form: "); }; ]; )->xxx;" $is_semicolon_terminated # and 'cuddled parens' of the form: ")->pack(". Bug fix for RT # #123749]: the TYPES here were incorrectly ')' and '('. The # corrected TYPES are '}' and '{'. But skip a cuddled block. || ( $terminal_type eq '{' && $type_beg eq '}' && ( $nesting_depth_to_go[$iend] + 1 == $nesting_depth_to_go[$ibeg] ) && !$is_cuddled_closing_brace ) # remove continuation indentation for any line like # } ... { # or without ending '{' and unbalanced, such as # such as '}->{$operator}' || ( $type_beg eq '}' && ( $types_to_go[$iend] eq '{' || $levels_to_go[$iend] < $level_beg ) # but not if a cuddled block && !$is_cuddled_closing_brace ) # and when the next line is at a lower indentation level... # PATCH #1: and only if the style allows undoing continuation # for all closing token types. We should really wait until # the indentation of the next line is known and then make # a decision, but that would require another pass. # PATCH #2: and not if this token is under -xci control || ( $level_jump < 0 && !$some_closing_token_indentation && !$self->[_rseqno_controlling_my_ci_]->{$K_beg} ) # Patch for -wn=2, multiple welded closing tokens || ( $i_terminal > $ibeg && $is_closing_type{ $types_to_go[$iend] } ) # Alternate Patch for git #51, isolated closing qw token not # outdented if no-delete-old-newlines is set. This works, but # a more general patch elsewhere fixes the real problem: ljump. # || ( $seqno_qw_closing && $ibeg == $i_terminal ) ) { $adjust_indentation = 1; } # outdent something like '),' if ( $terminal_type eq ',' # Removed this constraint for -wn # OLD: allow just one character before the comma # && $i_terminal == $ibeg + 1 # require LIST environment; otherwise, we may outdent too much - # this can happen in calls without parentheses (overload.t); && $self->is_in_list_by_i($i_terminal) ) { $adjust_indentation = 1; } # undo continuation indentation of a terminal closing token if # it is the last token before a level decrease. This will allow # a closing token to line up with its opening counterpart, and # avoids an indentation jump larger than 1 level. my $rLL = $self->[_rLL_]; my $Klimit = $self->[_Klimit_]; if ( $i_terminal == $ibeg && $is_closing_type_beg && defined($K_beg) && $K_beg < $Klimit ) { my $K_plus = $K_beg + 1; my $type_plus = $rLL->[$K_plus]->[_TYPE_]; if ( $type_plus eq 'b' && $K_plus < $Klimit ) { $type_plus = $rLL->[ ++$K_plus ]->[_TYPE_]; } if ( $type_plus eq '#' && $K_plus < $Klimit ) { $type_plus = $rLL->[ ++$K_plus ]->[_TYPE_]; if ( $type_plus eq 'b' && $K_plus < $Klimit ) { $type_plus = $rLL->[ ++$K_plus ]->[_TYPE_]; } # Note: we have skipped past just one comment (perhaps a # side comment). There could be more, and we could easily # skip past all the rest with the following code, or with a # while loop. It would be rare to have to do this, and # those block comments would still be indented, so it would # to leave them indented. So it seems best to just stop at # a maximum of one comment. ##if ($type_plus eq '#') { ## $K_plus = $self->K_next_code($K_plus); ##} } if ( !$is_bli_beg && defined($K_plus) ) { my $lev = $level_beg; my $level_next = $rLL->[$K_plus]->[_LEVEL_]; # and do not undo ci if it was set by the -xci option $adjust_indentation = 1 if ( $level_next < $lev && !$self->[_rseqno_controlling_my_ci_]->{$K_beg} ); } # Patch for RT #96101, in which closing brace of anonymous subs # was not outdented. We should look ahead and see if there is # a level decrease at the next token (i.e., a closing token), # but right now we do not have that information. For now # we see if we are in a list, and this works well. # See test files 'sub*.t' for good test cases. if ( !$rOpts_indent_closing_brace && $block_type_beg && $self->[_ris_asub_block_]->{$seqno_beg} && $self->is_in_list_by_i($i_terminal) ) { ( $opening_indentation, $opening_offset, $is_leading, $opening_exists ) = $self->get_opening_indentation( $ibeg, $ri_first, $ri_last, $rindentation_list ); my $indentation = $leading_spaces_beg; if ( defined($opening_indentation) && get_spaces($indentation) > get_spaces($opening_indentation) ) { $adjust_indentation = 1; } } } # YVES patch 1 of 2: # Undo ci of line with leading closing eval brace, # but not beyond the indentation of the line with # the opening brace. if ( $block_type_beg && $block_type_beg eq 'eval' && !ref($leading_spaces_beg) && !$rOpts_indent_closing_brace ) { ( $opening_indentation, $opening_offset, $is_leading, $opening_exists ) = $self->get_opening_indentation( $ibeg, $ri_first, $ri_last, $rindentation_list ); my $indentation = $leading_spaces_beg; if ( defined($opening_indentation) && get_spaces($indentation) > get_spaces($opening_indentation) ) { $adjust_indentation = 1; } } # patch for issue git #40: -bli setting has priority $adjust_indentation = 0 if ($is_bli_beg); $default_adjust_indentation = $adjust_indentation; # Now modify default behavior according to user request: # handle option to indent non-blocks of the form ); }; ]; # But don't do special indentation to something like ')->pack(' if ( !$block_type_beg ) { # Note that logical padding has already been applied, so we may # need to remove some spaces to get a valid hash key. my $tok = $token_beg; my $cti = $closing_token_indentation{$tok}; # Fix the value of 'cti' for an isolated non-welded closing qw # delimiter. if ( $seqno_qw_closing && $ibeg_weld_fix == $ibeg ) { # A quote delimiter which is not a container will not have # a cti value defined. In this case use the style of a # paren. For example # my @fars = ( # qw< # far # farfar # farfars-far # >, # ); if ( !defined($cti) && length($tok) == 1 ) { # something other than ')', '}', ']' ; use flag for ')' $cti = $closing_token_indentation{')'}; # But for now, do not outdent non-container qw # delimiters because it would would change existing # formatting. if ( $tok ne '>' ) { $cti = 3 } } # A non-welded closing qw cannot currently use -cti=1 # because that option requires a sequence number to find # the opening indentation, and qw quote delimiters are not # sequenced items. if ( defined($cti) && $cti == 1 ) { $cti = 0 } } if ( !defined($cti) ) { # $cti may not be defined for several reasons. # -padding may have been applied so the character # has a length > 1 # - we may have welded to a closing quote token. # Here is an example (perltidy -wn): # __PACKAGE__->load_components( qw( # > Core # > # > ) ); $adjust_indentation = 0; } elsif ( $cti == 1 ) { if ( $i_terminal <= $ibeg + 1 || $is_semicolon_terminated ) { $adjust_indentation = 2; } else { $adjust_indentation = 0; } } elsif ( $cti == 2 ) { if ($is_semicolon_terminated) { $adjust_indentation = 3; } else { $adjust_indentation = 0; } } elsif ( $cti == 3 ) { $adjust_indentation = 3; } } # handle option to indent blocks else { if ( $rOpts_indent_closing_brace && ( $i_terminal == $ibeg # isolated terminal '}' || $is_semicolon_terminated ) ) # } xxxx ; { $adjust_indentation = 3; } } } ## end if ( $is_closing_type_beg || $seqno_qw_closing ) # if line begins with a ':', align it with any # previous line leading with corresponding ? elsif ( $type_beg eq ':' ) { ( $opening_indentation, $opening_offset, $is_leading, $opening_exists ) = $self->get_opening_indentation( $ibeg, $ri_first, $ri_last, $rindentation_list ); if ($is_leading) { $adjust_indentation = 2; } } return ( $adjust_indentation, $default_adjust_indentation, $opening_indentation, $opening_offset, $is_leading, $opening_exists, ); } ## end sub get_closing_token_indentation } ## end closure get_final_indentation sub get_opening_indentation { # get the indentation of the line which output the opening token # corresponding to a given closing token in the current output batch. # # given: # $i_closing - index in this line of a closing token ')' '}' or ']' # # $ri_first - reference to list of the first index $i for each output # line in this batch # $ri_last - reference to list of the last index $i for each output line # in this batch # $rindentation_list - reference to a list containing the indentation # used for each line. # $qw_seqno - optional sequence number to use if normal seqno not defined # (NOTE: would be more general to just look this up from index i) # # return: # -the indentation of the line which contained the opening token # which matches the token at index $i_opening # -and its offset (number of columns) from the start of the line # my ( $self, $i_closing, $ri_first, $ri_last, $rindentation_list, $qw_seqno ) = @_; # first, see if the opening token is in the current batch my $i_opening = $mate_index_to_go[$i_closing]; my ( $indent, $offset, $is_leading, $exists ); $exists = 1; if ( defined($i_opening) && $i_opening >= 0 ) { # it is..look up the indentation ( $indent, $offset, $is_leading ) = lookup_opening_indentation( $i_opening, $ri_first, $ri_last, $rindentation_list ); } # if not, it should have been stored in the hash by a previous batch else { my $seqno = $type_sequence_to_go[$i_closing]; $seqno = $qw_seqno unless ($seqno); ( $indent, $offset, $is_leading, $exists ) = get_saved_opening_indentation($seqno); } return ( $indent, $offset, $is_leading, $exists ); } ## end sub get_opening_indentation sub examine_vertical_tightness_flags { my ($self) = @_; # For efficiency, we will set a flag to skip all calls to sub # 'set_vertical_tightness_flags' if vertical tightness is not possible with # the user input parameters. If vertical tightness is possible, we will # simply leave the flag undefined and return. # Vertical tightness is never possible with --freeze-whitespace if ($rOpts_freeze_whitespace) { $self->[_no_vertical_tightness_flags_] = 1; return; } # This sub is coordinated with sub set_vertical_tightness_flags. # The Section numbers in the following comments are the sections # in sub set_vertical_tightness_flags: # Examine controls for Section 1a: return if ($rOpts_line_up_parentheses); foreach my $key ( keys %opening_vertical_tightness ) { return if ( $opening_vertical_tightness{$key} ); } # Examine controls for Section 1b: foreach my $key ( keys %closing_vertical_tightness ) { return if ( $closing_vertical_tightness{$key} ); } # Examine controls for Section 1c: foreach my $key ( keys %opening_token_right ) { return if ( $opening_token_right{$key} ); } # Examine controls for Section 1d: foreach my $key ( keys %stack_opening_token ) { return if ( $stack_opening_token{$key} ); } foreach my $key ( keys %stack_closing_token ) { return if ( $stack_closing_token{$key} ); } # Examine controls for Section 2: return if ($rOpts_block_brace_vertical_tightness); # Examine controls for Section 3: return if ($rOpts_stack_closing_block_brace); # None of the controls used for vertical tightness are set, so # we can skip all calls to sub set_vertical_tightness_flags $self->[_no_vertical_tightness_flags_] = 1; return; } ## end sub examine_vertical_tightness_flags sub set_vertical_tightness_flags { my ( $self, $n, $n_last_line, $ibeg, $iend, $ri_first, $ri_last, $ending_in_quote, $closing_side_comment ) = @_; # Define vertical tightness controls for the nth line of a batch. # Note: do not call this sub for a block comment or if # $rOpts_freeze_whitespace is set. # These parameters are passed to the vertical aligner to indicated # if we should combine this line with the next line to achieve the # desired vertical tightness. This was previously an array but # has been converted to a hash: # old hash Meaning # index key # # 0 _vt_type: 1=opening non-block 2=closing non-block # 3=opening block brace 4=closing block brace # # 1a _vt_opening_flag: 1=no multiple steps, 2=multiple steps ok # 1b _vt_closing_flag: spaces of padding to use if closing # 2 _vt_seqno: sequence number of container # 3 _vt_valid flag: do not append if this flag is false. Will be # true if appropriate -vt flag is set. Otherwise, Will be # made true only for 2 line container in parens with -lp # 4 _vt_seqno_beg: sequence number of first token of line # 5 _vt_seqno_end: sequence number of last token of line # 6 _vt_min_lines: min number of lines for joining opening cache, # 0=no constraint # 7 _vt_max_lines: max number of lines for joining opening cache, # 0=no constraint # The vertical tightness mechanism can add whitespace, so whitespace can # continually increase if we allowed it when the -fws flag is set. # See case b499 for an example. # Define these values... my $vt_type = 0; my $vt_opening_flag = 0; my $vt_closing_flag = 0; my $vt_seqno = 0; my $vt_valid_flag = 0; my $vt_seqno_beg = 0; my $vt_seqno_end = 0; my $vt_min_lines = 0; my $vt_max_lines = 0; # Uses these global parameters: # $rOpts_block_brace_tightness # $rOpts_block_brace_vertical_tightness # $rOpts_stack_closing_block_brace # $rOpts_line_up_parentheses # %opening_vertical_tightness # %closing_vertical_tightness # %opening_token_right # %stack_closing_token # %stack_opening_token #-------------------------------------------------------------- # Vertical Tightness Flags Section 1: # Handle Lines 1 .. n-1 but not the last line # For non-BLOCK tokens, we will need to examine the next line # too, so we won't consider the last line. #-------------------------------------------------------------- if ( $n < $n_last_line ) { #-------------------------------------------------------------- # Vertical Tightness Flags Section 1a: # Look for Type 1, last token of this line is a non-block opening token #-------------------------------------------------------------- my $ibeg_next = $ri_first->[ $n + 1 ]; my $token_end = $tokens_to_go[$iend]; my $iend_next = $ri_last->[ $n + 1 ]; if ( $type_sequence_to_go[$iend] && !$block_type_to_go[$iend] && $is_opening_token{$token_end} && ( $opening_vertical_tightness{$token_end} > 0 # allow 2-line method call to be closed up || ( $rOpts_line_up_parentheses && $token_end eq '(' && $self->[_rlp_object_by_seqno_] ->{ $type_sequence_to_go[$iend] } && $iend > $ibeg && $types_to_go[ $iend - 1 ] ne 'b' ) ) ) { # avoid multiple jumps in nesting depth in one line if # requested my $ovt = $opening_vertical_tightness{$token_end}; # Turn off the -vt flag if the next line ends in a weld. # This avoids an instability with one-line welds (fixes b1183). my $type_end_next = $types_to_go[$iend_next]; $ovt = 0 if ( $self->[_rK_weld_left_]->{ $K_to_go[$iend_next] } && $is_closing_type{$type_end_next} ); # The flag '_rbreak_container_' avoids conflict of -bom and -pt=1 # or -pt=2; fixes b1270. See similar patch above for $cvt. my $seqno = $type_sequence_to_go[$iend]; if ( $ovt && $seqno && $self->[_rbreak_container_]->{$seqno} ) { $ovt = 0; } # The flag '_rmax_vertical_tightness_' avoids welding conflicts. if ( defined( $self->[_rmax_vertical_tightness_]->{$seqno} ) ) { $ovt = min( $ovt, $self->[_rmax_vertical_tightness_]->{$seqno} ); } unless ( $ovt < 2 && ( $nesting_depth_to_go[ $iend_next + 1 ] != $nesting_depth_to_go[$ibeg_next] ) ) { # If -vt flag has not been set, mark this as invalid # and aligner will validate it if it sees the closing paren # within 2 lines. my $valid_flag = $ovt; $vt_type = 1; $vt_opening_flag = $ovt; $vt_seqno = $type_sequence_to_go[$iend]; $vt_valid_flag = $valid_flag; } } #-------------------------------------------------------------- # Vertical Tightness Flags Section 1b: # Look for Type 2, first token of next line is a non-block closing # token .. and be sure this line does not have a side comment #-------------------------------------------------------------- my $token_next = $tokens_to_go[$ibeg_next]; if ( $type_sequence_to_go[$ibeg_next] && !$block_type_to_go[$ibeg_next] && $is_closing_token{$token_next} && $types_to_go[$iend] ne '#' ) # for safety, shouldn't happen! { my $cvt = $closing_vertical_tightness{$token_next}; # Avoid conflict of -bom and -pvt=1 or -pvt=2, fixes b977, b1303 # See similar patch above for $ovt. my $seqno = $type_sequence_to_go[$ibeg_next]; if ( $cvt && $self->[_rbreak_container_]->{$seqno} ) { $cvt = 0; } # Implement cvt=3: like cvt=0 for assigned structures, like cvt=1 # otherwise. Added for rt136417. if ( $cvt == 3 ) { $cvt = $self->[_ris_assigned_structure_]->{$seqno} ? 0 : 1; } # The unusual combination -pvtc=2 -dws -naws can be unstable. # This fixes b1282, b1283. This can be moved to set_options. if ( $cvt == 2 && $rOpts_delete_old_whitespace && !$rOpts_add_whitespace ) { $cvt = 1; } # Fix for b1379, b1380, b1381, b1382, b1384 part 2, # instablility with adding and deleting trailing commas: # Reducing -cvt=2 to =1 fixes stability for -wtc=b in b1379,1380. # Reducing -cvt>0 to =0 fixes stability for -wtc=b in b1381,1382. # Reducing -cvt>0 to =0 fixes stability for -wtc=m in b1384 if ( $cvt && $self->[_ris_bare_trailing_comma_by_seqno_]->{$seqno} ) { $cvt = 0; } if ( # Never append a trailing line like ')->pack(' because it # will throw off later alignment. So this line must start at a # deeper level than the next line (fix1 for welding, git #45). ( $nesting_depth_to_go[$ibeg_next] >= $nesting_depth_to_go[ $iend_next + 1 ] + 1 ) && ( $cvt == 2 || ( !$self->is_in_list_by_i($ibeg_next) && ( $cvt == 1 # allow closing up 2-line method calls || ( $rOpts_line_up_parentheses && $token_next eq ')' && $type_sequence_to_go[$ibeg_next] && $self->[_rlp_object_by_seqno_] ->{ $type_sequence_to_go[$ibeg_next] } ) ) ) ) ) { # decide which trailing closing tokens to append.. my $ok = 0; if ( $cvt == 2 || $iend_next == $ibeg_next ) { $ok = 1 } else { my $str = join( EMPTY_STRING, @types_to_go[ $ibeg_next + 1 .. $ibeg_next + 2 ] ); # append closing token if followed by comment or ';' # or another closing token (fix2 for welding, git #45) if ( $str =~ /^b?[\)\]\}R#;]/ ) { $ok = 1 } } if ($ok) { my $valid_flag = $cvt; my $min_lines = 0; my $max_lines = 0; # Fix for b1187 and b1188: Blinking can occur if we allow # welded tokens to re-form into one-line blocks during # vertical alignment when -lp used. So for this case we # set the minimum number of lines to be 1 instead of 0. # The maximum should be 1 if -vtc is not used. If -vtc is # used, we turn the valid # flag off and set the maximum to 0. This is equivalent to # using a large number. my $seqno_ibeg_next = $type_sequence_to_go[$ibeg_next]; if ( $rOpts_line_up_parentheses && $total_weld_count && $seqno_ibeg_next && $self->[_rlp_object_by_seqno_]->{$seqno_ibeg_next} && $self->is_welded_at_seqno($seqno_ibeg_next) ) { $min_lines = 1; $max_lines = $cvt ? 0 : 1; $valid_flag = 0; } $vt_type = 2; $vt_closing_flag = $tightness{$token_next} == 2 ? 0 : 1; $vt_seqno = $type_sequence_to_go[$ibeg_next]; $vt_valid_flag = $valid_flag; $vt_min_lines = $min_lines; $vt_max_lines = $max_lines; } } } #-------------------------------------------------------------- # Vertical Tightness Flags Section 1c: # Implement the Opening Token Right flag (Type 2).. # If requested, move an isolated trailing opening token to the end of # the previous line which ended in a comma. We could do this # in sub recombine_breakpoints but that would cause problems # with -lp formatting. The problem is that indentation will # quickly move far to the right in nested expressions. By # doing it after indentation has been set, we avoid changes # to the indentation. Actual movement of the token takes place # in sub valign_output_step_B. # Note added 4 May 2021: the man page suggests that the -otr flags # are mainly for opening tokens following commas. But this seems # to have been generalized long ago to include other situations. # I checked the coding back to 2012 and it is essentially the same # as here, so it is best to leave this unchanged for now. #-------------------------------------------------------------- if ( $opening_token_right{ $tokens_to_go[$ibeg_next] } # previous line is not opening # (use -sot to combine with it) && !$is_opening_token{$token_end} # previous line ended in one of these # (add other cases if necessary; '=>' and '.' are not necessary && !$block_type_to_go[$ibeg_next] # this is a line with just an opening token && ( $iend_next == $ibeg_next || $iend_next == $ibeg_next + 2 && $types_to_go[$iend_next] eq '#' ) # Fix for case b1060 when both -baoo and -otr are set: # to avoid blinking, honor the -baoo flag over the -otr flag. && $token_end ne '||' && $token_end ne '&&' # Keep break after '=' if -lp. Fixes b964 b1040 b1062 b1083 b1089. # Generalized from '=' to $is_assignment to fix b1375. && !( $is_assignment{ $types_to_go[$iend] } && $rOpts_line_up_parentheses && $type_sequence_to_go[$ibeg_next] && $self->[_rlp_object_by_seqno_] ->{ $type_sequence_to_go[$ibeg_next] } ) # looks bad if we align vertically with the wrong container && $tokens_to_go[$ibeg] ne $tokens_to_go[$ibeg_next] # give -kba priority over -otr (b1445) && !$self->[_rbreak_after_Klast_]->{ $K_to_go[$iend] } ) { my $spaces = ( $types_to_go[ $ibeg_next - 1 ] eq 'b' ) ? 1 : 0; $vt_type = 2; $vt_closing_flag = $spaces; $vt_seqno = $type_sequence_to_go[$ibeg_next]; $vt_valid_flag = 1; } #-------------------------------------------------------------- # Vertical Tightness Flags Section 1d: # Stacking of opening and closing tokens (Type 2) #-------------------------------------------------------------- my $stackable; my $token_beg_next = $tokens_to_go[$ibeg_next]; # patch to make something like 'qw(' behave like an opening paren # (aran.t) if ( $types_to_go[$ibeg_next] eq 'q' ) { if ( $token_beg_next =~ /^qw\s*([\[\(\{])$/ ) { $token_beg_next = $1; } } if ( $is_closing_token{$token_end} && $is_closing_token{$token_beg_next} ) { # avoid instability of combo -bom and -sct; b1179 my $seq_next = $type_sequence_to_go[$ibeg_next]; $stackable = $stack_closing_token{$token_beg_next} unless ( $block_type_to_go[$ibeg_next] || $seq_next && $self->[_rbreak_container_]->{$seq_next} ); } elsif ($is_opening_token{$token_end} && $is_opening_token{$token_beg_next} ) { $stackable = $stack_opening_token{$token_beg_next} unless ( $block_type_to_go[$ibeg_next] ) ; # shouldn't happen; just checking } if ($stackable) { my $is_semicolon_terminated; if ( $n + 1 == $n_last_line ) { my ( $terminal_type, $i_terminal ) = terminal_type_i( $ibeg_next, $iend_next ); $is_semicolon_terminated = $terminal_type eq ';' && $nesting_depth_to_go[$iend_next] < $nesting_depth_to_go[$ibeg_next]; } # this must be a line with just an opening token # or end in a semicolon if ( $is_semicolon_terminated || ( $iend_next == $ibeg_next || $iend_next == $ibeg_next + 2 && $types_to_go[$iend_next] eq '#' ) ) { my $spaces = ( $types_to_go[ $ibeg_next - 1 ] eq 'b' ) ? 1 : 0; $vt_type = 2; $vt_closing_flag = $spaces; $vt_seqno = $type_sequence_to_go[$ibeg_next]; $vt_valid_flag = 1; } } } #-------------------------------------------------------------- # Vertical Tightness Flags Section 2: # Handle type 3, opening block braces on last line of the batch # Check for a last line with isolated opening BLOCK curly #-------------------------------------------------------------- elsif ($rOpts_block_brace_vertical_tightness && $ibeg eq $iend && $types_to_go[$iend] eq '{' && $block_type_to_go[$iend] && $block_type_to_go[$iend] =~ /$block_brace_vertical_tightness_pattern/ ) { $vt_type = 3; $vt_opening_flag = $rOpts_block_brace_vertical_tightness; $vt_seqno = 0; $vt_valid_flag = 1; } #-------------------------------------------------------------- # Vertical Tightness Flags Section 3: # Handle type 4, a closing block brace on the last line of the batch Check # for a last line with isolated closing BLOCK curly # Patch: added a check for any new closing side comment which the # -csc option may generate. If it exists, there will be a side comment # so we cannot combine with a brace on the next line. This issue # occurs for the combination -scbb and -csc is used. #-------------------------------------------------------------- elsif ($rOpts_stack_closing_block_brace && $ibeg eq $iend && $block_type_to_go[$iend] && $types_to_go[$iend] eq '}' && ( !$closing_side_comment || $n < $n_last_line ) ) { my $spaces = $rOpts_block_brace_tightness == 2 ? 0 : 1; $vt_type = 4; $vt_closing_flag = $spaces; $vt_seqno = $type_sequence_to_go[$iend]; $vt_valid_flag = 1; } # get the sequence numbers of the ends of this line $vt_seqno_beg = $type_sequence_to_go[$ibeg]; if ( !$vt_seqno_beg ) { if ( $types_to_go[$ibeg] eq 'q' ) { $vt_seqno_beg = $self->get_seqno( $ibeg, $ending_in_quote ); } else { $vt_seqno_beg = EMPTY_STRING } } $vt_seqno_end = $type_sequence_to_go[$iend]; if ( !$vt_seqno_end ) { if ( $types_to_go[$iend] eq 'q' ) { $vt_seqno_end = $self->get_seqno( $iend, $ending_in_quote ); } else { $vt_seqno_end = EMPTY_STRING } } if ( !defined($vt_seqno) ) { $vt_seqno = EMPTY_STRING } my $rvertical_tightness_flags = { _vt_type => $vt_type, _vt_opening_flag => $vt_opening_flag, _vt_closing_flag => $vt_closing_flag, _vt_seqno => $vt_seqno, _vt_valid_flag => $vt_valid_flag, _vt_seqno_beg => $vt_seqno_beg, _vt_seqno_end => $vt_seqno_end, _vt_min_lines => $vt_min_lines, _vt_max_lines => $vt_max_lines, }; return ($rvertical_tightness_flags); } ## end sub set_vertical_tightness_flags ########################################################## # CODE SECTION 14: Code for creating closing side comments ########################################################## { ## begin closure accumulate_csc_text # These routines are called once per batch when the --closing-side-comments flag # has been set. my %block_leading_text; my %block_opening_line_number; my $csc_new_statement_ok; my $csc_last_label; my %csc_block_label; my $accumulating_text_for_block; my $leading_block_text; my $rleading_block_if_elsif_text; my $leading_block_text_level; my $leading_block_text_length_exceeded; my $leading_block_text_line_length; my $leading_block_text_line_number; sub initialize_csc_vars { %block_leading_text = (); %block_opening_line_number = (); $csc_new_statement_ok = 1; $csc_last_label = EMPTY_STRING; %csc_block_label = (); $rleading_block_if_elsif_text = []; $accumulating_text_for_block = EMPTY_STRING; reset_block_text_accumulator(); return; } ## end sub initialize_csc_vars sub reset_block_text_accumulator { # save text after 'if' and 'elsif' to append after 'else' if ($accumulating_text_for_block) { ## ( $accumulating_text_for_block =~ /^(if|elsif)$/ ) { if ( $is_if_elsif{$accumulating_text_for_block} ) { push @{$rleading_block_if_elsif_text}, $leading_block_text; } } $accumulating_text_for_block = EMPTY_STRING; $leading_block_text = EMPTY_STRING; $leading_block_text_level = 0; $leading_block_text_length_exceeded = 0; $leading_block_text_line_number = 0; $leading_block_text_line_length = 0; return; } ## end sub reset_block_text_accumulator sub set_block_text_accumulator { my ( $self, $i ) = @_; $accumulating_text_for_block = $tokens_to_go[$i]; if ( $accumulating_text_for_block !~ /^els/ ) { $rleading_block_if_elsif_text = []; } $leading_block_text = EMPTY_STRING; $leading_block_text_level = $levels_to_go[$i]; $leading_block_text_line_number = $self->get_output_line_number(); $leading_block_text_length_exceeded = 0; # this will contain the column number of the last character # of the closing side comment $leading_block_text_line_length = length($csc_last_label) + length($accumulating_text_for_block) + length( $rOpts->{'closing-side-comment-prefix'} ) + $leading_block_text_level * $rOpts_indent_columns + 3; return; } ## end sub set_block_text_accumulator sub accumulate_block_text { my ( $self, $i ) = @_; # accumulate leading text for -csc, ignoring any side comments if ( $accumulating_text_for_block && !$leading_block_text_length_exceeded && $types_to_go[$i] ne '#' ) { my $added_length = $token_lengths_to_go[$i]; $added_length += 1 if $i == 0; my $new_line_length = $leading_block_text_line_length + $added_length; # we can add this text if we don't exceed some limits.. if ( # we must not have already exceeded the text length limit length($leading_block_text) < $rOpts_closing_side_comment_maximum_text # and either: # the new total line length must be below the line length limit # or the new length must be below the text length limit # (ie, we may allow one token to exceed the text length limit) && ( $new_line_length < $maximum_line_length_at_level[$leading_block_text_level] || length($leading_block_text) + $added_length < $rOpts_closing_side_comment_maximum_text ) # UNLESS: we are adding a closing paren before the brace we seek. # This is an attempt to avoid situations where the ... to be # added are longer than the omitted right paren, as in: # foreach my $item (@a_rather_long_variable_name_here) { # &whatever; # } ## end foreach my $item (@a_rather_long_variable_name_here... || ( $tokens_to_go[$i] eq ')' && ( ( $i + 1 <= $max_index_to_go && $block_type_to_go[ $i + 1 ] && $block_type_to_go[ $i + 1 ] eq $accumulating_text_for_block ) || ( $i + 2 <= $max_index_to_go && $block_type_to_go[ $i + 2 ] && $block_type_to_go[ $i + 2 ] eq $accumulating_text_for_block ) ) ) ) { # add an extra space at each newline if ( $i == 0 && $types_to_go[$i] ne 'b' ) { $leading_block_text .= SPACE; } # add the token text $leading_block_text .= $tokens_to_go[$i]; $leading_block_text_line_length = $new_line_length; } # show that text was truncated if necessary elsif ( $types_to_go[$i] ne 'b' ) { $leading_block_text_length_exceeded = 1; $leading_block_text .= '...'; } } return; } ## end sub accumulate_block_text sub accumulate_csc_text { my ($self) = @_; # called once per output buffer when -csc is used. Accumulates # the text placed after certain closing block braces. # Defines and returns the following for this buffer: my $block_leading_text = EMPTY_STRING; # the leading text of the last '}' my $rblock_leading_if_elsif_text; my $i_block_leading_text = -1; # index of token owning block_leading_text my $block_line_count = 100; # how many lines the block spans my $terminal_type = 'b'; # type of last nonblank token my $i_terminal = 0; # index of last nonblank token my $terminal_block_type = EMPTY_STRING; # update most recent statement label $csc_last_label = EMPTY_STRING unless ($csc_last_label); if ( $types_to_go[0] eq 'J' ) { $csc_last_label = $tokens_to_go[0] } my $block_label = $csc_last_label; # Loop over all tokens of this batch for my $i ( 0 .. $max_index_to_go ) { my $type = $types_to_go[$i]; my $block_type = $block_type_to_go[$i]; my $token = $tokens_to_go[$i]; $block_type = EMPTY_STRING unless ($block_type); # remember last nonblank token type if ( $type ne '#' && $type ne 'b' ) { $terminal_type = $type; $terminal_block_type = $block_type; $i_terminal = $i; } my $type_sequence = $type_sequence_to_go[$i]; if ( $block_type && $type_sequence ) { if ( $token eq '}' ) { # restore any leading text saved when we entered this block if ( defined( $block_leading_text{$type_sequence} ) ) { ( $block_leading_text, $rblock_leading_if_elsif_text ) = @{ $block_leading_text{$type_sequence} }; $i_block_leading_text = $i; delete $block_leading_text{$type_sequence}; $rleading_block_if_elsif_text = $rblock_leading_if_elsif_text; } if ( defined( $csc_block_label{$type_sequence} ) ) { $block_label = $csc_block_label{$type_sequence}; delete $csc_block_label{$type_sequence}; } # if we run into a '}' then we probably started accumulating # at something like a trailing 'if' clause..no harm done. if ( $accumulating_text_for_block && $levels_to_go[$i] <= $leading_block_text_level ) { my $lev = $levels_to_go[$i]; reset_block_text_accumulator(); } if ( defined( $block_opening_line_number{$type_sequence} ) ) { my $output_line_number = $self->get_output_line_number(); $block_line_count = $output_line_number - $block_opening_line_number{$type_sequence} + 1; delete $block_opening_line_number{$type_sequence}; } else { # Error: block opening line undefined for this line.. # This shouldn't be possible, but it is not a # significant problem. } } elsif ( $token eq '{' ) { my $line_number = $self->get_output_line_number(); $block_opening_line_number{$type_sequence} = $line_number; # set a label for this block, except for # a bare block which already has the label # A label can only be used on the next { if ( $block_type =~ /:$/ ) { $csc_last_label = EMPTY_STRING; } $csc_block_label{$type_sequence} = $csc_last_label; $csc_last_label = EMPTY_STRING; if ( $accumulating_text_for_block && $levels_to_go[$i] == $leading_block_text_level ) { if ( $accumulating_text_for_block eq $block_type ) { # save any leading text before we enter this block $block_leading_text{$type_sequence} = [ $leading_block_text, $rleading_block_if_elsif_text ]; $block_opening_line_number{$type_sequence} = $leading_block_text_line_number; reset_block_text_accumulator(); } else { # shouldn't happen, but not a serious error. # We were accumulating -csc text for block type # $accumulating_text_for_block and unexpectedly # encountered a '{' for block type $block_type. } } } } if ( $type eq 'k' && $csc_new_statement_ok && $is_if_elsif_else_unless_while_until_for_foreach{$token} && $token =~ /$closing_side_comment_list_pattern/ ) { $self->set_block_text_accumulator($i); } else { # note: ignoring type 'q' because of tricks being played # with 'q' for hanging side comments if ( $type ne 'b' && $type ne '#' && $type ne 'q' ) { $csc_new_statement_ok = ( $block_type || $type eq 'J' || $type eq ';' ); } if ( $type eq ';' && $accumulating_text_for_block && $levels_to_go[$i] == $leading_block_text_level ) { reset_block_text_accumulator(); } else { $self->accumulate_block_text($i); } } } # Treat an 'else' block specially by adding preceding 'if' and # 'elsif' text. Otherwise, the 'end else' is not helpful, # especially for cuddled-else formatting. if ( $terminal_block_type =~ /^els/ && $rblock_leading_if_elsif_text ) { $block_leading_text = $self->make_else_csc_text( $i_terminal, $terminal_block_type, $block_leading_text, $rblock_leading_if_elsif_text ); } # if this line ends in a label then remember it for the next pass $csc_last_label = EMPTY_STRING; if ( $terminal_type eq 'J' ) { $csc_last_label = $tokens_to_go[$i_terminal]; } return ( $terminal_type, $i_terminal, $i_block_leading_text, $block_leading_text, $block_line_count, $block_label ); } ## end sub accumulate_csc_text sub make_else_csc_text { # create additional -csc text for an 'else' and optionally 'elsif', # depending on the value of switch # # = 0 add 'if' text to trailing else # = 1 same as 0 plus: # add 'if' to 'elsif's if can fit in line length # add last 'elsif' to trailing else if can fit in one line # = 2 same as 1 but do not check if exceed line length # # $rif_elsif_text = a reference to a list of all previous closing # side comments created for this if block # my ( $self, $i_terminal, $block_type, $block_leading_text, $rif_elsif_text ) = @_; my $csc_text = $block_leading_text; if ( $block_type eq 'elsif' && $rOpts_closing_side_comment_else_flag == 0 ) { return $csc_text; } my $count = @{$rif_elsif_text}; return $csc_text unless ($count); my $if_text = '[ if' . $rif_elsif_text->[0]; # always show the leading 'if' text on 'else' if ( $block_type eq 'else' ) { $csc_text .= $if_text; } # see if that's all if ( $rOpts_closing_side_comment_else_flag == 0 ) { return $csc_text; } my $last_elsif_text = EMPTY_STRING; if ( $count > 1 ) { $last_elsif_text = ' [elsif' . $rif_elsif_text->[ $count - 1 ]; if ( $count > 2 ) { $last_elsif_text = ' [...' . $last_elsif_text; } } # tentatively append one more item my $saved_text = $csc_text; if ( $block_type eq 'else' ) { $csc_text .= $last_elsif_text; } else { $csc_text .= SPACE . $if_text; } # all done if no length checks requested if ( $rOpts_closing_side_comment_else_flag == 2 ) { return $csc_text; } # undo it if line length exceeded my $length = length($csc_text) + length($block_type) + length( $rOpts->{'closing-side-comment-prefix'} ) + $levels_to_go[$i_terminal] * $rOpts_indent_columns + 3; if ( $length > $maximum_line_length_at_level[$leading_block_text_level] ) { $csc_text = $saved_text; } return $csc_text; } ## end sub make_else_csc_text } ## end closure accumulate_csc_text { ## begin closure balance_csc_text # Some additional routines for handling the --closing-side-comments option my %matching_char; BEGIN { %matching_char = ( '{' => '}', '(' => ')', '[' => ']', '}' => '{', ')' => '(', ']' => '[', ); } ## end BEGIN sub balance_csc_text { # Append characters to balance a closing side comment so that editors # such as vim can correctly jump through code. # Simple Example: # input = ## end foreach my $foo ( sort { $b ... # output = ## end foreach my $foo ( sort { $b ...}) # NOTE: This routine does not currently filter out structures within # quoted text because the bounce algorithms in text editors do not # necessarily do this either (a version of vim was checked and # did not do this). # Some complex examples which will cause trouble for some editors: # while ( $mask_string =~ /\{[^{]*?\}/g ) { # if ( $mask_str =~ /\}\s*els[^\{\}]+\{$/ ) { # if ( $1 eq '{' ) { # test file test1/braces.pl has many such examples. my ($csc) = @_; # loop to examine characters one-by-one, RIGHT to LEFT and # build a balancing ending, LEFT to RIGHT. foreach my $pos ( reverse( 0 .. length($csc) - 1 ) ) { my $char = substr( $csc, $pos, 1 ); # ignore everything except structural characters next unless ( $matching_char{$char} ); # pop most recently appended character my $top = chop($csc); # push it back plus the mate to the newest character # unless they balance each other. $csc = $csc . $top . $matching_char{$char} unless $top eq $char; } # return the balanced string return $csc; } ## end sub balance_csc_text } ## end closure balance_csc_text sub add_closing_side_comment { my ( $self, $ri_first, $ri_last ) = @_; my $rLL = $self->[_rLL_]; # add closing side comments after closing block braces if -csc used my ( $closing_side_comment, $cscw_block_comment ); #--------------------------------------------------------------- # Step 1: loop through all tokens of this line to accumulate # the text needed to create the closing side comments. Also see # how the line ends. #--------------------------------------------------------------- my ( $terminal_type, $i_terminal, $i_block_leading_text, $block_leading_text, $block_line_count, $block_label ) = $self->accumulate_csc_text(); #--------------------------------------------------------------- # Step 2: make the closing side comment if this ends a block #--------------------------------------------------------------- my $have_side_comment = $types_to_go[$max_index_to_go] eq '#'; # if this line might end in a block closure.. if ( $terminal_type eq '}' # Fix 1 for c091, this is only for blocks && $block_type_to_go[$i_terminal] # ..and either && ( # the block is long enough ( $block_line_count >= $rOpts->{'closing-side-comment-interval'} ) # or there is an existing comment to check || ( $have_side_comment && $rOpts->{'closing-side-comment-warnings'} ) ) # .. and if this is one of the types of interest && $block_type_to_go[$i_terminal] =~ /$closing_side_comment_list_pattern/ # .. but not an anonymous sub # These are not normally of interest, and their closing braces are # often followed by commas or semicolons anyway. This also avoids # possible erratic output due to line numbering inconsistencies # in the cases where their closing braces terminate a line. && $block_type_to_go[$i_terminal] ne 'sub' # ..and the corresponding opening brace must is not in this batch # (because we do not need to tag one-line blocks, although this # should also be caught with a positive -csci value) && !defined( $mate_index_to_go[$i_terminal] ) # ..and either && ( # this is the last token (line doesn't have a side comment) !$have_side_comment # or the old side comment is a closing side comment || $tokens_to_go[$max_index_to_go] =~ /$closing_side_comment_prefix_pattern/ ) ) { # then make the closing side comment text if ($block_label) { $block_label .= SPACE } my $token = "$rOpts->{'closing-side-comment-prefix'} $block_label$block_type_to_go[$i_terminal]"; # append any extra descriptive text collected above if ( $i_block_leading_text == $i_terminal ) { $token .= $block_leading_text; } $token = balance_csc_text($token) if $rOpts->{'closing-side-comments-balanced'}; $token =~ s/\s*$//; # trim any trailing whitespace # handle case of existing closing side comment if ($have_side_comment) { # warn if requested and tokens differ significantly if ( $rOpts->{'closing-side-comment-warnings'} ) { my $old_csc = $tokens_to_go[$max_index_to_go]; my $new_csc = $token; $new_csc =~ s/\s+//g; # trim all whitespace $old_csc =~ s/\s+//g; # trim all whitespace $new_csc =~ s/[\]\)\}\s]*$//; # trim trailing structures $old_csc =~ s/[\]\)\}\s]*$//; # trim trailing structures $new_csc =~ s/(\.\.\.)$//; # trim trailing '...' my $new_trailing_dots = $1; $old_csc =~ s/(\.\.\.)\s*$//; # trim trailing '...' # Patch to handle multiple closing side comments at # else and elsif's. These have become too complicated # to check, so if we see an indication of # '[ if' or '[ # elsif', then assume they were made # by perltidy. if ( $block_type_to_go[$i_terminal] eq 'else' ) { if ( $old_csc =~ /\[\s*elsif/ ) { $old_csc = $new_csc } } elsif ( $block_type_to_go[$i_terminal] eq 'elsif' ) { if ( $old_csc =~ /\[\s*if/ ) { $old_csc = $new_csc } } # if old comment is contained in new comment, # only compare the common part. if ( length($new_csc) > length($old_csc) ) { $new_csc = substr( $new_csc, 0, length($old_csc) ); } # if the new comment is shorter and has been limited, # only compare the common part. if ( length($new_csc) < length($old_csc) && $new_trailing_dots ) { $old_csc = substr( $old_csc, 0, length($new_csc) ); } # any remaining difference? if ( $new_csc ne $old_csc ) { # just leave the old comment if we are below the threshold # for creating side comments if ( $block_line_count < $rOpts->{'closing-side-comment-interval'} ) { $token = undef; } # otherwise we'll make a note of it else { warning( "perltidy -cscw replaced: $tokens_to_go[$max_index_to_go]\n" ); # save the old side comment in a new trailing block # comment my $timestamp = EMPTY_STRING; if ( $rOpts->{'timestamp'} ) { my ( $day, $month, $year ) = (localtime)[ 3, 4, 5 ]; $year += 1900; $month += 1; $timestamp = "$year-$month-$day"; } $cscw_block_comment = "## perltidy -cscw $timestamp: $tokens_to_go[$max_index_to_go]"; } } # No differences.. we can safely delete old comment if we # are below the threshold elsif ( $block_line_count < $rOpts->{'closing-side-comment-interval'} ) { # Since the line breaks have already been set, we have # to remove the token from the _to_go array and also # from the line range (this fixes issue c081). # Note that we can only get here if -cscw has been set # because otherwise the old comment is already deleted. $token = undef; my $ibeg = $ri_first->[-1]; my $iend = $ri_last->[-1]; if ( $iend > $ibeg && $iend == $max_index_to_go && $types_to_go[$max_index_to_go] eq '#' ) { $iend--; $max_index_to_go--; if ( $iend > $ibeg && $types_to_go[$max_index_to_go] eq 'b' ) { $iend--; $max_index_to_go--; } $ri_last->[-1] = $iend; } } } # switch to the new csc (unless we deleted it!) if ($token) { my $len_tok = length($token); # NOTE: length no longer important my $added_len = $len_tok - $token_lengths_to_go[$max_index_to_go]; $tokens_to_go[$max_index_to_go] = $token; $token_lengths_to_go[$max_index_to_go] = $len_tok; my $K = $K_to_go[$max_index_to_go]; $rLL->[$K]->[_TOKEN_] = $token; $rLL->[$K]->[_TOKEN_LENGTH_] = $len_tok; $summed_lengths_to_go[ $max_index_to_go + 1 ] += $added_len; } } # handle case of NO existing closing side comment else { # To avoid inserting a new token in the token arrays, we # will just return the new side comment so that it can be # inserted just before it is needed in the call to the # vertical aligner. $closing_side_comment = $token; } } return ( $closing_side_comment, $cscw_block_comment ); } ## end sub add_closing_side_comment ############################ # CODE SECTION 15: Summarize ############################ sub wrapup { # This is the last routine called when a file is formatted. # Flush buffer and write any informative messages my ( $self, $severe_error ) = @_; $self->flush(); my $file_writer_object = $self->[_file_writer_object_]; $file_writer_object->decrement_output_line_number() ; # fix up line number since it was incremented we_are_at_the_last_line(); my $max_depth = $self->[_maximum_BLOCK_level_]; my $at_line = $self->[_maximum_BLOCK_level_at_line_]; write_logfile_entry( "Maximum leading structural depth is $max_depth in input at line $at_line\n" ); my $added_semicolon_count = $self->[_added_semicolon_count_]; my $first_added_semicolon_at = $self->[_first_added_semicolon_at_]; my $last_added_semicolon_at = $self->[_last_added_semicolon_at_]; if ( $added_semicolon_count > 0 ) { my $first = ( $added_semicolon_count > 1 ) ? "First" : EMPTY_STRING; my $what = ( $added_semicolon_count > 1 ) ? "semicolons were" : "semicolon was"; write_logfile_entry("$added_semicolon_count $what added:\n"); write_logfile_entry( " $first at input line $first_added_semicolon_at\n"); if ( $added_semicolon_count > 1 ) { write_logfile_entry( " Last at input line $last_added_semicolon_at\n"); } write_logfile_entry(" (Use -nasc to prevent semicolon addition)\n"); write_logfile_entry("\n"); } my $deleted_semicolon_count = $self->[_deleted_semicolon_count_]; my $first_deleted_semicolon_at = $self->[_first_deleted_semicolon_at_]; my $last_deleted_semicolon_at = $self->[_last_deleted_semicolon_at_]; if ( $deleted_semicolon_count > 0 ) { my $first = ( $deleted_semicolon_count > 1 ) ? "First" : EMPTY_STRING; my $what = ( $deleted_semicolon_count > 1 ) ? "semicolons were" : "semicolon was"; write_logfile_entry( "$deleted_semicolon_count unnecessary $what deleted:\n"); write_logfile_entry( " $first at input line $first_deleted_semicolon_at\n"); if ( $deleted_semicolon_count > 1 ) { write_logfile_entry( " Last at input line $last_deleted_semicolon_at\n"); } write_logfile_entry(" (Use -ndsm to prevent semicolon deletion)\n"); write_logfile_entry("\n"); } my $embedded_tab_count = $self->[_embedded_tab_count_]; my $first_embedded_tab_at = $self->[_first_embedded_tab_at_]; my $last_embedded_tab_at = $self->[_last_embedded_tab_at_]; if ( $embedded_tab_count > 0 ) { my $first = ( $embedded_tab_count > 1 ) ? "First" : EMPTY_STRING; my $what = ( $embedded_tab_count > 1 ) ? "quotes or patterns" : "quote or pattern"; write_logfile_entry("$embedded_tab_count $what had embedded tabs:\n"); write_logfile_entry( "This means the display of this script could vary with device or software\n" ); write_logfile_entry(" $first at input line $first_embedded_tab_at\n"); if ( $embedded_tab_count > 1 ) { write_logfile_entry( " Last at input line $last_embedded_tab_at\n"); } write_logfile_entry("\n"); } my $first_tabbing_disagreement = $self->[_first_tabbing_disagreement_]; my $last_tabbing_disagreement = $self->[_last_tabbing_disagreement_]; my $tabbing_disagreement_count = $self->[_tabbing_disagreement_count_]; my $in_tabbing_disagreement = $self->[_in_tabbing_disagreement_]; if ($first_tabbing_disagreement) { write_logfile_entry( "First indentation disagreement seen at input line $first_tabbing_disagreement\n" ); } my $first_btd = $self->[_first_brace_tabbing_disagreement_]; if ($first_btd) { my $msg = "First closing brace indentation disagreement started at input line $first_btd\n"; write_logfile_entry($msg); # leave a hint in the .ERR file if there was a brace error if ( get_saw_brace_error() ) { warning("NOTE: $msg") } } my $in_btd = $self->[_in_brace_tabbing_disagreement_]; if ($in_btd) { my $msg = "Ending with brace indentation disagreement which started at input line $in_btd\n"; write_logfile_entry($msg); # leave a hint in the .ERR file if there was a brace error if ( get_saw_brace_error() ) { warning("NOTE: $msg") } } if ($in_tabbing_disagreement) { my $msg = "Ending with indentation disagreement which started at input line $in_tabbing_disagreement\n"; write_logfile_entry($msg); } else { if ($last_tabbing_disagreement) { write_logfile_entry( "Last indentation disagreement seen at input line $last_tabbing_disagreement\n" ); } else { write_logfile_entry("No indentation disagreement seen\n"); } } if ($first_tabbing_disagreement) { write_logfile_entry( "Note: Indentation disagreement detection is not accurate for outdenting and -lp.\n" ); } write_logfile_entry("\n"); my $vao = $self->[_vertical_aligner_object_]; $vao->report_anything_unusual(); $file_writer_object->report_line_length_errors(); # Define the formatter self-check for convergence. $self->[_converged_] = $severe_error || $file_writer_object->get_convergence_check() || $rOpts->{'indent-only'}; return; } ## end sub wrapup } ## end package Perl::Tidy::Formatter 1; Perl-Tidy-20230309/lib/Perl/Tidy/IOScalar.pm0000644000175000017500000000673514400733202017207 0ustar stevesteve##################################################################### # # This is a stripped down version of IO::Scalar # Given a reference to a scalar, it supplies either: # a getline method which reads lines (mode='r'), or # a print method which reads lines (mode='w') # ##################################################################### package Perl::Tidy::IOScalar; use strict; use warnings; use Carp; our $VERSION = '20230309'; use constant EMPTY_STRING => q{}; sub AUTOLOAD { # Catch any undefined sub calls so that we are sure to get # some diagnostic information. This sub should never be called # except for a programming error. our $AUTOLOAD; return if ( $AUTOLOAD =~ /\bDESTROY$/ ); my ( $pkg, $fname, $lno ) = caller(); my $my_package = __PACKAGE__; print STDERR <[1]; if ( $mode ne 'r' ) { confess <[2]++; return $self->[0]->[$i]; } sub print { my ( $self, $msg ) = @_; my $mode = $self->[1]; if ( $mode ne 'w' ) { confess <[0] } .= $msg; return; } sub close { return } 1; Perl-Tidy-20230309/lib/Perl/Tidy.pm0000644000175000017500000062574514401152107015562 0ustar stevesteve# ########################################################### # # perltidy - a perl script indenter and formatter # # Copyright (c) 2000-2022 by Steve Hancock # Distributed under the GPL license agreement; see file COPYING # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License along # with this program; if not, write to the Free Software Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. # # For brief instructions, try 'perltidy -h'. # For more complete documentation, try 'man perltidy' # or visit http://perltidy.sourceforge.net # # This script is an example of the default style. It was formatted with: # # perltidy Tidy.pm # # Code Contributions: See ChangeLog.html for a complete history. # Michael Cartmell supplied code for adaptation to VMS and helped with # v-strings. # Hugh S. Myers supplied sub streamhandle and the supporting code to # create a Perl::Tidy module which can operate on strings, arrays, etc. # Yves Orton supplied coding to help detect Windows versions. # Axel Rose supplied a patch for MacPerl. # Sebastien Aperghis-Tramoni supplied a patch for the defined or operator. # Dan Tyrell contributed a patch for binary I/O. # Ueli Hugenschmidt contributed a patch for -fpsc # Sam Kington supplied a patch to identify the initial indentation of # entabbed code. # jonathan swartz supplied patches for: # * .../ pattern, which looks upwards from directory # * --notidy, to be used in directories where we want to avoid # accidentally tidying # * prefilter and postfilter # * iterations option # # Many others have supplied key ideas, suggestions, and bug reports; # see the CHANGES file. # ############################################################ package Perl::Tidy; # perlver reports minimum version needed is 5.8.0 # 5.004 needed for IO::File # 5.008 needed for wide characters use 5.008; use warnings; use strict; use Exporter; use Carp; use English qw( -no_match_vars ); use Digest::MD5 qw(md5_hex); use Perl::Tidy::Debugger; use Perl::Tidy::DevNull; use Perl::Tidy::Diagnostics; use Perl::Tidy::FileWriter; use Perl::Tidy::Formatter; use Perl::Tidy::HtmlWriter; use Perl::Tidy::IOScalar; use Perl::Tidy::IOScalarArray; use Perl::Tidy::IndentationItem; use Perl::Tidy::LineSink; use Perl::Tidy::LineSource; use Perl::Tidy::Logger; use Perl::Tidy::Tokenizer; use Perl::Tidy::VerticalAligner; local $OUTPUT_AUTOFLUSH = 1; # DEVEL_MODE can be turned on for extra checking during development use constant DEVEL_MODE => 0; use constant EMPTY_STRING => q{}; use constant SPACE => q{ }; use vars qw{ $VERSION @ISA @EXPORT }; @ISA = qw( Exporter ); @EXPORT = qw( &perltidy ); use Cwd; use Encode (); use Encode::Guess; use IO::File; use File::Basename; use File::Copy; use File::Temp qw(tempfile); BEGIN { # Release version is the approximate YYYYMMDD of the release. # Development version is (Last Release).(Development Number) # To make the number continually increasing, the Development Number is a 2 # digit number starting at 01 after a release. It is continually bumped # along at significant points during development. If it ever reaches 99 # then the Release version must be bumped, and it is probably past time for # a release anyway. $VERSION = '20230309'; } ## end BEGIN sub DESTROY { # required to avoid call to AUTOLOAD in some versions of perl } sub AUTOLOAD { # Catch any undefined sub calls so that we are sure to get # some diagnostic information. This sub should never be called # except for a programming error. our $AUTOLOAD; return if ( $AUTOLOAD =~ /\bDESTROY$/ ); my ( $pkg, $fname, $lno ) = caller(); print STDERR <new( $filename, $mode ) }; } elsif ( $ref eq 'SCALAR' ) { $New = sub { Perl::Tidy::IOScalar->new( $filename, $mode ) }; } else { # Accept an object with a getline method for reading. Note: # IO::File is built-in and does not respond to the defined # operator. If this causes trouble, the check can be # skipped and we can just let it crash if there is no # getline. if ( $mode =~ /[rR]/ ) { # RT#97159; part 1 of 2: updated to use 'can' ##if ( $ref eq 'IO::File' || defined &{ $ref . "::getline" } ) { if ( $ref->can('getline') ) { $New = sub { $filename }; } else { $New = sub { undef }; confess <can('print') ) { $New = sub { $filename }; } else { $New = sub { undef }; confess <new( $filename, $mode ) }; } } $fh = $New->( $filename, $mode ); if ( !$fh ) { Warn("Couldn't open file:$filename in mode:$mode : $ERRNO\n"); } else { # Case 1: handle encoded data if ($is_encoded_data) { if ( ref($fh) eq 'IO::File' ) { ## binmode object call not available in older perl versions ## $fh->binmode(":raw:encoding(UTF-8)"); binmode $fh, ":raw:encoding(UTF-8)"; } elsif ( $filename eq '-' ) { binmode STDOUT, ":raw:encoding(UTF-8)"; } else { # shouldn't happen } } # Case 2: handle unencoded data else { if ( ref($fh) eq 'IO::File' ) { binmode $fh } elsif ( $filename eq '-' ) { binmode STDOUT } else { } # shouldn't happen } } return $fh, ( $ref or $filename ); } ## end sub streamhandle sub find_input_line_ending { # Peek at a file and return first line ending character. # Return undefined value in case of any trouble. my ($input_file) = @_; my $ending; # silently ignore input from object or stdin if ( ref($input_file) || $input_file eq '-' ) { return $ending; } my $fh; open( $fh, '<', $input_file ) || return $ending; binmode $fh; my $buf; read( $fh, $buf, 1024 ); close $fh || return $ending; if ( $buf && $buf =~ /([\012\015]+)/ ) { my $test = $1; # dos if ( $test =~ /^(\015\012)+$/ ) { $ending = "\015\012" } # mac elsif ( $test =~ /^\015+$/ ) { $ending = "\015" } # unix elsif ( $test =~ /^\012+$/ ) { $ending = "\012" } # unknown else { } } # no ending seen else { } return $ending; } ## end sub find_input_line_ending { ## begin closure for sub catfile my $missing_file_spec; BEGIN { $missing_file_spec = !eval { require File::Spec; 1 }; } sub catfile { # concatenate a path and file basename # returns undef in case of error my @parts = @_; # use File::Spec if we can unless ($missing_file_spec) { return File::Spec->catfile(@parts); } # Perl 5.004 systems may not have File::Spec so we'll make # a simple try. We assume File::Basename is available. # return if not successful. my $name = pop @parts; my $path = join '/', @parts; my $test_file = $path . $name; my ( $test_name, $test_path ) = fileparse($test_file); return $test_file if ( $test_name eq $name ); return if ( $OSNAME eq 'VMS' ); # this should work at least for Windows and Unix: $test_file = $path . '/' . $name; ( $test_name, $test_path ) = fileparse($test_file); return $test_file if ( $test_name eq $name ); return; } ## end sub catfile } ## end closure for sub catfile # Here is a map of the flow of data from the input source to the output # line sink: # # LineSource-->Tokenizer-->Formatter-->VerticalAligner-->FileWriter--> # input groups output # lines tokens lines of lines lines # lines # # The names correspond to the package names responsible for the unit processes. # # The overall process is controlled by the "main" package. # # LineSource is the stream of input lines # # Tokenizer analyzes a line and breaks it into tokens, peeking ahead # if necessary. A token is any section of the input line which should be # manipulated as a single entity during formatting. For example, a single # ',' character is a token, and so is an entire side comment. It handles # the complexities of Perl syntax, such as distinguishing between '<<' as # a shift operator and as a here-document, or distinguishing between '/' # as a divide symbol and as a pattern delimiter. # # Formatter inserts and deletes whitespace between tokens, and breaks # sequences of tokens at appropriate points as output lines. It bases its # decisions on the default rules as modified by any command-line options. # # VerticalAligner collects groups of lines together and tries to line up # certain tokens, such as '=>', '#', and '=' by adding whitespace. # # FileWriter simply writes lines to the output stream. # # The Logger package, not shown, records significant events and warning # messages. It writes a .LOG file, which may be saved with a # '-log' or a '-g' flag. { #<<< (this side comment avoids excessive indentation in a closure) my $Warn_count; my $fh_stderr; my $loaded_unicode_gcstring; my $rstatus; # Bump Warn_count only: it is essential to bump the count on all warnings, even # if no message goes out, so that the correct exit status is set. sub Warn_count_bump { $Warn_count++; return } # Output Warn message only sub Warn_msg { my $msg = shift; $fh_stderr->print($msg); return } # Output Warn message and bump Warn count sub Warn { my $msg = shift; $fh_stderr->print($msg); $Warn_count++; return } sub is_char_mode { my ($string) = @_; # Returns: # true if $string is in Perl's internal character mode # (also called the 'upgraded form', or UTF8=1) # false if $string is in Perl's internal byte mode # This function isolates the call to Perl's internal function # utf8::is_utf8() which is true for strings represented in an 'upgraded # form'. It is available after Perl version 5.8. # See https://perldoc.perl.org/Encode. # See also comments in Carp.pm and other modules using this function return 1 if ( utf8::is_utf8($string) ); return; } ## end sub is_char_mode my $md5_hex = sub { my ($buf) = @_; # Evaluate the MD5 sum for a string # Patch for [rt.cpan.org #88020] # Use utf8::encode since md5_hex() only operates on bytes. # my $digest = md5_hex( utf8::encode($sink_buffer) ); # Note added 20180114: the above patch did not work correctly. I'm not # sure why. But switching to the method recommended in the Perl 5 # documentation for Encode worked. According to this we can either use # $octets = encode_utf8($string) or equivalently # $octets = encode("utf8",$string) # and then calculate the checksum. So: my $octets = Encode::encode( "utf8", $buf ); my $digest = md5_hex($octets); return $digest; }; BEGIN { # Array index names for $self. # Do not combine with other BEGIN blocks (c101). my $i = 0; use constant { _actual_output_extension_ => $i++, _debugfile_stream_ => $i++, _decoded_input_as_ => $i++, _destination_stream_ => $i++, _diagnostics_object_ => $i++, _display_name_ => $i++, _file_extension_separator_ => $i++, _fileroot_ => $i++, _is_encoded_data_ => $i++, _length_function_ => $i++, _line_separator_default_ => $i++, _line_separator_ => $i++, _logger_object_ => $i++, _output_file_ => $i++, _postfilter_ => $i++, _prefilter_ => $i++, _rOpts_ => $i++, _saw_pbp_ => $i++, _tabsize_ => $i++, _teefile_stream_ => $i++, _user_formatter_ => $i++, _input_copied_verbatim_ => $i++, _input_output_difference_ => $i++, }; } ## end BEGIN sub perltidy { my %input_hash = @_; my %defaults = ( argv => undef, destination => undef, formatter => undef, logfile => undef, errorfile => undef, teefile => undef, debugfile => undef, perltidyrc => undef, source => undef, stderr => undef, dump_options => undef, dump_options_type => undef, dump_getopt_flags => undef, dump_options_category => undef, dump_options_range => undef, dump_abbreviations => undef, prefilter => undef, postfilter => undef, ); # Status information which can be returned for diagnostic purposes. # NOTE: This is intended only for testing and subject to change. # List of "key => value" hash entries: # Some relevant user input parameters for convenience: # opt_format => value of --format: 'tidy', 'html', or 'user' # opt_encoding => value of -enc flag: 'utf8', 'none', or 'guess' # opt_encode_output => value of -eos flag: 'eos' or 'neos' # opt_max_iterations => value of --iterations=n # file_count => number of files processed in this call # If multiple files are processed, then the following values will be for # the last file only: # input_name => name of the input stream # output_name => name of the output stream # The following two variables refer to Perl's two internal string modes, # and have the values 0 for 'byte' mode and 1 for 'char' mode: # char_mode_source => true if source is in 'char' mode. Will be false # unless we received a source string ref with utf8::is_utf8() set. # char_mode_used => true if text processed by perltidy in 'char' mode. # Normally true for text identified as utf8, otherwise false. # This tells if Unicode::GCString was used # gcs_used => true if -gcs and Unicode::GCString found & used # These variables tell what utf8 decoding/encoding was done: # input_decoded_as => non-blank if perltidy decoded the source text # output_encoded_as => non-blank if perltidy encoded before return # These variables are related to iterations and convergence testing: # iteration_count => number of iterations done # ( can be from 1 to opt_max_iterations ) # converged => true if stopped on convergence # ( can only happen if opt_max_iterations > 1 ) # blinking => true if stopped on blinking states # ( i.e., unstable formatting, should not happen ) $rstatus = { file_count => 0, opt_format => EMPTY_STRING, opt_encoding => EMPTY_STRING, opt_encode_output => EMPTY_STRING, opt_max_iterations => EMPTY_STRING, input_name => EMPTY_STRING, output_name => EMPTY_STRING, char_mode_source => 0, char_mode_used => 0, input_decoded_as => EMPTY_STRING, output_encoded_as => EMPTY_STRING, gcs_used => 0, iteration_count => 0, converged => 0, blinking => 0, }; # Fix for issue git #57 $Warn_count = 0; # don't overwrite callers ARGV local @ARGV = @ARGV; local *STDERR = *STDERR; if ( my @bad_keys = grep { !exists $defaults{$_} } keys %input_hash ) { local $LIST_SEPARATOR = ')('; my @good_keys = sort keys %defaults; @bad_keys = sort @bad_keys; confess <{'input_name'}; $input_stream_name = '(unknown)' unless ($input_stream_name); Die(<('dump_options'); my $dump_getopt_flags = $get_hash_ref->('dump_getopt_flags'); my $dump_options_category = $get_hash_ref->('dump_options_category'); my $dump_abbreviations = $get_hash_ref->('dump_abbreviations'); my $dump_options_range = $get_hash_ref->('dump_options_range'); # validate dump_options_type if ( defined($dump_options) ) { unless ( defined($dump_options_type) ) { $dump_options_type = 'perltidyrc'; } if ( $dump_options_type ne 'perltidyrc' && $dump_options_type ne 'full' ) { croak <new(); } # see if ARGV is overridden if ( defined($argv) ) { my $rargv = ref $argv; if ( $rargv eq 'SCALAR' ) { $argv = ${$argv}; $rargv = undef } # ref to ARRAY if ($rargv) { if ( $rargv eq 'ARRAY' ) { @ARGV = @{$argv}; } else { croak <[_file_extension_separator_] = $dot; #------------------------- # get command line options #------------------------- my ( $rOpts, $config_file, $rraw_options, $roption_string, $rexpansion, $roption_category, $roption_range ) = process_command_line( $perltidyrc_stream, $is_Windows, $Windows_type, $rpending_complaint, $dump_options_type, ); # Only filenames should remain in @ARGV my @Arg_files = @ARGV; $self->[_rOpts_] = $rOpts; my $saw_pbp = grep { $_ eq '-pbp' || $_ eq '-perl-best-practices' } @{$rraw_options}; $self->[_saw_pbp_] = $saw_pbp; #------------------------------------ # Handle requests to dump information #------------------------------------ # return or exit immediately after all dumps my $quit_now = 0; # Getopt parameters and their flags if ( defined($dump_getopt_flags) ) { $quit_now = 1; foreach my $op ( @{$roption_string} ) { my $opt = $op; my $flag = EMPTY_STRING; # Examples: # some-option=s # some-option=i # some-option:i # some-option! if ( $opt =~ /(.*)(!|=.*|:.*)$/ ) { $opt = $1; $flag = $2; } $dump_getopt_flags->{$opt} = $flag; } } if ( defined($dump_options_category) ) { $quit_now = 1; %{$dump_options_category} = %{$roption_category}; } if ( defined($dump_options_range) ) { $quit_now = 1; %{$dump_options_range} = %{$roption_range}; } if ( defined($dump_abbreviations) ) { $quit_now = 1; %{$dump_abbreviations} = %{$rexpansion}; } if ( defined($dump_options) ) { $quit_now = 1; %{$dump_options} = %{$rOpts}; } Exit(0) if ($quit_now); # make printable string of options for this run as possible diagnostic my $readable_options = readable_options( $rOpts, $roption_string ); # dump from command line if ( $rOpts->{'dump-options'} ) { print STDOUT $readable_options; Exit(0); } # --dump-block-summary requires one filename in the arg list. # This is a safety precaution in case a user accidentally adds -dbs to the # command line parameters and is expecting formatted output to stdout. # Another precaution, added elsewhere, is to ignore -dbs in a .perltidyrc my $numf = @Arg_files; if ( $rOpts->{'dump-block-summary'} && $numf != 1 ) { Die(<check_options( $is_Windows, $Windows_type, $rpending_complaint ); if ($user_formatter) { $rOpts->{'format'} = 'user'; } # there must be one entry here for every possible format my %default_file_extension = ( tidy => 'tdy', html => 'html', user => EMPTY_STRING, ); $rstatus->{'opt_format'} = $rOpts->{'format'}; $rstatus->{'opt_max_iterations'} = $rOpts->{'iterations'}; $rstatus->{'opt_encode_output'} = $rOpts->{'encode-output-strings'} ? 'eos' : 'neos'; # be sure we have a valid output format unless ( exists $default_file_extension{ $rOpts->{'format'} } ) { my $formats = join SPACE, sort map { "'" . $_ . "'" } keys %default_file_extension; my $fmt = $rOpts->{'format'}; Die("-format='$fmt' but must be one of: $formats\n"); } my $output_extension = $self->make_file_extension( $rOpts->{'output-file-extension'}, $default_file_extension{ $rOpts->{'format'} } ); # get parameters associated with the -b option my ( $in_place_modify, $backup_extension, $delete_backup ) = $self->check_in_place_modify( $source_stream, $destination_stream ); Perl::Tidy::Formatter::check_options($rOpts); Perl::Tidy::Tokenizer::check_options($rOpts); Perl::Tidy::VerticalAligner::check_options($rOpts); if ( $rOpts->{'format'} eq 'html' ) { Perl::Tidy::HtmlWriter->check_options($rOpts); } # make the pattern of file extensions that we shouldn't touch my $forbidden_file_extensions = "(($dot_pattern)(LOG|DEBUG|ERR|TEE)"; if ($output_extension) { my $ext = quotemeta($output_extension); $forbidden_file_extensions .= "|$ext"; } if ( $in_place_modify && $backup_extension ) { my $ext = quotemeta($backup_extension); $forbidden_file_extensions .= "|$ext"; } $forbidden_file_extensions .= ')$'; # Create a diagnostics object if requested; # This is only useful for code development my $diagnostics_object = undef; if ( $rOpts->{'DIAGNOSTICS'} ) { $diagnostics_object = Perl::Tidy::Diagnostics->new(); } # no filenames should be given if input is from an array if ($source_stream) { if ( @Arg_files > 0 ) { Die( "You may not specify any filenames when a source array is given\n" ); } # we'll stuff the source array into Arg_files unshift( @Arg_files, $source_stream ); # No special treatment for source stream which is a filename. # This will enable checks for binary files and other bad stuff. $source_stream = undef unless ref($source_stream); } # use stdin by default if no source array and no args else { unshift( @Arg_files, '-' ) unless @Arg_files; } # Flag for loading module Unicode::GCString for evaluating text width: # undef = ok to use but not yet loaded # 0 = do not use; failed to load or not wanted # 1 = successfully loaded and ok to use # The module is not actually loaded unless/until it is needed if ( !$rOpts->{'use-unicode-gcstring'} ) { $loaded_unicode_gcstring = 0; } # Remove duplicate filenames. Otherwise, for example if the user entered # perltidy -b myfile.pl myfile.pl # the backup version of the original would be lost. if ( @Arg_files > 1 ) { my %seen = (); @Arg_files = grep { !$seen{$_}++ } @Arg_files; } # If requested, process in order of increasing file size # This can significantly reduce perl's virtual memory usage during testing. if ( @Arg_files > 1 && $rOpts->{'file-size-order'} ) { @Arg_files = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, -e $_ ? -s $_ : 0 ] } @Arg_files; } my $logfile_header = make_logfile_header( $rOpts, $config_file, $rraw_options, $Windows_type, $readable_options, ); # Store some values needed by lower level routines $self->[_diagnostics_object_] = $diagnostics_object; $self->[_postfilter_] = $postfilter; $self->[_prefilter_] = $prefilter; $self->[_user_formatter_] = $user_formatter; #-------------------------- # loop to process all files #-------------------------- $self->process_all_files( \%input_hash, \@Arg_files, # filename stuff... $output_extension, $forbidden_file_extensions, $in_place_modify, $backup_extension, $delete_backup, # logfile stuff... $logfile_header, $rpending_complaint, $rpending_logfile_message, ); #----- # Exit #----- # Fix for RT #130297: return a true value if anything was written to the # standard error output, even non-fatal warning messages, otherwise return # false. # These exit codes are returned: # 0 = perltidy ran to completion with no errors # 1 = perltidy could not run to completion due to errors # 2 = perltidy ran to completion with error messages # Note that if perltidy is run with multiple files, any single file with # errors or warnings will write a line like # '## Please see file testing.t.ERR' # to standard output for each file with errors, so the flag will be true, # even if only some of the multiple files may have had errors. NORMAL_EXIT: my $ret = $Warn_count ? 2 : 0; return wantarray ? ( $ret, $rstatus ) : $ret; ERROR_EXIT: return wantarray ? ( 1, $rstatus ) : 1; } ## end sub perltidy sub make_file_extension { # Make a file extension, adding any leading '.' if necessary. # (the '.' may actually be an '_' under VMS). my ( $self, $extension, $default ) = @_; # '$extension' is the first choice (usually a user entry) # '$default' is a backup extension $extension = EMPTY_STRING unless defined($extension); $extension =~ s/^\s+//; $extension =~ s/\s+$//; # Use default extension if nothing remains of the first choice # if ( length($extension) == 0 ) { $extension = $default; $extension = EMPTY_STRING unless defined($extension); $extension =~ s/^\s+//; $extension =~ s/\s+$//; } # Only extensions with these leading characters get a '.' # This rule gives the user some freedom. if ( $extension =~ /^[a-zA-Z0-9]/ ) { my $dot = $self->[_file_extension_separator_]; $extension = $dot . $extension; } return $extension; } ## end sub make_file_extension sub check_in_place_modify { my ( $self, $source_stream, $destination_stream ) = @_; # get parameters associated with the -b option my $rOpts = $self->[_rOpts_]; # check for -b option; # silently ignore unless beautify mode my $in_place_modify = $rOpts->{'backup-and-modify-in-place'} && $rOpts->{'format'} eq 'tidy'; my ( $backup_extension, $delete_backup ); # Turn off -b with warnings in case of conflicts with other options. # NOTE: Do this silently, without warnings, if there is a source or # destination stream, or standard output is used. This is because the -b # flag may have been in a .perltidyrc file and warnings break # Test::NoWarnings. See email discussion with Merijn Brand 26 Feb 2014. if ($in_place_modify) { if ( $rOpts->{'standard-output'} || $destination_stream || ref $source_stream || $rOpts->{'outfile'} || defined( $rOpts->{'output-path'} ) ) { $in_place_modify = 0; } } if ($in_place_modify) { # If the backup extension contains a / character then the backup should # be deleted when the -b option is used. On older versions of # perltidy this will generate an error message due to an illegal # file name. # # A backup file will still be generated but will be deleted # at the end. If -bext='/' then this extension will be # the default 'bak'. Otherwise it will be whatever characters # remains after all '/' characters are removed. For example: # -bext extension slashes # '/' bak 1 # '/delete' delete 1 # 'delete/' delete 1 # '/dev/null' devnull 2 (Currently not allowed) my $bext = $rOpts->{'backup-file-extension'}; $delete_backup = ( $rOpts->{'backup-file-extension'} =~ s/\///g ); # At present only one forward slash is allowed. In the future multiple # slashes may be allowed to allow for other options if ( $delete_backup > 1 ) { Die("-bext=$bext contains more than one '/'\n"); } $backup_extension = $self->make_file_extension( $rOpts->{'backup-file-extension'}, 'bak' ); } my $backup_method = $rOpts->{'backup-method'}; if ( defined($backup_method) && $backup_method ne 'copy' && $backup_method ne 'move' ) { Die( "Unexpected --backup-method='$backup_method'; must be one of: 'move', 'copy'\n" ); } return ( $in_place_modify, $backup_extension, $delete_backup ); } ## end sub check_in_place_modify sub backup_method_copy { my ( $self, $input_file, $output_file, $backup_extension, $delete_backup ) = @_; # Handle the -b (--backup-and-modify-in-place) option with -bm='copy': # - First copy $input file to $backup_name. # - Then open input file and rewrite with contents of $output_file # - Then delete the backup if requested # NOTES: # - Die immediately on any error. # - $output_file is actually an ARRAY ref my $backup_file = $input_file . $backup_extension; unless ( -f $input_file ) { # no real file to backup .. # This shouldn't happen because of numerous preliminary checks Die( "problem with -b backing up input file '$input_file': not a file\n" ); } if ( -f $backup_file ) { unlink($backup_file) or Die( "unable to remove previous '$backup_file' for -b option; check permissions: $ERRNO\n" ); } # Copy input file to backup File::Copy::copy( $input_file, $backup_file ) or Die("File::Copy failed trying to backup source: $ERRNO"); # set permissions of the backup file to match the input file my @input_file_stat = stat($input_file); my $in_place_modify = 1; $self->set_output_file_permissions( $backup_file, \@input_file_stat, $in_place_modify ); # set the modification time of the copy to the original value (rt#145999) my ( $read_time, $write_time ) = @input_file_stat[ 8, 9 ]; if ( defined($write_time) ) { utime( $read_time, $write_time, $backup_file ) || Warn("error setting times for backup file '$backup_file'\n"); } # Open the original input file for writing ... opening with ">" will # truncate the existing data. open( my $fout, ">", $input_file ) || Die( "problem re-opening $input_file for write for -b option; check file and directory permissions: $ERRNO\n" ); if ( $self->[_is_encoded_data_] ) { binmode $fout, ":raw:encoding(UTF-8)"; } # Now copy the formatted output to it.. # if formatted output is in an ARRAY ref (normally this is true)... if ( ref($output_file) eq 'ARRAY' ) { foreach my $line ( @{$output_file} ) { $fout->print($line) or Die("cannot print to '$input_file' with -b option: $OS_ERROR\n"); } } # or in a SCALAR ref (less efficient, and only used for testing) elsif ( ref($output_file) eq 'SCALAR' ) { foreach my $line ( split /^/, ${$output_file} ) { $fout->print($line) or Die("cannot print to '$input_file' with -b option: $OS_ERROR\n"); } } # Error if anything else ... # This can only happen if the output was changed from \@tmp_buff else { my $ref = ref($output_file); Die(<close() or Die("cannot close '$input_file' with -b option: $OS_ERROR\n"); # Set permissions of the output file to match the input file. This is # necessary even if the inode remains unchanged because suid/sgid bits may # have been reset. $self->set_output_file_permissions( $input_file, \@input_file_stat, $in_place_modify ); # Keep original modification time if no change (rt#145999) if ( !$self->[_input_output_difference_] && defined($write_time) ) { utime( $read_time, $write_time, $input_file ) || Warn("error setting times for '$input_file'\n"); } #--------------------------------------------------------- # remove the original file for in-place modify as follows: # $delete_backup=0 never # $delete_backup=1 only if no errors # $delete_backup>1 always : NOT ALLOWED, too risky #--------------------------------------------------------- if ( $delete_backup && -f $backup_file ) { # Currently, $delete_backup may only be 1. But if a future update # allows a value > 1, then reduce it to 1 if there were warnings. if ( $delete_backup > 1 && $self->[_logger_object_]->get_warning_count() ) { $delete_backup = 1; } # As an added safety precaution, do not delete the source file # if its size has dropped from positive to zero, since this # could indicate a disaster of some kind, including a hardware # failure. Actually, this could happen if you had a file of # all comments (or pod) and deleted everything with -dac (-dap) # for some reason. if ( !-s $input_file && -s $backup_file && $delete_backup == 1 ) { Warn( "output file '$input_file' missing or zero length; original '$backup_file' not deleted\n" ); } else { unlink($backup_file) or Die( "unable to remove backup file '$backup_file' for -b option; check permissions: $ERRNO\n" ); } } # Verify that inode is unchanged during development if (DEVEL_MODE) { my @output_file_stat = stat($input_file); my $inode_input = $input_file_stat[1]; my $inode_output = $output_file_stat[1]; if ( $inode_input != $inode_output ) { Fault(<[_is_encoded_data_]; my ( $fout, $iname ) = Perl::Tidy::streamhandle( $input_file, 'w', $is_encoded_data ); if ( !$fout ) { Die( "problem re-opening $input_file for write for -b option; check file and directory permissions: $ERRNO\n" ); } # Now copy the formatted output to it.. # if formatted output is in an ARRAY ref ... if ( ref($output_file) eq 'ARRAY' ) { foreach my $line ( @{$output_file} ) { $fout->print($line) or Die("cannot print to '$input_file' with -b option: $OS_ERROR\n"); } } # or in a SCALAR ref (less efficient, for testing only) elsif ( ref($output_file) eq 'SCALAR' ) { foreach my $line ( split /^/, ${$output_file} ) { $fout->print($line) or Die("cannot print to '$input_file' with -b option: $OS_ERROR\n"); } } # Error if anything else ... # This can only happen if the output was changed from \@tmp_buff else { my $ref = ref($output_file); Die(<close() or Die("cannot close '$input_file' with -b option: $OS_ERROR\n"); # set permissions of the output file to match the input file my $in_place_modify = 1; $self->set_output_file_permissions( $input_file, \@input_file_stat, $in_place_modify ); # Keep original modification time if no change (rt#145999) my ( $read_time, $write_time ) = @input_file_stat[ 8, 9 ]; if ( !$self->[_input_output_difference_] && defined($write_time) ) { utime( $read_time, $write_time, $input_file ) || Warn("error setting times for '$input_file'\n"); } #--------------------------------------------------------- # remove the original file for in-place modify as follows: # $delete_backup=0 never # $delete_backup=1 only if no errors # $delete_backup>1 always : NOT ALLOWED, too risky #--------------------------------------------------------- if ( $delete_backup && -f $backup_name ) { # Currently, $delete_backup may only be 1. But if a future update # allows a value > 1, then reduce it to 1 if there were warnings. if ( $delete_backup > 1 && $self->[_logger_object_]->get_warning_count() ) { $delete_backup = 1; } # As an added safety precaution, do not delete the source file # if its size has dropped from positive to zero, since this # could indicate a disaster of some kind, including a hardware # failure. Actually, this could happen if you had a file of # all comments (or pod) and deleted everything with -dac (-dap) # for some reason. if ( !-s $input_file && -s $backup_name && $delete_backup == 1 ) { Warn( "output file '$input_file' missing or zero length; original '$backup_name' not deleted\n" ); } else { unlink($backup_name) or Die( "unable to remove previous '$backup_name' for -b option; check permissions: $ERRNO\n" ); } } return; } ## end sub backup_method_move sub set_output_file_permissions { my ( $self, $output_file, $rinput_file_stat, $in_place_modify ) = @_; # Given: # $output_file = the file whose permissions we will set # $rinput_file_stat = the result of stat($input_file) # $in_place_modify = true if --backup-and-modify-in-place is set my ( $mode_i, $uid_i, $gid_i ) = @{$rinput_file_stat}[ 2, 4, 5 ]; my ( $uid_o, $gid_o ) = ( stat($output_file) )[ 4, 5 ]; my $input_file_permissions = $mode_i & oct(7777); my $output_file_permissions = $input_file_permissions; #rt128477: avoid inconsistent owner/group and suid/sgid if ( $uid_i != $uid_o || $gid_i != $gid_o ) { # try to change owner and group to match input file if # in -b mode. Note: chown returns number of files # successfully changed. if ( $in_place_modify && chown( $uid_i, $gid_i, $output_file ) ) { # owner/group successfully changed } else { # owner or group differ: do not copy suid and sgid $output_file_permissions = $mode_i & oct(777); if ( $input_file_permissions != $output_file_permissions ) { Warn( "Unable to copy setuid and/or setgid bits for output file '$output_file'\n" ); } } } # Mark the output file for rw unless we are in -b mode. # Explanation: perltidy does not unlink existing output # files before writing to them, for safety. If a # designated output file exists and is not writable, # perltidy will halt. This can prevent a data loss if a # user accidentally enters "perltidy infile -o # important_ro_file", or "perltidy infile -st # >important_ro_file". But it also means that perltidy can # get locked out of rerunning unless it marks its own # output files writable. The alternative, of always # unlinking the designated output file, is less safe and # not always possible, except in -b mode, where there is an # assumption that a previous backup can be unlinked even if # not writable. if ( !$in_place_modify ) { $output_file_permissions |= oct(600); } if ( !chmod( $output_file_permissions, $output_file ) ) { # couldn't change file permissions my $operm = sprintf "%04o", $output_file_permissions; Warn( "Unable to set permissions for output file '$output_file' to $operm\n" ); } return; } ## end sub set_output_file_permissions sub get_decoded_string_buffer { my ( $self, $input_file, $display_name, $rpending_logfile_message ) = @_; # Decode the input buffer if necessary or requested # Given # $input_file = the input file or stream # $display_name = its name to use in error messages # Return # $buf = string buffer with input, decoded from utf8 if necessary # $is_encoded_data = true if $buf is decoded from utf8 # $decoded_input_as = true if perltidy decoded input buf # $encoding_log_message = messages for log file, # $length_function = function to use for measuring string width # Return nothing on any error; this is a signal to skip this file my $rOpts = $self->[_rOpts_]; my $source_object = Perl::Tidy::LineSource->new( input_file => $input_file, rOpts => $rOpts, ); # return nothing if error return unless ($source_object); my $buf = EMPTY_STRING; while ( my $line = $source_object->get_line() ) { $buf .= $line; } my $encoding_in = EMPTY_STRING; my $rOpts_character_encoding = $rOpts->{'character-encoding'}; my $encoding_log_message; my $decoded_input_as = EMPTY_STRING; $rstatus->{'char_mode_source'} = 0; # Case 1: If Perl is already in a character-oriented mode for this # string rather than a byte-oriented mode. Normally, this happens if # the caller has decoded a utf8 string before calling perltidy. But it # could also happen if the user has done some unusual manipulations of # the source. In any case, we will not attempt to decode it because # that could result in an output string in a different mode. if ( is_char_mode($buf) ) { $encoding_in = "utf8"; $rstatus->{'char_mode_source'} = 1; } # Case 2. No input stream encoding requested. This is appropriate # for single-byte encodings like ascii, latin-1, etc elsif ( !$rOpts_character_encoding || $rOpts_character_encoding eq 'none' ) { # nothing to do } # Case 3. guess input stream encoding if requested elsif ( lc($rOpts_character_encoding) eq 'guess' ) { # The guessing strategy is simple: use Encode::Guess to guess # an encoding. If and only if the guess is utf8, try decoding and # use it if successful. Otherwise, we proceed assuming the # characters are encoded as single bytes (same as if 'none' had # been specified as the encoding). # In testing I have found that including additional guess 'suspect' # encodings sometimes works but can sometimes lead to disaster by # using an incorrect decoding. my $buf_in = $buf; my $decoder = guess_encoding( $buf_in, 'utf8' ); if ( ref($decoder) ) { $encoding_in = $decoder->name; if ( $encoding_in ne 'UTF-8' && $encoding_in ne 'utf8' ) { $encoding_in = EMPTY_STRING; $buf = $buf_in; $encoding_log_message .= <decode($buf_in); 1 } ) { $encoding_log_message .= <[_is_encoded_data_] = $is_encoded_data; # Delete any Byte Order Mark (BOM), which can cause trouble if ($is_encoded_data) { $buf =~ s/^\x{FEFF}//; } $rstatus->{'input_name'} = $display_name; $rstatus->{'opt_encoding'} = $rOpts_character_encoding; $rstatus->{'char_mode_used'} = $encoding_in ? 1 : 0; $rstatus->{'input_decoded_as'} = $decoded_input_as; # Define the function to determine the display width of character # strings my $length_function = sub { return length( $_[0] ) }; if ($is_encoded_data) { # Try to load Unicode::GCString for defining text display width, if # requested, when the first encoded file is encountered if ( !defined($loaded_unicode_gcstring) ) { if ( eval { require Unicode::GCString; 1 } ) { $loaded_unicode_gcstring = 1; } else { $loaded_unicode_gcstring = 0; if ( $rOpts->{'use-unicode-gcstring'} ) { Warn(<new( $_[0] )->columns; }; $encoding_log_message .= <{'gcs_used'} = 1; } } return ( $buf, $is_encoded_data, $decoded_input_as, $encoding_log_message, $length_function, ); } ## end sub get_decoded_string_buffer sub process_all_files { my ( $self, $rinput_hash, $rfiles, $output_extension, $forbidden_file_extensions, $in_place_modify, $backup_extension, $delete_backup, $logfile_header, $rpending_complaint, $rpending_logfile_message, ) = @_; # This routine is the main loop to process all files. # Total formatting is done with these layers of subroutines: # perltidy - main routine; checks run parameters # *process_all_files - main loop to process all files; *THIS LAYER # process_filter_layer - do any pre and post processing; # process_iteration_layer - handle any iterations on formatting # process_single_case - solves one formatting problem my $rOpts = $self->[_rOpts_]; my $dot = $self->[_file_extension_separator_]; my $diagnostics_object = $self->[_diagnostics_object_]; my $line_separator_default = $self->[_line_separator_default_]; my $destination_stream = $rinput_hash->{'destination'}; my $errorfile_stream = $rinput_hash->{'errorfile'}; my $logfile_stream = $rinput_hash->{'logfile'}; my $teefile_stream = $rinput_hash->{'teefile'}; my $debugfile_stream = $rinput_hash->{'debugfile'}; my $source_stream = $rinput_hash->{'source'}; my $stderr_stream = $rinput_hash->{'stderr'}; my $number_of_files = @{$rfiles}; while ( my $input_file = shift @{$rfiles} ) { my $fileroot; my @input_file_stat; my $display_name; #-------------------------- # prepare this input stream #-------------------------- if ($source_stream) { $fileroot = "perltidy"; $display_name = ""; # If the source is from an array or string, then .LOG output # is only possible if a logfile stream is specified. This prevents # unexpected perltidy.LOG files. if ( !defined($logfile_stream) ) { $logfile_stream = Perl::Tidy::DevNull->new(); # Likewise for .TEE and .DEBUG output } if ( !defined($teefile_stream) ) { $teefile_stream = Perl::Tidy::DevNull->new(); } if ( !defined($debugfile_stream) ) { $debugfile_stream = Perl::Tidy::DevNull->new(); } } elsif ( $input_file eq '-' ) { # '-' indicates input from STDIN $fileroot = "perltidy"; # root name to use for .ERR, .LOG, etc $display_name = ""; $in_place_modify = 0; } else { $fileroot = $input_file; $display_name = $input_file; unless ( -e $input_file ) { # file doesn't exist - check for a file glob if ( $input_file =~ /([\?\*\[\{])/ ) { # Windows shell may not remove quotes, so do it my $input_file = $input_file; if ( $input_file =~ /^\'(.+)\'$/ ) { $input_file = $1 } if ( $input_file =~ /^\"(.+)\"$/ ) { $input_file = $1 } my $pattern = fileglob_to_re($input_file); my $dh; if ( opendir( $dh, './' ) ) { my @files = grep { /$pattern/ && !-d } readdir($dh); closedir($dh); next unless (@files); unshift @{$rfiles}, @files; next; } } Warn("skipping file: '$input_file': no matches found\n"); next; } unless ( -f $input_file ) { Warn("skipping file: $input_file: not a regular file\n"); next; } # As a safety precaution, skip zero length files. # If for example a source file got clobbered somehow, # the old .tdy or .bak files might still exist so we # shouldn't overwrite them with zero length files. unless ( -s $input_file ) { Warn("skipping file: $input_file: Zero size\n"); next; } # And avoid formatting extremely large files. Since perltidy reads # files into memory, trying to process an extremely large file # could cause system problems. my $size_in_mb = ( -s $input_file ) / ( 1024 * 1024 ); if ( $size_in_mb > $rOpts->{'maximum-file-size-mb'} ) { $size_in_mb = sprintf( "%0.1f", $size_in_mb ); Warn( "skipping file: $input_file: size $size_in_mb MB exceeds limit $rOpts->{'maximum-file-size-mb'}; use -mfs=i to change\n" ); next; } unless ( ( -T $input_file ) || $rOpts->{'force-read-binary'} ) { Warn("skipping file: $input_file: Non-text (override with -f)\n" ); next; } # Input file must be writable for -b -bm='copy'. We must catch # this early to prevent encountering trouble after unlinking the # previous backup. if ( $in_place_modify && !-w $input_file ) { my $backup_method = $rOpts->{'backup-method'}; if ( defined($backup_method) && $backup_method eq 'copy' ) { Warn "skipping file '$input_file' for -b option: file reported as non-writable\n"; next; } } # we should have a valid filename now $fileroot = $input_file; @input_file_stat = stat($input_file); if ( $OSNAME eq 'VMS' ) { ( $fileroot, $dot ) = check_vms_filename($fileroot); $self->[_file_extension_separator_] = $dot; } # add option to change path here if ( defined( $rOpts->{'output-path'} ) ) { my ( $base, $old_path ) = fileparse($fileroot); my $new_path = $rOpts->{'output-path'}; unless ( -d $new_path ) { unless ( mkdir $new_path, 0777 ) { Die("unable to create directory $new_path: $ERRNO\n"); } } my $path = $new_path; $fileroot = catfile( $path, $base ); unless ($fileroot) { Die(<get_decoded_string_buffer( $input_file, $display_name, $rpending_logfile_message ); # Skip this file on any error next if ( !defined($buf) ); # Register this file name with the Diagnostics package, if any. $diagnostics_object->set_input_file($input_file) if $diagnostics_object; # OK: the (possibly decoded) input is now in string $buf. We just need # to to prepare the output and error logger before formatting it. #-------------------------- # prepare the output stream #-------------------------- my $output_file = undef; my $output_name = EMPTY_STRING; my $actual_output_extension; if ( $rOpts->{'outfile'} ) { if ( $number_of_files <= 1 ) { if ( $rOpts->{'standard-output'} ) { my $saw_pbp = $self->[_saw_pbp_]; my $msg = "You may not use -o and -st together"; $msg .= " (-pbp contains -st; see manual)" if ($saw_pbp); Die("$msg\n"); } elsif ($destination_stream) { Die( "You may not specify a destination array and -o together\n" ); } elsif ( defined( $rOpts->{'output-path'} ) ) { Die("You may not specify -o and -opath together\n"); } elsif ( defined( $rOpts->{'output-file-extension'} ) ) { Die("You may not specify -o and -oext together\n"); } $output_file = $rOpts->{outfile}; $output_name = $output_file; # make sure user gives a file name after -o if ( $output_file =~ /^-/ ) { Die("You must specify a valid filename after -o\n"); } # do not overwrite input file with -o if ( @input_file_stat && ( $output_file eq $input_file ) ) { Die("Use 'perltidy -b $input_file' to modify in-place\n"); } } else { Die("You may not use -o with more than one input file\n"); } } elsif ( $rOpts->{'standard-output'} ) { if ($destination_stream) { my $saw_pbp = $self->[_saw_pbp_]; my $msg = "You may not specify a destination array and -st together\n"; $msg .= " (-pbp contains -st; see manual)" if ($saw_pbp); Die("$msg\n"); } $output_file = '-'; $output_name = ""; if ( $number_of_files <= 1 ) { } else { Die("You may not use -st with more than one input file\n"); } } elsif ($destination_stream) { $output_file = $destination_stream; $output_name = ""; } elsif ($source_stream) { # source but no destination goes to stdout $output_file = '-'; $output_name = ""; } elsif ( $input_file eq '-' ) { $output_file = '-'; $output_name = ""; } else { if ($in_place_modify) { # Send output to a temporary array buffer. This will # allow efficient copying back to the input by # sub backup_and_modify_in_place, below. my @tmp_buff; $output_file = \@tmp_buff; $output_name = $display_name; } else { $actual_output_extension = $output_extension; $output_file = $fileroot . $output_extension; $output_name = $output_file; } } $rstatus->{'file_count'} += 1; $rstatus->{'output_name'} = $output_name; $rstatus->{'iteration_count'} = 0; $rstatus->{'converged'} = 0; #------------------------------------------ # initialize the error logger for this file #------------------------------------------ my $warning_file = $fileroot . $dot . "ERR"; if ($errorfile_stream) { $warning_file = $errorfile_stream } my $log_file = $fileroot . $dot . "LOG"; if ($logfile_stream) { $log_file = $logfile_stream } # The logger object handles warning messages, logfile messages, # and can supply basic run information to lower level routines. my $logger_object = Perl::Tidy::Logger->new( rOpts => $rOpts, log_file => $log_file, warning_file => $warning_file, fh_stderr => $fh_stderr, display_name => $display_name, is_encoded_data => $is_encoded_data, ); $logger_object->write_logfile_entry($logfile_header); $logger_object->write_logfile_entry($encoding_log_message) if $encoding_log_message; # Now we can add any pending messages to the log if ( ${$rpending_logfile_message} ) { $logger_object->write_logfile_entry( ${$rpending_logfile_message} ); } if ( ${$rpending_complaint} ) { $logger_object->complain( ${$rpending_complaint} ); } # Use input line endings if requested my $line_separator = $line_separator_default; if ( $rOpts->{'preserve-line-endings'} ) { my $ls_input = find_input_line_ending($input_file); if ( defined($ls_input) ) { $line_separator = $ls_input } } # additional parameters needed by lower level routines $self->[_actual_output_extension_] = $actual_output_extension; $self->[_debugfile_stream_] = $debugfile_stream; $self->[_decoded_input_as_] = $decoded_input_as; $self->[_destination_stream_] = $destination_stream; $self->[_display_name_] = $display_name; $self->[_fileroot_] = $fileroot; $self->[_is_encoded_data_] = $is_encoded_data; $self->[_length_function_] = $length_function; $self->[_line_separator_] = $line_separator; $self->[_logger_object_] = $logger_object; $self->[_output_file_] = $output_file; $self->[_teefile_stream_] = $teefile_stream; $self->[_input_copied_verbatim_] = 0; $self->[_input_output_difference_] = 1; ## updated later if -b used #---------------------------------------------------------- # Do all formatting of this buffer. # Results will go to the selected output file or streams(s) #---------------------------------------------------------- $self->process_filter_layer($buf); #-------------------------------------------------- # Handle the -b option (backup and modify in-place) #-------------------------------------------------- if ($in_place_modify) { # For -b option, leave the file unchanged if a severe error caused # formatting to be skipped. Otherwise we will overwrite any backup. if ( !$self->[_input_copied_verbatim_] ) { my $backup_method = $rOpts->{'backup-method'}; # Option 1, -bm='copy': uses newer version in which original is # copied to the backup and rewritten; see git #103. if ( defined($backup_method) && $backup_method eq 'copy' ) { $self->backup_method_copy( $input_file, $output_file, $backup_extension, $delete_backup ); } # Option 2, -bm='move': uses older version, where original is # moved to the backup and formatted output goes to a new file. else { $self->backup_method_move( $input_file, $output_file, $backup_extension, $delete_backup ); } } $output_file = $input_file; } #------------------------------------------------------------------- # Otherwise set output file ownership and permissions if appropriate #------------------------------------------------------------------- elsif ( $output_file && -f $output_file && !-l $output_file ) { if (@input_file_stat) { if ( $rOpts->{'format'} eq 'tidy' ) { $self->set_output_file_permissions( $output_file, \@input_file_stat, $in_place_modify ); } # else use default permissions for html and any other format } } $logger_object->finish() if $logger_object; } ## end of main loop to process all files return; } ## end sub process_all_files sub process_filter_layer { my ( $self, $buf ) = @_; # This is the filter layer of processing. # Do all requested formatting on the string '$buf', including any # pre- and post-processing with filters. # Store the results in the selected output file(s) or stream(s). # Total formatting is done with these layers of subroutines: # perltidy - main routine; checks run parameters # process_all_files - main loop to process all files; # *process_filter_layer - do any pre and post processing; *THIS LAYER # process_iteration_layer - handle any iterations on formatting # process_single_case - solves one formatting problem # Data Flow in this layer: # $buf # -> optional prefilter operation # -> [ formatting by sub process_iteration_layer ] # -> ( optional postfilter_buffer for postfilter, other operations ) # -> ( optional destination_buffer for encoding ) # -> final sink_object # What is done based on format type: # utf8 decoding is done for all format types # prefiltering is applied to all format types # - because it may be needed to get through the tokenizer # postfiltering is only done for format='tidy' # - might cause problems operating on html text # encoding of decoded output is only done for format='tidy' # - because html does its own encoding; user formatter does what it wants my $rOpts = $self->[_rOpts_]; my $is_encoded_data = $self->[_is_encoded_data_]; my $logger_object = $self->[_logger_object_]; my $output_file = $self->[_output_file_]; my $user_formatter = $self->[_user_formatter_]; my $destination_stream = $self->[_destination_stream_]; my $prefilter = $self->[_prefilter_]; my $postfilter = $self->[_postfilter_]; my $decoded_input_as = $self->[_decoded_input_as_]; my $line_separator = $self->[_line_separator_]; my $remove_terminal_newline = !$rOpts->{'add-terminal-newline'} && substr( $buf, -1, 1 ) !~ /\n/; # vars for postfilter, if used my $use_postfilter_buffer; my $postfilter_buffer; # vars for destination buffer, if used my $destination_buffer; my $use_destination_buffer; my $encode_destination_buffer; # vars for iterations, if done my $sink_object; # vars for checking assertions, if needed my $digest_input; my $saved_input_buf; my $ref_destination_stream = ref($destination_stream); # Setup vars for postfilter, destination buffer, assertions and sink object # if needed. These are only used for 'tidy' formatting. if ( $rOpts->{'format'} eq 'tidy' ) { # evaluate MD5 sum of input file, if needed, before any prefilter if ( $rOpts->{'assert-tidy'} || $rOpts->{'assert-untidy'} || $rOpts->{'backup-and-modify-in-place'} ) { $digest_input = $md5_hex->($buf); $saved_input_buf = $buf; } #----------------------- # Setup postfilter buffer #----------------------- # If we need access to the output for filtering or checking assertions # before writing to its ultimate destination, then we will send it # to a temporary buffer. The variables are: # $postfilter_buffer = the buffer to capture the output # $use_postfilter_buffer = is a postfilter buffer used? # These are used below, just after iterations are made. $use_postfilter_buffer = $postfilter || $remove_terminal_newline || $rOpts->{'assert-tidy'} || $rOpts->{'assert-untidy'} || $rOpts->{'backup-and-modify-in-place'}; #------------------------- # Setup destination_buffer #------------------------- # If the final output destination is not a file, then we might need to # encode the result at the end of processing. So in this case we will # send the output to a temporary buffer. # The key variables are: # $destination_buffer - receives the formatted output # $use_destination_buffer - is $destination_buffer used? # $encode_destination_buffer - encode $destination_buffer? # These are used by sub 'copy_buffer_to_destination', below if ($ref_destination_stream) { $use_destination_buffer = 1; $output_file = \$destination_buffer; $self->[_output_file_] = $output_file; # Strings and arrays use special encoding rules if ( $ref_destination_stream eq 'SCALAR' || $ref_destination_stream eq 'ARRAY' ) { $encode_destination_buffer = $rOpts->{'encode-output-strings'} && $decoded_input_as; } # An object with a print method will use file encoding rules elsif ( $ref_destination_stream->can('print') ) { $encode_destination_buffer = $is_encoded_data; } else { confess <new( output_file => $use_postfilter_buffer ? \$postfilter_buffer : $output_file, line_separator => $line_separator, is_encoded_data => $is_encoded_data, ); } #----------------------------------------------------------------------- # Apply any prefilter. The prefilter is a code reference that will be # applied to the source before tokenizing. Note that we are doing this # for all format types ('tidy', 'html', 'user') because it may be needed # to avoid tokenization errors. #----------------------------------------------------------------------- $buf = $prefilter->($buf) if $prefilter; #---------------------------------------------------------------------- # Format contents of string '$buf', iterating if requested. # For 'tidy', formatted result will be written to '$sink_object' # For 'html' and 'user', result goes directly to its ultimate destination. #---------------------------------------------------------------------- $self->process_iteration_layer( $buf, $sink_object ); #-------------------------------- # Do postfilter buffer processing #-------------------------------- if ($use_postfilter_buffer) { my $sink_object_post = Perl::Tidy::LineSink->new( output_file => $output_file, line_separator => $line_separator, is_encoded_data => $is_encoded_data, ); #---------------------------------------------------------------------- # Apply any postfilter. The postfilter is a code reference that will be # applied to the source after tidying. #---------------------------------------------------------------------- my $buf_post = $postfilter ? $postfilter->($postfilter_buffer) : $postfilter_buffer; if ( defined($digest_input) ) { my $digest_output = $md5_hex->($buf_post); $self->[_input_output_difference_] = $digest_output ne $digest_input; } # Check if file changed if requested, but only after any postfilter if ( $rOpts->{'assert-tidy'} ) { if ( $self->[_input_output_difference_] ) { my $diff_msg = compare_string_buffers( $saved_input_buf, $buf_post, $is_encoded_data ); $logger_object->warning(<interrupt_logfile(); $logger_object->warning( $diff_msg . "\n" ); $logger_object->resume_logfile(); } } if ( $rOpts->{'assert-untidy'} ) { if ( !$self->[_input_output_difference_] ) { $logger_object->warning( "assertion failure: '--assert-untidy' is set but output equals input\n" ); } } my $source_object = Perl::Tidy::LineSource->new( input_file => \$buf_post, rOpts => $rOpts, ); # Copy the filtered buffer to the final destination if ( !$remove_terminal_newline ) { while ( my $line = $source_object->get_line() ) { $sink_object_post->write_line($line); } } else { # Copy the filtered buffer but remove the newline char from the # final line my $line; while ( my $next_line = $source_object->get_line() ) { $sink_object_post->write_line($line) if ($line); $line = $next_line; } if ($line) { $sink_object_post->set_line_separator(undef); chomp $line; $sink_object_post->write_line($line); } } $sink_object_post->close_output_file(); $source_object->close_input_file(); } #-------------------------------------------------------- # Do destination buffer processing, encoding if required. #-------------------------------------------------------- if ($use_destination_buffer) { $self->copy_buffer_to_destination( $destination_buffer, $destination_stream, $encode_destination_buffer ); } else { # output went to a file in 'tidy' mode... if ( $is_encoded_data && $rOpts->{'format'} eq 'tidy' ) { $rstatus->{'output_encoded_as'} = 'UTF-8'; } } # The final formatted result should now be in the selected output file(s) # or stream(s). return; } ## end sub process_filter_layer sub process_iteration_layer { my ( $self, $buf, $sink_object ) = @_; # This is the iteration layer of processing. # Do all formatting, iterating if requested, on the source string $buf. # Output depends on format type: # For 'tidy' formatting, output goes to sink object # For 'html' formatting, output goes to the ultimate destination # For 'user' formatting, user formatter handles output # Total formatting is done with these layers of subroutines: # perltidy - main routine; checks run parameters # process_all_files - main loop to process all files; # process_filter_layer - do any pre and post processing # *process_iteration_layer - do any iterations on formatting; *THIS LAYER # process_single_case - solves one formatting problem # Data Flow in this layer: # $buf -> [ loop over iterations ] -> $sink_object # Only 'tidy' formatting can use multiple iterations. my $diagnostics_object = $self->[_diagnostics_object_]; my $display_name = $self->[_display_name_]; my $fileroot = $self->[_fileroot_]; my $is_encoded_data = $self->[_is_encoded_data_]; my $length_function = $self->[_length_function_]; my $line_separator = $self->[_line_separator_]; my $logger_object = $self->[_logger_object_]; my $rOpts = $self->[_rOpts_]; my $tabsize = $self->[_tabsize_]; my $user_formatter = $self->[_user_formatter_]; # create a source object for the buffer my $source_object = Perl::Tidy::LineSource->new( input_file => \$buf, rOpts => $rOpts, ); # make a debugger object if requested my $debugger_object; if ( $rOpts->{DEBUG} ) { my $debug_file = $self->[_debugfile_stream_] || $fileroot . $self->make_file_extension('DEBUG'); $debugger_object = Perl::Tidy::Debugger->new( $debug_file, $is_encoded_data ); } # make a tee file handle if requested my $fh_tee; if ( $rOpts->{'tee-pod'} || $rOpts->{'tee-block-comments'} || $rOpts->{'tee-side-comments'} ) { my $tee_file = $self->[_teefile_stream_] || $fileroot . $self->make_file_extension('TEE'); ( $fh_tee, my $tee_filename ) = Perl::Tidy::streamhandle( $tee_file, 'w', $is_encoded_data ); if ( !$fh_tee ) { Warn("couldn't open TEE file $tee_file: $ERRNO\n"); } } # vars for iterations and convergence test my $max_iterations = 1; my $convergence_log_message; my $do_convergence_test; my %saw_md5; # Only 'tidy' formatting can use multiple iterations if ( $rOpts->{'format'} eq 'tidy' ) { # check iteration count and quietly fix if necessary: # - iterations option only applies to code beautification mode # - the convergence check should stop most runs on iteration 2, and # virtually all on iteration 3. But we'll allow up to 6. $max_iterations = $rOpts->{'iterations'}; if ( !defined($max_iterations) || $max_iterations <= 0 ) { $max_iterations = 1; } elsif ( $max_iterations > 6 ) { $max_iterations = 6; } # get starting MD5 sum for convergence test if ( $max_iterations > 1 ) { $do_convergence_test = 1; my $digest = $md5_hex->($buf); $saw_md5{$digest} = 0; } } # save objects to allow redirecting output during iterations my $sink_object_final = $sink_object; my $logger_object_final = $logger_object; my $iteration_of_formatter_convergence; #--------------------- # Loop over iterations #--------------------- foreach my $iter ( 1 .. $max_iterations ) { $rstatus->{'iteration_count'} += 1; # send output stream to temp buffers until last iteration my $sink_buffer; if ( $iter < $max_iterations ) { $sink_object = Perl::Tidy::LineSink->new( output_file => \$sink_buffer, line_separator => $line_separator, is_encoded_data => $is_encoded_data, ); } else { $sink_object = $sink_object_final; } # Save logger, debugger and tee output only on pass 1 because: # (1) line number references must be to the starting # source, not an intermediate result, and # (2) we need to know if there are errors so we can stop the # iterations early if necessary. # (3) the tee option only works on first pass if comments are also # being deleted. if ( $iter > 1 ) { $debugger_object->close_debug_file() if ($debugger_object); $fh_tee->close() if ($fh_tee); $debugger_object = undef; $logger_object = undef; $fh_tee = undef; } #--------------------------------- # create a formatter for this file #--------------------------------- my $formatter; if ($user_formatter) { $formatter = $user_formatter; } elsif ( $rOpts->{'format'} eq 'html' ) { my $html_toc_extension = $self->make_file_extension( $rOpts->{'html-toc-extension'}, 'toc' ); my $html_src_extension = $self->make_file_extension( $rOpts->{'html-src-extension'}, 'src' ); $formatter = Perl::Tidy::HtmlWriter->new( input_file => $fileroot, html_file => $self->[_output_file_], extension => $self->[_actual_output_extension_], html_toc_extension => $html_toc_extension, html_src_extension => $html_src_extension, ); } elsif ( $rOpts->{'format'} eq 'tidy' ) { $formatter = Perl::Tidy::Formatter->new( logger_object => $logger_object, diagnostics_object => $diagnostics_object, sink_object => $sink_object, length_function => $length_function, is_encoded_data => $is_encoded_data, fh_tee => $fh_tee, ); } else { Die("I don't know how to do -format=$rOpts->{'format'}\n"); } unless ($formatter) { Die("Unable to continue with $rOpts->{'format'} formatting\n"); } #----------------------------------- # create the tokenizer for this file #----------------------------------- my $tokenizer = Perl::Tidy::Tokenizer->new( source_object => $source_object, logger_object => $logger_object, debugger_object => $debugger_object, diagnostics_object => $diagnostics_object, tabsize => $tabsize, rOpts => $rOpts, starting_level => $rOpts->{'starting-indentation-level'}, indent_columns => $rOpts->{'indent-columns'}, look_for_hash_bang => $rOpts->{'look-for-hash-bang'}, look_for_autoloader => $rOpts->{'look-for-autoloader'}, look_for_selfloader => $rOpts->{'look-for-selfloader'}, trim_qw => $rOpts->{'trim-qw'}, extended_syntax => $rOpts->{'extended-syntax'}, continuation_indentation => $rOpts->{'continuation-indentation'}, outdent_labels => $rOpts->{'outdent-labels'}, ); #--------------------------------- # do processing for this iteration #--------------------------------- $self->process_single_case( $tokenizer, $formatter ); #----------------------------------------- # close the input source and report errors #----------------------------------------- $source_object->close_input_file(); # see if the formatter is converged if ( $max_iterations > 1 && !defined($iteration_of_formatter_convergence) && $formatter->can('get_convergence_check') ) { if ( $formatter->get_convergence_check() ) { $iteration_of_formatter_convergence = $iter; $rstatus->{'converged'} = 1; } } # line source for next iteration (if any) comes from the current # temporary output buffer if ( $iter < $max_iterations ) { $sink_object->close_output_file(); $source_object = Perl::Tidy::LineSource->new( input_file => \$sink_buffer, rOpts => $rOpts, ); # stop iterations if errors or converged my $stop_now = $self->[_input_copied_verbatim_]; $stop_now ||= $tokenizer->get_unexpected_error_count(); my $stopping_on_error = $stop_now; if ($stop_now) { $convergence_log_message = <($sink_buffer); if ( !defined( $saw_md5{$digest} ) ) { $saw_md5{$digest} = $iter; } else { # Deja vu, stop iterating $stop_now = 1; my $iterm = $iter - 1; if ( $saw_md5{$digest} != $iterm ) { # Blinking (oscillating) between two or more stable # end states. This is unlikely to occur with normal # parameters, but it can occur in stress testing # with extreme parameter values, such as very short # maximum line lengths. We want to catch and fix # them when they happen. $rstatus->{'blinking'} = 1; $convergence_log_message = <write_diagnostics( $convergence_log_message) if $diagnostics_object; # Uncomment to search for blinking states # Warn( "$display_name: blinking; iter $iter same as for $saw_md5{$digest}\n" ); } else { $convergence_log_message = <write_diagnostics( $convergence_log_message) if $diagnostics_object && $iterm > 2; $rstatus->{'converged'} = 1; } } } ## end if ($do_convergence_test) if ($stop_now) { if (DEVEL_MODE) { if ( defined($iteration_of_formatter_convergence) ) { # This message cannot appear unless the formatter # convergence test above is temporarily skipped for # testing. if ( $iteration_of_formatter_convergence < $iter - 1 ) { print STDERR "STRANGE Early conv in $display_name: Stopping on it=$iter, converged in formatter on $iteration_of_formatter_convergence\n"; } } elsif ( !$stopping_on_error ) { print STDERR "STRANGE no conv in $display_name: stopping on it=$iter, but not converged in formatter\n"; } } # we are stopping the iterations early; # copy the output stream to its final destination $sink_object = $sink_object_final; while ( my $line = $source_object->get_line() ) { $sink_object->write_line($line); } $source_object->close_input_file(); last; } } ## end if ( $iter < $max_iterations) } ## end loop over iterations for one source file $sink_object->close_output_file() if $sink_object; $debugger_object->close_debug_file() if $debugger_object; $fh_tee->close() if $fh_tee; # leave logger object open for additional messages $logger_object = $logger_object_final; $logger_object->write_logfile_entry($convergence_log_message) if $convergence_log_message; return; } ## end sub process_iteration_layer sub process_single_case { # run the formatter on a single defined case my ( $self, $tokenizer, $formatter ) = @_; # Total formatting is done with these layers of subroutines: # perltidy - main routine; checks run parameters # process_all_files - main loop to process all files; # process_filter_layer - do any pre and post processing; # process_iteration_layer - do any iterations on formatting # *process_single_case - solve one formatting problem; *THIS LAYER while ( my $line = $tokenizer->get_line() ) { $formatter->write_line($line); } # user-defined formatters are possible, and may not have a # sub 'finish_formatting', so we have to check if ( $formatter->can('finish_formatting') ) { my $severe_error = $tokenizer->report_tokenization_errors(); my $verbatim = $formatter->finish_formatting($severe_error); $self->[_input_copied_verbatim_] = $verbatim; } return; } ## end sub process_single_case sub copy_buffer_to_destination { my ( $self, $destination_buffer, $destination_stream, $encode_destination_buffer ) = @_; # Copy $destination_buffer to the final $destination_stream, # encoding if the flag $encode_destination_buffer is true. # Data Flow: # $destination_buffer -> [ encode? ] -> $destination_stream $rstatus->{'output_encoded_as'} = EMPTY_STRING; if ($encode_destination_buffer) { my $encoded_buffer; if ( !eval { $encoded_buffer = Encode::encode( "UTF-8", $destination_buffer, Encode::FB_CROAK | Encode::LEAVE_SRC ); 1; } ) { Warn( "Error attempting to encode output string ref; encoding not done\n" ); } else { $destination_buffer = $encoded_buffer; $rstatus->{'output_encoded_as'} = 'UTF-8'; } } # Send data for SCALAR, ARRAY & OBJ refs to its final destination if ( ref($destination_stream) eq 'SCALAR' ) { ${$destination_stream} = $destination_buffer; } elsif ($destination_buffer) { my @lines = split /^/, $destination_buffer; if ( ref($destination_stream) eq 'ARRAY' ) { @{$destination_stream} = @lines; } # destination stream must be an object with print method else { foreach my $line (@lines) { $destination_stream->print($line); } my $ref_destination_stream = ref($destination_stream); if ( $ref_destination_stream->can('close') ) { $destination_stream->close(); } } } else { # Empty destination buffer not going to a string ... could # happen for example if user deleted all pod or comments } return; } ## end sub copy_buffer_to_destination } ## end of closure for sub perltidy sub line_diff { # Given two strings, return # $diff_marker = a string with carat (^) symbols indicating differences # $pos1 = character position of first difference; pos1=-1 if no difference # Form exclusive or of the strings, which has null characters where strings # have same common characters so non-null characters indicate character # differences. my ( $s1, $s2 ) = @_; my $diff_marker = EMPTY_STRING; my $pos = -1; my $pos1 = $pos; if ( defined($s1) && defined($s2) ) { my $count = 0; my $mask = $s1 ^ $s2; while ( $mask =~ /[^\0]/g ) { $count++; my $pos_last = $pos; $pos = $LAST_MATCH_START[0]; if ( $count == 1 ) { $pos1 = $pos; } $diff_marker .= SPACE x ( $pos - $pos_last - 1 ) . '^'; # we could continue to mark all differences, but there is no point last; } } return wantarray ? ( $diff_marker, $pos1 ) : $diff_marker; } ## end sub line_diff sub compare_string_buffers { # Compare input and output string buffers and return a brief text # description of the first difference. my ( $bufi, $bufo, $is_encoded_data ) = @_; my $leni = length($bufi); my $leno = defined($bufo) ? length($bufo) : 0; my $msg = "Input file length is $leni chars\nOutput file length is $leno chars\n"; return $msg unless $leni && $leno; my ( $fhi, $fnamei ) = streamhandle( \$bufi, 'r', $is_encoded_data ); my ( $fho, $fnameo ) = streamhandle( \$bufo, 'r', $is_encoded_data ); return $msg unless ( $fho && $fhi ); # for safety, shouldn't happen my ( $linei, $lineo ); my ( $counti, $counto ) = ( 0, 0 ); my ( $last_nonblank_line, $last_nonblank_count ) = ( EMPTY_STRING, 0 ); my $truncate = sub { my ( $str, $lenmax ) = @_; if ( length($str) > $lenmax ) { $str = substr( $str, 0, $lenmax ) . "..."; } return $str; }; while (1) { if ($linei) { $last_nonblank_line = $linei; $last_nonblank_count = $counti; } $linei = $fhi->getline(); $lineo = $fho->getline(); # compare chomp'ed lines if ( defined($linei) ) { $counti++; chomp $linei } if ( defined($lineo) ) { $counto++; chomp $lineo } # see if one or both ended before a difference last unless ( defined($linei) && defined($lineo) ); next if ( $linei eq $lineo ); # lines differ ... my ( $line_diff, $pos1 ) = line_diff( $linei, $lineo ); my $reason = "Files first differ at character $pos1 of line $counti"; my ( $leading_ws_i, $leading_ws_o ) = ( EMPTY_STRING, EMPTY_STRING ); if ( $linei =~ /^(\s+)/ ) { $leading_ws_i = $1; } if ( $lineo =~ /^(\s+)/ ) { $leading_ws_o = $1; } if ( $leading_ws_i ne $leading_ws_o ) { $reason .= "; leading whitespace differs"; if ( $leading_ws_i =~ /\t/ ) { $reason .= "; input has tab char"; } } else { my ( $trailing_ws_i, $trailing_ws_o ) = ( EMPTY_STRING, EMPTY_STRING ); if ( $linei =~ /(\s+)$/ ) { $trailing_ws_i = $1; } if ( $lineo =~ /(\s+)$/ ) { $trailing_ws_o = $1; } if ( $trailing_ws_i ne $trailing_ws_o ) { $reason .= "; trailing whitespace differs"; } } $msg .= $reason . "\n"; # limit string display length if ( $pos1 > 60 ) { my $drop = $pos1 - 40; $linei = "..." . substr( $linei, $drop ); $lineo = "..." . substr( $lineo, $drop ); $line_diff = SPACE x 3 . substr( $line_diff, $drop ); } $linei = $truncate->( $linei, 72 ); $lineo = $truncate->( $lineo, 72 ); $last_nonblank_line = $truncate->( $last_nonblank_line, 72 ); if ($last_nonblank_line) { $msg .= <$counto:$lineo $line_diff EOM return $msg; } ## end while # no line differences found, but one file may have fewer lines if ( $counti > $counto ) { $msg .= < '.*' $x =~ s#\?#.#g; # '?' -> '.' return "^$x\\z"; # match whole word } ## end sub fileglob_to_re sub make_logfile_header { my ( $rOpts, $config_file, $rraw_options, $Windows_type, $readable_options ) = @_; # Note: the punctuation variable '$]' is not in older versions of # English.pm so leave it as is to avoid failing installation tests. my $msg = "perltidy version $VERSION log file on a $OSNAME system, OLD_PERL_VERSION=$]\n"; if ($Windows_type) { $msg .= "Windows type is $Windows_type\n"; } my $options_string = join( SPACE, @{$rraw_options} ); if ($config_file) { $msg .= "Found Configuration File >>> $config_file \n"; } $msg .= "Configuration and command line parameters for this run:\n"; $msg .= "$options_string\n"; if ( $rOpts->{'DEBUG'} || $rOpts->{'show-options'} ) { $rOpts->{'logfile'} = 1; # force logfile to be saved $msg .= "Final parameter set for this run\n"; $msg .= "------------------------------------\n"; $msg .= $readable_options; $msg .= "------------------------------------\n"; } $msg .= "To find error messages search for 'WARNING' with your editor\n"; return $msg; } ## end sub make_logfile_header sub generate_options { ###################################################################### # Generate and return references to: # @option_string - the list of options to be passed to Getopt::Long # @defaults - the list of default options # %expansion - a hash showing how all abbreviations are expanded # %category - a hash giving the general category of each option # %option_range - a hash giving the valid ranges of certain options # Note: a few options are not documented in the man page and usage # message. This is because these are experimental or debug options and # may or may not be retained in future versions. # # Here are the undocumented flags as far as I know. Any of them # may disappear at any time. They are mainly for fine-tuning # and debugging. # # fll --> fuzzy-line-length # a trivial parameter which gets # turned off for the extrude option # which is mainly for debugging # scl --> short-concatenation-item-length # helps break at '.' # recombine # for debugging line breaks # I --> DIAGNOSTICS # for debugging [**DEACTIVATED**] ###################################################################### # here is a summary of the Getopt codes: # does not take an argument # =s takes a mandatory string # :s takes an optional string (DO NOT USE - filenames will get eaten up) # =i takes a mandatory integer # :i takes an optional integer (NOT RECOMMENDED - can cause trouble) # ! does not take an argument and may be negated # i.e., -foo and -nofoo are allowed # a double dash signals the end of the options list # #----------------------------------------------- # Define the option string passed to GetOptions. #----------------------------------------------- my @option_string = (); my %expansion = (); my %option_category = (); my %option_range = (); my $rexpansion = \%expansion; # names of categories in manual # leading integers will allow sorting my @category_name = ( '0. I/O control', '1. Basic formatting options', '2. Code indentation control', '3. Whitespace control', '4. Comment controls', '5. Linebreak controls', '6. Controlling list formatting', '7. Retaining or ignoring existing line breaks', '8. Blank line control', '9. Other controls', '10. HTML options', '11. pod2html options', '12. Controlling HTML properties', '13. Debugging', ); # These options are parsed directly by perltidy: # help h # version v # However, they are included in the option set so that they will # be seen in the options dump. # These long option names have no abbreviations or are treated specially @option_string = qw( html! noprofile no-profile npro recombine! notidy ); my $category = 13; # Debugging foreach (@option_string) { my $opt = $_; # must avoid changing the actual flag $opt =~ s/!$//; $option_category{$opt} = $category_name[$category]; } $category = 11; # HTML $option_category{html} = $category_name[$category]; # routine to install and check options my $add_option = sub { my ( $long_name, $short_name, $flag ) = @_; push @option_string, $long_name . $flag; $option_category{$long_name} = $category_name[$category]; if ($short_name) { if ( $expansion{$short_name} ) { my $existing_name = $expansion{$short_name}[0]; Die( "redefining abbreviation $short_name for $long_name; already used for $existing_name\n" ); } $expansion{$short_name} = [$long_name]; if ( $flag eq '!' ) { my $nshort_name = 'n' . $short_name; my $nolong_name = 'no' . $long_name; if ( $expansion{$nshort_name} ) { my $existing_name = $expansion{$nshort_name}[0]; Die( "attempting to redefine abbreviation $nshort_name for $nolong_name; already used for $existing_name\n" ); } $expansion{$nshort_name} = [$nolong_name]; } } return; }; # Install long option names which have a simple abbreviation. # Options with code '!' get standard negation ('no' for long names, # 'n' for abbreviations). Categories follow the manual. ########################### $category = 0; # I/O_Control ########################### $add_option->( 'backup-and-modify-in-place', 'b', '!' ); $add_option->( 'backup-file-extension', 'bext', '=s' ); $add_option->( 'backup-method', 'bm', '=s' ); $add_option->( 'character-encoding', 'enc', '=s' ); $add_option->( 'force-read-binary', 'f', '!' ); $add_option->( 'format', 'fmt', '=s' ); $add_option->( 'iterations', 'it', '=i' ); $add_option->( 'logfile', 'log', '!' ); $add_option->( 'logfile-gap', 'g', ':i' ); $add_option->( 'outfile', 'o', '=s' ); $add_option->( 'output-file-extension', 'oext', '=s' ); $add_option->( 'output-path', 'opath', '=s' ); $add_option->( 'profile', 'pro', '=s' ); $add_option->( 'quiet', 'q', '!' ); $add_option->( 'standard-error-output', 'se', '!' ); $add_option->( 'standard-output', 'st', '!' ); $add_option->( 'use-unicode-gcstring', 'gcs', '!' ); $add_option->( 'warning-output', 'w', '!' ); $add_option->( 'add-terminal-newline', 'atnl', '!' ); # options which are both toggle switches and values moved here # to hide from tidyview (which does not show category 0 flags): # -ole moved here from category 1 # -sil moved here from category 2 $add_option->( 'output-line-ending', 'ole', '=s' ); $add_option->( 'starting-indentation-level', 'sil', '=i' ); ######################################## $category = 1; # Basic formatting options ######################################## $add_option->( 'check-syntax', 'syn', '!' ); $add_option->( 'entab-leading-whitespace', 'et', '=i' ); $add_option->( 'indent-columns', 'i', '=i' ); $add_option->( 'maximum-line-length', 'l', '=i' ); $add_option->( 'variable-maximum-line-length', 'vmll', '!' ); $add_option->( 'whitespace-cycle', 'wc', '=i' ); $add_option->( 'perl-syntax-check-flags', 'pscf', '=s' ); $add_option->( 'preserve-line-endings', 'ple', '!' ); $add_option->( 'tabs', 't', '!' ); $add_option->( 'default-tabsize', 'dt', '=i' ); $add_option->( 'extended-syntax', 'xs', '!' ); $add_option->( 'assert-tidy', 'ast', '!' ); $add_option->( 'assert-untidy', 'asu', '!' ); $add_option->( 'encode-output-strings', 'eos', '!' ); $add_option->( 'sub-alias-list', 'sal', '=s' ); $add_option->( 'grep-alias-list', 'gal', '=s' ); $add_option->( 'grep-alias-exclusion-list', 'gaxl', '=s' ); $add_option->( 'use-feature', 'uf', '=s' ); ######################################## $category = 2; # Code indentation control ######################################## $add_option->( 'continuation-indentation', 'ci', '=i' ); $add_option->( 'extended-continuation-indentation', 'xci', '!' ); $add_option->( 'line-up-parentheses', 'lp', '!' ); $add_option->( 'extended-line-up-parentheses', 'xlp', '!' ); $add_option->( 'line-up-parentheses-exclusion-list', 'lpxl', '=s' ); $add_option->( 'line-up-parentheses-inclusion-list', 'lpil', '=s' ); $add_option->( 'outdent-keyword-list', 'okwl', '=s' ); $add_option->( 'outdent-keywords', 'okw', '!' ); $add_option->( 'outdent-labels', 'ola', '!' ); $add_option->( 'outdent-long-quotes', 'olq', '!' ); $add_option->( 'indent-closing-brace', 'icb', '!' ); $add_option->( 'closing-token-indentation', 'cti', '=i' ); $add_option->( 'closing-paren-indentation', 'cpi', '=i' ); $add_option->( 'closing-brace-indentation', 'cbi', '=i' ); $add_option->( 'closing-square-bracket-indentation', 'csbi', '=i' ); $add_option->( 'brace-left-and-indent', 'bli', '!' ); $add_option->( 'brace-left-and-indent-list', 'blil', '=s' ); $add_option->( 'brace-left-and-indent-exclusion-list', 'blixl', '=s' ); ######################################## $category = 3; # Whitespace control ######################################## $add_option->( 'add-trailing-commas', 'atc', '!' ); $add_option->( 'add-semicolons', 'asc', '!' ); $add_option->( 'add-whitespace', 'aws', '!' ); $add_option->( 'block-brace-tightness', 'bbt', '=i' ); $add_option->( 'brace-tightness', 'bt', '=i' ); $add_option->( 'delete-old-whitespace', 'dws', '!' ); $add_option->( 'delete-repeated-commas', 'drc', '!' ); $add_option->( 'delete-trailing-commas', 'dtc', '!' ); $add_option->( 'delete-weld-interfering-commas', 'dwic', '!' ); $add_option->( 'delete-semicolons', 'dsm', '!' ); $add_option->( 'function-paren-vertical-alignment', 'fpva', '!' ); $add_option->( 'keyword-paren-inner-tightness', 'kpit', '=i' ); $add_option->( 'keyword-paren-inner-tightness-list', 'kpitl', '=s' ); $add_option->( 'logical-padding', 'lop', '!' ); $add_option->( 'nospace-after-keyword', 'nsak', '=s' ); $add_option->( 'nowant-left-space', 'nwls', '=s' ); $add_option->( 'nowant-right-space', 'nwrs', '=s' ); $add_option->( 'paren-tightness', 'pt', '=i' ); $add_option->( 'space-after-keyword', 'sak', '=s' ); $add_option->( 'space-for-semicolon', 'sfs', '!' ); $add_option->( 'space-function-paren', 'sfp', '!' ); $add_option->( 'space-keyword-paren', 'skp', '!' ); $add_option->( 'space-terminal-semicolon', 'sts', '!' ); $add_option->( 'square-bracket-tightness', 'sbt', '=i' ); $add_option->( 'square-bracket-vertical-tightness', 'sbvt', '=i' ); $add_option->( 'square-bracket-vertical-tightness-closing', 'sbvtc', '=i' ); $add_option->( 'tight-secret-operators', 'tso', '!' ); $add_option->( 'trim-qw', 'tqw', '!' ); $add_option->( 'trim-pod', 'trp', '!' ); $add_option->( 'want-left-space', 'wls', '=s' ); $add_option->( 'want-right-space', 'wrs', '=s' ); $add_option->( 'want-trailing-commas', 'wtc', '=s' ); $add_option->( 'space-prototype-paren', 'spp', '=i' ); $add_option->( 'valign-code', 'vc', '!' ); $add_option->( 'valign-block-comments', 'vbc', '!' ); $add_option->( 'valign-side-comments', 'vsc', '!' ); $add_option->( 'valign-exclusion-list', 'vxl', '=s' ); $add_option->( 'valign-inclusion-list', 'vil', '=s' ); ######################################## $category = 4; # Comment controls ######################################## $add_option->( 'closing-side-comment-else-flag', 'csce', '=i' ); $add_option->( 'closing-side-comment-interval', 'csci', '=i' ); $add_option->( 'closing-side-comment-list', 'cscl', '=s' ); $add_option->( 'closing-side-comment-maximum-text', 'csct', '=i' ); $add_option->( 'closing-side-comment-prefix', 'cscp', '=s' ); $add_option->( 'closing-side-comment-warnings', 'cscw', '!' ); $add_option->( 'closing-side-comments', 'csc', '!' ); $add_option->( 'closing-side-comments-balanced', 'cscb', '!' ); $add_option->( 'code-skipping', 'cs', '!' ); $add_option->( 'code-skipping-begin', 'csb', '=s' ); $add_option->( 'code-skipping-end', 'cse', '=s' ); $add_option->( 'format-skipping', 'fs', '!' ); $add_option->( 'format-skipping-begin', 'fsb', '=s' ); $add_option->( 'format-skipping-end', 'fse', '=s' ); $add_option->( 'hanging-side-comments', 'hsc', '!' ); $add_option->( 'indent-block-comments', 'ibc', '!' ); $add_option->( 'indent-spaced-block-comments', 'isbc', '!' ); $add_option->( 'fixed-position-side-comment', 'fpsc', '=i' ); $add_option->( 'minimum-space-to-comment', 'msc', '=i' ); $add_option->( 'non-indenting-braces', 'nib', '!' ); $add_option->( 'non-indenting-brace-prefix', 'nibp', '=s' ); $add_option->( 'outdent-long-comments', 'olc', '!' ); $add_option->( 'outdent-static-block-comments', 'osbc', '!' ); $add_option->( 'static-block-comment-prefix', 'sbcp', '=s' ); $add_option->( 'static-block-comments', 'sbc', '!' ); $add_option->( 'static-side-comment-prefix', 'sscp', '=s' ); $add_option->( 'static-side-comments', 'ssc', '!' ); $add_option->( 'ignore-side-comment-lengths', 'iscl', '!' ); ######################################## $category = 5; # Linebreak controls ######################################## $add_option->( 'add-newlines', 'anl', '!' ); $add_option->( 'block-brace-vertical-tightness', 'bbvt', '=i' ); $add_option->( 'block-brace-vertical-tightness-list', 'bbvtl', '=s' ); $add_option->( 'brace-follower-vertical-tightness', 'bfvt', '=i' ); $add_option->( 'brace-vertical-tightness', 'bvt', '=i' ); $add_option->( 'brace-vertical-tightness-closing', 'bvtc', '=i' ); $add_option->( 'cuddled-else', 'ce', '!' ); $add_option->( 'cuddled-block-list', 'cbl', '=s' ); $add_option->( 'cuddled-block-list-exclusive', 'cblx', '!' ); $add_option->( 'cuddled-break-option', 'cbo', '=i' ); $add_option->( 'cuddled-paren-brace', 'cpb', '!' ); $add_option->( 'delete-old-newlines', 'dnl', '!' ); $add_option->( 'opening-brace-always-on-right', 'bar', '!' ); $add_option->( 'opening-brace-on-new-line', 'bl', '!' ); $add_option->( 'opening-hash-brace-right', 'ohbr', '!' ); $add_option->( 'opening-paren-right', 'opr', '!' ); $add_option->( 'opening-square-bracket-right', 'osbr', '!' ); $add_option->( 'opening-anonymous-sub-brace-on-new-line', 'asbl', '!' ); $add_option->( 'opening-sub-brace-on-new-line', 'sbl', '!' ); $add_option->( 'paren-vertical-tightness', 'pvt', '=i' ); $add_option->( 'paren-vertical-tightness-closing', 'pvtc', '=i' ); $add_option->( 'weld-nested-containers', 'wn', '!' ); $add_option->( 'weld-nested-exclusion-list', 'wnxl', '=s' ); $add_option->( 'weld-fat-comma', 'wfc', '!' ); $add_option->( 'space-backslash-quote', 'sbq', '=i' ); $add_option->( 'stack-closing-block-brace', 'scbb', '!' ); $add_option->( 'stack-closing-hash-brace', 'schb', '!' ); $add_option->( 'stack-closing-paren', 'scp', '!' ); $add_option->( 'stack-closing-square-bracket', 'scsb', '!' ); $add_option->( 'stack-opening-hash-brace', 'sohb', '!' ); $add_option->( 'stack-opening-paren', 'sop', '!' ); $add_option->( 'stack-opening-square-bracket', 'sosb', '!' ); $add_option->( 'vertical-tightness', 'vt', '=i' ); $add_option->( 'vertical-tightness-closing', 'vtc', '=i' ); $add_option->( 'want-break-after', 'wba', '=s' ); $add_option->( 'want-break-before', 'wbb', '=s' ); $add_option->( 'break-after-all-operators', 'baao', '!' ); $add_option->( 'break-before-all-operators', 'bbao', '!' ); $add_option->( 'keep-interior-semicolons', 'kis', '!' ); $add_option->( 'one-line-block-semicolons', 'olbs', '=i' ); $add_option->( 'one-line-block-nesting', 'olbn', '=i' ); $add_option->( 'one-line-block-exclusion-list', 'olbxl', '=s' ); $add_option->( 'break-before-hash-brace', 'bbhb', '=i' ); $add_option->( 'break-before-hash-brace-and-indent', 'bbhbi', '=i' ); $add_option->( 'break-before-square-bracket', 'bbsb', '=i' ); $add_option->( 'break-before-square-bracket-and-indent', 'bbsbi', '=i' ); $add_option->( 'break-before-paren', 'bbp', '=i' ); $add_option->( 'break-before-paren-and-indent', 'bbpi', '=i' ); $add_option->( 'brace-left-list', 'bll', '=s' ); $add_option->( 'brace-left-exclusion-list', 'blxl', '=s' ); $add_option->( 'break-after-labels', 'bal', '=i' ); # This was an experiment mentioned in git #78, originally named -bopl. I # expanded it to also open logical blocks, based on git discussion #100, # and renamed it -bocp. It works, but will remain commented out due to # apparent lack of interest. # $add_option->( 'break-open-compact-parens', 'bocp', '=s' ); ######################################## $category = 6; # Controlling list formatting ######################################## $add_option->( 'break-at-old-comma-breakpoints', 'boc', '!' ); $add_option->( 'comma-arrow-breakpoints', 'cab', '=i' ); $add_option->( 'maximum-fields-per-table', 'mft', '=i' ); ######################################## $category = 7; # Retaining or ignoring existing line breaks ######################################## $add_option->( 'break-at-old-keyword-breakpoints', 'bok', '!' ); $add_option->( 'break-at-old-logical-breakpoints', 'bol', '!' ); $add_option->( 'break-at-old-method-breakpoints', 'bom', '!' ); $add_option->( 'break-at-old-semicolon-breakpoints', 'bos', '!' ); $add_option->( 'break-at-old-ternary-breakpoints', 'bot', '!' ); $add_option->( 'break-at-old-attribute-breakpoints', 'boa', '!' ); $add_option->( 'keep-old-breakpoints-before', 'kbb', '=s' ); $add_option->( 'keep-old-breakpoints-after', 'kba', '=s' ); $add_option->( 'ignore-old-breakpoints', 'iob', '!' ); ######################################## $category = 8; # Blank line control ######################################## $add_option->( 'blanks-before-blocks', 'bbb', '!' ); $add_option->( 'blanks-before-comments', 'bbc', '!' ); $add_option->( 'blank-lines-before-subs', 'blbs', '=i' ); $add_option->( 'blank-lines-before-packages', 'blbp', '=i' ); $add_option->( 'long-block-line-count', 'lbl', '=i' ); $add_option->( 'maximum-consecutive-blank-lines', 'mbl', '=i' ); $add_option->( 'keep-old-blank-lines', 'kbl', '=i' ); $add_option->( 'keyword-group-blanks-list', 'kgbl', '=s' ); $add_option->( 'keyword-group-blanks-size', 'kgbs', '=s' ); $add_option->( 'keyword-group-blanks-repeat-count', 'kgbr', '=i' ); $add_option->( 'keyword-group-blanks-before', 'kgbb', '=i' ); $add_option->( 'keyword-group-blanks-after', 'kgba', '=i' ); $add_option->( 'keyword-group-blanks-inside', 'kgbi', '!' ); $add_option->( 'keyword-group-blanks-delete', 'kgbd', '!' ); $add_option->( 'blank-lines-after-opening-block', 'blao', '=i' ); $add_option->( 'blank-lines-before-closing-block', 'blbc', '=i' ); $add_option->( 'blank-lines-after-opening-block-list', 'blaol', '=s' ); $add_option->( 'blank-lines-before-closing-block-list', 'blbcl', '=s' ); ######################################## $category = 9; # Other controls ######################################## $add_option->( 'delete-block-comments', 'dbc', '!' ); $add_option->( 'delete-closing-side-comments', 'dcsc', '!' ); $add_option->( 'delete-pod', 'dp', '!' ); $add_option->( 'delete-side-comments', 'dsc', '!' ); $add_option->( 'tee-block-comments', 'tbc', '!' ); $add_option->( 'tee-pod', 'tp', '!' ); $add_option->( 'tee-side-comments', 'tsc', '!' ); $add_option->( 'look-for-autoloader', 'lal', '!' ); $add_option->( 'look-for-hash-bang', 'x', '!' ); $add_option->( 'look-for-selfloader', 'lsl', '!' ); $add_option->( 'pass-version-line', 'pvl', '!' ); ######################################## $category = 13; # Debugging ######################################## $add_option->( 'DIAGNOSTICS', 'I', '!' ) if (DEVEL_MODE); $add_option->( 'DEBUG', 'D', '!' ); $add_option->( 'dump-block-summary', 'dbs', '!' ); $add_option->( 'dump-block-minimum-lines', 'dbl', '=i' ); $add_option->( 'dump-block-types', 'dbt', '=s' ); $add_option->( 'dump-cuddled-block-list', 'dcbl', '!' ); $add_option->( 'dump-defaults', 'ddf', '!' ); $add_option->( 'dump-long-names', 'dln', '!' ); $add_option->( 'dump-options', 'dop', '!' ); $add_option->( 'dump-profile', 'dpro', '!' ); $add_option->( 'dump-short-names', 'dsn', '!' ); $add_option->( 'dump-token-types', 'dtt', '!' ); $add_option->( 'dump-want-left-space', 'dwls', '!' ); $add_option->( 'dump-want-right-space', 'dwrs', '!' ); $add_option->( 'fuzzy-line-length', 'fll', '!' ); $add_option->( 'help', 'h', EMPTY_STRING ); $add_option->( 'short-concatenation-item-length', 'scl', '=i' ); $add_option->( 'show-options', 'opt', '!' ); $add_option->( 'timestamp', 'ts', '!' ); $add_option->( 'version', 'v', EMPTY_STRING ); $add_option->( 'memoize', 'mem', '!' ); $add_option->( 'file-size-order', 'fso', '!' ); $add_option->( 'maximum-file-size-mb', 'maxfs', '=i' ); $add_option->( 'maximum-level-errors', 'maxle', '=i' ); $add_option->( 'maximum-unexpected-errors', 'maxue', '=i' ); #--------------------------------------------------------------------- # The Perl::Tidy::HtmlWriter will add its own options to the string Perl::Tidy::HtmlWriter->make_getopt_long_names( \@option_string ); ######################################## # Set categories 10, 11, 12 ######################################## # Based on their known order $category = 12; # HTML properties foreach my $opt (@option_string) { my $long_name = $opt; $long_name =~ s/(!|=.*|:.*)$//; unless ( defined( $option_category{$long_name} ) ) { if ( $long_name =~ /^html-linked/ ) { $category = 10; # HTML options } elsif ( $long_name =~ /^pod2html/ ) { $category = 11; # Pod2html } $option_category{$long_name} = $category_name[$category]; } } #--------------------------------------- # Assign valid ranges to certain options #--------------------------------------- # In the future, these may be used to make preliminary checks # hash keys are long names # If key or value is undefined: # strings may have any value # integer ranges are >=0 # If value is defined: # value is [qw(any valid words)] for strings # value is [min, max] for integers # if min is undefined, there is no lower limit # if max is undefined, there is no upper limit # Parameters not listed here have defaults %option_range = ( 'format' => [ 'tidy', 'html', 'user' ], 'output-line-ending' => [ 'dos', 'win', 'mac', 'unix' ], 'space-backslash-quote' => [ 0, 2 ], 'block-brace-tightness' => [ 0, 2 ], 'keyword-paren-inner-tightness' => [ 0, 2 ], 'brace-tightness' => [ 0, 2 ], 'paren-tightness' => [ 0, 2 ], 'square-bracket-tightness' => [ 0, 2 ], 'block-brace-vertical-tightness' => [ 0, 2 ], 'brace-follower-vertical-tightness' => [ 0, 2 ], 'brace-vertical-tightness' => [ 0, 2 ], 'brace-vertical-tightness-closing' => [ 0, 2 ], 'paren-vertical-tightness' => [ 0, 2 ], 'paren-vertical-tightness-closing' => [ 0, 2 ], 'square-bracket-vertical-tightness' => [ 0, 2 ], 'square-bracket-vertical-tightness-closing' => [ 0, 2 ], 'vertical-tightness' => [ 0, 2 ], 'vertical-tightness-closing' => [ 0, 2 ], 'closing-brace-indentation' => [ 0, 3 ], 'closing-paren-indentation' => [ 0, 3 ], 'closing-square-bracket-indentation' => [ 0, 3 ], 'closing-token-indentation' => [ 0, 3 ], 'closing-side-comment-else-flag' => [ 0, 2 ], 'comma-arrow-breakpoints' => [ 0, 5 ], 'keyword-group-blanks-before' => [ 0, 2 ], 'keyword-group-blanks-after' => [ 0, 2 ], 'space-prototype-paren' => [ 0, 2 ], 'break-after-labels' => [ 0, 2 ], ); # Note: we could actually allow negative ci if someone really wants it: # $option_range{'continuation-indentation'} = [ undef, undef ]; #------------------------------------------------------------------ # DEFAULTS: Assign default values to the above options here, except # for 'outfile' and 'help'. # These settings should approximate the perlstyle(1) suggestions. #------------------------------------------------------------------ my @defaults = qw( add-newlines add-terminal-newline add-semicolons add-whitespace blanks-before-blocks blanks-before-comments blank-lines-before-subs=1 blank-lines-before-packages=1 keyword-group-blanks-size=5 keyword-group-blanks-repeat-count=0 keyword-group-blanks-before=1 keyword-group-blanks-after=1 nokeyword-group-blanks-inside nokeyword-group-blanks-delete block-brace-tightness=0 block-brace-vertical-tightness=0 brace-follower-vertical-tightness=1 brace-tightness=1 brace-vertical-tightness-closing=0 brace-vertical-tightness=0 break-after-labels=0 break-at-old-logical-breakpoints break-at-old-ternary-breakpoints break-at-old-attribute-breakpoints break-at-old-keyword-breakpoints break-before-hash-brace=0 break-before-hash-brace-and-indent=0 break-before-square-bracket=0 break-before-square-bracket-and-indent=0 break-before-paren=0 break-before-paren-and-indent=0 comma-arrow-breakpoints=5 nocheck-syntax character-encoding=guess closing-side-comment-interval=6 closing-side-comment-maximum-text=20 closing-side-comment-else-flag=0 closing-side-comments-balanced closing-paren-indentation=0 closing-brace-indentation=0 closing-square-bracket-indentation=0 continuation-indentation=2 noextended-continuation-indentation cuddled-break-option=1 delete-old-newlines delete-semicolons dump-block-minimum-lines=20 dump-block-types=sub extended-syntax encode-output-strings function-paren-vertical-alignment fuzzy-line-length hanging-side-comments indent-block-comments indent-columns=4 iterations=1 keep-old-blank-lines=1 keyword-paren-inner-tightness=1 logical-padding long-block-line-count=8 look-for-autoloader look-for-selfloader maximum-consecutive-blank-lines=1 maximum-fields-per-table=0 maximum-line-length=80 maximum-file-size-mb=10 maximum-level-errors=1 maximum-unexpected-errors=0 memoize minimum-space-to-comment=4 nobrace-left-and-indent nocuddled-else nodelete-old-whitespace nohtml nologfile non-indenting-braces noquiet noshow-options nostatic-side-comments notabs nowarning-output one-line-block-semicolons=1 one-line-block-nesting=0 outdent-labels outdent-long-quotes outdent-long-comments paren-tightness=1 paren-vertical-tightness-closing=0 paren-vertical-tightness=0 pass-version-line noweld-nested-containers recombine nouse-unicode-gcstring use-feature=class valign-code valign-block-comments valign-side-comments short-concatenation-item-length=8 space-for-semicolon space-backslash-quote=1 space-prototype-paren=1 square-bracket-tightness=1 square-bracket-vertical-tightness-closing=0 square-bracket-vertical-tightness=0 static-block-comments timestamp trim-qw format=tidy backup-method=copy backup-file-extension=bak code-skipping format-skipping default-tabsize=8 pod2html html-table-of-contents html-entities ); #----------------------------------------------------------------------- # Define abbreviations which will be expanded into the above primitives. # These may be defined recursively. #----------------------------------------------------------------------- %expansion = ( %expansion, 'freeze-newlines' => [qw(noadd-newlines nodelete-old-newlines)], 'fnl' => [qw(freeze-newlines)], 'freeze-whitespace' => [qw(noadd-whitespace nodelete-old-whitespace)], 'fws' => [qw(freeze-whitespace)], 'freeze-blank-lines' => [qw(maximum-consecutive-blank-lines=0 keep-old-blank-lines=2)], 'fbl' => [qw(freeze-blank-lines)], 'indent-only' => [qw(freeze-newlines freeze-whitespace)], 'outdent-long-lines' => [qw(outdent-long-quotes outdent-long-comments)], 'nooutdent-long-lines' => [qw(nooutdent-long-quotes nooutdent-long-comments)], 'oll' => [qw(outdent-long-lines)], 'noll' => [qw(nooutdent-long-lines)], 'io' => [qw(indent-only)], 'delete-all-comments' => [qw(delete-block-comments delete-side-comments delete-pod)], 'nodelete-all-comments' => [qw(nodelete-block-comments nodelete-side-comments nodelete-pod)], 'dac' => [qw(delete-all-comments)], 'ndac' => [qw(nodelete-all-comments)], 'gnu' => [qw(gnu-style)], 'pbp' => [qw(perl-best-practices)], 'tee-all-comments' => [qw(tee-block-comments tee-side-comments tee-pod)], 'notee-all-comments' => [qw(notee-block-comments notee-side-comments notee-pod)], 'tac' => [qw(tee-all-comments)], 'ntac' => [qw(notee-all-comments)], 'html' => [qw(format=html)], 'nhtml' => [qw(format=tidy)], 'tidy' => [qw(format=tidy)], 'brace-left' => [qw(opening-brace-on-new-line)], # -cb is now a synonym for -ce 'cb' => [qw(cuddled-else)], 'cuddled-blocks' => [qw(cuddled-else)], 'utf8' => [qw(character-encoding=utf8)], 'UTF8' => [qw(character-encoding=utf8)], 'guess' => [qw(character-encoding=guess)], 'swallow-optional-blank-lines' => [qw(kbl=0)], 'noswallow-optional-blank-lines' => [qw(kbl=1)], 'sob' => [qw(kbl=0)], 'nsob' => [qw(kbl=1)], 'break-after-comma-arrows' => [qw(cab=0)], 'nobreak-after-comma-arrows' => [qw(cab=1)], 'baa' => [qw(cab=0)], 'nbaa' => [qw(cab=1)], 'blanks-before-subs' => [qw(blbs=1 blbp=1)], 'bbs' => [qw(blbs=1 blbp=1)], 'noblanks-before-subs' => [qw(blbs=0 blbp=0)], 'nbbs' => [qw(blbs=0 blbp=0)], 'keyword-group-blanks' => [qw(kgbb=2 kgbi kgba=2)], 'kgb' => [qw(kgbb=2 kgbi kgba=2)], 'nokeyword-group-blanks' => [qw(kgbb=1 nkgbi kgba=1)], 'nkgb' => [qw(kgbb=1 nkgbi kgba=1)], 'break-at-old-trinary-breakpoints' => [qw(bot)], 'cti=0' => [qw(cpi=0 cbi=0 csbi=0)], 'cti=1' => [qw(cpi=1 cbi=1 csbi=1)], 'cti=2' => [qw(cpi=2 cbi=2 csbi=2)], 'icp' => [qw(cpi=2 cbi=2 csbi=2)], 'nicp' => [qw(cpi=0 cbi=0 csbi=0)], 'closing-token-indentation=0' => [qw(cpi=0 cbi=0 csbi=0)], 'closing-token-indentation=1' => [qw(cpi=1 cbi=1 csbi=1)], 'closing-token-indentation=2' => [qw(cpi=2 cbi=2 csbi=2)], 'indent-closing-paren' => [qw(cpi=2 cbi=2 csbi=2)], 'noindent-closing-paren' => [qw(cpi=0 cbi=0 csbi=0)], 'vt=0' => [qw(pvt=0 bvt=0 sbvt=0)], 'vt=1' => [qw(pvt=1 bvt=1 sbvt=1)], 'vt=2' => [qw(pvt=2 bvt=2 sbvt=2)], 'vertical-tightness=0' => [qw(pvt=0 bvt=0 sbvt=0)], 'vertical-tightness=1' => [qw(pvt=1 bvt=1 sbvt=1)], 'vertical-tightness=2' => [qw(pvt=2 bvt=2 sbvt=2)], 'vtc=0' => [qw(pvtc=0 bvtc=0 sbvtc=0)], 'vtc=1' => [qw(pvtc=1 bvtc=1 sbvtc=1)], 'vtc=2' => [qw(pvtc=2 bvtc=2 sbvtc=2)], 'vertical-tightness-closing=0' => [qw(pvtc=0 bvtc=0 sbvtc=0)], 'vertical-tightness-closing=1' => [qw(pvtc=1 bvtc=1 sbvtc=1)], 'vertical-tightness-closing=2' => [qw(pvtc=2 bvtc=2 sbvtc=2)], 'otr' => [qw(opr ohbr osbr)], 'opening-token-right' => [qw(opr ohbr osbr)], 'notr' => [qw(nopr nohbr nosbr)], 'noopening-token-right' => [qw(nopr nohbr nosbr)], 'sot' => [qw(sop sohb sosb)], 'nsot' => [qw(nsop nsohb nsosb)], 'stack-opening-tokens' => [qw(sop sohb sosb)], 'nostack-opening-tokens' => [qw(nsop nsohb nsosb)], 'sct' => [qw(scp schb scsb)], 'stack-closing-tokens' => [qw(scp schb scsb)], 'nsct' => [qw(nscp nschb nscsb)], 'nostack-closing-tokens' => [qw(nscp nschb nscsb)], 'sac' => [qw(sot sct)], 'nsac' => [qw(nsot nsct)], 'stack-all-containers' => [qw(sot sct)], 'nostack-all-containers' => [qw(nsot nsct)], 'act=0' => [qw(pt=0 sbt=0 bt=0 bbt=0)], 'act=1' => [qw(pt=1 sbt=1 bt=1 bbt=1)], 'act=2' => [qw(pt=2 sbt=2 bt=2 bbt=2)], 'all-containers-tightness=0' => [qw(pt=0 sbt=0 bt=0 bbt=0)], 'all-containers-tightness=1' => [qw(pt=1 sbt=1 bt=1 bbt=1)], 'all-containers-tightness=2' => [qw(pt=2 sbt=2 bt=2 bbt=2)], 'stack-opening-block-brace' => [qw(bbvt=2 bbvtl=*)], 'sobb' => [qw(bbvt=2 bbvtl=*)], 'nostack-opening-block-brace' => [qw(bbvt=0)], 'nsobb' => [qw(bbvt=0)], 'converge' => [qw(it=4)], 'noconverge' => [qw(it=1)], 'conv' => [qw(it=4)], 'nconv' => [qw(it=1)], 'valign' => [qw(vc vsc vbc)], 'novalign' => [qw(nvc nvsc nvbc)], # NOTE: This is a possible future shortcut. But it will remain # deactivated until the -lpxl flag is no longer experimental. # 'line-up-function-parentheses' => [ qw(lp), q#lpxl=[ { F(2# ], # 'lfp' => [qw(line-up-function-parentheses)], # 'mangle' originally deleted pod and comments, but to keep it # reversible, it no longer does. But if you really want to # delete them, just use: # -mangle -dac # An interesting use for 'mangle' is to do this: # perltidy -mangle myfile.pl -st | perltidy -o myfile.pl.new # which will form as many one-line blocks as possible 'mangle' => [ qw( keep-old-blank-lines=0 delete-old-newlines delete-old-whitespace delete-semicolons indent-columns=0 maximum-consecutive-blank-lines=0 maximum-line-length=100000 noadd-newlines noadd-semicolons noadd-whitespace noblanks-before-blocks blank-lines-before-subs=0 blank-lines-before-packages=0 notabs ) ], # 'extrude' originally deleted pod and comments, but to keep it # reversible, it no longer does. But if you really want to # delete them, just use # extrude -dac # # An interesting use for 'extrude' is to do this: # perltidy -extrude myfile.pl -st | perltidy -o myfile.pl.new # which will break up all one-line blocks. 'extrude' => [ qw( ci=0 delete-old-newlines delete-old-whitespace delete-semicolons indent-columns=0 maximum-consecutive-blank-lines=0 maximum-line-length=1 noadd-semicolons noadd-whitespace noblanks-before-blocks blank-lines-before-subs=0 blank-lines-before-packages=0 nofuzzy-line-length notabs norecombine ) ], # this style tries to follow the GNU Coding Standards (which do # not really apply to perl but which are followed by some perl # programmers). 'gnu-style' => [ qw( lp bl noll pt=2 bt=2 sbt=2 cpi=1 csbi=1 cbi=1 ) ], # Style suggested in Damian Conway's Perl Best Practices 'perl-best-practices' => [ qw(l=78 i=4 ci=4 st se vt=2 cti=0 pt=1 bt=1 sbt=1 bbt=1 nsfs nolq), q(wbb=% + - * / x != == >= <= =~ !~ < > | & = **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x=) ], # Additional styles can be added here ); Perl::Tidy::HtmlWriter->make_abbreviated_names( \%expansion ); # Uncomment next line to dump all expansions for debugging: # dump_short_names(\%expansion); return ( \@option_string, \@defaults, \%expansion, \%option_category, \%option_range ); } ## end sub generate_options # Memoize process_command_line. Given same @ARGV passed in, return same # values and same @ARGV back. # This patch was supplied by Jonathan Swartz Nov 2012 and significantly speeds # up masontidy (https://metacpan.org/module/masontidy) my %process_command_line_cache; sub process_command_line { my @q = @_; my ( $perltidyrc_stream, $is_Windows, $Windows_type, $rpending_complaint, $dump_options_type ) = @q; my $use_cache = !defined($perltidyrc_stream) && !$dump_options_type; if ($use_cache) { my $cache_key = join( chr(28), @ARGV ); if ( my $result = $process_command_line_cache{$cache_key} ) { my ( $argv, @retvals ) = @{$result}; @ARGV = @{$argv}; return @retvals; } else { my @retvals = _process_command_line(@q); $process_command_line_cache{$cache_key} = [ \@ARGV, @retvals ] if $retvals[0]->{'memoize'}; return @retvals; } } else { return _process_command_line(@q); } } ## end sub process_command_line # (note the underscore here) sub _process_command_line { my ( $perltidyrc_stream, $is_Windows, $Windows_type, $rpending_complaint, $dump_options_type ) = @_; use Getopt::Long; # Save any current Getopt::Long configuration # and set to Getopt::Long defaults. Use eval to avoid # breaking old versions of Perl without these routines. # Previous configuration is reset at the exit of this routine. my $glc; if ( eval { $glc = Getopt::Long::Configure(); 1 } ) { my $ok = eval { Getopt::Long::ConfigDefaults(); 1 }; if ( !$ok && DEVEL_MODE ) { Fault("Failed call to Getopt::Long::ConfigDefaults: $EVAL_ERROR\n"); } } else { $glc = undef } my ( $roption_string, $rdefaults, $rexpansion, $roption_category, $roption_range ) = generate_options(); #-------------------------------------------------------------- # set the defaults by passing the above list through GetOptions #-------------------------------------------------------------- my %Opts = (); { local @ARGV = (); # do not load the defaults if we are just dumping perltidyrc unless ( $dump_options_type eq 'perltidyrc' ) { for my $i ( @{$rdefaults} ) { push @ARGV, "--" . $i } } if ( !GetOptions( \%Opts, @{$roption_string} ) ) { Die( "Programming Bug reported by 'GetOptions': error in setting default options" ); } } my @raw_options = (); my $config_file = EMPTY_STRING; my $saw_ignore_profile = 0; my $saw_dump_profile = 0; #-------------------------------------------------------------- # Take a first look at the command-line parameters. Do as many # immediate dumps as possible, which can avoid confusion if the # perltidyrc file has an error. #-------------------------------------------------------------- foreach my $i (@ARGV) { $i =~ s/^--/-/; if ( $i =~ /^-(npro|noprofile|no-profile)$/ ) { $saw_ignore_profile = 1; } # note: this must come before -pro and -profile, below: elsif ( $i =~ /^-(dump-profile|dpro)$/ ) { $saw_dump_profile = 1; } elsif ( $i =~ /^-(pro|profile)=(.+)/ ) { if ($config_file) { Warn( "Only one -pro=filename allowed, using '$2' instead of '$config_file'\n" ); } $config_file = $2; # resolve /.../, meaning look upwards from directory if ( defined($config_file) ) { if ( my ( $start_dir, $search_file ) = ( $config_file =~ m{^(.*)\.\.\./(.*)$} ) ) { $start_dir = '.' if !$start_dir; $start_dir = Cwd::realpath($start_dir); if ( my $found_file = find_file_upwards( $start_dir, $search_file ) ) { $config_file = $found_file; } } } unless ( -e $config_file ) { Warn("cannot find file given with -pro=$config_file: $ERRNO\n"); $config_file = EMPTY_STRING; } } elsif ( $i =~ /^-(pro|profile)=?$/ ) { Die("usage: -pro=filename or --profile=filename, no spaces\n"); } elsif ( $i =~ /^-(help|h|HELP|H|\?)$/ ) { usage(); Exit(0); } elsif ( $i =~ /^-(version|v)$/ ) { show_version(); Exit(0); } elsif ( $i =~ /^-(dump-defaults|ddf)$/ ) { dump_defaults( @{$rdefaults} ); Exit(0); } elsif ( $i =~ /^-(dump-long-names|dln)$/ ) { dump_long_names( @{$roption_string} ); Exit(0); } elsif ( $i =~ /^-(dump-short-names|dsn)$/ ) { dump_short_names($rexpansion); Exit(0); } elsif ( $i =~ /^-(dump-token-types|dtt)$/ ) { Perl::Tidy::Tokenizer->dump_token_types(*STDOUT); Exit(0); } } if ( $saw_dump_profile && $saw_ignore_profile ) { Warn("No profile to dump because of -npro\n"); Exit(1); } #---------------------------------------- # read any .perltidyrc configuration file #---------------------------------------- unless ($saw_ignore_profile) { # resolve possible conflict between $perltidyrc_stream passed # as call parameter to perltidy and -pro=filename on command # line. if ($perltidyrc_stream) { if ($config_file) { Warn(<{'grep-alias-exclusion-list'}; if ($exclude_string) { $exclude_string =~ s/,/ /g; # allow commas $exclude_string =~ s/^\s+//; $exclude_string =~ s/\s+$//; my @q = split /\s+/, $exclude_string; @is_excluded_word{@q} = (1) x scalar(@q); } # The special option -gaxl='*' removes all defaults if ( $is_excluded_word{'*'} ) { $default_string = EMPTY_STRING } # combine the defaults and any input list my $input_string = $rOpts->{'grep-alias-list'}; if ($input_string) { $input_string .= SPACE . $default_string } else { $input_string = $default_string } # Now make the final list of unique grep alias words $input_string =~ s/,/ /g; # allow commas $input_string =~ s/^\s+//; $input_string =~ s/\s+$//; my @word_list = split /\s+/, $input_string; my @filtered_word_list; my %seen; foreach my $word (@word_list) { if ($word) { if ( $word !~ /^\w[\w\d]*$/ ) { Warn( "unexpected word in --grep-alias-list: '$word' - ignoring\n" ); } if ( !$seen{$word} && !$is_excluded_word{$word} ) { $seen{$word}++; push @filtered_word_list, $word; } } } my $joined_words = join SPACE, @filtered_word_list; $rOpts->{'grep-alias-list'} = $joined_words; return; } ## end sub make_grep_alias_string sub cleanup_word_list { my ( $rOpts, $option_name, $rforced_words ) = @_; # Clean up the list of words in a user option to simplify use by # later routines (delete repeats, replace commas with single space, # remove non-words) # Given: # $rOpts - the global option hash # $option_name - hash key of this option # $rforced_words - ref to list of any words to be added # Returns: # \%seen - hash of the final list of words my %seen; my @input_list; my $input_string = $rOpts->{$option_name}; if ( defined($input_string) && length($input_string) ) { $input_string =~ s/,/ /g; # allow commas $input_string =~ s/^\s+//; $input_string =~ s/\s+$//; @input_list = split /\s+/, $input_string; } if ($rforced_words) { push @input_list, @{$rforced_words}; } my @filtered_word_list; foreach my $word (@input_list) { if ($word) { # look for obviously bad words if ( $word =~ /^\d/ || $word !~ /^\w[\w\d]*$/ ) { Warn("unexpected '$option_name' word '$word' - ignoring\n"); } if ( !$seen{$word} ) { $seen{$word}++; push @filtered_word_list, $word; } } } $rOpts->{$option_name} = join SPACE, @filtered_word_list; return \%seen; } ## end sub cleanup_word_list sub check_options { my ( $self, $is_Windows, $Windows_type, $rpending_complaint ) = @_; my $rOpts = $self->[_rOpts_]; #------------------------------------------------------------ # check and handle any interactions among the basic options.. #------------------------------------------------------------ # Since perltidy only encodes in utf8, problems can occur if we let it # decode anything else. See discussions for issue git #83. my $encoding = $rOpts->{'character-encoding'}; if ( $encoding !~ /^\s*(guess|none|utf8|utf-8)\s*$/i ) { Die(<{'vertical-tightness'} ) { my $vt = $rOpts->{'vertical-tightness'}; $rOpts->{'paren-vertical-tightness'} = $vt; $rOpts->{'square-bracket-vertical-tightness'} = $vt; $rOpts->{'brace-vertical-tightness'} = $vt; } if ( defined $rOpts->{'vertical-tightness-closing'} ) { my $vtc = $rOpts->{'vertical-tightness-closing'}; $rOpts->{'paren-vertical-tightness-closing'} = $vtc; $rOpts->{'square-bracket-vertical-tightness-closing'} = $vtc; $rOpts->{'brace-vertical-tightness-closing'} = $vtc; } if ( defined $rOpts->{'closing-token-indentation'} ) { my $cti = $rOpts->{'closing-token-indentation'}; $rOpts->{'closing-square-bracket-indentation'} = $cti; $rOpts->{'closing-brace-indentation'} = $cti; $rOpts->{'closing-paren-indentation'} = $cti; } # Syntax checking is no longer supported due to concerns about executing # code in BEGIN blocks. The flag is still accepted for backwards # compatibility but is ignored if set. $rOpts->{'check-syntax'} = 0; my $check_blank_count = sub { my ( $key, $abbrev ) = @_; if ( $rOpts->{$key} ) { if ( $rOpts->{$key} < 0 ) { $rOpts->{$key} = 0; Warn("negative value of $abbrev, setting 0\n"); } if ( $rOpts->{$key} > 100 ) { Warn("unreasonably large value of $abbrev, reducing\n"); $rOpts->{$key} = 100; } } return; }; # check for reasonable number of blank lines and fix to avoid problems $check_blank_count->( 'blank-lines-before-subs', '-blbs' ); $check_blank_count->( 'blank-lines-before-packages', '-blbp' ); $check_blank_count->( 'blank-lines-after-block-opening', '-blao' ); $check_blank_count->( 'blank-lines-before-block-closing', '-blbc' ); # setting a non-negative logfile gap causes logfile to be saved if ( defined( $rOpts->{'logfile-gap'} ) && $rOpts->{'logfile-gap'} >= 0 ) { $rOpts->{'logfile'} = 1; } # set short-cut flag when only indentation is to be done. # Note that the user may or may not have already set the # indent-only flag. if ( !$rOpts->{'add-whitespace'} && !$rOpts->{'delete-old-whitespace'} && !$rOpts->{'add-newlines'} && !$rOpts->{'delete-old-newlines'} ) { $rOpts->{'indent-only'} = 1; } # -isbc implies -ibc if ( $rOpts->{'indent-spaced-block-comments'} ) { $rOpts->{'indent-block-comments'} = 1; } # -bar cannot be used with -bl or -bli; arbitrarily keep -bar if ( $rOpts->{'opening-brace-always-on-right'} ) { if ( $rOpts->{'opening-brace-on-new-line'} ) { Warn(<{'opening-brace-on-new-line'} = 0; } if ( $rOpts->{'brace-left-and-indent'} ) { Warn(<{'brace-left-and-indent'} = 0; } } # it simplifies things if -bl is 0 rather than undefined if ( !defined( $rOpts->{'opening-brace-on-new-line'} ) ) { $rOpts->{'opening-brace-on-new-line'} = 0; } if ( $rOpts->{'entab-leading-whitespace'} ) { if ( $rOpts->{'entab-leading-whitespace'} < 0 ) { Warn("-et=n must use a positive integer; ignoring -et\n"); $rOpts->{'entab-leading-whitespace'} = undef; } # entab leading whitespace has priority over the older 'tabs' option if ( $rOpts->{'tabs'} ) { # The following warning could be added but would annoy a lot of # users who have a perltidyrc with both -t and -et=n. So instead # there is a note in the manual that -et overrides -t. ##Warn("-tabs and -et=n conflict; ignoring -tabs\n"); $rOpts->{'tabs'} = 0; } } # set a default tabsize to be used in guessing the starting indentation # level if and only if this run does not use tabs and the old code does # use tabs if ( $rOpts->{'default-tabsize'} ) { if ( $rOpts->{'default-tabsize'} < 0 ) { Warn("negative value of -dt, setting 0\n"); $rOpts->{'default-tabsize'} = 0; } if ( $rOpts->{'default-tabsize'} > 20 ) { Warn("unreasonably large value of -dt, reducing\n"); $rOpts->{'default-tabsize'} = 20; } } else { $rOpts->{'default-tabsize'} = 8; } # Check and clean up any use-feature list my $saw_use_feature_class; if ( $rOpts->{'use-feature'} ) { my $rseen = cleanup_word_list( $rOpts, 'use-feature' ); $saw_use_feature_class = $rseen->{'class'}; } # Check and clean up any sub-alias-list if ( defined( $rOpts->{'sub-alias-list'} ) && length( $rOpts->{'sub-alias-list'} ) || $saw_use_feature_class ) { my @forced_words; # include 'sub' for convenience if this option is used push @forced_words, 'sub'; # use-feature=class requires method as a sub alias push @forced_words, 'method' if ($saw_use_feature_class); cleanup_word_list( $rOpts, 'sub-alias-list', \@forced_words ); } make_grep_alias_string($rOpts); # Turn on fuzzy-line-length unless this is an extrude run, as determined # by the -i and -ci settings. Otherwise blinkers can form (case b935) if ( !$rOpts->{'fuzzy-line-length'} ) { if ( $rOpts->{'maximum-line-length'} != 1 || $rOpts->{'continuation-indentation'} != 0 ) { $rOpts->{'fuzzy-line-length'} = 1; } } # Large values of -scl can cause convergence problems, issue c167 if ( $rOpts->{'short-concatenation-item-length'} > 12 ) { $rOpts->{'short-concatenation-item-length'} = 12; } # The freeze-whitespace option is currently a derived option which has its # own key $rOpts->{'freeze-whitespace'} = !$rOpts->{'add-whitespace'} && !$rOpts->{'delete-old-whitespace'}; # Turn off certain options if whitespace is frozen # Note: vertical alignment will be automatically shut off if ( $rOpts->{'freeze-whitespace'} ) { $rOpts->{'logical-padding'} = 0; } # Define $tabsize, the number of spaces per tab for use in # guessing the indentation of source lines with leading tabs. # Assume same as for this run if tabs are used, otherwise assume # a default value, typically 8 $self->[_tabsize_] = $rOpts->{'entab-leading-whitespace'} ? $rOpts->{'entab-leading-whitespace'} : $rOpts->{'tabs'} ? $rOpts->{'indent-columns'} : $rOpts->{'default-tabsize'}; # Define the default line ending, before any -ple option is applied $self->[_line_separator_default_] = get_line_separator_default($rOpts); return; } ## end sub check_options sub get_line_separator_default { my ( $rOpts, $input_file ) = @_; # Get the line separator that will apply unless overriden by a # --preserve-line-endings flag for a specific file my $line_separator_default = "\n"; my $ole = $rOpts->{'output-line-ending'}; if ($ole) { my %endings = ( dos => "\015\012", win => "\015\012", mac => "\015", unix => "\012", ); $line_separator_default = $endings{ lc $ole }; if ( !$line_separator_default ) { my $str = join SPACE, keys %endings; Die(<{'preserve-line-endings'} ) { Warn("Ignoring -ple; conflicts with -ole\n"); $rOpts->{'preserve-line-endings'} = undef; } } return $line_separator_default; } ## end sub get_line_separator_default sub find_file_upwards { my ( $search_dir, $search_file ) = @_; $search_dir =~ s{/+$}{}; $search_file =~ s{^/+}{}; while (1) { my $try_path = "$search_dir/$search_file"; if ( -f $try_path ) { return $try_path; } elsif ( $search_dir eq '/' ) { return; } else { $search_dir = dirname($search_dir); } } # This return is for Perl-Critic. # We shouldn't get out of the while loop without a return return; } ## end sub find_file_upwards sub expand_command_abbreviations { # go through @ARGV and expand any abbreviations my ( $rexpansion, $rraw_options, $config_file ) = @_; # set a pass limit to prevent an infinite loop; # 10 should be plenty, but it may be increased to allow deeply # nested expansions. my $max_passes = 10; # keep looping until all expansions have been converted into actual # dash parameters.. foreach my $pass_count ( 0 .. $max_passes ) { my @new_argv = (); my $abbrev_count = 0; # loop over each item in @ARGV.. foreach my $word (@ARGV) { # convert any leading 'no-' to just 'no' if ( $word =~ /^(-[-]?no)-(.*)/ ) { $word = $1 . $2 } # if it is a dash flag (instead of a file name).. if ( $word =~ /^-[-]?([\w\-]+)(.*)/ ) { my $abr = $1; my $flags = $2; # save the raw input for debug output in case of circular refs if ( $pass_count == 0 ) { push( @{$rraw_options}, $word ); } # recombine abbreviation and flag, if necessary, # to allow abbreviations with arguments such as '-vt=1' if ( $rexpansion->{ $abr . $flags } ) { $abr = $abr . $flags; $flags = EMPTY_STRING; } # if we see this dash item in the expansion hash.. if ( $rexpansion->{$abr} ) { $abbrev_count++; # stuff all of the words that it expands to into the # new arg list for the next pass foreach my $abbrev ( @{ $rexpansion->{$abr} } ) { next unless $abbrev; # for safety; shouldn't happen push( @new_argv, '--' . $abbrev . $flags ); } } # not in expansion hash, must be actual long name else { push( @new_argv, $word ); } } # not a dash item, so just save it for the next pass else { push( @new_argv, $word ); } } ## end of this pass # update parameter list @ARGV to the new one @ARGV = @new_argv; last if ( !$abbrev_count ); # make sure we are not in an infinite loop if ( $pass_count == $max_passes ) { local $LIST_SEPARATOR = ')('; Warn(<{$abbrev} }; print STDOUT "$abbrev --> @list\n"; } return; } ## end sub dump_short_names sub check_vms_filename { # given a valid filename (the perltidy input file) # create a modified filename and separator character # suitable for VMS. # # Contributed by Michael Cartmell # my $filename = shift; my ( $base, $path ) = fileparse($filename); # remove explicit ; version $base =~ s/;-?\d*$// # remove explicit . version ie two dots in filename NB ^ escapes a dot or $base =~ s/( # begin capture $1 (?:^|[^^])\. # match a dot not preceded by a caret (?: # followed by nothing | # or .*[^^] # anything ending in a non caret ) ) # end capture $1 \.-?\d*$ # match . version number /$1/x; # normalize filename, if there are no unescaped dots then append one $base .= '.' unless $base =~ /(?:^|[^^])\./; # if we don't already have an extension then we just append the extension my $separator = ( $base =~ /\.$/ ) ? EMPTY_STRING : "_"; return ( $path . $base, $separator ); } ## end sub check_vms_filename sub Win_OS_Type { # TODO: are these more standard names? # Win32s Win95 Win98 WinMe WinNT3.51 WinNT4 Win2000 WinXP/.Net Win2003 # Returns a string that determines what MS OS we are on. # Returns win32s,95,98,Me,NT3.51,NT4,2000,XP/.Net,Win2003 # Returns blank string if not an MS system. # Original code contributed by: Yves Orton # We need to know this to decide where to look for config files my $rpending_complaint = shift; my $os = EMPTY_STRING; return $os unless $OSNAME =~ /win32|dos/i; # is it a MS box? # Systems built from Perl source may not have Win32.pm # But probably have Win32::GetOSVersion() anyway so the # following line is not 'required': # return $os unless eval('require Win32'); # Use the standard API call to determine the version my ( $undef, $major, $minor, $build, $id ); my $ok = eval { ( $undef, $major, $minor, $build, $id ) = Win32::GetOSVersion(); 1; }; if ( !$ok && DEVEL_MODE ) { Fault("Could not cal Win32::GetOSVersion(): $EVAL_ERROR\n"); } # # NAME ID MAJOR MINOR # Windows NT 4 2 4 0 # Windows 2000 2 5 0 # Windows XP 2 5 1 # Windows Server 2003 2 5 2 return "win32s" unless $id; # If id==0 then its a win32s box. $os = { # Magic numbers from MSDN # documentation of GetOSVersion 1 => { 0 => "95", 10 => "98", 90 => "Me", }, 2 => { 0 => "2000", # or NT 4, see below 1 => "XP/.Net", 2 => "Win2003", 51 => "NT3.51", } }->{$id}->{$minor}; # If $os is undefined, the above code is out of date. Suggested updates # are welcome. unless ( defined $os ) { $os = EMPTY_STRING; # Deactivated this message 20180322 because it was needlessly # causing some test scripts to fail. Need help from someone # with expertise in Windows to decide what is possible with windows. ${$rpending_complaint} .= </.../, meaning look upwards from directory my $config_file = shift; if ($config_file) { if ( my ( $start_dir, $search_file ) = ( $config_file =~ m{^(.*)\.\.\./(.*)$} ) ) { ${$rconfig_file_chatter} .= "# Searching Upward: $config_file\n"; $start_dir = '.' if !$start_dir; $start_dir = Cwd::realpath($start_dir); if ( my $found_file = find_file_upwards( $start_dir, $search_file ) ) { $config_file = $found_file; ${$rconfig_file_chatter} .= "# Found: $config_file\n"; } } } return $config_file; }; my $config_file; # look in current directory first $config_file = ".perltidyrc"; return $config_file if $exists_config_file->($config_file); if ($is_Windows) { $config_file = "perltidy.ini"; return $config_file if $exists_config_file->($config_file); } # Default environment vars. my @envs = qw(PERLTIDY HOME); # Check the NT/2k/XP locations, first a local machine def, then a # network def push @envs, qw(USERPROFILE HOMESHARE) if $OSNAME =~ /win32/i; # Now go through the environment ... foreach my $var (@envs) { ${$rconfig_file_chatter} .= "# Examining: \$ENV{$var}"; if ( defined( $ENV{$var} ) ) { ${$rconfig_file_chatter} .= " = $ENV{$var}\n"; # test ENV{ PERLTIDY } as file: if ( $var eq 'PERLTIDY' ) { $config_file = "$ENV{$var}"; $config_file = $resolve_config_file->($config_file); return $config_file if $exists_config_file->($config_file); } # test ENV as directory: $config_file = catfile( $ENV{$var}, ".perltidyrc" ); $config_file = $resolve_config_file->($config_file); return $config_file if $exists_config_file->($config_file); if ($is_Windows) { $config_file = catfile( $ENV{$var}, "perltidy.ini" ); $config_file = $resolve_config_file->($config_file); return $config_file if $exists_config_file->($config_file); } } else { ${$rconfig_file_chatter} .= "\n"; } } # then look for a system-wide definition # where to look varies with OS if ($is_Windows) { if ($Windows_type) { my ( $os, $system, $allusers ) = Win_Config_Locs( $rpending_complaint, $Windows_type ); # Check All Users directory, if there is one. # i.e. C:\Documents and Settings\User\perltidy.ini if ($allusers) { $config_file = catfile( $allusers, ".perltidyrc" ); return $config_file if $exists_config_file->($config_file); $config_file = catfile( $allusers, "perltidy.ini" ); return $config_file if $exists_config_file->($config_file); } # Check system directory. # retain old code in case someone has been able to create # a file with a leading period. $config_file = catfile( $system, ".perltidyrc" ); return $config_file if $exists_config_file->($config_file); $config_file = catfile( $system, "perltidy.ini" ); return $config_file if $exists_config_file->($config_file); } } # Place to add customization code for other systems elsif ( $OSNAME eq 'OS2' ) { } elsif ( $OSNAME eq 'MacOS' ) { } elsif ( $OSNAME eq 'VMS' ) { } # Assume some kind of Unix else { $config_file = "/usr/local/etc/perltidyrc"; return $config_file if $exists_config_file->($config_file); $config_file = "/etc/perltidyrc"; return $config_file if $exists_config_file->($config_file); } # Couldn't find a config file return; } ## end sub find_config_file sub Win_Config_Locs { # In scalar context returns the OS name (95 98 ME NT3.51 NT4 2000 XP), # or undef if its not a win32 OS. In list context returns OS, System # Directory, and All Users Directory. All Users will be empty on a # 9x/Me box. Contributed by: Yves Orton. my ( $rpending_complaint, $os ) = @_; if ( !$os ) { $os = Win_OS_Type(); } return unless $os; my $system = EMPTY_STRING; my $allusers = EMPTY_STRING; if ( $os =~ /9[58]|Me/ ) { $system = "C:/Windows"; } elsif ( $os =~ /NT|XP|200?/ ) { $system = ( $os =~ /XP/ ) ? "C:/Windows/" : "C:/WinNT/"; $allusers = ( $os =~ /NT/ ) ? "C:/WinNT/profiles/All Users/" : "C:/Documents and Settings/All Users/"; } else { # This currently would only happen on a win32s computer. I don't have # one to test, so I am unsure how to proceed. Suggestions welcome! ${$rpending_complaint} .= "I dont know a sensible place to look for config files on an $os system.\n"; return; } return wantarray ? ( $os, $system, $allusers ) : $os; } ## end sub Win_Config_Locs sub dump_config_file { my ( $fh, $config_file, $rconfig_file_chatter ) = @_; print STDOUT "${$rconfig_file_chatter}"; if ($fh) { print STDOUT "# Dump of file: '$config_file'\n"; while ( my $line = $fh->getline() ) { print STDOUT $line } my $ok = eval { $fh->close(); 1 }; if ( !$ok && DEVEL_MODE ) { Fault("Could not close file handle(): $EVAL_ERROR\n"); } } else { print STDOUT "# ...no config file found\n"; } return; } ## end sub dump_config_file sub read_config_file { my ( $fh, $config_file, $rexpansion ) = @_; my @config_list = (); # file is bad if non-empty $death_message is returned my $death_message = EMPTY_STRING; my $name = undef; my $line_no; my $opening_brace_line; while ( my $line = $fh->getline() ) { $line_no++; chomp $line; ( $line, $death_message ) = strip_comment( $line, $config_file, $line_no ); last if ($death_message); next unless $line; $line =~ s/^\s*(.*?)\s*$/$1/; # trim both ends next unless $line; my $body = $line; # Look for complete or partial abbreviation definition of the form # name { body } or name { or name { body # See rules in perltidy's perldoc page # Section: Other Controls - Creating a new abbreviation if ( $line =~ /^((\w+)\s*\{)(.*)?$/ ) { ( $name, $body ) = ( $2, $3 ); # Cannot start new abbreviation unless old abbreviation is complete last if ($opening_brace_line); $opening_brace_line = $line_no unless ( $body && $body =~ s/\}$// ); # handle a new alias definition if ( $rexpansion->{$name} ) { local $LIST_SEPARATOR = ')('; my @names = sort keys %{$rexpansion}; $death_message = "Here is a list of all installed aliases\n(@names)\n" . "Attempting to redefine alias ($name) in config file $config_file line $INPUT_LINE_NUMBER\n"; last; } $rexpansion->{$name} = []; } # leading opening braces not allowed elsif ( $line =~ /^{/ ) { $opening_brace_line = undef; $death_message = "Unexpected '{' at line $line_no in config file '$config_file'\n"; last; } # Look for abbreviation closing: body } or } elsif ( $line =~ /^(.*)?\}$/ ) { $body = $1; if ($opening_brace_line) { $opening_brace_line = undef; } else { $death_message = "Unexpected '}' at line $line_no in config file '$config_file'\n"; last; } } # Now store any parameters if ($body) { my ( $rbody_parts, $msg ) = parse_args($body); if ($msg) { $death_message = <{$name} }, @{$rbody_parts}; } else { push( @config_list, @{$rbody_parts} ); } } } if ($opening_brace_line) { $death_message = "Didn't see a '}' to match the '{' at line $opening_brace_line in config file '$config_file'\n"; } my $ok = eval { $fh->close(); 1 }; if ( !$ok && DEVEL_MODE ) { Fault("Could not close file handle(): $EVAL_ERROR\n"); } return ( \@config_list, $death_message ); } ## end sub read_config_file sub strip_comment { # Strip any comment from a command line my ( $instr, $config_file, $line_no ) = @_; my $msg = EMPTY_STRING; # check for full-line comment if ( $instr =~ /^\s*#/ ) { return ( EMPTY_STRING, $msg ); } # nothing to do if no comments if ( $instr !~ /#/ ) { return ( $instr, $msg ); } # handle case of no quotes elsif ( $instr !~ /['"]/ ) { # We now require a space before the # of a side comment # this allows something like: # -sbcp=# # Otherwise, it would have to be quoted: # -sbcp='#' $instr =~ s/\s+\#.*$//; return ( $instr, $msg ); } # handle comments and quotes my $outstr = EMPTY_STRING; my $quote_char = EMPTY_STRING; while (1) { # looking for ending quote character if ($quote_char) { if ( $instr =~ /\G($quote_char)/gc ) { $quote_char = EMPTY_STRING; $outstr .= $1; } elsif ( $instr =~ /\G(.)/gc ) { $outstr .= $1; } # error..we reached the end without seeing the ending quote char else { $msg = < in this text: $instr Please fix this line or use -npro to avoid reading this file EOM last; } } # accumulating characters and looking for start of a quoted string else { if ( $instr =~ /\G([\"\'])/gc ) { $outstr .= $1; $quote_char = $1; } # Note: not yet enforcing the space-before-hash rule for side # comments if the parameter is quoted. elsif ( $instr =~ /\G#/gc ) { last; } elsif ( $instr =~ /\G(.)/gc ) { $outstr .= $1; } else { last; } } } return ( $outstr, $msg ); } ## end sub strip_comment sub parse_args { # Parse a command string containing multiple string with possible # quotes, into individual commands. It might look like this, for example: # # -wba=" + - " -some-thing -wbb='. && ||' # # There is no need, at present, to handle escaped quote characters. # (They are not perltidy tokens, so needn't be in strings). my ($body) = @_; my @body_parts = (); my $quote_char = EMPTY_STRING; my $part = EMPTY_STRING; my $msg = EMPTY_STRING; # Check for external call with undefined $body - added to fix # github issue Perl-Tidy-Sweetened issue #23 if ( !defined($body) ) { $body = EMPTY_STRING } while (1) { # looking for ending quote character if ($quote_char) { if ( $body =~ /\G($quote_char)/gc ) { $quote_char = EMPTY_STRING; } elsif ( $body =~ /\G(.)/gc ) { $part .= $1; } # error..we reached the end without seeing the ending quote char else { if ( length($part) ) { push @body_parts, $part; } $msg = < in this text: $body EOM last; } } # accumulating characters and looking for start of a quoted string else { if ( $body =~ /\G([\"\'])/gc ) { $quote_char = $1; } elsif ( $body =~ /\G(\s+)/gc ) { if ( length($part) ) { push @body_parts, $part; } $part = EMPTY_STRING; } elsif ( $body =~ /\G(.)/gc ) { $part .= $1; } else { if ( length($part) ) { push @body_parts, $part; } last; } } } return ( \@body_parts, $msg ); } ## end sub parse_args sub dump_long_names { my @names = @_; print STDOUT < does not take an argument # =s takes a mandatory string # :s takes an optional string # =i takes a mandatory integer # :i takes an optional integer # ! does not take an argument and may be negated # i.e., -foo and -nofoo are allowed # a double dash signals the end of the options list # #-------------------------------------------------- EOM foreach my $name ( sort @names ) { print STDOUT "$name\n" } return; } ## end sub dump_long_names sub dump_defaults { my @defaults = @_; print STDOUT "Default command line options:\n"; foreach my $line ( sort @defaults ) { print STDOUT "$line\n" } return; } ## end sub dump_defaults sub readable_options { # return options for this run as a string which could be # put in a perltidyrc file my ( $rOpts, $roption_string ) = @_; my %Getopt_flags; my $rGetopt_flags = \%Getopt_flags; my $readable_options = "# Final parameter set for this run.\n"; $readable_options .= "# See utility 'perltidyrc_dump.pl' for nicer formatting.\n"; foreach my $opt ( @{$roption_string} ) { my $flag = EMPTY_STRING; if ( $opt =~ /(.*)(!|=.*)$/ ) { $opt = $1; $flag = $2; } if ( defined( $rOpts->{$opt} ) ) { $rGetopt_flags->{$opt} = $flag; } } foreach my $key ( sort keys %{$rOpts} ) { my $flag = $rGetopt_flags->{$key}; my $value = $rOpts->{$key}; my $prefix = '--'; my $suffix = EMPTY_STRING; if ($flag) { if ( $flag =~ /^=/ ) { if ( $value !~ /^\d+$/ ) { $value = '"' . $value . '"' } $suffix = "=" . $value; } elsif ( $flag =~ /^!/ ) { $prefix .= "no" unless ($value); } else { # shouldn't happen $readable_options .= "# ERROR in dump_options: unrecognized flag $flag for $key\n"; } } $readable_options .= $prefix . $key . $suffix . "\n"; } return $readable_options; } ## end sub readable_options sub show_version { print STDOUT <<"EOM"; This is perltidy, v$VERSION Copyright 2000-2022, Steve Hancock Perltidy is free software and may be copied under the terms of the GNU General Public License, which is included in the distribution files. Complete documentation for perltidy can be found using 'man perltidy' or on the internet at http://perltidy.sourceforge.net. EOM return; } ## end sub show_version sub usage { print STDOUT <outfile perltidy [ options ] outfile Options have short and long forms. Short forms are shown; see man pages for long forms. Note: '=s' indicates a required string, and '=n' indicates a required integer. I/O control -h show this help -o=file name of the output file (only if single input file) -oext=s change output extension from 'tdy' to s -opath=path change path to be 'path' for output files -b backup original to .bak and modify file in-place -bext=s change default backup extension from 'bak' to s -q deactivate error messages (for running under editor) -w include non-critical warning messages in the .ERR error output -log save .LOG file, which has useful diagnostics -f force perltidy to read a binary file -g like -log but writes more detailed .LOG file, for debugging scripts -opt write the set of options actually used to a .LOG file -npro ignore .perltidyrc configuration command file -pro=file read configuration commands from file instead of .perltidyrc -st send output to standard output, STDOUT -se send all error output to standard error output, STDERR -v display version number to standard output and quit Basic Options: -i=n use n columns per indentation level (default n=4) -t tabs: use one tab character per indentation level, not recommended -nt no tabs: use n spaces per indentation level (default) -et=n entab leading whitespace n spaces per tab; not recommended -io "indent only": just do indentation, no other formatting. -sil=n set starting indentation level to n; use if auto detection fails -ole=s specify output line ending (s=dos or win, mac, unix) -ple keep output line endings same as input (input must be filename) Whitespace Control -fws freeze whitespace; this disables all whitespace changes and disables the following switches: -bt=n sets brace tightness, n= (0 = loose, 1=default, 2 = tight) -bbt same as -bt but for code block braces; same as -bt if not given -bbvt block braces vertically tight; use with -bl or -bli -bbvtl=s make -bbvt to apply to selected list of block types -pt=n paren tightness (n=0, 1 or 2) -sbt=n square bracket tightness (n=0, 1, or 2) -bvt=n brace vertical tightness, n=(0=open, 1=close unless multiple steps on a line, 2=always close) -pvt=n paren vertical tightness (see -bvt for n) -sbvt=n square bracket vertical tightness (see -bvt for n) -bvtc=n closing brace vertical tightness: n=(0=open, 1=sometimes close, 2=always close) -pvtc=n closing paren vertical tightness, see -bvtc for n. -sbvtc=n closing square bracket vertical tightness, see -bvtc for n. -ci=n sets continuation indentation=n, default is n=2 spaces -lp line up parentheses, brackets, and non-BLOCK braces -sfs add space before semicolon in for( ; ; ) -aws allow perltidy to add whitespace (default) -dws delete all old non-essential whitespace -icb indent closing brace of a code block -cti=n closing indentation of paren, square bracket, or non-block brace: n=0 none, =1 align with opening, =2 one full indentation level -icp equivalent to -cti=2 -wls=s want space left of tokens in string; i.e. -nwls='+ - * /' -wrs=s want space right of tokens in string; -sts put space before terminal semicolon of a statement -sak=s put space between keywords given in s and '('; -nsak=s no space between keywords in s and '('; i.e. -nsak='my our local' Line Break Control -fnl freeze newlines; this disables all line break changes and disables the following switches: -anl add newlines; ok to introduce new line breaks -bbs add blank line before subs and packages -bbc add blank line before block comments -bbb add blank line between major blocks -kbl=n keep old blank lines? 0=no, 1=some, 2=all -mbl=n maximum consecutive blank lines to output (default=1) -ce cuddled else; use this style: '} else {' -cb cuddled blocks (other than 'if-elsif-else') -cbl=s list of blocks to cuddled, default 'try-catch-finally' -dnl delete old newlines (default) -l=n maximum line length; default n=80 -bl opening brace on new line -sbl opening sub brace on new line. value of -bl is used if not given. -bli opening brace on new line and indented -bar opening brace always on right, even for long clauses -vt=n vertical tightness (requires -lp); n controls break after opening token: 0=never 1=no break if next line balanced 2=no break -vtc=n vertical tightness of closing container; n controls if closing token starts new line: 0=always 1=not unless list 1=never -wba=s want break after tokens in string; i.e. wba=': .' -wbb=s want break before tokens in string -wn weld nested: combines opening and closing tokens when both are adjacent -wnxl=s weld nested exclusion list: provides some control over the types of containers which can be welded Following Old Breakpoints -kis keep interior semicolons. Allows multiple statements per line. -boc break at old comma breaks: turns off all automatic list formatting -bol break at old logical breakpoints: or, and, ||, && (default) -bom break at old method call breakpoints: -> -bok break at old list keyword breakpoints such as map, sort (default) -bot break at old conditional (ternary ?:) operator breakpoints (default) -boa break at old attribute breakpoints -cab=n break at commas after a comma-arrow (=>): n=0 break at all commas after => n=1 stable: break unless this breaks an existing one-line container n=2 break only if a one-line container cannot be formed n=3 do not treat commas after => specially at all Comment controls -ibc indent block comments (default) -isbc indent spaced block comments; may indent unless no leading space -msc=n minimum desired spaces to side comment, default 4 -fpsc=n fix position for side comments; default 0; -csc add or update closing side comments after closing BLOCK brace -dcsc delete closing side comments created by a -csc command -cscp=s change closing side comment prefix to be other than '## end' -cscl=s change closing side comment to apply to selected list of blocks -csci=n minimum number of lines needed to apply a -csc tag, default n=6 -csct=n maximum number of columns of appended text, default n=20 -cscw causes warning if old side comment is overwritten with -csc -sbc use 'static block comments' identified by leading '##' (default) -sbcp=s change static block comment identifier to be other than '##' -osbc outdent static block comments -ssc use 'static side comments' identified by leading '##' (default) -sscp=s change static side comment identifier to be other than '##' Delete selected text -dac delete all comments AND pod -dbc delete block comments -dsc delete side comments -dp delete pod Send selected text to a '.TEE' file -tac tee all comments AND pod -tbc tee block comments -tsc tee side comments -tp tee pod Outdenting -olq outdent long quoted strings (default) -olc outdent a long block comment line -ola outdent statement labels -okw outdent control keywords (redo, next, last, goto, return) -okwl=s specify alternative keywords for -okw command Other controls -mft=n maximum fields per table; default n=0 (no limit) -x do not format lines before hash-bang line (i.e., for VMS) -asc allows perltidy to add a ';' when missing (default) -dsm allows perltidy to delete an unnecessary ';' (default) Combinations of other parameters -gnu attempt to follow GNU Coding Standards as applied to perl -mangle remove as many newlines as possible (but keep comments and pods) -extrude insert as many newlines as possible Dump and die, debugging -dop dump options used in this run to standard output and quit -ddf dump default options to standard output and quit -dsn dump all option short names to standard output and quit -dln dump option long names to standard output and quit -dpro dump whatever configuration file is in effect to standard output -dtt dump all token types to standard output and quit HTML -html write an html file (see 'man perl2web' for many options) Note: when -html is used, no indentation or formatting are done. Hint: try perltidy -html -css=mystyle.css filename.pl and edit mystyle.css to change the appearance of filename.html. -nnn gives line numbers -pre only writes out
..
code section -toc places a table of contents to subs at the top (default) -pod passes pod text through pod2html (default) -frm write html as a frame (3 files) -text=s extra extension for table of contents if -frm, default='toc' -sext=s extra extension for file content if -frm, default='src' A prefix of "n" negates short form toggle switches, and a prefix of "no" negates the long forms. For example, -nasc means don't add missing semicolons. If you are unable to see this entire text, try "perltidy -h | more" For more detailed information, and additional options, try "man perltidy", or go to the perltidy home page at http://perltidy.sourceforge.net EOF return; } ## end sub usage 1; Perl-Tidy-20230309/t/0002755000175000017500000000000014401515241013067 5ustar stevestevePerl-Tidy-20230309/t/snippets16.t0000644000175000017500000002737614373177244015324 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 spp.spp1 #2 spp.spp2 #3 git16.def #4 git10.def #5 git10.git10 #6 multiple_equals.def #7 align31.def #8 almost1.def #9 almost2.def #10 almost3.def #11 rt130394.def #12 rt131115.def #13 rt131115.rt131115 #14 ndsm1.def #15 ndsm1.ndsm #16 rt131288.def #17 rt130394.rt130394 #18 git18.def #19 here2.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'def' => "", 'git10' => "-wn -ce -cbl=sort,map,grep", 'ndsm' => "-ndsm", 'rt130394' => "-olbn=1", 'rt131115' => "-bli", 'spp1' => "-spp=1", 'spp2' => "-spp=2", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'align31' => <<'----------', # do not align the commas $w->insert( ListBox => origin => [ 270, 160 ], size => [ 200, 55 ], ); ---------- 'almost1' => <<'----------', # not a good alignment my $realname = catfile( $dir, $file ); my $display_name = defined $disp ? catfile( $disp, $file ) : $file; ---------- 'almost2' => <<'----------', # not a good alignment my $substname = ( $indtot > 1 ? $indname . $indno : $indname ); my $incname = $indname . ( $indtot > 1 ? $indno : "" ); ---------- 'almost3' => <<'----------', # not a good alignment sub head { match_on_type @_ => Null => sub { die "Cannot get head of Null" }, ArrayRef => sub { $_->[0] }; } ---------- 'git10' => <<'----------', # perltidy -wn -ce -cbl=sort,map,grep @sorted = map { $_->[0] } sort { $a->[1] <=> $b->[1] or $a->[0] cmp $b->[0] } map { [ $_, length($_) ] } @unsorted; ---------- 'git16' => <<'----------', # git#16, two equality lines with fat commas on the right my $Package = $Self->RepositoryGet( %Param, Result => 'SCALAR' ); my %Structure = $Self->PackageParse( String => $Package ); ---------- 'git18' => <<'----------', # parsing stuff like 'x17' before fat comma my %bb = ( 123x18 => '123x18', 123 x19 => '123 x19', 123x 20 => '123x 20', 2 x 7 => '2 x 7', x40 => 'x40', 'd' x17 => "'d' x17", c x17 => 'c x17', ); foreach my $key ( keys %bb ) { print "key='$key' => $bb{$key}\n"; } ---------- 'here2' => <<'----------', $_ = ""; s|(?:)|"${\< <<'----------', # ignore second '=' here $| = $debug = 1 if $opt_d; $full_index = 1 if $opt_i; $query_all = $opt_A if $opt_A; # not aligning multiple '='s here $start = $end = $len = $ismut = $number = $allele_ori = $allele_mut = $proof = $xxxxreg = $reg = $dist = ''; ---------- 'ndsm1' => <<'----------', ;;;;; # 1 trapped semicolon sub numerically {$a <=> $b}; ;;;;; sub Numerically {$a <=> $b}; # trapped semicolon @: = qw;2c72656b636168 2020202020 ;; __; ---------- 'rt130394' => <<'----------', # rt130394: keep on one line with -olbn=1 $factorial = sub { reduce { $a * $b } 1 .. 11 }; ---------- 'rt131115' => <<'----------', # closing braces to be inteded with -bli sub a { my %uniq; foreach my $par (@_) { $uniq{$par} = 1; } } ---------- 'rt131288' => <<'----------', sub OptArgs2::STYLE_FULL { 3 } $style == OptArgs2::STYLE_FULL ? 'FullUsage' : 'NormalUsage', 'usage: ' . $usage . "\n"; ---------- 'spp' => <<'----------', sub get_val() { } sub get_Val () { } sub Get_val () { } my $sub1=sub () { }; my $sub2=sub () { }; ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'spp.spp1' => { source => "spp", params => "spp1", expect => <<'#1...........', sub get_val() { } sub get_Val () { } sub Get_val () { } my $sub1 = sub () { }; my $sub2 = sub () { }; #1........... }, 'spp.spp2' => { source => "spp", params => "spp2", expect => <<'#2...........', sub get_val () { } sub get_Val () { } sub Get_val () { } my $sub1 = sub () { }; my $sub2 = sub () { }; #2........... }, 'git16.def' => { source => "git16", params => "def", expect => <<'#3...........', # git#16, two equality lines with fat commas on the right my $Package = $Self->RepositoryGet( %Param, Result => 'SCALAR' ); my %Structure = $Self->PackageParse( String => $Package ); #3........... }, 'git10.def' => { source => "git10", params => "def", expect => <<'#4...........', # perltidy -wn -ce -cbl=sort,map,grep @sorted = map { $_->[0] } sort { $a->[1] <=> $b->[1] or $a->[0] cmp $b->[0] } map { [ $_, length($_) ] } @unsorted; #4........... }, 'git10.git10' => { source => "git10", params => "git10", expect => <<'#5...........', # perltidy -wn -ce -cbl=sort,map,grep @sorted = map { $_->[0] } sort { $a->[1] <=> $b->[1] or $a->[0] cmp $b->[0] } map { [ $_, length($_) ] } @unsorted; #5........... }, 'multiple_equals.def' => { source => "multiple_equals", params => "def", expect => <<'#6...........', # ignore second '=' here $| = $debug = 1 if $opt_d; $full_index = 1 if $opt_i; $query_all = $opt_A if $opt_A; # not aligning multiple '='s here $start = $end = $len = $ismut = $number = $allele_ori = $allele_mut = $proof = $xxxxreg = $reg = $dist = ''; #6........... }, 'align31.def' => { source => "align31", params => "def", expect => <<'#7...........', # do not align the commas $w->insert( ListBox => origin => [ 270, 160 ], size => [ 200, 55 ], ); #7........... }, 'almost1.def' => { source => "almost1", params => "def", expect => <<'#8...........', # not a good alignment my $realname = catfile( $dir, $file ); my $display_name = defined $disp ? catfile( $disp, $file ) : $file; #8........... }, 'almost2.def' => { source => "almost2", params => "def", expect => <<'#9...........', # not a good alignment my $substname = ( $indtot > 1 ? $indname . $indno : $indname ); my $incname = $indname . ( $indtot > 1 ? $indno : "" ); #9........... }, 'almost3.def' => { source => "almost3", params => "def", expect => <<'#10...........', # not a good alignment sub head { match_on_type @_ => Null => sub { die "Cannot get head of Null" }, ArrayRef => sub { $_->[0] }; } #10........... }, 'rt130394.def' => { source => "rt130394", params => "def", expect => <<'#11...........', # rt130394: keep on one line with -olbn=1 $factorial = sub { reduce { $a * $b } 1 .. 11; }; #11........... }, 'rt131115.def' => { source => "rt131115", params => "def", expect => <<'#12...........', # closing braces to be inteded with -bli sub a { my %uniq; foreach my $par (@_) { $uniq{$par} = 1; } } #12........... }, 'rt131115.rt131115' => { source => "rt131115", params => "rt131115", expect => <<'#13...........', # closing braces to be inteded with -bli sub a { my %uniq; foreach my $par (@_) { $uniq{$par} = 1; } } #13........... }, 'ndsm1.def' => { source => "ndsm1", params => "def", expect => <<'#14...........', ; # 1 trapped semicolon sub numerically { $a <=> $b } sub Numerically { $a <=> $b }; # trapped semicolon @: = qw;2c72656b636168 2020202020 ;; __; #14........... }, 'ndsm1.ndsm' => { source => "ndsm1", params => "ndsm", expect => <<'#15...........', ; ; ; ; ; # 1 trapped semicolon sub numerically { $a <=> $b }; ; ; ; ; ; sub Numerically { $a <=> $b }; # trapped semicolon @: = qw;2c72656b636168 2020202020 ;; __; #15........... }, 'rt131288.def' => { source => "rt131288", params => "def", expect => <<'#16...........', sub OptArgs2::STYLE_FULL { 3 } $style == OptArgs2::STYLE_FULL ? 'FullUsage' : 'NormalUsage', 'usage: ' . $usage . "\n"; #16........... }, 'rt130394.rt130394' => { source => "rt130394", params => "rt130394", expect => <<'#17...........', # rt130394: keep on one line with -olbn=1 $factorial = sub { reduce { $a * $b } 1 .. 11 }; #17........... }, 'git18.def' => { source => "git18", params => "def", expect => <<'#18...........', # parsing stuff like 'x17' before fat comma my %bb = ( 123 x 18 => '123x18', 123 x 19 => '123 x19', 123 x 20 => '123x 20', 2 x 7 => '2 x 7', x40 => 'x40', 'd' x 17 => "'d' x17", c x17 => 'c x17', ); foreach my $key ( keys %bb ) { print "key='$key' => $bb{$key}\n"; } #18........... }, 'here2.def' => { source => "here2", params => "def", expect => <<'#19...........', $_ = ""; s|(?:)|"${\< $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/snippets13.t0000644000175000017500000003400114373177244015300 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 align10.def #2 align11.def #3 align12.def #4 align13.def #5 rt127633.def #6 rt127633.rt127633 #7 align14.def #8 align15.def #9 align16.def #10 break5.def #11 align19.def #12 align20.def #13 align21.def #14 align22.def #15 align23.def #16 align24.def #17 align25.def #18 align26.def #19 align27.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'def' => "", 'rt127633' => "-baao", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'align10' => <<'----------', $message =~ &rhs_wordwrap( $message, $width ); $message_len =~ split( /^/, $message ); ---------- 'align11' => <<'----------', my $accountno = getnextacctno( $env, $bornum, $dbh ); my $item = getiteminformation( $env, $itemno ); my $account = "Insert into accountlines bla bla"; ---------- 'align12' => <<'----------', my $type = shift || "o"; my $fname = ( $type eq 'oo' ? 'orte_city' : 'orte' ); my $suffix = ( $coord_system eq 'standard' ? '' : '-orig' ); ---------- 'align13' => <<'----------', # symbols =~ and !~ are equivalent in alignment ok( $out !~ /EXACT /, "No 'baz'" ); ok( $out =~ //, "Got 'liz'" ); # liz ok( $out =~ //, "Got 'zoo'" ); # zoo ok( $out !~ //, "Got 'zap'" ); # zap ---------- 'align14' => <<'----------', # align the = my($apple)=new Fruit("Apple1",.1,.30); my($grapefruit)=new Grapefruit("Grapefruit1",.3); my($redgrapefruit)=new RedGrapefruit("Grapefruit2",.3); ---------- 'align15' => <<'----------', # align both = and // my$color=$opts{'-color'}//'black'; my$background=$opts{'-background'}//'none'; my$linewidth=$opts{'-linewidth'}//1; my$radius=$opts{'-radius'}//0; ---------- 'align16' => <<'----------', # align all at first => use constant { PHFAM => [ { John => 1, Jane => 2, Sally => 3 }, 33, 28, 3 ], FAMILY => [qw( John Jane Sally )], AGES => { John => 33, Jane => 28, Sally => 3 }, RFAM => [ [qw( John Jane Sally )] ], THREE => 3, SPIT => sub { shift }, }; ---------- 'align19' => <<'----------', # different lhs patterns, do not align the '=' @_ = qw(sort grep map do eval); @is_not_zero_continuation_block_type{@_} = (1) x scalar(@_); ---------- 'align20' => <<'----------', # marginal two-line match; different lhs patterns; do not align $w[$i] = $t; $t = 1000000; ---------- 'align21' => <<'----------', # two lines with large gap but same lhs pattern so align equals local (@pieces) = split( /\./, $filename, 2 ); local ($just_dir_and_base) = $pieces[0]; # two lines with 3 alignment tokens $expect = "1$expect" if $expect =~ /^e/i; $p = "1$p" if defined $p and $p =~ /^e/i; # two lines where alignment causes a large gap is( eval { sysopen( my $ro, $foo, &O_RDONLY | $TAINT0 ) }, undef ); is( $@, '' ); ---------- 'align22' => <<'----------', # two equality lines with different patterns to left of equals do not align $signame{$_} = ++$signal; $signum[$signal] = $_; ---------- 'align23' => <<'----------', # two equality lines with same pattern on left of equals will align my $orig = my $format = "^<<<<< ~~\n"; my $abc = "abc"; ---------- 'align24' => <<'----------', # Do not align interior fat commas here; different container types my $p = TAP::Parser::SubclassTest->new( { exec => [ $cat => $file ], sources => { MySourceHandler => { accept_all => 1 } }, } ); ---------- 'align25' => <<'----------', # do not align internal commas here; different container types is_deeply( [ $a, $a ], [ $b, $c ] ); is_deeply( { foo => $a, bar => $a }, { foo => $b, bar => $c } ); is_deeply( [ \$a, \$a ], [ \$b, \$c ] ); ---------- 'align26' => <<'----------', # align first of multiple equals $SIG{PIPE}=sub{die"writingtoaclosedpipe"}; $SIG{BREAK}=$SIG{INT}=$SIG{TERM}; $SIG{HUP}=\&some_handler; ---------- 'align27' => <<'----------', # do not align first equals here (unmatched commas on left side of =) my ( $self, $name, $type ) = @_; my $html_toc_fh = $self->{_html_toc_fh}; my $html_prelim_fh = $self->{_html_prelim_fh}; ---------- 'break5' => <<'----------', # do not break at .'s after the ? return ( ( $pod eq $pod2 ) & amp; & ( $htype eq "NAME" ) ) ? "\n<A NAME=\"" . $value . "\">\n$text</A>\n" : "\n$type$pod2.html\#" . $value . "\">$text<\/A>\n"; ---------- 'rt127633' => <<'----------', # keep lines long; do not break after 'return' and '.' with -baoo return $ref eq 'SCALAR' ? $self->encode_scalar( $object, $name, $type, $attr ) : $ref eq 'ARRAY'; my $s = 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' . 'bbbbbbbbbbbbbbbbbbbbbbbbb'; ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'align10.def' => { source => "align10", params => "def", expect => <<'#1...........', $message =~ &rhs_wordwrap( $message, $width ); $message_len =~ split( /^/, $message ); #1........... }, 'align11.def' => { source => "align11", params => "def", expect => <<'#2...........', my $accountno = getnextacctno( $env, $bornum, $dbh ); my $item = getiteminformation( $env, $itemno ); my $account = "Insert into accountlines bla bla"; #2........... }, 'align12.def' => { source => "align12", params => "def", expect => <<'#3...........', my $type = shift || "o"; my $fname = ( $type eq 'oo' ? 'orte_city' : 'orte' ); my $suffix = ( $coord_system eq 'standard' ? '' : '-orig' ); #3........... }, 'align13.def' => { source => "align13", params => "def", expect => <<'#4...........', # symbols =~ and !~ are equivalent in alignment ok( $out !~ /EXACT /, "No 'baz'" ); ok( $out =~ //, "Got 'liz'" ); # liz ok( $out =~ //, "Got 'zoo'" ); # zoo ok( $out !~ //, "Got 'zap'" ); # zap #4........... }, 'rt127633.def' => { source => "rt127633", params => "def", expect => <<'#5...........', # keep lines long; do not break after 'return' and '.' with -baoo return $ref eq 'SCALAR' ? $self->encode_scalar( $object, $name, $type, $attr ) : $ref eq 'ARRAY'; my $s = 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' . 'bbbbbbbbbbbbbbbbbbbbbbbbb'; #5........... }, 'rt127633.rt127633' => { source => "rt127633", params => "rt127633", expect => <<'#6...........', # keep lines long; do not break after 'return' and '.' with -baoo return $ref eq 'SCALAR' ? $self->encode_scalar( $object, $name, $type, $attr ) : $ref eq 'ARRAY'; my $s = 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' . 'bbbbbbbbbbbbbbbbbbbbbbbbb'; #6........... }, 'align14.def' => { source => "align14", params => "def", expect => <<'#7...........', # align the = my ($apple) = new Fruit( "Apple1", .1, .30 ); my ($grapefruit) = new Grapefruit( "Grapefruit1", .3 ); my ($redgrapefruit) = new RedGrapefruit( "Grapefruit2", .3 ); #7........... }, 'align15.def' => { source => "align15", params => "def", expect => <<'#8...........', # align both = and // my $color = $opts{'-color'} // 'black'; my $background = $opts{'-background'} // 'none'; my $linewidth = $opts{'-linewidth'} // 1; my $radius = $opts{'-radius'} // 0; #8........... }, 'align16.def' => { source => "align16", params => "def", expect => <<'#9...........', # align all at first => use constant { PHFAM => [ { John => 1, Jane => 2, Sally => 3 }, 33, 28, 3 ], FAMILY => [qw( John Jane Sally )], AGES => { John => 33, Jane => 28, Sally => 3 }, RFAM => [ [qw( John Jane Sally )] ], THREE => 3, SPIT => sub { shift }, }; #9........... }, 'break5.def' => { source => "break5", params => "def", expect => <<'#10...........', # do not break at .'s after the ? return ( ( $pod eq $pod2 ) & amp; & ( $htype eq "NAME" ) ) ? "\n<A NAME=\"" . $value . "\">\n$text</A>\n" : "\n$type$pod2.html\#" . $value . "\">$text<\/A>\n"; #10........... }, 'align19.def' => { source => "align19", params => "def", expect => <<'#11...........', # different lhs patterns, do not align the '=' @_ = qw(sort grep map do eval); @is_not_zero_continuation_block_type{@_} = (1) x scalar(@_); #11........... }, 'align20.def' => { source => "align20", params => "def", expect => <<'#12...........', # marginal two-line match; different lhs patterns; do not align $w[$i] = $t; $t = 1000000; #12........... }, 'align21.def' => { source => "align21", params => "def", expect => <<'#13...........', # two lines with large gap but same lhs pattern so align equals local (@pieces) = split( /\./, $filename, 2 ); local ($just_dir_and_base) = $pieces[0]; # two lines with 3 alignment tokens $expect = "1$expect" if $expect =~ /^e/i; $p = "1$p" if defined $p and $p =~ /^e/i; # two lines where alignment causes a large gap is( eval { sysopen( my $ro, $foo, &O_RDONLY | $TAINT0 ) }, undef ); is( $@, '' ); #13........... }, 'align22.def' => { source => "align22", params => "def", expect => <<'#14...........', # two equality lines with different patterns to left of equals do not align $signame{$_} = ++$signal; $signum[$signal] = $_; #14........... }, 'align23.def' => { source => "align23", params => "def", expect => <<'#15...........', # two equality lines with same pattern on left of equals will align my $orig = my $format = "^<<<<< ~~\n"; my $abc = "abc"; #15........... }, 'align24.def' => { source => "align24", params => "def", expect => <<'#16...........', # Do not align interior fat commas here; different container types my $p = TAP::Parser::SubclassTest->new( { exec => [ $cat => $file ], sources => { MySourceHandler => { accept_all => 1 } }, } ); #16........... }, 'align25.def' => { source => "align25", params => "def", expect => <<'#17...........', # do not align internal commas here; different container types is_deeply( [ $a, $a ], [ $b, $c ] ); is_deeply( { foo => $a, bar => $a }, { foo => $b, bar => $c } ); is_deeply( [ \$a, \$a ], [ \$b, \$c ] ); #17........... }, 'align26.def' => { source => "align26", params => "def", expect => <<'#18...........', # align first of multiple equals $SIG{PIPE} = sub { die "writingtoaclosedpipe" }; $SIG{BREAK} = $SIG{INT} = $SIG{TERM}; $SIG{HUP} = \&some_handler; #18........... }, 'align27.def' => { source => "align27", params => "def", expect => <<'#19...........', # do not align first equals here (unmatched commas on left side of =) my ( $self, $name, $type ) = @_; my $html_toc_fh = $self->{_html_toc_fh}; my $html_prelim_fh = $self->{_html_prelim_fh}; #19........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/filter_example.t0000755000175000017500000000301013743146146016263 0ustar stevesteve# Test use of prefilter and postfilter parameters use strict; use Carp; use Perl::Tidy; use Test::More; my $name = 'filter_example'; BEGIN { plan tests => 1; } my $source = <<'ENDS'; use Method::Signatures::Simple; method foo1 { $self->bar } # with signature method foo2($bar, %opts) { $self->bar(reverse $bar) if $opts{rev}; } # attributes method foo3 : lvalue { $self->{foo} } # change invocant name method foo4 ( $class : $bar ) { $class->bar($bar) } ENDS my $expect = <<'ENDE'; use Method::Signatures::Simple; method foo1 { $self->bar } # with signature method foo2 ( $bar, %opts ) { $self->bar( reverse $bar ) if $opts{rev}; } # attributes method foo3 : lvalue { $self->{foo}; } # change invocant name method foo4 ( $class : $bar ) { $class->bar($bar) } ENDE my $output; my $stderr_string; my $errorfile_string; my $params = ""; my $err = Perl::Tidy::perltidy( #argv => '-npro', # fix for RT#127679, avoid reading unwanted .perltidyrc argv => '', perltidyrc => \$params, # avoid reading unwanted .perltidyrc prefilter => sub { $_ = $_[0]; s/^\s*method\s+(\w.*)/sub METHOD_$1/gm; return $_ }, postfilter => sub { $_ = $_[0]; s/sub\s+METHOD_/method /gm; return $_ }, source => \$source, destination => \$output, stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { ok(0); } else { is( $output, $expect, $name ); } Perl-Tidy-20230309/t/snippets14.t0000644000175000017500000007471114373177244015315 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 else1.def #2 else2.def #3 ternary3.def #4 align17.def #5 align18.def #6 kgb1.def #7 kgb1.kgb #8 kgb2.def #9 kgb2.kgb #10 kgb3.def #11 kgb3.kgb #12 kgb4.def #13 kgb4.kgb #14 kgb5.def #15 kgb5.kgb #16 kgbd.def #17 kgbd.kgbd #18 kgb_tight.def #19 gnu5.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'def' => "", 'kgb' => "-kgb", 'kgbd' => "-kgbd -kgb", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'align17' => <<'----------', # align => even at broken sub block my%opt=( 'cc'=>sub{$param::cachecom=1;}, 'cd'=>sub{$param::cachedisable=1;}, 'p'=>sub{ $param::pflag=1; $param::build=0; } ); ---------- 'align18' => <<'----------', #align '&&' for($ENV{HTTP_USER_AGENT}){ $page= /Mac/&&'m/Macintrash.html' ||/Win(dows)?NT/&&'e/evilandrude.html' ||/Win|MSIE|WebTV/&&'m/MicroslothWindows.html' ||/Linux/&&'l/Linux.html' ||/HP-UX/&&'h/HP-SUX.html' ||/SunOS/&&'s/ScumOS.html' ||'a/AppendixB.html'; } ---------- 'else1' => <<'----------', # pad after 'if' when followed by 'elsif' if ( not defined $dir or not length $dir ) { $rslt = ''; } elsif ( $dir =~ /^\$\([^\)]+\)\Z(?!\n)/s ) { $rslt = $dir; } else { $rslt = vmspath($dir); } ---------- 'else2' => <<'----------', # no pad after 'if' when followed by 'else' if ( $m = $g[$x][$y] ) { print $$m{v}; $$m{i}->() } else { print " " } ---------- 'gnu5' => <<'----------', # side comments limit gnu type formatting with l=80; note extra comma push @tests, [ "Lowest code point requiring 13 bytes to represent", # 2**36 "\xff\x80\x80\x80\x80\x80\x81\x80\x80\x80\x80\x80\x80", ($::is64bit) ? 0x1000000000 : -1, # overflows on 32bit ], ; ---------- 'kgb1' => <<'----------', # a variety of line types for testing -kgb use strict; use Test; use Encode qw(from_to encode decode encode_utf8 decode_utf8 find_encoding is_utf8); use charnames qw(greek); our $targetdir = "/usr/local/doc/HTML/Perl"; local ( $tocfile, $loffile, $lotfile, $footfile, $citefile, $idxfile, $figure_captions, $table_captions, $footnotes, $citations, %font_size, %index, %done, $t_title, $t_author, $t_date, $t_address, $t_affil, $changed ); my @UNITCHECKs = B::unitcheck_av->isa("B::AV") ? B::unitcheck_av->ARRAY : (); my @CHECKs = B::check_av->isa("B::AV") ? B::check_av->ARRAY : (); my $dna = Bio::LiveSeq::DNA->new( -seq => $dnasequence ); my $min = 1; my $max = length($dnasequence); my $T = $G->_strongly_connected; my %R = $T->vertex_roots; my @C; # We're not calling the strongly_connected_components() # Do not separate this hanging side comment from previous my $G = shift; my $exon = Bio::LiveSeq::Exon->new( -seq => $dna, -start => $min, -end => $max, -strand => 1 ); my $octal_mode; my @inputs = ( 0777, 0700, 0470, 0407, 0433, 0400, 0430, 0403, 0111, 0100, 0110, 0101, 0731, 0713, 0317, 0371, 0173, 0137 ); my $impulse = ( 1 - $factor ) * ( 170 - $u ) + ( 350 / $u**0.65 + 500 / $u**5 ) * $factor; my $r = q{ pm_to_blib: $(TO_INST_PM) }; my $regcomp_re = "(?ckWARN(?:\\d+)?reg\\w*|vWARN\\d+|$regcomp_fail_re)"; my $position = List::MoreUtils::firstidx { refaddr $_ == $key } my @exons = ($exon); my $fastafile2 = "/tmp/tmpfastafile2"; my $grepcut = 'egrep -v "[[:digit:]]|^ *$|sequences" | cut -c8-'; # grep/cut my $alignprogram = "/usr/local/etc/bioinfo/fasta2/align -s /usr/local/etc/bioinfo/fasta2/idnaa.mat $fastafile1 $fastafile2 2>/dev/null | $grepcut" ; # ALIGN my $xml = new Mioga::XML::Simple( forcearray => 1 ); my $xml_tree = $xml->XMLin($skel_file); my $skel_name = ( exists( $xml_tree->{'name'} ) ) ? $xml_tree->{'name'} : ""; my $grp = GroupGetValues( $conf->{dbh}, $group_id ); my $adm_profile = ProfileGetUser( $conf->{dbh}, $grp->{id_admin}, $group_id ); my $harness = TAP::Harness->new( { verbosity => 1, formatter_class => "TAP::Formatter::Console" } ); require File::Temp; require Time::HiRes; my ( $fh, $filename ) = File::Temp::tempfile("Time-HiRes-utime-XXXXXXXXX"); use File::Basename qw[dirname]; my $dirname = dirname($filename); my $CUT = qr/\n=cut.*$EOP/; my $pod_or_DATA = qr/ ^=(?:head[1-4]|item) .*? $CUT | ^=pod .*? $CUT | ^=for .*? $CUT | ^=begin .*? $CUT | ^__(DATA|END)__\r?\n.* /smx; require Cwd; ( my $boot = $self->{NAME} ) =~ s/:/_/g; doit( sub { @E::ISA = qw/F/ }, sub { @E::ISA = qw/D/; @C::ISA = qw/F/ }, sub { @C::ISA = qw//; @A::ISA = qw/K/ }, sub { @A::ISA = qw//; @J::ISA = qw/F K/ }, sub { @J::ISA = qw/F/; @H::ISA = qw/K G/ }, sub { @H::ISA = qw/G/; @B::ISA = qw/B/ }, sub { @B::ISA = qw//; @K::ISA = qw/K J I/ }, sub { @K::ISA = qw/J I/; @D::ISA = qw/A H B C/ }, return; ); my %extractor_for = ( quotelike => [ $ws, $variable, $id, { MATCH => \&extract_quotelike } ], regex => [ $ws, $pod_or_DATA, $id, $exql ], string => [ $ws, $pod_or_DATA, $id, $exql ], code => [ $ws, { DONT_MATCH => $pod_or_DATA }, $variable, $id, { DONT_MATCH => \&extract_quotelike } ], code_no_comments => [ { DONT_MATCH => $comment }, $ncws, { DONT_MATCH => $pod_or_DATA }, $variable, $id, { DONT_MATCH => \&extract_quotelike } ], executable => [ $ws, { DONT_MATCH => $pod_or_DATA } ], executable_no_comments => [ { DONT_MATCH => $comment }, $ncws, { DONT_MATCH => $pod_or_DATA } ], all => [ { MATCH => qr/(?s:.*)/ } ], ); exit 1; ---------- 'kgb2' => <<'----------', # with -kgb, do no break after last my sub next_sibling { my $self = shift; my $parent = $_PARENT{refaddr $self} or return ''; my $key = refaddr $self; my $elements = $parent->{children}; my $position = List::MoreUtils::firstidx { refaddr $_ == $key } @$elements; $elements->[$position + 1] || ''; } ---------- 'kgb3' => <<'----------', #!/usr/bin/perl -w use strict; # with -kgb, no break after hash bang our ( @Changed, $TAP ); # break after isolated 'our' use File::Compare; use Symbol; use Text::Wrap(); use Text::Warp(); use Blast::IPS::MathUtils qw( set_interpolation_points table_row_interpolation two_point_interpolation ); # with -kgb, break around isolated 'local' below use Text::Warp(); local($delta2print) = (defined $size) ? int($size/50) : $defaultdelta2print; print "break before this line\n"; ---------- 'kgb4' => <<'----------', print "hello"; # with -kgb, break after this line use strict; use warnings; use Test::More tests => 1; use Pod::Simple::XHTML; my $c = < <<'----------', # with -kgb, do not put blank in ternary print "Starting\n"; # with -kgb, break after this line my $A = "1"; my $B = "0"; my $C = "1"; my $D = "1"; my $result = $A ? $B ? $C ? "+A +B +C" : "+A +B -C" : "+A -B" : "-A"; my $F = "0"; print "with -kgb, put blank above this line; result=$result\n"; ---------- 'kgb_tight' => <<'----------', # a variety of line types for testing -kgb use strict; use Test; use Encode qw(from_to encode decode encode_utf8 decode_utf8 find_encoding is_utf8); use charnames qw(greek); our $targetdir = "/usr/local/doc/HTML/Perl"; local ( $tocfile, $loffile, $lotfile, $footfile, $citefile, $idxfile, $figure_captions, $table_captions, $footnotes, $citations, %font_size, %index, %done, $t_title, $t_author, $t_date, $t_address, $t_affil, $changed ); my @UNITCHECKs = B::unitcheck_av->isa("B::AV") ? B::unitcheck_av->ARRAY : (); my @CHECKs = B::check_av->isa("B::AV") ? B::check_av->ARRAY : (); my $dna = Bio::LiveSeq::DNA->new( -seq => $dnasequence ); my $min = 1; my $max = length($dnasequence); my $T = $G->_strongly_connected; my %R = $T->vertex_roots; my @C; # We're not calling the strongly_connected_components() # Do not separate this hanging side comment from previous my $G = shift; my $exon = Bio::LiveSeq::Exon->new( -seq => $dna, -start => $min, -end => $max, -strand => 1 ); my @inputs = ( 0777, 0700, 0470, 0407, 0433, 0400, 0430, 0403, 0111, 0100, 0110, 0101, 0731, 0713, 0317, 0371, 0173, 0137 ); my $impulse = ( 1 - $factor ) * ( 170 - $u ) + ( 350 / $u**0.65 + 500 / $u**5 ) * $factor; my $r = q{ pm_to_blib: $(TO_INST_PM) }; my $regcomp_re = "(?ckWARN(?:\\d+)?reg\\w*|vWARN\\d+|$regcomp_fail_re)"; my $position = List::MoreUtils::firstidx { refaddr $_ == $key } my $alignprogram = "/usr/local/etc/bioinfo/fasta2/align -s /usr/local/etc/bioinfo/fasta2/idnaa.mat $fastafile1 $fastafile2 2>/dev/null | $grepcut" ; # ALIGN my $skel_name = ( exists( $xml_tree->{'name'} ) ) ? $xml_tree->{'name'} : ""; my $grp = GroupGetValues( $conf->{dbh}, $group_id ); my $adm_profile = ProfileGetUser( $conf->{dbh}, $grp->{id_admin}, $group_id ); my $harness = TAP::Harness->new( { verbosity => 1, formatter_class => "TAP::Formatter::Console" } ); require File::Temp; require Time::HiRes; my ( $fh, $filename ) = File::Temp::tempfile("Time-HiRes-utime-XXXXXXXXX"); use File::Basename qw[dirname]; my $dirname = dirname($filename); my $CUT = qr/\n=cut.*$EOP/; my $pod_or_DATA = qr/ ^=(?:head[1-4]|item) .*? $CUT | ^=pod .*? $CUT | ^=for .*? $CUT | ^=begin .*? $CUT | ^__(DATA|END)__\r?\n.* /smx; require Cwd; print "continuing\n"; exit 1; ---------- 'kgbd' => <<'----------', package A1::B2; use strict; require Exporter; use A1::Context; use A1::Database; use A1::Bibliotek; use A1::Author; use A1::Title; use vars qw($VERSION @ISA @EXPORT); $VERSION = 0.01; ---------- 'ternary3' => <<'----------', # this previously caused trouble because of the = and =~ push( @aligns, ( ( $a = shift @a ) =~ /[^n]/ ) ? $a : (@isnum) ? 'n' : 'l' ) unless $opt_a; ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'else1.def' => { source => "else1", params => "def", expect => <<'#1...........', # pad after 'if' when followed by 'elsif' if ( not defined $dir or not length $dir ) { $rslt = ''; } elsif ( $dir =~ /^\$\([^\)]+\)\Z(?!\n)/s ) { $rslt = $dir; } else { $rslt = vmspath($dir); } #1........... }, 'else2.def' => { source => "else2", params => "def", expect => <<'#2...........', # no pad after 'if' when followed by 'else' if ( $m = $g[$x][$y] ) { print $$m{v}; $$m{i}->() } else { print " " } #2........... }, 'ternary3.def' => { source => "ternary3", params => "def", expect => <<'#3...........', # this previously caused trouble because of the = and =~ push( @aligns, ( ( $a = shift @a ) =~ /[^n]/ ) ? $a : (@isnum) ? 'n' : 'l' ) unless $opt_a; #3........... }, 'align17.def' => { source => "align17", params => "def", expect => <<'#4...........', # align => even at broken sub block my %opt = ( 'cc' => sub { $param::cachecom = 1; }, 'cd' => sub { $param::cachedisable = 1; }, 'p' => sub { $param::pflag = 1; $param::build = 0; } ); #4........... }, 'align18.def' => { source => "align18", params => "def", expect => <<'#5...........', #align '&&' for ( $ENV{HTTP_USER_AGENT} ) { $page = /Mac/ && 'm/Macintrash.html' || /Win(dows)?NT/ && 'e/evilandrude.html' || /Win|MSIE|WebTV/ && 'm/MicroslothWindows.html' || /Linux/ && 'l/Linux.html' || /HP-UX/ && 'h/HP-SUX.html' || /SunOS/ && 's/ScumOS.html' || 'a/AppendixB.html'; } #5........... }, 'kgb1.def' => { source => "kgb1", params => "def", expect => <<'#6...........', # a variety of line types for testing -kgb use strict; use Test; use Encode qw(from_to encode decode encode_utf8 decode_utf8 find_encoding is_utf8); use charnames qw(greek); our $targetdir = "/usr/local/doc/HTML/Perl"; local ( $tocfile, $loffile, $lotfile, $footfile, $citefile, $idxfile, $figure_captions, $table_captions, $footnotes, $citations, %font_size, %index, %done, $t_title, $t_author, $t_date, $t_address, $t_affil, $changed ); my @UNITCHECKs = B::unitcheck_av->isa("B::AV") ? B::unitcheck_av->ARRAY : (); my @CHECKs = B::check_av->isa("B::AV") ? B::check_av->ARRAY : (); my $dna = Bio::LiveSeq::DNA->new( -seq => $dnasequence ); my $min = 1; my $max = length($dnasequence); my $T = $G->_strongly_connected; my %R = $T->vertex_roots; my @C; # We're not calling the strongly_connected_components() # Do not separate this hanging side comment from previous my $G = shift; my $exon = Bio::LiveSeq::Exon->new( -seq => $dna, -start => $min, -end => $max, -strand => 1 ); my $octal_mode; my @inputs = ( 0777, 0700, 0470, 0407, 0433, 0400, 0430, 0403, 0111, 0100, 0110, 0101, 0731, 0713, 0317, 0371, 0173, 0137 ); my $impulse = ( 1 - $factor ) * ( 170 - $u ) + ( 350 / $u**0.65 + 500 / $u**5 ) * $factor; my $r = q{ pm_to_blib: $(TO_INST_PM) }; my $regcomp_re = "(?ckWARN(?:\\d+)?reg\\w*|vWARN\\d+|$regcomp_fail_re)"; my $position = List::MoreUtils::firstidx { refaddr $_ == $key } my @exons = ($exon); my $fastafile2 = "/tmp/tmpfastafile2"; my $grepcut = 'egrep -v "[[:digit:]]|^ *$|sequences" | cut -c8-'; # grep/cut my $alignprogram = "/usr/local/etc/bioinfo/fasta2/align -s /usr/local/etc/bioinfo/fasta2/idnaa.mat $fastafile1 $fastafile2 2>/dev/null | $grepcut" ; # ALIGN my $xml = new Mioga::XML::Simple( forcearray => 1 ); my $xml_tree = $xml->XMLin($skel_file); my $skel_name = ( exists( $xml_tree->{'name'} ) ) ? $xml_tree->{'name'} : ""; my $grp = GroupGetValues( $conf->{dbh}, $group_id ); my $adm_profile = ProfileGetUser( $conf->{dbh}, $grp->{id_admin}, $group_id ); my $harness = TAP::Harness->new( { verbosity => 1, formatter_class => "TAP::Formatter::Console" } ); require File::Temp; require Time::HiRes; my ( $fh, $filename ) = File::Temp::tempfile("Time-HiRes-utime-XXXXXXXXX"); use File::Basename qw[dirname]; my $dirname = dirname($filename); my $CUT = qr/\n=cut.*$EOP/; my $pod_or_DATA = qr/ ^=(?:head[1-4]|item) .*? $CUT | ^=pod .*? $CUT | ^=for .*? $CUT | ^=begin .*? $CUT | ^__(DATA|END)__\r?\n.* /smx; require Cwd; ( my $boot = $self->{NAME} ) =~ s/:/_/g; doit( sub { @E::ISA = qw/F/ }, sub { @E::ISA = qw/D/; @C::ISA = qw/F/ }, sub { @C::ISA = qw//; @A::ISA = qw/K/ }, sub { @A::ISA = qw//; @J::ISA = qw/F K/ }, sub { @J::ISA = qw/F/; @H::ISA = qw/K G/ }, sub { @H::ISA = qw/G/; @B::ISA = qw/B/ }, sub { @B::ISA = qw//; @K::ISA = qw/K J I/ }, sub { @K::ISA = qw/J I/; @D::ISA = qw/A H B C/ }, return; ); my %extractor_for = ( quotelike => [ $ws, $variable, $id, { MATCH => \&extract_quotelike } ], regex => [ $ws, $pod_or_DATA, $id, $exql ], string => [ $ws, $pod_or_DATA, $id, $exql ], code => [ $ws, { DONT_MATCH => $pod_or_DATA }, $variable, $id, { DONT_MATCH => \&extract_quotelike } ], code_no_comments => [ { DONT_MATCH => $comment }, $ncws, { DONT_MATCH => $pod_or_DATA }, $variable, $id, { DONT_MATCH => \&extract_quotelike } ], executable => [ $ws, { DONT_MATCH => $pod_or_DATA } ], executable_no_comments => [ { DONT_MATCH => $comment }, $ncws, { DONT_MATCH => $pod_or_DATA } ], all => [ { MATCH => qr/(?s:.*)/ } ], ); exit 1; #6........... }, 'kgb1.kgb' => { source => "kgb1", params => "kgb", expect => <<'#7...........', # a variety of line types for testing -kgb use strict; use Test; use Encode qw(from_to encode decode encode_utf8 decode_utf8 find_encoding is_utf8); use charnames qw(greek); our $targetdir = "/usr/local/doc/HTML/Perl"; local ( $tocfile, $loffile, $lotfile, $footfile, $citefile, $idxfile, $figure_captions, $table_captions, $footnotes, $citations, %font_size, %index, %done, $t_title, $t_author, $t_date, $t_address, $t_affil, $changed ); my @UNITCHECKs = B::unitcheck_av->isa("B::AV") ? B::unitcheck_av->ARRAY : (); my @CHECKs = B::check_av->isa("B::AV") ? B::check_av->ARRAY : (); my $dna = Bio::LiveSeq::DNA->new( -seq => $dnasequence ); my $min = 1; my $max = length($dnasequence); my $T = $G->_strongly_connected; my %R = $T->vertex_roots; my @C; # We're not calling the strongly_connected_components() # Do not separate this hanging side comment from previous my $G = shift; my $exon = Bio::LiveSeq::Exon->new( -seq => $dna, -start => $min, -end => $max, -strand => 1 ); my $octal_mode; my @inputs = ( 0777, 0700, 0470, 0407, 0433, 0400, 0430, 0403, 0111, 0100, 0110, 0101, 0731, 0713, 0317, 0371, 0173, 0137 ); my $impulse = ( 1 - $factor ) * ( 170 - $u ) + ( 350 / $u**0.65 + 500 / $u**5 ) * $factor; my $r = q{ pm_to_blib: $(TO_INST_PM) }; my $regcomp_re = "(?ckWARN(?:\\d+)?reg\\w*|vWARN\\d+|$regcomp_fail_re)"; my $position = List::MoreUtils::firstidx { refaddr $_ == $key } my @exons = ($exon); my $fastafile2 = "/tmp/tmpfastafile2"; my $grepcut = 'egrep -v "[[:digit:]]|^ *$|sequences" | cut -c8-'; # grep/cut my $alignprogram = "/usr/local/etc/bioinfo/fasta2/align -s /usr/local/etc/bioinfo/fasta2/idnaa.mat $fastafile1 $fastafile2 2>/dev/null | $grepcut" ; # ALIGN my $xml = new Mioga::XML::Simple( forcearray => 1 ); my $xml_tree = $xml->XMLin($skel_file); my $skel_name = ( exists( $xml_tree->{'name'} ) ) ? $xml_tree->{'name'} : ""; my $grp = GroupGetValues( $conf->{dbh}, $group_id ); my $adm_profile = ProfileGetUser( $conf->{dbh}, $grp->{id_admin}, $group_id ); my $harness = TAP::Harness->new( { verbosity => 1, formatter_class => "TAP::Formatter::Console" } ); require File::Temp; require Time::HiRes; my ( $fh, $filename ) = File::Temp::tempfile("Time-HiRes-utime-XXXXXXXXX"); use File::Basename qw[dirname]; my $dirname = dirname($filename); my $CUT = qr/\n=cut.*$EOP/; my $pod_or_DATA = qr/ ^=(?:head[1-4]|item) .*? $CUT | ^=pod .*? $CUT | ^=for .*? $CUT | ^=begin .*? $CUT | ^__(DATA|END)__\r?\n.* /smx; require Cwd; ( my $boot = $self->{NAME} ) =~ s/:/_/g; doit( sub { @E::ISA = qw/F/ }, sub { @E::ISA = qw/D/; @C::ISA = qw/F/ }, sub { @C::ISA = qw//; @A::ISA = qw/K/ }, sub { @A::ISA = qw//; @J::ISA = qw/F K/ }, sub { @J::ISA = qw/F/; @H::ISA = qw/K G/ }, sub { @H::ISA = qw/G/; @B::ISA = qw/B/ }, sub { @B::ISA = qw//; @K::ISA = qw/K J I/ }, sub { @K::ISA = qw/J I/; @D::ISA = qw/A H B C/ }, return; ); my %extractor_for = ( quotelike => [ $ws, $variable, $id, { MATCH => \&extract_quotelike } ], regex => [ $ws, $pod_or_DATA, $id, $exql ], string => [ $ws, $pod_or_DATA, $id, $exql ], code => [ $ws, { DONT_MATCH => $pod_or_DATA }, $variable, $id, { DONT_MATCH => \&extract_quotelike } ], code_no_comments => [ { DONT_MATCH => $comment }, $ncws, { DONT_MATCH => $pod_or_DATA }, $variable, $id, { DONT_MATCH => \&extract_quotelike } ], executable => [ $ws, { DONT_MATCH => $pod_or_DATA } ], executable_no_comments => [ { DONT_MATCH => $comment }, $ncws, { DONT_MATCH => $pod_or_DATA } ], all => [ { MATCH => qr/(?s:.*)/ } ], ); exit 1; #7........... }, 'kgb2.def' => { source => "kgb2", params => "def", expect => <<'#8...........', # with -kgb, do no break after last my sub next_sibling { my $self = shift; my $parent = $_PARENT{ refaddr $self} or return ''; my $key = refaddr $self; my $elements = $parent->{children}; my $position = List::MoreUtils::firstidx { refaddr $_ == $key } @$elements; $elements->[ $position + 1 ] || ''; } #8........... }, 'kgb2.kgb' => { source => "kgb2", params => "kgb", expect => <<'#9...........', # with -kgb, do no break after last my sub next_sibling { my $self = shift; my $parent = $_PARENT{ refaddr $self} or return ''; my $key = refaddr $self; my $elements = $parent->{children}; my $position = List::MoreUtils::firstidx { refaddr $_ == $key } @$elements; $elements->[ $position + 1 ] || ''; } #9........... }, 'kgb3.def' => { source => "kgb3", params => "def", expect => <<'#10...........', #!/usr/bin/perl -w use strict; # with -kgb, no break after hash bang our ( @Changed, $TAP ); # break after isolated 'our' use File::Compare; use Symbol; use Text::Wrap(); use Text::Warp(); use Blast::IPS::MathUtils qw( set_interpolation_points table_row_interpolation two_point_interpolation ); # with -kgb, break around isolated 'local' below use Text::Warp(); local ($delta2print) = ( defined $size ) ? int( $size / 50 ) : $defaultdelta2print; print "break before this line\n"; #10........... }, 'kgb3.kgb' => { source => "kgb3", params => "kgb", expect => <<'#11...........', #!/usr/bin/perl -w use strict; # with -kgb, no break after hash bang our ( @Changed, $TAP ); # break after isolated 'our' use File::Compare; use Symbol; use Text::Wrap(); use Text::Warp(); use Blast::IPS::MathUtils qw( set_interpolation_points table_row_interpolation two_point_interpolation ); # with -kgb, break around isolated 'local' below use Text::Warp(); local ($delta2print) = ( defined $size ) ? int( $size / 50 ) : $defaultdelta2print; print "break before this line\n"; #11........... }, 'kgb4.def' => { source => "kgb4", params => "def", expect => <<'#12...........', print "hello"; # with -kgb, break after this line use strict; use warnings; use Test::More tests => 1; use Pod::Simple::XHTML; my $c = < { source => "kgb4", params => "kgb", expect => <<'#13...........', print "hello"; # with -kgb, break after this line use strict; use warnings; use Test::More tests => 1; use Pod::Simple::XHTML; my $c = < { source => "kgb5", params => "def", expect => <<'#14...........', # with -kgb, do not put blank in ternary print "Starting\n"; # with -kgb, break after this line my $A = "1"; my $B = "0"; my $C = "1"; my $D = "1"; my $result = $A ? $B ? $C ? "+A +B +C" : "+A +B -C" : "+A -B" : "-A"; my $F = "0"; print "with -kgb, put blank above this line; result=$result\n"; #14........... }, 'kgb5.kgb' => { source => "kgb5", params => "kgb", expect => <<'#15...........', # with -kgb, do not put blank in ternary print "Starting\n"; # with -kgb, break after this line my $A = "1"; my $B = "0"; my $C = "1"; my $D = "1"; my $result = $A ? $B ? $C ? "+A +B +C" : "+A +B -C" : "+A -B" : "-A"; my $F = "0"; print "with -kgb, put blank above this line; result=$result\n"; #15........... }, 'kgbd.def' => { source => "kgbd", params => "def", expect => <<'#16...........', package A1::B2; use strict; require Exporter; use A1::Context; use A1::Database; use A1::Bibliotek; use A1::Author; use A1::Title; use vars qw($VERSION @ISA @EXPORT); $VERSION = 0.01; #16........... }, 'kgbd.kgbd' => { source => "kgbd", params => "kgbd", expect => <<'#17...........', package A1::B2; use strict; require Exporter; use A1::Context; use A1::Database; use A1::Bibliotek; use A1::Author; use A1::Title; use vars qw($VERSION @ISA @EXPORT); $VERSION = 0.01; #17........... }, 'kgb_tight.def' => { source => "kgb_tight", params => "def", expect => <<'#18...........', # a variety of line types for testing -kgb use strict; use Test; use Encode qw(from_to encode decode encode_utf8 decode_utf8 find_encoding is_utf8); use charnames qw(greek); our $targetdir = "/usr/local/doc/HTML/Perl"; local ( $tocfile, $loffile, $lotfile, $footfile, $citefile, $idxfile, $figure_captions, $table_captions, $footnotes, $citations, %font_size, %index, %done, $t_title, $t_author, $t_date, $t_address, $t_affil, $changed ); my @UNITCHECKs = B::unitcheck_av->isa("B::AV") ? B::unitcheck_av->ARRAY : (); my @CHECKs = B::check_av->isa("B::AV") ? B::check_av->ARRAY : (); my $dna = Bio::LiveSeq::DNA->new( -seq => $dnasequence ); my $min = 1; my $max = length($dnasequence); my $T = $G->_strongly_connected; my %R = $T->vertex_roots; my @C; # We're not calling the strongly_connected_components() # Do not separate this hanging side comment from previous my $G = shift; my $exon = Bio::LiveSeq::Exon->new( -seq => $dna, -start => $min, -end => $max, -strand => 1 ); my @inputs = ( 0777, 0700, 0470, 0407, 0433, 0400, 0430, 0403, 0111, 0100, 0110, 0101, 0731, 0713, 0317, 0371, 0173, 0137 ); my $impulse = ( 1 - $factor ) * ( 170 - $u ) + ( 350 / $u**0.65 + 500 / $u**5 ) * $factor; my $r = q{ pm_to_blib: $(TO_INST_PM) }; my $regcomp_re = "(?ckWARN(?:\\d+)?reg\\w*|vWARN\\d+|$regcomp_fail_re)"; my $position = List::MoreUtils::firstidx { refaddr $_ == $key } my $alignprogram = "/usr/local/etc/bioinfo/fasta2/align -s /usr/local/etc/bioinfo/fasta2/idnaa.mat $fastafile1 $fastafile2 2>/dev/null | $grepcut" ; # ALIGN my $skel_name = ( exists( $xml_tree->{'name'} ) ) ? $xml_tree->{'name'} : ""; my $grp = GroupGetValues( $conf->{dbh}, $group_id ); my $adm_profile = ProfileGetUser( $conf->{dbh}, $grp->{id_admin}, $group_id ); my $harness = TAP::Harness->new( { verbosity => 1, formatter_class => "TAP::Formatter::Console" } ); require File::Temp; require Time::HiRes; my ( $fh, $filename ) = File::Temp::tempfile("Time-HiRes-utime-XXXXXXXXX"); use File::Basename qw[dirname]; my $dirname = dirname($filename); my $CUT = qr/\n=cut.*$EOP/; my $pod_or_DATA = qr/ ^=(?:head[1-4]|item) .*? $CUT | ^=pod .*? $CUT | ^=for .*? $CUT | ^=begin .*? $CUT | ^__(DATA|END)__\r?\n.* /smx; require Cwd; print "continuing\n"; exit 1; #18........... }, 'gnu5.def' => { source => "gnu5", params => "def", expect => <<'#19...........', # side comments limit gnu type formatting with l=80; note extra comma push @tests, [ "Lowest code point requiring 13 bytes to represent", # 2**36 "\xff\x80\x80\x80\x80\x80\x81\x80\x80\x80\x80\x80\x80", ($::is64bit) ? 0x1000000000 : -1, # overflows on 32bit ], ; #19........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/testwide-tidy.pl.src0000644000175000017500000000036114174123067017017 0ustar stevesteve# really simple @a= ( "Plain", "Zwölf große Boxkämpfer jagen Vik quer über den Sylter.", "Jeż wlókÅ‚ gęś. Uf! BÄ…dź choć przy nim, staÅ„!", "ЛюбÑ, Ñъешь щипцы, — вздохнёт мÑÑ€, — кайф жгуч.", ); Perl-Tidy-20230309/t/testwide-tidy.t0000644000175000017500000000671714255414171016073 0ustar stevesteveuse strict; use warnings; use utf8; use FindBin qw($Bin); use File::Temp qw(tempfile); use Test::More; BEGIN { unshift @INC, "./" } use Perl::Tidy; # This tests the -eos (--encode-output-strings) which was added for issue # git #83 to fix an issue with tidyall. # NOTE: to prevent automatic conversion of line endings LF to CRLF under github # Actions with Windows, which would cause test failure, it is essential that # there be a file 't/.gitattributes' with the line: # * -text # The test file is UTF-8 encoded plan( tests => 6 ); test_all(); sub my_note { my ($msg) = @_; # work around problem where sub Test::More::note does not exist # in older versions of perl if ($] >= 5.010) { note($msg); } return; } sub test_all { my $test_file = "$Bin/testwide-tidy.pl.src"; my $tidy_file = "$Bin/testwide-tidy.pl.srctdy"; my $tidy_str = slurp_raw($tidy_file); test_file2file( $test_file, $tidy_str ); test_scalar2scalar( $test_file, $tidy_str ); test_scalararray2scalararray( $test_file, $tidy_str ); } sub test_file2file { my $test_file = shift; my $tidy_str = shift; my $tidy_hex = unpack( 'H*', $tidy_str ); my $tmp_file = File::Temp->new( TMPDIR => 1 ); my $source = $test_file; my $destination = $tmp_file->filename(); my_note("Testing file2file: '$source' => '$destination'\n"); my $tidyresult = Perl::Tidy::perltidy( argv => '-utf8 -npro', source => $source, destination => $destination ); ok( !$tidyresult, 'perltidy' ); my $destination_str = slurp_raw($destination); my $destination_hex = unpack( 'H*', $destination_str ); my_note("Comparing contents:\n $tidy_hex\n $destination_hex\n"); ok($tidy_hex eq $destination_hex, 'file content compare'); } sub test_scalar2scalar { my $test_file = shift; my $tidy_str = shift; my $tidy_hex = unpack( 'H*', $tidy_str ); my $source = slurp_raw($test_file); my $destination; my_note("Testing scalar2scalar\n"); my $tidyresult = Perl::Tidy::perltidy( argv => '-utf8 -eos -npro', source => \$source, destination => \$destination ); ok( !$tidyresult, 'perltidy' ); my $destination_hex = unpack( 'H*', $destination ); my_note("Comparing contents:\n $tidy_hex\n $destination_hex\n"); ok($tidy_hex eq $destination_hex, 'scalar content compare'); } sub test_scalararray2scalararray { my $test_file = shift; my $tidy_str = shift; my $tidy_hex = unpack( 'H*', $tidy_str ); my $source = [ lines_raw($test_file) ]; my $destination = []; my_note("Testing scalararray2scalararray\n"); my $tidyresult = Perl::Tidy::perltidy( argv => '-utf8 -eos -npro', source => $source, destination => $destination ); ok( !$tidyresult, 'perltidy' ); my $destination_str = join( '', @$destination ); my $destination_hex = unpack( 'H*', $destination_str ); my_note("Comparing contents:\n $tidy_hex\n $destination_hex\n"); ok($tidy_hex eq $destination_hex, 'scalararray content compare'); } sub slurp_raw { my $filename = shift; open( TMP, '<', $filename ); binmode( TMP, ':raw' ); local $/; my $contents = ; close(TMP); return $contents; } sub lines_raw { my $filename = shift; open( TMP, '<', $filename ); binmode( TMP, ':raw' ); my @contents = ; close(TMP); return @contents; } Perl-Tidy-20230309/t/snippets10.t0000644000175000017500000006703714373177244015314 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 scl.def #2 scl.scl #3 semicolon2.def #4 side_comments1.def #5 sil1.def #6 sil1.sil #7 slashslash.def #8 smart.def #9 space1.def #10 space2.def #11 space3.def #12 space4.def #13 space5.def #14 structure1.def #15 style.def #16 style.style1 #17 style.style2 #18 style.style3 #19 style.style4 #20 style.style5 # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'def' => "", 'scl' => "-scl=12", 'sil' => "-sil=0", 'style1' => <<'----------', -b -se -w -i=2 -l=100 -nolq -bbt=1 -bt=2 -pt=2 -nsfs -sbt=2 -sbvt=2 -nhsc -isbc -bvt=2 -pvt=2 -wbb="% + - * / x != == >= <= =~ < > | & **= += *= &= <<= &&= -= /= |= >>= ||= .= %= ^= x=" -mbl=2 ---------- 'style2' => <<'----------', -bt=2 -nwls=".." -nwrs=".." -pt=2 -nsfs -sbt=2 -cuddled-blocks -bar -nsbl -nbbc ---------- 'style3' => <<'----------', -l=160 -cbi=1 -cpi=1 -csbi=1 -lp -nolq -csci=20 -csct=40 -csc -isbc -cuddled-blocks -nsbl -dcsc ---------- 'style4' => <<'----------', -bt=2 -pt=2 -sbt=2 -cuddled-blocks -bar ---------- 'style5' => <<'----------', -b -bext="~" -et=8 -l=77 -cbi=2 -cpi=2 -csbi=2 -ci=4 -nolq -nasc -bt=2 -ndsm -nwls="++ -- ?" -nwrs="++ --" -pt=2 -nsfs -nsts -sbt=2 -sbvt=1 -wls="= .= =~ !~ :" -wrs="= .= =~ !~ ? :" -ncsc -isbc -msc=2 -nolc -bvt=1 -bl -sbl -pvt=1 -wba="% + - * / x != == >= <= =~ !~ < > | & >= < = **= += *= &= <<= &&= -= /= |= >>= ||= .= %= ^= x= . << >> -> && ||" -wbb=" " -cab=1 -mbl=2 ---------- }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'scl' => <<'----------', # try -scl=12 to see '$returns' joined with the previous line $format = "format STDOUT =\n" . &format_line('Function: @') . '$name' . "\n" . &format_line('Arguments: @') . '$args' . "\n" . &format_line('Returns: @') . '$returns' . "\n" . &format_line(' ~~ ^') . '$desc' . "\n.\n"; ---------- 'semicolon2' => <<'----------', # will not add semicolon for this block type $highest = List::Util::reduce { Sort::Versions::versioncmp( $a, $b ) > 0 ? $a : $b } ---------- 'side_comments1' => <<'----------', # side comments at different indentation levels should not be aligned { { { { { ${msg} = "Hello World!"; print "My message: ${msg}\n"; } } #end level 4 } # end level 3 } # end level 2 } # end level 1 ---------- 'sil1' => <<'----------', ############################################################# # This will walk to the left because of bad -sil guess SKIP: { ############################################################# } # This will walk to the right if it is the first line of a file. ov_method mycan( $package, '(""' ), $package or ov_method mycan( $package, '(0+' ), $package or ov_method mycan( $package, '(bool' ), $package or ov_method mycan( $package, '(nomethod' ), $package; ---------- 'slashslash' => <<'----------', $home = $ENV{HOME} // $ENV{LOGDIR} // ( getpwuid($<) )[7] // die "You're homeless!\n"; defined( $x // $y ); $version = 'v' . join '.', map ord, split //, $version->PV; foreach ( split( //, $lets ) ) { } foreach ( split( //, $input ) ) { } 'xyz' =~ //; ---------- 'smart' => <<'----------', \&foo !~~ \&foo; \&foo ~~ \&foo; \&foo ~~ \&foo; \&foo ~~ sub {}; sub {} ~~ \&foo; \&foo ~~ \&bar; \&bar ~~ \&foo; 1 ~~ sub{shift}; sub{shift} ~~ 1; 0 ~~ sub{shift}; sub{shift} ~~ 0; 1 ~~ sub{scalar @_}; sub{scalar @_} ~~ 1; [] ~~ \&bar; \&bar ~~ []; {} ~~ \&bar; \&bar ~~ {}; qr// ~~ \&bar; \&bar ~~ qr//; a_const ~~ "a constant"; "a constant" ~~ a_const; a_const ~~ a_const; a_const ~~ a_const; a_const ~~ b_const; b_const ~~ a_const; {} ~~ {}; {} ~~ {}; {} ~~ {1 => 2}; {1 => 2} ~~ {}; {1 => 2} ~~ {1 => 2}; {1 => 2} ~~ {1 => 2}; {1 => 2} ~~ {1 => 3}; {1 => 3} ~~ {1 => 2}; {1 => 2} ~~ {2 => 3}; {2 => 3} ~~ {1 => 2}; \%main:: ~~ {map {$_ => 'x'} keys %main::}; {map {$_ => 'x'} keys %main::} ~~ \%main::; \%hash ~~ \%tied_hash; \%tied_hash ~~ \%hash; \%tied_hash ~~ \%tied_hash; \%tied_hash ~~ \%tied_hash; \%:: ~~ [keys %main::]; [keys %main::] ~~ \%::; \%:: ~~ []; [] ~~ \%::; {"" => 1} ~~ [undef]; [undef] ~~ {"" => 1}; {foo => 1} ~~ qr/^(fo[ox])$/; qr/^(fo[ox])$/ ~~ {foo => 1}; +{0..100} ~~ qr/[13579]$/; qr/[13579]$/ ~~ +{0..100}; +{foo => 1, bar => 2} ~~ "foo"; "foo" ~~ +{foo => 1, bar => 2}; +{foo => 1, bar => 2} ~~ "baz"; "baz" ~~ +{foo => 1, bar => 2}; [] ~~ []; [] ~~ []; [] ~~ [1]; [1] ~~ []; [["foo"], ["bar"]] ~~ [qr/o/, qr/a/]; [qr/o/, qr/a/] ~~ [["foo"], ["bar"]]; ["foo", "bar"] ~~ [qr/o/, qr/a/]; [qr/o/, qr/a/] ~~ ["foo", "bar"]; $deep1 ~~ $deep1; $deep1 ~~ $deep1; $deep1 ~~ $deep2; $deep2 ~~ $deep1; \@nums ~~ \@tied_nums; \@tied_nums ~~ \@nums; [qw(foo bar baz quux)] ~~ qr/x/; qr/x/ ~~ [qw(foo bar baz quux)]; [qw(foo bar baz quux)] ~~ qr/y/; qr/y/ ~~ [qw(foo bar baz quux)]; [qw(1foo 2bar)] ~~ 2; 2 ~~ [qw(1foo 2bar)]; [qw(1foo 2bar)] ~~ "2"; "2" ~~ [qw(1foo 2bar)]; 2 ~~ 2; 2 ~~ 2; 2 ~~ 3; 3 ~~ 2; 2 ~~ "2"; "2" ~~ 2; 2 ~~ "2.0"; "2.0" ~~ 2; 2 ~~ "2bananas"; "2bananas" ~~ 2; 2_3 ~~ "2_3"; "2_3" ~~ 2_3; qr/x/ ~~ "x"; "x" ~~ qr/x/; qr/y/ ~~ "x"; "x" ~~ qr/y/; 12345 ~~ qr/3/; qr/3/ ~~ 12345; @nums ~~ 7; 7 ~~ @nums; @nums ~~ \@nums; \@nums ~~ @nums; @nums ~~ \\@nums; \\@nums ~~ @nums; @nums ~~ [1..10]; [1..10] ~~ @nums; @nums ~~ [0..9]; [0..9] ~~ @nums; %hash ~~ "foo"; "foo" ~~ %hash; %hash ~~ /bar/; /bar/ ~~ %hash; ---------- 'space1' => <<'----------', # We usually want a space at '} (', for example: map { 1 * $_; } ( $y, $M, $w, $d, $h, $m, $s ); # But not others: &{ $_->[1] }( delete $_[$#_]{ $_->[0] } ); # remove unwanted spaces after $ and -> here &{ $ _ -> [1] }( delete $ _ [$#_ ]{ $_ -> [0] } ); # this has both tabs and spaces to remove $ setup = $ labels -> labelsetup( Output_Width => 2.625) ; ---------- 'space2' => <<'----------', # space before this opening paren for$i(0..20){} # retain any space between '-' and bare word $myhash{USER-NAME}='steve'; ---------- 'space3' => <<'----------', # Treat newline as a whitespace. Otherwise, we might combine # 'Send' and '-recipients' here my $msg = new Fax::Send -recipients => $to, -data => $data; ---------- 'space4' => <<'----------', # first prototype line will cause space between 'redirect' and '(' to close sub html::redirect($); #<-- temporary prototype; use html; print html::redirect ('http://www.glob.com.au/'); ---------- 'space5' => <<'----------', # first prototype line commented out; space after 'redirect' remains #sub html::redirect($); #<-- temporary prototype; use html; print html::redirect ('http://www.glob.com.au/'); ---------- 'structure1' => <<'----------', push@contents,$c->table({-width=>'100%'},$c->Tr($c->td({-align=>'left'},"The emboldened field names are mandatory, ","the remainder are optional",),$c->td({-align=>'right'},$c->a({-href=>'help.cgi',-target=>'_blank'},"What are the various fields?")))); ---------- 'style' => <<'----------', # This test snippet is from package bbbike v3.214 by Slaven Rezic; GPL 2.0 licence sub arrange_topframe { my(@order) = ($hslabel_frame, $km_frame, $speed_frame[0], $power_frame[0], $wind_frame, $percent_frame, $temp_frame, @speed_frame[1..$#speed_frame], @power_frame[1..$#power_frame], ); my(@col) = (0, 1, 3, 4+$#speed_frame, 5+$#speed_frame+$#power_frame, 2, 6+$#speed_frame+$#power_frame, 4..3+$#speed_frame, 5+$#speed_frame..4+$#speed_frame+$#power_frame); $top->idletasks; my $width = 0; my(%gridslaves) = map {($_, 1)} $top_frame->gridSlaves; for(my $i = 0; $i <= $#order; $i++) { my $w = $order[$i]; next unless Tk::Exists($w); my $col = $col[$i] || 0; $width += $w->reqwidth; if ($gridslaves{$w}) { $w->gridForget; } if ($width <= $top->width) { $w->grid(-row => 0, -column => $col, -sticky => 'nsew'); # XXX } } } ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'scl.def' => { source => "scl", params => "def", expect => <<'#1...........', # try -scl=12 to see '$returns' joined with the previous line $format = "format STDOUT =\n" . &format_line('Function: @') . '$name' . "\n" . &format_line('Arguments: @') . '$args' . "\n" . &format_line('Returns: @') . '$returns' . "\n" . &format_line(' ~~ ^') . '$desc' . "\n.\n"; #1........... }, 'scl.scl' => { source => "scl", params => "scl", expect => <<'#2...........', # try -scl=12 to see '$returns' joined with the previous line $format = "format STDOUT =\n" . &format_line('Function: @') . '$name' . "\n" . &format_line('Arguments: @') . '$args' . "\n" . &format_line('Returns: @') . '$returns' . "\n" . &format_line(' ~~ ^') . '$desc' . "\n.\n"; #2........... }, 'semicolon2.def' => { source => "semicolon2", params => "def", expect => <<'#3...........', # will not add semicolon for this block type $highest = List::Util::reduce { Sort::Versions::versioncmp( $a, $b ) > 0 ? $a : $b } #3........... }, 'side_comments1.def' => { source => "side_comments1", params => "def", expect => <<'#4...........', # side comments at different indentation levels should not be aligned { { { { { ${msg} = "Hello World!"; print "My message: ${msg}\n"; } } #end level 4 } # end level 3 } # end level 2 } # end level 1 #4........... }, 'sil1.def' => { source => "sil1", params => "def", expect => <<'#5...........', ############################################################# # This will walk to the left because of bad -sil guess SKIP: { ############################################################# } # This will walk to the right if it is the first line of a file. ov_method mycan( $package, '(""' ), $package or ov_method mycan( $package, '(0+' ), $package or ov_method mycan( $package, '(bool' ), $package or ov_method mycan( $package, '(nomethod' ), $package; #5........... }, 'sil1.sil' => { source => "sil1", params => "sil", expect => <<'#6...........', ############################################################# # This will walk to the left because of bad -sil guess SKIP: { ############################################################# } # This will walk to the right if it is the first line of a file. ov_method mycan( $package, '(""' ), $package or ov_method mycan( $package, '(0+' ), $package or ov_method mycan( $package, '(bool' ), $package or ov_method mycan( $package, '(nomethod' ), $package; #6........... }, 'slashslash.def' => { source => "slashslash", params => "def", expect => <<'#7...........', $home = $ENV{HOME} // $ENV{LOGDIR} // ( getpwuid($<) )[7] // die "You're homeless!\n"; defined( $x // $y ); $version = 'v' . join '.', map ord, split //, $version->PV; foreach ( split( //, $lets ) ) { } foreach ( split( //, $input ) ) { } 'xyz' =~ //; #7........... }, 'smart.def' => { source => "smart", params => "def", expect => <<'#8...........', \&foo !~~ \&foo; \&foo ~~ \&foo; \&foo ~~ \&foo; \&foo ~~ sub { }; sub { } ~~ \&foo; \&foo ~~ \&bar; \&bar ~~ \&foo; 1 ~~ sub { shift }; sub { shift } ~~ 1; 0 ~~ sub { shift }; sub { shift } ~~ 0; 1 ~~ sub { scalar @_ }; sub { scalar @_ } ~~ 1; [] ~~ \&bar; \&bar ~~ []; {} ~~ \&bar; \&bar ~~ {}; qr// ~~ \&bar; \&bar ~~ qr//; a_const ~~ "a constant"; "a constant" ~~ a_const; a_const ~~ a_const; a_const ~~ a_const; a_const ~~ b_const; b_const ~~ a_const; {} ~~ {}; {} ~~ {}; {} ~~ { 1 => 2 }; { 1 => 2 } ~~ {}; { 1 => 2 } ~~ { 1 => 2 }; { 1 => 2 } ~~ { 1 => 2 }; { 1 => 2 } ~~ { 1 => 3 }; { 1 => 3 } ~~ { 1 => 2 }; { 1 => 2 } ~~ { 2 => 3 }; { 2 => 3 } ~~ { 1 => 2 }; \%main:: ~~ { map { $_ => 'x' } keys %main:: }; { map { $_ => 'x' } keys %main:: } ~~ \%main::; \%hash ~~ \%tied_hash; \%tied_hash ~~ \%hash; \%tied_hash ~~ \%tied_hash; \%tied_hash ~~ \%tied_hash; \%:: ~~ [ keys %main:: ]; [ keys %main:: ] ~~ \%::; \%:: ~~ []; [] ~~ \%::; { "" => 1 } ~~ [undef]; [undef] ~~ { "" => 1 }; { foo => 1 } ~~ qr/^(fo[ox])$/; qr/^(fo[ox])$/ ~~ { foo => 1 }; +{ 0 .. 100 } ~~ qr/[13579]$/; qr/[13579]$/ ~~ +{ 0 .. 100 }; +{ foo => 1, bar => 2 } ~~ "foo"; "foo" ~~ +{ foo => 1, bar => 2 }; +{ foo => 1, bar => 2 } ~~ "baz"; "baz" ~~ +{ foo => 1, bar => 2 }; [] ~~ []; [] ~~ []; [] ~~ [1]; [1] ~~ []; [ ["foo"], ["bar"] ] ~~ [ qr/o/, qr/a/ ]; [ qr/o/, qr/a/ ] ~~ [ ["foo"], ["bar"] ]; [ "foo", "bar" ] ~~ [ qr/o/, qr/a/ ]; [ qr/o/, qr/a/ ] ~~ [ "foo", "bar" ]; $deep1 ~~ $deep1; $deep1 ~~ $deep1; $deep1 ~~ $deep2; $deep2 ~~ $deep1; \@nums ~~ \@tied_nums; \@tied_nums ~~ \@nums; [qw(foo bar baz quux)] ~~ qr/x/; qr/x/ ~~ [qw(foo bar baz quux)]; [qw(foo bar baz quux)] ~~ qr/y/; qr/y/ ~~ [qw(foo bar baz quux)]; [qw(1foo 2bar)] ~~ 2; 2 ~~ [qw(1foo 2bar)]; [qw(1foo 2bar)] ~~ "2"; "2" ~~ [qw(1foo 2bar)]; 2 ~~ 2; 2 ~~ 2; 2 ~~ 3; 3 ~~ 2; 2 ~~ "2"; "2" ~~ 2; 2 ~~ "2.0"; "2.0" ~~ 2; 2 ~~ "2bananas"; "2bananas" ~~ 2; 2_3 ~~ "2_3"; "2_3" ~~ 2_3; qr/x/ ~~ "x"; "x" ~~ qr/x/; qr/y/ ~~ "x"; "x" ~~ qr/y/; 12345 ~~ qr/3/; qr/3/ ~~ 12345; @nums ~~ 7; 7 ~~ @nums; @nums ~~ \@nums; \@nums ~~ @nums; @nums ~~ \\@nums; \\@nums ~~ @nums; @nums ~~ [ 1 .. 10 ]; [ 1 .. 10 ] ~~ @nums; @nums ~~ [ 0 .. 9 ]; [ 0 .. 9 ] ~~ @nums; %hash ~~ "foo"; "foo" ~~ %hash; %hash ~~ /bar/; /bar/ ~~ %hash; #8........... }, 'space1.def' => { source => "space1", params => "def", expect => <<'#9...........', # We usually want a space at '} (', for example: map { 1 * $_; } ( $y, $M, $w, $d, $h, $m, $s ); # But not others: &{ $_->[1] }( delete $_[$#_]{ $_->[0] } ); # remove unwanted spaces after $ and -> here &{ $_->[1] }( delete $_[$#_]{ $_->[0] } ); # this has both tabs and spaces to remove $setup = $labels->labelsetup( Output_Width => 2.625 ); #9........... }, 'space2.def' => { source => "space2", params => "def", expect => <<'#10...........', # space before this opening paren for $i ( 0 .. 20 ) { } # retain any space between '-' and bare word $myhash{ USER-NAME } = 'steve'; #10........... }, 'space3.def' => { source => "space3", params => "def", expect => <<'#11...........', # Treat newline as a whitespace. Otherwise, we might combine # 'Send' and '-recipients' here my $msg = new Fax::Send -recipients => $to, -data => $data; #11........... }, 'space4.def' => { source => "space4", params => "def", expect => <<'#12...........', # first prototype line will cause space between 'redirect' and '(' to close sub html::redirect($); #<-- temporary prototype; use html; print html::redirect('http://www.glob.com.au/'); #12........... }, 'space5.def' => { source => "space5", params => "def", expect => <<'#13...........', # first prototype line commented out; space after 'redirect' remains #sub html::redirect($); #<-- temporary prototype; use html; print html::redirect ('http://www.glob.com.au/'); #13........... }, 'structure1.def' => { source => "structure1", params => "def", expect => <<'#14...........', push @contents, $c->table( { -width => '100%' }, $c->Tr( $c->td( { -align => 'left' }, "The emboldened field names are mandatory, ", "the remainder are optional", ), $c->td( { -align => 'right' }, $c->a( { -href => 'help.cgi', -target => '_blank' }, "What are the various fields?" ) ) ) ); #14........... }, 'style.def' => { source => "style", params => "def", expect => <<'#15...........', # This test snippet is from package bbbike v3.214 by Slaven Rezic; GPL 2.0 licence sub arrange_topframe { my (@order) = ( $hslabel_frame, $km_frame, $speed_frame[0], $power_frame[0], $wind_frame, $percent_frame, $temp_frame, @speed_frame[ 1 .. $#speed_frame ], @power_frame[ 1 .. $#power_frame ], ); my (@col) = ( 0, 1, 3, 4 + $#speed_frame, 5 + $#speed_frame + $#power_frame, 2, 6 + $#speed_frame + $#power_frame, 4 .. 3 + $#speed_frame, 5 + $#speed_frame .. 4 + $#speed_frame + $#power_frame ); $top->idletasks; my $width = 0; my (%gridslaves) = map { ( $_, 1 ) } $top_frame->gridSlaves; for ( my $i = 0 ; $i <= $#order ; $i++ ) { my $w = $order[$i]; next unless Tk::Exists($w); my $col = $col[$i] || 0; $width += $w->reqwidth; if ( $gridslaves{$w} ) { $w->gridForget; } if ( $width <= $top->width ) { $w->grid( -row => 0, -column => $col, -sticky => 'nsew' ); # XXX } } } #15........... }, 'style.style1' => { source => "style", params => "style1", expect => <<'#16...........', # This test snippet is from package bbbike v3.214 by Slaven Rezic; GPL 2.0 licence sub arrange_topframe { my (@order) = ( $hslabel_frame, $km_frame, $speed_frame[0], $power_frame[0], $wind_frame, $percent_frame, $temp_frame, @speed_frame[1 .. $#speed_frame], @power_frame[1 .. $#power_frame], ); my (@col) = ( 0, 1, 3, 4 + $#speed_frame, 5 + $#speed_frame + $#power_frame, 2, 6 + $#speed_frame + $#power_frame, 4 .. 3 + $#speed_frame, 5 + $#speed_frame .. 4 + $#speed_frame + $#power_frame ); $top->idletasks; my $width = 0; my (%gridslaves) = map { ($_, 1) } $top_frame->gridSlaves; for (my $i = 0; $i <= $#order; $i++) { my $w = $order[$i]; next unless Tk::Exists($w); my $col = $col[$i] || 0; $width += $w->reqwidth; if ($gridslaves{$w}) { $w->gridForget; } if ($width <= $top->width) { $w->grid( -row => 0, -column => $col, -sticky => 'nsew' ); # XXX } } } #16........... }, 'style.style2' => { source => "style", params => "style2", expect => <<'#17...........', # This test snippet is from package bbbike v3.214 by Slaven Rezic; GPL 2.0 licence sub arrange_topframe { my (@order) = ( $hslabel_frame, $km_frame, $speed_frame[0], $power_frame[0], $wind_frame, $percent_frame, $temp_frame, @speed_frame[1..$#speed_frame], @power_frame[1..$#power_frame], ); my (@col) = ( 0, 1, 3, 4 + $#speed_frame, 5 + $#speed_frame + $#power_frame, 2, 6 + $#speed_frame + $#power_frame, 4..3 + $#speed_frame, 5 + $#speed_frame..4 + $#speed_frame + $#power_frame ); $top->idletasks; my $width = 0; my (%gridslaves) = map { ($_, 1) } $top_frame->gridSlaves; for (my $i = 0; $i <= $#order; $i++) { my $w = $order[$i]; next unless Tk::Exists($w); my $col = $col[$i] || 0; $width += $w->reqwidth; if ($gridslaves{$w}) { $w->gridForget; } if ($width <= $top->width) { $w->grid( -row => 0, -column => $col, -sticky => 'nsew' ); # XXX } } } #17........... }, 'style.style3' => { source => "style", params => "style3", expect => <<'#18...........', # This test snippet is from package bbbike v3.214 by Slaven Rezic; GPL 2.0 licence sub arrange_topframe { my (@order) = ( $hslabel_frame, $km_frame, $speed_frame[0], $power_frame[0], $wind_frame, $percent_frame, $temp_frame, @speed_frame[ 1 .. $#speed_frame ], @power_frame[ 1 .. $#power_frame ], ); my (@col) = ( 0, 1, 3, 4 + $#speed_frame, 5 + $#speed_frame + $#power_frame, 2, 6 + $#speed_frame + $#power_frame, 4 .. 3 + $#speed_frame, 5 + $#speed_frame .. 4 + $#speed_frame + $#power_frame ); $top->idletasks; my $width = 0; my (%gridslaves) = map { ( $_, 1 ) } $top_frame->gridSlaves; for ( my $i = 0 ; $i <= $#order ; $i++ ) { my $w = $order[$i]; next unless Tk::Exists($w); my $col = $col[$i] || 0; $width += $w->reqwidth; if ( $gridslaves{$w} ) { $w->gridForget; } if ( $width <= $top->width ) { $w->grid( -row => 0, -column => $col, -sticky => 'nsew' ); # XXX } } } ## end sub arrange_topframe #18........... }, 'style.style4' => { source => "style", params => "style4", expect => <<'#19...........', # This test snippet is from package bbbike v3.214 by Slaven Rezic; GPL 2.0 licence sub arrange_topframe { my (@order) = ( $hslabel_frame, $km_frame, $speed_frame[0], $power_frame[0], $wind_frame, $percent_frame, $temp_frame, @speed_frame[1 .. $#speed_frame], @power_frame[1 .. $#power_frame], ); my (@col) = ( 0, 1, 3, 4 + $#speed_frame, 5 + $#speed_frame + $#power_frame, 2, 6 + $#speed_frame + $#power_frame, 4 .. 3 + $#speed_frame, 5 + $#speed_frame .. 4 + $#speed_frame + $#power_frame ); $top->idletasks; my $width = 0; my (%gridslaves) = map { ($_, 1) } $top_frame->gridSlaves; for (my $i = 0 ; $i <= $#order ; $i++) { my $w = $order[$i]; next unless Tk::Exists($w); my $col = $col[$i] || 0; $width += $w->reqwidth; if ($gridslaves{$w}) { $w->gridForget; } if ($width <= $top->width) { $w->grid( -row => 0, -column => $col, -sticky => 'nsew' ); # XXX } } } #19........... }, 'style.style5' => { source => "style", params => "style5", expect => <<'#20...........', # This test snippet is from package bbbike v3.214 by Slaven Rezic; GPL 2.0 licence sub arrange_topframe { my (@order) = ( $hslabel_frame, $km_frame, $speed_frame[0], $power_frame[0], $wind_frame, $percent_frame, $temp_frame, @speed_frame[1 .. $#speed_frame], @power_frame[1 .. $#power_frame], ); my (@col) = ( 0, 1, 3, 4 + $#speed_frame, 5 + $#speed_frame + $#power_frame, 2, 6 + $#speed_frame + $#power_frame, 4 .. 3 + $#speed_frame, 5 + $#speed_frame .. 4 + $#speed_frame + $#power_frame ); $top->idletasks; my $width = 0; my (%gridslaves) = map { ($_, 1) } $top_frame->gridSlaves; for (my $i = 0; $i <= $#order; $i++) { my $w = $order[$i]; next unless Tk::Exists($w); my $col = $col[$i] || 0; $width += $w->reqwidth; if ($gridslaves{$w}) { $w->gridForget; } if ($width <= $top->width) { $w->grid( -row => 0, -column => $col, -sticky => 'nsew' ); # XXX } } } #20........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/snippets25.t0000644000175000017500000005006514373177245015314 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 novalign.def #2 novalign.novalign1 #3 novalign.novalign2 #4 novalign.novalign3 #5 lp2.def #6 lp2.lp #7 braces.braces8 #8 rt140025.def #9 rt140025.rt140025 #10 xlp1.def #11 xlp1.xlp1 #12 git74.def #13 git74.git74 #14 git77.def #15 git77.git77 #16 vxl.def #17 vxl.vxl1 #18 vxl.vxl2 #19 bal.bal1 # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'bal1' => "-bal=1", 'braces8' => <<'----------', -bl -bbvt=1 -blxl=' ' -bll='sub do asub' ---------- 'def' => "", 'git74' => <<'----------', -xlp --iterations=2 --maximum-line-length=120 --line-up-parentheses --continuation-indentation=4 --closing-token-indentation=1 --want-left-space="= -> ( )" --want-right-space="= -> ( )" --space-function-paren --space-keyword-paren --space-terminal-semicolon --opening-brace-on-new-line --opening-sub-brace-on-new-line --opening-anonymous-sub-brace-on-new-line --brace-left-and-indent --brace-left-and-indent-list="*" --break-before-hash-brace=3 ---------- 'git77' => <<'----------', -gal='Grep Map' ---------- 'lp' => "-lp", 'novalign1' => "-novalign", 'novalign2' => "-nvsc -nvbc -msc=2", 'novalign3' => "-nvc", 'rt140025' => "-lp -xci -ci=4 -ce", 'vxl1' => <<'----------', -vxl='=' ---------- 'vxl2' => <<'----------', -vxl='*' -vil='=' ---------- 'xlp1' => "-xlp", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'bal' => <<'----------', { L1: L2: L3: return; }; ---------- 'braces' => <<'----------', sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } } $myfun = sub { print("Hello, World\n"); }; eval { my $app = App::perlbrew->new( "install-patchperl", "-q" ); $app->run(); } or do { $error = $@; $produced_error = 1; }; Mojo::IOLoop->next_tick( sub { $ua->get( '/' => sub { push @kept_alive, pop->kept_alive; Mojo::IOLoop->next_tick( sub { Mojo::IOLoop->stop } ); } ); } ); $r = do { sswitch( $words[ rand @words ] ) { case $words[0]: case $words[1]: case $words[2]: case $words[3]: { 'ok' } default: { 'wtf' } } }; try { die; } catch { die; }; ---------- 'git74' => <<'----------', $self->func( { command => [ 'command', 'argument1', 'argument2' ], callback => sub { my ($res) = @_; print($res); } } ); my $test_var = $self->test_call( # $arg1, $arg2 ); my $test_var = $self->test_call( $arg1, # $arg2 ); my $test_var = $self->test_call( # $arg1, $arg2, ); my $test_var = $self->test_call( $arg1, $arg2, ); my $test_var = $self->test_call( $arg1, $arg2 ); my $test_var = $self->test_call( $arg1, $arg2, ); my $test_var = $self->test_call( $arg1, $arg2 ); ---------- 'git77' => <<'----------', # These should format about the same with -gal='Map Grep'. # NOTE: The braces only align if the internal code flag ALIGN_GREP_ALIASES is set return +{ Map { $_->init_arg => $_->get_value($instance) } Grep { $_->has_value($instance) } Grep { defined( $_->init_arg ) } $class->get_all_attributes }; return +{ map { $_->init_arg => $_->get_value($instance) } grep { $_->has_value($instance) } grep { defined( $_->init_arg ) } $class->get_all_attributes }; ---------- 'lp2' => <<'----------', # test issue git #74, lost -lp when final anon sub brace followed by '}' Util::Parser->new( Handlers => { Init => sub { $self->init(@_) }, Mid => { sub { shift; $self->mid(@_) } }, Final => sub { shift; $self->final(@_) } } )->parse( $_[0] ); ---------- 'novalign' => <<'----------', { # simple vertical alignment of '=' and '#' # A long line to test -nvbc ... normally this will cause the previous line to move left my $lines = 0; # checksum: #lines my $bytes = 0; # checksum: #bytes my $sum = 0; # checksum: system V sum my $patchdata = 0; # saw patch data my $pos = 0; # start of patch data # a hanging side comment my $endkit = 0; # saw end of kit my $fail = 0; # failed } ---------- 'rt140025' => <<'----------', eval { my $cpid; my $cmd; FORK: { if( $cpid = fork ) { close( STDOUT ); last; } elsif( defined $cpid ) { close( STDIN ); open( STDIN, '<', '/dev/null' ) or die( "open3: $!\n" ); exec $cmd or die( "exec: $!\n" ); } elsif( $! == EAGAIN ) { sleep 3; redo FORK; } else { die( "Can't fork: $!\n" ); } } }; ---------- 'vxl' => <<'----------', # if equals is excluded then ternary is automatically excluded # side comment alignments always remain $co_description = ($color) ? 'bold cyan' : ''; # description $co_prompt = ($color) ? 'bold green' : ''; # prompt $co_unused = ($color) ? 'on_green' : 'reverse'; # unused ---------- 'xlp1' => <<'----------', # test -xlp with comments, broken sub blocks, blank line, line length limit $cb1 = $act_page->Checkbutton( -text => M "Verwenden", -variable => \$qualitaet_s_optimierung, -command => sub { change_state_all( $act_page1, $qualitaet_s_optimierung, { $cb1 => 1 } ) ; # sc }, )->grid( # block comment -row => $gridy++, -column => 2, -sticky => 'e' ); ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'novalign.def' => { source => "novalign", params => "def", expect => <<'#1...........', { # simple vertical alignment of '=' and '#' # A long line to test -nvbc ... normally this will cause the previous line to move left my $lines = 0; # checksum: #lines my $bytes = 0; # checksum: #bytes my $sum = 0; # checksum: system V sum my $patchdata = 0; # saw patch data my $pos = 0; # start of patch data # a hanging side comment my $endkit = 0; # saw end of kit my $fail = 0; # failed } #1........... }, 'novalign.novalign1' => { source => "novalign", params => "novalign1", expect => <<'#2...........', { # simple vertical alignment of '=' and '#' # A long line to test -nvbc ... normally this will cause the previous line to move left my $lines = 0; # checksum: #lines my $bytes = 0; # checksum: #bytes my $sum = 0; # checksum: system V sum my $patchdata = 0; # saw patch data my $pos = 0; # start of patch data # a hanging side comment my $endkit = 0; # saw end of kit my $fail = 0; # failed } #2........... }, 'novalign.novalign2' => { source => "novalign", params => "novalign2", expect => <<'#3...........', { # simple vertical alignment of '=' and '#' # A long line to test -nvbc ... normally this will cause the previous line to move left my $lines = 0; # checksum: #lines my $bytes = 0; # checksum: #bytes my $sum = 0; # checksum: system V sum my $patchdata = 0; # saw patch data my $pos = 0; # start of patch data # a hanging side comment my $endkit = 0; # saw end of kit my $fail = 0; # failed } #3........... }, 'novalign.novalign3' => { source => "novalign", params => "novalign3", expect => <<'#4...........', { # simple vertical alignment of '=' and '#' # A long line to test -nvbc ... normally this will cause the previous line to move left my $lines = 0; # checksum: #lines my $bytes = 0; # checksum: #bytes my $sum = 0; # checksum: system V sum my $patchdata = 0; # saw patch data my $pos = 0; # start of patch data # a hanging side comment my $endkit = 0; # saw end of kit my $fail = 0; # failed } #4........... }, 'lp2.def' => { source => "lp2", params => "def", expect => <<'#5...........', # test issue git #74, lost -lp when final anon sub brace followed by '}' Util::Parser->new( Handlers => { Init => sub { $self->init(@_) }, Mid => { sub { shift; $self->mid(@_) } }, Final => sub { shift; $self->final(@_) } } )->parse( $_[0] ); #5........... }, 'lp2.lp' => { source => "lp2", params => "lp", expect => <<'#6...........', # test issue git #74, lost -lp when final anon sub brace followed by '}' Util::Parser->new( Handlers => { Init => sub { $self->init(@_) }, Mid => { sub { shift; $self->mid(@_) } }, Final => sub { shift; $self->final(@_) } } )->parse( $_[0] ); #6........... }, 'braces.braces8' => { source => "braces", params => "braces8", expect => <<'#7...........', sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } } $myfun = sub { print("Hello, World\n"); }; eval { my $app = App::perlbrew->new( "install-patchperl", "-q" ); $app->run(); } or do { $error = $@; $produced_error = 1; }; Mojo::IOLoop->next_tick( sub { $ua->get( '/' => sub { push @kept_alive, pop->kept_alive; Mojo::IOLoop->next_tick( sub { Mojo::IOLoop->stop } ); } ); } ); $r = do { sswitch( $words[ rand @words ] ) { case $words[0]: case $words[1]: case $words[2]: case $words[3]: { 'ok' } default: { 'wtf' } } }; try { die; } catch { die; }; #7........... }, 'rt140025.def' => { source => "rt140025", params => "def", expect => <<'#8...........', eval { my $cpid; my $cmd; FORK: { if ( $cpid = fork ) { close(STDOUT); last; } elsif ( defined $cpid ) { close(STDIN); open( STDIN, '<', '/dev/null' ) or die("open3: $!\n"); exec $cmd or die("exec: $!\n"); } elsif ( $! == EAGAIN ) { sleep 3; redo FORK; } else { die("Can't fork: $!\n"); } } }; #8........... }, 'rt140025.rt140025' => { source => "rt140025", params => "rt140025", expect => <<'#9...........', eval { my $cpid; my $cmd; FORK: { if ( $cpid = fork ) { close(STDOUT); last; } elsif ( defined $cpid ) { close(STDIN); open( STDIN, '<', '/dev/null' ) or die("open3: $!\n"); exec $cmd or die("exec: $!\n"); } elsif ( $! == EAGAIN ) { sleep 3; redo FORK; } else { die("Can't fork: $!\n"); } } }; #9........... }, 'xlp1.def' => { source => "xlp1", params => "def", expect => <<'#10...........', # test -xlp with comments, broken sub blocks, blank line, line length limit $cb1 = $act_page->Checkbutton( -text => M "Verwenden", -variable => \$qualitaet_s_optimierung, -command => sub { change_state_all( $act_page1, $qualitaet_s_optimierung, { $cb1 => 1 } ) ; # sc }, )->grid( # block comment -row => $gridy++, -column => 2, -sticky => 'e' ); #10........... }, 'xlp1.xlp1' => { source => "xlp1", params => "xlp1", expect => <<'#11...........', # test -xlp with comments, broken sub blocks, blank line, line length limit $cb1 = $act_page->Checkbutton( -text => M "Verwenden", -variable => \$qualitaet_s_optimierung, -command => sub { change_state_all( $act_page1, $qualitaet_s_optimierung, { $cb1 => 1 } ) ; # sc }, )->grid( # block comment -row => $gridy++, -column => 2, -sticky => 'e' ); #11........... }, 'git74.def' => { source => "git74", params => "def", expect => <<'#12...........', $self->func( { command => [ 'command', 'argument1', 'argument2' ], callback => sub { my ($res) = @_; print($res); } } ); my $test_var = $self->test_call( # $arg1, $arg2 ); my $test_var = $self->test_call( $arg1, # $arg2 ); my $test_var = $self->test_call( # $arg1, $arg2, ); my $test_var = $self->test_call( $arg1, $arg2, ); my $test_var = $self->test_call( $arg1, $arg2 ); my $test_var = $self->test_call( $arg1, $arg2, ); my $test_var = $self->test_call( $arg1, $arg2 ); #12........... }, 'git74.git74' => { source => "git74", params => "git74", expect => <<'#13...........', $self -> func ( { command => [ 'command', 'argument1', 'argument2' ], callback => sub { my ($res) = @_ ; print ($res) ; } } ) ; my $test_var = $self -> test_call ( # $arg1, $arg2 ) ; my $test_var = $self -> test_call ( $arg1, # $arg2 ) ; my $test_var = $self -> test_call ( # $arg1, $arg2, ) ; my $test_var = $self -> test_call ( $arg1, $arg2, ) ; my $test_var = $self -> test_call ( $arg1, $arg2 ) ; my $test_var = $self -> test_call ( $arg1, $arg2, ) ; my $test_var = $self -> test_call ( $arg1, $arg2 ) ; #13........... }, 'git77.def' => { source => "git77", params => "def", expect => <<'#14...........', # These should format about the same with -gal='Map Grep'. # NOTE: The braces only align if the internal code flag ALIGN_GREP_ALIASES is set return +{ Map { $_->init_arg => $_->get_value($instance) } Grep { $_->has_value($instance) } Grep { defined( $_->init_arg ) } $class->get_all_attributes }; return +{ map { $_->init_arg => $_->get_value($instance) } grep { $_->has_value($instance) } grep { defined( $_->init_arg ) } $class->get_all_attributes }; #14........... }, 'git77.git77' => { source => "git77", params => "git77", expect => <<'#15...........', # These should format about the same with -gal='Map Grep'. # NOTE: The braces only align if the internal code flag ALIGN_GREP_ALIASES is set return +{ Map { $_->init_arg => $_->get_value($instance) } Grep { $_->has_value($instance) } Grep { defined( $_->init_arg ) } $class->get_all_attributes }; return +{ map { $_->init_arg => $_->get_value($instance) } grep { $_->has_value($instance) } grep { defined( $_->init_arg ) } $class->get_all_attributes }; #15........... }, 'vxl.def' => { source => "vxl", params => "def", expect => <<'#16...........', # if equals is excluded then ternary is automatically excluded # side comment alignments always remain $co_description = ($color) ? 'bold cyan' : ''; # description $co_prompt = ($color) ? 'bold green' : ''; # prompt $co_unused = ($color) ? 'on_green' : 'reverse'; # unused #16........... }, 'vxl.vxl1' => { source => "vxl", params => "vxl1", expect => <<'#17...........', # if equals is excluded then ternary is automatically excluded # side comment alignments always remain $co_description = ($color) ? 'bold cyan' : ''; # description $co_prompt = ($color) ? 'bold green' : ''; # prompt $co_unused = ($color) ? 'on_green' : 'reverse'; # unused #17........... }, 'vxl.vxl2' => { source => "vxl", params => "vxl2", expect => <<'#18...........', # if equals is excluded then ternary is automatically excluded # side comment alignments always remain $co_description = ($color) ? 'bold cyan' : ''; # description $co_prompt = ($color) ? 'bold green' : ''; # prompt $co_unused = ($color) ? 'on_green' : 'reverse'; # unused #18........... }, 'bal.bal1' => { source => "bal", params => "bal1", expect => <<'#19...........', { L1: L2: L3: return; }; #19........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/snippets15.t0000644000175000017500000003425114373177244015311 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 gnu5.gnu #2 wngnu1.def #3 olbs.def #4 olbs.olbs0 #5 olbs.olbs2 #6 break_old_methods.break_old_methods #7 break_old_methods.def #8 bom1.bom #9 bom1.def #10 align28.def #11 align29.def #12 align30.def #13 git09.def #14 git09.git09 #15 git14.def #16 sal.def #17 sal.sal #18 spp.def #19 spp.spp0 # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'bom' => "-bom -wn", 'break_old_methods' => "--break-at-old-method-breakpoints", 'def' => "", 'git09' => "-ce -cbl=map,sort,grep", 'gnu' => "-gnu", 'olbs0' => "-olbs=0", 'olbs2' => "-olbs=2", 'sal' => <<'----------', -sal='method fun' ---------- 'spp0' => "-spp=0", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'align28' => <<'----------', # tests for 'delete_needless_parens' # align all '='s; but do not align parens my $w = $columns * $cell_w + ( $columns + 1 ) * $border; my $h = $rows * $cell_h + ( $rows + 1 ) * $border; my $img = new Gimp::Image( $w, $h, RGB ); # keep leading paren after if as alignment for padding eval { if ( $a->{'abc'} eq 'ABC' ) { no_op(23) } else { no_op(42) } }; ---------- 'align29' => <<'----------', # alignment with lots of commas is( floor(1.23441242), 1, "Basic floor(1.23441242) test" ); is( fmod( 3.5, 2.0 ), 1.5, "Basic fmod(3.5, 2.0) test" ); is( join( " ", frexp(1) ), "0.5 1", "Basic frexp(1) test" ); is( ldexp( 0, 1 ), 0, "Basic ldexp(0,1) test" ); is( log10(1), 0, "Basic log10(1) test" ); ---------- 'align30' => <<'----------', # commas on lhs align, commas on rhs do not (different subs) ($x,$y,$z)=spherical_to_cartesian($rho,$theta,$phi); ($rho_c,$theta,$z)=spherical_to_cylindrical($rho_s,$theta,$phi); ( $r2, $theta2, $z2 )=cartesian_to_cylindrical( $x1, $y1, $z1 ); # two-line if/elsif gets aligned if($i==$depth){$_++;} elsif($i>$depth){$_=0;} ---------- 'bom1' => <<'----------', # keep cuddled call chain with -bom return Mojo::Promise->resolve( $query_params )->then( &_reveal_event )->then(sub ($code) { return $c->render(text => '', status => $code); })->catch(sub { # 1. return error return $c->render(json => {}, status => 400); }); ---------- 'break_old_methods' => <<'----------', my $q = $rs ->related_resultset('CDs') ->related_resultset('Tracks') ->search({ 'track.id' => { -ident => 'none_search.id' }, }) ->as_query; ---------- 'git09' => <<'----------', # no one-line block for first map with -ce -cbl=map,sort,grep @sorted = map { $_->[0] } sort { $a->[1] <=> $b->[1] or $a->[0] cmp $b->[0] } map { [$_, length($_)] } @unsorted; ---------- 'git14' => <<'----------', # git#14; do not break at trailing 'or' $second = { key1 => 'aaa', key2 => 'bbb', } if $flag1 or $flag2; ---------- 'gnu5' => <<'----------', # side comments limit gnu type formatting with l=80; note extra comma push @tests, [ "Lowest code point requiring 13 bytes to represent", # 2**36 "\xff\x80\x80\x80\x80\x80\x81\x80\x80\x80\x80\x80\x80", ($::is64bit) ? 0x1000000000 : -1, # overflows on 32bit ], ; ---------- 'olbs' => <<'----------', for $x ( 1, 2 ) { s/(.*)/+$1/ } for $x ( 1, 2 ) { s/(.*)/+$1/ } # side comment if ( $editlblk eq 1 ) { $editlblk = "on"; $editlblkchecked = "checked" } for $x ( 1, 2 ) { s/(.*)/+$1/; } for $x ( 1, 2 ) { s/(.*)/+$1/; } # side comment if ( $editlblk eq 1 ) { $editlblk = "on"; $editlblkchecked = "checked"; } ---------- 'sal' => <<'----------', sub get_val () { } method get_value () { } fun get_other_value () { } ---------- 'spp' => <<'----------', sub get_val() { } sub get_Val () { } sub Get_val () { } my $sub1=sub () { }; my $sub2=sub () { }; ---------- 'wngnu1' => <<'----------', # test with -wn -gnu foreach my $parameter ( qw( set_themes add_themes severity maximum_violations_per_document _non_public_data ) ) { is( $config->get($parameter), undef, qq<"$parameter" is not defined via get() for $policy_short_name.>, ); } ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'gnu5.gnu' => { source => "gnu5", params => "gnu", expect => <<'#1...........', # side comments limit gnu type formatting with l=80; note extra comma push @tests, [ "Lowest code point requiring 13 bytes to represent", # 2**36 "\xff\x80\x80\x80\x80\x80\x81\x80\x80\x80\x80\x80\x80", ($::is64bit) ? 0x1000000000 : -1, # overflows on 32bit ], ; #1........... }, 'wngnu1.def' => { source => "wngnu1", params => "def", expect => <<'#2...........', # test with -wn -gnu foreach my $parameter ( qw( set_themes add_themes severity maximum_violations_per_document _non_public_data ) ) { is( $config->get($parameter), undef, qq<"$parameter" is not defined via get() for $policy_short_name.>, ); } #2........... }, 'olbs.def' => { source => "olbs", params => "def", expect => <<'#3...........', for $x ( 1, 2 ) { s/(.*)/+$1/ } for $x ( 1, 2 ) { s/(.*)/+$1/ } # side comment if ( $editlblk eq 1 ) { $editlblk = "on"; $editlblkchecked = "checked" } for $x ( 1, 2 ) { s/(.*)/+$1/; } for $x ( 1, 2 ) { s/(.*)/+$1/; } # side comment if ( $editlblk eq 1 ) { $editlblk = "on"; $editlblkchecked = "checked"; } #3........... }, 'olbs.olbs0' => { source => "olbs", params => "olbs0", expect => <<'#4...........', for $x ( 1, 2 ) { s/(.*)/+$1/ } for $x ( 1, 2 ) { s/(.*)/+$1/ } # side comment if ( $editlblk eq 1 ) { $editlblk = "on"; $editlblkchecked = "checked" } for $x ( 1, 2 ) { s/(.*)/+$1/ } for $x ( 1, 2 ) { s/(.*)/+$1/ } # side comment if ( $editlblk eq 1 ) { $editlblk = "on"; $editlblkchecked = "checked"; } #4........... }, 'olbs.olbs2' => { source => "olbs", params => "olbs2", expect => <<'#5...........', for $x ( 1, 2 ) { s/(.*)/+$1/; } for $x ( 1, 2 ) { s/(.*)/+$1/; } # side comment if ( $editlblk eq 1 ) { $editlblk = "on"; $editlblkchecked = "checked"; } for $x ( 1, 2 ) { s/(.*)/+$1/; } for $x ( 1, 2 ) { s/(.*)/+$1/; } # side comment if ( $editlblk eq 1 ) { $editlblk = "on"; $editlblkchecked = "checked"; } #5........... }, 'break_old_methods.break_old_methods' => { source => "break_old_methods", params => "break_old_methods", expect => <<'#6...........', my $q = $rs ->related_resultset('CDs') ->related_resultset('Tracks') ->search( { 'track.id' => { -ident => 'none_search.id' }, } )->as_query; #6........... }, 'break_old_methods.def' => { source => "break_old_methods", params => "def", expect => <<'#7...........', my $q = $rs->related_resultset('CDs')->related_resultset('Tracks')->search( { 'track.id' => { -ident => 'none_search.id' }, } )->as_query; #7........... }, 'bom1.bom' => { source => "bom1", params => "bom", expect => <<'#8...........', # keep cuddled call chain with -bom return Mojo::Promise->resolve( $query_params )->then( &_reveal_event )->then( sub ($code) { return $c->render( text => '', status => $code ); } )->catch( sub { # 1. return error return $c->render( json => {}, status => 400 ); } ); #8........... }, 'bom1.def' => { source => "bom1", params => "def", expect => <<'#9...........', # keep cuddled call chain with -bom return Mojo::Promise->resolve($query_params)->then(&_reveal_event)->then( sub ($code) { return $c->render( text => '', status => $code ); } )->catch( sub { # 1. return error return $c->render( json => {}, status => 400 ); } ); #9........... }, 'align28.def' => { source => "align28", params => "def", expect => <<'#10...........', # tests for 'delete_needless_parens' # align all '='s; but do not align parens my $w = $columns * $cell_w + ( $columns + 1 ) * $border; my $h = $rows * $cell_h + ( $rows + 1 ) * $border; my $img = new Gimp::Image( $w, $h, RGB ); # keep leading paren after if as alignment for padding eval { if ( $a->{'abc'} eq 'ABC' ) { no_op(23) } else { no_op(42) } }; #10........... }, 'align29.def' => { source => "align29", params => "def", expect => <<'#11...........', # alignment with lots of commas is( floor(1.23441242), 1, "Basic floor(1.23441242) test" ); is( fmod( 3.5, 2.0 ), 1.5, "Basic fmod(3.5, 2.0) test" ); is( join( " ", frexp(1) ), "0.5 1", "Basic frexp(1) test" ); is( ldexp( 0, 1 ), 0, "Basic ldexp(0,1) test" ); is( log10(1), 0, "Basic log10(1) test" ); #11........... }, 'align30.def' => { source => "align30", params => "def", expect => <<'#12...........', # commas on lhs align, commas on rhs do not (different subs) ( $x, $y, $z ) = spherical_to_cartesian( $rho, $theta, $phi ); ( $rho_c, $theta, $z ) = spherical_to_cylindrical( $rho_s, $theta, $phi ); ( $r2, $theta2, $z2 ) = cartesian_to_cylindrical( $x1, $y1, $z1 ); # two-line if/elsif gets aligned if ( $i == $depth ) { $_++; } elsif ( $i > $depth ) { $_ = 0; } #12........... }, 'git09.def' => { source => "git09", params => "def", expect => <<'#13...........', # no one-line block for first map with -ce -cbl=map,sort,grep @sorted = map { $_->[0] } sort { $a->[1] <=> $b->[1] or $a->[0] cmp $b->[0] } map { [ $_, length($_) ] } @unsorted; #13........... }, 'git09.git09' => { source => "git09", params => "git09", expect => <<'#14...........', # no one-line block for first map with -ce -cbl=map,sort,grep @sorted = map { $_->[0] } sort { $a->[1] <=> $b->[1] or $a->[0] cmp $b->[0] } map { [ $_, length($_) ] } @unsorted; #14........... }, 'git14.def' => { source => "git14", params => "def", expect => <<'#15...........', # git#14; do not break at trailing 'or' $second = { key1 => 'aaa', key2 => 'bbb', } if $flag1 or $flag2; #15........... }, 'sal.def' => { source => "sal", params => "def", expect => <<'#16...........', sub get_val () { } method get_value () { } fun get_other_value() { } #16........... }, 'sal.sal' => { source => "sal", params => "sal", expect => <<'#17...........', sub get_val () { } method get_value () { } fun get_other_value () { } #17........... }, 'spp.def' => { source => "spp", params => "def", expect => <<'#18...........', sub get_val() { } sub get_Val () { } sub Get_val () { } my $sub1 = sub () { }; my $sub2 = sub () { }; #18........... }, 'spp.spp0' => { source => "spp", params => "spp0", expect => <<'#19...........', sub get_val() { } sub get_Val() { } sub Get_val() { } my $sub1 = sub() { }; my $sub2 = sub() { }; #19........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/test_DEBUG.t0000755000175000017500000000205614265013033015145 0ustar stevesteve# Test that the -D (-DEBUG) flag works use strict; use Carp; use Perl::Tidy; use Test::More; my $name = 'DEBUG test'; BEGIN { plan tests => 2; } my $source = <<'EOM'; my @words = qw( alpha beta gamma ); EOM my $expect = <<'EOM'; my @words = qw( alpha beta gamma ); EOM my $debug_expect = <<'EOM'; Use -dump-token-types (-dtt) to get a list of token type codes 1: my @words = qw( 1: kkbiiiiiib=bqqq 2: alpha beta gamma 2: qqqqqqqqqqqqqqqq 3: ); 3: q; EOM my $output; my $stderr_string; my $errorfile_string; my $debug_string; my $perltidyrc = ""; my $err = Perl::Tidy::perltidy( argv => '-D -npro', perltidyrc => \$perltidyrc, # avoid reading unwanted .perltidyrc source => \$source, destination => \$output, stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set debugfile => \$debug_string, ); if ( $err || $stderr_string || $errorfile_string ) { ok(0); } else { is( $output, $expect, $name ); is( $debug_string, $debug_expect, $name ); } Perl-Tidy-20230309/t/snippets28.t0000644000175000017500000001611314373177245015313 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 olbxl.olbxl2 #2 recombine5.def #3 recombine6.def #4 recombine7.def #5 recombine8.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'def' => "", 'olbxl2' => <<'----------', -olbxl='*' ---------- }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'olbxl' => <<'----------', eval { require Ace }; @list = map { $frm{ ( /@(.*?)>/ ? $1 : $_ ) }++ ? () : ($_); } @list; $color = join( '/', sort { $color_value{$::a} <=> $color_value{$::b}; } keys %colors ); @sorted = sort { $SortDir * $PageTotal{$a} <=> $SortDir * $PageTotal{$b} }; ---------- 'recombine5' => <<'----------', # recombine uses reverse optimization $rotate = Math::MatrixReal->new_from_string( "[ " . cos($theta) . " " . -sin($theta) . " ]\n" . "[ " . sin($theta) . " " . cos($theta) . " ]\n" ); ---------- 'recombine6' => <<'----------', # recombine operation uses forward optimization $filecol = (/^$/) ? $filecol : (s/^\+//) ? $filecol + $_ : (s/^\-//) ? $filecol - $_ : (s/^>//) ? ($filecol + $_) % $pages : (s/^]//) ? (($filecol + $_ >= $pages) ? 0 : $filecol + $_) : (s/^ <<'----------', # recombine uses forward optimization, must recombine at = my $J = int( 365.25 * ( $y + 4712 ) ) + int( ( 30.6 * $m ) + 0.5 ) + 59 + $d - 0.5; ---------- 'recombine8' => <<'----------', # recombine uses normal forward mode $v_gb = -1*(eval($pmt_gb))*(-1+((((-1+(1/((eval($i_gb)/100)+1))** ((eval($n_gb)-1)))))/(eval($i_gb)/100))); ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'olbxl.olbxl2' => { source => "olbxl", params => "olbxl2", expect => <<'#1...........', eval { require Ace; }; @list = map { $frm{ ( /@(.*?)>/ ? $1 : $_ ) }++ ? () : ($_); } @list; $color = join( '/', sort { $color_value{$::a} <=> $color_value{$::b}; } keys %colors ); @sorted = sort { $SortDir * $PageTotal{$a} <=> $SortDir * $PageTotal{$b} }; #1........... }, 'recombine5.def' => { source => "recombine5", params => "def", expect => <<'#2...........', # recombine uses reverse optimization $rotate = Math::MatrixReal->new_from_string( "[ " . cos($theta) . " " . -sin($theta) . " ]\n" . "[ " . sin($theta) . " " . cos($theta) . " ]\n" ); #2........... }, 'recombine6.def' => { source => "recombine6", params => "def", expect => <<'#3...........', # recombine operation uses forward optimization $filecol = (/^$/) ? $filecol : (s/^\+//) ? $filecol + $_ : (s/^\-//) ? $filecol - $_ : (s/^>//) ? ( $filecol + $_ ) % $pages : (s/^]//) ? ( ( $filecol + $_ >= $pages ) ? 0 : $filecol + $_ ) : (s/^ { source => "recombine7", params => "def", expect => <<'#4...........', # recombine uses forward optimization, must recombine at = my $J = int( 365.25 * ( $y + 4712 ) ) + int( ( 30.6 * $m ) + 0.5 ) + 59 + $d - 0.5; #4........... }, 'recombine8.def' => { source => "recombine8", params => "def", expect => <<'#5...........', # recombine uses normal forward mode $v_gb = -1 * ( eval($pmt_gb) ) * ( -1 + ( ( ( ( -1 + ( 1 / ( ( eval($i_gb) / 100 ) + 1 ) ) **( ( eval($n_gb) - 1 ) ) ) ) ) / ( eval($i_gb) / 100 ) ) ); #5........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/snippets12.t0000644000175000017500000003746514373177244015320 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 vtc1.def #2 vtc1.vtc #3 vtc2.def #4 vtc2.vtc #5 vtc3.def #6 vtc3.vtc #7 vtc4.def #8 vtc4.vtc #9 wn1.def #10 wn1.wn #11 wn2.def #12 wn2.wn #13 wn3.def #14 wn3.wn #15 wn4.def #16 wn4.wn #17 wn5.def #18 wn5.wn #19 wn6.def #20 wn6.wn # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'def' => "", 'vtc' => <<'----------', -sbvtc=2 -bvtc=2 -pvtc=2 ---------- 'wn' => "-wn", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'vtc1' => <<'----------', @lol = ( [ 'Dr. Watson', undef, '221b', 'Baker St.', undef, 'London', 'NW1', undef, 'England', undef ], [ 'Sam Gamgee', undef, undef, 'Bagshot Row', undef, 'Hobbiton', undef, undef, 'The Shire', undef], ); ---------- 'vtc2' => <<'----------', ok( $s->call( SOAP::Data->name('getStateName') ->attr( { xmlns => 'urn:/My/Examples' } ), 1 )->result eq 'Alabama' ); ---------- 'vtc3' => <<'----------', $day_long = ( "Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday" )[$wday]; ---------- 'vtc4' => <<'----------', my$bg_color=$im->colorAllocate(unpack('C3',pack('H2H2H2',unpack('a2a2a2',(length($options_r->{'bg_color'})?$options_r->{'bg_color'}:$MIDI::Opus::BG_color))))); ---------- 'wn1' => <<'----------', my $bg_color = $im->colorAllocate( unpack( 'C3', pack( 'H2H2H2', unpack( 'a2a2a2', ( length( $options_r->{'bg_color'} ) ? $options_r->{'bg_color'} : $MIDI::Opus::BG_color ) ) ) ) ); ---------- 'wn2' => <<'----------', if ($PLATFORM eq 'aix') { skip_symbols([qw( Perl_dump_fds Perl_ErrorNo Perl_GetVars PL_sys_intern )]); } ---------- 'wn3' => <<'----------', deferred->resolve->then( sub { push @out, 'Resolve'; return $then; } )->then( sub { push @out, 'Reject'; push @out, @_; } ); ---------- 'wn4' => <<'----------', {{{ # Orignal formatting looks nice but would be hard to duplicate return exists $G->{ Attr }->{ E } && exists $G->{ Attr }->{ E }->{ $u } && exists $G->{ Attr }->{ E }->{ $u }->{ $v } ? %{ $G->{ Attr }->{ E }->{ $u }->{ $v } } : ( ); }}} ---------- 'wn5' => <<'----------', # qw weld with -wn use_all_ok( qw{ PPI PPI::Tokenizer PPI::Lexer PPI::Dumper PPI::Find PPI::Normal PPI::Util PPI::Cache } ); ---------- 'wn6' => <<'----------', # illustration of some do-not-weld rules # do not weld a two-line function call $trans->add_transformation( PDL::Graphics::TriD::Scale->new( $sx, $sy, $sz ) ); # but weld this more complex statement my $compass = uc( opposite_direction( line_to_canvas_direction( @{ $coords[0] }, @{ $coords[1] } ) ) ); # OLD: do not weld to a one-line block because the function could # get separated from its opening paren. # NEW: (30-jan-2021): keep one-line block together for stability $_[0]->code_handler ( sub { $morexxxxxxxxxxxxxxxxxx .= $_[1] . ":" . $_[0] . "\n" } ); # another example; do not weld because the sub is not broken $wrapped->add_around_modifier( sub { push @tracelog => 'around 1'; $_[0]->(); } ); # but okay to weld here because the sub is broken $wrapped->add_around_modifier( sub { push @tracelog => 'around 1'; $_[0]->(); } ); ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'vtc1.def' => { source => "vtc1", params => "def", expect => <<'#1...........', @lol = ( [ 'Dr. Watson', undef, '221b', 'Baker St.', undef, 'London', 'NW1', undef, 'England', undef ], [ 'Sam Gamgee', undef, undef, 'Bagshot Row', undef, 'Hobbiton', undef, undef, 'The Shire', undef ], ); #1........... }, 'vtc1.vtc' => { source => "vtc1", params => "vtc", expect => <<'#2...........', @lol = ( [ 'Dr. Watson', undef, '221b', 'Baker St.', undef, 'London', 'NW1', undef, 'England', undef ], [ 'Sam Gamgee', undef, undef, 'Bagshot Row', undef, 'Hobbiton', undef, undef, 'The Shire', undef ], ); #2........... }, 'vtc2.def' => { source => "vtc2", params => "def", expect => <<'#3...........', ok( $s->call( SOAP::Data->name('getStateName') ->attr( { xmlns => 'urn:/My/Examples' } ), 1 )->result eq 'Alabama' ); #3........... }, 'vtc2.vtc' => { source => "vtc2", params => "vtc", expect => <<'#4...........', ok( $s->call( SOAP::Data->name('getStateName') ->attr( { xmlns => 'urn:/My/Examples' } ), 1 )->result eq 'Alabama' ); #4........... }, 'vtc3.def' => { source => "vtc3", params => "def", expect => <<'#5...........', $day_long = ( "Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday" )[$wday]; #5........... }, 'vtc3.vtc' => { source => "vtc3", params => "vtc", expect => <<'#6...........', $day_long = ( "Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday" )[$wday]; #6........... }, 'vtc4.def' => { source => "vtc4", params => "def", expect => <<'#7...........', my $bg_color = $im->colorAllocate( unpack( 'C3', pack( 'H2H2H2', unpack( 'a2a2a2', ( length( $options_r->{'bg_color'} ) ? $options_r->{'bg_color'} : $MIDI::Opus::BG_color ) ) ) ) ); #7........... }, 'vtc4.vtc' => { source => "vtc4", params => "vtc", expect => <<'#8...........', my $bg_color = $im->colorAllocate( unpack( 'C3', pack( 'H2H2H2', unpack( 'a2a2a2', ( length( $options_r->{'bg_color'} ) ? $options_r->{'bg_color'} : $MIDI::Opus::BG_color ) ) ) ) ); #8........... }, 'wn1.def' => { source => "wn1", params => "def", expect => <<'#9...........', my $bg_color = $im->colorAllocate( unpack( 'C3', pack( 'H2H2H2', unpack( 'a2a2a2', ( length( $options_r->{'bg_color'} ) ? $options_r->{'bg_color'} : $MIDI::Opus::BG_color ) ) ) ) ); #9........... }, 'wn1.wn' => { source => "wn1", params => "wn", expect => <<'#10...........', my $bg_color = $im->colorAllocate( unpack( 'C3', pack( 'H2H2H2', unpack( 'a2a2a2', ( length( $options_r->{'bg_color'} ) ? $options_r->{'bg_color'} : $MIDI::Opus::BG_color ) ) ) ) ); #10........... }, 'wn2.def' => { source => "wn2", params => "def", expect => <<'#11...........', if ( $PLATFORM eq 'aix' ) { skip_symbols( [ qw( Perl_dump_fds Perl_ErrorNo Perl_GetVars PL_sys_intern ) ] ); } #11........... }, 'wn2.wn' => { source => "wn2", params => "wn", expect => <<'#12...........', if ( $PLATFORM eq 'aix' ) { skip_symbols( [ qw( Perl_dump_fds Perl_ErrorNo Perl_GetVars PL_sys_intern ) ] ); } #12........... }, 'wn3.def' => { source => "wn3", params => "def", expect => <<'#13...........', deferred->resolve->then( sub { push @out, 'Resolve'; return $then; } )->then( sub { push @out, 'Reject'; push @out, @_; } ); #13........... }, 'wn3.wn' => { source => "wn3", params => "wn", expect => <<'#14...........', deferred->resolve->then( sub { push @out, 'Resolve'; return $then; } )->then( sub { push @out, 'Reject'; push @out, @_; } ); #14........... }, 'wn4.def' => { source => "wn4", params => "def", expect => <<'#15...........', { { { # Orignal formatting looks nice but would be hard to duplicate return exists $G->{Attr}->{E} && exists $G->{Attr}->{E}->{$u} && exists $G->{Attr}->{E}->{$u}->{$v} ? %{ $G->{Attr}->{E}->{$u}->{$v} } : (); } } } #15........... }, 'wn4.wn' => { source => "wn4", params => "wn", expect => <<'#16...........', { { { # Orignal formatting looks nice but would be hard to duplicate return exists $G->{Attr}->{E} && exists $G->{Attr}->{E}->{$u} && exists $G->{Attr}->{E}->{$u}->{$v} ? %{ $G->{Attr}->{E}->{$u}->{$v} } : (); } } } #16........... }, 'wn5.def' => { source => "wn5", params => "def", expect => <<'#17...........', # qw weld with -wn use_all_ok( qw{ PPI PPI::Tokenizer PPI::Lexer PPI::Dumper PPI::Find PPI::Normal PPI::Util PPI::Cache } ); #17........... }, 'wn5.wn' => { source => "wn5", params => "wn", expect => <<'#18...........', # qw weld with -wn use_all_ok( qw{ PPI PPI::Tokenizer PPI::Lexer PPI::Dumper PPI::Find PPI::Normal PPI::Util PPI::Cache } ); #18........... }, 'wn6.def' => { source => "wn6", params => "def", expect => <<'#19...........', # illustration of some do-not-weld rules # do not weld a two-line function call $trans->add_transformation( PDL::Graphics::TriD::Scale->new( $sx, $sy, $sz ) ); # but weld this more complex statement my $compass = uc( opposite_direction( line_to_canvas_direction( @{ $coords[0] }, @{ $coords[1] } ) ) ); # OLD: do not weld to a one-line block because the function could # get separated from its opening paren. # NEW: (30-jan-2021): keep one-line block together for stability $_[0]->code_handler( sub { $morexxxxxxxxxxxxxxxxxx .= $_[1] . ":" . $_[0] . "\n" } ); # another example; do not weld because the sub is not broken $wrapped->add_around_modifier( sub { push @tracelog => 'around 1'; $_[0]->(); } ); # but okay to weld here because the sub is broken $wrapped->add_around_modifier( sub { push @tracelog => 'around 1'; $_[0]->(); } ); #19........... }, 'wn6.wn' => { source => "wn6", params => "wn", expect => <<'#20...........', # illustration of some do-not-weld rules # do not weld a two-line function call $trans->add_transformation( PDL::Graphics::TriD::Scale->new( $sx, $sy, $sz ) ); # but weld this more complex statement my $compass = uc( opposite_direction( line_to_canvas_direction( @{ $coords[0] }, @{ $coords[1] } ) ) ); # OLD: do not weld to a one-line block because the function could # get separated from its opening paren. # NEW: (30-jan-2021): keep one-line block together for stability $_[0]->code_handler( sub { $morexxxxxxxxxxxxxxxxxx .= $_[1] . ":" . $_[0] . "\n" } ); # another example; do not weld because the sub is not broken $wrapped->add_around_modifier( sub { push @tracelog => 'around 1'; $_[0]->(); } ); # but okay to weld here because the sub is broken $wrapped->add_around_modifier( sub { push @tracelog => 'around 1'; $_[0]->(); } ); #20........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/snippets27.t0000644000175000017500000005542414373177245015322 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 wtc.wtc1 #2 wtc.wtc2 #3 wtc.wtc3 #4 wtc.wtc4 #5 wtc.wtc5 #6 wtc.wtc6 #7 dwic.def #8 dwic.dwic #9 wtc.wtc7 #10 rt144979.def #11 rt144979.rt144979 #12 bfvt.bfvt0 #13 bfvt.bfvt2 #14 bfvt.def #15 cpb.cpb #16 cpb.def #17 rt145706.def #18 olbxl.def #19 olbxl.olbxl1 # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'bfvt0' => "-bfvt=0", 'bfvt2' => "-bfvt=2", 'cpb' => "-cpb", 'def' => "", 'dwic' => "-wn -dwic", 'olbxl1' => "-olbxl=eval", 'rt144979' => "-xci -ce -lp", 'wtc1' => "-wtc=0 -dtc", 'wtc2' => "-wtc=1 -atc", 'wtc3' => "-wtc=m -atc", 'wtc4' => "-wtc=m -atc -dtc", 'wtc5' => "-wtc=b -atc -dtc -vtc=2", 'wtc6' => "-wtc=i -atc -dtc -vtc=2", 'wtc7' => "-wtc=h -atc -dtc -vtc=2", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'bfvt' => <<'----------', # combines with -bfvt>0 eval { require XSLoader; XSLoader::load( 'Sys::Syslog', $VERSION ); 1; } or do { require DynaLoader; push @ISA, 'DynaLoader'; bootstrap Sys::Syslog $VERSION; }; # combines with -bfvt=2 eval { ( $line, $cond ) = $self->_normalize_if_elif($line); 1; } or die sprintf "Error at line %d\nLine %d: %s\n%s", ( $line_info->start_line_num() ) x 2, $line, $@; # stable for bfvt<2; combines for bfvt=2; has ci my $domain = shift || eval { require Net::Domain; Net::Domain::hostfqdn(); } || ""; # stays combined for all bfvt; has ci my $domain = shift || eval { require Net::Domain; Net::Domain::hostfqdn(); } || ""; ---------- 'cpb' => <<'----------', foreach my $dir ( '05_lexer', '07_token', '08_regression', '11_util', '13_data', '15_transform' ) { my @perl = find_files( catdir( 't', 'data', $dir ) ); push @files, @perl; } ---------- 'dwic' => <<'----------', skip_symbols( [ qw( Perl_dump_fds Perl_ErrorNo Perl_GetVars PL_sys_intern ) ], ); ---------- 'olbxl' => <<'----------', eval { require Ace }; @list = map { $frm{ ( /@(.*?)>/ ? $1 : $_ ) }++ ? () : ($_); } @list; $color = join( '/', sort { $color_value{$::a} <=> $color_value{$::b}; } keys %colors ); @sorted = sort { $SortDir * $PageTotal{$a} <=> $SortDir * $PageTotal{$b} }; ---------- 'rt144979' => <<'----------', # part 1 GetOptions( "format|f=s" => sub { my ( $n, $v ) = @_; if ( ( my $k = $formats{$v} ) ) { $format = $k; } else { die("--format must be 'system' or 'user'\n"); } return; }, ); # part 2 {{{ my $desc = $access ? "for -$op under use filetest 'access' $desc_tail" : "for -$op $desc_tail"; { local $SIG{__WARN__} = sub { my $w = shift; if ($w =~ /^File::stat ignores VMS ACLs/) { ++$vwarn; } elsif ( $w =~ /^File::stat ignores use filetest 'access'/) { ++$awarn; } else { $warnings .= $w; } }; $rv = eval "$access; -$op \$stat"; } }}} ---------- 'rt145706' => <<'----------', # some tests for default setting --use-feature=class, rt145706 class Example::Subclass1 : isa(Example::Base) { ... } class Example::Subclass2 : isa(Example::Base 2.345) { ... } class Example::Subclass3 : isa(Example::Base) 1.345 { ... } field $y : param(the_y_value); class Pointer 2.0 { field $x : param; field $y : param; method to_string() { return "($x, $y)"; } } ADJUST { $x = 0; } # these should not produce errors method paint => sub { ...; }; method painter => sub { ...; }; is( ( method Pack "a", "b", "c" ), "method,a,b,c" ); class ExtendsBasicAttributes is BasicAttributes{ ... } class BrokenExtendsBasicAttributes is BasicAttributes{ ... } class +Night with +Bad { public nine { return 'crazy' } }; my $x = field(50); ---------- 'wtc' => <<'----------', # both single and multiple line lists: @LoL = ( [ "fred", "barney", ], [ "george", "jane", "elroy" ], [ "homer", "marge", "bart", ], ); # single line ( $name, $body ) = ( $2, $3, ); # multiline, but not bare $text = $main->Scrolled( TextUndo, $yyy, $zzz, $wwwww, selectbackgroundxxxxx => 'yellow', ); # this will pass for 'h' my $new = { %$item, text => $leaf, color => 'green', }; # matches 'i' my @list = ( $xx, $yy ); # does not match 'h' $c1->create( 'rectangle', 40, 60, 80, 80, -fill => 'red', -tags => 'rectangle' ); $dasm_frame->Button( -text => 'Locate', -command => sub { $target_binary = $fs->Show( -popover => 'cursor', -create => 1, ); }, )->pack( -side => 'left', ); my $no_index_1_1 = { 'map' => { ':key' => { name => \&string, list => { value => \&string }, }, }, }; ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'wtc.wtc1' => { source => "wtc", params => "wtc1", expect => <<'#1...........', # both single and multiple line lists: @LoL = ( [ "fred", "barney" ], [ "george", "jane", "elroy" ], [ "homer", "marge", "bart" ] ); # single line ( $name, $body ) = ( $2, $3 ); # multiline, but not bare $text = $main->Scrolled( TextUndo, $yyy, $zzz, $wwwww, selectbackgroundxxxxx => 'yellow' ); # this will pass for 'h' my $new = { %$item, text => $leaf, color => 'green' }; # matches 'i' my @list = ( $xx, $yy ); # does not match 'h' $c1->create( 'rectangle', 40, 60, 80, 80, -fill => 'red', -tags => 'rectangle' ); $dasm_frame->Button( -text => 'Locate', -command => sub { $target_binary = $fs->Show( -popover => 'cursor', -create => 1 ); } )->pack( -side => 'left' ); my $no_index_1_1 = { 'map' => { ':key' => { name => \&string, list => { value => \&string } } } }; #1........... }, 'wtc.wtc2' => { source => "wtc", params => "wtc2", expect => <<'#2...........', # both single and multiple line lists: @LoL = ( [ "fred", "barney", ], [ "george", "jane", "elroy", ], [ "homer", "marge", "bart", ], ); # single line ( $name, $body, ) = ( $2, $3, ); # multiline, but not bare $text = $main->Scrolled( TextUndo, $yyy, $zzz, $wwwww, selectbackgroundxxxxx => 'yellow', ); # this will pass for 'h' my $new = { %$item, text => $leaf, color => 'green', }; # matches 'i' my @list = ( $xx, $yy, ); # does not match 'h' $c1->create( 'rectangle', 40, 60, 80, 80, -fill => 'red', -tags => 'rectangle', ); $dasm_frame->Button( -text => 'Locate', -command => sub { $target_binary = $fs->Show( -popover => 'cursor', -create => 1, ); }, )->pack( -side => 'left', ); my $no_index_1_1 = { 'map' => { ':key' => { name => \&string, list => { value => \&string }, }, }, }; #2........... }, 'wtc.wtc3' => { source => "wtc", params => "wtc3", expect => <<'#3...........', # both single and multiple line lists: @LoL = ( [ "fred", "barney", ], [ "george", "jane", "elroy" ], [ "homer", "marge", "bart", ], ); # single line ( $name, $body ) = ( $2, $3, ); # multiline, but not bare $text = $main->Scrolled( TextUndo, $yyy, $zzz, $wwwww, selectbackgroundxxxxx => 'yellow', ); # this will pass for 'h' my $new = { %$item, text => $leaf, color => 'green', }; # matches 'i' my @list = ( $xx, $yy, ); # does not match 'h' $c1->create( 'rectangle', 40, 60, 80, 80, -fill => 'red', -tags => 'rectangle', ); $dasm_frame->Button( -text => 'Locate', -command => sub { $target_binary = $fs->Show( -popover => 'cursor', -create => 1, ); }, )->pack( -side => 'left', ); my $no_index_1_1 = { 'map' => { ':key' => { name => \&string, list => { value => \&string }, }, }, }; #3........... }, 'wtc.wtc4' => { source => "wtc", params => "wtc4", expect => <<'#4...........', # both single and multiple line lists: @LoL = ( [ "fred", "barney" ], [ "george", "jane", "elroy" ], [ "homer", "marge", "bart" ], ); # single line ( $name, $body ) = ( $2, $3 ); # multiline, but not bare $text = $main->Scrolled( TextUndo, $yyy, $zzz, $wwwww, selectbackgroundxxxxx => 'yellow', ); # this will pass for 'h' my $new = { %$item, text => $leaf, color => 'green', }; # matches 'i' my @list = ( $xx, $yy, ); # does not match 'h' $c1->create( 'rectangle', 40, 60, 80, 80, -fill => 'red', -tags => 'rectangle', ); $dasm_frame->Button( -text => 'Locate', -command => sub { $target_binary = $fs->Show( -popover => 'cursor', -create => 1 ); }, )->pack( -side => 'left' ); my $no_index_1_1 = { 'map' => { ':key' => { name => \&string, list => { value => \&string } } }, }; #4........... }, 'wtc.wtc5' => { source => "wtc", params => "wtc5", expect => <<'#5...........', # both single and multiple line lists: @LoL = ( [ "fred", "barney" ], [ "george", "jane", "elroy" ], [ "homer", "marge", "bart" ], ); # single line ( $name, $body ) = ( $2, $3 ); # multiline, but not bare $text = $main->Scrolled( TextUndo, $yyy, $zzz, $wwwww, selectbackgroundxxxxx => 'yellow' ); # this will pass for 'h' my $new = { %$item, text => $leaf, color => 'green', }; # matches 'i' my @list = ( $xx, $yy, ); # does not match 'h' $c1->create( 'rectangle', 40, 60, 80, 80, -fill => 'red', -tags => 'rectangle', ); $dasm_frame->Button( -text => 'Locate', -command => sub { $target_binary = $fs->Show( -popover => 'cursor', -create => 1 ); }, )->pack( -side => 'left' ); my $no_index_1_1 = { 'map' => { ':key' => { name => \&string, list => { value => \&string } } } }; #5........... }, 'wtc.wtc6' => { source => "wtc", params => "wtc6", expect => <<'#6...........', # both single and multiple line lists: @LoL = ( [ "fred", "barney" ], [ "george", "jane", "elroy" ], [ "homer", "marge", "bart" ] ); # single line ( $name, $body ) = ( $2, $3 ); # multiline, but not bare $text = $main->Scrolled( TextUndo, $yyy, $zzz, $wwwww, selectbackgroundxxxxx => 'yellow' ); # this will pass for 'h' my $new = { %$item, text => $leaf, color => 'green', }; # matches 'i' my @list = ( $xx, $yy, ); # does not match 'h' $c1->create( 'rectangle', 40, 60, 80, 80, -fill => 'red', -tags => 'rectangle' ); $dasm_frame->Button( -text => 'Locate', -command => sub { $target_binary = $fs->Show( -popover => 'cursor', -create => 1 ); }, )->pack( -side => 'left' ); my $no_index_1_1 = { 'map' => { ':key' => { name => \&string, list => { value => \&string } } } }; #6........... }, 'dwic.def' => { source => "dwic", params => "def", expect => <<'#7...........', skip_symbols( [ qw( Perl_dump_fds Perl_ErrorNo Perl_GetVars PL_sys_intern ) ], ); #7........... }, 'dwic.dwic' => { source => "dwic", params => "dwic", expect => <<'#8...........', skip_symbols( [ qw( Perl_dump_fds Perl_ErrorNo Perl_GetVars PL_sys_intern ) ] ); #8........... }, 'wtc.wtc7' => { source => "wtc", params => "wtc7", expect => <<'#9...........', # both single and multiple line lists: @LoL = ( [ "fred", "barney" ], [ "george", "jane", "elroy" ], [ "homer", "marge", "bart" ] ); # single line ( $name, $body ) = ( $2, $3 ); # multiline, but not bare $text = $main->Scrolled( TextUndo, $yyy, $zzz, $wwwww, selectbackgroundxxxxx => 'yellow' ); # this will pass for 'h' my $new = { %$item, text => $leaf, color => 'green', }; # matches 'i' my @list = ( $xx, $yy ); # does not match 'h' $c1->create( 'rectangle', 40, 60, 80, 80, -fill => 'red', -tags => 'rectangle' ); $dasm_frame->Button( -text => 'Locate', -command => sub { $target_binary = $fs->Show( -popover => 'cursor', -create => 1 ); }, )->pack( -side => 'left' ); my $no_index_1_1 = { 'map' => { ':key' => { name => \&string, list => { value => \&string } } } }; #9........... }, 'rt144979.def' => { source => "rt144979", params => "def", expect => <<'#10...........', # part 1 GetOptions( "format|f=s" => sub { my ( $n, $v ) = @_; if ( ( my $k = $formats{$v} ) ) { $format = $k; } else { die("--format must be 'system' or 'user'\n"); } return; }, ); # part 2 { { { my $desc = $access ? "for -$op under use filetest 'access' $desc_tail" : "for -$op $desc_tail"; { local $SIG{__WARN__} = sub { my $w = shift; if ( $w =~ /^File::stat ignores VMS ACLs/ ) { ++$vwarn; } elsif ( $w =~ /^File::stat ignores use filetest 'access'/ ) { ++$awarn; } else { $warnings .= $w; } }; $rv = eval "$access; -$op \$stat"; } } } } #10........... }, 'rt144979.rt144979' => { source => "rt144979", params => "rt144979", expect => <<'#11...........', # part 1 GetOptions( "format|f=s" => sub { my ( $n, $v ) = @_; if ( ( my $k = $formats{$v} ) ) { $format = $k; } else { die("--format must be 'system' or 'user'\n"); } return; }, ); # part 2 { { { my $desc = $access ? "for -$op under use filetest 'access' $desc_tail" : "for -$op $desc_tail"; { local $SIG{__WARN__} = sub { my $w = shift; if ( $w =~ /^File::stat ignores VMS ACLs/ ) { ++$vwarn; } elsif ( $w =~ /^File::stat ignores use filetest 'access'/ ) { ++$awarn; } else { $warnings .= $w; } }; $rv = eval "$access; -$op \$stat"; } } } } #11........... }, 'bfvt.bfvt0' => { source => "bfvt", params => "bfvt0", expect => <<'#12...........', # combines with -bfvt>0 eval { require XSLoader; XSLoader::load( 'Sys::Syslog', $VERSION ); 1; } or do { require DynaLoader; push @ISA, 'DynaLoader'; bootstrap Sys::Syslog $VERSION; }; # combines with -bfvt=2 eval { ( $line, $cond ) = $self->_normalize_if_elif($line); 1; } or die sprintf "Error at line %d\nLine %d: %s\n%s", ( $line_info->start_line_num() ) x 2, $line, $@; # stable for bfvt<2; combines for bfvt=2; has ci my $domain = shift || eval { require Net::Domain; Net::Domain::hostfqdn(); } || ""; # stays combined for all bfvt; has ci my $domain = shift || eval { require Net::Domain; Net::Domain::hostfqdn(); } || ""; #12........... }, 'bfvt.bfvt2' => { source => "bfvt", params => "bfvt2", expect => <<'#13...........', # combines with -bfvt>0 eval { require XSLoader; XSLoader::load( 'Sys::Syslog', $VERSION ); 1; } or do { require DynaLoader; push @ISA, 'DynaLoader'; bootstrap Sys::Syslog $VERSION; }; # combines with -bfvt=2 eval { ( $line, $cond ) = $self->_normalize_if_elif($line); 1; } or die sprintf "Error at line %d\nLine %d: %s\n%s", ( $line_info->start_line_num() ) x 2, $line, $@; # stable for bfvt<2; combines for bfvt=2; has ci my $domain = shift || eval { require Net::Domain; Net::Domain::hostfqdn(); } || ""; # stays combined for all bfvt; has ci my $domain = shift || eval { require Net::Domain; Net::Domain::hostfqdn(); } || ""; #13........... }, 'bfvt.def' => { source => "bfvt", params => "def", expect => <<'#14...........', # combines with -bfvt>0 eval { require XSLoader; XSLoader::load( 'Sys::Syslog', $VERSION ); 1; } or do { require DynaLoader; push @ISA, 'DynaLoader'; bootstrap Sys::Syslog $VERSION; }; # combines with -bfvt=2 eval { ( $line, $cond ) = $self->_normalize_if_elif($line); 1; } or die sprintf "Error at line %d\nLine %d: %s\n%s", ( $line_info->start_line_num() ) x 2, $line, $@; # stable for bfvt<2; combines for bfvt=2; has ci my $domain = shift || eval { require Net::Domain; Net::Domain::hostfqdn(); } || ""; # stays combined for all bfvt; has ci my $domain = shift || eval { require Net::Domain; Net::Domain::hostfqdn(); } || ""; #14........... }, 'cpb.cpb' => { source => "cpb", params => "cpb", expect => <<'#15...........', foreach my $dir ( '05_lexer', '07_token', '08_regression', '11_util', '13_data', '15_transform' ) { my @perl = find_files( catdir( 't', 'data', $dir ) ); push @files, @perl; } #15........... }, 'cpb.def' => { source => "cpb", params => "def", expect => <<'#16...........', foreach my $dir ( '05_lexer', '07_token', '08_regression', '11_util', '13_data', '15_transform' ) { my @perl = find_files( catdir( 't', 'data', $dir ) ); push @files, @perl; } #16........... }, 'rt145706.def' => { source => "rt145706", params => "def", expect => <<'#17...........', # some tests for default setting --use-feature=class, rt145706 class Example::Subclass1 : isa(Example::Base) { ... } class Example::Subclass2 : isa(Example::Base 2.345) { ... } class Example::Subclass3 : isa(Example::Base) 1.345 { ... } field $y : param(the_y_value); class Pointer 2.0 { field $x : param; field $y : param; method to_string() { return "($x, $y)"; } } ADJUST { $x = 0; } # these should not produce errors method paint => sub { ...; }; method painter => sub { ...; }; is( ( method Pack "a", "b", "c" ), "method,a,b,c" ); class ExtendsBasicAttributes is BasicAttributes { ... } class BrokenExtendsBasicAttributes is BasicAttributes { ... } class +Night with +Bad { public nine { return 'crazy' } }; my $x = field(50); #17........... }, 'olbxl.def' => { source => "olbxl", params => "def", expect => <<'#18...........', eval { require Ace }; @list = map { $frm{ ( /@(.*?)>/ ? $1 : $_ ) }++ ? () : ($_); } @list; $color = join( '/', sort { $color_value{$::a} <=> $color_value{$::b}; } keys %colors ); @sorted = sort { $SortDir * $PageTotal{$a} <=> $SortDir * $PageTotal{$b} }; #18........... }, 'olbxl.olbxl1' => { source => "olbxl", params => "olbxl1", expect => <<'#19...........', eval { require Ace; }; @list = map { $frm{ ( /@(.*?)>/ ? $1 : $_ ) }++ ? () : ($_); } @list; $color = join( '/', sort { $color_value{$::a} <=> $color_value{$::b}; } keys %colors ); @sorted = sort { $SortDir * $PageTotal{$a} <=> $SortDir * $PageTotal{$b} }; #19........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/snippets4.t0000644000175000017500000004146614373177246015237 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 gnu1.gnu #2 gnu2.def #3 gnu2.gnu #4 gnu3.def #5 gnu3.gnu #6 gnu4.def #7 gnu4.gnu #8 hanging_side_comments1.def #9 hanging_side_comments2.def #10 hash1.def #11 hashbang.def #12 here1.def #13 html1.def #14 html1.html #15 ident1.def #16 if1.def #17 iscl1.def #18 iscl1.iscl #19 label1.def #20 lextest1.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'def' => "", 'gnu' => "-gnu", 'html' => <<'----------', -fmt="html" -nts ---------- 'iscl' => "-iscl", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'gnu1' => <<'----------', @common_sometimes = ( "aclocal.m4", "acconfig.h", "config.h.top", "config.h.bot", "stamp-h.in", 'stamp-vti' ); ---------- 'gnu2' => <<'----------', $search_mb = $menu_bar->Menubutton( '-text' => 'Search', '-relief' => 'raised', '-borderwidth' => 2, )->pack( '-side' => 'left', '-padx' => 2 ); ---------- 'gnu3' => <<'----------', $output_rules .= &file_contents_with_transform( 's/\@TEXI\@/' . $info_cursor . '/g; ' . 's/\@VTI\@/' . $vti . '/g; ' . 's/\@VTEXI\@/' . $vtexi . '/g;' . 's,\@MDDIR\@,' . $conf_pat . ',g;', 'texi-vers'); ---------- 'gnu4' => <<'----------', my $mzef = Bio::Tools::MZEF->new( '-file' => Bio::Root::IO->catfile("t", "genomic-seq.mzef")); ---------- 'hanging_side_comments1' => <<'----------', $valuestr .= $value . " " ; # with a trailing space in case there are multiple values # for this tag (allowed in GFF2 and .ace format) ---------- 'hanging_side_comments2' => <<'----------', # keep '=' lined up even with hanging side comments $ax=1;# side comment # hanging side comment $boondoggle=5;# side comment $beetle=5;# side comment # hanging side comment $d=3; ---------- 'hash1' => <<'----------', %TV=(flintstones=>{series=>"flintstones",nights=>[qw(monday thursday friday)], members=>[{name=>"fred",role=>"lead",age=>36,},{name=>"wilma",role=>"wife", age=>31,},{name=>"pebbles",role=>"kid",age=>4,},],},jetsons=>{series=>"jetsons", nights=>[qw(wednesday saturday)],members=>[{name=>"george",role=>"lead",age=>41, },{name=>"jane",role=>"wife",age=>39,},{name=>"elroy",role=>"kid",age=>9,},],}, simpsons=>{series=>"simpsons",nights=>[qw(monday)],members=>[{name=>"homer", role=>"lead",age=>34,},{name=>"marge",role=>"wife",age=>37,},{name=>"bart", role=>"kid",age=>11,},],},); ---------- 'hashbang' => <<'----------', #!/usr/bin/perl ---------- 'here1' => <<'----------', is( <<~`END`, "ok\n", '<<~`HEREDOC`' ); $Perl -le "print 'ok'" END ---------- 'html1' => <<'----------', if ( $editlblk eq 1 ) { $editlblk = "on"; $editlblkchecked = "checked" } else { $editlblk = "off"; $editlblkchecked = "unchecked" } ---------- 'ident1' => <<'----------', package A; sub new { print "A::new! $_[0] $_[1]\n"; return 1; } package main; my $scanner = new A::() ; $scanner = new A::; $scanner = new A 'a'; ---------- 'if1' => <<'----------', # one-line blocks if ( $editlblk eq 1 ) { $editlblk = "on"; $editlblkchecked = "checked" } else { $editlblk = "off"; $editlblkchecked = "unchecked" } ---------- 'iscl1' => <<'----------', # -iscl will not allow alignment of hanging side comments (currently) $gsmatch = ( $sub >= 50 ) ? "equal" : "lequal"; # Force an equal match for # dev, but be more forgiving # for releases ---------- 'label1' => <<'----------', INIT : { $a++; print "looping with label INIT:, a=$a\n"; if ($a<10) {goto INIT} } package: { print "hello!\n"; } sub: { print "hello!\n"; } ---------- 'lextest1' => <<'----------', $_= <<'EOL'; $url = new URI::URL "http://www/"; die if $url eq "xXx"; EOL LOOP:{print(" digits"),redo LOOP if/\G\d+\b[,.;]?\s*/gc;print(" lowercase"), redo LOOP if/\G[a-z]+\b[,.;]?\s*/gc;print(" UPPERCASE"), redo LOOP if/\G[A-Z]+\b[,.;]?\s*/gc;print(" Capitalized"), redo LOOP if/\G[A-Z][a-z]+\b[,.;]?\s*/gc; print(" MiXeD"),redo LOOP if/\G[A-Za-z]+\b[,.;]?\s*/gc;print( " alphanumeric"),redo LOOP if/\G[A-Za-z0-9]+\b[,.;]?\s*/gc;print(" line-noise" ),redo LOOP if/\G[^A-Za-z0-9]+/gc;print". That's all!\n";} ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'gnu1.gnu' => { source => "gnu1", params => "gnu", expect => <<'#1...........', @common_sometimes = ( "aclocal.m4", "acconfig.h", "config.h.top", "config.h.bot", "stamp-h.in", 'stamp-vti' ); #1........... }, 'gnu2.def' => { source => "gnu2", params => "def", expect => <<'#2...........', $search_mb = $menu_bar->Menubutton( '-text' => 'Search', '-relief' => 'raised', '-borderwidth' => 2, )->pack( '-side' => 'left', '-padx' => 2 ); #2........... }, 'gnu2.gnu' => { source => "gnu2", params => "gnu", expect => <<'#3...........', $search_mb = $menu_bar->Menubutton( '-text' => 'Search', '-relief' => 'raised', '-borderwidth' => 2, )->pack('-side' => 'left', '-padx' => 2); #3........... }, 'gnu3.def' => { source => "gnu3", params => "def", expect => <<'#4...........', $output_rules .= &file_contents_with_transform( 's/\@TEXI\@/' . $info_cursor . '/g; ' . 's/\@VTI\@/' . $vti . '/g; ' . 's/\@VTEXI\@/' . $vtexi . '/g;' . 's,\@MDDIR\@,' . $conf_pat . ',g;', 'texi-vers' ); #4........... }, 'gnu3.gnu' => { source => "gnu3", params => "gnu", expect => <<'#5...........', $output_rules .= &file_contents_with_transform( 's/\@TEXI\@/' . $info_cursor . '/g; ' . 's/\@VTI\@/' . $vti . '/g; ' . 's/\@VTEXI\@/' . $vtexi . '/g;' . 's,\@MDDIR\@,' . $conf_pat . ',g;', 'texi-vers' ); #5........... }, 'gnu4.def' => { source => "gnu4", params => "def", expect => <<'#6...........', my $mzef = Bio::Tools::MZEF->new( '-file' => Bio::Root::IO->catfile( "t", "genomic-seq.mzef" ) ); #6........... }, 'gnu4.gnu' => { source => "gnu4", params => "gnu", expect => <<'#7...........', my $mzef = Bio::Tools::MZEF->new( '-file' => Bio::Root::IO->catfile("t", "genomic-seq.mzef")); #7........... }, 'hanging_side_comments1.def' => { source => "hanging_side_comments1", params => "def", expect => <<'#8...........', $valuestr .= $value . " "; # with a trailing space in case there are multiple values # for this tag (allowed in GFF2 and .ace format) #8........... }, 'hanging_side_comments2.def' => { source => "hanging_side_comments2", params => "def", expect => <<'#9...........', # keep '=' lined up even with hanging side comments $ax = 1; # side comment # hanging side comment $boondoggle = 5; # side comment $beetle = 5; # side comment # hanging side comment $d = 3; #9........... }, 'hash1.def' => { source => "hash1", params => "def", expect => <<'#10...........', %TV = ( flintstones => { series => "flintstones", nights => [qw(monday thursday friday)], members => [ { name => "fred", role => "lead", age => 36, }, { name => "wilma", role => "wife", age => 31, }, { name => "pebbles", role => "kid", age => 4, }, ], }, jetsons => { series => "jetsons", nights => [qw(wednesday saturday)], members => [ { name => "george", role => "lead", age => 41, }, { name => "jane", role => "wife", age => 39, }, { name => "elroy", role => "kid", age => 9, }, ], }, simpsons => { series => "simpsons", nights => [qw(monday)], members => [ { name => "homer", role => "lead", age => 34, }, { name => "marge", role => "wife", age => 37, }, { name => "bart", role => "kid", age => 11, }, ], }, ); #10........... }, 'hashbang.def' => { source => "hashbang", params => "def", expect => <<'#11...........', #!/usr/bin/perl #11........... }, 'here1.def' => { source => "here1", params => "def", expect => <<'#12...........', is( <<~`END`, "ok\n", '<<~`HEREDOC`' ); $Perl -le "print 'ok'" END #12........... }, 'html1.def' => { source => "html1", params => "def", expect => <<'#13...........', if ( $editlblk eq 1 ) { $editlblk = "on"; $editlblkchecked = "checked" } else { $editlblk = "off"; $editlblkchecked = "unchecked" } #13........... }, 'html1.html' => { source => "html1", params => "html", expect => <<'#14...........', perltidy

perltidy


if   ( $editlblk eq 1 ) { $editlblk = "on";  $editlblkchecked = "checked" }
else                    { $editlblk = "off"; $editlblkchecked = "unchecked" }
#14........... }, 'ident1.def' => { source => "ident1", params => "def", expect => <<'#15...........', package A; sub new { print "A::new! $_[0] $_[1]\n"; return 1; } package main; my $scanner = new A::(); $scanner = new A::; $scanner = new A 'a'; #15........... }, 'if1.def' => { source => "if1", params => "def", expect => <<'#16...........', # one-line blocks if ( $editlblk eq 1 ) { $editlblk = "on"; $editlblkchecked = "checked" } else { $editlblk = "off"; $editlblkchecked = "unchecked" } #16........... }, 'iscl1.def' => { source => "iscl1", params => "def", expect => <<'#17...........', # -iscl will not allow alignment of hanging side comments (currently) $gsmatch = ( $sub >= 50 ) ? "equal" : "lequal"; # Force an equal match for # dev, but be more forgiving # for releases #17........... }, 'iscl1.iscl' => { source => "iscl1", params => "iscl", expect => <<'#18...........', # -iscl will not allow alignment of hanging side comments (currently) $gsmatch = ( $sub >= 50 ) ? "equal" : "lequal"; # Force an equal match for # dev, but be more forgiving # for releases #18........... }, 'label1.def' => { source => "label1", params => "def", expect => <<'#19...........', INIT: { $a++; print "looping with label INIT:, a=$a\n"; if ( $a < 10 ) { goto INIT } } package: { print "hello!\n"; } sub: { print "hello!\n"; } #19........... }, 'lextest1.def' => { source => "lextest1", params => "def", expect => <<'#20...........', $_ = <<'EOL'; $url = new URI::URL "http://www/"; die if $url eq "xXx"; EOL LOOP: { print(" digits"), redo LOOP if /\G\d+\b[,.;]?\s*/gc; print(" lowercase"), redo LOOP if /\G[a-z]+\b[,.;]?\s*/gc; print(" UPPERCASE"), redo LOOP if /\G[A-Z]+\b[,.;]?\s*/gc; print(" Capitalized"), redo LOOP if /\G[A-Z][a-z]+\b[,.;]?\s*/gc; print(" MiXeD"), redo LOOP if /\G[A-Za-z]+\b[,.;]?\s*/gc; print(" alphanumeric"), redo LOOP if /\G[A-Za-z0-9]+\b[,.;]?\s*/gc; print(" line-noise"), redo LOOP if /\G[^A-Za-z0-9]+/gc; print ". That's all!\n"; } #20........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/snippets1.t0000644000175000017500000003734314373177244015231 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 105484.def #2 align1.def #3 align2.def #4 align3.def #5 align4.def #6 align5.def #7 align6.def #8 align7.def #9 align8.def #10 align9.def #11 andor1.def #12 andor10.def #13 andor2.def #14 andor3.def #15 andor4.def #16 andor5.def #17 andor6.def #18 andor7.def #19 andor8.def #20 andor9.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'def' => "", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { '105484' => <<'----------', switch (1) { case x { 2 } else { } } ---------- 'align1' => <<'----------', return ( $fetch_key eq $fk && $store_key eq $sk && $fetch_value eq $fv && $store_value eq $sv && $_ eq 'original' ); ---------- 'align2' => <<'----------', same = ( ( $aP eq $bP ) && ( $aS eq $bS ) && ( $aT eq $bT ) && ( $a->{'title'} eq $b->{'title'} ) && ( $a->{'href'} eq $b->{'href'} ) ); ---------- 'align3' => <<'----------', # This greatly improved after dropping 'ne' and 'eq': if ( $dir eq $updir and # if we have an updir @collapsed and # and something to collapse length $collapsed[-1] and # and its not the rootdir $collapsed[-1] ne $updir and # nor another updir $collapsed[-1] ne $curdir # nor the curdir ) { $bla} ---------- 'align4' => <<'----------', # removed 'eq' and '=~' from alignment tokens to get alignment of '?'s my $salute = $name eq $EMPTY_STR ? 'Customer' : $name =~ m/\A((?:Sir|Dame) \s+ \S+) /xms ? $1 : $name =~ m/(.*), \s+ Ph[.]?D \z /xms ? "Dr $1" : $name; ---------- 'align5' => <<'----------', # some lists printline( "Broadcast", &bintodq($b), ( $b, $mask, $bcolor, 0 ) ); printline( "HostMin", &bintodq($hmin), ( $hmin, $mask, $bcolor, 0 ) ); printline( "HostMax", &bintodq($hmax), ( $hmax, $mask, $bcolor, 0 ) ); ---------- 'align6' => <<'----------', # align opening parens if ( ( index( $msg_line_lc, $nick1 ) != -1 ) || ( index( $msg_line_lc, $nick2 ) != -1 ) || ( index( $msg_line_lc, $nick3 ) != -1 ) ) { do_something(); } ---------- 'align7' => <<'----------', # Alignment with two fat commas in second line my $ct = Courriel::Header::ContentType->new( mime_type => 'multipart/alternative', attributes => { boundary => unique_boundary }, ); ---------- 'align8' => <<'----------', # aligning '=' and padding 'if' if ( $tag == 263 ) { $bbi->{"Info.Thresholding"} = $value } elsif ( $tag == 264 ) { $bbi->{"Info.CellWidth"} = $value } elsif ( $tag == 265 ) { $bbi->{"Info.CellLength"} = $value } ---------- 'align9' => <<'----------', # test of aligning || my $os = ( $ExtUtils::MM_Unix::Is_OS2 || 0 ) + ( $ExtUtils::MM_Unix::Is_Mac || 0 ) + ( $ExtUtils::MM_Unix::Is_Win32 || 0 ) + ( $ExtUtils::MM_Unix::Is_Dos || 0 ) + ( $ExtUtils::MM_Unix::Is_VMS || 0 ); ---------- 'andor1' => <<'----------', return 1 if $det_a < 0 and $det_b > 0 or $det_a > 0 and $det_b < 0; ---------- 'andor10' => <<'----------', if ( ( ($a) and ( $b == 13 ) and ( $c - 24 = 0 ) and ("test") and ( $rudolph eq "reindeer" or $rudolph eq "red nosed" ) and $test ) or ( $nobody and ( $noone or $none ) ) ) { $i++; } ---------- 'andor2' => <<'----------', # breaks at = or at && but not both my $success = ( system("$Config{cc} -o $te $tc $libs $HIDE") == 0 ) && -e $te ? 1 : 0; ---------- 'andor3' => <<'----------', ok( ( $obj->name() eq $obj2->name() ) and ( $obj->version() eq $obj2->version() ) and ( $obj->help() eq $obj2->help() ) ); ---------- 'andor4' => <<'----------', if ( !$verbose_error && ( !$options->{'log'} && ( ( $options->{'verbose'} & 8 ) || ( $options->{'verbose'} & 16 ) || ( $options->{'verbose'} & 32 ) || ( $options->{'verbose'} & 64 ) ) ) ) ---------- 'andor5' => <<'----------', # two levels of && with side comments if ( defined &syscopy && \&syscopy != \© && !$to_a_handle && !( $from_a_handle && $^O eq 'os2' ) # OS/2 cannot handle && !( $from_a_handle && $^O eq 'mpeix' ) # and neither can MPE/iX. ) { return syscopy( $from, $to ); } ---------- 'andor6' => <<'----------', # Example of nested ands and ors sub is_miniwhile { # check for one-line loop (`foo() while $y--') my $op = shift; return ( !null($op) and null( $op->sibling ) and $op->ppaddr eq "pp_null" and class($op) eq "UNOP" and ( ( $op->first->ppaddr =~ /^pp_(and|or)$/ and $op->first->first->sibling->ppaddr eq "pp_lineseq" ) or ( $op->first->ppaddr eq "pp_lineseq" and not null $op->first->first->sibling and $op->first->first->sibling->ppaddr eq "pp_unstack" ) ) ); } ---------- 'andor7' => <<'----------', # original is single line: $a = 1 if $l and !$r or !$l and $r; ---------- 'andor8' => <<'----------', # original is broken: $a = 1 if $l and !$r or !$l and $r; ---------- 'andor9' => <<'----------', if ( ( ( $old_new and $old_new eq 'changed' ) and ( $db_new and $db_new eq 'changed' ) and ( not defined $old_db ) ) or ( ( $old_new and $old_new eq 'changed' ) and ( $db_new and $db_new eq 'new' ) and ( $old_db and $old_db eq 'new' ) ) or ( ( $old_new and $old_new eq 'new' ) and ( $db_new and $db_new eq 'new' ) and ( not defined $old_db ) ) ) { return "update"; } ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { '105484.def' => { source => "105484", params => "def", expect => <<'#1...........', switch (1) { case x { 2 } else { } } #1........... }, 'align1.def' => { source => "align1", params => "def", expect => <<'#2...........', return ( $fetch_key eq $fk && $store_key eq $sk && $fetch_value eq $fv && $store_value eq $sv && $_ eq 'original' ); #2........... }, 'align2.def' => { source => "align2", params => "def", expect => <<'#3...........', same = ( ( $aP eq $bP ) && ( $aS eq $bS ) && ( $aT eq $bT ) && ( $a->{'title'} eq $b->{'title'} ) && ( $a->{'href'} eq $b->{'href'} ) ); #3........... }, 'align3.def' => { source => "align3", params => "def", expect => <<'#4...........', # This greatly improved after dropping 'ne' and 'eq': if ( $dir eq $updir and # if we have an updir @collapsed and # and something to collapse length $collapsed[-1] and # and its not the rootdir $collapsed[-1] ne $updir and # nor another updir $collapsed[-1] ne $curdir # nor the curdir ) { $bla; } #4........... }, 'align4.def' => { source => "align4", params => "def", expect => <<'#5...........', # removed 'eq' and '=~' from alignment tokens to get alignment of '?'s my $salute = $name eq $EMPTY_STR ? 'Customer' : $name =~ m/\A((?:Sir|Dame) \s+ \S+) /xms ? $1 : $name =~ m/(.*), \s+ Ph[.]?D \z /xms ? "Dr $1" : $name; #5........... }, 'align5.def' => { source => "align5", params => "def", expect => <<'#6...........', # some lists printline( "Broadcast", &bintodq($b), ( $b, $mask, $bcolor, 0 ) ); printline( "HostMin", &bintodq($hmin), ( $hmin, $mask, $bcolor, 0 ) ); printline( "HostMax", &bintodq($hmax), ( $hmax, $mask, $bcolor, 0 ) ); #6........... }, 'align6.def' => { source => "align6", params => "def", expect => <<'#7...........', # align opening parens if ( ( index( $msg_line_lc, $nick1 ) != -1 ) || ( index( $msg_line_lc, $nick2 ) != -1 ) || ( index( $msg_line_lc, $nick3 ) != -1 ) ) { do_something(); } #7........... }, 'align7.def' => { source => "align7", params => "def", expect => <<'#8...........', # Alignment with two fat commas in second line my $ct = Courriel::Header::ContentType->new( mime_type => 'multipart/alternative', attributes => { boundary => unique_boundary }, ); #8........... }, 'align8.def' => { source => "align8", params => "def", expect => <<'#9...........', # aligning '=' and padding 'if' if ( $tag == 263 ) { $bbi->{"Info.Thresholding"} = $value } elsif ( $tag == 264 ) { $bbi->{"Info.CellWidth"} = $value } elsif ( $tag == 265 ) { $bbi->{"Info.CellLength"} = $value } #9........... }, 'align9.def' => { source => "align9", params => "def", expect => <<'#10...........', # test of aligning || my $os = ( $ExtUtils::MM_Unix::Is_OS2 || 0 ) + ( $ExtUtils::MM_Unix::Is_Mac || 0 ) + ( $ExtUtils::MM_Unix::Is_Win32 || 0 ) + ( $ExtUtils::MM_Unix::Is_Dos || 0 ) + ( $ExtUtils::MM_Unix::Is_VMS || 0 ); #10........... }, 'andor1.def' => { source => "andor1", params => "def", expect => <<'#11...........', return 1 if $det_a < 0 and $det_b > 0 or $det_a > 0 and $det_b < 0; #11........... }, 'andor10.def' => { source => "andor10", params => "def", expect => <<'#12...........', if ( ( ($a) and ( $b == 13 ) and ( $c - 24 = 0 ) and ("test") and ( $rudolph eq "reindeer" or $rudolph eq "red nosed" ) and $test ) or ( $nobody and ( $noone or $none ) ) ) { $i++; } #12........... }, 'andor2.def' => { source => "andor2", params => "def", expect => <<'#13...........', # breaks at = or at && but not both my $success = ( system("$Config{cc} -o $te $tc $libs $HIDE") == 0 ) && -e $te ? 1 : 0; #13........... }, 'andor3.def' => { source => "andor3", params => "def", expect => <<'#14...........', ok( ( $obj->name() eq $obj2->name() ) and ( $obj->version() eq $obj2->version() ) and ( $obj->help() eq $obj2->help() ) ); #14........... }, 'andor4.def' => { source => "andor4", params => "def", expect => <<'#15...........', if ( !$verbose_error && ( !$options->{'log'} && ( ( $options->{'verbose'} & 8 ) || ( $options->{'verbose'} & 16 ) || ( $options->{'verbose'} & 32 ) || ( $options->{'verbose'} & 64 ) ) ) ) #15........... }, 'andor5.def' => { source => "andor5", params => "def", expect => <<'#16...........', # two levels of && with side comments if ( defined &syscopy && \&syscopy != \© && !$to_a_handle && !( $from_a_handle && $^O eq 'os2' ) # OS/2 cannot handle && !( $from_a_handle && $^O eq 'mpeix' ) # and neither can MPE/iX. ) { return syscopy( $from, $to ); } #16........... }, 'andor6.def' => { source => "andor6", params => "def", expect => <<'#17...........', # Example of nested ands and ors sub is_miniwhile { # check for one-line loop (`foo() while $y--') my $op = shift; return ( !null($op) and null( $op->sibling ) and $op->ppaddr eq "pp_null" and class($op) eq "UNOP" and ( ( $op->first->ppaddr =~ /^pp_(and|or)$/ and $op->first->first->sibling->ppaddr eq "pp_lineseq" ) or ( $op->first->ppaddr eq "pp_lineseq" and not null $op->first->first->sibling and $op->first->first->sibling->ppaddr eq "pp_unstack" ) ) ); } #17........... }, 'andor7.def' => { source => "andor7", params => "def", expect => <<'#18...........', # original is single line: $a = 1 if $l and !$r or !$l and $r; #18........... }, 'andor8.def' => { source => "andor8", params => "def", expect => <<'#19...........', # original is broken: $a = 1 if $l and !$r or !$l and $r; #19........... }, 'andor9.def' => { source => "andor9", params => "def", expect => <<'#20...........', if ( ( ( $old_new and $old_new eq 'changed' ) and ( $db_new and $db_new eq 'changed' ) and ( not defined $old_db ) ) or ( ( $old_new and $old_new eq 'changed' ) and ( $db_new and $db_new eq 'new' ) and ( $old_db and $old_db eq 'new' ) ) or ( ( $old_new and $old_new eq 'new' ) and ( $db_new and $db_new eq 'new' ) and ( not defined $old_db ) ) ) { return "update"; } #20........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/testwide.pl.src0000644000175000017500000000037312563255655016064 0ustar stevesteve%pangrams=("Plain","ASCII", "Zwölf große Boxkämpfer jagen Vik quer über den Sylter.","DE", "Jeż wlókÅ‚ gęś. Uf! BÄ…dź choć przy nim, staÅ„!","PL", "ЛюбÑ, Ñъешь щипцы, — вздохнёт мÑÑ€, — кайф жгуч.","RU"); Perl-Tidy-20230309/t/testwide-tidy.pl.srctdy0000644000175000017500000000040214174123067017534 0ustar stevesteve# really simple @a = ( "Plain", "Zwölf große Boxkämpfer jagen Vik quer über den Sylter.", "Jeż wlókÅ‚ gęś. Uf! BÄ…dź choć przy nim, staÅ„!", "ЛюбÑ, Ñъешь щипцы, — вздохнёт мÑÑ€, — кайф жгуч.", ); Perl-Tidy-20230309/t/snippets9.t0000644000175000017500000002707114373177246015240 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 rt70747.rt70747 #2 rt74856.def #3 rt78156.def #4 rt78764.def #5 rt79813.def #6 rt79947.def #7 rt80645.def #8 rt81852.def #9 rt81852.rt81852 #10 rt81854.def #11 rt87502.def #12 rt93197.def #13 rt94338.def #14 rt95419.def #15 rt95708.def #16 rt96021.def #17 rt96101.def #18 rt98902.def #19 rt98902.rt98902 #20 rt99961.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'def' => "", 'rt70747' => "-i=2", 'rt81852' => <<'----------', -wn -act=2 ---------- 'rt98902' => "-boc", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'rt70747' => <<'----------', coerce Q2RawStatGroupArray, from ArrayRef [Q2StatGroup], via { [ map { my $g = $_->as_hash; $g->{stats} = [ map { scalar $_->as_array } @{ $g->{stats} } ]; $g; } @$_; ] }; ---------- 'rt74856' => <<'----------', { my $foo = '1'; #<<< my $bar = (test()) ? 'some value' : undef; #>>> my $baz = 'something else'; } ---------- 'rt78156' => <<'----------', package Some::Class 2.012; ---------- 'rt78764' => <<'----------', qr/3/ ~~ ['1234'] ? 1 : 0; map { $_ ~~ [ '0', '1' ] ? 'x' : 'o' } @a; ---------- 'rt79813' => <<'----------', my %hash = ( a => { bbbbbbbbb => { cccccccccc => 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx', }, },); ---------- 'rt79947' => <<'----------', try { croak "An Error!"; } catch ($error) { print STDERR $error . "\n"; } ---------- 'rt80645' => <<'----------', BEGIN { $^W = 1; } use warnings; use strict; @$ = 'test'; print $#{$}; ---------- 'rt81852' => <<'----------', do { { next if ( $n % 2 ); print $n, "\n"; } } while ( $n++ < 10 ); ---------- 'rt81854' => <<'----------', return "this is a descriptive error message" if $res->is_error or not length $data; ---------- 'rt87502' => <<'----------', if ( @ARGV ~~ { map { $_ => 1 } qw(re restart reload) } ) { # CODE } ---------- 'rt93197' => <<'----------', $to = $to->{$_} ||= {} for @key; if (1) {2;} else {3;} ---------- 'rt94338' => <<'----------', # for-loop in a parenthesized block-map triggered an error message map( { foreach my $item ( '0', '1' ) { print $item} } qw(a b c) ); ---------- 'rt95419' => <<'----------', case "blah" => sub { { a => 1 } }; ---------- 'rt95708' => <<'----------', use strict; use JSON; my $ref = { when => time(), message => 'abc' }; my $json = encode_json { when => time(), message => 'abc' }; my $json2 = encode_json + { when => time(), message => 'abc' }; ---------- 'rt96021' => <<'----------', $a->@*; $a->**; $a->$*; $a->&*; $a->%*; $a->$#* ---------- 'rt96101' => <<'----------', # Example for rt.cpan.org #96101; Perltidy not properly formatting subroutine # references inside subroutine execution. # closing brace of second sub should get outdented here sub startup { my $self = shift; $self->plugin( 'authentication' => { 'autoload_user' => 1, 'session_key' => rand(), 'load_user' => sub { return HaloVP::Users->load(@_); }, 'validate_user' => sub { return HaloVP::Users->login(@_); } } ); } ---------- 'rt98902' => <<'----------', my %foo = ( alpha => 1, beta => 2, gamma => 3, ); my @bar = map { { number => $_, character => chr $_, padding => ( ' ' x $_ ), } } ( 0 .. 32 ); ---------- 'rt99961' => <<'----------', %thing = %{ print qq[blah1\n]; $b; }; ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'rt70747.rt70747' => { source => "rt70747", params => "rt70747", expect => <<'#1...........', coerce Q2RawStatGroupArray, from ArrayRef [Q2StatGroup], via { [ map { my $g = $_->as_hash; $g->{stats} = [ map { scalar $_->as_array } @{ $g->{stats} } ]; $g; } @$_; ] }; #1........... }, 'rt74856.def' => { source => "rt74856", params => "def", expect => <<'#2...........', { my $foo = '1'; #<<< my $bar = (test()) ? 'some value' : undef; #>>> my $baz = 'something else'; } #2........... }, 'rt78156.def' => { source => "rt78156", params => "def", expect => <<'#3...........', package Some::Class 2.012; #3........... }, 'rt78764.def' => { source => "rt78764", params => "def", expect => <<'#4...........', qr/3/ ~~ ['1234'] ? 1 : 0; map { $_ ~~ [ '0', '1' ] ? 'x' : 'o' } @a; #4........... }, 'rt79813.def' => { source => "rt79813", params => "def", expect => <<'#5...........', my %hash = ( a => { bbbbbbbbb => { cccccccccc => 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx', }, }, ); #5........... }, 'rt79947.def' => { source => "rt79947", params => "def", expect => <<'#6...........', try { croak "An Error!"; } catch ($error) { print STDERR $error . "\n"; } #6........... }, 'rt80645.def' => { source => "rt80645", params => "def", expect => <<'#7...........', BEGIN { $^W = 1; } use warnings; use strict; @$ = 'test'; print $#{$}; #7........... }, 'rt81852.def' => { source => "rt81852", params => "def", expect => <<'#8...........', do { { next if ( $n % 2 ); print $n, "\n"; } } while ( $n++ < 10 ); #8........... }, 'rt81852.rt81852' => { source => "rt81852", params => "rt81852", expect => <<'#9...........', do {{ next if ($n % 2); print $n, "\n"; }} while ($n++ < 10); #9........... }, 'rt81854.def' => { source => "rt81854", params => "def", expect => <<'#10...........', return "this is a descriptive error message" if $res->is_error or not length $data; #10........... }, 'rt87502.def' => { source => "rt87502", params => "def", expect => <<'#11...........', if ( @ARGV ~~ { map { $_ => 1 } qw(re restart reload) } ) { # CODE } #11........... }, 'rt93197.def' => { source => "rt93197", params => "def", expect => <<'#12...........', $to = $to->{$_} ||= {} for @key; if (1) { 2; } else { 3; } #12........... }, 'rt94338.def' => { source => "rt94338", params => "def", expect => <<'#13...........', # for-loop in a parenthesized block-map triggered an error message map( { foreach my $item ( '0', '1' ) { print $item; } } qw(a b c) ); #13........... }, 'rt95419.def' => { source => "rt95419", params => "def", expect => <<'#14...........', case "blah" => sub { { a => 1 } }; #14........... }, 'rt95708.def' => { source => "rt95708", params => "def", expect => <<'#15...........', use strict; use JSON; my $ref = { when => time(), message => 'abc' }; my $json = encode_json { when => time(), message => 'abc' }; my $json2 = encode_json + { when => time(), message => 'abc' }; #15........... }, 'rt96021.def' => { source => "rt96021", params => "def", expect => <<'#16...........', $a->@*; $a->**; $a->$*; $a->&*; $a->%*; $a->$#* #16........... }, 'rt96101.def' => { source => "rt96101", params => "def", expect => <<'#17...........', # Example for rt.cpan.org #96101; Perltidy not properly formatting subroutine # references inside subroutine execution. # closing brace of second sub should get outdented here sub startup { my $self = shift; $self->plugin( 'authentication' => { 'autoload_user' => 1, 'session_key' => rand(), 'load_user' => sub { return HaloVP::Users->load(@_); }, 'validate_user' => sub { return HaloVP::Users->login(@_); } } ); } #17........... }, 'rt98902.def' => { source => "rt98902", params => "def", expect => <<'#18...........', my %foo = ( alpha => 1, beta => 2, gamma => 3, ); my @bar = map { { number => $_, character => chr $_, padding => ( ' ' x $_ ), } } ( 0 .. 32 ); #18........... }, 'rt98902.rt98902' => { source => "rt98902", params => "rt98902", expect => <<'#19...........', my %foo = ( alpha => 1, beta => 2, gamma => 3, ); my @bar = map { { number => $_, character => chr $_, padding => ( ' ' x $_ ), } } ( 0 .. 32 ); #19........... }, 'rt99961.def' => { source => "rt99961", params => "def", expect => <<'#20...........', %thing = %{ print qq[blah1\n]; $b; }; #20........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/test-eol.t0000644000175000017500000000705313721052601015013 0ustar stevesteveuse strict; use File::Temp; use Test; use Carp; BEGIN {plan tests => 4} use Perl::Tidy; #---------------------------------------------------------------------- ## test string->string #---------------------------------------------------------------------- my $source_template = <<'EOM'; %height=("letter",27.9, "legal",35.6, "arche",121.9, "archd",91.4, "archc",61, "archb",45.7, "archa",30.5, "flsa",33, "flse",33, "halfletter",21.6, "11x17",43.2, "ledger",27.9); %width=("letter",21.6, "legal",21.6, "arche",91.4, "archd",61, "archc",45.7, "archb",30.5, "archa",22.9, "flsa",21.6, "flse",21.6, "halfletter",14, "11x17",27.9, "ledger",43.2); EOM my $perltidyrc; my $expected_output_template=<<'EOM'; %height = ( "letter", 27.9, "legal", 35.6, "arche", 121.9, "archd", 91.4, "archc", 61, "archb", 45.7, "archa", 30.5, "flsa", 33, "flse", 33, "halfletter", 21.6, "11x17", 43.2, "ledger", 27.9 ); %width = ( "letter", 21.6, "legal", 21.6, "arche", 91.4, "archd", 61, "archc", 45.7, "archb", 30.5, "archa", 22.9, "flsa", 21.6, "flse", 21.6, "halfletter", 14, "11x17", 27.9, "ledger", 43.2 ); EOM my $source; my $output; my $expected_output; my $CR = chr(015); my $LF = chr(012); $perltidyrc = <<'EOM'; -gnu --output-line-ending="unix" # use *nix LF EOLs EOM $source = $source_template; $source =~ s/\n/$CR$LF/gmsx; $expected_output = $expected_output_template; $expected_output =~ s/\n/$LF/gmsx; # my ($source_fh, $source_filename) = File::Temp::tempfile(); close $source_filename; my ($output_fh, $output_filename) = File::Temp::tempfile(); close $output_filename; # print STDERR "# source_filename = ", $source_filename, "\n"; # print STDERR "# output_filename = ", $output_filename, "\n"; # open $source_fh, ">", $source_filename; # binmode $source_fh, ":raw"; # print $source_fh $source; # close $source_fh; # in-memory output (non-UTF8) Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$perltidyrc, argv => '', ); ok($output, $expected_output, "in-memory EOLs (non-UTF8)"); # file output (non-UTF8) Perl::Tidy::perltidy( source => \$source, destination => $output_filename, perltidyrc => \$perltidyrc, argv => '', ); {# slurp entire file local $/ = undef; open $output_fh, "<", $output_filename; binmode $output_fh, ":raw"; $output = <$output_fh>; } ok($output, $expected_output, "output file EOLs (non-UTF8)"); $perltidyrc = <<'EOM'; -gnu --character-encoding="utf8" # treat files as UTF-8 (decode and encode) --output-line-ending="unix" # use *nix LF EOLs EOM # in-memory (UTF8) $source = $source_template; $source =~ s/\n/$CR$LF/gmsx; $expected_output = $expected_output_template; $expected_output =~ s/\n/$LF/gmsx; Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$perltidyrc, argv => '', ); ok($output, $expected_output, "in-memory EOLs (UTF8)"); # file output (UTF8) Perl::Tidy::perltidy( source => \$source, destination => $output_filename, perltidyrc => \$perltidyrc, argv => '', ); {# slurp entire file local $/ = undef; open $output_fh, "<", $output_filename; binmode $output_fh, ":raw"; $output = <$output_fh>; } ok($output, $expected_output, "output file EOLs (UTF8)"); # Try to delete the tmpfile; # Comment this out if it causes a failure at Appveyor if ( -e $output_filename ) { unlink $output_filename } Perl-Tidy-20230309/t/snippets11.t0000644000175000017500000003551314373177244015307 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 sub1.def #2 sub2.def #3 switch1.def #4 syntax1.def #5 syntax2.def #6 ternary1.def #7 ternary2.def #8 tick1.def #9 trim_quote.def #10 tso1.def #11 tso1.tso #12 tutor.def #13 undoci1.def #14 use1.def #15 use2.def #16 version1.def #17 version2.def #18 vert.def #19 vmll.def #20 vmll.vmll # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'def' => "", 'tso' => "-tso", 'vmll' => <<'----------', -vmll -bbt=2 -bt=2 -pt=2 -sbt=2 ---------- }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'sub1' => <<'----------', my::doit(); join::doit(); for::doit(); sub::doit(); package::doit(); __END__::doit(); __DATA__::doit(); package my; sub doit{print"Hello My\n";}package join; sub doit{print"Hello Join\n";}package for; sub doit{print"Hello for\n";}package package; sub doit{print"Hello package\n";}package sub; sub doit{print"Hello sub\n";}package __END__; sub doit{print"Hello __END__\n";}package __DATA__; sub doit{print"Hello __DATA__\n";} ---------- 'sub2' => <<'----------', my $selector; # leading atrribute separator: $a = sub : locked { print "Hello, World!\n"; }; $a->(); # colon as both ?/: and attribute separator $a = $selector ? sub : locked { print "Hello, World!\n"; } : sub : locked { print "GOODBYE!\n"; }; $a->(); ---------- 'switch1' => <<'----------', sub classify_digit ($digit) { switch($digit) { case 0 { return 'zero' } case [ 2, 4, 6, 8 ]{ return 'even' } case [ 1, 3, 4, 7, 9 ]{ return 'odd' } case /[A-F]/i { return 'hex' } } } ---------- 'syntax1' => <<'----------', # Caused trouble: print $x **2; ---------- 'syntax2' => <<'----------', # ? was taken as pattern my $case_flag = File::Spec->case_tolerant ? '(?i)' : ''; ---------- 'ternary1' => <<'----------', my $flags = ( $_ & 1 ) ? ( $_ & 4 ) ? $THRf_DEAD : $THRf_ZOMBIE : ( $_ & 4 ) ? $THRf_R_DETACHED : $THRf_R_JOINABLE; ---------- 'ternary2' => <<'----------', my $a=($b) ? ($c) ? ($d) ? $d1 : $d2 : ($e) ? $e1 : $e2 : ($f) ? ($g) ? $g1 : $g2 : ($h) ? $h1 : $h2; ---------- 'tick1' => <<'----------', sub a'this { $p'u'a = "mooo\n"; print $p::u::a; } a::this(); # print "mooo" print $p'u'a; # print "mooo" sub a::that { $p't'u = "wwoo\n"; return sub { print $p't'u} } $a'that = a'that(); $a'that->(); # print "wwoo" $a'that = a'that(); $p::t::u = "booo\n"; $a'that->(); # print "booo" ---------- 'trim_quote' => <<'----------', # space after quote will get trimmed push @m, ' all :: pure_all manifypods ' . $self->{NOECHO} . '$(NOOP) ' unless $self->{SKIPHASH}{'all'}; ---------- 'tso1' => <<'----------', print 0+ '42 EUR'; # 42 ---------- 'tutor' => <<'----------', #!/usr/bin/perl $y=shift||5;for $i(1..10){$l[$i]="T";$w[$i]=999999;}while(1){print"Name:";$u=;$t=50;$a=time;for(0..9){$x="";for(1..$y){$x.=chr(int(rand(126-33)+33));}while($z ne $x){print"\r\n$x\r\n";$z=;chomp($z);$t-=5;}}$b=time;$t-=($b-$a)*2;$t=0-$t;$z=1;@q=@l;@p=@w;print "You scored $t points\r\nTopTen\r\n";for $i(1..10){if ($t<$p[$z]){$l[$i]=$u;chomp($l[$i]);$w[$i]=$t;$t=1000000}else{$l[$i]=$q[$z];$w[$i]=$p[$z];$z++;}print $l[$i],"\t",$w[$i],"\r\n";}} ---------- 'undoci1' => <<'----------', $rinfo{deleteStyle} = [ -fill => 'red', -stipple => '@' . Tk->findINC('demos/images/grey.25'), ]; ---------- 'use1' => <<'----------', # previously this caused an incorrect error message after '2.42' use lib "$Common::global::gInstallRoot/lib"; use CGI 2.42 qw(fatalsToBrowser); use RRDs 1.000101; # the 0666 must expect an operator use constant MODE => do { 0666 & ( 0777 & ~umask ) }; use IO::File (); ---------- 'use2' => <<'----------', # Keep the space before the '()' here: use Foo::Bar (); use Foo::Bar (); use Foo::Bar 1.0 (); use Foo::Bar qw(baz); use Foo::Bar 1.0 qw(baz); ---------- 'version1' => <<'----------', # VERSION statement unbroken, no semicolon added; our $VERSION = do { my @r = ( q$Revision: 2.2 $ =~ /\d+/g ); sprintf "%d." . "%02d" x $#r, @r } ---------- 'version2' => <<'----------', # On one line so MakeMaker will see it. require Exporter; our $VERSION = $Exporter::VERSION; ---------- 'vert' => <<'----------', # if $w->vert is tokenized as type 'U' then the ? will start a quote # and an error will occur. sub vert { } sub Restore { $w->vert ? $w->delta_width(0) : $w->delta_height(0); } ---------- 'vmll' => <<'----------', # perltidy -act=2 -vmll will leave these intact and greater than 80 columns # in length, which is what vmll does BEGIN {is_deeply(\@init_metas_called, [1]) || diag(Dumper(\@init_metas_called))} This has the comma on the next line exception {Class::MOP::Class->initialize("NonExistent")->rebless_instance($foo)}, ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'sub1.def' => { source => "sub1", params => "def", expect => <<'#1...........', my::doit(); join::doit(); for::doit(); sub::doit(); package::doit(); __END__::doit(); __DATA__::doit(); package my; sub doit { print "Hello My\n"; } package join; sub doit { print "Hello Join\n"; } package for; sub doit { print "Hello for\n"; } package package; sub doit { print "Hello package\n"; } package sub; sub doit { print "Hello sub\n"; } package __END__; sub doit { print "Hello __END__\n"; } package __DATA__; sub doit { print "Hello __DATA__\n"; } #1........... }, 'sub2.def' => { source => "sub2", params => "def", expect => <<'#2...........', my $selector; # leading atrribute separator: $a = sub : locked { print "Hello, World!\n"; }; $a->(); # colon as both ?/: and attribute separator $a = $selector ? sub : locked { print "Hello, World!\n"; } : sub : locked { print "GOODBYE!\n"; }; $a->(); #2........... }, 'switch1.def' => { source => "switch1", params => "def", expect => <<'#3...........', sub classify_digit ($digit) { switch ($digit) { case 0 { return 'zero' } case [ 2, 4, 6, 8 ]{ return 'even' } case [ 1, 3, 4, 7, 9 ]{ return 'odd' } case /[A-F]/i { return 'hex' } } } #3........... }, 'syntax1.def' => { source => "syntax1", params => "def", expect => <<'#4...........', # Caused trouble: print $x **2; #4........... }, 'syntax2.def' => { source => "syntax2", params => "def", expect => <<'#5...........', # ? was taken as pattern my $case_flag = File::Spec->case_tolerant ? '(?i)' : ''; #5........... }, 'ternary1.def' => { source => "ternary1", params => "def", expect => <<'#6...........', my $flags = ( $_ & 1 ) ? ( $_ & 4 ) ? $THRf_DEAD : $THRf_ZOMBIE : ( $_ & 4 ) ? $THRf_R_DETACHED : $THRf_R_JOINABLE; #6........... }, 'ternary2.def' => { source => "ternary2", params => "def", expect => <<'#7...........', my $a = ($b) ? ($c) ? ($d) ? $d1 : $d2 : ($e) ? $e1 : $e2 : ($f) ? ($g) ? $g1 : $g2 : ($h) ? $h1 : $h2; #7........... }, 'tick1.def' => { source => "tick1", params => "def", expect => <<'#8...........', sub a'this { $p'u'a = "mooo\n"; print $p::u::a; } a::this(); # print "mooo" print $p'u'a; # print "mooo" sub a::that { $p't'u = "wwoo\n"; return sub { print $p't'u} } $a'that = a'that(); $a'that->(); # print "wwoo" $a'that = a'that(); $p::t::u = "booo\n"; $a'that->(); # print "booo" #8........... }, 'trim_quote.def' => { source => "trim_quote", params => "def", expect => <<'#9...........', # space after quote will get trimmed push @m, ' all :: pure_all manifypods ' . $self->{NOECHO} . '$(NOOP) ' unless $self->{SKIPHASH}{'all'}; #9........... }, 'tso1.def' => { source => "tso1", params => "def", expect => <<'#10...........', print 0 + '42 EUR'; # 42 #10........... }, 'tso1.tso' => { source => "tso1", params => "tso", expect => <<'#11...........', print 0+ '42 EUR'; # 42 #11........... }, 'tutor.def' => { source => "tutor", params => "def", expect => <<'#12...........', #!/usr/bin/perl $y = shift || 5; for $i ( 1 .. 10 ) { $l[$i] = "T"; $w[$i] = 999999; } while (1) { print "Name:"; $u = ; $t = 50; $a = time; for ( 0 .. 9 ) { $x = ""; for ( 1 .. $y ) { $x .= chr( int( rand( 126 - 33 ) + 33 ) ); } while ( $z ne $x ) { print "\r\n$x\r\n"; $z = ; chomp($z); $t -= 5; } } $b = time; $t -= ( $b - $a ) * 2; $t = 0 - $t; $z = 1; @q = @l; @p = @w; print "You scored $t points\r\nTopTen\r\n"; for $i ( 1 .. 10 ) { if ( $t < $p[$z] ) { $l[$i] = $u; chomp( $l[$i] ); $w[$i] = $t; $t = 1000000; } else { $l[$i] = $q[$z]; $w[$i] = $p[$z]; $z++; } print $l[$i], "\t", $w[$i], "\r\n"; } } #12........... }, 'undoci1.def' => { source => "undoci1", params => "def", expect => <<'#13...........', $rinfo{deleteStyle} = [ -fill => 'red', -stipple => '@' . Tk->findINC('demos/images/grey.25'), ]; #13........... }, 'use1.def' => { source => "use1", params => "def", expect => <<'#14...........', # previously this caused an incorrect error message after '2.42' use lib "$Common::global::gInstallRoot/lib"; use CGI 2.42 qw(fatalsToBrowser); use RRDs 1.000101; # the 0666 must expect an operator use constant MODE => do { 0666 & ( 0777 & ~umask ) }; use IO::File (); #14........... }, 'use2.def' => { source => "use2", params => "def", expect => <<'#15...........', # Keep the space before the '()' here: use Foo::Bar (); use Foo::Bar (); use Foo::Bar 1.0 (); use Foo::Bar qw(baz); use Foo::Bar 1.0 qw(baz); #15........... }, 'version1.def' => { source => "version1", params => "def", expect => <<'#16...........', # VERSION statement unbroken, no semicolon added; our $VERSION = do { my @r = ( q$Revision: 2.2 $ =~ /\d+/g ); sprintf "%d." . "%02d" x $#r, @r } #16........... }, 'version2.def' => { source => "version2", params => "def", expect => <<'#17...........', # On one line so MakeMaker will see it. require Exporter; our $VERSION = $Exporter::VERSION; #17........... }, 'vert.def' => { source => "vert", params => "def", expect => <<'#18...........', # if $w->vert is tokenized as type 'U' then the ? will start a quote # and an error will occur. sub vert { } sub Restore { $w->vert ? $w->delta_width(0) : $w->delta_height(0); } #18........... }, 'vmll.def' => { source => "vmll", params => "def", expect => <<'#19...........', # perltidy -act=2 -vmll will leave these intact and greater than 80 columns # in length, which is what vmll does BEGIN { is_deeply( \@init_metas_called, [1] ) || diag( Dumper( \@init_metas_called ) ); } This has the comma on the next line exception { Class::MOP::Class->initialize("NonExistent")->rebless_instance($foo) }, #19........... }, 'vmll.vmll' => { source => "vmll", params => "vmll", expect => <<'#20...........', # perltidy -act=2 -vmll will leave these intact and greater than 80 columns # in length, which is what vmll does BEGIN {is_deeply(\@init_metas_called, [1]) || diag(Dumper(\@init_metas_called))} This has the comma on the next line exception { Class::MOP::Class->initialize("NonExistent")->rebless_instance($foo) }, #20........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/.gitattributes0000644000175000017500000000001014174407332015757 0ustar stevesteve* -text Perl-Tidy-20230309/t/testwide.t0000755000175000017500000000561614212533056015121 0ustar stevesteveuse strict; use utf8; use Test; use Carp; use FindBin; BEGIN { unshift @INC, "./" } BEGIN { plan tests => 3 } use Perl::Tidy; my $source = <<'EOM'; %pangrams=("Plain","ASCII", "Zwölf große Boxkämpfer jagen Vik quer über den Sylter.","DE", "Jeż wlókÅ‚ gęś. Uf! BÄ…dź choć przy nim, staÅ„!","PL", "ЛюбÑ, Ñъешь щипцы, — вздохнёт мÑÑ€, — кайф жгуч.","RU"); EOM my $expected_output = <<'EOM'; %pangrams = ( "Plain", "ASCII", "Zwölf große Boxkämpfer jagen Vik quer über den Sylter.", "DE", "Jeż wlókÅ‚ gęś. Uf! BÄ…dź choć przy nim, staÅ„!", "PL", "ЛюбÑ, Ñъешь щипцы, — вздохнёт мÑÑ€, — кайф жгуч.", "RU" ); EOM my $perltidyrc = <<'EOM'; -gnu -enc=utf8 EOM my $output; # The source is in character mode here, so perltidy will not decode. # So here we do not need to set -eos or -neos Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$perltidyrc, argv => '-nsyn', ); ok( $output, $expected_output ); Perl::Tidy::perltidy( source => $FindBin::Bin . '/testwide.pl.src', destination => \$output, perltidyrc => \$perltidyrc, argv => '-nsyn', ); # We have to be careful here ... In this test we are comparing $output to a # source string which is in character mode (since it is in this file declared # with 'use utf8'). We need to compare strings which have the same storage # mode. # The internal storage mode of $output was character mode (decoded) for # vesions prior to 20220217.02, but is byte mode (encoded) for the latest # version of perltidy. # The following statement will decode $output if it is stored in byte mode, # and leave it unchanged (and return an error) otherwise. So this will work # with all version of perltidy. See https://perldoc.perl.org/utf8 utf8::decode($output); ok( $output, $expected_output ); # Test writing encoded output to stdout with the -st flag # References: RT #133166, RT #133171, git #35 $output = ""; do { # Send STDOUT to a temporary file use File::Temp (); my $fh = new File::Temp(); my $tmpfile = $fh->filename; # Note that we are not specifying an encoding here. Perltidy should do that. local *STDOUT; open STDOUT, '>', $tmpfile or die "Can't open tmpfile: $!"; Perl::Tidy::perltidy( source => \$source, ##destination => ... we are using -st, so no destination is specified perltidyrc => \$perltidyrc, argv => '-nsyn -st', # added -st ); close STDOUT; # Read the temporary file back in. Note that here we need to specify # the encoding. open TMP, '<', $tmpfile; binmode TMP, ":raw:encoding(UTF-8)"; while ( my $line = ) { $output .= $line } }; ok( $output, $expected_output ); Perl-Tidy-20230309/t/testwide-passthrough.pl.src0000644000175000017500000000036114174123067020415 0ustar stevesteve# nothing to tidy or run "Plain"; "Zwölf große Boxkämpfer jagen Vik quer über den Sylter."; "Jeż wlókÅ‚ gęś. Uf! BÄ…dź choć przy nim, staÅ„!"; "ЛюбÑ, Ñъешь щипцы, — вздохнёт мÑÑ€, — кайф жгуч."; Perl-Tidy-20230309/t/snippets20.t0000644000175000017500000005032314373177245015304 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 space6.def #2 space6.space6 #3 sub3.def #4 wc.def #5 wc.wc1 #6 wc.wc2 #7 ce2.ce #8 ce2.def #9 gnu6.def #10 gnu6.gnu #11 git25.def #12 git25.git25 #13 outdent.outdent2 #14 kpit.def #15 kpit.kpit #16 kpitl.def #17 kpitl.kpitl #18 hanging_side_comments3.def #19 lop.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'ce' => "-cuddled-blocks", 'def' => "", 'git25' => "-l=0", 'gnu' => "-gnu", 'kpit' => "-pt=2 -kpit=0", 'kpitl' => <<'----------', -kpit=0 -kpitl='return factorial' -pt=2 ---------- 'outdent2' => <<'----------', # test -okw and -okwl -okw -okwl='next' ---------- 'space6' => <<'----------', -nwrs="+ - / *" -nwls="+ - / *" ---------- 'wc1' => "-wc=4", 'wc2' => "-wc=4 -wn", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'ce2' => <<'----------', # Previously, perltidy -ce would move a closing brace below a pod section to # form '} else {'. No longer doing this because if you change back to -nce, the # brace cannot go back to where it was. if ($notty) { $runnonstop = 1; share($runnonstop); } =pod If there is a TTY, we have to determine who it belongs to before we can ... =cut else { # Is Perl being run from a slave editor or graphical debugger? ... } ---------- 'git25' => <<'----------', # example for git #25; use -l=0; was losing alignment; sub 'fix_ragged_lists' was added to fix this my $mapping = [ # ... { 'is_col' => 'dsstdat', 'cr_col' => 'enroll_isaric_date', 'trans' => 0, }, { 'is_col' => 'corona_ieorres', 'cr_col' => '', 'trans' => 0, }, { 'is_col' => 'symptoms_fever', 'cr_col' => 'elig_fever', 'trans' => 1, 'manually_reviewed' => '@TODO', 'map' => { '0' => '0', '1' => '1', '9' => '@TODO' }, }, { 'is_col' => 'symptoms_cough', 'cr_col' => 'elig_cough', 'trans' => 1, 'manually_reviewed' => '@TODO', 'map' => { '0' => '0', '1' => '1', '9' => '@TODO' }, }, { 'is_col' => 'symptoms_dys_tachy_noea', 'cr_col' => 'elig_dyspnea', 'trans' => 1, 'manually_reviewed' => '@TODO', 'map' => { '0' => '0', '1' => '1', '9' => '@TODO' }, }, { 'is_col' => 'symptoms_clinical_susp', 'cr_col' => 'elig_ari', 'trans' => 0, }, { 'is_col' => 'sex', 'cr_col' => 'sex', 'trans' => 1, 'manually_reviewed' => 1, 'map' => { '0' => '1', '1' => '2' }, }, { 'is_col' => 'age', 'cr_col' => '', 'trans' => 0, }, { 'is_col' => 'ageu', 'cr_col' => '', 'trans' => 0, }, # ... ]; ---------- 'gnu6' => <<'----------', # These closing braces no longer have the same position with -gnu after an # update 13 dec 2021 in which the vertical aligner zeros recoverable spaces. # But adding the -xlp should make them all have the same indentation. $var1 = { 'foo10' => undef, 'foo72' => ' ', }; $var1 = { 'foo10' => undef, 'foo72' => ' ', }; $var2 = { 'foo72' => ' ', 'foo10' => undef, }; ---------- 'hanging_side_comments3' => <<'----------', if ( $var eq 'wastebasket' ) { # this sends a pure block # of hanging side comments #to the vertical aligner. #It caused a crash in #a test version of #sub 'delete_unmatched_tokens' #... #} } elsif ( $var eq 'spacecommand' ) { &die("No $val function") unless eval "defined &$val"; } ---------- 'kpit' => <<'----------', if ( seek(DATA, 0, 0) ) { ... } # The foreach keyword may be separated from the next opening paren foreach $req(@bgQueue) { ... } # This had trouble because a later padding operation removed the inside space while ($CmdJob eq "" && @CmdQueue > 0 && $RunNightlyWhenIdle != 1 || @CmdQueue > 0 && $RunNightlyWhenIdle == 2 && $bpc->isAdminJob($CmdQueue[0]->{host})) { ... } ---------- 'kpitl' => <<'----------', return ( $r**$n ) * ( pi**( $n / 2 ) ) / ( sqrt(pi) * factorial( 2 * ( int( $n / 2 ) ) + 2 ) / factorial( int( $n / 2 ) + 1 ) / ( 4**( int( $n / 2 ) + 1 ) ) ); ---------- 'lop' => <<'----------', # logical padding examples $same = ( ( $aP eq $bP ) && ( $aS eq $bS ) && ( $aT eq $bT ) && ( $a->{'title'} eq $b->{'title'} ) && ( $a->{'href'} eq $b->{'href'} ) ); $bits = $top > 0xffff ? 32 : $top > 0xff ? 16 : $top > 1 ? 8 : 1; lc( $self->mime_attr('content-type') || $self->{MIH_DefaultType} || 'text/plain' ); # Padding can also remove spaces; here the space after the '(' is lost: elsif ( $statement_type =~ /^sub\b/ || $paren_type[$paren_depth] =~ /^sub\b/ ) ---------- 'outdent' => <<'----------', my $i; LOOP: while ( $i = ) { chomp($i); next unless $i; fixit($i); } ---------- 'space6' => <<'----------', # test some spacing rules at possible filehandles my $z=$x/$y; # ok to change spaces around both sides of the / print $x / $y; # do not remove space before or after / here print $x/$y; # do not add a space before the / here print $x+$y; # do not add a space before the + here ---------- 'sub3' => <<'----------', # keep these one-line blocks intact my $aa = sub #line 245 "Parse.yp" { n_stmtexp $_[1] }; my $bb = sub # { n_stmtexp $_[1] }; ---------- 'wc' => <<'----------', { my (@indices) = sort { $dir eq 'left' ? $cells[$a] <=> $cells[$b] : $cells[$b] <=> $cells[$a]; } (0 .. $#cells); {{{{ if ( !$array[0] ) { $array[0] = &$CantProcessPartFunc( $entity->{'fields'}{ 'content-type'} ); } }}}}} ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'space6.def' => { source => "space6", params => "def", expect => <<'#1...........', # test some spacing rules at possible filehandles my $z = $x / $y; # ok to change spaces around both sides of the / print $x / $y; # do not remove space before or after / here print $x/ $y; # do not add a space before the / here print $x+ $y; # do not add a space before the + here #1........... }, 'space6.space6' => { source => "space6", params => "space6", expect => <<'#2...........', # test some spacing rules at possible filehandles my $z = $x/$y; # ok to change spaces around both sides of the / print $x / $y; # do not remove space before or after / here print $x/$y; # do not add a space before the / here print $x+$y; # do not add a space before the + here #2........... }, 'sub3.def' => { source => "sub3", params => "def", expect => <<'#3...........', # keep these one-line blocks intact my $aa = sub #line 245 "Parse.yp" { n_stmtexp $_[1] }; my $bb = sub # { n_stmtexp $_[1] }; #3........... }, 'wc.def' => { source => "wc", params => "def", expect => <<'#4...........', { my (@indices) = sort { $dir eq 'left' ? $cells[$a] <=> $cells[$b] : $cells[$b] <=> $cells[$a]; } ( 0 .. $#cells ); { { { { if ( !$array[0] ) { $array[0] = &$CantProcessPartFunc( $entity->{'fields'}{'content-type'} ); } } } } } } #4........... }, 'wc.wc1' => { source => "wc", params => "wc1", expect => <<'#5...........', { my (@indices) = sort { $dir eq 'left' ? $cells[$a] <=> $cells[$b] : $cells[$b] <=> $cells[$a]; } ( 0 .. $#cells ); { { { { if ( !$array[0] ) { $array[0] = &$CantProcessPartFunc( $entity->{'fields'}{'content-type'} ); } } } } } } #5........... }, 'wc.wc2' => { source => "wc", params => "wc2", expect => <<'#6...........', { my (@indices) = sort { $dir eq 'left' ? $cells[$a] <=> $cells[$b] : $cells[$b] <=> $cells[$a]; } ( 0 .. $#cells ); { { { { if ( !$array[0] ) { $array[0] = &$CantProcessPartFunc( $entity->{'fields'}{'content-type'} ); } } } } } } #6........... }, 'ce2.ce' => { source => "ce2", params => "ce", expect => <<'#7...........', # Previously, perltidy -ce would move a closing brace below a pod section to # form '} else {'. No longer doing this because if you change back to -nce, the # brace cannot go back to where it was. if ($notty) { $runnonstop = 1; share($runnonstop); } =pod If there is a TTY, we have to determine who it belongs to before we can ... =cut else { # Is Perl being run from a slave editor or graphical debugger? ...; } #7........... }, 'ce2.def' => { source => "ce2", params => "def", expect => <<'#8...........', # Previously, perltidy -ce would move a closing brace below a pod section to # form '} else {'. No longer doing this because if you change back to -nce, the # brace cannot go back to where it was. if ($notty) { $runnonstop = 1; share($runnonstop); } =pod If there is a TTY, we have to determine who it belongs to before we can ... =cut else { # Is Perl being run from a slave editor or graphical debugger? ...; } #8........... }, 'gnu6.def' => { source => "gnu6", params => "def", expect => <<'#9...........', # These closing braces no longer have the same position with -gnu after an # update 13 dec 2021 in which the vertical aligner zeros recoverable spaces. # But adding the -xlp should make them all have the same indentation. $var1 = { 'foo10' => undef, 'foo72' => ' ', }; $var1 = { 'foo10' => undef, 'foo72' => ' ', }; $var2 = { 'foo72' => ' ', 'foo10' => undef, }; #9........... }, 'gnu6.gnu' => { source => "gnu6", params => "gnu", expect => <<'#10...........', # These closing braces no longer have the same position with -gnu after an # update 13 dec 2021 in which the vertical aligner zeros recoverable spaces. # But adding the -xlp should make them all have the same indentation. $var1 = { 'foo10' => undef, 'foo72' => ' ', }; $var1 = { 'foo10' => undef, 'foo72' => ' ', }; $var2 = { 'foo72' => ' ', 'foo10' => undef, }; #10........... }, 'git25.def' => { source => "git25", params => "def", expect => <<'#11...........', # example for git #25; use -l=0; was losing alignment; sub 'fix_ragged_lists' was added to fix this my $mapping = [ # ... { 'is_col' => 'dsstdat', 'cr_col' => 'enroll_isaric_date', 'trans' => 0, }, { 'is_col' => 'corona_ieorres', 'cr_col' => '', 'trans' => 0, }, { 'is_col' => 'symptoms_fever', 'cr_col' => 'elig_fever', 'trans' => 1, 'manually_reviewed' => '@TODO', 'map' => { '0' => '0', '1' => '1', '9' => '@TODO' }, }, { 'is_col' => 'symptoms_cough', 'cr_col' => 'elig_cough', 'trans' => 1, 'manually_reviewed' => '@TODO', 'map' => { '0' => '0', '1' => '1', '9' => '@TODO' }, }, { 'is_col' => 'symptoms_dys_tachy_noea', 'cr_col' => 'elig_dyspnea', 'trans' => 1, 'manually_reviewed' => '@TODO', 'map' => { '0' => '0', '1' => '1', '9' => '@TODO' }, }, { 'is_col' => 'symptoms_clinical_susp', 'cr_col' => 'elig_ari', 'trans' => 0, }, { 'is_col' => 'sex', 'cr_col' => 'sex', 'trans' => 1, 'manually_reviewed' => 1, 'map' => { '0' => '1', '1' => '2' }, }, { 'is_col' => 'age', 'cr_col' => '', 'trans' => 0, }, { 'is_col' => 'ageu', 'cr_col' => '', 'trans' => 0, }, # ... ]; #11........... }, 'git25.git25' => { source => "git25", params => "git25", expect => <<'#12...........', # example for git #25; use -l=0; was losing alignment; sub 'fix_ragged_lists' was added to fix this my $mapping = [ # ... { 'is_col' => 'dsstdat', 'cr_col' => 'enroll_isaric_date', 'trans' => 0, }, { 'is_col' => 'corona_ieorres', 'cr_col' => '', 'trans' => 0, }, { 'is_col' => 'symptoms_fever', 'cr_col' => 'elig_fever', 'trans' => 1, 'manually_reviewed' => '@TODO', 'map' => { '0' => '0', '1' => '1', '9' => '@TODO' }, }, { 'is_col' => 'symptoms_cough', 'cr_col' => 'elig_cough', 'trans' => 1, 'manually_reviewed' => '@TODO', 'map' => { '0' => '0', '1' => '1', '9' => '@TODO' }, }, { 'is_col' => 'symptoms_dys_tachy_noea', 'cr_col' => 'elig_dyspnea', 'trans' => 1, 'manually_reviewed' => '@TODO', 'map' => { '0' => '0', '1' => '1', '9' => '@TODO' }, }, { 'is_col' => 'symptoms_clinical_susp', 'cr_col' => 'elig_ari', 'trans' => 0, }, { 'is_col' => 'sex', 'cr_col' => 'sex', 'trans' => 1, 'manually_reviewed' => 1, 'map' => { '0' => '1', '1' => '2' }, }, { 'is_col' => 'age', 'cr_col' => '', 'trans' => 0, }, { 'is_col' => 'ageu', 'cr_col' => '', 'trans' => 0, }, # ... ]; #12........... }, 'outdent.outdent2' => { source => "outdent", params => "outdent2", expect => <<'#13...........', my $i; LOOP: while ( $i = ) { chomp($i); next unless $i; fixit($i); } #13........... }, 'kpit.def' => { source => "kpit", params => "def", expect => <<'#14...........', if ( seek( DATA, 0, 0 ) ) { ... } # The foreach keyword may be separated from the next opening paren foreach $req (@bgQueue) { ...; } # This had trouble because a later padding operation removed the inside space while ($CmdJob eq "" && @CmdQueue > 0 && $RunNightlyWhenIdle != 1 || @CmdQueue > 0 && $RunNightlyWhenIdle == 2 && $bpc->isAdminJob( $CmdQueue[0]->{host} ) ) { ...; } #14........... }, 'kpit.kpit' => { source => "kpit", params => "kpit", expect => <<'#15...........', if ( seek(DATA, 0, 0) ) { ... } # The foreach keyword may be separated from the next opening paren foreach $req ( @bgQueue ) { ...; } # This had trouble because a later padding operation removed the inside space while ( $CmdJob eq "" && @CmdQueue > 0 && $RunNightlyWhenIdle != 1 || @CmdQueue > 0 && $RunNightlyWhenIdle == 2 && $bpc->isAdminJob($CmdQueue[0]->{host}) ) { ...; } #15........... }, 'kpitl.def' => { source => "kpitl", params => "def", expect => <<'#16...........', return ( $r**$n ) * ( pi**( $n / 2 ) ) / ( sqrt(pi) * factorial( 2 * ( int( $n / 2 ) ) + 2 ) / factorial( int( $n / 2 ) + 1 ) / ( 4**( int( $n / 2 ) + 1 ) ) ); #16........... }, 'kpitl.kpitl' => { source => "kpitl", params => "kpitl", expect => <<'#17...........', return ( $r**$n ) * (pi**($n / 2)) / ( sqrt(pi) * factorial( 2 * (int($n / 2)) + 2 ) / factorial( int($n / 2) + 1 ) / (4**(int($n / 2) + 1))); #17........... }, 'hanging_side_comments3.def' => { source => "hanging_side_comments3", params => "def", expect => <<'#18...........', if ( $var eq 'wastebasket' ) { # this sends a pure block # of hanging side comments #to the vertical aligner. #It caused a crash in #a test version of #sub 'delete_unmatched_tokens' #... #} } elsif ( $var eq 'spacecommand' ) { &die("No $val function") unless eval "defined &$val"; } #18........... }, 'lop.def' => { source => "lop", params => "def", expect => <<'#19...........', # logical padding examples $same = ( ( $aP eq $bP ) && ( $aS eq $bS ) && ( $aT eq $bT ) && ( $a->{'title'} eq $b->{'title'} ) && ( $a->{'href'} eq $b->{'href'} ) ); $bits = $top > 0xffff ? 32 : $top > 0xff ? 16 : $top > 1 ? 8 : 1; lc( $self->mime_attr('content-type') || $self->{MIH_DefaultType} || 'text/plain' ); # Padding can also remove spaces; here the space after the '(' is lost: elsif ($statement_type =~ /^sub\b/ || $paren_type[$paren_depth] =~ /^sub\b/ ) #19........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/testsa.t0000644000175000017500000000337112563477536014606 0ustar stevesteveuse strict; use Test; use Carp; BEGIN {plan tests => 1} use Perl::Tidy; #---------------------------------------------------------------------- ## test string->array # Also tests flags -ce and -l=60 # Note that we have to use -npro to avoid using local .perltidyrc #---------------------------------------------------------------------- my $source = <<'EOM'; $seqno = $type_sequence[$i]; if ($seqno) { if (tok =~/[\(\[\{]/) { $indentation{$seqno} = indentation } } elsif (tok =~/[\)\]\}]/) { $min_indentation = $indentation{$seqno}; delete $indentation{$seqno}; if ($indentation < $min_indentation) {$indentation = $min_indentation} } EOM my @tidy_output; Perl::Tidy::perltidy( source => \$source, destination => \@tidy_output, perltidyrc => undef, argv => '-nsyn -ce -npro -l=60', ); my @expected_output=; my $ok=1; if (@expected_output == @tidy_output) { while ( $_ = pop @tidy_output ) { s/\s+$//; my $expect = pop @expected_output; $expect=~s/\s+$//; if ( $expect ne $_ ) { print STDERR "got:$_"; print STDERR "---\n"; print STDERR "expected_output:$expect"; $ok=0; last; } } } else { print STDERR "Line Counts differ\n"; $ok=0; } ok ($ok,1); # This is the expected result of 'perltidy -ce -l=60' on the above string: __DATA__ $seqno = $type_sequence[$i]; if ($seqno) { if ( tok =~ /[\(\[\{]/ ) { $indentation{$seqno} = indentation; } } elsif ( tok =~ /[\)\]\}]/ ) { $min_indentation = $indentation{$seqno}; delete $indentation{$seqno}; if ( $indentation < $min_indentation ) { $indentation = $min_indentation; } } Perl-Tidy-20230309/t/snippets3.t0000644000175000017500000004530114373177245015225 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 ce_wn1.ce_wn #2 ce_wn1.def #3 colin.colin #4 colin.def #5 essential.def #6 essential.essential1 #7 essential.essential2 #8 extrude1.def #9 extrude1.extrude #10 extrude2.def #11 extrude2.extrude #12 extrude3.def #13 extrude3.extrude #14 extrude4.def #15 extrude4.extrude #16 fabrice_bug.def #17 fabrice_bug.fabrice_bug #18 format1.def #19 given1.def #20 gnu1.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'ce_wn' => <<'----------', -cuddled-blocks -wn ---------- 'colin' => <<'----------', -l=0 -pt=2 -nsfs -sbt=2 -ohbr -opr -osbr -pvt=2 -schb -scp -scsb -sohb -sop -sosb ---------- 'def' => "", 'essential1' => <<'----------', -syn -i=0 -l=100000 -nasc -naws -dws -nanl -blbp=0 -blbs=0 -nbbb -kbl=0 -mbl=0 ---------- 'essential2' => "-extrude", 'extrude' => "--extrude", 'fabrice_bug' => "-bt=0", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'ce_wn1' => <<'----------', if ($BOLD_MATH) { ( $labels, $comment, join( '', ' < B > ', &make_math( $mode, '', '', $_ ), ' < /B>' ) ) } else { ( &process_math_in_latex( $mode, $math_style, $slevel, "\\mbox{$text}" ), $after ) } ---------- 'colin' => <<'----------', env(0, 15, 0, 10, { Xtitle => 'X-data', Ytitle => 'Y-data', Title => 'An example of errb and points', Font => 'Italic' }); ---------- 'essential' => <<'----------', # Run with mangle to squeeze out the white space # also run with extrude # never combine two bare words or numbers status and ::ok(1); return ::spw(...); for bla::bla:: abc; # do not combine 'overload::' and 'and' if $self->{bareStringify} and ref $_ and defined %overload:: and defined &{'overload::StrVal'}; # do not combine 'SINK' and 'if' my $size=-s::SINK if $file; # do not combine to make $inputeq"quit" if ($input eq"quit"); # do not combine a number with a concatenation dot to get a float '78.' $vt100_compatible ? "\e[0;0H" : ('-' x 78 . "\n"); # do not join a minus with a bare word, because you might form # a file test operator. Here "z-i" would be taken as a file test. if (CORE::abs($z - i) < $eps); # '= -' should not become =- or you will get a warning # and something like these could become ambiguous without space # after the '-': use constant III=>1; $a = $b - III; $a = - III; # keep a space between a token ending in '$' and any word; die @$ if $@; # avoid combining tokens to create new meanings. Example: # this must not become $a++$b $a+ +$b; # another example: do not combine these two &'s: allow_options & &OPT_EXECCGI; # Perl is sensitive to whitespace after the + here: $b = xvals $a + 0.1 * yvals $a; # keep paren separate here: use Foo::Bar (); # need space after foreach my; for example, this will fail in # older versions of Perl: foreach my$ft(@filetypes)... # must retain space between grep and left paren; "grep(" may fail my $match = grep (m/^-extrude$/, @list) ? 1 : 0; # don't stick numbers next to left parens, as in: use Mail::Internet 1.28 (); # do not remove space between an '&' and a bare word because # it may turn into a function evaluation, like here # between '&' and 'O_ACCMODE', producing a syntax error [File.pm] $opts{rdonly} = (($opts{mode} & O_ACCMODE) == O_RDONLY); ---------- 'extrude1' => <<'----------', # do not break before the ++ print $x++ . "\n"; ---------- 'extrude2' => <<'----------', if (-l pid_filename()) { return readlink(pid_filename()); } ---------- 'extrude3' => <<'----------', # Breaking before a ++ can cause perl to guess wrong print( ( $i++ & 1 ) ? $_ : ( $change{$_} || $_ ) ); # Space between '&' and 'O_ACCMODE' is essential here $opts{rdonly} = (($opts{mode} & O_ACCMODE) == O_RDONLY); ---------- 'extrude4' => <<'----------', # From Safe.pm caused trouble with extrude use Opcode 1.01, qw( opset opset_to_ops opmask_add empty_opset full_opset invert_opset verify_opset opdesc opcodes opmask define_optag opset_to_hex ); ---------- 'fabrice_bug' => <<'----------', # no space around ^variable with -bt=0 my $before = ${^PREMATCH}; my $after = ${PREMATCH}; ---------- 'format1' => <<'----------', if (/^--list$/o) { format = @<<<<<<<<<<<<<<<<<<<<<<<< @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< $_, $val . print "Available strips:\n"; for ( split ( /\|/, $known_strips ) ) { $val = $defs{$_}{'name'}; write; } } ---------- 'given1' => <<'----------', given ([9,"a",11]) { when (qr/\d/) { given ($count) { when (1) { ok($count==1) } else { ok($count!=1) } when ([5,6]) { ok(0) } else { ok(1) } } } ok(1) when 11; } ---------- 'gnu1' => <<'----------', @common_sometimes = ( "aclocal.m4", "acconfig.h", "config.h.top", "config.h.bot", "stamp-h.in", 'stamp-vti' ); ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'ce_wn1.ce_wn' => { source => "ce_wn1", params => "ce_wn", expect => <<'#1...........', if ($BOLD_MATH) { ( $labels, $comment, join( '', ' < B > ', &make_math( $mode, '', '', $_ ), ' < /B>' ) ) } else { ( &process_math_in_latex( $mode, $math_style, $slevel, "\\mbox{$text}" ), $after ) } #1........... }, 'ce_wn1.def' => { source => "ce_wn1", params => "def", expect => <<'#2...........', if ($BOLD_MATH) { ( $labels, $comment, join( '', ' < B > ', &make_math( $mode, '', '', $_ ), ' < /B>' ) ) } else { ( &process_math_in_latex( $mode, $math_style, $slevel, "\\mbox{$text}" ), $after ) } #2........... }, 'colin.colin' => { source => "colin", params => "colin", expect => <<'#3...........', env(0, 15, 0, 10, { Xtitle => 'X-data', Ytitle => 'Y-data', Title => 'An example of errb and points', Font => 'Italic' }); #3........... }, 'colin.def' => { source => "colin", params => "def", expect => <<'#4...........', env( 0, 15, 0, 10, { Xtitle => 'X-data', Ytitle => 'Y-data', Title => 'An example of errb and points', Font => 'Italic' } ); #4........... }, 'essential.def' => { source => "essential", params => "def", expect => <<'#5...........', # Run with mangle to squeeze out the white space # also run with extrude # never combine two bare words or numbers status and ::ok(1); return ::spw(...); for bla::bla:: abc; # do not combine 'overload::' and 'and' if $self->{bareStringify} and ref $_ and defined %overload:: and defined &{'overload::StrVal'}; # do not combine 'SINK' and 'if' my $size = -s ::SINK if $file; # do not combine to make $inputeq"quit" if ( $input eq "quit" ); # do not combine a number with a concatenation dot to get a float '78.' $vt100_compatible ? "\e[0;0H" : ( '-' x 78 . "\n" ); # do not join a minus with a bare word, because you might form # a file test operator. Here "z-i" would be taken as a file test. if ( CORE::abs( $z - i ) < $eps ); # '= -' should not become =- or you will get a warning # and something like these could become ambiguous without space # after the '-': use constant III => 1; $a = $b - III; $a = - III; # keep a space between a token ending in '$' and any word; die @$ if $@; # avoid combining tokens to create new meanings. Example: # this must not become $a++$b $a + +$b; # another example: do not combine these two &'s: allow_options & &OPT_EXECCGI; # Perl is sensitive to whitespace after the + here: $b = xvals $a + 0.1 * yvals $a; # keep paren separate here: use Foo::Bar (); # need space after foreach my; for example, this will fail in # older versions of Perl: foreach my $ft (@filetypes) ... # must retain space between grep and left paren; "grep(" may fail my $match = grep ( m/^-extrude$/, @list ) ? 1 : 0; # don't stick numbers next to left parens, as in: use Mail::Internet 1.28 (); # do not remove space between an '&' and a bare word because # it may turn into a function evaluation, like here # between '&' and 'O_ACCMODE', producing a syntax error [File.pm] $opts{rdonly} = ( ( $opts{mode} & O_ACCMODE ) == O_RDONLY ); #5........... }, 'essential.essential1' => { source => "essential", params => "essential1", expect => <<'#6...........', # Run with mangle to squeeze out the white space # also run with extrude # never combine two bare words or numbers status and ::ok(1); return ::spw(...); for bla::bla:: abc; # do not combine 'overload::' and 'and' if$self->{bareStringify}and ref$_ and defined%overload:: and defined&{'overload::StrVal'}; # do not combine 'SINK' and 'if' my$size=-s::SINK if$file; # do not combine to make $inputeq"quit" if($input eq"quit"); # do not combine a number with a concatenation dot to get a float '78.' $vt100_compatible?"\e[0;0H":('-' x 78 ."\n"); # do not join a minus with a bare word, because you might form # a file test operator. Here "z-i" would be taken as a file test. if(CORE::abs($z- i)<$eps); # '= -' should not become =- or you will get a warning # and something like these could become ambiguous without space # after the '-': use constant III=>1; $a=$b- III; $a=- III; # keep a space between a token ending in '$' and any word; die@$ if$@; # avoid combining tokens to create new meanings. Example: # this must not become $a++$b $a+ +$b; # another example: do not combine these two &'s: allow_options& &OPT_EXECCGI; # Perl is sensitive to whitespace after the + here: $b=xvals$a + 0.1*yvals$a; # keep paren separate here: use Foo::Bar (); # need space after foreach my; for example, this will fail in # older versions of Perl: foreach my$ft(@filetypes)... # must retain space between grep and left paren; "grep(" may fail my$match=grep (m/^-extrude$/,@list)?1:0; # don't stick numbers next to left parens, as in: use Mail::Internet 1.28 (); # do not remove space between an '&' and a bare word because # it may turn into a function evaluation, like here # between '&' and 'O_ACCMODE', producing a syntax error [File.pm] $opts{rdonly}=(($opts{mode}& O_ACCMODE)==O_RDONLY); #6........... }, 'essential.essential2' => { source => "essential", params => "essential2", expect => <<'#7...........', # Run with mangle to squeeze out the white space # also run with extrude # never combine two bare words or numbers status and ::ok( 1 ) ; return ::spw( ... ) ; for bla::bla:: abc ; # do not combine 'overload::' and 'and' if $self -> {bareStringify} and ref $_ and defined %overload:: and defined &{ 'overload::StrVal' } ; # do not combine 'SINK' and 'if' my$size = -s::SINK if $file ; # do not combine to make $inputeq"quit" if ( $input eq "quit" ) ; # do not combine a number with a concatenation dot to get a float '78.' $vt100_compatible? "\e[0;0H" : ( '-' x 78 . "\n" ) ; # do not join a minus with a bare word, because you might form # a file test operator. Here "z-i" would be taken as a file test. if ( CORE::abs ( $z - i ) < $eps ) ; # '= -' should not become =- or you will get a warning # and something like these could become ambiguous without space # after the '-': use constant III=> 1 ; $a = $b - III ; $a = - III ; # keep a space between a token ending in '$' and any word; die @$ if $@ ; # avoid combining tokens to create new meanings. Example: # this must not become $a++$b $a + + $b ; # another example: do not combine these two &'s: allow_options & &OPT_EXECCGI ; # Perl is sensitive to whitespace after the + here: $b = xvals$a + 0.1 * yvals$a; # keep paren separate here: use Foo::Bar ( ) ; # need space after foreach my; for example, this will fail in # older versions of Perl: foreach my$ft ( @filetypes ) ... # must retain space between grep and left paren; "grep(" may fail my$match = grep ( m/^-extrude$/ , @list ) ? 1 : 0 ; # don't stick numbers next to left parens, as in: use Mail::Internet 1.28 ( ) ; # do not remove space between an '&' and a bare word because # it may turn into a function evaluation, like here # between '&' and 'O_ACCMODE', producing a syntax error [File.pm] $opts{rdonly} = ( ( $opts{mode} & O_ACCMODE ) == O_RDONLY ) ; #7........... }, 'extrude1.def' => { source => "extrude1", params => "def", expect => <<'#8...........', # do not break before the ++ print $x++ . "\n"; #8........... }, 'extrude1.extrude' => { source => "extrude1", params => "extrude", expect => <<'#9...........', # do not break before the ++ print$x++ . "\n" ; #9........... }, 'extrude2.def' => { source => "extrude2", params => "def", expect => <<'#10...........', if ( -l pid_filename() ) { return readlink( pid_filename() ); } #10........... }, 'extrude2.extrude' => { source => "extrude2", params => "extrude", expect => <<'#11...........', if ( -l pid_filename( ) ) { return readlink ( pid_filename( ) ) ; } #11........... }, 'extrude3.def' => { source => "extrude3", params => "def", expect => <<'#12...........', # Breaking before a ++ can cause perl to guess wrong print( ( $i++ & 1 ) ? $_ : ( $change{$_} || $_ ) ); # Space between '&' and 'O_ACCMODE' is essential here $opts{rdonly} = ( ( $opts{mode} & O_ACCMODE ) == O_RDONLY ); #12........... }, 'extrude3.extrude' => { source => "extrude3", params => "extrude", expect => <<'#13...........', # Breaking before a ++ can cause perl to guess wrong print ( ( $i++ & 1 ) ? $_ : ( $change{ $_ } || $_ ) ) ; # Space between '&' and 'O_ACCMODE' is essential here $opts{rdonly} = ( ( $opts{mode} & O_ACCMODE ) == O_RDONLY ) ; #13........... }, 'extrude4.def' => { source => "extrude4", params => "def", expect => <<'#14...........', # From Safe.pm caused trouble with extrude use Opcode 1.01, qw( opset opset_to_ops opmask_add empty_opset full_opset invert_opset verify_opset opdesc opcodes opmask define_optag opset_to_hex ); #14........... }, 'extrude4.extrude' => { source => "extrude4", params => "extrude", expect => <<'#15...........', # From Safe.pm caused trouble with extrude use Opcode 1.01 , qw( opset opset_to_ops opmask_add empty_opset full_opset invert_opset verify_opset opdesc opcodes opmask define_optag opset_to_hex ) ; #15........... }, 'fabrice_bug.def' => { source => "fabrice_bug", params => "def", expect => <<'#16...........', # no space around ^variable with -bt=0 my $before = ${^PREMATCH}; my $after = ${PREMATCH}; #16........... }, 'fabrice_bug.fabrice_bug' => { source => "fabrice_bug", params => "fabrice_bug", expect => <<'#17...........', # no space around ^variable with -bt=0 my $before = ${^PREMATCH}; my $after = ${ PREMATCH }; #17........... }, 'format1.def' => { source => "format1", params => "def", expect => <<'#18...........', if (/^--list$/o) { format = @<<<<<<<<<<<<<<<<<<<<<<<< @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< $_, $val . print "Available strips:\n"; for ( split( /\|/, $known_strips ) ) { $val = $defs{$_}{'name'}; write; } } #18........... }, 'given1.def' => { source => "given1", params => "def", expect => <<'#19...........', given ( [ 9, "a", 11 ] ) { when (qr/\d/) { given ($count) { when (1) { ok( $count == 1 ) } else { ok( $count != 1 ) } when ( [ 5, 6 ] ) { ok(0) } else { ok(1) } } } ok(1) when 11; } #19........... }, 'gnu1.def' => { source => "gnu1", params => "def", expect => <<'#20...........', @common_sometimes = ( "aclocal.m4", "acconfig.h", "config.h.top", "config.h.bot", "stamp-h.in", 'stamp-vti' ); #20........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/snippets2.t0000644000175000017500000003231414373177245015224 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 angle.def #2 arrows1.def #3 arrows2.def #4 attrib1.def #5 attrib2.def #6 attrib3.def #7 bar1.bar #8 bar1.def #9 block1.def #10 boc1.boc #11 boc1.def #12 boc2.boc #13 boc2.def #14 break1.def #15 break2.def #16 break3.def #17 break4.def #18 carat.def #19 ce1.ce #20 ce1.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'bar' => "-bar", 'boc' => "-boc", 'ce' => "-cuddled-blocks", 'def' => "", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'angle' => <<'----------', # This is an angle operator: @message_list =sort sort_algorithm < INDEX_FILE >;# angle operator # Not an angle operator: # Patched added in guess routine for this case: if ( VERSION < 5.009 && $op->name eq 'aassign' ) { } ---------- 'arrows1' => <<'----------', # remove spaces around arrows my $obj = Bio::Variation::AAChange -> new; my $termcap = Term::Cap -> Tgetent( { TERM => undef } ); ---------- 'arrows2' => <<'----------', $_[ 0]-> Blue -> backColor(( $_[ 0]-> Blue -> backColor == cl::Blue ) ? cl::LightBlue : cl::Blue ); ---------- 'attrib1' => <<'----------', sub be_careful () : locked method { my $self = shift; # ... } ---------- 'attrib2' => <<'----------', sub witch () # prototype may be on new line, but cannot put line break within prototype : locked { print "and your little dog "; } ---------- 'attrib3' => <<'----------', package Canine; package Dog; my Canine $spot : Watchful ; package Felis; my $cat : Nervous; package X; sub foo : locked ; package X; sub Y::x : locked { 1 } package X; sub foo { 1 } package Y; BEGIN { *bar = \&X::foo; } package Z; sub Y::bar : locked ; ---------- 'bar1' => <<'----------', if ($bigwasteofspace1 && $bigwasteofspace2 || $bigwasteofspace3 && $bigwasteofspace4) { } ---------- 'block1' => <<'----------', # Some block tests print "start main running\n"; die "main now dying\n"; END {$a=6; print "1st end, a=$a\n"} CHECK {$a=8; print "1st check, a=$a\n"} INIT {$a=10; print "1st init, a=$a\n"} END {$a=12; print "2nd end, a=$a\n"} BEGIN {$a=14; print "1st begin, a=$a\n"} INIT {$a=16; print "2nd init, a=$a\n"} BEGIN {$a=18; print "2nd begin, a=$a\n"} CHECK {$a=20; print "2nd check, a=$a\n"} END {$a=23; print "3rd end, a=$a\n"} ---------- 'boc1' => <<'----------', # RT#98902 # Running with -boc (break-at-old-comma-breakpoints) should not # allow forming a single line my @bar = map { { number => $_, character => chr $_, padding => (' ' x $_), } } ( 0 .. 32 ); ---------- 'boc2' => <<'----------', my @list = ( 1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1,); ---------- 'break1' => <<'----------', # break at ; $self->__print("*** Type 'p' now to show start up log\n") ; # XXX add to banner? ---------- 'break2' => <<'----------', # break before the '->' ( $current_feature_item->children )[0]->set( $current_feature->primary_tag ); $sth->{'Database'}->{'xbase_tables'}->{ $parsed_sql->{'table'}[0] }->field_type($_); ---------- 'break3' => <<'----------', # keep the anonymous hash block together: my $red_color = $widget->window->get_colormap->color_alloc( { red => 65000, green => 0, blue => 0 } ); ---------- 'break4' => <<'----------', spawn( "$LINTIAN_ROOT/unpack/list-binpkg", "$LINTIAN_LAB/info/binary-packages", $v ) == 0 or fail("cannot create binary package list"); ---------- 'carat' => <<'----------', my $a=${^WARNING_BITS}; @{^HOWDY_PARDNER}=(101,102); ${^W} = 1; $bb[$^]] = "bubba"; ---------- 'ce1' => <<'----------', # test -ce with blank lines and comments between blocks if($value[0] =~ /^(\#)/){ # skip any comment line last SWITCH; } elsif($value[0] =~ /^(o)$/ or $value[0] =~ /^(os)$/){ $os=$value[1]; last SWITCH; } elsif($value[0] =~ /^(b)$/ or $value[0] =~ /^(dbfile)$/) # comment { $dbfile=$value[1]; last SWITCH; # Add the additional site }else{ $rebase_hash{$name} .= " $site"; } ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'angle.def' => { source => "angle", params => "def", expect => <<'#1...........', # This is an angle operator: @message_list = sort sort_algorithm < INDEX_FILE >; # angle operator # Not an angle operator: # Patched added in guess routine for this case: if ( VERSION < 5.009 && $op->name eq 'aassign' ) { } #1........... }, 'arrows1.def' => { source => "arrows1", params => "def", expect => <<'#2...........', # remove spaces around arrows my $obj = Bio::Variation::AAChange->new; my $termcap = Term::Cap->Tgetent( { TERM => undef } ); #2........... }, 'arrows2.def' => { source => "arrows2", params => "def", expect => <<'#3...........', $_[0]->Blue->backColor( ( $_[0]->Blue->backColor == cl::Blue ) ? cl::LightBlue : cl::Blue ); #3........... }, 'attrib1.def' => { source => "attrib1", params => "def", expect => <<'#4...........', sub be_careful () : locked method { my $self = shift; # ... } #4........... }, 'attrib2.def' => { source => "attrib2", params => "def", expect => <<'#5...........', sub witch () # prototype may be on new line, but cannot put line break within prototype : locked { print "and your little dog "; } #5........... }, 'attrib3.def' => { source => "attrib3", params => "def", expect => <<'#6...........', package Canine; package Dog; my Canine $spot : Watchful; package Felis; my $cat : Nervous; package X; sub foo : locked; package X; sub Y::x : locked { 1 } package X; sub foo { 1 } package Y; BEGIN { *bar = \&X::foo; } package Z; sub Y::bar : locked; #6........... }, 'bar1.bar' => { source => "bar1", params => "bar", expect => <<'#7...........', if ( $bigwasteofspace1 && $bigwasteofspace2 || $bigwasteofspace3 && $bigwasteofspace4 ) { } #7........... }, 'bar1.def' => { source => "bar1", params => "def", expect => <<'#8...........', if ( $bigwasteofspace1 && $bigwasteofspace2 || $bigwasteofspace3 && $bigwasteofspace4 ) { } #8........... }, 'block1.def' => { source => "block1", params => "def", expect => <<'#9...........', # Some block tests print "start main running\n"; die "main now dying\n"; END { $a = 6; print "1st end, a=$a\n" } CHECK { $a = 8; print "1st check, a=$a\n" } INIT { $a = 10; print "1st init, a=$a\n" } END { $a = 12; print "2nd end, a=$a\n" } BEGIN { $a = 14; print "1st begin, a=$a\n" } INIT { $a = 16; print "2nd init, a=$a\n" } BEGIN { $a = 18; print "2nd begin, a=$a\n" } CHECK { $a = 20; print "2nd check, a=$a\n" } END { $a = 23; print "3rd end, a=$a\n" } #9........... }, 'boc1.boc' => { source => "boc1", params => "boc", expect => <<'#10...........', # RT#98902 # Running with -boc (break-at-old-comma-breakpoints) should not # allow forming a single line my @bar = map { { number => $_, character => chr $_, padding => ( ' ' x $_ ), } } ( 0 .. 32 ); #10........... }, 'boc1.def' => { source => "boc1", params => "def", expect => <<'#11...........', # RT#98902 # Running with -boc (break-at-old-comma-breakpoints) should not # allow forming a single line my @bar = map { { number => $_, character => chr $_, padding => ( ' ' x $_ ), } } ( 0 .. 32 ); #11........... }, 'boc2.boc' => { source => "boc2", params => "boc", expect => <<'#12...........', my @list = ( 1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1, ); #12........... }, 'boc2.def' => { source => "boc2", params => "def", expect => <<'#13...........', my @list = ( 1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1, ); #13........... }, 'break1.def' => { source => "break1", params => "def", expect => <<'#14...........', # break at ; $self->__print("*** Type 'p' now to show start up log\n") ; # XXX add to banner? #14........... }, 'break2.def' => { source => "break2", params => "def", expect => <<'#15...........', # break before the '->' ( $current_feature_item->children )[0] ->set( $current_feature->primary_tag ); $sth->{'Database'}->{'xbase_tables'}->{ $parsed_sql->{'table'}[0] } ->field_type($_); #15........... }, 'break3.def' => { source => "break3", params => "def", expect => <<'#16...........', # keep the anonymous hash block together: my $red_color = $widget->window->get_colormap->color_alloc( { red => 65000, green => 0, blue => 0 } ); #16........... }, 'break4.def' => { source => "break4", params => "def", expect => <<'#17...........', spawn( "$LINTIAN_ROOT/unpack/list-binpkg", "$LINTIAN_LAB/info/binary-packages", $v ) == 0 or fail("cannot create binary package list"); #17........... }, 'carat.def' => { source => "carat", params => "def", expect => <<'#18...........', my $a = ${^WARNING_BITS}; @{^HOWDY_PARDNER} = ( 101, 102 ); ${^W} = 1; $bb[$^]] = "bubba"; #18........... }, 'ce1.ce' => { source => "ce1", params => "ce", expect => <<'#19...........', # test -ce with blank lines and comments between blocks if ( $value[0] =~ /^(\#)/ ) { # skip any comment line last SWITCH; } elsif ( $value[0] =~ /^(o)$/ or $value[0] =~ /^(os)$/ ) { $os = $value[1]; last SWITCH; } elsif ( $value[0] =~ /^(b)$/ or $value[0] =~ /^(dbfile)$/ ) # comment { $dbfile = $value[1]; last SWITCH; # Add the additional site } else { $rebase_hash{$name} .= " $site"; } #19........... }, 'ce1.def' => { source => "ce1", params => "def", expect => <<'#20...........', # test -ce with blank lines and comments between blocks if ( $value[0] =~ /^(\#)/ ) { # skip any comment line last SWITCH; } elsif ( $value[0] =~ /^(o)$/ or $value[0] =~ /^(os)$/ ) { $os = $value[1]; last SWITCH; } elsif ( $value[0] =~ /^(b)$/ or $value[0] =~ /^(dbfile)$/ ) # comment { $dbfile = $value[1]; last SWITCH; # Add the additional site } else { $rebase_hash{$name} .= " $site"; } #20........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/snippets6.t0000644000175000017500000003042114373177246015226 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 otr1.otr #2 pbp1.def #3 pbp1.pbp #4 pbp2.def #5 pbp2.pbp #6 pbp3.def #7 pbp3.pbp #8 pbp4.def #9 pbp4.pbp #10 pbp5.def #11 pbp5.pbp #12 print1.def #13 q1.def #14 q2.def #15 recombine1.def #16 recombine2.def #17 recombine3.def #18 recombine4.def #19 rt101547.def #20 rt102371.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'def' => "", 'otr' => <<'----------', -ohbr -opr -osbr ---------- 'pbp' => "-pbp -nst -nse", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'otr1' => <<'----------', return $pdl->slice( join ',', ( map { $_ eq "X" ? ":" : ref $_ eq "ARRAY" ? join ':', @$_ : !ref $_ ? $_ : die "INVALID SLICE DEF $_" } @_ ) ); ---------- 'pbp1' => <<'----------', # break after '+' if default, before + if pbp my $min_gnu_indentation = $standard_increment + $gnu_stack[$max_gnu_stack_index]->get_SPACES(); ---------- 'pbp2' => <<'----------', $tmp = $day - 32075 + 1461 * ( $year + 4800 - ( 14 - $month ) / 12 ) / 4 + 367 * ( $month - 2 + ( ( 14 - $month ) / 12 ) * 12 ) / 12 - 3 * ( ( $year + 4900 - ( 14 - $month ) / 12 ) / 100 ) / 4; ---------- 'pbp3' => <<'----------', return $sec + $SecOff + ( SECS_PER_MINUTE * $min ) + ( SECS_PER_HOUR * $hour ) + ( SECS_PER_DAY * $days ); ---------- 'pbp4' => <<'----------', # with defaults perltidy will break after the '=' here my @host_seq = $level eq "easy" ? @reordered : 0..$last; # reordered has CDROM up front ---------- 'pbp5' => <<'----------', # illustates problem with -pbp: -ci should not equal -i say 'ok_200_24_hours.value '.average({'$and'=>[{time=>{'$gt',$time-60*60*24}},{status=>200}]}); ---------- 'print1' => <<'----------', # same text twice. Has uncontained commas; -- leave as is print "conformability (Not the same dimension)\n", "\t", $have, " is ", text_unit($hu), "\n", "\t", $want, " is ", text_unit($wu), "\n",; print "conformability (Not the same dimension)\n", "\t", $have, " is ", text_unit($hu), "\n", "\t", $want, " is ", text_unit($wu), "\n", ; ---------- 'q1' => <<'----------', print qq(You are in zone $thisTZ Difference with respect to GMT is ), $offset / 3600, qq( hours And local time is $hour hours $min minutes $sec seconds ); ---------- 'q2' => <<'----------', $a=qq XHello World\nX; print "$a"; ---------- 'recombine1' => <<'----------', # recombine '= [' here: $retarray = [ &{ $sth->{'xbase_parsed_sql'}{'selectfn'} } ( $xbase, $values, $sth->{'xbase_bind_values'} ) ] if defined $values; ---------- 'recombine2' => <<'----------', # recombine = unless old break there $a = [ length( $self->{fb}[-1] ), $#{ $self->{fb} } ] ; # set cursor at end of buffer and print this cursor ---------- 'recombine3' => <<'----------', # recombine final line $command = ( ($catpage =~ m:\.gz:) ? $ZCAT : $CAT ) . " < $catpage"; ---------- 'recombine4' => <<'----------', # do not recombine into two lines after a comma if # the term is complex (has parens) or changes level $delta_time = sprintf "%.4f", ( ( $done[0] + ( $done[1] / 1e6 ) ) - ( $start[0] + ( $start[1] / 1e6 ) ) ); ---------- 'rt101547' => <<'----------', { source_host => MM::Config->instance->host // q{}, } ---------- 'rt102371' => <<'----------', state $b //= ccc(); ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'otr1.otr' => { source => "otr1", params => "otr", expect => <<'#1...........', return $pdl->slice( join ',', ( map { $_ eq "X" ? ":" : ref $_ eq "ARRAY" ? join ':', @$_ : !ref $_ ? $_ : die "INVALID SLICE DEF $_" } @_ ) ); #1........... }, 'pbp1.def' => { source => "pbp1", params => "def", expect => <<'#2...........', # break after '+' if default, before + if pbp my $min_gnu_indentation = $standard_increment + $gnu_stack[$max_gnu_stack_index]->get_SPACES(); #2........... }, 'pbp1.pbp' => { source => "pbp1", params => "pbp", expect => <<'#3...........', # break after '+' if default, before + if pbp my $min_gnu_indentation = $standard_increment + $gnu_stack[$max_gnu_stack_index]->get_SPACES(); #3........... }, 'pbp2.def' => { source => "pbp2", params => "def", expect => <<'#4...........', $tmp = $day - 32075 + 1461 * ( $year + 4800 - ( 14 - $month ) / 12 ) / 4 + 367 * ( $month - 2 + ( ( 14 - $month ) / 12 ) * 12 ) / 12 - 3 * ( ( $year + 4900 - ( 14 - $month ) / 12 ) / 100 ) / 4; #4........... }, 'pbp2.pbp' => { source => "pbp2", params => "pbp", expect => <<'#5...........', $tmp = $day - 32075 + 1461 * ( $year + 4800 - ( 14 - $month ) / 12 ) / 4 + 367 * ( $month - 2 + ( ( 14 - $month ) / 12 ) * 12 ) / 12 - 3 * ( ( $year + 4900 - ( 14 - $month ) / 12 ) / 100 ) / 4; #5........... }, 'pbp3.def' => { source => "pbp3", params => "def", expect => <<'#6...........', return $sec + $SecOff + ( SECS_PER_MINUTE * $min ) + ( SECS_PER_HOUR * $hour ) + ( SECS_PER_DAY * $days ); #6........... }, 'pbp3.pbp' => { source => "pbp3", params => "pbp", expect => <<'#7...........', return $sec + $SecOff + ( SECS_PER_MINUTE * $min ) + ( SECS_PER_HOUR * $hour ) + ( SECS_PER_DAY * $days ); #7........... }, 'pbp4.def' => { source => "pbp4", params => "def", expect => <<'#8...........', # with defaults perltidy will break after the '=' here my @host_seq = $level eq "easy" ? @reordered : 0 .. $last; # reordered has CDROM up front #8........... }, 'pbp4.pbp' => { source => "pbp4", params => "pbp", expect => <<'#9...........', # with defaults perltidy will break after the '=' here my @host_seq = $level eq "easy" ? @reordered : 0 .. $last; # reordered has CDROM up front #9........... }, 'pbp5.def' => { source => "pbp5", params => "def", expect => <<'#10...........', # illustates problem with -pbp: -ci should not equal -i say 'ok_200_24_hours.value ' . average( { '$and' => [ { time => { '$gt', $time - 60 * 60 * 24 } }, { status => 200 } ] } ); #10........... }, 'pbp5.pbp' => { source => "pbp5", params => "pbp", expect => <<'#11...........', # illustates problem with -pbp: -ci should not equal -i say 'ok_200_24_hours.value ' . average( { '$and' => [ { time => { '$gt', $time - 60 * 60 * 24 } }, { status => 200 } ] } ); #11........... }, 'print1.def' => { source => "print1", params => "def", expect => <<'#12...........', # same text twice. Has uncontained commas; -- leave as is print "conformability (Not the same dimension)\n", "\t", $have, " is ", text_unit($hu), "\n", "\t", $want, " is ", text_unit($wu), "\n",; print "conformability (Not the same dimension)\n", "\t", $have, " is ", text_unit($hu), "\n", "\t", $want, " is ", text_unit($wu), "\n", ; #12........... }, 'q1.def' => { source => "q1", params => "def", expect => <<'#13...........', print qq(You are in zone $thisTZ Difference with respect to GMT is ), $offset / 3600, qq( hours And local time is $hour hours $min minutes $sec seconds ); #13........... }, 'q2.def' => { source => "q2", params => "def", expect => <<'#14...........', $a = qq XHello World\nX; print "$a"; #14........... }, 'recombine1.def' => { source => "recombine1", params => "def", expect => <<'#15...........', # recombine '= [' here: $retarray = [ &{ $sth->{'xbase_parsed_sql'}{'selectfn'} } ( $xbase, $values, $sth->{'xbase_bind_values'} ) ] if defined $values; #15........... }, 'recombine2.def' => { source => "recombine2", params => "def", expect => <<'#16...........', # recombine = unless old break there $a = [ length( $self->{fb}[-1] ), $#{ $self->{fb} } ] ; # set cursor at end of buffer and print this cursor #16........... }, 'recombine3.def' => { source => "recombine3", params => "def", expect => <<'#17...........', # recombine final line $command = ( ( $catpage =~ m:\.gz: ) ? $ZCAT : $CAT ) . " < $catpage"; #17........... }, 'recombine4.def' => { source => "recombine4", params => "def", expect => <<'#18...........', # do not recombine into two lines after a comma if # the term is complex (has parens) or changes level $delta_time = sprintf "%.4f", ( ( $done[0] + ( $done[1] / 1e6 ) ) - ( $start[0] + ( $start[1] / 1e6 ) ) ); #18........... }, 'rt101547.def' => { source => "rt101547", params => "def", expect => <<'#19...........', { source_host => MM::Config->instance->host // q{}, } #19........... }, 'rt102371.def' => { source => "rt102371", params => "def", expect => <<'#20...........', state $b //= ccc(); #20........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/testss.t0000644000175000017500000000262110441455207014605 0ustar stevesteveuse strict; use Test; use Carp; BEGIN {plan tests => 1} use Perl::Tidy; #---------------------------------------------------------------------- ## test string->string #---------------------------------------------------------------------- my $source = <<'EOM'; %height=("letter",27.9, "legal",35.6, "arche",121.9, "archd",91.4, "archc",61, "archb",45.7, "archa",30.5, "flsa",33, "flse",33, "halfletter",21.6, "11x17",43.2, "ledger",27.9); %width=("letter",21.6, "legal",21.6, "arche",91.4, "archd",61, "archc",45.7, "archb",30.5, "archa",22.9, "flsa",21.6, "flse",21.6, "halfletter",14, "11x17",27.9, "ledger",43.2); EOM my $perltidyrc = <<'EOM'; -gnu EOM my $output; Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$perltidyrc, argv => '-nsyn', ); my $expected_output=<<'EOM'; %height = ( "letter", 27.9, "legal", 35.6, "arche", 121.9, "archd", 91.4, "archc", 61, "archb", 45.7, "archa", 30.5, "flsa", 33, "flse", 33, "halfletter", 21.6, "11x17", 43.2, "ledger", 27.9 ); %width = ( "letter", 21.6, "legal", 21.6, "arche", 91.4, "archd", 61, "archc", 45.7, "archb", 30.5, "archa", 22.9, "flsa", 21.6, "flse", 21.6, "halfletter", 14, "11x17", 27.9, "ledger", 43.2 ); EOM ok($output, $expected_output); Perl-Tidy-20230309/t/atee.t0000644000175000017500000000422013653026577014207 0ustar stevesteveuse strict; use Test; use Carp; use Perl::Tidy; BEGIN { plan tests => 2; } my $sname = 'atee.t'; my $source = <<'EOM'; # block comment =pod some pod =cut print "hello world\n"; $xx++; # side comment EOM my $expect = <<'EOM'; print "hello world\n"; $xx++; EOM my $teefile_expect = <<'EOM'; # block comment =pod some pod =cut $xx++; # side comment EOM # Test capturing the .LOG, .DEBUG, .TEE outputs to strings. # In this test we delete all comments and pod in the test script and send them # to a .TEE file also save .DEBUG and .LOG output my $params = "-dac -tac -D -g"; # Verify correctness of the formatted output and the .TEE output # (.DEBUG and .LOG have been verified to work but are not checked here because # they may change over time, making work for maintaining this test file) my $output; my $teefile; my $debugfile; my $stderr_string; my $errorfile_string; my $logfile_string; my $debugfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set teefile => \$teefile, debugfile => \$debugfile_string, logfile => \$logfile_string, ); if ( $err || $stderr_string || $errorfile_string ) { if ($err) { print STDERR "This error received calling Perl::Tidy with '$sname'\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; print STDERR "This error received calling Perl::Tidy with '$sname''\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; print STDERR "This error received calling Perl::Tidy with '$sname''\n"; ok( !$errorfile_string ); } } else { ok( $output, $expect ); ok( $teefile, $teefile_expect ); } Perl-Tidy-20230309/t/snippets22.t0000644000175000017500000003637414373177245015320 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 here_long.here_long #2 bbhb.bbhb2 #3 bbhb.bbhb3 #4 bbhb.def #5 bbhb.bbhb4 #6 bbhb.bbhb5 #7 braces.braces7 #8 xci.def #9 xci.xci1 #10 xci.xci2 #11 mangle4.def #12 mangle4.mangle #13 extrude5.def #14 extrude5.extrude #15 kba1.def #16 kba1.kba1 #17 git45.def #18 git45.git45 #19 boa.boa # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'bbhb2' => "-bbhb=2 -bbp=2", 'bbhb3' => "-bbhb=3 -bbp=3", 'bbhb4' => "-bbhb=3 -bbp=3 -bbhbi=2 -bbpi=2", 'bbhb5' => "-bbhb=3 -bbp=3 -bbhbi=1 -bbpi=1", 'boa' => <<'----------', # -boa is default so we test nboa -nboa ---------- 'braces7' => <<'----------', -bli -blil='*' -blixl='eval' ---------- 'def' => "", 'extrude' => "--extrude", 'git45' => "-vtc=1 -wn", 'here_long' => "-l=33", 'kba1' => <<'----------', -kbb='=> ,' -kba='=>' ---------- 'mangle' => "--mangle", 'xci1' => "-xci", 'xci2' => "-pbp -nst -nse -xci", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'bbhb' => <<'----------', my %temp = ( supsup => 123, nested => { asdf => 456, yarg => 'yarp', }, ); ---------- 'boa' => <<'----------', my @field : field : Default(1) : Get('Name' => 'foo') : Set('Name'); ---------- 'braces' => <<'----------', sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } } $myfun = sub { print("Hello, World\n"); }; eval { my $app = App::perlbrew->new( "install-patchperl", "-q" ); $app->run(); } or do { $error = $@; $produced_error = 1; }; Mojo::IOLoop->next_tick( sub { $ua->get( '/' => sub { push @kept_alive, pop->kept_alive; Mojo::IOLoop->next_tick( sub { Mojo::IOLoop->stop } ); } ); } ); $r = do { sswitch( $words[ rand @words ] ) { case $words[0]: case $words[1]: case $words[2]: case $words[3]: { 'ok' } default: { 'wtf' } } }; try { die; } catch { die; }; ---------- 'extrude5' => <<'----------', use perl6-alpha; $var{-y} = 1; ---------- 'git45' => <<'----------', # git#45 -vtc=n and -wn were not working together if ( $self->_add_fqdn_host( name => $name, realm => $realm ) ) { ...; } # do not stack )->pack( my $hlist = $control::control->Scrolled( 'HList', drawbranch => 1, width => 20, -scrollbars => 'w' )->pack( -side => 'bottom', -expand => 1 ); ---------- 'here_long' => <<'----------', # must not break after here target regardless of maximum-line-length $sth= $dbh->prepare (<<"END_OF_SELECT") or die "Couldn't prepare SQL" ; SELECT COUNT(duration),SUM(duration) FROM logins WHERE username='$user' END_OF_SELECT ---------- 'kba1' => <<'----------', $this_env = join("", $before, $closures , $contents , ($defenv ? '': &balance_tags()) , $reopens ); $_ = $after; method 'foo1' => [ Int, Int ] => sub { my ( $self, $x, $y ) = ( shift, @_ ); ...; }; method 'foo2'=> [ Int, Int ]=> sub { my ( $self, $x, $y ) = ( shift, @_ ); ...; }; ---------- 'mangle4' => <<'----------', # a useful parsing test from 'signatures.t' use feature "signatures"; no warnings "experimental::signatures"; sub t086 ( #foo))) $ #foo))) a #foo))) , #foo))) , #foo))) $ #foo))) b #foo))) = #foo))) 333 #foo))) , #foo))) , #foo))) ) #foo))) { $a.$b } ---------- 'xci' => <<'----------', $self->{_text} = ( !$section ? '' : $type eq 'item' ? "the $section entry" : "the section on $section" ) . ( $page ? ( $section ? ' in ' : '' ) . "the $page$page_ext manpage" : ' elsewhere in this document' ); my $otherHashRef = $condition ? { 'a' => 'a value', 'b' => 'b value', 'c' => { 'd' => 'd value', 'e' => 'e value' } } : undef; my @globlist = ( grep { defined } @opt{qw( l q S t )} ) ? do { local *DIR; opendir DIR, './' or die "can't opendir './': $!"; my @a = grep { not /^\.+$/ } readdir DIR; closedir DIR; @a; } : (); ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'here_long.here_long' => { source => "here_long", params => "here_long", expect => <<'#1...........', # must not break after here target regardless of maximum-line-length $sth = $dbh->prepare( <<"END_OF_SELECT") or die "Couldn't prepare SQL"; SELECT COUNT(duration),SUM(duration) FROM logins WHERE username='$user' END_OF_SELECT #1........... }, 'bbhb.bbhb2' => { source => "bbhb", params => "bbhb2", expect => <<'#2...........', my %temp = ( supsup => 123, nested => { asdf => 456, yarg => 'yarp', }, ); #2........... }, 'bbhb.bbhb3' => { source => "bbhb", params => "bbhb3", expect => <<'#3...........', my %temp = ( supsup => 123, nested => { asdf => 456, yarg => 'yarp', }, ); #3........... }, 'bbhb.def' => { source => "bbhb", params => "def", expect => <<'#4...........', my %temp = ( supsup => 123, nested => { asdf => 456, yarg => 'yarp', }, ); #4........... }, 'bbhb.bbhb4' => { source => "bbhb", params => "bbhb4", expect => <<'#5...........', my %temp = ( supsup => 123, nested => { asdf => 456, yarg => 'yarp', }, ); #5........... }, 'bbhb.bbhb5' => { source => "bbhb", params => "bbhb5", expect => <<'#6...........', my %temp = ( supsup => 123, nested => { asdf => 456, yarg => 'yarp', }, ); #6........... }, 'braces.braces7' => { source => "braces", params => "braces7", expect => <<'#7...........', sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } } $myfun = sub { print("Hello, World\n"); }; eval { my $app = App::perlbrew->new( "install-patchperl", "-q" ); $app->run(); } or do { $error = $@; $produced_error = 1; }; Mojo::IOLoop->next_tick( sub { $ua->get( '/' => sub { push @kept_alive, pop->kept_alive; Mojo::IOLoop->next_tick( sub { Mojo::IOLoop->stop } ); } ); } ); $r = do { sswitch( $words[ rand @words ] ) { case $words[0]: case $words[1]: case $words[2]: case $words[3]: { 'ok' } default: { 'wtf' } } }; try { die; } catch { die; }; #7........... }, 'xci.def' => { source => "xci", params => "def", expect => <<'#8...........', $self->{_text} = ( !$section ? '' : $type eq 'item' ? "the $section entry" : "the section on $section" ) . ( $page ? ( $section ? ' in ' : '' ) . "the $page$page_ext manpage" : ' elsewhere in this document' ); my $otherHashRef = $condition ? { 'a' => 'a value', 'b' => 'b value', 'c' => { 'd' => 'd value', 'e' => 'e value' } } : undef; my @globlist = ( grep { defined } @opt{qw( l q S t )} ) ? do { local *DIR; opendir DIR, './' or die "can't opendir './': $!"; my @a = grep { not /^\.+$/ } readdir DIR; closedir DIR; @a; } : (); #8........... }, 'xci.xci1' => { source => "xci", params => "xci1", expect => <<'#9...........', $self->{_text} = ( !$section ? '' : $type eq 'item' ? "the $section entry" : "the section on $section" ) . ( $page ? ( $section ? ' in ' : '' ) . "the $page$page_ext manpage" : ' elsewhere in this document' ); my $otherHashRef = $condition ? { 'a' => 'a value', 'b' => 'b value', 'c' => { 'd' => 'd value', 'e' => 'e value' } } : undef; my @globlist = ( grep { defined } @opt{qw( l q S t )} ) ? do { local *DIR; opendir DIR, './' or die "can't opendir './': $!"; my @a = grep { not /^\.+$/ } readdir DIR; closedir DIR; @a; } : (); #9........... }, 'xci.xci2' => { source => "xci", params => "xci2", expect => <<'#10...........', $self->{_text} = ( !$section ? '' : $type eq 'item' ? "the $section entry" : "the section on $section" ) . ( $page ? ( $section ? ' in ' : '' ) . "the $page$page_ext manpage" : ' elsewhere in this document' ); my $otherHashRef = $condition ? { 'a' => 'a value', 'b' => 'b value', 'c' => { 'd' => 'd value', 'e' => 'e value' } } : undef; my @globlist = ( grep {defined} @opt{qw( l q S t )} ) ? do { local *DIR; opendir DIR, './' or die "can't opendir './': $!"; my @a = grep { not /^\.+$/ } readdir DIR; closedir DIR; @a; } : (); #10........... }, 'mangle4.def' => { source => "mangle4", params => "def", expect => <<'#11...........', # a useful parsing test from 'signatures.t' use feature "signatures"; no warnings "experimental::signatures"; sub t086 ( #foo))) $ #foo))) a #foo))) , #foo))) , #foo))) $ #foo))) b #foo))) = #foo))) 333 #foo))) , #foo))) , #foo))) ) #foo))) { $a . $b } #11........... }, 'mangle4.mangle' => { source => "mangle4", params => "mangle", expect => <<'#12...........', # a useful parsing test from 'signatures.t' use feature "signatures"; no warnings "experimental::signatures"; sub t086(#foo))) $ #foo))) a#foo))) ,#foo))) ,#foo))) $ #foo))) b#foo))) =#foo))) 333#foo))) ,#foo))) ,#foo))) )#foo))) {$a.$b} #12........... }, 'extrude5.def' => { source => "extrude5", params => "def", expect => <<'#13...........', use perl6-alpha; $var{-y} = 1; #13........... }, 'extrude5.extrude' => { source => "extrude5", params => "extrude", expect => <<'#14...........', use perl6-alpha ; $var{-y} = 1 ; #14........... }, 'kba1.def' => { source => "kba1", params => "def", expect => <<'#15...........', $this_env = join( "", $before, $closures, $contents, ( $defenv ? '' : &balance_tags() ), $reopens ); $_ = $after; method 'foo1' => [ Int, Int ] => sub { my ( $self, $x, $y ) = ( shift, @_ ); ...; }; method 'foo2' => [ Int, Int ] => sub { my ( $self, $x, $y ) = ( shift, @_ ); ...; }; #15........... }, 'kba1.kba1' => { source => "kba1", params => "kba1", expect => <<'#16...........', $this_env = join( "", $before, $closures , $contents , ( $defenv ? '' : &balance_tags() ) , $reopens ); $_ = $after; method 'foo1' => [ Int, Int ] => sub { my ( $self, $x, $y ) = ( shift, @_ ); ...; }; method 'foo2' => [ Int, Int ] => sub { my ( $self, $x, $y ) = ( shift, @_ ); ...; }; #16........... }, 'git45.def' => { source => "git45", params => "def", expect => <<'#17...........', # git#45 -vtc=n and -wn were not working together if ( $self->_add_fqdn_host( name => $name, realm => $realm ) ) { ...; } # do not stack )->pack( my $hlist = $control::control->Scrolled( 'HList', drawbranch => 1, width => 20, -scrollbars => 'w' )->pack( -side => 'bottom', -expand => 1 ); #17........... }, 'git45.git45' => { source => "git45", params => "git45", expect => <<'#18...........', # git#45 -vtc=n and -wn were not working together if ( $self->_add_fqdn_host( name => $name, realm => $realm ) ) { ...; } # do not stack )->pack( my $hlist = $control::control->Scrolled( 'HList', drawbranch => 1, width => 20, -scrollbars => 'w' )->pack( -side => 'bottom', -expand => 1 ); #18........... }, 'boa.boa' => { source => "boa", params => "boa", expect => <<'#19...........', my @field : field : Default(1) : Get('Name' => 'foo') : Set('Name'); #19........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/snippets23.t0000644000175000017500000004152714373177245015315 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 boa.def #2 bol.bol #3 bol.def #4 bot.bot #5 bot.def #6 hash_bang.def #7 hash_bang.hash_bang #8 listop1.listop1 #9 sbcp.def #10 sbcp.sbcp1 #11 wnxl.def #12 wnxl.wnxl1 #13 wnxl.wnxl2 #14 wnxl.wnxl3 #15 wnxl.wnxl4 #16 align34.def #17 git47.def #18 git47.git47 #19 qw.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'bol' => <<'----------', # -bol is default, so test -nbol -nbol ---------- 'bot' => <<'----------', # -bot is default so we test -nbot -nbot ---------- 'def' => "", 'git47' => <<'----------', # perltidyrc from git #47 -pbp # Start with Perl Best Practices -w # Show all warnings -iob # Ignore old breakpoints -l=120 # 120 characters per line -mbl=2 # No more than 2 blank lines -i=2 # Indentation is 2 columns -ci=2 # Continuation indentation is 2 columns -vt=0 # Less vertical tightness -pt=2 # High parenthesis tightness -bt=2 # High brace tightness -sbt=2 # High square bracket tightness -wn # Weld nested containers -isbc # Don't indent comments without leading space -nst # Don't output to STDOUT ---------- 'hash_bang' => "-x", 'listop1' => <<'----------', # -bok is default so we test nbok -nbok ---------- 'sbcp1' => <<'----------', -sbc -sbcp='#x#' ---------- 'wnxl1' => <<'----------', # only weld parens, and only if leading keyword -wn -wnxl='^K( [ { q' ---------- 'wnxl2' => <<'----------', # do not weld leading '[' -wn -wnxl='^[' ---------- 'wnxl3' => <<'----------', # do not weld interior or ending '{' without a keyword -wn -wnxl='.K{' ---------- 'wnxl4' => <<'----------', # do not weld except parens or trailing brace with keyword -wn -wnxl='.K{ ^{ [' ---------- }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'align34' => <<'----------', # align all '{' and runs of '=' if ( $line =~ /^NAME>(.*)/i ) { $Cookies{'name'} = $1; } elsif ( $line =~ /^EMAIL>(.*)/i ) { $email = $1; } elsif ( $line =~ /^IP_ADDRESS>(.*)/i ) { $ipaddress = $1; } elsif ( $line =~ /^/i ) { $remoteuser = $1; } elsif ( $line =~ /^PASSWORD>(.*)/i ) { next; } elsif ( $line =~ /^IMAGE>(.*)/i ) { $image_url = $1; } elsif ( $line =~ /^LINKNAME>(.*)/i ) { $linkname = $1; } elsif ( $line =~ /^LINKURL>(.*)/i ) { $linkurl = $1; } else { $body .= $line; } ---------- 'boa' => <<'----------', my @field : field : Default(1) : Get('Name' => 'foo') : Set('Name'); ---------- 'bol' => <<'----------', return unless $cmd = $cmd || ($dot && $Last_Shell) || &prompt('|'); ---------- 'bot' => <<'----------', $foo = $condition ? undef : 1; ---------- 'git47' => <<'----------', # cannot weld here $promises[$i]->then( sub { $all->resolve(@_); () }, sub { $results->[$i] = [@_]; $all->reject(@$results) if --$remaining <= 0; return (); } ); sub _absolutize { [ map { _is_scoped($_) ? $_ : [ [ [ 'pc', 'scope' ] ], ' ', @$_ ] } @{ shift() } ] } $c->helpers->log->debug( sub { my $req = $c->req; my $method = $req->method; my $path = $req->url->path->to_abs_string; $c->helpers->timing->begin('mojo.timer'); return qq{$method "$path"}; } ) unless $stash->{'mojo.static'}; # A single signature var can weld return Mojo::Promise->resolve($query_params)->then(&_reveal_event)->then( sub ($code) { return $c->render( text => '', status => $code ); } ); ---------- 'hash_bang' => <<'----------', # above spaces will be retained with -x but not by default #!/usr/bin/perl my $date = localtime(); ---------- 'listop1' => <<'----------', my @sorted = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, rand ] } @list; ---------- 'qw' => <<'----------', # do not outdent ending ) more than initial qw line if ( $pos == 0 ) { @return = grep( /^$word/, sort qw( ! a b d h i m o q r u autobundle clean make test install force reload look ) ); } # outdent ')' even if opening is not '(' @EXPORT = ( qw) i Re Im rho theta arg sqrt log ln log10 logn cbrt root cplx cplxe ), @trig ); # outdent '>' like ')' @EXPORT = ( qw< i Re Im rho theta arg sqrt log ln log10 logn cbrt root cplx cplxe >, @trig ); # but ';' not outdented @EXPORT = ( qw; i Re Im rho theta arg sqrt log ln log10 logn cbrt root cplx cplxe ;, @trig ); ---------- 'sbcp' => <<'----------', @month_of_year = ( 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', #x# 'Dec', 'Nov' ## 'Dec', 'Nov' 'Nov', 'Dec' ); ---------- 'wnxl' => <<'----------', if ( $PLATFORM eq 'aix' ) { skip_symbols( [ qw( Perl_dump_fds Perl_ErrorNo Perl_GetVars PL_sys_intern ) ] ); } if ( _add_fqdn_host( name => ..., fqdn => ... ) ) { ...; } do {{ next if ($n % 2); print $n, "\n"; }} while ($n++ < 10); threads->create( sub { my (%hash3); share(%hash3); $hash2{hash} = \%hash3; $hash3{"thread"} = "yes"; } )->join(); ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'boa.def' => { source => "boa", params => "def", expect => <<'#1...........', my @field : field : Default(1) : Get('Name' => 'foo') : Set('Name'); #1........... }, 'bol.bol' => { source => "bol", params => "bol", expect => <<'#2...........', return unless $cmd = $cmd || ( $dot && $Last_Shell ) || &prompt('|'); #2........... }, 'bol.def' => { source => "bol", params => "def", expect => <<'#3...........', return unless $cmd = $cmd || ( $dot && $Last_Shell ) || &prompt('|'); #3........... }, 'bot.bot' => { source => "bot", params => "bot", expect => <<'#4...........', $foo = $condition ? undef : 1; #4........... }, 'bot.def' => { source => "bot", params => "def", expect => <<'#5...........', $foo = $condition ? undef : 1; #5........... }, 'hash_bang.def' => { source => "hash_bang", params => "def", expect => <<'#6...........', # above spaces will be retained with -x but not by default #!/usr/bin/perl my $date = localtime(); #6........... }, 'hash_bang.hash_bang' => { source => "hash_bang", params => "hash_bang", expect => <<'#7...........', # above spaces will be retained with -x but not by default #!/usr/bin/perl my $date = localtime(); #7........... }, 'listop1.listop1' => { source => "listop1", params => "listop1", expect => <<'#8...........', my @sorted = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, rand ] } @list; #8........... }, 'sbcp.def' => { source => "sbcp", params => "def", expect => <<'#9...........', @month_of_year = ( 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', #x# 'Dec', 'Nov' ## 'Dec', 'Nov' 'Nov', 'Dec' ); #9........... }, 'sbcp.sbcp1' => { source => "sbcp", params => "sbcp1", expect => <<'#10...........', @month_of_year = ( 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', #x# 'Dec', 'Nov' ## 'Dec', 'Nov' 'Nov', 'Dec' ); #10........... }, 'wnxl.def' => { source => "wnxl", params => "def", expect => <<'#11...........', if ( $PLATFORM eq 'aix' ) { skip_symbols( [ qw( Perl_dump_fds Perl_ErrorNo Perl_GetVars PL_sys_intern ) ] ); } if ( _add_fqdn_host( name => ..., fqdn => ... ) ) { ...; } do { { next if ( $n % 2 ); print $n, "\n"; } } while ( $n++ < 10 ); threads->create( sub { my (%hash3); share(%hash3); $hash2{hash} = \%hash3; $hash3{"thread"} = "yes"; } )->join(); #11........... }, 'wnxl.wnxl1' => { source => "wnxl", params => "wnxl1", expect => <<'#12...........', if ( $PLATFORM eq 'aix' ) { skip_symbols( [ qw( Perl_dump_fds Perl_ErrorNo Perl_GetVars PL_sys_intern ) ] ); } if ( _add_fqdn_host( name => ..., fqdn => ... ) ) { ...; } do { { next if ( $n % 2 ); print $n, "\n"; } } while ( $n++ < 10 ); threads->create( sub { my (%hash3); share(%hash3); $hash2{hash} = \%hash3; $hash3{"thread"} = "yes"; } )->join(); #12........... }, 'wnxl.wnxl2' => { source => "wnxl", params => "wnxl2", expect => <<'#13...........', if ( $PLATFORM eq 'aix' ) { skip_symbols( [ qw( Perl_dump_fds Perl_ErrorNo Perl_GetVars PL_sys_intern ) ] ); } if ( _add_fqdn_host( name => ..., fqdn => ... ) ) { ...; } do { { next if ( $n % 2 ); print $n, "\n"; } } while ( $n++ < 10 ); threads->create( sub { my (%hash3); share(%hash3); $hash2{hash} = \%hash3; $hash3{"thread"} = "yes"; } )->join(); #13........... }, 'wnxl.wnxl3' => { source => "wnxl", params => "wnxl3", expect => <<'#14...........', if ( $PLATFORM eq 'aix' ) { skip_symbols( [ qw( Perl_dump_fds Perl_ErrorNo Perl_GetVars PL_sys_intern ) ] ); } if ( _add_fqdn_host( name => ..., fqdn => ... ) ) { ...; } do { { next if ( $n % 2 ); print $n, "\n"; } } while ( $n++ < 10 ); threads->create( sub { my (%hash3); share(%hash3); $hash2{hash} = \%hash3; $hash3{"thread"} = "yes"; } )->join(); #14........... }, 'wnxl.wnxl4' => { source => "wnxl", params => "wnxl4", expect => <<'#15...........', if ( $PLATFORM eq 'aix' ) { skip_symbols( [ qw( Perl_dump_fds Perl_ErrorNo Perl_GetVars PL_sys_intern ) ] ); } if ( _add_fqdn_host( name => ..., fqdn => ... ) ) { ...; } do { { next if ( $n % 2 ); print $n, "\n"; } } while ( $n++ < 10 ); threads->create( sub { my (%hash3); share(%hash3); $hash2{hash} = \%hash3; $hash3{"thread"} = "yes"; } )->join(); #15........... }, 'align34.def' => { source => "align34", params => "def", expect => <<'#16...........', # align all '{' and runs of '=' if ( $line =~ /^NAME>(.*)/i ) { $Cookies{'name'} = $1; } elsif ( $line =~ /^EMAIL>(.*)/i ) { $email = $1; } elsif ( $line =~ /^IP_ADDRESS>(.*)/i ) { $ipaddress = $1; } elsif ( $line =~ /^/i ) { $remoteuser = $1; } elsif ( $line =~ /^PASSWORD>(.*)/i ) { next; } elsif ( $line =~ /^IMAGE>(.*)/i ) { $image_url = $1; } elsif ( $line =~ /^LINKNAME>(.*)/i ) { $linkname = $1; } elsif ( $line =~ /^LINKURL>(.*)/i ) { $linkurl = $1; } else { $body .= $line; } #16........... }, 'git47.def' => { source => "git47", params => "def", expect => <<'#17...........', # cannot weld here $promises[$i]->then( sub { $all->resolve(@_); () }, sub { $results->[$i] = [@_]; $all->reject(@$results) if --$remaining <= 0; return (); } ); sub _absolutize { [ map { _is_scoped($_) ? $_ : [ [ [ 'pc', 'scope' ] ], ' ', @$_ ] } @{ shift() } ] } $c->helpers->log->debug( sub { my $req = $c->req; my $method = $req->method; my $path = $req->url->path->to_abs_string; $c->helpers->timing->begin('mojo.timer'); return qq{$method "$path"}; } ) unless $stash->{'mojo.static'}; # A single signature var can weld return Mojo::Promise->resolve($query_params)->then(&_reveal_event)->then( sub ($code) { return $c->render( text => '', status => $code ); } ); #17........... }, 'git47.git47' => { source => "git47", params => "git47", expect => <<'#18...........', # cannot weld here $promises[$i]->then( sub { $all->resolve(@_); () }, sub { $results->[$i] = [@_]; $all->reject(@$results) if --$remaining <= 0; return (); } ); sub _absolutize { [map { _is_scoped($_) ? $_ : [[['pc', 'scope']], ' ', @$_] } @{shift()}] } $c->helpers->log->debug(sub { my $req = $c->req; my $method = $req->method; my $path = $req->url->path->to_abs_string; $c->helpers->timing->begin('mojo.timer'); return qq{$method "$path"}; }) unless $stash->{'mojo.static'}; # A single signature var can weld return Mojo::Promise->resolve($query_params)->then(&_reveal_event)->then(sub ($code) { return $c->render(text => '', status => $code); }); #18........... }, 'qw.def' => { source => "qw", params => "def", expect => <<'#19...........', # do not outdent ending ) more than initial qw line if ( $pos == 0 ) { @return = grep( /^$word/, sort qw( ! a b d h i m o q r u autobundle clean make test install force reload look ) ); } # outdent ')' even if opening is not '(' @EXPORT = ( qw) i Re Im rho theta arg sqrt log ln log10 logn cbrt root cplx cplxe ), @trig ); # outdent '>' like ')' @EXPORT = ( qw< i Re Im rho theta arg sqrt log ln log10 logn cbrt root cplx cplxe >, @trig ); # but ';' not outdented @EXPORT = ( qw; i Re Im rho theta arg sqrt log ln log10 logn cbrt root cplx cplxe ;, @trig ); #19........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/testwide-passthrough.t0000644000175000017500000000716214255414056017466 0ustar stevesteveuse strict; use warnings; use utf8; use FindBin qw($Bin); use File::Temp qw(tempfile); use Test::More; BEGIN { unshift @INC, "./" } use Perl::Tidy; # This tests the -eos (--encode-output-strings) which was added for issue # git #83 to fix an issue with tidyall. # NOTE: to prevent automatic conversion of line endings LF to CRLF under github # Actions with Windows, which would cause test failure, it is essential that # there be a file 't/.gitattributes' with the line: # * -text # The test file has no tidying needs but is UTF-8 encoded, so all passes # through perltidy should read/write identical contents (previously only # file test behaved correctly) # Test::More in perl versions before 5.10 does not have sub note # so just skip this test plan( tests => 6 ); test_all(); sub my_note { my ($msg) = @_; # try to work around problem where sub Test::More::note does not exist # in older versions of perl if ($] >= 5.010) { note($msg); } return; } sub test_all { my $test_file = "$Bin/testwide-passthrough.pl.src"; test_file2file($test_file); test_scalar2scalar($test_file); test_scalararray2scalararray($test_file); } sub test_file2file { my $test_file = shift; my $tmp_file = File::Temp->new( TMPDIR => 1 ); my $source = $test_file; my $destination = $tmp_file->filename(); my_note("Testing file2file: '$source' => '$destination'\n"); my $tidyresult = Perl::Tidy::perltidy( argv => '-utf8 -npro', source => $source, destination => $destination ); ok( !$tidyresult, 'perltidy' ); my $source_str = slurp_raw($source); my $destination_str = slurp_raw($destination); my $source_hex = unpack( 'H*', $source_str ); my $destination_hex = unpack( 'H*', $destination_str ); my_note("Comparing contents:\n $source_hex\n $destination_hex\n"); ok( $source_hex eq $destination_hex, 'file content compare' ); } sub test_scalar2scalar { my $testfile = shift; my $source = slurp_raw($testfile); my $destination; my_note("Testing scalar2scalar\n"); my $tidyresult = Perl::Tidy::perltidy( argv => '-utf8 -eos -npro', source => \$source, destination => \$destination ); ok( !$tidyresult, 'perltidy' ); my $source_hex = unpack( 'H*', $source ); my $destination_hex = unpack( 'H*', $destination ); my_note("Comparing contents:\n $source_hex\n $destination_hex\n"); ok( $source_hex eq $destination_hex, 'scalar content compare' ); } sub test_scalararray2scalararray { my $testfile = shift; my $source = [ lines_raw($testfile) ]; my $destination = []; my_note("Testing scalararray2scalararray\n"); my $tidyresult = Perl::Tidy::perltidy( argv => '-utf8 -eos -npro', source => $source, destination => $destination ); ok( !$tidyresult, 'perltidy' ); my $source_str = join( "", @$source ); my $destination_str = join( "", @$destination ); my $source_hex = unpack( 'H*', $source_str ); my $destination_hex = unpack( 'H*', $destination_str ); my_note("Comparing contents:\n $source_hex\n $destination_hex\n"); ok( $source_hex eq $destination_hex, 'scalararray content compare' ); } sub slurp_raw { my $filename = shift; open( TMP, '<', $filename ); binmode( TMP, ':raw' ); local $/; my $contents = ; close(TMP); return $contents; } sub lines_raw { my $filename = shift; open( TMP, '<', $filename ); binmode( TMP, ':raw' ); my @contents = ; close(TMP); return @contents; } Perl-Tidy-20230309/t/snippets18.t0000644000175000017500000005245414373177245015322 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 wn7.wn #2 wn8.def #3 wn8.wn #4 comments.comments5 #5 braces.braces1 #6 braces.braces2 #7 braces.braces3 #8 braces.def #9 csc.csc1 #10 csc.csc2 #11 csc.def #12 iob.def #13 iob.iob #14 kis.def #15 kis.kis #16 maths.def #17 maths.maths1 #18 maths.maths2 #19 misc_tests.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'braces1' => "-bl -asbl", 'braces2' => "-sbl", 'braces3' => "-bli -bbvt=1", 'comments5' => <<'----------', # testing --delete-side-comments and --nostatic-block-comments -dsc -nsbc ---------- 'csc1' => "-csc -csci=2 -ncscb", 'csc2' => "-dcsc", 'def' => "", 'iob' => "-iob", 'kis' => "-kis", 'maths1' => <<'----------', # testing -break-before-all-operators and no spaces around math operators -bbao -nwls="= + - / *" -nwrs="= + - / *" ---------- 'maths2' => <<'----------', # testing -break-after-all-operators and no spaces around math operators -baao -nwls="= + - / *" -nwrs="= + - / *" ---------- 'wn' => "-wn", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'braces' => <<'----------', sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } } $myfun = sub { print("Hello, World\n"); }; eval { my $app = App::perlbrew->new( "install-patchperl", "-q" ); $app->run(); } or do { $error = $@; $produced_error = 1; }; Mojo::IOLoop->next_tick( sub { $ua->get( '/' => sub { push @kept_alive, pop->kept_alive; Mojo::IOLoop->next_tick( sub { Mojo::IOLoop->stop } ); } ); } ); $r = do { sswitch( $words[ rand @words ] ) { case $words[0]: case $words[1]: case $words[2]: case $words[3]: { 'ok' } default: { 'wtf' } } }; try { die; } catch { die; }; ---------- 'comments' => <<'----------', #!/usr/bin/perl -w # an initial hash bang line cannot be deleted with -dp #<<< format skipping of first code can cause an error message in perltidy v20210625 my $rvar = [ [ 1, 2, 3 ], [ 4, 5, 6 ] ]; #>>> sub length { return length($_[0]) } # side comment # hanging side comment # very longgggggggggggggggggggggggggggggggggggggggggggggggggggg hanging side comment # a blank will be inserted to prevent forming a hanging side comment sub macro_get_names { # # # %name = macro_get_names(); (key=macrohandle, value=macroname) # ##local(%name); # a static block comment without indentation local(%name)=(); ## a static side comment to test -ssc # a spaced block comment to test -isbc for (0..$#mac_ver) { # a very long comment for testing the parameter --nooutdent-long-comments (or -nolc) $name{$_} = $mac_ext[$idx{$mac_exti[$_]}]; $vmsfile =~ s/;[\d\-]*$//; # very long side comment; Clip off version number; we can use a newer version as well } %name; } @month_of_year = ( 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', ## 'Dec', 'Nov' [a static block comment with indentation] 'Nov', 'Dec'); { # this side comment will not align my $IGNORE = 0; # This is a side comment # This is a hanging side comment # And so is this # A blank line interrupts the hsc's; this is a block comment } # side comments at different indentation levels should not normally be aligned { { { { { ${msg} = "Hello World!"; print "My message: ${msg}\n"; } } #end level 4 } # end level 3 } # end level 2 } # end level 1 #<<< do not let perltidy touch this unless -nfs is set my @list = (1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1,); #>>> #<< test alternate format skipping string my @list = (1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1,); #>> # some blank lines follow =pod Some pod before __END__ to delete with -dp =cut __END__ # text following __END__, not a comment =pod Some pod after __END__ to delete with -dp and trim with -trp =cut ---------- 'csc' => <<'----------', sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } } ## end sub message ---------- 'iob' => <<'----------', return "this is a descriptive error message" if $res->is_error or not length $data; ---------- 'kis' => <<'----------', dbmclose(%verb_delim); undef %verb_delim; dbmclose(%expanded); undef %expanded; ---------- 'maths' => <<'----------', $tmp = $day - 32075 + 1461 * ( $year + 4800 - ( 14 - $month ) / 12 ) / 4 + 367 * ( $month - 2 + ( ( 14 - $month ) / 12 ) * 12 ) / 12 - 3 * ( ( $year + 4900 - ( 14 - $month ) / 12 ) / 100 ) / 4; return ( $r**$n ) * ( pi**( $n / 2 ) ) / ( sqrt(pi) * factorial( 2 * ( int( $n / 2 ) ) + 2 ) / factorial( int( $n / 2 ) + 1 ) / ( 4**( int( $n / 2 ) + 1 ) ) ); $root=-$b+sqrt($b*$b-4.*$a*$c)/(2.*$a); ---------- 'misc_tests' => <<'----------', for ( @a = @$ap, $u = shift @a; @a; $u = $v ) { ... } # test -sfs $i = 1 ; # test -sts $i = 0; ## =1; test -ssc ;;;; # test -ndsm my ( $a, $b, $c ) = @_; # test -nsak="my for" ---------- 'wn7' => <<'----------', # do not weld paren to opening one-line non-paren container $Self->_Add($SortOrderDisplay{$Field->GenerateFieldForSelectSQL()}); # this will not get welded with -wn f( do { 1; !!(my $x = bless []); } ); ---------- 'wn8' => <<'----------', # Former -wn blinkers, which oscillated between two states # fixed RULE 1 only applies to '(' my $res = eval { { $die_on_fetch, 0 } }; my $res = eval { { $die_on_fetch, 0 } }; # fixed RULE 2 applies to any inner opening token; this is a stable # state with -wn $app->FORM->{'appbar1'}->set_status( _("Cannot delete zone $name: sub-zones or appellations exist.") ); # OLD: fixed RULE 1: this is now a stable state with -wn # NEW (30 jan 2021): do not weld if one interior token $app->FORM->{'appbar1'}->set_status(_( "Cannot delete zone $name: sub-zones or appellations exist.")); ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'wn7.wn' => { source => "wn7", params => "wn", expect => <<'#1...........', # do not weld paren to opening one-line non-paren container $Self->_Add( $SortOrderDisplay{ $Field->GenerateFieldForSelectSQL() } ); # this will not get welded with -wn f( do { 1; !!( my $x = bless [] ); } ); #1........... }, 'wn8.def' => { source => "wn8", params => "def", expect => <<'#2...........', # Former -wn blinkers, which oscillated between two states # fixed RULE 1 only applies to '(' my $res = eval { { $die_on_fetch, 0 } }; my $res = eval { { $die_on_fetch, 0 } }; # fixed RULE 2 applies to any inner opening token; this is a stable # state with -wn $app->FORM->{'appbar1'}->set_status( _("Cannot delete zone $name: sub-zones or appellations exist.") ); # OLD: fixed RULE 1: this is now a stable state with -wn # NEW (30 jan 2021): do not weld if one interior token $app->FORM->{'appbar1'}->set_status( _("Cannot delete zone $name: sub-zones or appellations exist.") ); #2........... }, 'wn8.wn' => { source => "wn8", params => "wn", expect => <<'#3...........', # Former -wn blinkers, which oscillated between two states # fixed RULE 1 only applies to '(' my $res = eval { { $die_on_fetch, 0 } }; my $res = eval { { $die_on_fetch, 0 } }; # fixed RULE 2 applies to any inner opening token; this is a stable # state with -wn $app->FORM->{'appbar1'}->set_status( _("Cannot delete zone $name: sub-zones or appellations exist.") ); # OLD: fixed RULE 1: this is now a stable state with -wn # NEW (30 jan 2021): do not weld if one interior token $app->FORM->{'appbar1'}->set_status( _("Cannot delete zone $name: sub-zones or appellations exist.") ); #3........... }, 'comments.comments5' => { source => "comments", params => "comments5", expect => <<'#4...........', #!/usr/bin/perl -w # an initial hash bang line cannot be deleted with -dp #<<< format skipping of first code can cause an error message in perltidy v20210625 my $rvar = [ [ 1, 2, 3 ], [ 4, 5, 6 ] ]; #>>> sub length { return length( $_[0] ) } # a blank will be inserted to prevent forming a hanging side comment sub macro_get_names { # # %name = macro_get_names(); (key=macrohandle, value=macroname) # ##local(%name); # a static block comment without indentation local (%name) = (); # a spaced block comment to test -isbc for ( 0 .. $#mac_ver ) { # a very long comment for testing the parameter --nooutdent-long-comments (or -nolc) $name{$_} = $mac_ext[ $idx{ $mac_exti[$_] } ]; $vmsfile =~ s/;[\d\-]*$//; } %name; } @month_of_year = ( 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', ## 'Dec', 'Nov' [a static block comment with indentation] 'Nov', 'Dec' ); { my $IGNORE = 0; # A blank line interrupts the hsc's; this is a block comment } # side comments at different indentation levels should not normally be aligned { { { { { ${msg} = "Hello World!"; print "My message: ${msg}\n"; } } } } } #<<< do not let perltidy touch this unless -nfs is set my @list = (1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1,); #>>> #<< test alternate format skipping string my @list = ( 1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1, ); #>> # some blank lines follow =pod Some pod before __END__ to delete with -dp =cut __END__ # text following __END__, not a comment =pod Some pod after __END__ to delete with -dp and trim with -trp =cut #4........... }, 'braces.braces1' => { source => "braces", params => "braces1", expect => <<'#5...........', sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } } $myfun = sub { print("Hello, World\n"); }; eval { my $app = App::perlbrew->new( "install-patchperl", "-q" ); $app->run(); } or do { $error = $@; $produced_error = 1; }; Mojo::IOLoop->next_tick( sub { $ua->get( '/' => sub { push @kept_alive, pop->kept_alive; Mojo::IOLoop->next_tick( sub { Mojo::IOLoop->stop } ); } ); } ); $r = do { sswitch( $words[ rand @words ] ) { case $words[0]: case $words[1]: case $words[2]: case $words[3]: { 'ok' } default: { 'wtf' } } }; try { die; } catch { die; }; #5........... }, 'braces.braces2' => { source => "braces", params => "braces2", expect => <<'#6...........', sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } } $myfun = sub { print("Hello, World\n"); }; eval { my $app = App::perlbrew->new( "install-patchperl", "-q" ); $app->run(); } or do { $error = $@; $produced_error = 1; }; Mojo::IOLoop->next_tick( sub { $ua->get( '/' => sub { push @kept_alive, pop->kept_alive; Mojo::IOLoop->next_tick( sub { Mojo::IOLoop->stop } ); } ); } ); $r = do { sswitch( $words[ rand @words ] ) { case $words[0]: case $words[1]: case $words[2]: case $words[3]: { 'ok' } default: { 'wtf' } } }; try { die; } catch { die; }; #6........... }, 'braces.braces3' => { source => "braces", params => "braces3", expect => <<'#7...........', sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } } $myfun = sub { print("Hello, World\n"); }; eval { my $app = App::perlbrew->new( "install-patchperl", "-q" ); $app->run(); } or do { $error = $@; $produced_error = 1; }; Mojo::IOLoop->next_tick( sub { $ua->get( '/' => sub { push @kept_alive, pop->kept_alive; Mojo::IOLoop->next_tick( sub { Mojo::IOLoop->stop } ); } ); } ); $r = do { sswitch( $words[ rand @words ] ) { case $words[0]: case $words[1]: case $words[2]: case $words[3]: { 'ok' } default: { 'wtf' } } }; try { die; } catch { die; }; #7........... }, 'braces.def' => { source => "braces", params => "def", expect => <<'#8...........', sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } } $myfun = sub { print("Hello, World\n"); }; eval { my $app = App::perlbrew->new( "install-patchperl", "-q" ); $app->run(); } or do { $error = $@; $produced_error = 1; }; Mojo::IOLoop->next_tick( sub { $ua->get( '/' => sub { push @kept_alive, pop->kept_alive; Mojo::IOLoop->next_tick( sub { Mojo::IOLoop->stop } ); } ); } ); $r = do { sswitch( $words[ rand @words ] ) { case $words[0]: case $words[1]: case $words[2]: case $words[3]: { 'ok' } default: { 'wtf' } } }; try { die; } catch { die; }; #8........... }, 'csc.csc1' => { source => "csc", params => "csc1", expect => <<'#9...........', sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } ## end if ( !defined( $_[0] )) else { print( $_[0], "\n" ); } ## end else [ if ( !defined( $_[0] )) } ## end sub message #9........... }, 'csc.csc2' => { source => "csc", params => "csc2", expect => <<'#10...........', sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } } #10........... }, 'csc.def' => { source => "csc", params => "def", expect => <<'#11...........', sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } } ## end sub message #11........... }, 'iob.def' => { source => "iob", params => "def", expect => <<'#12...........', return "this is a descriptive error message" if $res->is_error or not length $data; #12........... }, 'iob.iob' => { source => "iob", params => "iob", expect => <<'#13...........', return "this is a descriptive error message" if $res->is_error or not length $data; #13........... }, 'kis.def' => { source => "kis", params => "def", expect => <<'#14...........', dbmclose(%verb_delim); undef %verb_delim; dbmclose(%expanded); undef %expanded; #14........... }, 'kis.kis' => { source => "kis", params => "kis", expect => <<'#15...........', dbmclose(%verb_delim); undef %verb_delim; dbmclose(%expanded); undef %expanded; #15........... }, 'maths.def' => { source => "maths", params => "def", expect => <<'#16...........', $tmp = $day - 32075 + 1461 * ( $year + 4800 - ( 14 - $month ) / 12 ) / 4 + 367 * ( $month - 2 + ( ( 14 - $month ) / 12 ) * 12 ) / 12 - 3 * ( ( $year + 4900 - ( 14 - $month ) / 12 ) / 100 ) / 4; return ( $r**$n ) * ( pi**( $n / 2 ) ) / ( sqrt(pi) * factorial( 2 * ( int( $n / 2 ) ) + 2 ) / factorial( int( $n / 2 ) + 1 ) / ( 4**( int( $n / 2 ) + 1 ) ) ); $root = -$b + sqrt( $b * $b - 4. * $a * $c ) / ( 2. * $a ); #16........... }, 'maths.maths1' => { source => "maths", params => "maths1", expect => <<'#17...........', $tmp =$day-32075 +1461*( $year+4800-( 14-$month )/12 )/4 +367*( $month-2+( ( 14-$month )/12 )*12 )/12 -3*( ( $year+4900-( 14-$month )/12 )/100 )/4; return ( $r**$n ) *( pi**( $n/2 ) ) /( sqrt(pi) *factorial( 2*( int( $n/2 ) )+2 ) /factorial( int( $n/2 )+1 ) /( 4**( int( $n/2 )+1 ) ) ); $root=-$b+sqrt( $b*$b-4.*$a*$c )/( 2.*$a ); #17........... }, 'maths.maths2' => { source => "maths", params => "maths2", expect => <<'#18...........', $tmp= $day-32075+ 1461*( $year+4800-( 14-$month )/12 )/4+ 367*( $month-2+( ( 14-$month )/12 )*12 )/12- 3*( ( $year+4900-( 14-$month )/12 )/100 )/4; return ( $r**$n )* ( pi**( $n/2 ) )/ ( sqrt(pi)* factorial( 2*( int( $n/2 ) )+2 )/ factorial( int( $n/2 )+1 )/ ( 4**( int( $n/2 )+1 ) ) ); $root=-$b+sqrt( $b*$b-4.*$a*$c )/( 2.*$a ); #18........... }, 'misc_tests.def' => { source => "misc_tests", params => "def", expect => <<'#19...........', for ( @a = @$ap, $u = shift @a ; @a ; $u = $v ) { ... } # test -sfs $i = 1; # test -sts $i = 0; ## =1; test -ssc ; # test -ndsm my ( $a, $b, $c ) = @_; # test -nsak="my for" #19........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/snippets17.t0000644000175000017500000007626214373177245015324 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 align32.def #2 bos.bos #3 bos.def #4 comments.comments1 #5 comments.comments2 #6 comments.comments3 #7 comments.comments4 #8 comments.def #9 long_line.def #10 long_line.long_line #11 pbp6.def #12 pbp6.pbp #13 rperl.def #14 rperl.rperl #15 rt132059.def #16 rt132059.rt132059 #17 signature.def #18 ternary4.def #19 wn7.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'bos' => "-bos", 'comments1' => <<'----------', # testing --fixed-position-side-comment=40, # --ignore-side-comment-lengths, # --noindent-block-comments, # --nohanging-side-comments # --static-side-comments # --trim-pod -fpsc=40 -iscl -nibc -nhsc -ssc -trp ---------- 'comments2' => <<'----------', # testing --minimum-space-to-comment=10, --delete-block-comments, --delete-pod -msc=10 -dbc -dp ---------- 'comments3' => <<'----------', # testing --maximum-consecutive-blank-lines=2 and --indent-spaced-block-comments --no-format-skipping -mbl=2 -isbc -nfs ---------- 'comments4' => <<'----------', # testing --keep-old-blank-lines=2 [=all] and # --nooutdent-long-comments and # --outdent-static-block-comments # --format-skipping-begin and --format-skipping-end -kbl=2 -nolc -osbc -fsb='#<{2,}' -fse='#>{2,}' ---------- 'def' => "", 'long_line' => "-l=0", 'pbp' => "-pbp -nst -nse", 'rperl' => "-pbp -nst --ignore-side-comment-lengths --converge -l=0 -q", 'rt132059' => "-dac", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'align32' => <<'----------', # align just the last two lines my $c_sub_khwnd = WindowFromId $k_hwnd, 0x8008; # FID_CLIENT ok $c_sub_khwnd, 'have kids client window'; ok IsWindow($c_sub_khwnd), 'IsWindow works on the client'; # parenless calls mkTextConfig $c, $x, $y, -anchor => 'se', $color; mkTextConfig $c, $x + 30, $y, -anchor => 's', $color; mkTextConfig $c, $x + 60, $y, -anchor => 'sw', $color; mkTextConfig $c, $x, $y + 30, -anchor => 'e', $color; permute_test [ 'a', 'b', 'c' ], '/', '/', [ 'a', 'b', 'c' ]; permute_test [ 'a,', 'b', 'c,' ], '/', '/', [ 'a,', 'b', 'c,' ]; permute_test [ 'a', ',', '#', 'c' ], '/', '/', [ 'a', ',', '#', 'c' ]; permute_test [ 'f_oo', 'b_ar' ], '/', '/', [ 'f_oo', 'b_ar' ]; # issue c093 - broken sub, but align fat commas use constant UNDEF_ONLY => sub { not defined $_[0] }; use constant EMPTY_OR_UNDEF => sub { !@_ or @_ == 1 && !defined $_[0]; }; ---------- 'bos' => <<'----------', $top_label->set_text( gettext("check permissions.") ) ; ---------- 'comments' => <<'----------', #!/usr/bin/perl -w # an initial hash bang line cannot be deleted with -dp #<<< format skipping of first code can cause an error message in perltidy v20210625 my $rvar = [ [ 1, 2, 3 ], [ 4, 5, 6 ] ]; #>>> sub length { return length($_[0]) } # side comment # hanging side comment # very longgggggggggggggggggggggggggggggggggggggggggggggggggggg hanging side comment # a blank will be inserted to prevent forming a hanging side comment sub macro_get_names { # # # %name = macro_get_names(); (key=macrohandle, value=macroname) # ##local(%name); # a static block comment without indentation local(%name)=(); ## a static side comment to test -ssc # a spaced block comment to test -isbc for (0..$#mac_ver) { # a very long comment for testing the parameter --nooutdent-long-comments (or -nolc) $name{$_} = $mac_ext[$idx{$mac_exti[$_]}]; $vmsfile =~ s/;[\d\-]*$//; # very long side comment; Clip off version number; we can use a newer version as well } %name; } @month_of_year = ( 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', ## 'Dec', 'Nov' [a static block comment with indentation] 'Nov', 'Dec'); { # this side comment will not align my $IGNORE = 0; # This is a side comment # This is a hanging side comment # And so is this # A blank line interrupts the hsc's; this is a block comment } # side comments at different indentation levels should not normally be aligned { { { { { ${msg} = "Hello World!"; print "My message: ${msg}\n"; } } #end level 4 } # end level 3 } # end level 2 } # end level 1 #<<< do not let perltidy touch this unless -nfs is set my @list = (1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1,); #>>> #<< test alternate format skipping string my @list = (1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1,); #>> # some blank lines follow =pod Some pod before __END__ to delete with -dp =cut __END__ # text following __END__, not a comment =pod Some pod after __END__ to delete with -dp and trim with -trp =cut ---------- 'long_line' => <<'----------', # This single line should break into multiple lines, even with -l=0 # sub 'tight_paren_follows' should break the do block $body = SOAP::Data->name('~V:Fault')->attr( { 'xmlns' => $SOAP::Constants::NS_ENV } )->value( \SOAP::Data->set_value( SOAP::Data->name( faultcode => qualify( $self->namespace => shift(@parameters) ) ), SOAP::Data->name( faultstring => shift(@parameters) ), @parameters ? SOAP::Data->name( detail => do { my $detail = shift(@parameters); ref $detail ? \$detail : $detail } ) : (), @parameters ? SOAP::Data->name( faultactor => shift(@parameters) ) : (), ) ); ---------- 'pbp6' => <<'----------', # These formerly blinked with -pbp return $width1*$common_length*( $W*atan2(1,$W) + $H*atan2(1,$H) - $RTHSQPWSQ*atan2(1,$RTHSQPWSQ) + 0.25*log( ($WSQP1*$HSQP1)/(1+$WSQ+$HSQ) *($WSQ*(1+$WSQ+$HSQ)/($WSQP1*$HSQPWSQ))**$WSQ *($HSQ*(1+$WSQ+$HSQ)/($HSQP1*$HSQPWSQ))**$HSQ ) )/($W*$pi); my $oldSec = ( 60 * $session->{originalStartHour} + $session->{originalStartMin} ) * 60; ---------- 'rperl' => <<'----------', # Some test cases for RPerl, https://github.com/wbraswell/rperl/ # These must not remain as single lines with default formatting and long lines sub multiply_return_F { { my number $RETURN_TYPE }; ( my integer $multiplicand, my number $multiplier ) = @ARG; return $multiplicand * $multiplier; } sub empty_method { { my void::method $RETURN_TYPE }; return 2; } sub foo_subroutine_in_main { { my void $RETURN_TYPE }; print 'Howdy from foo_subroutine_in_main()...', "\n"; return; } ---------- 'rt132059' => <<'----------', # Test deleting comments and pod $1=2; sub f { # a side comment # a hanging side comment # a block comment } =pod bonjour! =cut $i++; ---------- 'signature' => <<'----------', # git22: Preserve function signature on a single line # This behavior is controlled by 'sub weld_signature_parens' sub foo($x, $y="abcd") { $x.$y; } # do not break after closing do brace sub foo($x, $y=do{{}}, $z=42, $w=do{"abcd"}) { $x.$y.$z; } # This signature should get put back on one line sub t022 ( $p = do { $z += 10; 222 }, $a = do { $z++; 333 } ) { "$p/$a" } # anonymous sub with signature my $subref = sub ( $cat, $id = do { state $auto_id = 0; $auto_id++ } ) { ...; }; # signature and prototype and attribute sub foo1 ( $x, $y ) : prototype ( $$ ) : shared { } sub foo11 ( $thing, % ) { print $thing } sub animal4 ( $cat, $ = ) { } # second argument is optional *share = sub ( \[$@%] ) { }; # extruded test sub foo2 ( $ first , $ , $ third ) { return "first=$first, third=$third" ; } # valid attributes sub fnord (&\%) : switch(10,foo(7,3)) : expensive; sub plugh () : Ugly('\(") : Bad; ---------- 'ternary4' => <<'----------', # some side comments *{"${callpkg}::$sym"} = $type eq '&' ? \&{"${pkg}::$sym"} # : $type eq '$' ? \${"${pkg}::$sym"} # : $type eq '@' ? \@{"${pkg}::$sym"} : $type eq '%' ? \%{"${pkg}::$sym"} # side comment : $type eq '*' ? *{"${pkg}::$sym"} # : do { require Carp; Carp::croak("Can't export symbol: $type$sym") }; ---------- 'wn7' => <<'----------', # do not weld paren to opening one-line non-paren container $Self->_Add($SortOrderDisplay{$Field->GenerateFieldForSelectSQL()}); # this will not get welded with -wn f( do { 1; !!(my $x = bless []); } ); ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'align32.def' => { source => "align32", params => "def", expect => <<'#1...........', # align just the last two lines my $c_sub_khwnd = WindowFromId $k_hwnd, 0x8008; # FID_CLIENT ok $c_sub_khwnd, 'have kids client window'; ok IsWindow($c_sub_khwnd), 'IsWindow works on the client'; # parenless calls mkTextConfig $c, $x, $y, -anchor => 'se', $color; mkTextConfig $c, $x + 30, $y, -anchor => 's', $color; mkTextConfig $c, $x + 60, $y, -anchor => 'sw', $color; mkTextConfig $c, $x, $y + 30, -anchor => 'e', $color; permute_test [ 'a', 'b', 'c' ], '/', '/', [ 'a', 'b', 'c' ]; permute_test [ 'a,', 'b', 'c,' ], '/', '/', [ 'a,', 'b', 'c,' ]; permute_test [ 'a', ',', '#', 'c' ], '/', '/', [ 'a', ',', '#', 'c' ]; permute_test [ 'f_oo', 'b_ar' ], '/', '/', [ 'f_oo', 'b_ar' ]; # issue c093 - broken sub, but align fat commas use constant UNDEF_ONLY => sub { not defined $_[0] }; use constant EMPTY_OR_UNDEF => sub { !@_ or @_ == 1 && !defined $_[0]; }; #1........... }, 'bos.bos' => { source => "bos", params => "bos", expect => <<'#2...........', $top_label->set_text( gettext("check permissions.") ) ; #2........... }, 'bos.def' => { source => "bos", params => "def", expect => <<'#3...........', $top_label->set_text( gettext("check permissions.") ); #3........... }, 'comments.comments1' => { source => "comments", params => "comments1", expect => <<'#4...........', #!/usr/bin/perl -w # an initial hash bang line cannot be deleted with -dp #<<< format skipping of first code can cause an error message in perltidy v20210625 my $rvar = [ [ 1, 2, 3 ], [ 4, 5, 6 ] ]; #>>> sub length { return length( $_[0] ) } # side comment # hanging side comment # very longgggggggggggggggggggggggggggggggggggggggggggggggggggg hanging side comment # a blank will be inserted to prevent forming a hanging side comment sub macro_get_names { # # # %name = macro_get_names(); (key=macrohandle, value=macroname) # ##local(%name); # a static block comment without indentation local (%name) = (); ## a static side comment to test -ssc # a spaced block comment to test -isbc for ( 0 .. $#mac_ver ) { # a very long comment for testing the parameter --nooutdent-long-comments (or -nolc) $name{$_} = $mac_ext[ $idx{ $mac_exti[$_] } ]; $vmsfile =~ s/;[\d\-]*$//; # very long side comment; Clip off version number; we can use a newer version as well } %name; } @month_of_year = ( 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', ## 'Dec', 'Nov' [a static block comment with indentation] 'Nov', 'Dec' ); { # this side comment will not align my $IGNORE = 0; # This is a side comment # This is a hanging side comment # And so is this # A blank line interrupts the hsc's; this is a block comment } # side comments at different indentation levels should not normally be aligned { { { { { ${msg} = "Hello World!"; print "My message: ${msg}\n"; } } #end level 4 } # end level 3 } # end level 2 } # end level 1 #<<< do not let perltidy touch this unless -nfs is set my @list = (1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1,); #>>> #<< test alternate format skipping string my @list = ( 1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1, ); #>> # some blank lines follow =pod Some pod before __END__ to delete with -dp =cut __END__ # text following __END__, not a comment =pod Some pod after __END__ to delete with -dp and trim with -trp =cut #4........... }, 'comments.comments2' => { source => "comments", params => "comments2", expect => <<'#5...........', #!/usr/bin/perl -w #<<< format skipping of first code can cause an error message in perltidy v20210625 my $rvar = [ [ 1, 2, 3 ], [ 4, 5, 6 ] ]; #>>> sub length { return length( $_[0] ) } # side comment # hanging side comment # very longgggggggggggggggggggggggggggggggggggggggggggggggggggg hanging side comment sub macro_get_names { # local (%name) = (); ## a static side comment to test -ssc for ( 0 .. $#mac_ver ) { $name{$_} = $mac_ext[ $idx{ $mac_exti[$_] } ]; $vmsfile =~ s/;[\d\-]*$// ; # very long side comment; Clip off version number; we can use a newer version as well } %name; } @month_of_year = ( 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec' ); { # this side comment will not align my $IGNORE = 0; # This is a side comment # This is a hanging side comment # And so is this } { { { { { ${msg} = "Hello World!"; print "My message: ${msg}\n"; } } #end level 4 } # end level 3 } # end level 2 } # end level 1 #<<< do not let perltidy touch this unless -nfs is set my @list = (1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1,); #>>> my @list = ( 1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1, ); __END__ # text following __END__, not a comment #5........... }, 'comments.comments3' => { source => "comments", params => "comments3", expect => <<'#6...........', #!/usr/bin/perl -w # an initial hash bang line cannot be deleted with -dp #<<< format skipping of first code can cause an error message in perltidy v20210625 my $rvar = [ [ 1, 2, 3 ], [ 4, 5, 6 ] ]; #>>> sub length { return length( $_[0] ) } # side comment # hanging side comment # very longgggggggggggggggggggggggggggggggggggggggggggggggggggg hanging side comment # a blank will be inserted to prevent forming a hanging side comment sub macro_get_names { # # # %name = macro_get_names(); (key=macrohandle, value=macroname) # ##local(%name); # a static block comment without indentation local (%name) = (); ## a static side comment to test -ssc # a spaced block comment to test -isbc for ( 0 .. $#mac_ver ) { # a very long comment for testing the parameter --nooutdent-long-comments (or -nolc) $name{$_} = $mac_ext[ $idx{ $mac_exti[$_] } ]; $vmsfile =~ s/;[\d\-]*$// ; # very long side comment; Clip off version number; we can use a newer version as well } %name; } @month_of_year = ( 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', ## 'Dec', 'Nov' [a static block comment with indentation] 'Nov', 'Dec' ); { # this side comment will not align my $IGNORE = 0; # This is a side comment # This is a hanging side comment # And so is this # A blank line interrupts the hsc's; this is a block comment } # side comments at different indentation levels should not normally be aligned { { { { { ${msg} = "Hello World!"; print "My message: ${msg}\n"; } } #end level 4 } # end level 3 } # end level 2 } # end level 1 #<<< do not let perltidy touch this unless -nfs is set my @list = ( 1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1, ); #>>> #<< test alternate format skipping string my @list = ( 1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1, ); #>> # some blank lines follow =pod Some pod before __END__ to delete with -dp =cut __END__ # text following __END__, not a comment =pod Some pod after __END__ to delete with -dp and trim with -trp =cut #6........... }, 'comments.comments4' => { source => "comments", params => "comments4", expect => <<'#7...........', #!/usr/bin/perl -w # an initial hash bang line cannot be deleted with -dp #<<< format skipping of first code can cause an error message in perltidy v20210625 my $rvar = [ [ 1, 2, 3 ], [ 4, 5, 6 ] ]; #>>> sub length { return length( $_[0] ) } # side comment # hanging side comment # very longgggggggggggggggggggggggggggggggggggggggggggggggggggg hanging side comment # a blank will be inserted to prevent forming a hanging side comment sub macro_get_names { # # # %name = macro_get_names(); (key=macrohandle, value=macroname) # ##local(%name); # a static block comment without indentation local (%name) = (); ## a static side comment to test -ssc # a spaced block comment to test -isbc for ( 0 .. $#mac_ver ) { # a very long comment for testing the parameter --nooutdent-long-comments (or -nolc) $name{$_} = $mac_ext[ $idx{ $mac_exti[$_] } ]; $vmsfile =~ s/;[\d\-]*$// ; # very long side comment; Clip off version number; we can use a newer version as well } %name; } @month_of_year = ( 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', ## 'Dec', 'Nov' [a static block comment with indentation] 'Nov', 'Dec' ); { # this side comment will not align my $IGNORE = 0; # This is a side comment # This is a hanging side comment # And so is this # A blank line interrupts the hsc's; this is a block comment } # side comments at different indentation levels should not normally be aligned { { { { { ${msg} = "Hello World!"; print "My message: ${msg}\n"; } } #end level 4 } # end level 3 } # end level 2 } # end level 1 #<<< do not let perltidy touch this unless -nfs is set my @list = (1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1,); #>>> #<< test alternate format skipping string my @list = (1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1,); #>> # some blank lines follow =pod Some pod before __END__ to delete with -dp =cut __END__ # text following __END__, not a comment =pod Some pod after __END__ to delete with -dp and trim with -trp =cut #7........... }, 'comments.def' => { source => "comments", params => "def", expect => <<'#8...........', #!/usr/bin/perl -w # an initial hash bang line cannot be deleted with -dp #<<< format skipping of first code can cause an error message in perltidy v20210625 my $rvar = [ [ 1, 2, 3 ], [ 4, 5, 6 ] ]; #>>> sub length { return length( $_[0] ) } # side comment # hanging side comment # very longgggggggggggggggggggggggggggggggggggggggggggggggggggg hanging side comment # a blank will be inserted to prevent forming a hanging side comment sub macro_get_names { # # # %name = macro_get_names(); (key=macrohandle, value=macroname) # ##local(%name); # a static block comment without indentation local (%name) = (); ## a static side comment to test -ssc # a spaced block comment to test -isbc for ( 0 .. $#mac_ver ) { # a very long comment for testing the parameter --nooutdent-long-comments (or -nolc) $name{$_} = $mac_ext[ $idx{ $mac_exti[$_] } ]; $vmsfile =~ s/;[\d\-]*$// ; # very long side comment; Clip off version number; we can use a newer version as well } %name; } @month_of_year = ( 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', ## 'Dec', 'Nov' [a static block comment with indentation] 'Nov', 'Dec' ); { # this side comment will not align my $IGNORE = 0; # This is a side comment # This is a hanging side comment # And so is this # A blank line interrupts the hsc's; this is a block comment } # side comments at different indentation levels should not normally be aligned { { { { { ${msg} = "Hello World!"; print "My message: ${msg}\n"; } } #end level 4 } # end level 3 } # end level 2 } # end level 1 #<<< do not let perltidy touch this unless -nfs is set my @list = (1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1,); #>>> #<< test alternate format skipping string my @list = ( 1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1, ); #>> # some blank lines follow =pod Some pod before __END__ to delete with -dp =cut __END__ # text following __END__, not a comment =pod Some pod after __END__ to delete with -dp and trim with -trp =cut #8........... }, 'long_line.def' => { source => "long_line", params => "def", expect => <<'#9...........', # This single line should break into multiple lines, even with -l=0 # sub 'tight_paren_follows' should break the do block $body = SOAP::Data->name('~V:Fault')->attr( { 'xmlns' => $SOAP::Constants::NS_ENV } ) ->value( \SOAP::Data->set_value( SOAP::Data->name( faultcode => qualify( $self->namespace => shift(@parameters) ) ), SOAP::Data->name( faultstring => shift(@parameters) ), @parameters ? SOAP::Data->name( detail => do { my $detail = shift(@parameters); ref $detail ? \$detail : $detail; } ) : (), @parameters ? SOAP::Data->name( faultactor => shift(@parameters) ) : (), ) ); #9........... }, 'long_line.long_line' => { source => "long_line", params => "long_line", expect => <<'#10...........', # This single line should break into multiple lines, even with -l=0 # sub 'tight_paren_follows' should break the do block $body = SOAP::Data->name('~V:Fault')->attr( { 'xmlns' => $SOAP::Constants::NS_ENV } )->value( \SOAP::Data->set_value( SOAP::Data->name( faultcode => qualify( $self->namespace => shift(@parameters) ) ), SOAP::Data->name( faultstring => shift(@parameters) ), @parameters ? SOAP::Data->name( detail => do { my $detail = shift(@parameters); ref $detail ? \$detail : $detail } ) : (), @parameters ? SOAP::Data->name( faultactor => shift(@parameters) ) : (), ) ); #10........... }, 'pbp6.def' => { source => "pbp6", params => "def", expect => <<'#11...........', # These formerly blinked with -pbp return $width1 * $common_length * ( $W * atan2( 1, $W ) + $H * atan2( 1, $H ) - $RTHSQPWSQ * atan2( 1, $RTHSQPWSQ ) + 0.25 * log( ( $WSQP1 * $HSQP1 ) / ( 1 + $WSQ + $HSQ ) * ( $WSQ * ( 1 + $WSQ + $HSQ ) / ( $WSQP1 * $HSQPWSQ ) ) **$WSQ * ( $HSQ * ( 1 + $WSQ + $HSQ ) / ( $HSQP1 * $HSQPWSQ ) )**$HSQ ) ) / ( $W * $pi ); my $oldSec = ( 60 * $session->{originalStartHour} + $session->{originalStartMin} ) * 60; #11........... }, 'pbp6.pbp' => { source => "pbp6", params => "pbp", expect => <<'#12...........', # These formerly blinked with -pbp return $width1 * $common_length * ( $W * atan2( 1, $W ) + $H * atan2( 1, $H ) - $RTHSQPWSQ * atan2( 1, $RTHSQPWSQ ) + 0.25 * log( ( $WSQP1 * $HSQP1 ) / ( 1 + $WSQ + $HSQ ) * ( $WSQ * ( 1 + $WSQ + $HSQ ) / ( $WSQP1 * $HSQPWSQ ) ) **$WSQ * ( $HSQ * ( 1 + $WSQ + $HSQ ) / ( $HSQP1 * $HSQPWSQ ) ) **$HSQ ) ) / ( $W * $pi ); my $oldSec = ( 60 * $session->{originalStartHour} + $session->{originalStartMin} ) * 60; #12........... }, 'rperl.def' => { source => "rperl", params => "def", expect => <<'#13...........', # Some test cases for RPerl, https://github.com/wbraswell/rperl/ # These must not remain as single lines with default formatting and long lines sub multiply_return_F { { my number $RETURN_TYPE }; ( my integer $multiplicand, my number $multiplier ) = @ARG; return $multiplicand * $multiplier; } sub empty_method { { my void::method $RETURN_TYPE }; return 2; } sub foo_subroutine_in_main { { my void $RETURN_TYPE }; print 'Howdy from foo_subroutine_in_main()...', "\n"; return; } #13........... }, 'rperl.rperl' => { source => "rperl", params => "rperl", expect => <<'#14...........', # Some test cases for RPerl, https://github.com/wbraswell/rperl/ # These must not remain as single lines with default formatting and long lines sub multiply_return_F { { my number $RETURN_TYPE }; ( my integer $multiplicand, my number $multiplier ) = @ARG; return $multiplicand * $multiplier; } sub empty_method { { my void::method $RETURN_TYPE }; return 2; } sub foo_subroutine_in_main { { my void $RETURN_TYPE }; print 'Howdy from foo_subroutine_in_main()...', "\n"; return; } #14........... }, 'rt132059.def' => { source => "rt132059", params => "def", expect => <<'#15...........', # Test deleting comments and pod $1 = 2; sub f { # a side comment # a hanging side comment # a block comment } =pod bonjour! =cut $i++; #15........... }, 'rt132059.rt132059' => { source => "rt132059", params => "rt132059", expect => <<'#16...........', $1 = 2; sub f { } $i++; #16........... }, 'signature.def' => { source => "signature", params => "def", expect => <<'#17...........', # git22: Preserve function signature on a single line # This behavior is controlled by 'sub weld_signature_parens' sub foo ( $x, $y = "abcd" ) { $x . $y; } # do not break after closing do brace sub foo ( $x, $y = do { {} }, $z = 42, $w = do { "abcd" } ) { $x . $y . $z; } # This signature should get put back on one line sub t022 ( $p = do { $z += 10; 222 }, $a = do { $z++; 333 } ) { "$p/$a" } # anonymous sub with signature my $subref = sub ( $cat, $id = do { state $auto_id = 0; $auto_id++ } ) { ...; }; # signature and prototype and attribute sub foo1 ( $x, $y ) : prototype ( $$ ) : shared { } sub foo11 ( $thing, % ) { print $thing } sub animal4 ( $cat, $ = ) { } # second argument is optional *share = sub ( \[$@%] ) { }; # extruded test sub foo2 ( $first, $, $third ) { return "first=$first, third=$third"; } # valid attributes sub fnord (&\%) : switch(10,foo(7,3)) : expensive; sub plugh () : Ugly('\(") : Bad; #17........... }, 'ternary4.def' => { source => "ternary4", params => "def", expect => <<'#18...........', # some side comments *{"${callpkg}::$sym"} = $type eq '&' ? \&{"${pkg}::$sym"} # : $type eq '$' ? \${"${pkg}::$sym"} # : $type eq '@' ? \@{"${pkg}::$sym"} : $type eq '%' ? \%{"${pkg}::$sym"} # side comment : $type eq '*' ? *{"${pkg}::$sym"} # : do { require Carp; Carp::croak("Can't export symbol: $type$sym") }; #18........... }, 'wn7.def' => { source => "wn7", params => "def", expect => <<'#19...........', # do not weld paren to opening one-line non-paren container $Self->_Add( $SortOrderDisplay{ $Field->GenerateFieldForSelectSQL() } ); # this will not get welded with -wn f( do { 1; !!( my $x = bless [] ); } ); #19........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/snippets21.t0000644000175000017500000005357014373177245015314 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 lop.lop #2 switch_plain.def #3 switch_plain.switch_plain #4 sot.def #5 sot.sot #6 prune.def #7 align33.def #8 gnu7.def #9 gnu7.gnu #10 git33.def #11 git33.git33 #12 rt133130.def #13 rt133130.rt133130 #14 nib.def #15 nib.nib1 #16 nib.nib2 #17 scbb-csc.def #18 scbb-csc.scbb-csc #19 here_long.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'def' => "", 'git33' => <<'----------', -wls='->' -wrs='->' ---------- 'gnu' => "-gnu", 'lop' => "-nlop", 'nib1' => "-nnib", 'nib2' => <<'----------', -nib -nibp='#\+\+' ---------- 'rt133130' => <<'----------', # only the method should get a csc: -csc -cscl=sub -sal=method ---------- 'scbb-csc' => "-scbb -csc", 'sot' => "-sot -sct", 'switch_plain' => "-nola", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'align33' => <<'----------', $wl = int( $wl * $f + .5 ); $wr = int( $wr * $f + .5 ); $pag = int( $pageh * $f + .5 ); $fe = $opt_F ? "t" : "f"; $cf = $opt_U ? "t" : "f"; $tp = $opt_t ? "t" : "f"; $rm = $numbstyle ? "t" : "f"; $pa = $showurl ? "t" : "f"; $nh = $seq_number ? "t" : "f"; ---------- 'git33' => <<'----------', # test -wls='->' -wrs='->' use Net::Ping; my ($ping) = Net::Ping->new(); $ping->ping($host); ---------- 'gnu7' => <<'----------', # hanging side comments if ( $seen == 1 ) { # We're the first word so far to have # this abbreviation. $hashref->{$abbrev} = $word; } elsif ( $seen == 2 ) { # We're the second word to have this # abbreviation, so we can't use it. delete $hashref->{$abbrev}; } else { # We're the third word to have this # abbreviation, so skip to the next word. next WORD; } ---------- 'here_long' => <<'----------', # must not break after here target regardless of maximum-line-length $sth= $dbh->prepare (<<"END_OF_SELECT") or die "Couldn't prepare SQL" ; SELECT COUNT(duration),SUM(duration) FROM logins WHERE username='$user' END_OF_SELECT ---------- 'lop' => <<'----------', # logical padding examples $same = ( ( $aP eq $bP ) && ( $aS eq $bS ) && ( $aT eq $bT ) && ( $a->{'title'} eq $b->{'title'} ) && ( $a->{'href'} eq $b->{'href'} ) ); $bits = $top > 0xffff ? 32 : $top > 0xff ? 16 : $top > 1 ? 8 : 1; lc( $self->mime_attr('content-type') || $self->{MIH_DefaultType} || 'text/plain' ); # Padding can also remove spaces; here the space after the '(' is lost: elsif ( $statement_type =~ /^sub\b/ || $paren_type[$paren_depth] =~ /^sub\b/ ) ---------- 'nib' => <<'----------', { #<<< { #<<< { #++ print "hello world\n"; } } } { #++ { #++ { #<<< print "hello world\n"; } } } ---------- 'prune' => <<'----------', # some tests for 'sub prune_alignment_tree' $request->header( 'User-Agent' => $agent ) if $agent; $request->header( 'From' => $from ) if $from; $request->header( 'Range' => "bytes=0-$max_size" ) if $max_size; for ( [ 'CONSTANT', sub { join "foo", "bar" }, 0, "bar" ], [ 'CONSTANT', sub { join "foo", "bar", 3 }, 1, "barfoo3" ], [ '$var', sub { join $_, "bar" }, 0, "bar" ], [ '$myvar', sub { my $var; join $var, "bar" }, 0, "bar" ], ); [ [ [NewXSHdr], [ NewXSName, NewXSArgs ], "XSHdr" ], [ [NewXSCHdrs], [ NewXSName, NewXSArgs, GlobalNew ], "XSCHdrs" ], [ [DefSyms], [StructName], "MkDefSyms" ], [ [NewXSSymTab], [ DefSyms, NewXSArgs ], "AddArgsyms" ], [ [NewXSLocals], [NewXSSymTab], "Sym2Loc" ], [ [IsAffineFlag], [], sub { return "0" } ], ]; @degen_nums[ 1, 2, 4, 8 ] = ( 'a', 'c', 'g', 't' ); @degen_nums[ 5, 10, 9, 6, 3, 12 ] = ( 'r', 'y', 'w', 's', 'm', 'k' ); @degen_nums[ 14, 13, 11, 7, 15 ] = ( 'b', 'd', 'h', 'v', 'n' ); $_CreateFile = ff( "k32", "CreateFile", [ P, N, N, N, N, N, N ], N ); $_CloseHandle = ff( "k32", "CloseHandle", [N], N ); $_GetCommState = ff( "k32", "GetCommState", [ N, P ], I ); $_SetCommState = ff( "k32", "SetCommState", [ N, P ], I ); $_SetupComm = ff( "k32", "SetupComm", [ N, N, N ], I ); $_PurgeComm = ff( "k32", "PurgeComm", [ N, N ], I ); $_CreateEvent = ff( "k32", "CreateEvent", [ P, I, I, P ], N ); is_deeply \@t, [ [3], [0], [1], [0], 3, [1], 3, [1], 2, [0], [1], [0], [1], [1], [1], 2, 3, [1], 2, [3], 4, [ 7, 8 ], 9, ["a"], "b", 3, 2, 5, 3, 2, 5, 3, [2], 5, 4, 5, [ 3, 2, 1 ], 1, 2, 3, [ -1, -2, -3 ], [ -1, -2, -3 ], [ -1, -2, -3 ], [ -1, -2 ], 3, [ -1, -2 ], 3, [ -1, -2, -3 ], [ !1 ], [ 8, 7, 6 ], [ 8, 7, 6 ], [4], !!0, ]; ---------- 'rt133130' => <<'----------', method sum_radlinks { my ( $global_radiation_matrix, $local_radiation_matrix, $rngg ) = @_; my ( $i, $j, $n1, $n2, $num ); my $rggij; $num = @$rngg; for ( $i = 0 ; $i < $num ; $i++ ) { $n1 = $rngg->[$i]; for ( $j = 0 ; $j < $num ; $j++ ) { $n2 = $rngg->[$j]; $rggij = $local_radiation_matrix->[$i][$j]; if ( $rggij && ( $n1 != $n2 ) ) { $global_radiation_matrix->[$n1][$n2] += $rggij; } } } } ---------- 'scbb-csc' => <<'----------', sub perlmod_install_advice { my(@mod) = @_; if ($auto_install_cpan) { require AutoInstall::Tk; my $r = AutoInstall::Tk::do_autoinstall_tk(@mod); if ($r > 0) { for my $mod (@mod) { warn "Re-require $mod...\n"; eval "require $mod"; die __LINE__ . ": $@" if $@; }} } else { my $shell = ($os eq 'win' ? M"Eingabeaufforderung" : M"Shell"); status_message ( Mfmt( ( @mod > 1 ? "Die fehlenden Perl-Module können aus der %s mit dem Kommando\n" : "Das fehlende Perl-Modul kann aus der %s mit dem Kommando\n" ), $shell ) . " perl -MCPAN -e \"install " . join(", ", @mod) . "\"\n" . "aus dem Internet geholt und installiert werden.\n", "err" ); } } ---------- 'sot' => <<'----------', $opt_c = Text::CSV_XS->new( { binary => 1, sep_char => $opt_c, always_quote => 1, } ); $c->Tk::bind( '' => sub { my ($c) = @_; my $e = $c->XEvent; itemsUnderArea $c; } ); __PACKAGE__->load_components( qw( PK::Auto Core ) ); ---------- 'switch_plain' => <<'----------', # run with -nola to keep default from outdenting use Switch::Plain; my $r = 'fail'; my $x = int rand 100_000; nswitch (1 + $x * 2) { case $x: {} default: { $r = 'ok'; } } my @words = qw(cinnamon ginger nutmeg cloves); my $test = 1; $r = $test ? do { sswitch( $words[ rand @words ] ) { case $words[0]: case $words[1]: case $words[2]: case $words[3]: { 'ok' } default: { 'default case' } } } : 'not ok'; ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'lop.lop' => { source => "lop", params => "lop", expect => <<'#1...........', # logical padding examples $same = ( ( $aP eq $bP ) && ( $aS eq $bS ) && ( $aT eq $bT ) && ( $a->{'title'} eq $b->{'title'} ) && ( $a->{'href'} eq $b->{'href'} ) ); $bits = $top > 0xffff ? 32 : $top > 0xff ? 16 : $top > 1 ? 8 : 1; lc( $self->mime_attr('content-type') || $self->{MIH_DefaultType} || 'text/plain' ); # Padding can also remove spaces; here the space after the '(' is lost: elsif ( $statement_type =~ /^sub\b/ || $paren_type[$paren_depth] =~ /^sub\b/ ) #1........... }, 'switch_plain.def' => { source => "switch_plain", params => "def", expect => <<'#2...........', # run with -nola to keep default from outdenting use Switch::Plain; my $r = 'fail'; my $x = int rand 100_000; nswitch( 1 + $x * 2 ) { case $x: { } default: { $r = 'ok'; } } my @words = qw(cinnamon ginger nutmeg cloves); my $test = 1; $r = $test ? do { sswitch( $words[ rand @words ] ) { case $words[0]: case $words[1]: case $words[2]: case $words[3]: { 'ok' } default: { 'default case' } } } : 'not ok'; #2........... }, 'switch_plain.switch_plain' => { source => "switch_plain", params => "switch_plain", expect => <<'#3...........', # run with -nola to keep default from outdenting use Switch::Plain; my $r = 'fail'; my $x = int rand 100_000; nswitch( 1 + $x * 2 ) { case $x: { } default: { $r = 'ok'; } } my @words = qw(cinnamon ginger nutmeg cloves); my $test = 1; $r = $test ? do { sswitch( $words[ rand @words ] ) { case $words[0]: case $words[1]: case $words[2]: case $words[3]: { 'ok' } default: { 'default case' } } } : 'not ok'; #3........... }, 'sot.def' => { source => "sot", params => "def", expect => <<'#4...........', $opt_c = Text::CSV_XS->new( { binary => 1, sep_char => $opt_c, always_quote => 1, } ); $c->Tk::bind( '' => sub { my ($c) = @_; my $e = $c->XEvent; itemsUnderArea $c; } ); __PACKAGE__->load_components( qw( PK::Auto Core ) ); #4........... }, 'sot.sot' => { source => "sot", params => "sot", expect => <<'#5...........', $opt_c = Text::CSV_XS->new( { binary => 1, sep_char => $opt_c, always_quote => 1, } ); $c->Tk::bind( '' => sub { my ($c) = @_; my $e = $c->XEvent; itemsUnderArea $c; } ); __PACKAGE__->load_components( qw( PK::Auto Core ) ); #5........... }, 'prune.def' => { source => "prune", params => "def", expect => <<'#6...........', # some tests for 'sub prune_alignment_tree' $request->header( 'User-Agent' => $agent ) if $agent; $request->header( 'From' => $from ) if $from; $request->header( 'Range' => "bytes=0-$max_size" ) if $max_size; for ( [ 'CONSTANT', sub { join "foo", "bar" }, 0, "bar" ], [ 'CONSTANT', sub { join "foo", "bar", 3 }, 1, "barfoo3" ], [ '$var', sub { join $_, "bar" }, 0, "bar" ], [ '$myvar', sub { my $var; join $var, "bar" }, 0, "bar" ], ); [ [ [NewXSHdr], [ NewXSName, NewXSArgs ], "XSHdr" ], [ [NewXSCHdrs], [ NewXSName, NewXSArgs, GlobalNew ], "XSCHdrs" ], [ [DefSyms], [StructName], "MkDefSyms" ], [ [NewXSSymTab], [ DefSyms, NewXSArgs ], "AddArgsyms" ], [ [NewXSLocals], [NewXSSymTab], "Sym2Loc" ], [ [IsAffineFlag], [], sub { return "0" } ], ]; @degen_nums[ 1, 2, 4, 8 ] = ( 'a', 'c', 'g', 't' ); @degen_nums[ 5, 10, 9, 6, 3, 12 ] = ( 'r', 'y', 'w', 's', 'm', 'k' ); @degen_nums[ 14, 13, 11, 7, 15 ] = ( 'b', 'd', 'h', 'v', 'n' ); $_CreateFile = ff( "k32", "CreateFile", [ P, N, N, N, N, N, N ], N ); $_CloseHandle = ff( "k32", "CloseHandle", [N], N ); $_GetCommState = ff( "k32", "GetCommState", [ N, P ], I ); $_SetCommState = ff( "k32", "SetCommState", [ N, P ], I ); $_SetupComm = ff( "k32", "SetupComm", [ N, N, N ], I ); $_PurgeComm = ff( "k32", "PurgeComm", [ N, N ], I ); $_CreateEvent = ff( "k32", "CreateEvent", [ P, I, I, P ], N ); is_deeply \@t, [ [3], [0], [1], [0], 3, [1], 3, [1], 2, [0], [1], [0], [1], [1], [1], 2, 3, [1], 2, [3], 4, [ 7, 8 ], 9, ["a"], "b", 3, 2, 5, 3, 2, 5, 3, [2], 5, 4, 5, [ 3, 2, 1 ], 1, 2, 3, [ -1, -2, -3 ], [ -1, -2, -3 ], [ -1, -2, -3 ], [ -1, -2 ], 3, [ -1, -2 ], 3, [ -1, -2, -3 ], [ !1 ], [ 8, 7, 6 ], [ 8, 7, 6 ], [4], !!0, ]; #6........... }, 'align33.def' => { source => "align33", params => "def", expect => <<'#7...........', $wl = int( $wl * $f + .5 ); $wr = int( $wr * $f + .5 ); $pag = int( $pageh * $f + .5 ); $fe = $opt_F ? "t" : "f"; $cf = $opt_U ? "t" : "f"; $tp = $opt_t ? "t" : "f"; $rm = $numbstyle ? "t" : "f"; $pa = $showurl ? "t" : "f"; $nh = $seq_number ? "t" : "f"; #7........... }, 'gnu7.def' => { source => "gnu7", params => "def", expect => <<'#8...........', # hanging side comments if ( $seen == 1 ) { # We're the first word so far to have # this abbreviation. $hashref->{$abbrev} = $word; } elsif ( $seen == 2 ) { # We're the second word to have this # abbreviation, so we can't use it. delete $hashref->{$abbrev}; } else { # We're the third word to have this # abbreviation, so skip to the next word. next WORD; } #8........... }, 'gnu7.gnu' => { source => "gnu7", params => "gnu", expect => <<'#9...........', # hanging side comments if ($seen == 1) { # We're the first word so far to have # this abbreviation. $hashref->{$abbrev} = $word; } elsif ($seen == 2) { # We're the second word to have this # abbreviation, so we can't use it. delete $hashref->{$abbrev}; } else { # We're the third word to have this # abbreviation, so skip to the next word. next WORD; } #9........... }, 'git33.def' => { source => "git33", params => "def", expect => <<'#10...........', # test -wls='->' -wrs='->' use Net::Ping; my ($ping) = Net::Ping->new(); $ping->ping($host); #10........... }, 'git33.git33' => { source => "git33", params => "git33", expect => <<'#11...........', # test -wls='->' -wrs='->' use Net::Ping; my ($ping) = Net::Ping -> new(); $ping -> ping($host); #11........... }, 'rt133130.def' => { source => "rt133130", params => "def", expect => <<'#12...........', method sum_radlinks { my ( $global_radiation_matrix, $local_radiation_matrix, $rngg ) = @_; my ( $i, $j, $n1, $n2, $num ); my $rggij; $num = @$rngg; for ( $i = 0 ; $i < $num ; $i++ ) { $n1 = $rngg->[$i]; for ( $j = 0 ; $j < $num ; $j++ ) { $n2 = $rngg->[$j]; $rggij = $local_radiation_matrix->[$i][$j]; if ( $rggij && ( $n1 != $n2 ) ) { $global_radiation_matrix->[$n1][$n2] += $rggij; } } } } #12........... }, 'rt133130.rt133130' => { source => "rt133130", params => "rt133130", expect => <<'#13...........', method sum_radlinks { my ( $global_radiation_matrix, $local_radiation_matrix, $rngg ) = @_; my ( $i, $j, $n1, $n2, $num ); my $rggij; $num = @$rngg; for ( $i = 0 ; $i < $num ; $i++ ) { $n1 = $rngg->[$i]; for ( $j = 0 ; $j < $num ; $j++ ) { $n2 = $rngg->[$j]; $rggij = $local_radiation_matrix->[$i][$j]; if ( $rggij && ( $n1 != $n2 ) ) { $global_radiation_matrix->[$n1][$n2] += $rggij; } } } } ## end sub sum_radlinks #13........... }, 'nib.def' => { source => "nib", params => "def", expect => <<'#14...........', { #<<< { #<<< { #++ print "hello world\n"; } } } { #++ { #++ { #<<< print "hello world\n"; } } } #14........... }, 'nib.nib1' => { source => "nib", params => "nib1", expect => <<'#15...........', { #<<< { #<<< { #++ print "hello world\n"; } } } { #++ { #++ { #<<< print "hello world\n"; } } } #15........... }, 'nib.nib2' => { source => "nib", params => "nib2", expect => <<'#16...........', { #<<< { #<<< { #++ print "hello world\n"; } } } { #++ { #++ { #<<< print "hello world\n"; } } } #16........... }, 'scbb-csc.def' => { source => "scbb-csc", params => "def", expect => <<'#17...........', sub perlmod_install_advice { my (@mod) = @_; if ($auto_install_cpan) { require AutoInstall::Tk; my $r = AutoInstall::Tk::do_autoinstall_tk(@mod); if ( $r > 0 ) { for my $mod (@mod) { warn "Re-require $mod...\n"; eval "require $mod"; die __LINE__ . ": $@" if $@; } } } else { my $shell = ( $os eq 'win' ? M "Eingabeaufforderung" : M "Shell" ); status_message( Mfmt( ( @mod > 1 ? "Die fehlenden Perl-Module können aus der %s mit dem Kommando\n" : "Das fehlende Perl-Modul kann aus der %s mit dem Kommando\n" ), $shell ) . " perl -MCPAN -e \"install " . join( ", ", @mod ) . "\"\n" . "aus dem Internet geholt und installiert werden.\n", "err" ); } } #17........... }, 'scbb-csc.scbb-csc' => { source => "scbb-csc", params => "scbb-csc", expect => <<'#18...........', sub perlmod_install_advice { my (@mod) = @_; if ($auto_install_cpan) { require AutoInstall::Tk; my $r = AutoInstall::Tk::do_autoinstall_tk(@mod); if ( $r > 0 ) { for my $mod (@mod) { warn "Re-require $mod...\n"; eval "require $mod"; die __LINE__ . ": $@" if $@; } } ## end if ( $r > 0 ) } ## end if ($auto_install_cpan) else { my $shell = ( $os eq 'win' ? M "Eingabeaufforderung" : M "Shell" ); status_message( Mfmt( ( @mod > 1 ? "Die fehlenden Perl-Module können aus der %s mit dem Kommando\n" : "Das fehlende Perl-Modul kann aus der %s mit dem Kommando\n" ), $shell ) . " perl -MCPAN -e \"install " . join( ", ", @mod ) . "\"\n" . "aus dem Internet geholt und installiert werden.\n", "err" ); } ## end else [ if ($auto_install_cpan)] } ## end sub perlmod_install_advice #18........... }, 'here_long.def' => { source => "here_long", params => "def", expect => <<'#19...........', # must not break after here target regardless of maximum-line-length $sth = $dbh->prepare(<<"END_OF_SELECT") or die "Couldn't prepare SQL"; SELECT COUNT(duration),SUM(duration) FROM logins WHERE username='$user' END_OF_SELECT #19........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/snippets26.t0000644000175000017500000005511414373177245015315 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 bal.bal2 #2 bal.def #3 lpxl.lpxl6 #4 c133.c133 #5 c133.def #6 git93.def #7 git93.git93 #8 c139.def #9 drc.def #10 drc.drc #11 git105.def #12 git106.def #13 git106.git106 #14 c154.def #15 code_skipping.code_skipping #16 c158.def #17 git108.def #18 git108.git108 #19 wtc.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'bal2' => "-bal=2", 'c133' => "-boc", 'code_skipping' => <<'----------', # same as the default but tests -cs -csb and -cse --code-skipping --code-skipping-begin='#< "", 'drc' => "-drc", 'git106' => "-xlp -gnu -xci", 'git108' => "-wn -wfc", 'git93' => <<'----------', -vxl='q' ---------- 'lpxl6' => <<'----------', # equivalent to -lpxl='{ [ F(2' -lp -lpil='f(2' ---------- }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'bal' => <<'----------', { L1: L2: L3: return; }; ---------- 'c133' => <<'----------', # this will make 1 line unless -boc is used return ( $x * cos($a) - $y * sin($a), $x * sin($a) + $y * cos($a) ); # broken list - issue c133 return ( $x * cos($a) - $y * sin($a), $x * sin($a) + $y * cos($a) ); # no parens return $x * cos($a) - $y * sin($a), $x * sin($a) + $y * cos($a); ---------- 'c139' => <<'----------', # The '&' has trailing spaces @l = & _ ( -49, -71 ); # This '$' has trailing spaces my $ b = 40; # this arrow has trailing spaces $r = $c-> sql_set_env_attr( $evh, $SQL_ATTR_ODBC_VERSION, $SQL_OV_ODBC2, 0 ); # spaces and blank line @l = & _ ( -49, -71 ); # spaces and blank line $r = $c-> sql_set_env_attr( $evh, $SQL_ATTR_ODBC_VERSION, $SQL_OV_ODBC2, 0 ); ---------- 'c154' => <<'----------', {{{{ for ( $order = $start_order * $nbSubOrderByOrder + $start_suborder ; !exists $level_hash{$level}->{$order} and $order <= $stop_order * $nbSubOrderByOrder + $stop_suborder ; $order++ ) { } # has comma for ( $q = 201 ; print '-' x 79, "\n" ; $g = ( $f ^ ( $w = ( $z = $m . $e ) ^ substr $e, $q ) ^ ( $n = $b ^ $d | $a ^ $l ) ) & ( $w | $z ^ $f ^ $n ) & ( $l | $g ) ) { ...; } for ( $j = 0, $match_j = -1 ; $j < $sub_len && # changed from naive_string_matcher $sub->[$j] eq $big->[ $i + $j ] ; $j++ ) { ...; } }}}} ---------- 'c158' => <<'----------', my $meta = try { $package->meta } or die "$package does not have a ->meta method\n"; my ($curr) = current(); err(@_); ---------- 'code_skipping' => <<'----------', %Hdr=%U2E=%E2U=%Fallback=(); $in_charmap=$nerror=$nwarning=0; $.=0; #<>V my $self=shift; my $cloning=shift; ---------- 'drc' => <<'----------', ignoreSpec( $file, "file",, \%spec,,, \%Rspec ); ---------- 'git105' => <<'----------', use v5.36; use experimental 'for_list'; for my ( $k, $v ) ( 1, 2, 3, 4 ) { say "$k:$v"; } say 'end'; ---------- 'git106' => <<'----------', is( $module->VERSION, $expected, "$main_module->VERSION matches $module->VERSION ($expected)" ); ok( ( $@ eq "" && "@b" eq "1 4 5 9" ), 'redefinition should not take effect during the sort' ); &$f( ( map { $points->slice($_) } @sls1 ), ( map { $n->slice($_) } @sls1 ), ( map { $this->{Colors}->slice($_) } @sls1 ) ); AA( "0123456789012345678901234567890123456789", "0123456789012345678901234567890123456789" ); AAAAAA( "0123456789012345678901234567890123456789", "0123456789012345678901234567890123456789" ); # padded return !( $elem->isa('PPI::Statement::End') || $elem->isa('PPI::Statement::Data') ); for ( $s = $dbobj->seq( $k, $v, R_LAST ) ; $s == 0 ; $s = $dbobj->seq( $k, $v, R_PREV ) ) { print "$k: $v\n"; } # excess without -xci fresh_perl_is( '-C-', <<'abcdefghijklmnopq', {}, "ambiguous unary operator check doesn't crash" ); Warning: Use of "-C-" without parentheses is ambiguous at - line 1. abcdefghijklmnopq # excess with -xci { { { $self->privmsg( $to, "One moment please, I shall display the groups with agendas:" ); } } } ---------- 'git108' => <<'----------', elf->call_method( method_name_foo => { some_arg1 => $foo, some_other_arg3 => $bar->{'baz'}, } ); # leading dash my $species = new Bio::Species( -classification => [ qw( sapiens Homo Hominidae Catarrhini Primates Eutheria Mammalia Vertebrata Chordata Metazoa Eukaryota ) ] ); ---------- 'git93' => <<'----------', use Cwd qw[cwd]; use Carp qw(carp); use IPC::Cmd qw{can_run run QUOTE}; use File::Path qw/mkpath/; use File::Temp qw[tempdir]; use Params::Check qw; use Module::Load::Conditional qw#can_load#; use Locale::Maketext::Simple Style => 'gettext'; # does not align # do not align on these 'q' token types - not use statements... my $gene_color_sets = [ [ qw( blue blue blue blue ) => 'blue' ], [ qw( brown blue blue blue ) => 'brown' ], [ qw( brown brown green green ) => 'brown' ], ]; sub quux : PluginKeyword { 'quux' } sub qaax : PluginKeyword(qiix) { die "unimplemented" } use vars qw($curdir); no strict qw(vars); ---------- 'lpxl' => <<'----------', # simple function call my $loanlength = getLoanLength( $borrower->{'categorycode'}, # sc1 $iteminformation->{'itemtype'}, $borrower->{'branchcode'} # sc3 ); # function call, more than one level deep my $o = very::long::class::name->new( { propA => "a", propB => "b", propC => "c", } ); # function call with sublist debug( "Connecting to DB.", "Extra-Parameters: " . join("<->", $extra_parms), "Config: " . join("<->", %config) ); # simple function call with code block $m->command(-label => 'Save', -command => sub { print "DOS\n"; save_dialog($win); }); # function call, ternary in list return OptArgs2::Result->usage( $style == OptArgs2::STYLE_FULL ? 'FullUsage' : 'NormalUsage', 'usage: ' . $usage . "\n" ); # not a function call %blastparam = ( -run => \%runparam, -file => '', -parse => 1, -signif => 1e-5, ); # 'local' is a keyword, not a user function local ( $len, $pts, @colspec, $char, $cols, $repeat, $celldata, $at_text, $after_text ); # square bracket with sublists $data = [ ListElem->new(id => 0, val => 100), ListElem->new(id => 2, val => 50), ListElem->new(id => 1, val => 10), ]; # curly brace with sublists $behaviour = { cat => {nap => "lap", eat => "meat"}, dog => {prowl => "growl", pool => "drool"}, mouse => {nibble => "kibble"}, }; ---------- 'wtc' => <<'----------', # both single and multiple line lists: @LoL = ( [ "fred", "barney", ], [ "george", "jane", "elroy" ], [ "homer", "marge", "bart", ], ); # single line ( $name, $body ) = ( $2, $3, ); # multiline, but not bare $text = $main->Scrolled( TextUndo, $yyy, $zzz, $wwwww, selectbackgroundxxxxx => 'yellow', ); # this will pass for 'h' my $new = { %$item, text => $leaf, color => 'green', }; # matches 'i' my @list = ( $xx, $yy ); # does not match 'h' $c1->create( 'rectangle', 40, 60, 80, 80, -fill => 'red', -tags => 'rectangle' ); $dasm_frame->Button( -text => 'Locate', -command => sub { $target_binary = $fs->Show( -popover => 'cursor', -create => 1, ); }, )->pack( -side => 'left', ); my $no_index_1_1 = { 'map' => { ':key' => { name => \&string, list => { value => \&string }, }, }, }; ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'bal.bal2' => { source => "bal", params => "bal2", expect => <<'#1...........', { L1: L2: L3: return; }; #1........... }, 'bal.def' => { source => "bal", params => "def", expect => <<'#2...........', { L1: L2: L3: return; }; #2........... }, 'lpxl.lpxl6' => { source => "lpxl", params => "lpxl6", expect => <<'#3...........', # simple function call my $loanlength = getLoanLength( $borrower->{'categorycode'}, # sc1 $iteminformation->{'itemtype'}, $borrower->{'branchcode'} # sc3 ); # function call, more than one level deep my $o = very::long::class::name->new( { propA => "a", propB => "b", propC => "c", } ); # function call with sublist debug( "Connecting to DB.", "Extra-Parameters: " . join( "<->", $extra_parms ), "Config: " . join( "<->", %config ) ); # simple function call with code block $m->command( -label => 'Save', -command => sub { print "DOS\n"; save_dialog($win); } ); # function call, ternary in list return OptArgs2::Result->usage( $style == OptArgs2::STYLE_FULL ? 'FullUsage' : 'NormalUsage', 'usage: ' . $usage . "\n" ); # not a function call %blastparam = ( -run => \%runparam, -file => '', -parse => 1, -signif => 1e-5, ); # 'local' is a keyword, not a user function local ( $len, $pts, @colspec, $char, $cols, $repeat, $celldata, $at_text, $after_text ); # square bracket with sublists $data = [ ListElem->new( id => 0, val => 100 ), ListElem->new( id => 2, val => 50 ), ListElem->new( id => 1, val => 10 ), ]; # curly brace with sublists $behaviour = { cat => { nap => "lap", eat => "meat" }, dog => { prowl => "growl", pool => "drool" }, mouse => { nibble => "kibble" }, }; #3........... }, 'c133.c133' => { source => "c133", params => "c133", expect => <<'#4...........', # this will make 1 line unless -boc is used return ( $x * cos($a) - $y * sin($a), $x * sin($a) + $y * cos($a) ); # broken list - issue c133 return ( $x * cos($a) - $y * sin($a), $x * sin($a) + $y * cos($a) ); # no parens return $x * cos($a) - $y * sin($a), $x * sin($a) + $y * cos($a); #4........... }, 'c133.def' => { source => "c133", params => "def", expect => <<'#5...........', # this will make 1 line unless -boc is used return ( $x * cos($a) - $y * sin($a), $x * sin($a) + $y * cos($a) ); # broken list - issue c133 return ( $x * cos($a) - $y * sin($a), $x * sin($a) + $y * cos($a) ); # no parens return $x * cos($a) - $y * sin($a), $x * sin($a) + $y * cos($a); #5........... }, 'git93.def' => { source => "git93", params => "def", expect => <<'#6...........', use Cwd qw[cwd]; use Carp qw(carp); use IPC::Cmd qw{can_run run QUOTE}; use File::Path qw/mkpath/; use File::Temp qw[tempdir]; use Params::Check qw; use Module::Load::Conditional qw#can_load#; use Locale::Maketext::Simple Style => 'gettext'; # does not align # do not align on these 'q' token types - not use statements... my $gene_color_sets = [ [ qw( blue blue blue blue ) => 'blue' ], [ qw( brown blue blue blue ) => 'brown' ], [ qw( brown brown green green ) => 'brown' ], ]; sub quux : PluginKeyword { 'quux' } sub qaax : PluginKeyword(qiix) { die "unimplemented" } use vars qw($curdir); no strict qw(vars); #6........... }, 'git93.git93' => { source => "git93", params => "git93", expect => <<'#7...........', use Cwd qw[cwd]; use Carp qw(carp); use IPC::Cmd qw{can_run run QUOTE}; use File::Path qw/mkpath/; use File::Temp qw[tempdir]; use Params::Check qw; use Module::Load::Conditional qw#can_load#; use Locale::Maketext::Simple Style => 'gettext'; # does not align # do not align on these 'q' token types - not use statements... my $gene_color_sets = [ [ qw( blue blue blue blue ) => 'blue' ], [ qw( brown blue blue blue ) => 'brown' ], [ qw( brown brown green green ) => 'brown' ], ]; sub quux : PluginKeyword { 'quux' } sub qaax : PluginKeyword(qiix) { die "unimplemented" } use vars qw($curdir); no strict qw(vars); #7........... }, 'c139.def' => { source => "c139", params => "def", expect => <<'#8...........', # The '&' has trailing spaces @l = &_( -49, -71 ); # This '$' has trailing spaces my $b = 40; # this arrow has trailing spaces $r = $c->sql_set_env_attr( $evh, $SQL_ATTR_ODBC_VERSION, $SQL_OV_ODBC2, 0 ); # spaces and blank line @l = & _( -49, -71 ); # spaces and blank line $r = $c-> sql_set_env_attr( $evh, $SQL_ATTR_ODBC_VERSION, $SQL_OV_ODBC2, 0 ); #8........... }, 'drc.def' => { source => "drc", params => "def", expect => <<'#9...........', ignoreSpec( $file, "file",, \%spec,,, \%Rspec ); #9........... }, 'drc.drc' => { source => "drc", params => "drc", expect => <<'#10...........', ignoreSpec( $file, "file", \%spec, \%Rspec ); #10........... }, 'git105.def' => { source => "git105", params => "def", expect => <<'#11...........', use v5.36; use experimental 'for_list'; for my ( $k, $v ) ( 1, 2, 3, 4 ) { say "$k:$v"; } say 'end'; #11........... }, 'git106.def' => { source => "git106", params => "def", expect => <<'#12...........', is( $module->VERSION, $expected, "$main_module->VERSION matches $module->VERSION ($expected)" ); ok( ( $@ eq "" && "@b" eq "1 4 5 9" ), 'redefinition should not take effect during the sort' ); &$f( ( map { $points->slice($_) } @sls1 ), ( map { $n->slice($_) } @sls1 ), ( map { $this->{Colors}->slice($_) } @sls1 ) ); AA( "0123456789012345678901234567890123456789", "0123456789012345678901234567890123456789" ); AAAAAA( "0123456789012345678901234567890123456789", "0123456789012345678901234567890123456789" ); # padded return !( $elem->isa('PPI::Statement::End') || $elem->isa('PPI::Statement::Data') ); for ( $s = $dbobj->seq( $k, $v, R_LAST ) ; $s == 0 ; $s = $dbobj->seq( $k, $v, R_PREV ) ) { print "$k: $v\n"; } # excess without -xci fresh_perl_is( '-C-', <<'abcdefghijklmnopq', {}, "ambiguous unary operator check doesn't crash" ); Warning: Use of "-C-" without parentheses is ambiguous at - line 1. abcdefghijklmnopq # excess with -xci { { { $self->privmsg( $to, "One moment please, I shall display the groups with agendas:" ); } } } #12........... }, 'git106.git106' => { source => "git106", params => "git106", expect => <<'#13...........', is($module->VERSION, $expected, "$main_module->VERSION matches $module->VERSION ($expected)"); ok(($@ eq "" && "@b" eq "1 4 5 9"), 'redefinition should not take effect during the sort'); &$f((map { $points->slice($_) } @sls1), (map { $n->slice($_) } @sls1), (map { $this->{Colors}->slice($_) } @sls1)); AA("0123456789012345678901234567890123456789", "0123456789012345678901234567890123456789"); AAAAAA("0123456789012345678901234567890123456789", "0123456789012345678901234567890123456789"); # padded return !( $elem->isa('PPI::Statement::End') || $elem->isa('PPI::Statement::Data')); for ($s = $dbobj->seq($k, $v, R_LAST) ; $s == 0 ; $s = $dbobj->seq($k, $v, R_PREV)) { print "$k: $v\n"; } # excess without -xci fresh_perl_is('-C-', <<'abcdefghijklmnopq', {}, "ambiguous unary operator check doesn't crash"); Warning: Use of "-C-" without parentheses is ambiguous at - line 1. abcdefghijklmnopq # excess with -xci { { { $self->privmsg($to, "One moment please, I shall display the groups with agendas:" ); } } } #13........... }, 'c154.def' => { source => "c154", params => "def", expect => <<'#14...........', { { { { for ( $order = $start_order * $nbSubOrderByOrder + $start_suborder ; !exists $level_hash{$level}->{$order} and $order <= $stop_order * $nbSubOrderByOrder + $stop_suborder ; $order++ ) { } # has comma for ( $q = 201 ; print '-' x 79, "\n" ; $g = ( $f ^ ( $w = ( $z = $m . $e ) ^ substr $e, $q ) ^ ( $n = $b ^ $d | $a ^ $l ) ) & ( $w | $z ^ $f ^ $n ) & ( $l | $g ) ) { ...; } for ( $j = 0, $match_j = -1 ; $j < $sub_len && # changed from naive_string_matcher $sub->[$j] eq $big->[ $i + $j ] ; $j++ ) { ...; } } } } } #14........... }, 'code_skipping.code_skipping' => { source => "code_skipping", params => "code_skipping", expect => <<'#15...........', %Hdr = %U2E = %E2U = %Fallback = (); $in_charmap = $nerror = $nwarning = 0; $. = 0; #<>V my $self = shift; my $cloning = shift; #15........... }, 'c158.def' => { source => "c158", params => "def", expect => <<'#16...........', my $meta = try { $package->meta } or die "$package does not have a ->meta method\n"; my ($curr) = current(); err(@_); #16........... }, 'git108.def' => { source => "git108", params => "def", expect => <<'#17...........', elf->call_method( method_name_foo => { some_arg1 => $foo, some_other_arg3 => $bar->{'baz'}, } ); # leading dash my $species = new Bio::Species( -classification => [ qw( sapiens Homo Hominidae Catarrhini Primates Eutheria Mammalia Vertebrata Chordata Metazoa Eukaryota ) ] ); #17........... }, 'git108.git108' => { source => "git108", params => "git108", expect => <<'#18...........', elf->call_method( method_name_foo => { some_arg1 => $foo, some_other_arg3 => $bar->{'baz'}, } ); # leading dash my $species = new Bio::Species( -classification => [ qw( sapiens Homo Hominidae Catarrhini Primates Eutheria Mammalia Vertebrata Chordata Metazoa Eukaryota ) ] ); #18........... }, 'wtc.def' => { source => "wtc", params => "def", expect => <<'#19...........', # both single and multiple line lists: @LoL = ( [ "fred", "barney", ], [ "george", "jane", "elroy" ], [ "homer", "marge", "bart", ], ); # single line ( $name, $body ) = ( $2, $3, ); # multiline, but not bare $text = $main->Scrolled( TextUndo, $yyy, $zzz, $wwwww, selectbackgroundxxxxx => 'yellow', ); # this will pass for 'h' my $new = { %$item, text => $leaf, color => 'green', }; # matches 'i' my @list = ( $xx, $yy ); # does not match 'h' $c1->create( 'rectangle', 40, 60, 80, 80, -fill => 'red', -tags => 'rectangle' ); $dasm_frame->Button( -text => 'Locate', -command => sub { $target_binary = $fs->Show( -popover => 'cursor', -create => 1, ); }, )->pack( -side => 'left', ); my $no_index_1_1 = { 'map' => { ':key' => { name => \&string, list => { value => \&string }, }, }, }; #19........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/test.t0000644000175000017500000000011307432222315014227 0ustar stevesteveuse strict; use Test; BEGIN { plan tests => 1 } use Perl::Tidy; ok(1); Perl-Tidy-20230309/t/snippets8.t0000644000175000017500000002763614373177246015246 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 rt123749.rt123749 #2 rt123774.def #3 rt124114.def #4 rt124354.def #5 rt124354.rt124354 #6 rt125012.def #7 rt125012.rt125012 #8 rt125506.def #9 rt125506.rt125506 #10 rt126965.def #11 rt15735.def #12 rt18318.def #13 rt18318.rt18318 #14 rt27000.def #15 rt31741.def #16 rt49289.def #17 rt50702.def #18 rt50702.rt50702 #19 rt68870.def #20 rt70747.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'def' => "", 'rt123749' => "-wn", 'rt124354' => "-io", 'rt125012' => <<'----------', -mangle -dac ---------- 'rt125506' => "-io", 'rt18318' => <<'----------', -nwrs='A' ---------- 'rt50702' => <<'----------', -wbb='=' ---------- }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'rt123749' => <<'----------', get('http://mojolicious.org')->then( sub { my $mojo = shift; say $mojo->res->code; return get('http://metacpan.org'); } )->then( sub { my $cpan = shift; say $cpan->res->code; } )->catch( sub { my $err = shift; warn "Something went wrong: $err"; } )->wait; ---------- 'rt123774' => <<'----------', # retain any space between backslash and quote to avoid fooling html formatters my $var1 = \ "bubba"; my $var2 = \"bubba"; my $var3 = \ 'bubba'; my $var4 = \'bubba'; my $var5 = \ "bubba"; ---------- 'rt124114' => <<'----------', #!/usr/bin/perl my %h = { a => 2 > 3 ? 1 : 0, bbbb => sub { my $y = "1" }, c => sub { my $z = "2" }, d => 2 > 3 ? 1 : 0, }; ---------- 'rt124354' => <<'----------', package Foo; use Moose; has a => ( is => 'ro', isa => 'Int' ); has b => ( is => 'ro', isa => 'Int' ); has c => ( is => 'ro', isa => 'Int' ); __PACKAGE__->meta->make_immutable; ---------- 'rt125012' => <<'----------', ++$_ for #one space before eol: values %_; system #one space before eol: qq{}; ---------- 'rt125506' => <<'----------', my $t = ' un deux trois '; ---------- 'rt126965' => <<'----------', my $restrict_customer = shift ? 1 : 0; ---------- 'rt15735' => <<'----------', my $user_prefs = $ref_type eq 'SCALAR' ? _load_from_string( $profile ) : $ref_type eq 'ARRAY' ? _load_from_array( $profile ) : $ref_type eq 'HASH' ? _load_from_hash( $profile ) : _load_from_file( $profile ); ---------- 'rt18318' => <<'----------', # Class::Std attribute list # The token type of the first colon is 'A' so use -nwrs='A' to avoid space # after it my %rank_of : ATTR( :init_arg :get :set ); ---------- 'rt27000' => <<'----------', print add( 3, 4 ), "\n"; print add( 4, 3 ), "\n"; sub add { my ( $term1, $term2 ) = @_; # line 1234 die "$term1 > $term2" if $term1 > $term2; return $term1 + $term2; } ---------- 'rt31741' => <<'----------', $msg //= 'World'; ---------- 'rt49289' => <<'----------', use constant qw{ DEBUG 0 }; ---------- 'rt50702' => <<'----------', if (1) { my $uid = $ENV{ 'ORIG_LOGNAME' } || $ENV{ 'LOGNAME' } || $ENV{ 'REMOTE_USER' } || 'foobar'; } if (2) { my $uid = ($ENV{ 'ORIG_LOGNAME' } || $ENV{ 'LOGNAME' } || $ENV{ 'REMOTE_USER' } || 'foobar'); } ---------- 'rt68870' => <<'----------', s///r; ---------- 'rt70747' => <<'----------', coerce Q2RawStatGroupArray, from ArrayRef [Q2StatGroup], via { [ map { my $g = $_->as_hash; $g->{stats} = [ map { scalar $_->as_array } @{ $g->{stats} } ]; $g; } @$_; ] }; ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'rt123749.rt123749' => { source => "rt123749", params => "rt123749", expect => <<'#1...........', get('http://mojolicious.org')->then( sub { my $mojo = shift; say $mojo->res->code; return get('http://metacpan.org'); } )->then( sub { my $cpan = shift; say $cpan->res->code; } )->catch( sub { my $err = shift; warn "Something went wrong: $err"; } )->wait; #1........... }, 'rt123774.def' => { source => "rt123774", params => "def", expect => <<'#2...........', # retain any space between backslash and quote to avoid fooling html formatters my $var1 = \ "bubba"; my $var2 = \"bubba"; my $var3 = \ 'bubba'; my $var4 = \'bubba'; my $var5 = \ "bubba"; #2........... }, 'rt124114.def' => { source => "rt124114", params => "def", expect => <<'#3...........', #!/usr/bin/perl my %h = { a => 2 > 3 ? 1 : 0, bbbb => sub { my $y = "1" }, c => sub { my $z = "2" }, d => 2 > 3 ? 1 : 0, }; #3........... }, 'rt124354.def' => { source => "rt124354", params => "def", expect => <<'#4...........', package Foo; use Moose; has a => ( is => 'ro', isa => 'Int' ); has b => ( is => 'ro', isa => 'Int' ); has c => ( is => 'ro', isa => 'Int' ); __PACKAGE__->meta->make_immutable; #4........... }, 'rt124354.rt124354' => { source => "rt124354", params => "rt124354", expect => <<'#5...........', package Foo; use Moose; has a => ( is => 'ro', isa => 'Int' ); has b => ( is => 'ro', isa => 'Int' ); has c => ( is => 'ro', isa => 'Int' ); __PACKAGE__->meta->make_immutable; #5........... }, 'rt125012.def' => { source => "rt125012", params => "def", expect => <<'#6...........', ++$_ for #one space before eol: values %_; system #one space before eol: qq{}; #6........... }, 'rt125012.rt125012' => { source => "rt125012", params => "rt125012", expect => <<'#7...........', ++$_ for values%_; system qq{}; #7........... }, 'rt125506.def' => { source => "rt125506", params => "def", expect => <<'#8...........', my $t = ' un deux trois '; #8........... }, 'rt125506.rt125506' => { source => "rt125506", params => "rt125506", expect => <<'#9...........', my $t = ' un deux trois '; #9........... }, 'rt126965.def' => { source => "rt126965", params => "def", expect => <<'#10...........', my $restrict_customer = shift ? 1 : 0; #10........... }, 'rt15735.def' => { source => "rt15735", params => "def", expect => <<'#11...........', my $user_prefs = $ref_type eq 'SCALAR' ? _load_from_string($profile) : $ref_type eq 'ARRAY' ? _load_from_array($profile) : $ref_type eq 'HASH' ? _load_from_hash($profile) : _load_from_file($profile); #11........... }, 'rt18318.def' => { source => "rt18318", params => "def", expect => <<'#12...........', # Class::Std attribute list # The token type of the first colon is 'A' so use -nwrs='A' to avoid space # after it my %rank_of : ATTR( :init_arg :get :set ); #12........... }, 'rt18318.rt18318' => { source => "rt18318", params => "rt18318", expect => <<'#13...........', # Class::Std attribute list # The token type of the first colon is 'A' so use -nwrs='A' to avoid space # after it my %rank_of :ATTR( :init_arg :get :set ); #13........... }, 'rt27000.def' => { source => "rt27000", params => "def", expect => <<'#14...........', print add( 3, 4 ), "\n"; print add( 4, 3 ), "\n"; sub add { my ( $term1, $term2 ) = @_; # line 1234 die "$term1 > $term2" if $term1 > $term2; return $term1 + $term2; } #14........... }, 'rt31741.def' => { source => "rt31741", params => "def", expect => <<'#15...........', $msg //= 'World'; #15........... }, 'rt49289.def' => { source => "rt49289", params => "def", expect => <<'#16...........', use constant qw{ DEBUG 0 }; #16........... }, 'rt50702.def' => { source => "rt50702", params => "def", expect => <<'#17...........', if (1) { my $uid = $ENV{'ORIG_LOGNAME'} || $ENV{'LOGNAME'} || $ENV{'REMOTE_USER'} || 'foobar'; } if (2) { my $uid = ( $ENV{'ORIG_LOGNAME'} || $ENV{'LOGNAME'} || $ENV{'REMOTE_USER'} || 'foobar' ); } #17........... }, 'rt50702.rt50702' => { source => "rt50702", params => "rt50702", expect => <<'#18...........', if (1) { my $uid = $ENV{'ORIG_LOGNAME'} || $ENV{'LOGNAME'} || $ENV{'REMOTE_USER'} || 'foobar'; } if (2) { my $uid = ( $ENV{'ORIG_LOGNAME'} || $ENV{'LOGNAME'} || $ENV{'REMOTE_USER'} || 'foobar' ); } #18........... }, 'rt68870.def' => { source => "rt68870", params => "def", expect => <<'#19...........', s///r; #19........... }, 'rt70747.def' => { source => "rt70747", params => "def", expect => <<'#20...........', coerce Q2RawStatGroupArray, from ArrayRef [Q2StatGroup], via { [ map { my $g = $_->as_hash; $g->{stats} = [ map { scalar $_->as_array } @{ $g->{stats} } ]; $g; } @$_; ] }; #20........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/snippets5.t0000644000175000017500000007357314373177246015244 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 list1.def #2 listop1.def #3 listop2.def #4 lp1.def #5 lp1.lp #6 mangle1.def #7 mangle1.mangle #8 mangle2.def #9 mangle2.mangle #10 mangle3.def #11 mangle3.mangle #12 math1.def #13 math2.def #14 math3.def #15 math4.def #16 nasc.def #17 nasc.nasc #18 nothing.def #19 nothing.nothing #20 otr1.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'def' => "", 'lp' => "-lp", 'mangle' => "--mangle", 'nasc' => "-nasc", 'nothing' => "", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'list1' => <<'----------', %height=("letter",27.9, "legal",35.6, "arche",121.9, "archd",91.4, "archc",61, "archb",45.7, "archa",30.5, "flsa",33, "flse",33, "halfletter",21.6, "11x17",43.2, "ledger",27.9); %width=("letter",21.6, "legal",21.6, "arche",91.4, "archd",61, "archc",45.7, "archb",30.5, "archa",22.9, "flsa",21.6, "flse",21.6, "halfletter",14, "11x17",27.9, "ledger",43.2); ---------- 'listop1' => <<'----------', my @sorted = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, rand ] } @list; ---------- 'listop2' => <<'----------', my @sorted = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, rand ] } @list; ---------- 'lp1' => <<'----------', # a good test problem for -lp; thanks to Ian Stuart push @contents, $c->table( { -border => '1' }, $c->Tr( { -valign => 'top' }, $c->td( " Author ", $c->textfield( -tabindex => "1", -name => "author", -default => "$author", -size => '20' ) ), $c->td( $c->strong(" Publication Date "), $c->textfield( -tabindex => "2", -name => "pub_date", -default => "$pub_date", -size => '20' ), ) ), $c->Tr( { -valign => 'top' }, $c->td( { -colspan => '2' }, $c->strong("Title"), $c->textfield( -tabindex => "3", -name => "title", -default => "$title", -override => '1', -size => '40' ), ) ), $c->Tr( { -valign => 'top' }, $c->td( $c->table( $c->Tr( $c->td( { -valign => 'top' }, $c->strong(" Document Type ") ), $c->td( { -valign => 'top' }, $c->scrolling_list( -tabindex => "4", -name => "doc_type", -values => [@docCodeValues], -labels => \%docCodeLabels, -default => "$doc_type" ) ) ) ) ), $c->td( $c->table( $c->Tr( $c->td( { -valign => 'top' }, $c->strong( " Relevant Discipline ", $c->br(), "Area " ) ), $c->td( { -valign => 'top' }, $c->scrolling_list( -tabindex => "5", -name => "discipline", -values => [@discipValues], -labels => \%discipLabels, -default => "$discipline" ), ) ) ) ) ), $c->Tr( { -valign => 'top' }, $c->td( { -colspan => '2' }, $c->table( $c->Tr( $c->td( { -valign => 'top' }, $c->strong(" Relevant Subject Area "), $c->br(), "You may select multiple areas", ), $c->td( { -valign => 'top' }, $c->checkbox_group( -tabindex => "6", -name => "subject", -values => [@subjValues], -labels => \%subjLabels, -defaults => [@subject], -rows => "2" ) ) ) ) ) ), $c->Tr( { -valign => 'top' }, $c->td( { -colspan => '2' }, $c->strong("Location
"), $c->small("(ie, where to find it)"), $c->textfield( -tabindex => "7", -name => "location", -default => "$location", -size => '40' ) ) ), $c->Tr( { -valign => 'top' }, $c->td( { -colspan => '2' }, $c->table( $c->Tr( $c->td( { -valign => 'top' }, "Description", $c->br(), $c->small("Maximum 750 letters.") ), $c->td( { -valign => 'top' }, $c->textarea( -tabindex => "8", -name => "description", -default => "$description", -wrap => "soft", -rows => '10', -columns => '60' ) ) ) ) ) ), ); ---------- 'mangle1' => <<'----------', # The space after the '?' is essential and must not be deleted print $::opt_m ? " Files: ".my_wrap(""," ",$v) : $v; ---------- 'mangle2' => <<'----------', # hanging side comments - do not remove leading space with -mangle if ( $size1 == 0 || $size2 == 0 ) { # special handling for zero-length if ( $size2 + $size1 == 0 ) { # files. exit 0; } else { # Can't we say 'differ at byte zero' # and so on here? That might make # more sense than this behavior. # Also, this should be made consistent # with the behavior when skip >= # filesize. if ($volume) { warn "$0: EOF on $file1\n" unless $size1; warn "$0: EOF on $file2\n" unless $size2; } exit 1; } } ---------- 'mangle3' => <<'----------', # run with --mangle # Troublesome punctuation variables: $$ and $# # don't delete ws between '$$' and 'if' kill 'ABRT', $$ if $panic++; # Do not remove the space between '$#' and 'eq' $, = "Hello, World!\n"; $#=$,; print "$# "; $# eq $,? print "yes\n" : print "no\n"; # The space after the '?' is essential and must not be deleted print $::opt_m ? " Files: ".my_wrap(""," ",$v) : $v; # must not remove space before 'CAKE' use constant CAKE => atan2(1,1)/2; if ($arc >= - CAKE && $arc <= CAKE) { } # do not remove the space after 'JUNK': print JUNK ("<","&",">")[rand(3)];# make these a bit more likely ---------- 'math1' => <<'----------', my $xyz_shield = [ [ -0.060, -0.060, 0. ], [ 0.060, -0.060, 0. ], [ 0.060, 0.060, 0. ], [ -0.060, 0.060, 0. ], [ -0.0925, -0.0925, 0.092 ], [ 0.0925, -0.0925, 0.092 ], [ 0.0925, 0.0925, 0.092 ], [ -0.0925, 0.0925, 0.092 ], ]; ---------- 'math2' => <<'----------', $ans = pdl( [0, 0, 0, 0, 0], [0, 0, 2, 0, 0], [0, 1, 5, 2, 0], [0, 0, 4, 0, 0], [0, 0, 0, 0, 0] ); ---------- 'math3' => <<'----------', my ( $x, $y ) = ( $x0 + $index_x * $xgridwidth * $xm + ( $map_x * $xm * $xgridwidth ) / $detailwidth, $y0 - $index_y * $ygridwidth * $ym - ( $map_y * $ym * $ygridwidth ) / $detailheight,); ---------- 'math4' => <<'----------', my$u=($range*$pratio**(1./3.))/$wratio; my$factor=exp(-(18/$u)**4); my$ovp=(1-$factor)*(70-0.655515*$u)+(1000/($u**1.3)+10000/($u**3.3))*$factor; my$impulse=(1-$factor)*(170-$u)+(350/$u**0.65+500/$u**5)*$factor; $ovp=$ovp*$pratio; $impulse=$impulse*$wratio*$pratio**(2/3); ---------- 'nasc' => <<'----------', # will break and add semicolon unless -nasc is given eval { $terminal = Tgetent Term::Cap { TERM => undef, OSPEED => $ospeed } }; ---------- 'nothing' => <<'----------', ---------- 'otr1' => <<'----------', return $pdl->slice( join ',', ( map { $_ eq "X" ? ":" : ref $_ eq "ARRAY" ? join ':', @$_ : !ref $_ ? $_ : die "INVALID SLICE DEF $_" } @_ ) ); ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'list1.def' => { source => "list1", params => "def", expect => <<'#1...........', %height = ( "letter", 27.9, "legal", 35.6, "arche", 121.9, "archd", 91.4, "archc", 61, "archb", 45.7, "archa", 30.5, "flsa", 33, "flse", 33, "halfletter", 21.6, "11x17", 43.2, "ledger", 27.9 ); %width = ( "letter", 21.6, "legal", 21.6, "arche", 91.4, "archd", 61, "archc", 45.7, "archb", 30.5, "archa", 22.9, "flsa", 21.6, "flse", 21.6, "halfletter", 14, "11x17", 27.9, "ledger", 43.2 ); #1........... }, 'listop1.def' => { source => "listop1", params => "def", expect => <<'#2...........', my @sorted = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, rand ] } @list; #2........... }, 'listop2.def' => { source => "listop2", params => "def", expect => <<'#3...........', my @sorted = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, rand ] } @list; #3........... }, 'lp1.def' => { source => "lp1", params => "def", expect => <<'#4...........', # a good test problem for -lp; thanks to Ian Stuart push @contents, $c->table( { -border => '1' }, $c->Tr( { -valign => 'top' }, $c->td( " Author ", $c->textfield( -tabindex => "1", -name => "author", -default => "$author", -size => '20' ) ), $c->td( $c->strong(" Publication Date "), $c->textfield( -tabindex => "2", -name => "pub_date", -default => "$pub_date", -size => '20' ), ) ), $c->Tr( { -valign => 'top' }, $c->td( { -colspan => '2' }, $c->strong("Title"), $c->textfield( -tabindex => "3", -name => "title", -default => "$title", -override => '1', -size => '40' ), ) ), $c->Tr( { -valign => 'top' }, $c->td( $c->table( $c->Tr( $c->td( { -valign => 'top' }, $c->strong(" Document Type ") ), $c->td( { -valign => 'top' }, $c->scrolling_list( -tabindex => "4", -name => "doc_type", -values => [@docCodeValues], -labels => \%docCodeLabels, -default => "$doc_type" ) ) ) ) ), $c->td( $c->table( $c->Tr( $c->td( { -valign => 'top' }, $c->strong( " Relevant Discipline ", $c->br(), "Area " ) ), $c->td( { -valign => 'top' }, $c->scrolling_list( -tabindex => "5", -name => "discipline", -values => [@discipValues], -labels => \%discipLabels, -default => "$discipline" ), ) ) ) ) ), $c->Tr( { -valign => 'top' }, $c->td( { -colspan => '2' }, $c->table( $c->Tr( $c->td( { -valign => 'top' }, $c->strong(" Relevant Subject Area "), $c->br(), "You may select multiple areas", ), $c->td( { -valign => 'top' }, $c->checkbox_group( -tabindex => "6", -name => "subject", -values => [@subjValues], -labels => \%subjLabels, -defaults => [@subject], -rows => "2" ) ) ) ) ) ), $c->Tr( { -valign => 'top' }, $c->td( { -colspan => '2' }, $c->strong("Location
"), $c->small("(ie, where to find it)"), $c->textfield( -tabindex => "7", -name => "location", -default => "$location", -size => '40' ) ) ), $c->Tr( { -valign => 'top' }, $c->td( { -colspan => '2' }, $c->table( $c->Tr( $c->td( { -valign => 'top' }, "Description", $c->br(), $c->small("Maximum 750 letters.") ), $c->td( { -valign => 'top' }, $c->textarea( -tabindex => "8", -name => "description", -default => "$description", -wrap => "soft", -rows => '10', -columns => '60' ) ) ) ) ) ), ); #4........... }, 'lp1.lp' => { source => "lp1", params => "lp", expect => <<'#5...........', # a good test problem for -lp; thanks to Ian Stuart push @contents, $c->table( { -border => '1' }, $c->Tr( { -valign => 'top' }, $c->td( " Author ", $c->textfield( -tabindex => "1", -name => "author", -default => "$author", -size => '20' ) ), $c->td( $c->strong(" Publication Date "), $c->textfield( -tabindex => "2", -name => "pub_date", -default => "$pub_date", -size => '20' ), ) ), $c->Tr( { -valign => 'top' }, $c->td( { -colspan => '2' }, $c->strong("Title"), $c->textfield( -tabindex => "3", -name => "title", -default => "$title", -override => '1', -size => '40' ), ) ), $c->Tr( { -valign => 'top' }, $c->td( $c->table( $c->Tr( $c->td( { -valign => 'top' }, $c->strong(" Document Type ") ), $c->td( { -valign => 'top' }, $c->scrolling_list( -tabindex => "4", -name => "doc_type", -values => [@docCodeValues], -labels => \%docCodeLabels, -default => "$doc_type" ) ) ) ) ), $c->td( $c->table( $c->Tr( $c->td( { -valign => 'top' }, $c->strong( " Relevant Discipline ", $c->br(), "Area " ) ), $c->td( { -valign => 'top' }, $c->scrolling_list( -tabindex => "5", -name => "discipline", -values => [@discipValues], -labels => \%discipLabels, -default => "$discipline" ), ) ) ) ) ), $c->Tr( { -valign => 'top' }, $c->td( { -colspan => '2' }, $c->table( $c->Tr( $c->td( { -valign => 'top' }, $c->strong(" Relevant Subject Area "), $c->br(), "You may select multiple areas", ), $c->td( { -valign => 'top' }, $c->checkbox_group( -tabindex => "6", -name => "subject", -values => [@subjValues], -labels => \%subjLabels, -defaults => [@subject], -rows => "2" ) ) ) ) ) ), $c->Tr( { -valign => 'top' }, $c->td( { -colspan => '2' }, $c->strong("Location
"), $c->small("(ie, where to find it)"), $c->textfield( -tabindex => "7", -name => "location", -default => "$location", -size => '40' ) ) ), $c->Tr( { -valign => 'top' }, $c->td( { -colspan => '2' }, $c->table( $c->Tr( $c->td( { -valign => 'top' }, "Description", $c->br(), $c->small("Maximum 750 letters.") ), $c->td( { -valign => 'top' }, $c->textarea( -tabindex => "8", -name => "description", -default => "$description", -wrap => "soft", -rows => '10', -columns => '60' ) ) ) ) ) ), ); #5........... }, 'mangle1.def' => { source => "mangle1", params => "def", expect => <<'#6...........', # The space after the '?' is essential and must not be deleted print $::opt_m ? " Files: " . my_wrap( "", " ", $v ) : $v; #6........... }, 'mangle1.mangle' => { source => "mangle1", params => "mangle", expect => <<'#7...........', # The space after the '?' is essential and must not be deleted print$::opt_m ? " Files: ".my_wrap(""," ",$v):$v; #7........... }, 'mangle2.def' => { source => "mangle2", params => "def", expect => <<'#8...........', # hanging side comments - do not remove leading space with -mangle if ( $size1 == 0 || $size2 == 0 ) { # special handling for zero-length if ( $size2 + $size1 == 0 ) { # files. exit 0; } else { # Can't we say 'differ at byte zero' # and so on here? That might make # more sense than this behavior. # Also, this should be made consistent # with the behavior when skip >= # filesize. if ($volume) { warn "$0: EOF on $file1\n" unless $size1; warn "$0: EOF on $file2\n" unless $size2; } exit 1; } } #8........... }, 'mangle2.mangle' => { source => "mangle2", params => "mangle", expect => <<'#9...........', # hanging side comments - do not remove leading space with -mangle if($size1==0||$size2==0){# special handling for zero-length if($size2+$size1==0){# files. exit 0;}else{# Can't we say 'differ at byte zero' # and so on here? That might make # more sense than this behavior. # Also, this should be made consistent # with the behavior when skip >= # filesize. if($volume){warn"$0: EOF on $file1\n" unless$size1; warn"$0: EOF on $file2\n" unless$size2;}exit 1;}} #9........... }, 'mangle3.def' => { source => "mangle3", params => "def", expect => <<'#10...........', # run with --mangle # Troublesome punctuation variables: $$ and $# # don't delete ws between '$$' and 'if' kill 'ABRT', $$ if $panic++; # Do not remove the space between '$#' and 'eq' $, = "Hello, World!\n"; $# = $,; print "$# "; $# eq $, ? print "yes\n" : print "no\n"; # The space after the '?' is essential and must not be deleted print $::opt_m ? " Files: " . my_wrap( "", " ", $v ) : $v; # must not remove space before 'CAKE' use constant CAKE => atan2( 1, 1 ) / 2; if ( $arc >= - CAKE && $arc <= CAKE ) { } # do not remove the space after 'JUNK': print JUNK ( "<", "&", ">" )[ rand(3) ]; # make these a bit more likely #10........... }, 'mangle3.mangle' => { source => "mangle3", params => "mangle", expect => <<'#11...........', # run with --mangle # Troublesome punctuation variables: $$ and $# # don't delete ws between '$$' and 'if' kill 'ABRT',$$ if$panic++; # Do not remove the space between '$#' and 'eq' $,="Hello, World!\n"; $#=$,; print"$# "; $# eq$,?print"yes\n":print"no\n"; # The space after the '?' is essential and must not be deleted print$::opt_m ? " Files: ".my_wrap(""," ",$v):$v; # must not remove space before 'CAKE' use constant CAKE=>atan2(1,1)/2; if($arc>=- CAKE&&$arc<=CAKE){} # do not remove the space after 'JUNK': print JUNK ("<","&",">")[rand(3)];# make these a bit more likely #11........... }, 'math1.def' => { source => "math1", params => "def", expect => <<'#12...........', my $xyz_shield = [ [ -0.060, -0.060, 0. ], [ 0.060, -0.060, 0. ], [ 0.060, 0.060, 0. ], [ -0.060, 0.060, 0. ], [ -0.0925, -0.0925, 0.092 ], [ 0.0925, -0.0925, 0.092 ], [ 0.0925, 0.0925, 0.092 ], [ -0.0925, 0.0925, 0.092 ], ]; #12........... }, 'math2.def' => { source => "math2", params => "def", expect => <<'#13...........', $ans = pdl( [ 0, 0, 0, 0, 0 ], [ 0, 0, 2, 0, 0 ], [ 0, 1, 5, 2, 0 ], [ 0, 0, 4, 0, 0 ], [ 0, 0, 0, 0, 0 ] ); #13........... }, 'math3.def' => { source => "math3", params => "def", expect => <<'#14...........', my ( $x, $y ) = ( $x0 + $index_x * $xgridwidth * $xm + ( $map_x * $xm * $xgridwidth ) / $detailwidth, $y0 - $index_y * $ygridwidth * $ym - ( $map_y * $ym * $ygridwidth ) / $detailheight, ); #14........... }, 'math4.def' => { source => "math4", params => "def", expect => <<'#15...........', my $u = ( $range * $pratio**( 1. / 3. ) ) / $wratio; my $factor = exp( -( 18 / $u )**4 ); my $ovp = ( 1 - $factor ) * ( 70 - 0.655515 * $u ) + ( 1000 / ( $u**1.3 ) + 10000 / ( $u**3.3 ) ) * $factor; my $impulse = ( 1 - $factor ) * ( 170 - $u ) + ( 350 / $u**0.65 + 500 / $u**5 ) * $factor; $ovp = $ovp * $pratio; $impulse = $impulse * $wratio * $pratio**( 2 / 3 ); #15........... }, 'nasc.def' => { source => "nasc", params => "def", expect => <<'#16...........', # will break and add semicolon unless -nasc is given eval { $terminal = Tgetent Term::Cap { TERM => undef, OSPEED => $ospeed }; }; #16........... }, 'nasc.nasc' => { source => "nasc", params => "nasc", expect => <<'#17...........', # will break and add semicolon unless -nasc is given eval { $terminal = Tgetent Term::Cap { TERM => undef, OSPEED => $ospeed } }; #17........... }, 'nothing.def' => { source => "nothing", params => "def", expect => <<'#18...........', #18........... }, 'nothing.nothing' => { source => "nothing", params => "nothing", expect => <<'#19...........', #19........... }, 'otr1.def' => { source => "otr1", params => "def", expect => <<'#20...........', return $pdl->slice( join ',', ( map { $_ eq "X" ? ":" : ref $_ eq "ARRAY" ? join ':', @$_ : !ref $_ ? $_ : die "INVALID SLICE DEF $_" } @_ ) ); #20........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/snippets19.t0000644000175000017500000003662414373177245015324 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 misc_tests.misc_tests #2 outdent.def #3 outdent.outdent1 #4 sbq.def #5 sbq.sbq0 #6 sbq.sbq2 #7 tightness.def #8 tightness.tightness1 #9 tightness.tightness2 #10 tightness.tightness3 #11 braces.braces4 #12 scbb.def #13 scbb.scbb #14 space_paren.def #15 space_paren.space_paren1 #16 space_paren.space_paren2 #17 braces.braces5 #18 braces.braces6 #19 maths.maths3 # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'braces4' => "-icb", 'braces5' => <<'----------', -bli -blil='if' ---------- 'braces6' => "-ce", 'def' => "", 'maths3' => <<'----------', # test some bizarre spacing around operators -nwls="= / *" -wrs="= / *" -nwrs="+ -" -wls="+ -" ---------- 'misc_tests' => <<'----------', -sts -ssc -sfs -nsak="my for" -ndsm ---------- 'outdent1' => <<'----------', # test -nola -okw -nola -okw ---------- 'sbq0' => "-sbq=0", 'sbq2' => "-sbq=2", 'scbb' => "-scbb", 'space_paren1' => "-sfp -skp", 'space_paren2' => "-sak=push", 'tightness1' => "-pt=0 -sbt=0 -bt=0 -bbt=0", 'tightness2' => <<'----------', -pt=1 -sbt=1 -bt=1 -bbt=1 ---------- 'tightness3' => <<'----------', -pt=2 -sbt=2 -bt=2 -bbt=2 ---------- }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'braces' => <<'----------', sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } } $myfun = sub { print("Hello, World\n"); }; eval { my $app = App::perlbrew->new( "install-patchperl", "-q" ); $app->run(); } or do { $error = $@; $produced_error = 1; }; Mojo::IOLoop->next_tick( sub { $ua->get( '/' => sub { push @kept_alive, pop->kept_alive; Mojo::IOLoop->next_tick( sub { Mojo::IOLoop->stop } ); } ); } ); $r = do { sswitch( $words[ rand @words ] ) { case $words[0]: case $words[1]: case $words[2]: case $words[3]: { 'ok' } default: { 'wtf' } } }; try { die; } catch { die; }; ---------- 'maths' => <<'----------', $tmp = $day - 32075 + 1461 * ( $year + 4800 - ( 14 - $month ) / 12 ) / 4 + 367 * ( $month - 2 + ( ( 14 - $month ) / 12 ) * 12 ) / 12 - 3 * ( ( $year + 4900 - ( 14 - $month ) / 12 ) / 100 ) / 4; return ( $r**$n ) * ( pi**( $n / 2 ) ) / ( sqrt(pi) * factorial( 2 * ( int( $n / 2 ) ) + 2 ) / factorial( int( $n / 2 ) + 1 ) / ( 4**( int( $n / 2 ) + 1 ) ) ); $root=-$b+sqrt($b*$b-4.*$a*$c)/(2.*$a); ---------- 'misc_tests' => <<'----------', for ( @a = @$ap, $u = shift @a; @a; $u = $v ) { ... } # test -sfs $i = 1 ; # test -sts $i = 0; ## =1; test -ssc ;;;; # test -ndsm my ( $a, $b, $c ) = @_; # test -nsak="my for" ---------- 'outdent' => <<'----------', my $i; LOOP: while ( $i = ) { chomp($i); next unless $i; fixit($i); } ---------- 'sbq' => <<'----------', $str1=\"string1"; $str2=\ 'string2'; ---------- 'scbb' => <<'----------', # test -scbb: for $w1 (@w1) { for $w2 (@w2) { for $w3 (@w3) { for $w4 (@w4) { push( @lines, "$w1 $w2 $w3 $w4\n" ); } } } } ---------- 'space_paren' => <<'----------', myfunc ( $a, $b, $c ); # test -sfp push ( @array, $val ); # test -skp and also -sak='push' split( /\|/, $txt ); # test -skp and also -sak='push' my ( $v1, $v2 ) = @_; # test -sak='push' $c-> #sub set_whitespace_flags must look back past side comment bind( $o, $n, [ \&$q, \%m ] ); ---------- 'tightness' => <<'----------', if (( my $len_tab = length( $tabstr ) ) > 0) { } # test -pt $width = $col[ $j + $k ] - $col[ $j ]; # test -sbt $obj->{ $parsed_sql->{ 'table' }[0] }; # test -bt %bf = map { $_ => -M $_ } grep { /\.deb$/ } dirents '.'; # test -bbt ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'misc_tests.misc_tests' => { source => "misc_tests", params => "misc_tests", expect => <<'#1...........', for( @a = @$ap, $u = shift @a ; @a ; $u = $v ) { ... } # test -sfs $i = 1 ; # test -sts $i = 0 ; ## =1; test -ssc ; ; ; ; # test -ndsm my( $a, $b, $c ) = @_ ; # test -nsak="my for" #1........... }, 'outdent.def' => { source => "outdent", params => "def", expect => <<'#2...........', my $i; LOOP: while ( $i = ) { chomp($i); next unless $i; fixit($i); } #2........... }, 'outdent.outdent1' => { source => "outdent", params => "outdent1", expect => <<'#3...........', my $i; LOOP: while ( $i = ) { chomp($i); next unless $i; fixit($i); } #3........... }, 'sbq.def' => { source => "sbq", params => "def", expect => <<'#4...........', $str1 = \"string1"; $str2 = \ 'string2'; #4........... }, 'sbq.sbq0' => { source => "sbq", params => "sbq0", expect => <<'#5...........', $str1 = \"string1"; $str2 = \'string2'; #5........... }, 'sbq.sbq2' => { source => "sbq", params => "sbq2", expect => <<'#6...........', $str1 = \ "string1"; $str2 = \ 'string2'; #6........... }, 'tightness.def' => { source => "tightness", params => "def", expect => <<'#7...........', if ( ( my $len_tab = length($tabstr) ) > 0 ) { } # test -pt $width = $col[ $j + $k ] - $col[$j]; # test -sbt $obj->{ $parsed_sql->{'table'}[0] }; # test -bt %bf = map { $_ => -M $_ } grep { /\.deb$/ } dirents '.'; # test -bbt #7........... }, 'tightness.tightness1' => { source => "tightness", params => "tightness1", expect => <<'#8...........', if ( ( my $len_tab = length( $tabstr ) ) > 0 ) { } # test -pt $width = $col[ $j + $k ] - $col[ $j ]; # test -sbt $obj->{ $parsed_sql->{ 'table' }[ 0 ] }; # test -bt %bf = map { $_ => -M $_ } grep { /\.deb$/ } dirents '.'; # test -bbt #8........... }, 'tightness.tightness2' => { source => "tightness", params => "tightness2", expect => <<'#9...........', if ( ( my $len_tab = length($tabstr) ) > 0 ) { } # test -pt $width = $col[ $j + $k ] - $col[$j]; # test -sbt $obj->{ $parsed_sql->{'table'}[0] }; # test -bt %bf = map { $_ => -M $_ } grep {/\.deb$/} dirents '.'; # test -bbt #9........... }, 'tightness.tightness3' => { source => "tightness", params => "tightness3", expect => <<'#10...........', if ((my $len_tab = length($tabstr)) > 0) { } # test -pt $width = $col[$j + $k] - $col[$j]; # test -sbt $obj->{$parsed_sql->{'table'}[0]}; # test -bt %bf = map {$_ => -M $_} grep {/\.deb$/} dirents '.'; # test -bbt #10........... }, 'braces.braces4' => { source => "braces", params => "braces4", expect => <<'#11...........', sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } } $myfun = sub { print("Hello, World\n"); }; eval { my $app = App::perlbrew->new( "install-patchperl", "-q" ); $app->run(); } or do { $error = $@; $produced_error = 1; }; Mojo::IOLoop->next_tick( sub { $ua->get( '/' => sub { push @kept_alive, pop->kept_alive; Mojo::IOLoop->next_tick( sub { Mojo::IOLoop->stop } ); } ); } ); $r = do { sswitch( $words[ rand @words ] ) { case $words[0]: case $words[1]: case $words[2]: case $words[3]: { 'ok' } default: { 'wtf' } } }; try { die; } catch { die; }; #11........... }, 'scbb.def' => { source => "scbb", params => "def", expect => <<'#12...........', # test -scbb: for $w1 (@w1) { for $w2 (@w2) { for $w3 (@w3) { for $w4 (@w4) { push( @lines, "$w1 $w2 $w3 $w4\n" ); } } } } #12........... }, 'scbb.scbb' => { source => "scbb", params => "scbb", expect => <<'#13...........', # test -scbb: for $w1 (@w1) { for $w2 (@w2) { for $w3 (@w3) { for $w4 (@w4) { push( @lines, "$w1 $w2 $w3 $w4\n" ); } } } } #13........... }, 'space_paren.def' => { source => "space_paren", params => "def", expect => <<'#14...........', myfunc( $a, $b, $c ); # test -sfp push( @array, $val ); # test -skp and also -sak='push' split( /\|/, $txt ); # test -skp and also -sak='push' my ( $v1, $v2 ) = @_; # test -sak='push' $c-> #sub set_whitespace_flags must look back past side comment bind( $o, $n, [ \&$q, \%m ] ); #14........... }, 'space_paren.space_paren1' => { source => "space_paren", params => "space_paren1", expect => <<'#15...........', myfunc ( $a, $b, $c ); # test -sfp push ( @array, $val ); # test -skp and also -sak='push' split ( /\|/, $txt ); # test -skp and also -sak='push' my ( $v1, $v2 ) = @_; # test -sak='push' $c-> #sub set_whitespace_flags must look back past side comment bind ( $o, $n, [ \&$q, \%m ] ); #15........... }, 'space_paren.space_paren2' => { source => "space_paren", params => "space_paren2", expect => <<'#16...........', myfunc( $a, $b, $c ); # test -sfp push ( @array, $val ); # test -skp and also -sak='push' split( /\|/, $txt ); # test -skp and also -sak='push' my ( $v1, $v2 ) = @_; # test -sak='push' $c-> #sub set_whitespace_flags must look back past side comment bind( $o, $n, [ \&$q, \%m ] ); #16........... }, 'braces.braces5' => { source => "braces", params => "braces5", expect => <<'#17...........', sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } } $myfun = sub { print("Hello, World\n"); }; eval { my $app = App::perlbrew->new( "install-patchperl", "-q" ); $app->run(); } or do { $error = $@; $produced_error = 1; }; Mojo::IOLoop->next_tick( sub { $ua->get( '/' => sub { push @kept_alive, pop->kept_alive; Mojo::IOLoop->next_tick( sub { Mojo::IOLoop->stop } ); } ); } ); $r = do { sswitch( $words[ rand @words ] ) { case $words[0]: case $words[1]: case $words[2]: case $words[3]: { 'ok' } default: { 'wtf' } } }; try { die; } catch { die; }; #17........... }, 'braces.braces6' => { source => "braces", params => "braces6", expect => <<'#18...........', sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } } $myfun = sub { print("Hello, World\n"); }; eval { my $app = App::perlbrew->new( "install-patchperl", "-q" ); $app->run(); } or do { $error = $@; $produced_error = 1; }; Mojo::IOLoop->next_tick( sub { $ua->get( '/' => sub { push @kept_alive, pop->kept_alive; Mojo::IOLoop->next_tick( sub { Mojo::IOLoop->stop } ); } ); } ); $r = do { sswitch( $words[ rand @words ] ) { case $words[0]: case $words[1]: case $words[2]: case $words[3]: { 'ok' } default: { 'wtf' } } }; try { die; } catch { die; }; #18........... }, 'maths.maths3' => { source => "maths", params => "maths3", expect => <<'#19...........', $tmp= $day -32075 + 1461* ( $year +4800 -( 14 -$month )/ 12 )/ 4 + 367* ( $month -2 +( ( 14 -$month )/ 12 )* 12 )/ 12 - 3* ( ( $year +4900 -( 14 -$month )/ 12 )/ 100 )/ 4; return ( $r**$n )* ( pi**( $n/ 2 ) )/ ( sqrt(pi)* factorial( 2* ( int( $n/ 2 ) ) +2 )/ factorial( int( $n/ 2 ) +1 ) / ( 4**( int( $n/ 2 ) +1 ) ) ); $root= -$b +sqrt( $b* $b -4.* $a* $c )/ ( 2.* $a ); #19........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/snippets7.t0000644000175000017500000002744714373177246015245 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 rt102451.def #2 rt104427.def #3 rt106492.def #4 rt107832.def #5 rt107832.rt107832 #6 rt111519.def #7 rt111519.rt111519 #8 rt112534.def #9 rt113689.def #10 rt113689.rt113689 #11 rt113792.def #12 rt114359.def #13 rt114909.def #14 rt116344.def #15 rt119140.def #16 rt119588.def #17 rt119970.def #18 rt119970.rt119970 #19 rt123492.def #20 rt123749.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'def' => "", 'rt107832' => <<'----------', -lp -boc ---------- 'rt111519' => <<'----------', -io -dac ---------- 'rt113689' => <<'----------', -blao=2 -blbc=1 -blaol='*' -blbcl='*' -mbl=2 ---------- 'rt119970' => "-wn", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'rt102451' => <<'----------', # RT#102451 bug test; unwanted spaces added before =head1 on each pass #<<< =head1 NAME =cut my %KA_CACHE; # indexed by uhost currently, points to [$handle...] array =head1 NAME =cut #>>> ---------- 'rt104427' => <<'----------', #!/usr/bin/env perl use v5.020; #includes strict use warnings; use experimental 'signatures'; setidentifier(); exit; sub setidentifier ( $href = {} ) { say 'hi'; } ---------- 'rt106492' => <<'----------', my $ct = Courriel::Header::ContentType->new( mime_type => 'multipart/alternative', attributes => { boundary => unique_boundary }, ); ---------- 'rt107832' => <<'----------', my %temp = ( supsup => 123, nested => { asdf => 456, yarg => 'yarp', }, ); ---------- 'rt111519' => <<'----------', use strict; use warnings; my $x = 1; # comment not removed # comment will be removed my $y = 2; # comment also not removed ---------- 'rt112534' => <<'----------', get( on_ready => sub ($worker) { $on_ready->end; return; }, on_exit => sub ( $worker, $status ) { return; }, on_data => sub ($data) { $self->_on_data(@_) if $self; return; } ); ---------- 'rt113689' => <<'----------', $a = sub { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } }; ---------- 'rt113792' => <<'----------', print "hello world\n"; __DATA__ => 1/2 : 0.5 ---------- 'rt114359' => <<'----------', my $x = 2; print $x ** 0.5; ---------- 'rt114909' => <<'----------', #!perl use strict; use warnings; use experimental 'signatures'; sub reader ( $line_sep, $chomp ) { return sub ( $fh, $out ) : prototype(*$) { local $/ = $line_sep; my $content = <$fh>; return undef unless defined $content; chomp $content if $chomp; $$out .= $content; return 1; }; } BEGIN { *get_line = reader( "\n", 1 ); } while ( get_line( STDIN, \my $buf ) ) { print "Got: $buf\n"; } ---------- 'rt116344' => <<'----------', # Rt116344 # Attempting to tidy the following code failed: sub broken { return ref {} ? 1 : 0; something(); } ---------- 'rt119140' => <<'----------', while (<<>>) { } ---------- 'rt119588' => <<'----------', sub demo { my $self = shift; my $longname = shift // "xyz"; } ---------- 'rt119970' => <<'----------', my $x = [ { fooxx => 1, bar => 1, } ]; ---------- 'rt123492' => <<'----------', if (1) { print <<~EOF; Hello there EOF } ---------- 'rt123749' => <<'----------', get('http://mojolicious.org')->then( sub { my $mojo = shift; say $mojo->res->code; return get('http://metacpan.org'); } )->then( sub { my $cpan = shift; say $cpan->res->code; } )->catch( sub { my $err = shift; warn "Something went wrong: $err"; } )->wait; ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'rt102451.def' => { source => "rt102451", params => "def", expect => <<'#1...........', # RT#102451 bug test; unwanted spaces added before =head1 on each pass #<<< =head1 NAME =cut my %KA_CACHE; # indexed by uhost currently, points to [$handle...] array =head1 NAME =cut #>>> #1........... }, 'rt104427.def' => { source => "rt104427", params => "def", expect => <<'#2...........', #!/usr/bin/env perl use v5.020; #includes strict use warnings; use experimental 'signatures'; setidentifier(); exit; sub setidentifier ( $href = {} ) { say 'hi'; } #2........... }, 'rt106492.def' => { source => "rt106492", params => "def", expect => <<'#3...........', my $ct = Courriel::Header::ContentType->new( mime_type => 'multipart/alternative', attributes => { boundary => unique_boundary }, ); #3........... }, 'rt107832.def' => { source => "rt107832", params => "def", expect => <<'#4...........', my %temp = ( supsup => 123, nested => { asdf => 456, yarg => 'yarp', }, ); #4........... }, 'rt107832.rt107832' => { source => "rt107832", params => "rt107832", expect => <<'#5...........', my %temp = ( supsup => 123, nested => { asdf => 456, yarg => 'yarp', }, ); #5........... }, 'rt111519.def' => { source => "rt111519", params => "def", expect => <<'#6...........', use strict; use warnings; my $x = 1; # comment not removed # comment will be removed my $y = 2; # comment also not removed #6........... }, 'rt111519.rt111519' => { source => "rt111519", params => "rt111519", expect => <<'#7...........', use strict; use warnings; my $x = 1; my $y = 2; #7........... }, 'rt112534.def' => { source => "rt112534", params => "def", expect => <<'#8...........', get( on_ready => sub ($worker) { $on_ready->end; return; }, on_exit => sub ( $worker, $status ) { return; }, on_data => sub ($data) { $self->_on_data(@_) if $self; return; } ); #8........... }, 'rt113689.def' => { source => "rt113689", params => "def", expect => <<'#9...........', $a = sub { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } }; #9........... }, 'rt113689.rt113689' => { source => "rt113689", params => "rt113689", expect => <<'#10...........', $a = sub { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } }; #10........... }, 'rt113792.def' => { source => "rt113792", params => "def", expect => <<'#11...........', print "hello world\n"; __DATA__ => 1/2 : 0.5 #11........... }, 'rt114359.def' => { source => "rt114359", params => "def", expect => <<'#12...........', my $x = 2; print $x **0.5; #12........... }, 'rt114909.def' => { source => "rt114909", params => "def", expect => <<'#13...........', #!perl use strict; use warnings; use experimental 'signatures'; sub reader ( $line_sep, $chomp ) { return sub ( $fh, $out ) : prototype(*$) { local $/ = $line_sep; my $content = <$fh>; return undef unless defined $content; chomp $content if $chomp; $$out .= $content; return 1; }; } BEGIN { *get_line = reader( "\n", 1 ); } while ( get_line( STDIN, \my $buf ) ) { print "Got: $buf\n"; } #13........... }, 'rt116344.def' => { source => "rt116344", params => "def", expect => <<'#14...........', # Rt116344 # Attempting to tidy the following code failed: sub broken { return ref {} ? 1 : 0; something(); } #14........... }, 'rt119140.def' => { source => "rt119140", params => "def", expect => <<'#15...........', while ( <<>> ) { } #15........... }, 'rt119588.def' => { source => "rt119588", params => "def", expect => <<'#16...........', sub demo { my $self = shift; my $longname = shift // "xyz"; } #16........... }, 'rt119970.def' => { source => "rt119970", params => "def", expect => <<'#17...........', my $x = [ { fooxx => 1, bar => 1, } ]; #17........... }, 'rt119970.rt119970' => { source => "rt119970", params => "rt119970", expect => <<'#18...........', my $x = [ { fooxx => 1, bar => 1, } ]; #18........... }, 'rt123492.def' => { source => "rt123492", params => "def", expect => <<'#19...........', if (1) { print <<~EOF; Hello there EOF } #19........... }, 'rt123749.def' => { source => "rt123749", params => "def", expect => <<'#20...........', get('http://mojolicious.org')->then( sub { my $mojo = shift; say $mojo->res->code; return get('http://metacpan.org'); } )->then( sub { my $cpan = shift; say $cpan->res->code; } )->catch( sub { my $err = shift; warn "Something went wrong: $err"; } )->wait; #20........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/t/snippets24.t0000644000175000017500000005601614373177245015315 0ustar stevesteve# Created with: ./make_t.pl # Contents: #1 git54.def #2 git54.git54 #3 fpva.def #4 fpva.fpva1 #5 fpva.fpva2 #6 lpxl.def #7 lpxl.lpxl1 #8 lpxl.lpxl3 #9 lpxl.lpxl4 #10 lpxl.lpxl5 #11 git63.def #12 align35.def #13 rt136417.def #14 rt136417.rt136417 #15 numbers.def #16 code_skipping.def #17 git51.def #18 git51.git51 #19 pretok.def # To locate test #13 you can search for its name or the string '#13' use strict; use Test::More; use Carp; use Perl::Tidy; my $rparams; my $rsources; my $rtests; BEGIN { ########################################### # BEGIN SECTION 1: Parameter combinations # ########################################### $rparams = { 'def' => "", 'fpva1' => "-sfp", 'fpva2' => <<'----------', -sfp -wls='->' -wrs='->' -nfpva ---------- 'git51' => <<'----------', --maximum-line-length=120 --converge --tabs --entab-leading-whitespace=4 --continuation-indentation=4 --extended-continuation-indentation --no-delete-old-newlines --no-outdent-long-lines --no-outdent-labels --novalign --no-logical-padding --opening-sub-brace-on-new-line --square-bracket-tightness=2 --paren-tightness=2 --brace-tightness=2 --opening-token-right -sal='first any sum sum0 reduce' ---------- 'git54' => "-bbp=3 -bbpi=2 -ci=4 -lp", 'lpxl1' => "-lp", 'lpxl3' => <<'----------', -lp -lpxl='{ [ (' ---------- 'lpxl4' => <<'----------', -lp -lpxl='{ [ W(1' ---------- 'lpxl5' => <<'----------', -lp -lpxl='{ [ F(2' ---------- 'rt136417' => "-vtc=3", }; ############################ # BEGIN SECTION 2: Sources # ############################ $rsources = { 'align35' => <<'----------', # different module names, do not align commas (fixes rt136416) use File::Spec::Functions 'catfile', 'catdir'; use Mojo::Base 'Mojolicious', '-signatures'; # same module names, align fat commas use constant PI => 4 * atan2 1, 1; use constant TWOPI => 2 * PI; use constant FOURPI => 4 * PI; # same module names, align commas use TestCounter '3rd-party', 0, '3rd-party no longer visible'; use TestCounter 'replace', 1, 'replacement now visible'; use TestCounter 'root'; # same module name, align fat commas but not commas use constant COUNTDOWN => scalar reverse 1, 2, 3, 4, 5; use constant COUNTUP => reverse 1, 2, 3, 4, 5; use constant COUNTDOWN => scalar reverse 1, 2, 3, 4, 5; ---------- 'code_skipping' => <<'----------', %Hdr=%U2E=%E2U=%Fallback=(); $in_charmap=$nerror=$nwarning=0; $.=0; #<>V my $self=shift; my $cloning=shift; ---------- 'fpva' => <<'----------', log_something_with_long_function( 'This is a log message.', 2 ); Coro::AnyEvent::sleep( 3, 4 ); use Carp (); use File::Spec (); use File::Path (); $self -> method ( 'parameter_0', 'parameter_1' ); $self -> method_with_long_name ( 'parameter_0', 'parameter_1' ); ---------- 'git51' => <<'----------', Type::Libraries->setup_class( __PACKAGE__, qw( Types::Standard Types::Common::Numeric ), # <--- brace here ); ---------- 'git54' => <<'----------', # testing sensitivity to excess commas my $definition => ( { key1 => value1 }, { key2 => value2 }, ); my $definition => ( { key => value } ); my $definition => ( { key => value }, ); my $definition => ( { key => value, }, ); my $list = ( { key => $value, key => $value, key => $value, key => $value, key => $value, }, ) ; my $list = ( { key => $value, key => $value, key => $value, key => $value, key => $value, } ) ; ---------- 'git63' => <<'----------', my $fragment = $parser-> #parse_html_string parse_balanced_chunk($I); ---------- 'lpxl' => <<'----------', # simple function call my $loanlength = getLoanLength( $borrower->{'categorycode'}, # sc1 $iteminformation->{'itemtype'}, $borrower->{'branchcode'} # sc3 ); # function call, more than one level deep my $o = very::long::class::name->new( { propA => "a", propB => "b", propC => "c", } ); # function call with sublist debug( "Connecting to DB.", "Extra-Parameters: " . join("<->", $extra_parms), "Config: " . join("<->", %config) ); # simple function call with code block $m->command(-label => 'Save', -command => sub { print "DOS\n"; save_dialog($win); }); # function call, ternary in list return OptArgs2::Result->usage( $style == OptArgs2::STYLE_FULL ? 'FullUsage' : 'NormalUsage', 'usage: ' . $usage . "\n" ); # not a function call %blastparam = ( -run => \%runparam, -file => '', -parse => 1, -signif => 1e-5, ); # 'local' is a keyword, not a user function local ( $len, $pts, @colspec, $char, $cols, $repeat, $celldata, $at_text, $after_text ); # square bracket with sublists $data = [ ListElem->new(id => 0, val => 100), ListElem->new(id => 2, val => 50), ListElem->new(id => 1, val => 10), ]; # curly brace with sublists $behaviour = { cat => {nap => "lap", eat => "meat"}, dog => {prowl => "growl", pool => "drool"}, mouse => {nibble => "kibble"}, }; ---------- 'numbers' => <<'----------', # valid numbers my @vals = ( 12345, 12345.67, .23E-10, 3.14_15_92, 4_294_967_296, 0xff, 0xdead_beef, 0377, 0b011011, 0x1.999ap-4, 1e34, 1e+34, 1e+034, -1e+034, 0.00000000000000000000000000000000000000000000000000000000000000000001, 0Xabcdef, 0B1101, 0o12_345, # optional 'o' and 'O' added in perl v5.33.5 0O12_345, ); ---------- 'pretok' => <<'----------', # test sub split_pretoken my$s1=$^??"def":"not def"; my$s2=$^ ?"def":"not def"; my$s3=$^if($s2); my$s4=$^Oeq"linux"; my$s5=$ ^One"linux"; my$s6=$ ^One"linux"; my$s7=%^O; my$s8='hi'.'s'x10if(1); my$s9='merci'x0.1e4.$s8; ---------- 'rt136417' => <<'----------', function( # a, b, c); %hash = ( a => b, c => d, ); ---------- }; #################################### # BEGIN SECTION 3: Expected output # #################################### $rtests = { 'git54.def' => { source => "git54", params => "def", expect => <<'#1...........', # testing sensitivity to excess commas my $definition => ( { key1 => value1 }, { key2 => value2 }, ); my $definition => ( { key => value } ); my $definition => ( { key => value }, ); my $definition => ( { key => value, }, ); my $list = ( { key => $value, key => $value, key => $value, key => $value, key => $value, }, ); my $list = ( { key => $value, key => $value, key => $value, key => $value, key => $value, } ); #1........... }, 'git54.git54' => { source => "git54", params => "git54", expect => <<'#2...........', # testing sensitivity to excess commas my $definition => ( { key1 => value1 }, { key2 => value2 }, ); my $definition => ( { key => value } ); my $definition => ( { key => value }, ); my $definition => ( { key => value, }, ); my $list = ( { key => $value, key => $value, key => $value, key => $value, key => $value, }, ); my $list = ( { key => $value, key => $value, key => $value, key => $value, key => $value, } ); #2........... }, 'fpva.def' => { source => "fpva", params => "def", expect => <<'#3...........', log_something_with_long_function( 'This is a log message.', 2 ); Coro::AnyEvent::sleep( 3, 4 ); use Carp (); use File::Spec (); use File::Path (); $self->method( 'parameter_0', 'parameter_1' ); $self->method_with_long_name( 'parameter_0', 'parameter_1' ); #3........... }, 'fpva.fpva1' => { source => "fpva", params => "fpva1", expect => <<'#4...........', log_something_with_long_function ( 'This is a log message.', 2 ); Coro::AnyEvent::sleep ( 3, 4 ); use Carp (); use File::Spec (); use File::Path (); $self->method ( 'parameter_0', 'parameter_1' ); $self->method_with_long_name ( 'parameter_0', 'parameter_1' ); #4........... }, 'fpva.fpva2' => { source => "fpva", params => "fpva2", expect => <<'#5...........', log_something_with_long_function ( 'This is a log message.', 2 ); Coro::AnyEvent::sleep ( 3, 4 ); use Carp (); use File::Spec (); use File::Path (); $self -> method ( 'parameter_0', 'parameter_1' ); $self -> method_with_long_name ( 'parameter_0', 'parameter_1' ); #5........... }, 'lpxl.def' => { source => "lpxl", params => "def", expect => <<'#6...........', # simple function call my $loanlength = getLoanLength( $borrower->{'categorycode'}, # sc1 $iteminformation->{'itemtype'}, $borrower->{'branchcode'} # sc3 ); # function call, more than one level deep my $o = very::long::class::name->new( { propA => "a", propB => "b", propC => "c", } ); # function call with sublist debug( "Connecting to DB.", "Extra-Parameters: " . join( "<->", $extra_parms ), "Config: " . join( "<->", %config ) ); # simple function call with code block $m->command( -label => 'Save', -command => sub { print "DOS\n"; save_dialog($win); } ); # function call, ternary in list return OptArgs2::Result->usage( $style == OptArgs2::STYLE_FULL ? 'FullUsage' : 'NormalUsage', 'usage: ' . $usage . "\n" ); # not a function call %blastparam = ( -run => \%runparam, -file => '', -parse => 1, -signif => 1e-5, ); # 'local' is a keyword, not a user function local ( $len, $pts, @colspec, $char, $cols, $repeat, $celldata, $at_text, $after_text ); # square bracket with sublists $data = [ ListElem->new( id => 0, val => 100 ), ListElem->new( id => 2, val => 50 ), ListElem->new( id => 1, val => 10 ), ]; # curly brace with sublists $behaviour = { cat => { nap => "lap", eat => "meat" }, dog => { prowl => "growl", pool => "drool" }, mouse => { nibble => "kibble" }, }; #6........... }, 'lpxl.lpxl1' => { source => "lpxl", params => "lpxl1", expect => <<'#7...........', # simple function call my $loanlength = getLoanLength( $borrower->{'categorycode'}, # sc1 $iteminformation->{'itemtype'}, $borrower->{'branchcode'} # sc3 ); # function call, more than one level deep my $o = very::long::class::name->new( { propA => "a", propB => "b", propC => "c", } ); # function call with sublist debug( "Connecting to DB.", "Extra-Parameters: " . join( "<->", $extra_parms ), "Config: " . join( "<->", %config ) ); # simple function call with code block $m->command( -label => 'Save', -command => sub { print "DOS\n"; save_dialog($win); } ); # function call, ternary in list return OptArgs2::Result->usage( $style == OptArgs2::STYLE_FULL ? 'FullUsage' : 'NormalUsage', 'usage: ' . $usage . "\n" ); # not a function call %blastparam = ( -run => \%runparam, -file => '', -parse => 1, -signif => 1e-5, ); # 'local' is a keyword, not a user function local ( $len, $pts, @colspec, $char, $cols, $repeat, $celldata, $at_text, $after_text ); # square bracket with sublists $data = [ ListElem->new( id => 0, val => 100 ), ListElem->new( id => 2, val => 50 ), ListElem->new( id => 1, val => 10 ), ]; # curly brace with sublists $behaviour = { cat => { nap => "lap", eat => "meat" }, dog => { prowl => "growl", pool => "drool" }, mouse => { nibble => "kibble" }, }; #7........... }, 'lpxl.lpxl3' => { source => "lpxl", params => "lpxl3", expect => <<'#8...........', # simple function call my $loanlength = getLoanLength( $borrower->{'categorycode'}, # sc1 $iteminformation->{'itemtype'}, $borrower->{'branchcode'} # sc3 ); # function call, more than one level deep my $o = very::long::class::name->new( { propA => "a", propB => "b", propC => "c", } ); # function call with sublist debug( "Connecting to DB.", "Extra-Parameters: " . join( "<->", $extra_parms ), "Config: " . join( "<->", %config ) ); # simple function call with code block $m->command( -label => 'Save', -command => sub { print "DOS\n"; save_dialog($win); } ); # function call, ternary in list return OptArgs2::Result->usage( $style == OptArgs2::STYLE_FULL ? 'FullUsage' : 'NormalUsage', 'usage: ' . $usage . "\n" ); # not a function call %blastparam = ( -run => \%runparam, -file => '', -parse => 1, -signif => 1e-5, ); # 'local' is a keyword, not a user function local ( $len, $pts, @colspec, $char, $cols, $repeat, $celldata, $at_text, $after_text ); # square bracket with sublists $data = [ ListElem->new( id => 0, val => 100 ), ListElem->new( id => 2, val => 50 ), ListElem->new( id => 1, val => 10 ), ]; # curly brace with sublists $behaviour = { cat => { nap => "lap", eat => "meat" }, dog => { prowl => "growl", pool => "drool" }, mouse => { nibble => "kibble" }, }; #8........... }, 'lpxl.lpxl4' => { source => "lpxl", params => "lpxl4", expect => <<'#9...........', # simple function call my $loanlength = getLoanLength( $borrower->{'categorycode'}, # sc1 $iteminformation->{'itemtype'}, $borrower->{'branchcode'} # sc3 ); # function call, more than one level deep my $o = very::long::class::name->new( { propA => "a", propB => "b", propC => "c", } ); # function call with sublist debug( "Connecting to DB.", "Extra-Parameters: " . join( "<->", $extra_parms ), "Config: " . join( "<->", %config ) ); # simple function call with code block $m->command( -label => 'Save', -command => sub { print "DOS\n"; save_dialog($win); } ); # function call, ternary in list return OptArgs2::Result->usage( $style == OptArgs2::STYLE_FULL ? 'FullUsage' : 'NormalUsage', 'usage: ' . $usage . "\n" ); # not a function call %blastparam = ( -run => \%runparam, -file => '', -parse => 1, -signif => 1e-5, ); # 'local' is a keyword, not a user function local ( $len, $pts, @colspec, $char, $cols, $repeat, $celldata, $at_text, $after_text ); # square bracket with sublists $data = [ ListElem->new( id => 0, val => 100 ), ListElem->new( id => 2, val => 50 ), ListElem->new( id => 1, val => 10 ), ]; # curly brace with sublists $behaviour = { cat => { nap => "lap", eat => "meat" }, dog => { prowl => "growl", pool => "drool" }, mouse => { nibble => "kibble" }, }; #9........... }, 'lpxl.lpxl5' => { source => "lpxl", params => "lpxl5", expect => <<'#10...........', # simple function call my $loanlength = getLoanLength( $borrower->{'categorycode'}, # sc1 $iteminformation->{'itemtype'}, $borrower->{'branchcode'} # sc3 ); # function call, more than one level deep my $o = very::long::class::name->new( { propA => "a", propB => "b", propC => "c", } ); # function call with sublist debug( "Connecting to DB.", "Extra-Parameters: " . join( "<->", $extra_parms ), "Config: " . join( "<->", %config ) ); # simple function call with code block $m->command( -label => 'Save', -command => sub { print "DOS\n"; save_dialog($win); } ); # function call, ternary in list return OptArgs2::Result->usage( $style == OptArgs2::STYLE_FULL ? 'FullUsage' : 'NormalUsage', 'usage: ' . $usage . "\n" ); # not a function call %blastparam = ( -run => \%runparam, -file => '', -parse => 1, -signif => 1e-5, ); # 'local' is a keyword, not a user function local ( $len, $pts, @colspec, $char, $cols, $repeat, $celldata, $at_text, $after_text ); # square bracket with sublists $data = [ ListElem->new( id => 0, val => 100 ), ListElem->new( id => 2, val => 50 ), ListElem->new( id => 1, val => 10 ), ]; # curly brace with sublists $behaviour = { cat => { nap => "lap", eat => "meat" }, dog => { prowl => "growl", pool => "drool" }, mouse => { nibble => "kibble" }, }; #10........... }, 'git63.def' => { source => "git63", params => "def", expect => <<'#11...........', my $fragment = $parser-> #parse_html_string parse_balanced_chunk($I); #11........... }, 'align35.def' => { source => "align35", params => "def", expect => <<'#12...........', # different module names, do not align commas (fixes rt136416) use File::Spec::Functions 'catfile', 'catdir'; use Mojo::Base 'Mojolicious', '-signatures'; # same module names, align fat commas use constant PI => 4 * atan2 1, 1; use constant TWOPI => 2 * PI; use constant FOURPI => 4 * PI; # same module names, align commas use TestCounter '3rd-party', 0, '3rd-party no longer visible'; use TestCounter 'replace', 1, 'replacement now visible'; use TestCounter 'root'; # same module name, align fat commas but not commas use constant COUNTDOWN => scalar reverse 1, 2, 3, 4, 5; use constant COUNTUP => reverse 1, 2, 3, 4, 5; use constant COUNTDOWN => scalar reverse 1, 2, 3, 4, 5; #12........... }, 'rt136417.def' => { source => "rt136417", params => "def", expect => <<'#13...........', function( # a, b, c ); %hash = ( a => b, c => d, ); #13........... }, 'rt136417.rt136417' => { source => "rt136417", params => "rt136417", expect => <<'#14...........', function( # a, b, c ); %hash = ( a => b, c => d, ); #14........... }, 'numbers.def' => { source => "numbers", params => "def", expect => <<'#15...........', # valid numbers my @vals = ( 12345, 12345.67, .23E-10, 3.14_15_92, 4_294_967_296, 0xff, 0xdead_beef, 0377, 0b011011, 0x1.999ap-4, 1e34, 1e+34, 1e+034, -1e+034, 0.00000000000000000000000000000000000000000000000000000000000000000001, 0Xabcdef, 0B1101, 0o12_345, # optional 'o' and 'O' added in perl v5.33.5 0O12_345, ); #15........... }, 'code_skipping.def' => { source => "code_skipping", params => "def", expect => <<'#16...........', %Hdr = %U2E = %E2U = %Fallback = (); $in_charmap = $nerror = $nwarning = 0; $. = 0; #<>V my $self = shift; my $cloning = shift; #16........... }, 'git51.def' => { source => "git51", params => "def", expect => <<'#17...........', Type::Libraries->setup_class( __PACKAGE__, qw( Types::Standard Types::Common::Numeric ), # <--- brace here ); #17........... }, 'git51.git51' => { source => "git51", params => "git51", expect => <<'#18...........', Type::Libraries->setup_class( __PACKAGE__, qw( Types::Standard Types::Common::Numeric ), # <--- brace here ); #18........... }, 'pretok.def' => { source => "pretok", params => "def", expect => <<'#19...........', # test sub split_pretoken my $s1 = $^? ? "def" : "not def"; my $s2 = $^ ? "def" : "not def"; my $s3 = $^ if ($s2); my $s4 = $^O eq "linux"; my $s5 = $^O ne "linux"; my $s6 = $^O ne "linux"; my $s7 = %^O; my $s8 = 'hi' . 's' x 10 if (1); my $s9 = 'merci' x 0.1e4 . $s8; #19........... }, }; my $ntests = 0 + keys %{$rtests}; plan tests => $ntests; } ############### # EXECUTE TESTS ############### foreach my $key ( sort keys %{$rtests} ) { my $output; my $sname = $rtests->{$key}->{source}; my $expect = $rtests->{$key}->{expect}; my $pname = $rtests->{$key}->{params}; my $source = $rsources->{$sname}; my $params = defined($pname) ? $rparams->{$pname} : ""; my $stderr_string; my $errorfile_string; my $err = Perl::Tidy::perltidy( source => \$source, destination => \$output, perltidyrc => \$params, argv => '', # for safety; hide any ARGV from perltidy stderr => \$stderr_string, errorfile => \$errorfile_string, # not used when -se flag is set ); if ( $err || $stderr_string || $errorfile_string ) { print STDERR "Error output received for test '$key'\n"; if ($err) { print STDERR "An error flag '$err' was returned\n"; ok( !$err ); } if ($stderr_string) { print STDERR "---------------------\n"; print STDERR "<>\n$stderr_string\n"; print STDERR "---------------------\n"; ok( !$stderr_string ); } if ($errorfile_string) { print STDERR "---------------------\n"; print STDERR "<<.ERR file>>\n$errorfile_string\n"; print STDERR "---------------------\n"; ok( !$errorfile_string ); } } else { if ( !is( $output, $expect, $key ) ) { my $leno = length($output); my $lene = length($expect); if ( $leno == $lene ) { print STDERR "#> Test '$key' gave unexpected output. Strings differ but both have length $leno\n"; } else { print STDERR "#> Test '$key' gave unexpected output. String lengths differ: output=$leno, expected=$lene\n"; } } } } Perl-Tidy-20230309/bin/0002755000175000017500000000000014401515241013374 5ustar stevestevePerl-Tidy-20230309/bin/perltidy0000755000175000017500000065075714400776172015213 0ustar stevesteve#!/usr/bin/perl package main; use Perl::Tidy; my $arg_string = undef; # give Macs a chance to provide command line parameters if ( $^O =~ /Mac/ ) { $arg_string = MacPerl::Ask( 'Please enter @ARGV (-h for help)', defined $ARGV[0] ? "\"$ARGV[0]\"" : "" ); } # Exit codes returned by perltidy: # 0 = no errors # 1 = perltidy could not run to completion due to errors # 2 = perltidy ran to completion with error messages exit Perl::Tidy::perltidy( argv => $arg_string ); __END__ =head1 NAME perltidy - a perl script indenter and reformatter =head1 SYNOPSIS perltidy [ options ] file1 file2 file3 ... (output goes to file1.tdy, file2.tdy, file3.tdy, ...) perltidy [ options ] file1 -o outfile perltidy [ options ] file1 -st >outfile perltidy [ options ] outfile =head1 DESCRIPTION Perltidy reads a perl script and writes an indented, reformatted script. The formatting process involves converting the script into a string of tokens, removing any non-essential whitespace, and then rewriting the string of tokens with whitespace using whatever rules are specified, or defaults. This happens in a series of operations which can be controlled with the parameters described in this document. Perltidy is a commandline frontend to the module Perl::Tidy. For documentation describing how to call the Perl::Tidy module from other applications see the separate documentation for Perl::Tidy. It is the file Perl::Tidy.pod in the source distribution. Many users will find enough information in L<"EXAMPLES"> to get started. New users may benefit from the short tutorial which can be found at http://perltidy.sourceforge.net/tutorial.html A convenient aid to systematically defining a set of style parameters can be found at http://perltidy.sourceforge.net/stylekey.html Perltidy can produce output on either of two modes, depending on the existence of an B<-html> flag. Without this flag, the output is passed through a formatter. The default formatting tries to follow the recommendations in perlstyle(1), but it can be controlled in detail with numerous input parameters, which are described in L<"FORMATTING OPTIONS">. When the B<-html> flag is given, the output is passed through an HTML formatter which is described in L<"HTML OPTIONS">. =head1 EXAMPLES perltidy somefile.pl This will produce a file F containing the script reformatted using the default options, which approximate the style suggested in perlstyle(1). The source file F is unchanged. perltidy *.pl Execute perltidy on all F<.pl> files in the current directory with the default options. The output will be in files with an appended F<.tdy> extension. For any file with an error, there will be a file with extension F<.ERR>. perltidy -b file1.pl file2.pl Modify F and F in place, and backup the originals to F and F. If F and/or F already exist, they will be overwritten. perltidy -b -bext='/' file1.pl file2.pl Same as the previous example except that the backup files F and F will be deleted if there are no errors. perltidy -gnu somefile.pl Execute perltidy on file F with a style which approximates the GNU Coding Standards for C programs. The output will be F. perltidy -i=3 somefile.pl Execute perltidy on file F, with 3 columns for each level of indentation (B<-i=3>) instead of the default 4 columns. There will not be any tabs in the reformatted script, except for any which already exist in comments, pod documents, quotes, and here documents. Output will be F. perltidy -i=3 -et=8 somefile.pl Same as the previous example, except that leading whitespace will be entabbed with one tab character per 8 spaces. perltidy -ce -l=72 somefile.pl Execute perltidy on file F with all defaults except use "cuddled elses" (B<-ce>) and a maximum line length of 72 columns (B<-l=72>) instead of the default 80 columns. perltidy -g somefile.pl Execute perltidy on file F and save a log file F which shows the nesting of braces, parentheses, and square brackets at the start of every line. perltidy -html somefile.pl This will produce a file F containing the script with html markup. The output file will contain an embedded style sheet in the section which may be edited to change the appearance. perltidy -html -css=mystyle.css somefile.pl This will produce a file F containing the script with html markup. This output file will contain a link to a separate style sheet file F. If the file F does not exist, it will be created. If it exists, it will not be overwritten. perltidy -html -pre somefile.pl Write an html snippet with only the PRE section to F. This is useful when code snippets are being formatted for inclusion in a larger web page. No style sheet will be written in this case. perltidy -html -ss >mystyle.css Write a style sheet to F and exit. perltidy -html -frm mymodule.pm Write html with a frame holding a table of contents and the source code. The output files will be F (the frame), F (the table of contents), and F (the source code). =head1 OPTIONS - OVERVIEW The entire command line is scanned for options, and they are processed before any files are processed. As a result, it does not matter whether flags are before or after any filenames. However, the relative order of parameters is important, with later parameters overriding the values of earlier parameters. For each parameter, there is a long name and a short name. The short names are convenient for keyboard input, while the long names are self-documenting and therefore useful in scripts. It is customary to use two leading dashes for long names, but one may be used. Most parameters which serve as on/off flags can be negated with a leading "n" (for the short name) or a leading "no" or "no-" (for the long name). For example, the flag to outdent long quotes is B<-olq> or B<--outdent-long-quotes>. The flag to skip this is B<-nolq> or B<--nooutdent-long-quotes> or B<--no-outdent-long-quotes>. Options may not be bundled together. In other words, options B<-q> and B<-g> may NOT be entered as B<-qg>. Option names may be terminated early as long as they are uniquely identified. For example, instead of B<--dump-token-types>, it would be sufficient to enter B<--dump-tok>, or even B<--dump-t>, to uniquely identify this command. =head2 I/O Control The following parameters concern the files which are read and written. =over 4 =item B<-h>, B<--help> Show summary of usage and exit. =item B<-o>=filename, B<--outfile>=filename Name of the output file (only if a single input file is being processed). If no output file is specified, and output is not redirected to the standard output (see B<-st>), the output will go to F. [Note: - does not redirect to standard output. Use B<-st> instead.] =item B<-st>, B<--standard-output> Perltidy must be able to operate on an arbitrarily large number of files in a single run, with each output being directed to a different output file. Obviously this would conflict with outputting to the single standard output device, so a special flag, B<-st>, is required to request outputting to the standard output. For example, perltidy somefile.pl -st >somefile.new.pl This option may only be used if there is just a single input file. The default is B<-nst> or B<--nostandard-output>. =item B<-se>, B<--standard-error-output> If perltidy detects an error when processing file F, its default behavior is to write error messages to file F. Use B<-se> to cause all error messages to be sent to the standard error output stream instead. This directive may be negated with B<-nse>. Thus, you may place B<-se> in a F<.perltidyrc> and override it when desired with B<-nse> on the command line. =item B<-oext>=ext, B<--output-file-extension>=ext Change the extension of the output file to be F instead of the default F (or F in case the -B<-html> option is used). See L<"Specifying File Extensions">. =item B<-opath>=path, B<--output-path>=path When perltidy creates a filename for an output file, by default it merely appends an extension to the path and basename of the input file. This parameter causes the path to be changed to F instead. The path should end in a valid path separator character, but perltidy will try to add one if it is missing. For example perltidy somefile.pl -opath=/tmp/ will produce F
. Otherwise, F will appear in whatever directory contains F. If the path contains spaces, it should be placed in quotes. This parameter will be ignored if output is being directed to standard output, or if it is being specified explicitly with the B<-o=s> parameter. =item B<-b>, B<--backup-and-modify-in-place> Modify the input file or files in-place and save the original with the extension F<.bak>. Any existing F<.bak> file will be deleted. See next item for changing the default backup extension, and for eliminating the backup file altogether. B: Writing back to the input file increases the risk of data loss or corruption in the event of a software or hardware malfunction. Before using the B<-b> parameter please be sure to have backups and verify that it works correctly in your environment and operating system. A B<-b> flag will be ignored if input is from standard input or goes to standard output, or if the B<-html> flag is set. In particular, if you want to use both the B<-b> flag and the B<-pbp> (--perl-best-practices) flag, then you must put a B<-nst> flag after the B<-pbp> flag because it contains a B<-st> flag as one of its components, which means that output will go to the standard output stream. =item B<-bext>=ext, B<--backup-file-extension>=ext This parameter serves two purposes: (1) to change the extension of the backup file to be something other than the default F<.bak>, and (2) to indicate that no backup file should be saved. To change the default extension to something other than F<.bak> see L<"Specifying File Extensions">. A backup file of the source is always written, but you can request that it be deleted at the end of processing if there were no errors. This is risky unless the source code is being maintained with a source code control system. To indicate that the backup should be deleted include one forward slash, B, in the extension. If any text remains after the slash is removed it will be used to define the backup file extension (which is always created and only deleted if there were no errors). Here are some examples: Parameter Extension Backup File Treatment <-bext=bak> F<.bak> Keep (same as the default behavior) <-bext='/'> F<.bak> Delete if no errors <-bext='/backup'> F<.backup> Delete if no errors <-bext='original/'> F<.original> Delete if no errors =item B<-bm=s>, B<--backup-method=s> This parameter should not normally be used but is available in the event that problems arise as a transition is made from an older implementation of the backup logic to a newer implementation. The newer implementation is the default and is specified with B<-bm='copy'>. The older implementation is specified with B<-bm='move'>. The difference is that the older implementation made the backup by moving the input file to the backup file, and the newer implementation makes the backup by copying the input file. The newer implementation preserves the file system B value. This may avoid problems with other software running simultaneously. This change was made as part of issue B at github. =item B<-w>, B<--warning-output> Setting B<-w> causes any non-critical warning messages to be reported as errors. These include messages about possible pod problems, possibly bad starting indentation level, and cautions about indirect object usage. The default, B<-nw> or B<--nowarning-output>, is not to include these warnings. =item B<-q>, B<--quiet> Deactivate error messages (for running under an editor). For example, if you use a vi-style editor, such as vim, you may execute perltidy as a filter from within the editor using something like :n1,n2!perltidy -q where C represents the selected text. Without the B<-q> flag, any error message may mess up your screen, so be prepared to use your "undo" key. =item B<-log>, B<--logfile> Save the F<.LOG> file, which has many useful diagnostics. Perltidy always creates a F<.LOG> file, but by default it is deleted unless a program bug is suspected. Setting the B<-log> flag forces the log file to be saved. =item B<-g=n>, B<--logfile-gap=n> Set maximum interval between input code lines in the logfile. This purpose of this flag is to assist in debugging nesting errors. The value of C is optional. If you set the flag B<-g> without the value of C, it will be taken to be 1, meaning that every line will be written to the log file. This can be helpful if you are looking for a brace, paren, or bracket nesting error. Setting B<-g> also causes the logfile to be saved, so it is not necessary to also include B<-log>. If no B<-g> flag is given, a value of 50 will be used, meaning that at least every 50th line will be recorded in the logfile. This helps prevent excessively long log files. Setting a negative value of C is the same as not setting B<-g> at all. =item B<-npro> B<--noprofile> Ignore any F<.perltidyrc> command file. Normally, perltidy looks first in your current directory for a F<.perltidyrc> file of parameters. (The format is described below). If it finds one, it applies those options to the initial default values, and then it applies any that have been defined on the command line. If no F<.perltidyrc> file is found, it looks for one in your home directory. If you set the B<-npro> flag, perltidy will not look for this file. =item B<-pro=filename> or B<--profile=filename> To simplify testing and switching .perltidyrc files, this command may be used to specify a configuration file which will override the default name of .perltidyrc. There must not be a space on either side of the '=' sign. For example, the line perltidy -pro=testcfg would cause file F to be used instead of the default F<.perltidyrc>. A pathname begins with three dots, e.g. ".../.perltidyrc", indicates that the file should be searched for starting in the current directory and working upwards. This makes it easier to have multiple projects each with their own .perltidyrc in their root directories. =item B<-opt>, B<--show-options> Write a list of all options used to the F<.LOG> file. Please see B<--dump-options> for a simpler way to do this. =item B<-f>, B<--force-read-binary> Force perltidy to process binary files. To avoid producing excessive error messages, perltidy skips files identified by the system as non-text. However, valid perl scripts containing binary data may sometimes be identified as non-text, and this flag forces perltidy to process them. =item B<-ast>, B<--assert-tidy> This flag asserts that the input and output code streams are identical, or in other words that the input code is already 'tidy' according to the formatting parameters. If this is not the case, an error message noting this is produced. This error message will cause the process to return a non-zero exit code. The test for this is made by comparing an MD5 hash value for the input and output code streams. This flag has no other effect on the functioning of perltidy. This might be useful for certain code maintenance operations. Note: you will not see this message if you have error messages turned off with the -quiet flag. =item B<-asu>, B<--assert-untidy> This flag asserts that the input and output code streams are different, or in other words that the input code is 'untidy' according to the formatting parameters. If this is not the case, an error message noting this is produced. This flag has no other effect on the functioning of perltidy. =back =head1 FORMATTING OPTIONS =head2 Basic Options =over 4 =item B<--notidy> This flag disables all formatting and causes the input to be copied unchanged to the output except for possible changes in line ending characters and any pre- and post-filters. This can be useful in conjunction with a hierarchical set of F<.perltidyrc> files to avoid unwanted code tidying. See also L<"Skipping Selected Sections of Code"> for a way to avoid tidying specific sections of code. =item B<-i=n>, B<--indent-columns=n> Use n columns per indentation level (default n=4). =item B<-l=n>, B<--maximum-line-length=n> The default maximum line length is n=80 characters. Perltidy will try to find line break points to keep lines below this length. However, long quotes and side comments may cause lines to exceed this length. The default length of 80 comes from the past when this was the standard CRT screen width. Many programmers prefer to increase this to something like 120. Setting B<-l=0> is equivalent to setting B<-l=(a very large number)>. But this is not recommended because, for example, a very long list will be formatted in a single long line. =item B<-vmll>, B<--variable-maximum-line-length> A problem arises using a fixed maximum line length with very deeply nested code and data structures because eventually the amount of leading whitespace used for indicating indentation takes up most or all of the available line width, leaving little or no space for the actual code or data. One solution is to use a very long line length. Another solution is to use the B<-vmll> flag, which basically tells perltidy to ignore leading whitespace when measuring the line length. To be precise, when the B<-vmll> parameter is set, the maximum line length of a line of code will be M+L*I, where M is the value of --maximum-line-length=M (-l=M), default 80, I is the value of --indent-columns=I (-i=I), default 4, L is the indentation level of the line of code When this flag is set, the choice of breakpoints for a block of code should be essentially independent of its nesting depth. However, the absolute line lengths, including leading whitespace, can still be arbitrarily large. This problem can be avoided by including the next parameter. The default is not to do this (B<-nvmll>). =item B<-wc=n>, B<--whitespace-cycle=n> This flag also addresses problems with very deeply nested code and data structures. When the nesting depth exceeds the value B the leading whitespace will be reduced and start at a depth of 1 again. The result is that blocks of code will shift back to the left rather than moving arbitrarily far to the right. This occurs cyclically to any depth. For example if one level of indentation equals 4 spaces (B<-i=4>, the default), and one uses B<-wc=15>, then if the leading whitespace on a line exceeds about 4*15=60 spaces it will be reduced back to 4*1=4 spaces and continue increasing from there. If the whitespace never exceeds this limit the formatting remains unchanged. The combination of B<-vmll> and B<-wc=n> provides a solution to the problem of displaying arbitrarily deep data structures and code in a finite window, although B<-wc=n> may of course be used without B<-vmll>. The default is not to use this, which can also be indicated using B<-wc=0>. =item B Using tab characters will almost certainly lead to future portability and maintenance problems, so the default and recommendation is not to use them. For those who prefer tabs, however, there are two different options. Except for possibly introducing tab indentation characters, as outlined below, perltidy does not introduce any tab characters into your file, and it removes any tabs from the code (unless requested not to do so with B<-fws>). If you have any tabs in your comments, quotes, or here-documents, they will remain. =over 4 =item B<-et=n>, B<--entab-leading-whitespace> This flag causes each B leading space characters produced by the formatting process to be replaced by one tab character. The formatting process itself works with space characters. The B<-et=n> parameter is applied as a last step, after formatting is complete, to convert leading spaces into tabs. Before starting to use tabs, it is essential to first get the indentation controls set as desired without tabs, particularly the two parameters B<--indent-columns=n> (or B<-i=n>) and B<--continuation-indentation=n> (or B<-ci=n>). The value of the integer B can be any value but can be coordinated with the number of spaces used for indentation. For example, B<-et=4 -ci=4 -i=4> will produce one tab for each indentation level and and one for each continuation indentation level. You may want to coordinate the value of B with what your display software assumes for the spacing of a tab. =item B<-t>, B<--tabs> This flag causes one leading tab character to be inserted for each level of indentation. Certain other features are incompatible with this option, and if these options are also given, then a warning message will be issued and this flag will be unset. One example is the B<-lp> option. This flag is retained for backwards compatibility, but if you use tabs, the B<-et=n> flag is recommended. If both B<-t> and B<-et=n> are set, the B<-et=n> is used. =item B<-dt=n>, B<--default-tabsize=n> If the first line of code passed to perltidy contains leading tabs but no tab scheme is specified for the output stream then perltidy must guess how many spaces correspond to each leading tab. This number of spaces B corresponding to each leading tab of the input stream may be specified with B<-dt=n>. The default is B. This flag has no effect if a tab scheme is specified for the output stream, because then the input stream is assumed to use the same tab scheme and indentation spaces as for the output stream (any other assumption would lead to unstable editing). =back =item B<-io>, B<--indent-only> This flag is used to deactivate all whitespace and line break changes within non-blank lines of code. When it is in effect, the only change to the script will be to the indentation and to the number of blank lines. And any flags controlling whitespace and newlines will be ignored. You might want to use this if you are perfectly happy with your whitespace and line breaks, and merely want perltidy to handle the indentation. (This also speeds up perltidy by well over a factor of two, so it might be useful when perltidy is merely being used to help find a brace error in a large script). Setting this flag is equivalent to setting B<--freeze-newlines> and B<--freeze-whitespace>. If you also want to keep your existing blank lines exactly as they are, you can add B<--freeze-blank-lines>. With this option perltidy is still free to modify the indenting (and outdenting) of code and comments as it normally would. If you also want to prevent long comment lines from being outdented, you can add either B<-noll> or B<-l=0>. Setting this flag will prevent perltidy from doing any special operations on closing side comments. You may still delete all side comments however when this flag is in effect. =item B<-enc=s>, B<--character-encoding=s> This flag indicates if the input data stream use a character encoding. Perltidy does not look for the encoding directives in the source stream, such as B, and instead relies on this flag to determine the encoding. (Note that perltidy often works on snippets of code rather than complete files so it cannot rely on B directives). The possible values for B are: -enc=none if no encoding is used, or -enc=utf8 for encoding in utf8 -enc=guess if perltidy should guess between these two possibilities. The value B causes the stream to be processed without special encoding assumptions. This is appropriate for files which are written in single-byte character encodings such as latin-1. The value B causes the stream to be read and written as UTF-8. If the input stream cannot be decoded with this encoding then processing is not done. The value B tells perltidy to guess between either utf8 encoding or no encoding (meaning one character per byte). The B option uses the Encode::Guess module which has been found to be reliable at detecting if a file is encoded in utf8 or not. The current default is B. The abbreviations B<-utf8> or B<-UTF8> are equivalent to B<-enc=utf8>, and the abbreviation B<-guess> is equivalent to B<-enc=guess>. So to process a file named B which is encoded in UTF-8 you can use: perltidy -utf8 file.pl or perltidy -guess file.pl or simply perltidy file.pl since B<-guess> is the default. To process files with an encoding other than UTF-8, it would be necessary to write a short program which calls the Perl::Tidy module with some pre- and post-processing to handle decoding and encoding. =item B<-eos=s>, B<--encode-output-strings=s> This flag was added to resolve an issue involving the interface between Perl::Tidy and calling programs, and in particular B. If you only run the B binary this flag has no effect. If you run a program which calls the Perl::Tidy module and receives a string in return, then the meaning of the flag is as follows: =over 4 =item * The setting B<-eos> means Perl::Tidy should encode any string which it decodes. This is the default because it makes perltidy behave well as a filter, and is the correct setting for most programs. =item * The setting B<-neos> means that a string should remain decoded if it was decoded by Perl::Tidy. This is only appropriate if the calling program will handle any needed encoding before outputting the string. =back The default was changed from B<-neos> to B<-eos> in versions after 20220217. If this change causes a program to start running incorrectly on encoded files, an emergency fix might be to set B<-neos>. Additional information can be found in the man pages for the B module and also in L. =item B<-gcs>, B<--use-unicode-gcstring> This flag controls whether or not perltidy may use module Unicode::GCString to obtain accurate display widths of wide characters. The default is B<--nouse-unicode-gcstring>. If this flag is set, and text is encoded, perltidy will look for the module Unicode::GCString and, if found, will use it to obtain character display widths. This can improve displayed vertical alignment for files with wide characters. It is a nice feature but it is off by default to avoid conflicting formatting when there are multiple developers. Perltidy installation does not require Unicode::GCString, so users wanting to use this feature need set this flag and also to install Unicode::GCString separately. If this flag is set and perltidy does not find module Unicode::GCString, a warning message will be produced and processing will continue but without the potential benefit provided by the module. Also note that actual vertical alignment depends upon the fonts used by the text display software, so vertical alignment may not be optimal even when Unicode::GCString is used. =item B<-ole=s>, B<--output-line-ending=s> where s=C, C, C, or C. This flag tells perltidy to output line endings for a specific system. Normally, perltidy writes files with the line separator character of the host system. The C and C flags have an identical result. =item B<-ple>, B<--preserve-line-endings> This flag tells perltidy to write its output files with the same line endings as the input file, if possible. It should work for B, B, and B line endings. It will only work if perltidy input comes from a filename (rather than stdin, for example). If perltidy has trouble determining the input file line ending, it will revert to the default behavior of using the line ending of the host system. =item B<-atnl>, B<--add-terminal-newline> This flag, which is enabled by default, allows perltidy to terminate the last line of the output stream with a newline character, regardless of whether or not the input stream was terminated with a newline character. If this flag is negated, with B<-natnl>, then perltidy will add a terminal newline to the the output stream only if the input stream is terminated with a newline. Negating this flag may be useful for manipulating one-line scripts intended for use on a command line. =item B<-it=n>, B<--iterations=n> This flag causes perltidy to do B complete iterations. The reason for this flag is that code beautification is an iterative process and in some cases the output from perltidy can be different if it is applied a second time. For most purposes the default of B should be satisfactory. However B can be useful when a major style change is being made, or when code is being beautified on check-in to a source code control system. It has been found to be extremely rare for the output to change after 2 iterations. If a value B is greater than 2 is input then a convergence test will be used to stop the iterations as soon as possible, almost always after 2 iterations. See the next item for a simplified iteration control. This flag has no effect when perltidy is used to generate html. =item B<-conv>, B<--converge> This flag is equivalent to B<-it=4> and is included to simplify iteration control. For all practical purposes one either does or does not want to be sure that the output is converged, and there is no penalty to using a large iteration limit since perltidy will check for convergence and stop iterating as soon as possible. The default is B<-nconv> (no convergence check). Using B<-conv> will approximately double run time since typically one extra iteration is required to verify convergence. No extra iterations are required if no new line breaks are made, and two extra iterations are occasionally needed when reformatting complex code structures, such as deeply nested ternary statements. =back =head2 Code Indentation Control =over 4 =item B<-ci=n>, B<--continuation-indentation=n> Continuation indentation is extra indentation spaces applied when a long line is broken. The default is n=2, illustrated here: my $level = # -ci=2 ( $max_index_to_go >= 0 ) ? $levels_to_go[0] : $last_output_level; The same example, with n=0, is a little harder to read: my $level = # -ci=0 ( $max_index_to_go >= 0 ) ? $levels_to_go[0] : $last_output_level; The value given to B<-ci> is also used by some commands when a small space is required. Examples are commands for outdenting labels, B<-ola>, and control keywords, B<-okw>. When default values are not used, it is recommended that either (1) the value B given with B<-ci=n> be no more than about one-half of the number of spaces assigned to a full indentation level on the B<-i=n> command, or (2) the flag B<-extended-continuation-indentation> is used (see next section). =item B<-xci>, B<--extended-continuation-indentation> This flag allows perltidy to use some improvements which have been made to its indentation model. One of the things it does is "extend" continuation indentation deeper into structures, hence the name. The improved indentation is particularly noticeable when the flags B<-ci=n> and B<-i=n> use the same value of B. There are no significant disadvantages to using this flag, but to avoid disturbing existing formatting the default is not to use it, B<-nxci>. Please see the section L<"B<-pbp>, B<--perl-best-practices>"> for an example of how this flag can improve the formatting of ternary statements. It can also improve indentation of some multi-line qw lists as shown below. # perltidy foreach $color ( qw( AntiqueWhite3 Bisque1 Bisque2 Bisque3 Bisque4 SlateBlue3 RoyalBlue1 SteelBlue2 DeepSkyBlue3 ), qw( LightBlue1 DarkSlateGray1 Aquamarine2 DarkSeaGreen2 SeaGreen1 Yellow1 IndianRed1 IndianRed2 Tan1 Tan4 ) ) # perltidy -xci foreach $color ( qw( AntiqueWhite3 Bisque1 Bisque2 Bisque3 Bisque4 SlateBlue3 RoyalBlue1 SteelBlue2 DeepSkyBlue3 ), qw( LightBlue1 DarkSlateGray1 Aquamarine2 DarkSeaGreen2 SeaGreen1 Yellow1 IndianRed1 IndianRed2 Tan1 Tan4 ) ) =item B<-sil=n> B<--starting-indentation-level=n> By default, perltidy examines the input file and tries to determine the starting indentation level. While it is often zero, it may not be zero for a code snippet being sent from an editing session. To guess the starting indentation level perltidy simply assumes that indentation scheme used to create the code snippet is the same as is being used for the current perltidy process. This is the only sensible guess that can be made. It should be correct if this is true, but otherwise it probably won't. For example, if the input script was written with -i=2 and the current perltidy flags have -i=4, the wrong initial indentation will be guessed for a code snippet which has non-zero initial indentation. Likewise, if an entabbing scheme is used in the input script and not in the current process then the guessed indentation will be wrong. If the default method does not work correctly, or you want to change the starting level, use B<-sil=n>, to force the starting level to be n. =item B using B<--line-up-parentheses>, B<-lp> or B<--extended--line-up-parentheses> , B<-xlp> These flags provide an alternative indentation method for list data. The original flag for this is B<-lp>, but it has some limitations (explained below) which are avoided with the newer B<-xlp> flag. So B<-xlp> is probably the better choice for new work, but the B<-lp> flag is retained to minimize changes to existing formatting. If you enter both B<-lp> and B<-xlp>, then B<-xlp> will be used. In the default indentation method perltidy indents lists with 4 spaces, or whatever value is specified with B<-i=n>. Here is a small list formatted in this way: # perltidy (default) @month_of_year = ( 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec' ); The B<-lp> or B<-xlp> flags add extra indentation to cause the data to begin past the opening parentheses of a sub call or list, or opening square bracket of an anonymous array, or opening curly brace of an anonymous hash. With this option, the above list would become: # perltidy -lp or -xlp @month_of_year = ( 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec' ); If the available line length (see B<-l=n> ) does not permit this much space, perltidy will use less. For alternate placement of the closing paren, see the next section. These flags have no effect on code BLOCKS, such as if/then/else blocks, which always use whatever is specified with B<-i=n>. Some limitations on these flags are: =over 4 =item * A limitation on B<-lp>, but not B<-xlp>, occurs in situations where perltidy does not have complete freedom to choose line breaks. Then it may temporarily revert to its default indentation method. This can occur for example if there are blank lines, block comments, multi-line quotes, or side comments between the opening and closing parens, braces, or brackets. It will also occur if a multi-line anonymous sub occurs within a container since that will impose specific line breaks (such as line breaks after statements). =item * For both the B<-lp> and B<-xlp> flags, any parameter which significantly restricts the ability of perltidy to choose newlines will conflict with these flags and will cause them to be deactivated. These include B<-io>, B<-fnl>, B<-nanl>, and B<-ndnl>. =item * The B<-lp> and B<-xlp> options may not be used together with the B<-t> tabs option. They may, however, be used with the B<-et=n> tab method =back There are some potential disadvantages of this indentation method compared to the default method that should be noted: =over 4 =item * The available line length can quickly be used up if variable names are long. This can cause deeply nested code to quickly reach the line length limit, and become badly formatted, much sooner than would occur with the default indentation method. =item * Since the indentation depends on the lengths of variable names, small changes in variable names can cause changes in indentation over many lines in a file. This means that minor name changes can produce significant file differences. This can be annoying and does not occur with the default indentation method. =back Some things that can be done to minimize these problems are: =over 4 =item * Increase B<--maximum-line-length=n> above the default B characters if necessary. =item * If you use B<-xlp> then long side comments can limit the indentation over multiple lines. Consider adding the flag B<--ignore-side-comment-lengths> to prevent this, or minimizing the use of side comments. =item * Apply this style in a limited way. By default, it applies to all list containers (not just lists in parentheses). The next section describes how to limit this style to, for example, just function calls. The default indentation method will be applied elsewhere. =back =item B<-lpil=s>, B<--line-up-parentheses-inclusion-list> and B<-lpxl=s>, B<--line-up-parentheses-exclusion-list> The following discussion is written for B<-lp> but applies equally to the newer B<-xlp> version. By default, the B<-lp> flag applies to as many containers as possible. The set of containers to which the B<-lp> style applies can be reduced by either one of these two flags: Use B<-lpil=s> to specify the containers to which B<-lp> applies, or use B<-lpxl=s> to specify the containers to which B<-lp> does NOT apply. Only one of these two flags may be used. Both flags can achieve the same result, but the B<-lpil=s> flag is much easier to describe and use and is recommended. The B<-lpxl=s> flag was the original implementation and is only retained for backwards compatibility. This list B for these parameters is a string with space-separated items. Each item consists of up to three pieces of information in this order: (1) an optional letter code (2) a required container type, and (3) an optional numeric code. The only required piece of information is a container type, which is one of '(', '[', or '{'. For example the string -lpil='(' means use -lp formatting only on lists within parentheses, not lists in square-brackets or braces. The same thing could alternatively be specified with -lpxl = '[ {' which says to exclude lists within square-brackets and braces. So what remains is lists within parentheses. A second optional item of information which can be given for parentheses is an alphanumeric letter which is used to limit the selection further depending on the type of token immediately before the paren. The possible letters are currently 'k', 'K', 'f', 'F', 'w', and 'W', with these meanings for matching whatever precedes an opening paren: 'k' matches if the previous nonblank token is a perl built-in keyword (such as 'if', 'while'), 'K' matches if 'k' does not, meaning that the previous token is not a keyword. 'f' matches if the previous token is a function other than a keyword. 'F' matches if 'f' does not. 'w' matches if either 'k' or 'f' match. 'W' matches if 'w' does not. For example: -lpil = 'f(' means only apply -lp to function calls, and -lpil = 'w(' means only apply -lp to parenthesized lists which follow a function or a keyword. This last example could alternatively be written using the B<-lpxl=s> flag as -lpxl = '[ { W(' which says exclude B<-lp> for lists within square-brackets, braces, and parens NOT preceded by a keyword or function. Clearly, the B<-lpil=s> method is easier to understand. An optional numeric code may follow any of the container types to further refine the selection based on container contents. The numeric codes are: '0' or blank: no check on contents is made '1' exclude B<-lp> unless the contents is a simple list without sublists '2' exclude B<-lp> unless the contents is a simple list without sublists, without code blocks, and without ternary operators For example, -lpil = 'f(2' means only apply -lp to function call lists which do not contain any sublists, code blocks or ternary expressions. =item B<-cti=n>, B<--closing-token-indentation> The B<-cti=n> flag controls the indentation of a line beginning with a C<)>, C<]>, or a non-block C<}>. Such a line receives: -cti = 0 no extra indentation (default) -cti = 1 extra indentation such that the closing token aligns with its opening token. -cti = 2 one extra indentation level if the line looks like: ); or ]; or }; -cti = 3 one extra indentation level always The flags B<-cti=1> and B<-cti=2> work well with the B<-lp> flag (previous section). # perltidy -lp -cti=1 @month_of_year = ( 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec' ); # perltidy -lp -cti=2 @month_of_year = ( 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec' ); These flags are merely hints to the formatter and they may not always be followed. In particular, if -lp is not being used, the indentation for B is constrained to be no more than one indentation level. If desired, this control can be applied independently to each of the closing container token types. In fact, B<-cti=n> is merely an abbreviation for B<-cpi=n -csbi=n -cbi=n>, where: B<-cpi> or B<--closing-paren-indentation> controls B<)>'s, B<-csbi> or B<--closing-square-bracket-indentation> controls B<]>'s, B<-cbi> or B<--closing-brace-indentation> controls non-block B<}>'s. =item B<-icp>, B<--indent-closing-paren> The B<-icp> flag is equivalent to B<-cti=2>, described in the previous section. The B<-nicp> flag is equivalent B<-cti=0>. They are included for backwards compatibility. =item B<-icb>, B<--indent-closing-brace> The B<-icb> option gives one extra level of indentation to a brace which terminates a code block . For example, if ($task) { yyy(); } # -icb else { zzz(); } The default is not to do this, indicated by B<-nicb>. =item B<-nib>, B<--non-indenting-braces> Normally, lines of code contained within a pair of block braces receive one additional level of indentation. This flag, which is enabled by default, causes perltidy to look for opening block braces which are followed by a special side comment. This special side comment is B<#<<<> by default. If found, the code between this opening brace and its corresponding closing brace will not be given the normal extra indentation level. For example: { #<<< a closure to contain lexical vars my $var; # this line does not get one level of indentation ... } # this line does not 'see' $var; This can be useful, for example, when combining code from different files. Different sections of code can be placed within braces to keep their lexical variables from being visible to the end of the file. To keep the new braces from causing all of their contained code to be indented if you run perltidy, and possibly introducing new line breaks in long lines, you can mark the opening braces with this special side comment. Only the opening brace needs to be marked, since perltidy knows where the closing brace is. Braces contained within marked braces may also be marked as non-indenting. If your code happens to have some opening braces followed by '#<<<', and you don't want this behavior, you can use B<-nnib> to deactivate it. To make it easy to remember, the default string is the same as the string for starting a B section. There is no confusion because in that case it is for a block comment rather than a side-comment. The special side comment can be changed with the next parameter. =item B<-nibp=s>, B<--non-indenting-brace-prefix=s> The B<-nibp=string> parameter may be used to change the marker for non-indenting braces. The default is equivalent to -nibp='#<<<'. The string that you enter must begin with a # and should be in quotes as necessary to get past the command shell of your system. This string is the leading text of a regex pattern that is constructed by appending pre-pending a '^' and appending a'\s', so you must also include backslashes for characters to be taken literally rather than as patterns. For example, to match the side comment '#++', the parameter would be -nibp='#\+\+' =item B<-olq>, B<--outdent-long-quotes> When B<-olq> is set, lines which is a quoted string longer than the value B will have their indentation removed to make them more readable. This is the default. To prevent such out-denting, use B<-nolq> or B<--nooutdent-long-lines>. =item B<-oll>, B<--outdent-long-lines> This command is equivalent to B<--outdent-long-quotes> and B<--outdent-long-comments>, and it is included for compatibility with previous versions of perltidy. The negation of this also works, B<-noll> or B<--nooutdent-long-lines>, and is equivalent to setting B<-nolq> and B<-nolc>. =item B B<-ola>, B<--outdent-labels> This command will cause labels to be outdented by 2 spaces (or whatever B<-ci> has been set to), if possible. This is the default. For example: my $i; LOOP: while ( $i = ) { chomp($i); next unless $i; fixit($i); } Use B<-nola> to not outdent labels. To control line breaks after labels see L<"-bal=n, --break-after-labels=n">. =item B =over 4 =item B<-okw>, B<--outdent-keywords> The command B<-okw> will cause certain leading control keywords to be outdented by 2 spaces (or whatever B<-ci> has been set to), if possible. By default, these keywords are C, C, C, C, and C. The intention is to make these control keywords easier to see. To change this list of keywords being outdented, see the next section. For example, using C on the previous example gives: my $i; LOOP: while ( $i = ) { chomp($i); next unless $i; fixit($i); } The default is not to do this. =item B B<-okwl=string>, B<--outdent-keyword-list=string> This command can be used to change the keywords which are outdented with the B<-okw> command. The parameter B is a required list of perl keywords, which should be placed in quotes if there are more than one. By itself, it does not cause any outdenting to occur, so the B<-okw> command is still required. For example, the commands C<-okwl="next last redo goto" -okw> will cause those four keywords to be outdented. It is probably simplest to place any B<-okwl> command in a F<.perltidyrc> file. =back =back =head2 Whitespace Control Whitespace refers to the blank space between variables, operators, and other code tokens. =over 4 =item B<-fws>, B<--freeze-whitespace> This flag causes your original whitespace to remain unchanged, and causes the rest of the whitespace commands in this section, the Code Indentation section, and the Comment Control section to be ignored. =item B Here the term "tightness" will mean the closeness with which pairs of enclosing tokens, such as parentheses, contain the quantities within. A numerical value of 0, 1, or 2 defines the tightness, with 0 being least tight and 2 being most tight. Spaces within containers are always symmetric, so if there is a space after a C<(> then there will be a space before the corresponding C<)>. The B<-pt=n> or B<--paren-tightness=n> parameter controls the space within parens. The example below shows the effect of the three possible values, 0, 1, and 2: if ( ( my $len_tab = length( $tabstr ) ) > 0 ) { # -pt=0 if ( ( my $len_tab = length($tabstr) ) > 0 ) { # -pt=1 (default) if ((my $len_tab = length($tabstr)) > 0) { # -pt=2 When n is 0, there is always a space to the right of a '(' and to the left of a ')'. For n=2 there is never a space. For n=1, the default, there is a space unless the quantity within the parens is a single token, such as an identifier or quoted string. Likewise, the parameter B<-sbt=n> or B<--square-bracket-tightness=n> controls the space within square brackets, as illustrated below. $width = $col[ $j + $k ] - $col[ $j ]; # -sbt=0 $width = $col[ $j + $k ] - $col[$j]; # -sbt=1 (default) $width = $col[$j + $k] - $col[$j]; # -sbt=2 Curly braces which do not contain code blocks are controlled by the parameter B<-bt=n> or B<--brace-tightness=n>. $obj->{ $parsed_sql->{ 'table' }[0] }; # -bt=0 $obj->{ $parsed_sql->{'table'}[0] }; # -bt=1 (default) $obj->{$parsed_sql->{'table'}[0]}; # -bt=2 And finally, curly braces which contain blocks of code are controlled by the parameter B<-bbt=n> or B<--block-brace-tightness=n> as illustrated in the example below. %bf = map { $_ => -M $_ } grep { /\.deb$/ } dirents '.'; # -bbt=0 (default) %bf = map { $_ => -M $_ } grep {/\.deb$/} dirents '.'; # -bbt=1 %bf = map {$_ => -M $_} grep {/\.deb$/} dirents '.'; # -bbt=2 To simplify input in the case that all of the tightness flags have the same value , the parameter <-act=n> or B<--all-containers-tightness=n> is an abbreviation for the combination <-pt=n -sbt=n -bt=n -bbt=n>. =item B<-tso>, B<--tight-secret-operators> The flag B<-tso> causes certain perl token sequences (secret operators) which might be considered to be a single operator to be formatted "tightly" (without spaces). The operators currently modified by this flag are: 0+ +0 ()x!! ~~<> ,=> =( )= For example the sequence B<0 +>, which converts a string to a number, would be formatted without a space: B<0+> when the B<-tso> flag is set. This flag is off by default. =item B<-sts>, B<--space-terminal-semicolon> Some programmers prefer a space before all terminal semicolons. The default is for no such space, and is indicated with B<-nsts> or B<--nospace-terminal-semicolon>. $i = 1 ; # -sts $i = 1; # -nsts (default) =item B<-sfs>, B<--space-for-semicolon> Semicolons within B loops may sometimes be hard to see, particularly when commas are also present. This option places spaces on both sides of these special semicolons, and is the default. Use B<-nsfs> or B<--nospace-for-semicolon> to deactivate it. for ( @a = @$ap, $u = shift @a ; @a ; $u = $v ) { # -sfs (default) for ( @a = @$ap, $u = shift @a; @a; $u = $v ) { # -nsfs =item B<-asc>, B<--add-semicolons> Setting B<-asc> allows perltidy to add any missing optional semicolon at the end of a line which is followed by a closing curly brace on the next line. This is the default, and may be deactivated with B<-nasc> or B<--noadd-semicolons>. =item B<-dsm>, B<--delete-semicolons> Setting B<-dsm> allows perltidy to delete extra semicolons which are simply empty statements. This is the default, and may be deactivated with B<-ndsm> or B<--nodelete-semicolons>. (Such semicolons are not deleted, however, if they would promote a side comment to a block comment). =item B<-aws>, B<--add-whitespace> Setting this option allows perltidy to add certain whitespace to improve code readability. This is the default. If you do not want any whitespace added, but are willing to have some whitespace deleted, use B<-naws>. (Use B<-fws> to leave whitespace completely unchanged). =item B<-dws>, B<--delete-old-whitespace> Setting this option allows perltidy to remove some old whitespace between characters, if necessary. This is the default. If you do not want any old whitespace removed, use B<-ndws> or B<--nodelete-old-whitespace>. =item B For those who want more detailed control over the whitespace around tokens, there are four parameters which can directly modify the default whitespace rules built into perltidy for any token. They are: B<-wls=s> or B<--want-left-space=s>, B<-nwls=s> or B<--nowant-left-space=s>, B<-wrs=s> or B<--want-right-space=s>, B<-nwrs=s> or B<--nowant-right-space=s>. These parameters are each followed by a quoted string, B, containing a list of token types. No more than one of each of these parameters should be specified, because repeating a command-line parameter always overwrites the previous one before perltidy ever sees it. To illustrate how these are used, suppose it is desired that there be no space on either side of the token types B<= + - / *>. The following two parameters would specify this desire: -nwls="= + - / *" -nwrs="= + - / *" (Note that the token types are in quotes, and that they are separated by spaces). With these modified whitespace rules, the following line of math: $root = -$b + sqrt( $b * $b - 4. * $a * $c ) / ( 2. * $a ); becomes this: $root=-$b+sqrt( $b*$b-4.*$a*$c )/( 2.*$a ); These parameters should be considered to be hints to perltidy rather than fixed rules, because perltidy must try to resolve conflicts that arise between them and all of the other rules that it uses. One conflict that can arise is if, between two tokens, the left token wants a space and the right one doesn't. In this case, the token not wanting a space takes priority. It is necessary to have a list of all token types in order to create this type of input. Such a list can be obtained by the command B<--dump-token-types>. Also try the B<-D> flag on a short snippet of code and look at the .DEBUG file to see the tokenization. B Be sure to put these tokens in quotes to avoid having them misinterpreted by your command shell. =item B The various parameters controlling whitespace within a program are requests which perltidy follows as well as possible, but there are a number of situations where changing whitespace could change program behavior and is not done. Some of these are obvious; for example, we should not remove the space between the two plus symbols in '$x+ +$y' to avoid creating a '++' operator. Some are more subtle and involve the whitespace around bareword symbols and locations of possible filehandles. For example, consider the problem of formatting the following subroutine: sub print_div { my ($x,$y)=@_; print $x/$y; } Suppose the user requests that / signs have a space to the left but not to the right. Perltidy will refuse to do this, but if this were done the result would be sub print_div { my ($x,$y)=@_; print $x /$y; } If formatted in this way, the program will not run (at least with recent versions of perl) because the $x is taken to be a filehandle and / is assumed to start a quote. In a complex program, there might happen to be a / which terminates the multiline quote without a syntax error, allowing the program to run, but not as intended. Related issues arise with other binary operator symbols, such as + and -, and in older versions of perl there could be problems with ternary operators. So to avoid changing program behavior, perltidy has the simple rule that whitespace around possible filehandles is left unchanged. Likewise, whitespace around barewords is left unchanged. The reason is that if the barewords are defined in other modules, or in code that has not even been written yet, perltidy will not have seen their prototypes and must treat them cautiously. In perltidy this is implemented in the tokenizer by marking token following a B keyword as a special type B. When formatting is being done, whitespace following this token type is generally left unchanged as a precaution against changing program behavior. This is excessively conservative but simple and easy to implement. Keywords which are treated similarly to B include B, B, B, B. Changes in spacing around parameters following these keywords may have to be made manually. For example, the space, or lack of space, after the parameter $foo in the following line will be unchanged in formatting. system($foo ); system($foo); To find if a token is of type B you can use B. For the first line above the result is 1: system($foo ); 1: kkkkkk{ZZZZb}; which shows that B is type B (keyword) and $foo is type B. =item B Despite these precautions, it is still possible to introduce syntax errors with some asymmetric whitespace rules, particularly when call parameters are not placed in containing parens or braces. For example, the following two lines will be parsed by perl without a syntax error: # original programming, syntax ok my @newkeys = map $_-$nrecs+@data, @oldkeys; # perltidy default, syntax ok my @newkeys = map $_ - $nrecs + @data, @oldkeys; But the following will give a syntax error: # perltidy -nwrs='-' my @newkeys = map $_ -$nrecs + @data, @oldkeys; For another example, the following two lines will be parsed without syntax error: # original programming, syntax ok for my $severity ( reverse $SEVERITY_LOWEST+1 .. $SEVERITY_HIGHEST ) { ... } # perltidy default, syntax ok for my $severity ( reverse $SEVERITY_LOWEST + 1 .. $SEVERITY_HIGHEST ) { ... } But the following will give a syntax error: # perltidy -nwrs='+', syntax error: for my $severity ( reverse $SEVERITY_LOWEST +1 .. $SEVERITY_HIGHEST ) { ... } To avoid subtle parsing problems like this, it is best to avoid spacing a binary operator asymmetrically with a space on the left but not on the right. =item B When an opening paren follows a Perl keyword, no space is introduced after the keyword, unless it is (by default) one of these: my local our and or xor eq ne if else elsif until unless while for foreach return switch case given when These defaults can be modified with two commands: B<-sak=s> or B<--space-after-keyword=s> adds keywords. B<-nsak=s> or B<--nospace-after-keyword=s> removes keywords. where B is a list of keywords (in quotes if necessary). For example, my ( $a, $b, $c ) = @_; # default my( $a, $b, $c ) = @_; # -nsak="my local our" The abbreviation B<-nsak='*'> is equivalent to including all of the keywords in the above list. When both B<-nsak=s> and B<-sak=s> commands are included, the B<-nsak=s> command is executed first. For example, to have space after only the keywords (my, local, our) you could use B<-nsak="*" -sak="my local our">. To put a space after all keywords, see the next item. =item B When an opening paren follows a function or keyword, no space is introduced after the keyword except for the keywords noted in the previous item. To always put a space between a function or keyword and its opening paren, use the command: B<-skp> or B<--space-keyword-paren> You may also want to use the flag B<-sfp> (next item) too. =item B When an opening paren follows a function the default and recommended formatting is not to introduce a space. To cause a space to be introduced use: B<-sfp> or B<--space-function-paren> myfunc( $a, $b, $c ); # default myfunc ( $a, $b, $c ); # -sfp You will probably also want to use the flag B<-skp> (previous item) too. The parameter is not recommended because spacing a function paren can make a program vulnerable to parsing problems by Perl. For example, the following two-line program will run as written but will have a syntax error if reformatted with -sfp: if ( -e filename() ) { print "I'm here\n"; } sub filename { return $0 } In this particular case the syntax error can be removed if the line order is reversed, so that Perl parses 'sub filename' first. =item B<-fpva> or B<--function-paren-vertical-alignment> A side-effect of using the B<-sfp> flag is that the parens may become vertically aligned. For example, # perltidy -sfp myfun ( $aaa, $b, $cc ); mylongfun ( $a, $b, $c ); This is the default behavior. To prevent this alignment use B<-nfpva>: # perltidy -sfp -nfpva myfun ( $aaa, $b, $cc ); mylongfun ( $a, $b, $c ); =item B<-spp=n> or B<--space-prototype-paren=n> This flag can be used to control whether a function prototype is preceded by a space. For example, the following prototype does not have a space. sub usage(); This integer B may have the value 0, 1, or 2 as follows: -spp=0 means no space before the paren -spp=1 means follow the example of the source code [DEFAULT] -spp=2 means always put a space before the paren The default is B<-spp=1>, meaning that a space will be used if and only if there is one in the source code. Given the above line of code, the result of applying the different options would be: sub usage(); # n=0 [no space] sub usage(); # n=1 [default; follows input] sub usage (); # n=2 [space] =item B<-kpit=n> or B<--keyword-paren-inner-tightness=n> The space inside of an opening paren, which itself follows a certain keyword, can be controlled by this parameter. The space on the inside of the corresponding closing paren will be treated in the same (balanced) manner. This parameter has precedence over any other paren spacing rules. The values of B are as follows: -kpit=0 means always put a space (not tight) -kpit=1 means ignore this parameter [default] -kpit=2 means never put a space (tight) To illustrate, the following snippet is shown formatted in three ways: if ( seek( DATA, 0, 0 ) ) { ... } # perltidy (default) if (seek(DATA, 0, 0)) { ... } # perltidy -pt=2 if ( seek(DATA, 0, 0) ) { ... } # perltidy -pt=2 -kpit=0 In the second case the -pt=2 parameter makes all of the parens tight. In the third case the -kpit=0 flag causes the space within the 'if' parens to have a space, since 'if' is one of the keywords to which the -kpit flag applies by default. The remaining parens are still tight because of the -pt=2 parameter. The set of keywords to which this parameter applies are by default are: if elsif unless while until for foreach These can be changed with the parameter B<-kpitl=s> described in the next section. =item B<-kpitl=string> or B<--keyword-paren-inner-tightness=string> This command can be used to change the keywords to which the the B<-kpit=n> command applies. The parameter B is a required list either keywords or functions, which should be placed in quotes if there are more than one. By itself, this parameter does not cause any change in spacing, so the B<-kpit=n> command is still required. For example, the commands C<-kpitl="if else while" -kpit=2> will cause the just the spaces inside parens following 'if', 'else', and 'while' keywords to follow the tightness value indicated by the B<-kpit=2> flag. =item B<-lop> or B<--logical-padding> In the following example some extra space has been inserted on the second line between the two open parens. This extra space is called "logical padding" and is intended to help align similar things vertically in some logical or ternary expressions. # perltidy [default formatting] $same = ( ( $aP eq $bP ) && ( $aS eq $bS ) && ( $aT eq $bT ) && ( $a->{'title'} eq $b->{'title'} ) && ( $a->{'href'} eq $b->{'href'} ) ); Note that this is considered to be a different operation from "vertical alignment" because space at just one line is being adjusted, whereas in "vertical alignment" the spaces at all lines are being adjusted. So it sort of a local version of vertical alignment. Here is an example involving a ternary operator: # perltidy [default formatting] $bits = $top > 0xffff ? 32 : $top > 0xff ? 16 : $top > 1 ? 8 : 1; This behavior is controlled with the flag B<--logical-padding>, which is set 'on' by default. If it is not desired it can be turned off using B<--nological-padding> or B<-nlop>. The above two examples become, with B<-nlop>: # perltidy -nlop $same = ( ( $aP eq $bP ) && ( $aS eq $bS ) && ( $aT eq $bT ) && ( $a->{'title'} eq $b->{'title'} ) && ( $a->{'href'} eq $b->{'href'} ) ); # perltidy -nlop $bits = $top > 0xffff ? 32 : $top > 0xff ? 16 : $top > 1 ? 8 : 1; =item B quotes> B<-tqw> or B<--trim-qw> provide the default behavior of trimming spaces around multi-line C quotes and indenting them appropriately. B<-ntqw> or B<--notrim-qw> cause leading and trailing whitespace around multi-line C quotes to be left unchanged. This option will not normally be necessary, but was added for testing purposes, because in some versions of perl, trimming C quotes changes the syntax tree. =item B<-sbq=n> or B<--space-backslash-quote=n> lines like $str1=\"string1"; $str2=\'string2'; can confuse syntax highlighters unless a space is included between the backslash and the single or double quotation mark. this can be controlled with the value of B as follows: -sbq=0 means no space between the backslash and quote -sbq=1 means follow the example of the source code -sbq=2 means always put a space between the backslash and quote The default is B<-sbq=1>, meaning that a space will be used if there is one in the source code. =item B B<-trp> or B<--trim-pod> will remove trailing whitespace from lines of POD. The default is not to do this. =back =head2 Comment Controls Perltidy has a number of ways to control the appearance of both block comments and side comments. The term B here refers to a full-line comment, whereas B will refer to a comment which appears on a line to the right of some code. =over 4 =item B<-ibc>, B<--indent-block-comments> Block comments normally look best when they are indented to the same level as the code which follows them. This is the default behavior, but you may use B<-nibc> to keep block comments left-justified. Here is an example: # this comment is indented (-ibc, default) if ($task) { yyy(); } The alternative is B<-nibc>: # this comment is not indented (-nibc) if ($task) { yyy(); } See also the next item, B<-isbc>, as well as B<-sbc>, for other ways to have some indented and some outdented block comments. =item B<-isbc>, B<--indent-spaced-block-comments> If there is no leading space on the line, then the comment will not be indented, and otherwise it may be. If both B<-ibc> and B<-isbc> are set, then B<-isbc> takes priority. =item B<-olc>, B<--outdent-long-comments> When B<-olc> is set, lines which are full-line (block) comments longer than the value B will have their indentation removed. This is the default; use B<-nolc> to prevent outdenting. =item B<-msc=n>, B<--minimum-space-to-comment=n> Side comments look best when lined up several spaces to the right of code. Perltidy will try to keep comments at least n spaces to the right. The default is n=4 spaces. =item B<-fpsc=n>, B<--fixed-position-side-comment=n> This parameter tells perltidy to line up side comments in column number B whenever possible. The default, n=0, will not do this. =item B<-iscl>, B<--ignore-side-comment-lengths> This parameter causes perltidy to ignore the length of side comments when setting line breaks. The default, B<-niscl>, is to include the length of side comments when breaking lines to stay within the length prescribed by the B<-l=n> maximum line length parameter. For example, the following long single line would remain intact with -l=80 and -iscl: perltidy -l=80 -iscl $vmsfile =~ s/;[\d\-]*$//; # Clip off version number; we can use a newer version as well whereas without the -iscl flag the line will be broken: perltidy -l=80 $vmsfile =~ s/;[\d\-]*$// ; # Clip off version number; we can use a newer version as well =item B<-hsc>, B<--hanging-side-comments> By default, perltidy tries to identify and align "hanging side comments", which are something like this: my $IGNORE = 0; # This is a side comment # This is a hanging side comment # And so is this A comment is considered to be a hanging side comment if (1) it immediately follows a line with a side comment, or another hanging side comment, and (2) there is some leading whitespace on the line. To deactivate this feature, use B<-nhsc> or B<--nohanging-side-comments>. If block comments are preceded by a blank line, or have no leading whitespace, they will not be mistaken as hanging side comments. =item B A closing side comment is a special comment which perltidy can automatically create and place after the closing brace of a code block. They can be useful for code maintenance and debugging. The command B<-csc> (or B<--closing-side-comments>) adds or updates closing side comments. For example, here is a small code snippet sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } } And here is the result of processing with C: sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } } ## end sub message A closing side comment was added for C in this case, but not for the C and C blocks, because they were below the 6 line cutoff limit for adding closing side comments. This limit may be changed with the B<-csci> command, described below. The command B<-dcsc> (or B<--delete-closing-side-comments>) reverses this process and removes these comments. Several commands are available to modify the behavior of these two basic commands, B<-csc> and B<-dcsc>: =over 4 =item B<-csci=n>, or B<--closing-side-comment-interval=n> where C is the minimum number of lines that a block must have in order for a closing side comment to be added. The default value is C. To illustrate: # perltidy -csci=2 -csc sub message { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } ## end if ( !defined( $_[0] )) else { print( $_[0], "\n" ); } ## end else [ if ( !defined( $_[0] )) } ## end sub message Now the C and C blocks are commented. However, now this has become very cluttered. =item B<-cscp=string>, or B<--closing-side-comment-prefix=string> where string is the prefix used before the name of the block type. The default prefix, shown above, is C<## end>. This string will be added to closing side comments, and it will also be used to recognize them in order to update, delete, and format them. Any comment identified as a closing side comment will be placed just a single space to the right of its closing brace. =item B<-cscl=string>, or B<--closing-side-comment-list> where C is a list of block types to be tagged with closing side comments. By default, all code block types preceded by a keyword or label (such as C, C, and so on) will be tagged. The B<-cscl> command changes the default list to be any selected block types; see L<"Specifying Block Types">. For example, the following command requests that only C's, labels, C, and C blocks be affected by any B<-csc> or B<-dcsc> operation: -cscl="sub : BEGIN END" =item B<-csct=n>, or B<--closing-side-comment-maximum-text=n> The text appended to certain block types, such as an C block, is whatever lies between the keyword introducing the block, such as C, and the opening brace. Since this might be too much text for a side comment, there needs to be a limit, and that is the purpose of this parameter. The default value is C, meaning that no additional tokens will be appended to this text after its length reaches 20 characters. Omitted text is indicated with C<...>. (Tokens, including sub names, are never truncated, however, so actual lengths may exceed this). To illustrate, in the above example, the appended text of the first block is C< ( !defined( $_[0] )...>. The existing limit of C caused this text to be truncated, as indicated by the C<...>. See the next flag for additional control of the abbreviated text. =item B<-cscb>, or B<--closing-side-comments-balanced> As discussed in the previous item, when the closing-side-comment-maximum-text limit is exceeded the comment text must be truncated. Older versions of perltidy terminated with three dots, and this can still be achieved with -ncscb: perltidy -csc -ncscb } ## end foreach my $foo (sort { $b cmp $a ... However this causes a problem with editors which cannot recognize comments or are not configured to do so because they cannot "bounce" around in the text correctly. The B<-cscb> flag has been added to help them by appending appropriate balancing structure: perltidy -csc -cscb } ## end foreach my $foo (sort { $b cmp $a ... }) The default is B<-cscb>. =item B<-csce=n>, or B<--closing-side-comment-else-flag=n> The default, B, places the text of the opening C statement after any terminal C. If B is used, then each C is also given the text of the opening C statement. Also, an C will include the text of a preceding C statement. Note that this may result some long closing side comments. If B is used, the results will be the same as B whenever the resulting line length is less than the maximum allowed. =item B<-cscb>, or B<--closing-side-comments-balanced> When using closing-side-comments, and the closing-side-comment-maximum-text limit is exceeded, then the comment text must be abbreviated. It is terminated with three dots if the B<-cscb> flag is negated: perltidy -csc -ncscb } ## end foreach my $foo (sort { $b cmp $a ... This causes a problem with older editors which do not recognize comments because they cannot "bounce" around in the text correctly. The B<-cscb> flag tries to help them by appending appropriate terminal balancing structures: perltidy -csc -cscb } ## end foreach my $foo (sort { $b cmp $a ... }) The default is B<-cscb>. =item B<-cscw>, or B<--closing-side-comment-warnings> This parameter is intended to help make the initial transition to the use of closing side comments. It causes two things to happen if a closing side comment replaces an existing, different closing side comment: first, an error message will be issued, and second, the original side comment will be placed alone on a new specially marked comment line for later attention. The intent is to avoid clobbering existing hand-written side comments which happen to match the pattern of closing side comments. This flag should only be needed on the first run with B<-csc>. =back B =over 4 =item * Closing side comments are only placed on lines terminated with a closing brace. Certain closing styles, such as the use of cuddled elses (B<-ce>), preclude the generation of some closing side comments. =item * Please note that adding or deleting of closing side comments takes place only through the commands B<-csc> or B<-dcsc>. The other commands, if used, merely modify the behavior of these two commands. =item * It is recommended that the B<-cscw> flag be used along with B<-csc> on the first use of perltidy on a given file. This will prevent loss of any existing side comment data which happens to have the csc prefix. =item * Once you use B<-csc>, you should continue to use it so that any closing side comments remain correct as code changes. Otherwise, these comments will become incorrect as the code is updated. =item * If you edit the closing side comments generated by perltidy, you must also change the prefix to be different from the closing side comment prefix. Otherwise, your edits will be lost when you rerun perltidy with B<-csc>. For example, you could simply change C<## end> to be C<## End>, since the test is case sensitive. You may also want to use the B<-ssc> flag to keep these modified closing side comments spaced the same as actual closing side comments. =item * Temporarily generating closing side comments is a useful technique for exploring and/or debugging a perl script, especially one written by someone else. You can always remove them with B<-dcsc>. =back =item B Static block comments are block comments with a special leading pattern, C<##> by default, which will be treated slightly differently from other block comments. They effectively behave as if they had glue along their left and top edges, because they stick to the left edge and previous line when there is no blank spaces in those places. This option is particularly useful for controlling how commented code is displayed. =over 4 =item B<-sbc>, B<--static-block-comments> When B<-sbc> is used, a block comment with a special leading pattern, C<##> by default, will be treated specially. Comments so identified are treated as follows: =over 4 =item * If there is no leading space on the line, then the comment will not be indented, and otherwise it may be, =item * no new blank line will be inserted before such a comment, and =item * such a comment will never become a hanging side comment. =back For example, assuming C<@month_of_year> is left-adjusted: @month_of_year = ( # -sbc (default) 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', ## 'Dec', 'Nov' 'Nov', 'Dec'); Without this convention, the above code would become @month_of_year = ( # -nsbc 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', ## 'Dec', 'Nov' 'Nov', 'Dec' ); which is not as clear. The default is to use B<-sbc>. This may be deactivated with B<-nsbc>. =item B<-sbcp=string>, B<--static-block-comment-prefix=string> This parameter defines the prefix used to identify static block comments when the B<-sbc> parameter is set. The default prefix is C<##>, corresponding to C<-sbcp=##>. The prefix is actually part of a perl pattern used to match lines and it must either begin with C<#> or C<^#>. In the first case a prefix ^\s* will be added to match any leading whitespace, while in the second case the pattern will match only comments with no leading whitespace. For example, to identify all comments as static block comments, one would use C<-sbcp=#>. To identify all left-adjusted comments as static block comments, use C<-sbcp='^#'>. Please note that B<-sbcp> merely defines the pattern used to identify static block comments; it will not be used unless the switch B<-sbc> is set. Also, please be aware that since this string is used in a perl regular expression which identifies these comments, it must enable a valid regular expression to be formed. A pattern which can be useful is: -sbcp=^#{2,}[^\s#] This pattern requires a static block comment to have at least one character which is neither a # nor a space. It allows a line containing only '#' characters to be rejected as a static block comment. Such lines are often used at the start and end of header information in subroutines and should not be separated from the intervening comments, which typically begin with just a single '#'. =item B<-osbc>, B<--outdent-static-block-comments> The command B<-osbc> will cause static block comments to be outdented by 2 spaces (or whatever B<-ci=n> has been set to), if possible. =back =item B Static side comments are side comments with a special leading pattern. This option can be useful for controlling how commented code is displayed when it is a side comment. =over 4 =item B<-ssc>, B<--static-side-comments> When B<-ssc> is used, a side comment with a static leading pattern, which is C<##> by default, will be spaced only a single space from previous character, and it will not be vertically aligned with other side comments. The default is B<-nssc>. =item B<-sscp=string>, B<--static-side-comment-prefix=string> This parameter defines the prefix used to identify static side comments when the B<-ssc> parameter is set. The default prefix is C<##>, corresponding to C<-sscp=##>. Please note that B<-sscp> merely defines the pattern used to identify static side comments; it will not be used unless the switch B<-ssc> is set. Also, note that this string is used in a perl regular expression which identifies these comments, so it must enable a valid regular expression to be formed. =back =back =head2 Skipping Selected Sections of Code Selected lines of code may be passed verbatim to the output without any formatting by marking the starting and ending lines with special comments. There are two options for doing this. The first option is called B<--format-skipping> or B<-fs>, and the second option is called B<--code-skipping> or B<-cs>. In both cases the lines of code will be output without any changes. The difference is that in B<--format-skipping> perltidy will still parse the marked lines of code and check for errors, whereas in B<--code-skipping> perltidy will simply pass the lines to the output without any checking. Both of these features are enabled by default and are invoked with special comment markers. B<--format-skipping> uses starting and ending markers '#<<<' and '#>>>', like this: #<<< format skipping: do not let perltidy change my nice formatting my @list = (1, 1, 1, 1, 2, 1, 1, 3, 3, 1, 1, 4, 6, 4, 1,); #>>> B<--code-skipping> uses starting and ending markers '#<>V', like this: #< | _ | ] | <''> ] }; #>>V Additional text may appear on the special comment lines provided that it is separated from the marker by at least one space, as in the above examples. Any number of code-skipping or format-skipping sections may appear in a file. If an opening code-skipping or format-skipping comment is not followed by a corresponding closing comment, then skipping continues to the end of the file. If a closing code-skipping or format-skipping comment appears in a file but does not follow a corresponding opening comment, then it is treated as an ordinary comment without any special meaning. It is recommended to use B<--code-skipping> only if you need to hide a block of an extended syntax which would produce errors if parsed by perltidy, and use B<--format-skipping> otherwise. This is because the B<--format-skipping> option provides the benefits of error checking, and there are essentially no limitations on which lines to which it can be applied. The B<--code-skipping> option, on the other hand, does not do error checking and its use is more restrictive because the code which remains, after skipping the marked lines, must be syntactically correct code with balanced containers. These features should be used sparingly to avoid littering code with markers, but they can be helpful for working around occasional problems. Note that it may be possible to avoid the use of B<--format-skipping> for the specific case of a comma-separated list of values, as in the above example, by simply inserting a blank or comment somewhere between the opening and closing parens. See the section L<"Controlling List Formatting">. The following sections describe the available controls for these options. They should not normally be needed. =over 4 =item B<-fs>, B<--format-skipping> As explained above, this flag, which is enabled by default, causes any code between special beginning and ending comment markers to be passed to the output without formatting. The code between the comments is still checked for errors however. The default beginning marker is #<<< and the default ending marker is #>>>. Format skipping begins when a format skipping beginning comment is seen and continues until a format-skipping ending comment is found. This feature can be disabled with B<-nfs>. This should not normally be necessary. =item B<-fsb=string>, B<--format-skipping-begin=string> This and the next parameter allow the special beginning and ending comments to be changed. However, it is recommended that they only be changed if there is a conflict between the default values and some other use. If they are used, it is recommended that they only be entered in a B<.perltidyrc> file, rather than on a command line. This is because properly escaping these parameters on a command line can be difficult. If changed comment markers do not appear to be working, use the B<-log> flag and examine the F<.LOG> file to see if and where they are being detected. The B<-fsb=string> parameter may be used to change the beginning marker for format skipping. The default is equivalent to -fsb='#<<<'. The string that you enter must begin with a # and should be in quotes as necessary to get past the command shell of your system. It is actually the leading text of a pattern that is constructed by appending a '\s', so you must also include backslashes for characters to be taken literally rather than as patterns. Some examples show how example strings become patterns: -fsb='#\{\{\{' becomes /^#\{\{\{\s/ which matches #{{{ but not #{{{{ -fsb='#\*\*' becomes /^#\*\*\s/ which matches #** but not #*** -fsb='#\*{2,}' becomes /^#\*{2,}\s/ which matches #** and #***** =item B<-fse=string>, B<--format-skipping-end=string> The B<-fse=string> is the corresponding parameter used to change the ending marker for format skipping. The default is equivalent to -fse='#<<<'. The beginning and ending strings may be the same, but it is preferable to make them different for clarity. =item B<-cs>, B<--code-skipping> As explained above, this flag, which is enabled by default, causes any code between special beginning and ending comment markers to be directly passed to the output without any error checking or formatting. Essentially, perltidy treats it as if it were a block of arbitrary text. The default beginning marker is #<>V. This feature can be disabled with B<-ncs>. This should not normally be necessary. =item B<-csb=string>, B<--code-skipping-begin=string> This may be used to change the beginning comment for a B<--code-skipping> section, and its use is similar to the B<-fsb=string>. The default is equivalent to -csb='#<, B<--code-skipping-end=string> This may be used to change the ending comment for a B<--code-skipping> section, and its use is similar to the B<-fse=string>. The default is equivalent to -cse='#>>V'. =back =head2 Line Break Control The parameters in this and the next sections control breaks after non-blank lines of code. Blank lines are controlled separately by parameters in the section L<"Blank Line Control">. =over 4 =item B<-dnl>, B<--delete-old-newlines> By default, perltidy first deletes all old line break locations, and then it looks for good break points to match the desired line length. Use B<-ndnl> or B<--nodelete-old-newlines> to force perltidy to retain all old line break points. =item B<-anl>, B<--add-newlines> By default, perltidy will add line breaks when necessary to create continuations of long lines and to improve the script appearance. Use B<-nanl> or B<--noadd-newlines> to prevent any new line breaks. This flag does not prevent perltidy from eliminating existing line breaks; see B<--freeze-newlines> to completely prevent changes to line break points. =item B<-fnl>, B<--freeze-newlines> If you do not want any changes to the line breaks within lines of code in your script, set B<-fnl>, and they will remain fixed, and the rest of the commands in this section and sections L<"Controlling List Formatting">, L<"Retaining or Ignoring Existing Line Breaks">. You may want to use B<-noll> with this. Note: If you also want to keep your blank lines exactly as they are, you can use the B<-fbl> flag which is described in the section L<"Blank Line Control">. =back =head2 Controlling Breaks at Braces, Parens, and Square Brackets =over 4 =item B<-ce>, B<--cuddled-else> Enable the "cuddled else" style, in which C and C are follow immediately after the curly brace closing the previous block. The default is not to use cuddled elses, and is indicated with the flag B<-nce> or B<--nocuddled-else>. Here is a comparison of the alternatives: # -ce if ($task) { yyy(); } else { zzz(); } # -nce (default) if ($task) { yyy(); } else { zzz(); } In this example the keyword B is placed on the same line which begins with the preceding closing block brace and is followed by its own opening block brace on the same line. Other keywords and function names which are formatted with this "cuddled" style are B, B, B, B. Other block types can be formatted by specifying their names on a separate parameter B<-cbl>, described in a later section. Cuddling between a pair of code blocks requires that the closing brace of the first block start a new line. If this block is entirely on one line in the input file, it is necessary to decide if it should be broken to allow cuddling. This decision is controlled by the flag B<-cbo=n> discussed below. The default and recommended value of B<-cbo=1> bases this decision on the first block in the chain. If it spans multiple lines then cuddling is made and continues along the chain, regardless of the sizes of subsequent blocks. Otherwise, short lines remain intact. So for example, the B<-ce> flag would not have any effect if the above snippet is rewritten as if ($task) { yyy() } else { zzz() } If the first block spans multiple lines, then cuddling can be done and will continue for the subsequent blocks in the chain, as illustrated in the previous snippet. If there are blank lines between cuddled blocks they will be eliminated. If there are comments after the closing brace where cuddling would occur then cuddling will be prevented. If this occurs, cuddling will restart later in the chain if possible. =item B<-cb>, B<--cuddled-blocks> This flag is equivalent to B<-ce>. =item B<-cbl>, B<--cuddled-block-list> The built-in default cuddled block types are B. Additional block types to which the B<-cuddled-blocks> style applies can be defined by this parameter. This parameter is a character string, giving a list of block types separated by commas or spaces. For example, to cuddle code blocks of type sort, map and grep, in addition to the default types, the string could be set to -cbl="sort map grep" or equivalently -cbl=sort,map,grep Note however that these particular block types are typically short so there might not be much opportunity for the cuddled format style. Using commas avoids the need to protect spaces with quotes. As a diagnostic check, the flag B<--dump-cuddled-block-list> or B<-dcbl> can be used to view the hash of values that are generated by this flag. Finally, note that the B<-cbl> flag by itself merely specifies which blocks are formatted with the cuddled format. It has no effect unless this formatting style is activated with B<-ce>. =item B<-cblx>, B<--cuddled-block-list-exclusive> When cuddled else formatting is selected with B<-ce>, setting this flag causes perltidy to ignore its built-in defaults and rely exclusively on the block types specified on the B<-cbl> flag described in the previous section. For example, to avoid using cuddled B and B, which are among the defaults, the following set of parameters could be used: perltidy -ce -cbl='else elsif continue' -cblx =item B<-cbo=n>, B<--cuddled-break-option=n> Cuddled formatting is only possible between a pair of code blocks if the closing brace of the first block starts a new line. If a block is encountered which is entirely on a single line, and cuddled formatting is selected, it is necessary to make a decision as to whether or not to "break" the block, meaning to cause it to span multiple lines. This parameter controls that decision. The options are: cbo=0 Never force a short block to break. cbo=1 If the first of a pair of blocks is broken in the input file, then break the second [DEFAULT]. cbo=2 Break open all blocks for maximal cuddled formatting. The default and recommended value is B. With this value, if the starting block of a chain spans multiple lines, then a cascade of breaks will occur for remaining blocks causing the entire chain to be cuddled. The option B can produce erratic cuddling if there are numerous one-line blocks. The option B produces maximal cuddling but will not allow any short blocks. =item B<-bl>, B<--opening-brace-on-new-line>, or B<--brace-left> Use the flag B<-bl> to place an opening block brace on a new line: if ( $input_file eq '-' ) { ... } By default it applies to all structural blocks except B and anonymous subs. The default is B<-nbl> which places an opening brace on the same line as the keyword introducing it if possible. For example, # default if ( $input_file eq '-' ) { ... } When B<-bl> is set, the blocks to which this applies can be controlled with the parameters B<--brace-left-list> and B<-brace-left-exclusion-list> described in the next sections. =item B<-bll=s>, B<--brace-left-list=s> Use this parameter to change the types of block braces for which the B<-bl> flag applies; see L<"Specifying Block Types">. For example, B<-bll='if elsif else sub'> would apply it to only C and named sub blocks. The default is all blocks, B<-bll='*'>. =item B<-blxl=s>, B<--brace-left-exclusion-list=s> Use this parameter to exclude types of block braces for which the B<-bl> flag applies; see L<"Specifying Block Types">. For example, the default settings B<-bll='*'> and B<-blxl='sort map grep eval asub'> mean all blocks except B and anonymous sub blocks. Note that the lists B<-bll=s> and B<-blxl=s> control the behavior of the B<-bl> flag but have no effect unless the B<-bl> flag is set. =item B<-sbl>, B<--opening-sub-brace-on-new-line> The flag B<-sbl> provides a shortcut way to turn on B<-bl> just for named subs. The same effect can be achieved by turning on B<-bl> with the block list set as B<-bll='sub'>. For example, perltidy -sbl produces this result: sub message { if (!defined($_[0])) { print("Hello, World\n"); } else { print($_[0], "\n"); } } This flag is negated with B<-nsbl>, which is the default. =item B<-asbl>, B<--opening-anonymous-sub-brace-on-new-line> The flag B<-asbl> is like the B<-sbl> flag except that it applies to anonymous sub's instead of named subs. For example perltidy -asbl produces this result: $a = sub { if ( !defined( $_[0] ) ) { print("Hello, World\n"); } else { print( $_[0], "\n" ); } }; This flag is negated with B<-nasbl>, and the default is B<-nasbl>. =item B<-bli>, B<--brace-left-and-indent> The flag B<-bli> is similar to the B<-bl> flag but in addition it causes one unit of continuation indentation ( see B<-ci> ) to be placed before an opening and closing block braces. For example, perltidy -bli gives if ( $input_file eq '-' ) { important_function(); } By default, this extra indentation occurs for block types: B, B, B, B, B, B, B, B, and also B and blocks preceded by a B