pax_global_header00006660000000000000000000000064130016757130014515gustar00rootroot0000000000000052 comment=75c6e3ee093237524e24dfdf8df31464b9482b98 pwget-2016.1019+git75c6e3e/000077500000000000000000000000001300167571300147475ustar00rootroot00000000000000pwget-2016.1019+git75c6e3e/COPYING000066400000000000000000000431031300167571300160030ustar00rootroot00000000000000 GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Lesser General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) year name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. , 1 April 1989 Ty Coon, President of Vice This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. pwget-2016.1019+git75c6e3e/ChangeLog000066400000000000000000001042011300167571300165170ustar00rootroot000000000000002016-10-19 Wed Jari Aalto * bin/pwget.pl (POD): Fx spelling. 2014-04-09 Wed Jari Aalto * bin/pwget.pl (FileExists): Add .xz extension. (FileSimpleCompressed): Add .xz extension. (FileDeCompressedCmd): Add .xz support. (UrlHttp): Remove internal debug variable setting. 2014-02-21 Fri Jari Aalto * bin/pwget.pl (POD): Fix typo. (FileRootDirNeedeed): Correct Perl warning about missing function parens in call to DateYYYY_MM_DD(). (HandleCommandLineArgs): remove defined() tests from LISTs. 2012-02-09 Thu Jari Aalto * doc/examples/emacs.conf: Update old name to pwget. * bin/pwget.pl (Boot): Add dheck against empty command line args. (UrlHttGetWget): Remove which(1). Use --long options in wget(1). * README (README::Depends): Require PErl 5.10+. The LWP::UserAgent is included in latest Perl, so it's not extenal CPAN dependency any more. 2012-01-30 Mon Jari Aalto * bin/pwget.pl (LatestVersion): Improve internal error message. (Main): Improve HTTPS Crypt::SSLeay.pm message. (LatestVersion): use qr() and \E..\Q to quote $post part of found file name. Like in Debian archive: +dfsg.orig.tar => \+dfsg\.orig\.tar 2010-10-31 Sun Jari Aalto * bin/pwget.pl (--Tag): replace all occurrences with --tag. (HandleCommandLineArgs): Make -t and -v options non-ambiguous. (HELP::--regexp): Correct short option capitalization in example. Add caveat: currently works only for http:// URLs. 2010-10-12 Tue Jari Aalto * Makefile (PERL): New. (doc/manual/index.txt): Change from 'pwget.txt'. (doc/manual/index.html): Change from 'pwget.html'. * bin/pwget.pl (HELP): Correct spelling. (HELP::head1 OPTIONS): add missing '=over 4'. (Help): Correct load of module Pod::Man. 2010-05-01 Sat Jari Aalto * pwget.pl (HandleCommandLineArgs): Remove "require_order" from Getopt::Long::config(). 2010-04-17 Thu Jari Aalto * pwget.pl (Help): Change word 'kit' to 'package. Remove option categories and arrange all options alphabetically. 2010-04-15 Thu Jari Aalto * pwget.pl (Help): rewrite eval Pod::Man call. 2010-04-13 Tue Jari Aalto * pwget.pl: Add 'use 5.10' because "Named Capture Buffers" are used. 2010-03-13 Sat Jari Aalto * pwget.pl (top level): rearrange globals, use comamnds. (Initialize): change CONTACT to AUTHOR. 2010-02-22 Mon Jari Aalto * pwget.pl (HandleCommandLineArgs): Change command line options to follow GNU standards: -v, --version; not -v|--version; all long options names are in lowercase. 2009-10-01 Jari Aalto * pwget.pl (UrlHttGetPerl): Set User-Agent to Firefox 3.5.3 2009-09-23 Jari Aalto * pwget.pl (Help::LIST OF DIRECTIVES::new): Improve documentation. (Help::LIST OF DIRECTIVES::page): Improve documentation. (Help::LIST OF DIRECTIVES::pregexp): New. (ListUnique): New. (FileListFilter): Change to hash notation. 2009-09-22 Jari Aalto * pwget.pl (UrlHttp): Comment out '-->' which announce initial file name, not the final. (IsSourceforgeDownload): New. (UrlHttpParseHref): Convert args to HASH syntax. Add parameter unique. (FileExists): Convert args to HASH syntax. Add more archive extensions. Add Sourceforge support. (HandleCommandLineArgs): Correct setting of $debug and $verb. (UrlHttGetPerl): Chnage $ARG to my($content). (UrlHttpSearchPage): Complete rewrite. Use UrlHttGet(). (UrlHttGetPerl): Se user_agent. Work around berlios.de. 2009-09-21 Jari Aalto * pwget.pl (LatestVersion): Change all grouped references to names regular expressions and references. (UrlManipulateMain): Add Sourceforge /download support (SourceforgeProjectName): Add sf.net/projects/ support. (SourceforgeProjectId): Update SF group_id regexp. (UrlManipulateSfOld): New. Move old SF support. (UrlManipulateSf): New SF support. (UrlManipulateMain): Add 'mirror' parameter. (UrlHttp): Pass $mirror to UrlManipulateMain(). (UrlHttGetPerl): New. (UrlHttGetWget): New. (FileNameFix): Add sourceforge support. (HandleCommandLineArgs): Correct --verbose value setting with 'defined'. 2009-09-16 Jari Aalto * pwget.pl (LatestVersion): Add 'rc' to regexp, like in 'release candidate'. 2009-09-12 Jari Aalto * pwget.pl (HELP::--Regexp-content): Correct spelling. 2009-08-03 Jari Aalto * pwget.pl (HandleCommandLineArgs): Do not set $verb to default value 1. Be silent by default. 2009-03-26 Jari Aalto * pwget.pl (Help::LIST OF DIRECTIVES IN CONFIGURATION FILE): Adjust discussion about 'rename:' 2009-03-25 Jari Aalto * pwget.pl (UrlHttpDownload): Change savefile foo.txt?format=mode so that the rest after question mark are removed. Thus foo.txt. 2009-03-19 Jari Aalto * pwget.pl (UrlHttpParseHref): Add www.emacswiki.org filetr regexp. (DirectiveLcd): in mkdir(), announce if test mode is active. 2009-03-18 Jari Aalto * pwget.pl (UrlHttpParseHref): Add code.google.com directory filtering regexp. (HandleCommandLineArgs): Add new option --sleep SECONDS. (Help): Document option --sleep SECONDS. (UrlHttpDownload): Use new variable $SLEEP_SECONDS. 2009-03-08 Jari Aalto * pwget.pl (UrlHttpParseHref): Add more debug. 2009-02-20 Jari Aalto * pwget.pl (LIST OF DIRECTIVES IN CONFIGURATION FILE): Adjust right hand column to 60 in indented examples. (xopt:rm): Adjust indentation. 2009-02-11 Jari Aalto * pwget.pl (Help): Workaround Perl v5.10.0 bug in Pod::Text, which defined non-workable pod2text() function: Can't use string ("") as a symbol ref while "strict refs" in use at /usr/share/perl/5.10/Pod/Text.pm line 249. 2008-09-16 Jari Aalto * pwget.pl: Rename to shorter name 'perl wget'. Was mywebget.pl (help::POD): Document short options. (HandleCommandLineArgs): Add new option --dry-run. Add short option -c for --config. (DirectiveLcd): Do not create directories if --test option is in effect. (UrlManipulateSf): New. (UrlManipulateMain): New. (SopurceforgeParseDownloadPage): New. (SourceforgeProjectName): New. (SourceforgeProjectId): New. (UrlHttp): Handle SF url with UrlManipulateMain(). (UrlHttGet): New. (LatestVersion): Adjust postfix variable $post to detect exotic URLs (sourceforge) like filename=foo-0.7.1.tar.gz&abmode= (UrlHttpParseHref): Filter sourceforge 'mirror_picker'. (FileNameFix): Only fix for 'viewcvs'. (TestDriverSfMirror): Remove. (UrlHttpManipulate): Remove. (UrlHttp): Set correct $file, ignore exotic [?&] PHP paths as save filename. 2008-02-26 Jari Aalto * mywebget.pl (LatestVersion): Adjust 'add' variable. (UrlHttpSearchNewest): Call LatestVersion() only if there are @urls. 2008-02-12 Jari Aalto * mywebget-emacs.conf: (maclennan-sean): New. 2008-02-08 Jari Aalto * mywebget.pl (UrlHttp): Remoe filename from URL, when searching for newer files. (UrlHttpManipulate): Fix Sourceforge URL manipulation code. 2008-01-29 Jari Aalto * mywebget.pl (LatestVersion): Reformat error message. Suggest using in the configuration line. * mywebget-emacs.conf: (lua): Add lua-mode download. 2008-01-25 Jari Aalto * mywebget-emacs.conf: (jsp): change download directory from www/jsp to www/ (niksic-hrvoje): Download all *.el files. 2007-12-01 Jari Aalto * mywebget.pl (HELP): Removed heading VERSION. Can't expand variabled inside POD section. (Unpack): Move $newDir definition inside if. (FileDeCompressedRootDir): Correct detection of root dir. 2007-09-19 Jari Aalto * pwget.pl (LatestVersion): Quote special characters in regexp. The name may contain special characters, like in 'aewm++'. 2006-03-09 Jari Aalto * pwget.pl (UrlHttpSearchNewest): Impreved --Regexp search option. 2006-02-21 Jari Aalto * pwget.pl (HandleCommandLineArgs): Missing =s spec from --mirror (did not accept an argument). Fixed. 2006-02-09 Thu Jari Aalto * mywebget-emacs.conf: 1.52 (mackall-matt): New. quilt mode. (widhopf-fenk-robert): New. 2006-01-22 Sun Jari Aalto * pwget.pl (sub FileListFilter): 1.90 Files in sites using ftp procol were not scrutinized to new: file test. Added $getFile test after regexp test. (sub Main): 1.90 Use $fileName (new: tag content) for $origFile when passing it to UrlFtp(). This makes scanning new files take into effect. 2005-12-01 Thu Jari Aalto * pwget.pl (sub FileDeCompressedCmd): 1.85 incorrect $decompress binary 'bzip' => 'bzip2. Check ERRNO after external shell call. Changed backquotes to more reable qx(). (sub Unpack): 1.85 Changed backquotes to more reable qx(). (sub UrlSfMirrorParse): 1.85 New. (sub TestDriverSfMirror): 1.85 New. (Help::LIST OF DIRECTIVES IN CONFIGURATION FILE): Added new directive 'mirror:'. (Help::General options): Added option --mirrir SITE. for sourceforge downloads. (sub UrlHttpManipulate): 1.85 New. Handle Sourceforge's project downloads correctly. 2005-11-29 Tue Jari Aalto * mywebget-emacs.conf: 1.51 (warsaw-barry): URLs updated. (python-mode): New tag. Point people to sourceforge. 2005-10-16 Sun Jari Aalto * pwget.pl (LatestVersion): 1.84 Increased debug messages so that level 2 is needed. 2005-09-29 Thu Jari Aalto * pwget.pl (sub LatestVersion): 1.84 Added `tbz2' to variable $ext. * mywebget-emacs.conf: 1.50 Chnaged all `belnet' sourceforge download URLS to prdownloads.sourceforge.net (emacs-jabber):G New. (mitchell-lawrence): Added lisppaste.el 2005-08-13 Sat Jari Aalto * pwget.pl (sub ConfigVariableParse): 1.81 Ignore some URLS, that look like variable assignments: print http://example.com/viewcvs/vc-svn.el?rev=HEAD (sub Boot): 1.81 Raised debug from 2 => 4 before printing configuration file contents. (sub ConfigRead): 1.81 Fixed 'already flag' debug output. 2005-04-06 Wed Jari Aalto * mywebget-emacs.conf: 1.49 (corneli-joe): New. 2005-02-16 Jari Aalto Update Copright year in all files 2005-02-06 Sun Jari Aalto * mywebget-emacs.conf: 1.47 (two-mode): New. 2005-02-07 Jari Aalto * mywebget-emacs.conf: (buhl-josh): Tag corrected. Due to misunderstanding the tag was named 'ahlfeld-jorg'. Bug reported by jbuhl users sourceforgenet. 2005-02-04 Jari Aalto * mywebget-emacs.conf (svn): 1.45 Subversion tag disabled. Point people to use Stefan's tag reichor-stefan. 2005-02-02 Wed Jari Aalto * pwget.pl - There were serious problems with . Now accepts full perl code. (sub ExpandVars): 1.77 Commented out `PrintHash', so that environemtn varibales are not listed any more on error (too long listing). (sub ExpandVars): 1.77 Added new parameter 'origline'. (sub ConfigVariableParse): 1.77 Added 'next' to skip 'rename:' directive. (sub EvalCode): 1.77 Added debug. (Help::LIST OF DIRECTIVES IN CONFIGURATION FILE): 1.77 Added much more complicated example to directive. (sub EvalCode): 1.77 Added private block. (sub MonthToNumber): 1.77 New. (UrlHttpDownload): Moved saveFile setting further up, because 'on disk' checking was too early. (sub FileExists): 1.77 Missing -e check added. 2005-01-30 Sun Jari Aalto * mywebget-emacs.conf: 1.45 (dyke-neil): Added quake.el 2005-01-18 Tue Jari * mywebget-emacs.conf (hughes-graham): 1.44 New. Added rc4.el which implements encrypt in pure elisp. 2005-01-04 Tue Jari Aalto * pwget.pl (sub UrlHttpFileCheck): 1.77 Removed unnecessary development line: 'debug=5'. (sub UrlHttp): 1.77 Added input ARG $overwrite. (sub UrlFtp): 1.77 Added input ARG $overwrite. (sub UrlHttpFileCheck): 1.77 Converted input arguments to HASH notation. Added input ARG $overwrite. (sub UrlHttpDownload): 1.77 Added input ARG $overwrite. (sub UrlFile): 1.77 Converted input arguments to HASH notation. Added input ARG $overwrite. * mywebget-emacs.conf: 1.44 (elmes-damien): New (chua-sandra): New; remember.el. 2004-11-19 Fri Jari Aalto * mywebget-emacs.conf (arch): 1.43 download disabled. Instruct to use CVS instead. 2004-11-07 Sun Jari Aalto * pwget.pl (sub UrlHttpParseHref): 1.75 HREF can also use single quote. Added. * mywebget-emacs.conf: 1.41 (mitchell-lawrence): New. 2004-11-06 Sat Jari Aalto * mywebget-emacs.conf: 1.41 (kruse-peter): New. (oconnor-edward): Renamed. Was `oconor-edward' 2004-10-13 Jari Aalto * pwget.pl (DirectiveLcd): Changer input parameter to HASH. (HandleCommandLineArgs): Incorrect option name skip--version => skip-version. (Boot): Removed global $ARG, and used for-loop local my $arg. (HandleCommandLineArgs): New global $CFG_FILE_NEEDED. If the re is no --Tag or --regexp options, there is no need to read and parse configuration file. This will noticeably make program start faster to retrieve URLs. 2004-09-29 Wed Jari Aalto * mywebget-emacs.conf: 1.40 (yuji-minejima): New. 2004-08-27 Fri Jari Aalto * mywebget-emacs.conf: 1.39 (brown-jeremy): New. 2004-08-25 Wed Jari Aalto * mywebget-emacs.conf: 1.37 Removed all 'include' statements. They conflick different setups. * mywebget-emacs.conf: 1.37 (wiegley-john): URL updated. 2004-08-24 Tue Jari Aalto * pwget.pl (Getopt::Long): 1.72 Set verbose to 5 if debug is on. (sub ConfigRead): 1.72 Lowered $verb check to print a warning if configuration file cannot be read. 2004-08-19 Thu Jari Aalto * mywebget-emacs-vars.conf: 1.7 Changed wording in comments. * mywebget-emacs.conf: (The overall recommended site-lisp structure) 1.35 Movex xemacs to separate hierarchy: /usr/share/xemacs/site-lisp/ 2004-04-10 Sat Jari Aalto * mywebget-emacs.conf: 1.33 (triggs-mark): New. 2004-04-01 Thu Jari Aalto * pwget.pl (sub LatestVersion): 1.67 Didn't parse packages names with embedded numbers. like4this-1.1.tar.gz; Fixed. * mywebget-emacs.conf: (jabber): Added emacs-jabber from Sourceforge. (mp3): Added emacsmp3player from Sourceforge. (docbook-xml): Added docbookxml from Sourceforge. (svn): Added vc-svn.el (italk): Added from Sourceforge. (bibletools): Added from Sourceforge. (elisp-other): jtags, jdc-el added from sourceforge. 2004-03-31 Wed Jari Aalto * pwget.pl: 1.64 (top level): REmoved extra newlines. Code beatifying session. (sub LatestVersion): 1.64 Correct version detecton bug, when the version used leading zeroes: treat foo-1.002 as foo-1.0.0.2 (sub UrlHttpDownload): 1.66 Didn't respect user giver save: directive. Fixed. * mywebget-emacs.conf: (arneson-erik) 1.29 New. Includes mixmacter.el. (elisp-other): Added SF project 'table' by Takaaki Ota. 2004-02-08 Sun Jari Aalto * mywebget-emacs.conf: 1.28 (matsushita-akihisa): Added URL http://www.bookshelf.jp/elc/ 2004-02-03 Tue Jari Aalto * mywebget-emacs.conf: (berndl-klaus) 1.28 Added alternative tag `cygwin-mount'. (wright-francis): Added alternative tag `w32-symlinks'. 2004-01-25 Sun Jari Aalto * mywebget-emacs.conf: 1.27 (monnier-stefan): Removed /rum.cs.yale.edu FTP URL link. Not accessible. 2004-01-24 Sat Jari Aalto * mywebget-emacs.conf: 1.27 (elisp-other): Removed NTEmacs faq downloads epop and gnuserv. They are obsolete. Added clearcase.el download. (schroeder-alex) Wrong regexp-no, didn't filter out sql.el which is included in Emacs. Fixed. (pearson-dave): URL updated. 2003-09-01 Mon Jari Aalto [RELEASED 2003.0901 to sourceforge] * pwget.pl: 1.60 Use !/bin/perl, not !/usr/local/bin/perl (HandleCommandLineArgs): 1.60 --help-html and --help-man are now --Help-html and --Help-man (Help): 1.60 Exit 0, not 1. Needed for Makefile. 2003-08-11 Mon Jari Aalto * mywebget-emacs.conf: 1.26 (php-mode): Added sourceforge project. It does not use CVS. 2003-08-10 Sun Jari Aalto * mywebget-emacs.conf: 1.26 (zenirc): New 2003-08-09 Jari Aalto * mywebget-emacs.conf: (breton-peter): Added tag3 'pbreton' (oconnor-edward): New. (linkov-juri): New. 2003-08-08 Jari Aalto Copyright statement year updated to all files. * mywebget-emacs.conf: (goel-deepak): added additional tag3 'deego'. (schroeder-alex): additional tag3 'kensanata' 2003-08-04 Mon Jari Aalto * pwget.pl (Getopt::Long): 1.59 Corrected --verbose and --debug option to accept no arguments and still activating the option. 2003-08-01 Fri Jari Aalto * mywebget-emacs.conf: 1.22 (mccrossan-fraser): New. 2003-07-03 Thu Jari Aalto * mywebget-emacs.conf: 1.21 (sepulveda-rafael): New. (kapur-nevin): regexp-no:gnus-grepmail|msn.el, too old. 2003-06-26 Thu Jari Aalto * mywebget-emacs.conf: 1.21 do not download `bibfind', It's not Kyle's. (bini-michele): Do not download diff.el (ponce-david): commented out, the files are at sourceforge `emacshacks'. (grigni-michelangelo): Added noregexp `ff-path'. It's a file from galbraith-peter. (lopez-emilio): regexp-no `prosper'. It's Phillip Lord's (jump-theodore): regexp-no `prosper|nnir (zundel-detlev): rpm.el conflicts with cvs-packages/sourceforge/cedet/speedbar/rpm.el. Renamed to rpm2.el (rush-david): regexp-no:surl 2003-06-17 Tue Jari Aalto * pwget.pl (sub UrlHttpDownload): 1.56 Added debug calls. (LWP::UserAgent;): 1.56 Incorrectly passed $file if was given. Now respect @list with . That it, it dind't download the test file because the savefilename as wrong. (sub ConfigRead): 1.56 Changed few calls from $debug to $debug > 1 and $debu > 2 to reduce debug display in lower settings. 2003-06-10 Tue Jari Aalto * mywebget-emacs.conf: 1.19 (englen-stephen): Added ell.el download. Was at section theberge-jean. (theberge-jean): updated hachette.el according Jean-Philippe's recent comments * pwget.pl (UrlHttp): Parameter passing error in callt o `UrlHttpSearchNewest'. Fixed. (UrlHttpSearchNewest): Added $ua `die' check. (UrlHttpSearchPage): Added $ua `die' check. 2003-06-08 Sun Jari Aalto * mywebget-emacs.conf: 1.18 (akimichi-tatsukawa): New. EmacsWiki download (hodgson-kahlil): New. EmacsWiki download (alcorn-doug):New. EmacsWiki download (grossjohan-kai): Adde dlonglines.el from EmacsWiki (lang-mario): New. EmacsWiki download (matsushita-akihisa): New. EmacsWiki download (hodges-matthew): New. EmacsWiki download (bini-michele): New. EmacsWiki download (scholz-oliver): New. EmacsWiki download (anderson-patrick): New. EmacsWiki download (josefsson-simon): New link to AES. Rijndael implementation in Emacs Lisp (oconor-edward); O'Connor. New. EmacsWiki download (corcoran-travis): New. EmacsWiki download (koomen-hans): New. EmacsWiki download (zajcev-evgeny): New. EmacsWiki download * pwget.pl (Help): -- NEW FEATURE: Download according to content match. Added option --Regexp-content -- Massive code logic rewrite of function `UrlHttpFile'. (FileContentAnalyze): New. (UrlHttpFileCheck): New. Excerpted from `UrlHttpFile' (UrlHttpSearchNewest): New. Excerpted from `UrlHttpFile' (UrlHttpSearchPage): New. Excerpted from `UrlHttpFile' (Help): 1.52 Added directive documentation. (Main): Code logic fixes. Separate my-definitions moved to the point of usage. 2003-06-07 Sat Jari Aalto * pwget.pl (UrlHttp): 1.52 Added filtering out duplicate files in FILE LIST with temporary hash. (UrlHttpParseHref): Added support for HTML tag BASE. 2003-06-06 Fri Jari Aalto * mywebget-emacs.conf: 1.15 (gorrell-harley): footnote.el clashes with Emacs and XEmacs footnote.el. Is now saved as jhg-footnotee.el (tramp): Update print: to direct people to GNU savannah project. * pwget.pl (UrlHttp): 1.51 Incorrect test for if-statement in `Clearing FILE:'. Was @list > 0, is now @list > 1. This bug caused save: directive never to take place. 2003-06-04 Wed Jari Aalto * mywebget-emacs.conf: 1.15 -- All `elisp-users' URLs checked. (belanger-jay): Fixed changed link (berndl-klaus): URL to cygwin-mount.el updated. (blaak-ray): Uncommented all and added 'print' to say that delphi.el is included in latest Emacs. (breton-peter): Disabled invalid homepage ULR (davidson-kevin): Disabled invalid homepage ULR (galbraith-peter): rule updated, do not download word-help.el, it is not Peter's (goel-deepak): Updated all URLs (kemp-steve): URLs No longer available. Commented out. 2003-06-03 Tue Jari Aalto * mywebget-emacs.conf: 1.14 (gorrell-harley). Harley Gorrell sent url update OLD: http://www.hgsc.bcm.tmc.edu/~harley/elisp/ NEW: http://www.mahalito.net/~harley/elisp/ 2003-05-23 Fri Jari Aalto * mywebget-emacs.conf: 1.14 (ponce-david): Added print statement that the the files are available at SF project 'emhacks' 2003-05-18 Jari Aalto * mywebget-emacs.conf: (wright-francis) Commented out downloading package woman.el. Included in Emacs. 2003-02-08 Sat Jari Aalto * pwget.pl (sub Main): 1.50 Added https support. Needs to load module Crypt::SSLeay dynamically. 2002-12-22 Jari Aalto * pwget.pl (FileExists): Added check for complex URLs download.php?file=this.tar.gz => file=this.tar.gz * pwget.pl (FileNameFix): Smarter filename fix code. 2002-12-13 Jari Aalto * mywebget-emacs.conf (jde): Changed jde.sunsite.dk => jdee.sunsite.dk. Changed jde-beta.zip => jde-latest.zip. 2002-12-11 Jari Aalto * pwget.pl (UrlHttp): Added extra check for `not $new'. must not clear the $file variable model. 2002-08-31 Sat Jari Aalto * admin.bashrc (function sfperlwebget_ask ()): 1.3 New. (function sfperlwebget_release_check ()): 1.3 New. (function sfperlwebget_release ()): 1.3 Call `sfperlwebget_release_check' (function sfperlwebgetdoc ()): 1.3 Generate mywebget.1 and not pwget.pl.1 unix manual page. 2002-08-29 Thu Jari Aalto mywebget-emacs.conf: (rodgers-kevin): Added `print' commands to direct people to search igrep.el wtih google. The gnu.emacs.sources carries the latest version. 2002-08-22 Thu Jari Aalto * mywebget-emacs.conf: 1.11 Massive cleanup. Run Emacs tinypath.el tinypath-cache-problem-report to find offending packages. (wright-francis): woman.el is in Emacs 21.2 (vaidheeswarran-rajesh): whitespace is in Emacs 21.2 (jde-contrib): Ignore jsee - see `ponce-david'. Ognore jserial - see `lord-philip'. (ponce-david): jjar.zip moved to net/packages directory. (belanger-jay): ignore httpd.el (zimmermann-reto): vhdl commented. In Emacs. (Foreword): Added new topic about Emacs and tinypath.el (guess-lang): Removed. See tag `drieu-benjamin' (schwenke-martin): Removed mms.tgz due to conflicting packages. todo-mode is in Emacs 21.2 (blaak-ray): regexp-no:delphi.el (breton-peter): Ignored find-lisp, generic, locate, net-utils; in Emacs 21.2 (wiegley-john): elign.el, timeclock are in Emacs 21.2. Ignore httpd.el - see `marsden-eric' (belanger-jay): Ignore httpd.el, in `marsden-eric' (shulman-michael): Ignore fortune.el, in Emacs 21.2 (antlr-mode): Removed. In Emacs 21.2 (schroeder-alex): regexp-no:ansi-color, in Emacs 21.2 (dampel-herbert): regexp-no:battery, info-look; in Emacs 21.2 (shinn-alex): regexp-no:battery, in Emacs 21.2. Ignore lynx - see `sebold-charles' (jones-kyle): ignore nnir - see `grossjohan-kai' (bbdb-expire): Removed. Included in BBDB. (sylvester-olaf): regexp-no:bs\.e, in Emacs 21.2 (barzilay-eli): regexp-no:calculator, in Emacs 21.2 (cperl): Removed CPAN load, Emacs 21.2 ships newer. (monnier-stefan): exclude diff-mode.el, newcomment, in Emacs 21.2 (eshell): Removed. Included in Emacs 21.2 (ttn): Removed ttn-pers-elisp tar.gz because it includes too many files that alreay come in other packages, like eval-expr.el (galbraith-peter): Ignore ffap and ff-paths, in Emacs 21.2 (hirose-yuuji): Ignore id3.el (idlwave): Removed, included in latest Emacs. 2002-08-14 ke Jari Aalto * mywebget-emacs-vars.conf: 1.2 Fixes. 2002-08-14 Jari Aalto * mywebget-emacs.conf: 1.10 Documentation cleaned in comments. * mywebget-emacs-vars.conf: 1.2 Documentation cleaned in comments. 2002-08-13 Jari Aalto * mywebget-emacs.conf: (waider-ronan): New. ($EUSR/friedman-noah): Emacs 21.2 already has packages whitespace, type-break. Added to no-regexp. 2002-08-12 Jari Aalto * mywebget-emacs.conf: (barzilay-eli): Fixed URL. (curtin-matt): New. Added (pedersen-jesper): URL updated. (waldman-charles): New. (wiborg-espen): New. (zimmermann-reto): New. * pwget.pl (Main): Ignore look-a-like words in print: commands. like if you suggest connecting to a "cvs -d :pserver:..." that looked like a directive , when it wasn't. * mywebget-emacs.conf: (lord-philip): URL Updated. 2002-08-06 Jari Aalto * mywebget-emacs.conf: (galbraith-peter): Debian URL renowed. Temporary problem as Peter explained in email. Removed old site ftp://ftp.phys.ocean.dal.ca/users/rhogee/elisp/ which no longer contains Peter's files. 2002-08-04 Sun Jari Aalto * admin.bashrc (function sfperlwebgetdoc ()): New. (function sfperlwebgetcmd ()): New. * pwget.pl (Help): Added directive. (UrlHttp): Correcte dbug where @LIST was ('') triggering `multiple file noticed' -check. This set directive save: to empty and no fiel was saved anywhere. * mywebget-emacs.conf: (antlr-mode): New. (cparse): Removed. This is replaced by semantic.el (ede): Removed. Replaced by CEDET sourceforge project. (eieio): Removed. Replaced by CEDET sourceforge project. (semantic): Removed. Replaced by CEDET sourceforge project. (speedbar): Removed. Replaced by CEDET sourceforge project. (x-pkg): This file does not exist any more. Removed ftp://ftp.ultranet.com/pub/zappo/public_html/download/X-0.1a.tar.gz (gnus-junk): Site does not exist. Removed. http://stud2.tuwien.ac.at/~e9426626/gnus-junk.el (breton-tom): Removed. URL invalid. http://world.std.com/~tob/resume.html (drieu-benjamin): URL updated. New module guess-lang.el. Ignore pong.el, in Emacs 21.2 (pearson-dave): Ignore quickurl, 5x5; in Emacs 21.2 (fouts-martin): URL invalid. Removed ftp://ftp.fogey.com/fouts/elisp/ (galbraith-peter): URL invalid. Removed http://people.debian.org/~psg/elisp/ (wiborg-espen): Removed. No known addresses. (lauri-gian): Removed. No known addresses: visual-basic-mode.el.gz (lord-philip): Removed. No known addresses. http://bioinf.man.ac.uk/~lord/applications/emacs/emacs-packages.html (jde-contrib): NEW. (marsden-eric): URL updated. (moody-ray): removed. No known addresses: rmime.el (perry-william): URLs deactivated, not valid. email sent. (riebel-rob): Removed. Files are include din Emacs: tpu-edt and sql-mode. (socha-robin): removed. No known addresses. (theberge-jean): Removed multiple HTTP calls with only one. Made regexp more smarter to find the files. (tziperman-eli): use to find .el file instead of direct link. (urban-reini): removed. No known addresses. (vaidheeswarran-rajesh): P4 project moved to sourceforge. 2002-08-03 Sat Jari Aalto * pwget.pl (LatestVersion): Support CDex packaging numbers: cdex_150b6_enu.exe. Didn't recognize `b6' ending. Rearranged debugging. The -d does not turn on full debug any more, but -d 10 does. (Main): Expand URL variables too. Now you can say $HTTP_URL/directory/file.html * mywebget-emacs.conf: TAG html-helper is no longer active. The site has disappeared. (breton-peter): URL corrected. (minar-nelson): Added downloading html-helper.mode (idlwave): Site has moved. (template): Site moved to sourceforge project 'emacs-template' but code is not in CVS. (x-symbol): Site moved to sourceforge project 'x-symbol' (w3): Removed. latest code is in GNU savannah CVS. 2002-07-22 Jari Aalto * pwget.pl (UrlHttp): Filter out FRAGMENTs that are not part of the file names. This fixes bug, where you retrieved URLS from page using -R (regexp) iption. http://localhost/index.html#section1 => http://localhost/index.html 2002-07-12 Jari Aalto * mywebget-emacs.conf: (hirose-yuuji): URLs updated. (theberge-jean): All URLs fixed. (yavner-jonathan): All URLs fixed. (dirson-yann): All URLs fixed. 2002-07-06 Jari Aalto * mywebget-emacs.conf: (galbraith-peter) Added DEbian development packages. (ramakrishnan) New member added. 2002-02-13 Wed Jari Aalto * mywebget-emacs.conf: (zundel-detlev) URL address updated. 2002-01-19 Sat Jari Aalto * mywebget-emacs.conf: (walters-colin): Added checkout for `ibuf-macs.el' hat is required by latest ibuffer. (jde-docindex): New. by Kevin Burton. 2002-01-14 Mon Jari Aalto * pwget.pl: (sub FileExists): 1.41 New. If you download a file.txt.gz and instruct to extract it, it will become file.txt. We can't therefore check the existence of file.txt.gz, but file.txt as well. (sub Main): Send `unpack' information to HTTP and FTP handlers. (Getopt::Long): set $verb to 10 if DEBUG on. (sub FileSimpleCompressed): 1.41 New. (sub UrlHttp): Better overwrite checking with FileExists() and FileSimpleCompressed(). (Net::FTP->new): Better overwrite checking with FileExists() and FileSimpleCompressed(). 2002-01-12 Sat Jari Aalto * mywebget-emacs.conf: (volker-franz): New files. (yamaoka-katsumi): New files and packages. (mmm-mode): Removed. project is now at sourceforge under the same name. * pwget.pl (Filter): Added Pre-Filter Which removes unwanted files before the LatestVersion is called. (Getopt::Long): Added --chdir option. 2002-01-11 Fri Jari Aalto * mywebget-emacs.conf: (eudc): Rmeoved. Available from sourceforge. 2002-01-04 Fri Jari Aalto * pwget.pl (Initialize): Change incorrect environment variable MYMYWEBGET_PL_CFG into MYWEBGET_PL_CFG pwget-2016.1019+git75c6e3e/INSTALL000066400000000000000000000010211300167571300157720ustar00rootroot00000000000000INSTALL: Pwget - A Perl derivation of wget ------------------------------------------ System wide install Run makefile with appropriate parameters. The program is installed without any file extension. An example: make [-n] DESTDIR= prefix=/usr/local install Manual install: 1. Copy bin/*.pl somewhere along $PATH 2. Copy bin/*.1 somewhere along $MANPATH 3. Look into doc/examples/ and apply ideas how to write configuration files (optional; not required) End of file pwget-2016.1019+git75c6e3e/Makefile000066400000000000000000000114271300167571300164140ustar00rootroot00000000000000#!/usr/bin/make -f # # Copyright information # # Copyright (C) 2002-2016 Jari Aalto # # License # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . ifneq (,) This makefile requires GNU Make. endif PACKAGE = pwget DESTDIR = prefix = /usr exec_prefix = $(prefix) man_prefix = $(prefix)/share mandir = $(man_prefix)/man bindir = $(exec_prefix)/bin sharedir = $(prefix)/share BINDIR = $(DESTDIR)$(bindir) DOCDIR = $(DESTDIR)$(sharedir)/doc/$(PACKAGE) LOCALEDIR = $(DESTDIR)$(sharedir)/locale SHAREDIR = $(DESTDIR)$(sharedir)/$(PACKAGE) LIBDIR = $(DESTDIR)$(prefix)/lib/$(PACKAGE) SBINDIR = $(DESTDIR)$(exec_prefix)/sbin ETCDIR = $(DESTDIR)/etc/$(PACKAGE) # 1 = regular, 5 = conf, 6 = games, 8 = daemons MANDIR = $(DESTDIR)$(mandir) MANDIR1 = $(MANDIR)/man1 MANDIR5 = $(MANDIR)/man5 MANDIR6 = $(MANDIR)/man6 MANDIR8 = $(MANDIR)/man8 TAR = tar TAR_OPT_NO = --exclude='.build' \ --exclude='.sinst' \ --exclude='.inst' \ --exclude='tmp' \ --exclude='*.bak' \ --exclude='*[~\#]' \ --exclude='.\#*' \ --exclude='CVS' \ --exclude='.svn' \ --exclude='.git' \ --exclude='.bzr' \ --exclude='*.tar*' \ --exclude='*.tgz' INSTALL = /usr/bin/install INSTALL_BIN = $(INSTALL) -m 755 INSTALL_DATA = $(INSTALL) -m 644 INSTALL_SUID = $(INSTALL) -m 4755 DIST_DIR = ../build-area DATE = `date +"%Y.%m%d"` VERSION = $(DATE) RELEASE = $(PACKAGE)-$(VERSION) PERL = perl BIN = $(PACKAGE) PL_SCRIPT = bin/$(BIN).pl INSTALL_OBJS_BIN = $(PL_SCRIPT) INSTALL_OBJS_DOC = README COPYING INSTALL_OBJS_MAN = bin/*.1 all: @echo "Nothing to compile." @echo "Try 'make help' or 'make -n DESTDIR= prefix=/usr/local install'" # Rule: help - display Makefile rules help: grep "^# Rule:" Makefile | sort # Rule: clean - remove temporary files clean: # clean -rm -f *[#~] *.\#* \ *.x~~ pod*.tmp rm -rf tmp distclean: clean realclean: clean dist-git: doc test rm -f $(DIST_DIR)/$(RELEASE)* git archive --format=tar --prefix=$(RELEASE)/ master | \ gzip --best > $(DIST_DIR)/$(RELEASE).tar.gz chmod 644 $(DIST_DIR)/$(RELEASE).tar.gz tar -tvf $(DIST_DIR)/$(RELEASE).tar.gz | sort -k 5 ls -la $(DIST_DIR)/$(RELEASE).tar.gz # The "gt" is maintainer's program frontend to Git # Rule: dist-snap - [maintainer] release snapshot from Git repository dist-snap: doc test @echo gt tar -q -z -p $(PACKAGE) -c -D master # Rule: dist - [maintainer] release from Git repository dist: dist-git dist-ls: @ls -1tr $(DIST_DIR)/$(PACKAGE)* # Rule: dist - [maintainer] list of release files ls: dist-ls bin/$(PACKAGE).1: $(PL_SCRIPT) $(PERL) $< --help-man > $@ @-rm -f *.x~~ pod*.tmp doc/manual/index.html: $(PL_SCRIPT) $(PERL) $< --help-html > $@ @-rm -f *.x~~ pod*.tmp doc/manual/index.txt: $(PL_SCRIPT) $(PERL) $< --help > $@ @-rm -f *.x~~ pod*.tmp doc/conversion/index.html: doc/conversion/index.txt perl -S t2html.pl --Auto-detect --Out --print-url $< # Rule: man - Generate or update manual page man: bin/$(PACKAGE).1 html: doc/manual/index.html txt: doc/manual/index.txt # Rule: doc - Generate or update all documentation doc: man html txt # Rule: perl-test - Check program syntax perl-test: # perl-test - Check syntax perl -cw $(PL_SCRIPT) podchecker $(PL_SCRIPT) # Rule: test - Run tests test: perl-test install-doc: # Rule install-doc - Install documentation $(INSTALL_BIN) -d $(DOCDIR) [ ! "$(INSTALL_OBJS_DOC)" ] || \ $(INSTALL_DATA) $(INSTALL_OBJS_DOC) $(DOCDIR) $(TAR) -C doc $(TAR_OPT_NO) --create --file=- . | \ $(TAR) -C $(DOCDIR) --extract --file=- install-man: man # install-man - Install manual pages $(INSTALL_BIN) -d $(MANDIR1) $(INSTALL_DATA) $(INSTALL_OBJS_MAN) $(MANDIR1) install-bin: # install-bin - Install programs $(INSTALL_BIN) -d $(BINDIR) for f in $(INSTALL_OBJS_BIN); \ do \ dest=$$(basename $$f | sed -e 's/\.pl$$//' -e 's/\.py$$//' ); \ $(INSTALL_BIN) $$f $(BINDIR)/$$dest; \ done # Rule: install - Standard install install: install-bin install-man install-doc # Rule: install-test - for Maintainer only install-test: rm -rf tmp make DESTDIR=`pwd`/tmp prefix=/usr install find tmp | sort .PHONY: clean distclean realclean .PHONY: install install-bin install-man .PHONY: all man doc test install-test perl-test .PHONY: dist dist-git dist-ls ls # End of file pwget-2016.1019+git75c6e3e/README000066400000000000000000000026111300167571300156270ustar00rootroot00000000000000README: Pwget - A Perl derivation of wget ----------------------------------------- Pwget is similar to wget[1] but it can use categorized configuration files, analyze Web pages, and "search" for download links as instructed. Instead of absolute links, it contains heuristics to track newer versions of files. The source package directories: bin/ The program and system manual page (*.1) doc/ Documentation Important files COPYING GPL v2 or later Licence INSTALL Install instructions bin/ChangeLog Project change records Project details Homepage http://freecode.com/projects/perlwebget To report bugs See freecode page Source code repository See freecode page Depends Perl 5.10+ Standard Perl libraries: Net::FTP [Debian package: perl-modules] LWP::UserAgent [Debian package: libwww-perl] wget(1) for Sourceforge downloads [1]. t2html for generating documentation [2] from scratch (optional). References [1] http://www.gnu.org/software/wget [2] http://freecode.com/projects/perl-text2html Copyright Copyright (C) 1996-2016 Jari Aalto License This program is free software; you can redistribute and/or modify program under the terms of GNU General Public license either version 2 of the License, or (at your option) any later version. End of file pwget-2016.1019+git75c6e3e/bin/000077500000000000000000000000001300167571300155175ustar00rootroot00000000000000pwget-2016.1019+git75c6e3e/bin/pwget.1000066400000000000000000001222121300167571300167270ustar00rootroot00000000000000.\" Automatically generated by Pod::Man 2.28 (Pod::Simple 3.32) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is turned on, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{ . if \nF \{ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" .\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2). .\" Fear. Run. Save yourself. No user-serviceable parts. . \" fudge factors for nroff and troff .if n \{\ . ds #H 0 . ds #V .8m . ds #F .3m . ds #[ \f1 . ds #] \fP .\} .if t \{\ . ds #H ((1u-(\\\\n(.fu%2u))*.13m) . ds #V .6m . ds #F 0 . ds #[ \& . ds #] \& .\} . \" simple accents for nroff and troff .if n \{\ . ds ' \& . ds ` \& . ds ^ \& . ds , \& . ds ~ ~ . ds / .\} .if t \{\ . ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u" . ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u' . ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u' . ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u' . ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u' . ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u' .\} . \" troff and (daisy-wheel) nroff accents .ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V' .ds 8 \h'\*(#H'\(*b\h'-\*(#H' .ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#] .ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H' .ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u' .ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#] .ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#] .ds ae a\h'-(\w'a'u*4/10)'e .ds Ae A\h'-(\w'A'u*4/10)'E . \" corrections for vroff .if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u' .if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u' . \" for low resolution devices (crt and lpr) .if \n(.H>23 .if \n(.V>19 \ \{\ . ds : e . ds 8 ss . ds o a . ds d- d\h'-1'\(ga . ds D- D\h'-1'\(hy . ds th \o'bp' . ds Th \o'LP' . ds ae ae . ds Ae AE .\} .rm #[ #] #H #V #F C .\" ======================================================================== .\" .IX Title "PWGET 1" .TH PWGET 1 "2016-10-19" "perl v5.22.2" "Perl pwget URL fetch utility" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" pwget \- Perl Web URL fetch program .SH "SYNOPSIS" .IX Header "SYNOPSIS" .Vb 5 \& pwget http://example.com/ [URL ...] \& pwget \-\-config $HOME/config/pwget.conf \-\-tag linux \-\-tag emacs .. \& pwget \-\-verbose \-\-overwrite http://example.com/ \& pwget \-\-verbose \-\-overwrite \-\-Output ~/dir/ http://example.com/ \& pwget \-\-new \-\-overwrite http://example.com/package\-1.1.tar.gz .Ve .SH "DESCRIPTION" .IX Header "DESCRIPTION" Automate periodic downloads of files and packages. .PP If you retrieve latest versions of certain program blocks periodically, this is the Perl script for you. Run from cron job or once a week to upload newest versions of files around the net. Note: .SS "Wget and this program" .IX Subsection "Wget and this program" At this point you may wonder, where would you need this perl program when \fIwget\fR\|(1) C\-program has been the standard for ages. Well, 1) Perl is cross platform and more easily extendable 2) You can record file download criteria to a configuration file and use perl regular epxressions to select downloads 3) the program can anlyze web-pages and \*(L"search\*(R" for the download only links as instructed 4) last but not least, it can track newest packages whose name has changed since last downlaod. There are heuristics to determine the newest file or package according to file name skeleton defined in configuration. .PP This program does not replace \fIpwget\fR\|(1) because it does not offer as many options as wget, like recursive downloads and date comparing. Use wget for ad hoc downloads and this utility for files that change (new releases of archives) or which you monitor periodically. .SS "Short introduction" .IX Subsection "Short introduction" This small utility makes it possible to keep a list of URLs in a configuration file and periodically retrieve those pages or files with simple commands. This utility is best suited for small batch jobs to download e.g. most recent versions of software files. If you use an \s-1URL\s0 that is already on disk, be sure to supply option \fB\-\-overwrite\fR to allow overwriting existing files. .PP While you can run this program from command line to retrieve individual files, program has been designed to use separate configuration file via \&\fB\-\-config\fR option. In the configuration file you can control the downloading with separate directives like \f(CW\*(C`save:\*(C'\fR which tells to save the file under different name. The simplest way to retrieve the latest version of apackage from a \s-1FTP\s0 site is: .PP .Vb 2 \& pwget \-\-new \-\-overwite \-\-verbose \e \& http://www.example.com/package\-1.00.tar.gz .Ve .PP Do not worry about the filename \f(CW\*(C`package\-1.00.tar.gz\*(C'\fR. The latest version, say, \f(CW\*(C`package\-3.08.tar.gz\*(C'\fR will be retrieved. The option \&\fB\-\-new\fR instructs to find newer version than the provided \s-1URL.\s0 .PP If the \s-1URL\s0 ends to slash, then directory list at the remote machine is stored to file: .PP .Vb 1 \& !path!000root\-file .Ve .PP The content of this file can be either index.html or the directory listing depending on the used http or ftp protocol. .SH "OPTIONS" .IX Header "OPTIONS" .IP "\fB\-A, \-\-regexp\-content \s-1REGEXP\s0\fR" 4 .IX Item "-A, --regexp-content REGEXP" Analyze the content of the file and match \s-1REGEXP.\s0 Only if the regexp matches the file content, then download file. This option will make downloads slow, because the file is read into memory as a single line and then a match is searched against the content. .Sp For example to download Emacs lisp file (.el) written by Mr. Foo in case insensitive manner: .Sp .Vb 2 \& pwget \-v \-r \*(Aq\e.el$\*(Aq \-A "(?i)Author: Mr. Foo" \e \& http://www.emacswiki.org/elisp/index.html .Ve .IP "\fB\-C, \-\-create\-paths\fR" 4 .IX Item "-C, --create-paths" Create paths that do not exist in \f(CW\*(C`lcd:\*(C'\fR directives. .Sp By default, any \s-1LCD\s0 directive to non-existing directory will interrupt program. With this option, local directories are created as needed making it possible to re-create the exact structure as it is in configuration file. .IP "\fB\-c, \-\-config \s-1FILE\s0\fR" 4 .IX Item "-c, --config FILE" This option can be given multiple times. All configurations are read. .Sp Read URLs from configuration file. If no configuration file is given, file pointed by environment variable is read. See \s-1ENVIRONMENT.\s0 .Sp The configuration file layout is envlained in section \s-1CONFIGURATION FILE\s0 .IP "\fB\-\-chdir \s-1DIRECTORY\s0\fR" 4 .IX Item "--chdir DIRECTORY" Do a \fIchdir()\fR to \s-1DIRECTORY\s0 before any \s-1URL\s0 download starts. This is like doing: .Sp .Vb 2 \& cd DIRECTORY \& pwget http://example.com/index.html .Ve .IP "\fB\-d, \-\-debug [\s-1LEVEL\s0]\fR" 4 .IX Item "-d, --debug [LEVEL]" Turn on debug with positive \s-1LEVEL\s0 number. Zero means no debug. This option turns on \fB\-\-verbose\fR too. .IP "\fB\-e, \-\-extract\fR" 4 .IX Item "-e, --extract" Unpack any files after retrieving them. The command to unpack typical archive files are defined in a program. Make sure these programs are along path. Win32 users are encouraged to install the Cygwin utilities where these programs come standard. Refer to section \s-1SEE ALSO.\s0 .Sp .Vb 6 \& .tar => tar \& .tgz => tar + gzip \& .gz => gzip \& .bz2 => bzip2 \& .xz => xz \& .zip => unzip .Ve .IP "\fB\-F, \-\-firewall \s-1FIREWALL\s0\fR" 4 .IX Item "-F, --firewall FIREWALL" Use \s-1FIREWALL\s0 when accessing files via ftp:// protocol. .IP "\fB\-h, \-\-help\fR" 4 .IX Item "-h, --help" Print help page in text. .IP "\fB\-\-help\-html\fR" 4 .IX Item "--help-html" Print help page in \s-1HTML.\s0 .IP "\fB\-\-help\-man\fR" 4 .IX Item "--help-man" Print help page in Unix manual page format. You want to feed this output to c in order to read it. .Sp Print help page. .IP "\fB\-m, \-\-mirror \s-1SITE\s0\fR" 4 .IX Item "-m, --mirror SITE" If \s-1URL\s0 points to Sourcefoge download area, use mirror \s-1SITE\s0 for downloading. Alternatively the full full \s-1URL\s0 can include the mirror information. And example: .Sp .Vb 1 \& \-\-mirror kent http://downloads.sourceforge.net/foo/foo\-1.0.0.tar.gz .Ve .IP "\fB\-n, \-\-new\fR" 4 .IX Item "-n, --new" Get newest file. This applies to datafiles, which do not have extension \&.asp or .html. When new releases are announced, the version number in filename usually tells which is the current one so getting hardcoded file with: .Sp .Vb 1 \& pwget \-o \-v http://example.com/dir/program\-1.3.tar.gz .Ve .Sp is not usually practical from automation point of view. Adding \&\fB\-\-new\fR option to the command line causes double pass: a) the whole http://example.com/dir/ is examined for all files and b) files matching approximately filename program\-1.3.tar.gz are examined, heuristically sorted and file with latest version number is retrieved. .IP "\fB\-\-no\-lcd\fR" 4 .IX Item "--no-lcd" Ignore \f(CW\*(C`lcd:\*(C'\fR directives in configuration file. .Sp In the configuration file, any \f(CW\*(C`lcd:\*(C'\fR directives are obeyed as they are seen. But if you do want to retrieve \s-1URL\s0 to your current directory, be sure to supply this option. Otherwise the file will end to the directory pointer by \f(CW\*(C`lcd:\*(C'\fR. .IP "\fB\-\-no\-save\fR" 4 .IX Item "--no-save" Ignore \f(CW\*(C`save:\*(C'\fR directives in configuration file. If the URLs have \&\f(CW\*(C`save:\*(C'\fR options, they are ignored during fetch. You usually want to combine \fB\-\-no\-lcd\fR with \fB\-\-no\-save\fR .IP "\fB\-\-no\-extract\fR" 4 .IX Item "--no-extract" Ignore \f(CW\*(C`x:\*(C'\fR directives in configuration file. .IP "\fB\-O, \-\-output \s-1DIR\s0\fR" 4 .IX Item "-O, --output DIR" Before retrieving any files, chdir to \s-1DIR.\s0 .IP "\fB\-o, \-\-overwrite\fR" 4 .IX Item "-o, --overwrite" Allow overwriting existing files when retrieving URLs. Combine this with \fB\-\-skip\-version\fR if you periodically update files. .IP "\fB\-\-proxy \s-1PROXY\s0\fR" 4 .IX Item "--proxy PROXY" Use \s-1PROXY\s0 server for \s-1HTTP. \s0(See \fB\-\-Firewall\fR for \s-1FTP.\s0). The port number is optional in the call: .Sp .Vb 2 \& \-\-proxy http://example.com.proxy.com \& \-\-proxy example.com.proxy.com:8080 .Ve .IP "\fB\-p, \-\-prefix \s-1PREFIX\s0\fR" 4 .IX Item "-p, --prefix PREFIX" Add \s-1PREFIX\s0 to all retrieved files. .IP "\fB\-P, \-\-postfix \s-1POSTFIX \s0\fR" 4 .IX Item "-P, --postfix POSTFIX " Add \s-1POSTFIX\s0 to all retrieved files. .IP "\fB\-D, \-\-prefix\-date\fR" 4 .IX Item "-D, --prefix-date" Add iso8601 \*(L":YYYY\-MM\-DD\*(R" prefix to all retrieved files. This is added before possible \fB\-\-prefix\-www\fR or \fB\-\-prefix\fR. .IP "\fB\-W, \-\-prefix\-www\fR" 4 .IX Item "-W, --prefix-www" Usually the files are stored with the same name as in the \s-1URL\s0 dir, but if you retrieve files that have identical names you can store each page separately so that the file name is prefixed by the site name. .Sp .Vb 2 \& http://example.com/page.html \-\-> example.com::page.html \& http://example2.com/page.html \-\-> example2.com::page.html .Ve .IP "\fB\-r, \-\-regexp \s-1REGEXP\s0\fR" 4 .IX Item "-r, --regexp REGEXP" Retrieve file matching at the destination \s-1URL\s0 site. This is like \*(L"Connect to the \s-1URL\s0 and get all files matching \s-1REGEXP\*(R".\s0 Here all gzip compressed files are found form \s-1HTTP\s0 server directory: .Sp .Vb 1 \& pwget \-v \-r "\e.gz" http://example.com/archive/ .Ve .Sp Caveat: currently works only for http:// URLs. .IP "\fB\-R, \-\-config\-regexp \s-1REGEXP\s0\fR" 4 .IX Item "-R, --config-regexp REGEXP" Retrieve URLs matching \s-1REGEXP\s0 from configuration file. This cancels \&\fB\-\-tag\fR options in the command line. .IP "\fB\-s, \-\-selftest\fR" 4 .IX Item "-s, --selftest" Run some internal tests. For maintainer or developer only. .IP "\fB\-\-sleep \s-1SECONDS\s0\fR" 4 .IX Item "--sleep SECONDS" Sleep \s-1SECONDS\s0 before next \s-1URL\s0 request. When using regexp based downlaods that may return many hits, some sites disallow successive requests in within short period of time. This options makes program sleep for number of \s-1SECONDS\s0 between retrievals to overcome 'Service unavailable'. .IP "\fB\-\-stdout\fR" 4 .IX Item "--stdout" Retrieve \s-1URL\s0 and write to stdout. .IP "\fB\-\-skip\-version\fR" 4 .IX Item "--skip-version" Do not download files that have version number and which already exists on disk. Suppose you have these files and you use option \fB\-\-skip\-version\fR: .Sp .Vb 2 \& package.tar.gz \& file\-1.1.tar.gz .Ve .Sp Only file.txt is retrieved, because file\-1.1.tar.gz contains version number and the file has not changed since last retrieval. The idea is, that in every release the number in in distribution increases, but there may be distributions which do not contain version number. In regular intervals you may want to load those packages again, but skip versioned files. In short: This option does not make much sense without additional option \fB\-\-new\fR .Sp If you want to reload versioned file again, add option \fB\-\-overwrite\fR. .IP "\fB\-t, \-\-test, \-\-dry\-run\fR" 4 .IX Item "-t, --test, --dry-run" Run in test mode. .IP "\fB\-T, \-\-tag \s-1NAME\s0 [\s-1NAME\s0] ...\fR" 4 .IX Item "-T, --tag NAME [NAME] ..." Search tag \s-1NAME\s0 from the config file and download only entries defined under that tag. Refer to \fB\-\-config \s-1FILE\s0\fR option description. You can give Multiple \fB\-\-tag\fR switches. Combining this option with \fB\-\-regexp\fR does not make sense and the concequencies are undefined. .IP "\fB\-v, \-\-verbose [\s-1NUMBER\s0]\fR" 4 .IX Item "-v, --verbose [NUMBER]" Print verbose messages. .IP "\fB\-V, \-\-version\fR" 4 .IX Item "-V, --version" Print version information. .SH "EXAMPLES" .IX Header "EXAMPLES" Get files from site: .PP .Vb 1 \& pwget http://www.example.com/dir/package.tar.gz .. .Ve .PP Display copyright file for package \s-1GNU\s0 make from Debian pages: .PP .Vb 1 \& pwget \-\-stdout \-\-regexp \*(Aqcopyright$\*(Aq http://packages.debian.org/unstable/make .Ve .PP Get all mailing list archive files that match \*(L"gz\*(R": .PP .Vb 1 \& pwget \-\-regexp gz http://example.com/mailing\-list/archive/download/ .Ve .PP Read a directory and store it to filename \s-1YYYY\-MM\-DD::\s0!dir!000root\-file. .PP .Vb 1 \& pwget \-\-prefix\-date \-\-overwrite \-\-verbose http://www.example.com/dir/ .Ve .PP To update newest version of the package, but only if there is none at disk already. The \fB\-\-new\fR option instructs to find newer packages and the filename is only used as a skeleton for files to look for: .PP .Vb 2 \& pwget \-\-overwrite \-\-skip\-version \-\-new \-\-verbose \e \& ftp://ftp.example.com/dir/packet\-1.23.tar.gz .Ve .PP To overwrite file and add a date prefix to the file name: .PP .Vb 2 \& pwget \-\-prefix\-date \-\-overwrite \-\-verbose \e \& http://www.example.com/file.pl \& \& \-\-> YYYY\-MM\-DD::file.pl .Ve .PP To add date and \s-1WWW\s0 site prefix to the filenames: .PP .Vb 2 \& pwget \-\-prefix\-date \-\-prefix\-www \-\-overwrite \-\-verbose \e \& http://www.example.com/file.pl \& \& \-\-> YYYY\-MM\-DD::www.example.com::file.pl .Ve .PP Get all updated files under cnfiguration file's tag updates: .PP .Vb 2 \& pwget \-\-verbose \-\-overwrite \-\-skip\-version \-\-new \-\-tag updates \& pwget \-v \-o \-s \-n \-T updates .Ve .PP Get files as they read in the configuration file to the current directory, ignoring any \f(CW\*(C`lcd:\*(C'\fR and \f(CW\*(C`save:\*(C'\fR directives: .PP .Vb 3 \& pwget \-\-config $HOME/config/pwget.conf / \& \-\-no\-lcd \-\-no\-save \-\-overwrite \-\-verbose \e \& http://www.example.com/file.pl .Ve .PP To check configuration file, run the program with non-matching regexp and it parses the file and checks the \f(CW\*(C`lcd:\*(C'\fR directives on the way: .PP .Vb 1 \& pwget \-v \-r dummy\-regexp \& \& \-\-> \& \& pwget.DirectiveLcd: LCD [$EUSR/directory ...] \& is not a directory at /users/foo/bin/pwget line 889. .Ve .SH "CONFIGURATION FILE" .IX Header "CONFIGURATION FILE" .SS "Comments" .IX Subsection "Comments" The configuration file is \s-1NOT\s0 Perl code. Comments start with hash character (#). .SS "Variables" .IX Subsection "Variables" At this point, variable expansions happen only in \fBlcd:\fR. Do not try to use them anywhere else, like in URLs. .PP Path variables for \fBlcd:\fR are defined using following notation, spaces are not allowed in \s-1VALUE\s0 part (no directory names with spaces). Variable names are case sensitive. Variables substitute environment variabales with the same name. Environment variables are immediately available. .PP .Vb 3 \& VARIABLE = /home/my/dir # define variable \& VARIABLE = $dir/some/file # Use previously defined variable \& FTP = $HOME/ftp # Use environment variable .Ve .PP The right hand can refer to previously defined variables or existing environment variables. Repeat, this is not Perl code although it may look like one, but just an allowed syntax in the configuration file. Notice that there is dollar to the right hand> when variable is referred, but no dollar to the left hand side when variable is defined. Here is example of a possible configuration file contant. The tags are hierarchically ordered without a limit. .PP Warning: remember to use different variables names in separate include files. All variables are global. .SS "Include files" .IX Subsection "Include files" It is possible to include more configuration files with statement .PP .Vb 1 \& INCLUDE .Ve .PP Variable expansions are possible in the file name. There is no limit how many or how deep include structure is used. Every file is included only once, so it is safe to to have multiple includes to the same file. Every include is read, so put the most importat override includes last: .PP .Vb 2 \& INCLUDE # Global \& INCLUDE <$HOME/config/pwget.conf> # HOME overrides it .Ve .PP A special \f(CW\*(C`THIS\*(C'\fR tag means relative path of the current include file, which makes it possible to include several files form the same directory where a initial include file resides .PP .Vb 1 \& # Start of config at /etc/pwget.conf \& \& # THIS = /etc, current location \& include \& \& # Refers to directory where current user is: the pwd \& include \& \& # end .Ve .SS "Configuraton file example" .IX Subsection "Configuraton file example" The configuration file can contain many , where each directive end to a colon. The usage of each directory is best explained by examining the configuration file below and reading the commentary near each directive. .PP .Vb 1 \& # $HOME/config/pwget.conf F\- Perl pwget configuration file \& \& ROOT = $HOME # define variables \& CONF = $HOME/config \& UPDATE = $ROOT/updates \& DOWNL = $ROOT/download \& \& # Include more configuration files. It is possible to \& # split a huge file in pieces and have "linux", \& # "win32", "debian", "emacs" configurations in separate \& # and manageable files. \& \& INCLUDE <$CONF/pwget\-other.conf> \& INCLUDE <$CONF/pwget\-more.conf> \& \& tag1: local\-copies tag1: local # multiple names to this category \& \& lcd: $UPDATE # chdir directive \& \& # This is show to user with option \-\-verbose \& print: Notice, this site moved YYYY\-MM\-DD, update your bookmarks \& \& file://absolute/dir/file\-1.23.tar.gz \& \& tag1: external \& \& lcd: $DOWNL \& \& tag2: external\-http \& \& http://www.example.com/page.html \& http://www.example.com/page.html save:/dir/dir/page.html \& \& tag2: external\-ftp \& \& ftp://ftp.com/dir/file.txt.gz save:xx\-file.txt.gz login:foo pass:passwd x: \& \& lcd: $HOME/download/package \& \& ftp://ftp.com/dir/package\-1.1.tar.gz new: \& \& tag2: package\-x \& \& lcd: $DOWNL/package\-x \& \& # Person announces new files in his homepage, download all \& # announced files. Unpack everything (x:) and remove any \& # existing directories (xopt:rm) \& \& http://example.com/~foo pregexp:\e.tar\e.gz$ x: xopt:rm \& \& # End of configuration file pwget.conf .Ve .SH "LIST OF DIRECTIVES IN CONFIGURATION FILE" .IX Header "LIST OF DIRECTIVES IN CONFIGURATION FILE" All the directives must in the same line where the \s-1URL\s0 is. The programs scans lines and determines all options given in line for the \s-1URL.\s0 Directives can be overridden by command line options. .IP "\fBcnv:CONVERSION\fR" 4 .IX Item "cnv:CONVERSION" Currently only \fBconv:text\fR is available. .Sp Convert downloaded page to text. This option always needs either \fBsave:\fR or \fBrename:\fR, because only those directives change filename. Here is an example: .Sp .Vb 2 \& http://example.com/dir/file.html cnv:text save:file.txt \& http://example.com/dir/ pregexp:\e.html cnv:text rename:s/html/txt/ .Ve .Sp A \fBtext:\fR shorthand directive can be used instead of \fBcnv:text\fR. .IP "\fBcregexp:REGEXP\fR" 4 .IX Item "cregexp:REGEXP" Download file only if the content matches \s-1REGEXP.\s0 This is same as option \&\fB\-\-Regexp\-content\fR. In this example directory listing Emacs lisp packages (.el) are downloaded but only if their content indicates that the Author is Mr. Foo: .Sp .Vb 1 \& http://example.com/index.html cregexp:(?i)author:.*Foo pregexp:\e.el$ .Ve .IP "\fBlcd:DIRECTORY\fR" 4 .IX Item "lcd:DIRECTORY" Set local download directory to \s-1DIRECTORY \s0(chdir to it). Any environment variables are substituted in path name. If this tag is found, it replaces setting of \fB\-\-Output\fR. If path is not a directory, terminate with error. See also \fB\-\-Create\-paths\fR and \fB\-\-no\-lcd\fR. .IP "\fBlogin:LOGIN\-NAME\fR" 4 .IX Item "login:LOGIN-NAME" Ftp login name. Default value is \*(L"anonymous\*(R". .IP "\fBmirror:SITE\fR" 4 .IX Item "mirror:SITE" This is relevant to Sourceforge only which does not allow direct downloads with links. Visit project's Sourceforge homepage and see which mirrors are available for downloading. .Sp An example: .Sp .Vb 1 \& http://sourceforge.net/projects/austrumi/files/austrumi/austrumi\-1.8.5/austrumi\-1.8.5.iso/download new: mirror:kent .Ve .IP "\fBnew:\fR" 4 .IX Item "new:" Get newest file. This variable is reset to the value of \fB\-\-new\fR after the line has been processed. Newest means, that an \f(CW\*(C`ls\*(C'\fR command is run in the ftp, and something equivalent in \s-1HTTP \s0\*(L"ftp directories\*(R", and any files that resemble the filename is examined, sorted and heurestically determined according to version number of file which one is the latest. For example files that have version information in \s-1YYYYMMDD\s0 format will most likely to be retrieved right. .Sp Time stamps of the files are not checked. .Sp The only requirement is that filename \f(CW\*(C`must\*(C'\fR follow the universal version numbering standard: .Sp .Vb 1 \& FILE\-VERSION.extension # de facto VERSION is defined as [\ed.]+ \& \& file\-19990101.tar.gz # ok \& file\-1999.0101.tar.gz # ok \& file\-1.2.3.5.tar.gz # ok \& \& file1234.txt # not recognized. Must have "\-" \& file\-0.23d.tar.gz # warning, letters are problematic .Ve .Sp Files that have some alphabetic version indicator at the end of \&\s-1VERSION\s0 may not be handled correctly. Contact the developer and inform him about the de facto standard so that files can be retrieved more intelligently. .Sp \&\fI\s-1NOTE:\s0\fR In order the \fBnew:\fR directive to know what kind of files to look for, it needs a file tamplate. You can use a direct link to some filename. Here the location \*(L"http://www.example.com/downloads\*(R" is examined and the filename template used is took as \*(L"file\-1.1.tar.gz\*(R" to search for files that might be newer, like \*(L"file\-9.1.10.tar.gz\*(R": .Sp .Vb 1 \& http://www.example.com/downloads/file\-1.1.tar.gz new: .Ve .Sp If the filename appeard in a named page, use directive \fBfile:\fR for template. In this case the \*(L"download.html\*(R" page is examined for files looking like \*(L"file.*tar.gz\*(R" and the latest is searched: .Sp .Vb 1 \& http://www.example.com/project/download.html file:file\-1.1.tar.gz new: .Ve .IP "\fBoverwrite:\fR \fBo:\fR" 4 .IX Item "overwrite: o:" Same as turning on \fB\-\-overwrite\fR .IP "\fBpage:\fR" 4 .IX Item "page:" Read web page and apply commands to it. An example: contact the root page and save it: .Sp .Vb 1 \& http://example.com/~foo page: save:foo\-homepage.html .Ve .Sp In order to find the correct information from the page, other directives are usually supplied to guide the searching. .Sp 1) Adding directive \f(CW\*(C`pregexp:ARCHIVE\-REGEXP\*(C'\fR matches the A \s-1HREF\s0 links in the page. .Sp 2) Adding directive \fBnew:\fR instructs to find newer \s-1VERSIONS\s0 of the file. .Sp 3) Adding directive \f(CW\*(C`file:DOWNLOAD\-FILE\*(C'\fR tells what template to use to construct the downloadable file name. This is needed for the \&\f(CW\*(C`new:\*(C'\fR directive. .Sp 4) A directive \f(CW\*(C`vregexp:VERSION\-REGEXP\*(C'\fR matches the exact location in the page from where the version information is extracted. The default regexp looks for line that says \*(L"The latest version ... is ... N.N\*(R". The regexp must return submatch 2 for the version number. .Sp \&\s-1AN EXAMPLE\s0 .Sp Search for newer files from a \s-1HTTP\s0 directory listing. Examine page http://www.example.com/download/dir for model \f(CW\*(C`package\-1.1.tar.gz\*(C'\fR and find a newer file. E.g. \f(CW\*(C`package\-4.7.tar.gz\*(C'\fR would be downloaded. .Sp .Vb 1 \& http://www.example.com/download/dir/package\-1.1.tar.gz new: .Ve .Sp \&\s-1AN EXAMPLE\s0 .Sp Search for newer files from the content of the page. The directive \&\fBfile:\fR acts as a model for filenames to pay attention to. .Sp .Vb 1 \& http://www.example.com/project/download.html new: pregexp:tar.gz file:package\-1.1.tar.gz .Ve .Sp \&\s-1AN EXAMPLE\s0 .Sp Use directive \fBrename:\fR to change the filename before soring it on disk. Here, the version number is attached to the actila filename: .Sp .Vb 2 \& file.txt\-1.1 \& file.txt\-1.2 .Ve .Sp The directived needed would be as follows; entries have been broken to separate lines for legibility: .Sp .Vb 6 \& http://example.com/files/ \& pregexp:\e.el\-\ed \& vregexp:(file.el\-([\ed.]+)) \& file:file.el\-1.1 \& new: \& rename:s/\-[\ed.]+// .Ve .Sp This effectively reads: \*(L"See if there is new version of something that looks like file.el\-1.1 and save it under name file.el by deleting the extra version number at the end of original filename\*(R". .Sp \&\s-1AN EXAMPLE\s0 .Sp Contact absolute \fBpage:\fR at http://www.example.com/package.html and search A \s-1HREF\s0 urls in the page that match \fBpregexp:\fR. In addition, do another scan and search the version number in the page from thw position that match \fBvregexp:\fR (submatch 2). .Sp After all the pieces have been found, use template \fBfile:\fR to make the retrievable file using the version number found from \fBvregexp:\fR. The actual download location is combination of \fBpage:\fR and A \s-1HREF \&\s0\fBpregexp:\fR location. .Sp The directived needed would be as follows; entries have been broken to separate lines for legibility: .Sp .Vb 7 \& http://www.example.com/~foo/package.html \& page: \& pregexp: package.tar.gz \& vregexp: ((?i)latest.*?version.*?\eb([\ed][\ed.]+).*) \& file: package\-1.3.tar.gz \& new: \& x: .Ve .Sp An example of web page where the above would apply: .Sp .Vb 2 \& \& \& \& The latest version of package is 2.4.1 It can be \& downloaded in several forms: \& \& Tar file \& ZIP file \& \& \& .Ve .Sp For this example, assume that \f(CW\*(C`package.tar.gz\*(C'\fR is a symbolic link pointing to the latest release file \f(CW\*(C`package\-2.4.1.tar.gz\*(C'\fR. Thus the actual download location would have been \&\f(CW\*(C`http://www.example.com/~foo/download/files/package\-2.4.1.tar.gz\*(C'\fR. .Sp Why not simply download \f(CW\*(C`package.tar.gz\*(C'\fR? Because then the program can't decide if the version at the page is newer than one stored on disk from the previous download. With version numbers in the file names, the comparison is possible. .IP "\fBpage:find\fR" 4 .IX Item "page:find" \&\s-1FIXME:\s0 This opton is obsolete. do not use. .Sp \&\s-1THIS IS FOR HTTP\s0 only. Use Use directive \fBregexp:\fR for \s-1FTP\s0 protocls. .Sp This is a more general instruction than the \fBpage:\fR and \fBvregexp:\fR explained above. .Sp Instruct to download every \s-1URL\s0 on \s-1HTML\s0 page matching \fBpregexp:RE\fR. In typical situation the page maintainer lists his software in the development page. This example would download every tar.gz file in the page. Note, that the \s-1REGEXP\s0 is matched against the A \s-1HREF\s0 link content, not the actual text that is displayed on the page: .Sp .Vb 1 \& http://www.example.com/index.html page:find pregexp:\e.tar.gz$ .Ve .Sp You can also use additional \fBregexp-no:\fR directive if you want to exclude files after the \fBpregexp:\fR has matched a link. .Sp .Vb 1 \& http://www.example.com/index.html page:find pregexp:\e.tar.gz$ regexp\-no:desktop .Ve .IP "\fBpass:PASSWORD\fR" 4 .IX Item "pass:PASSWORD" For \s-1FTP\s0 logins. Default value is \f(CW\*(C`nobody@example.com\*(C'\fR. .IP "\fBpregexp:RE\fR" 4 .IX Item "pregexp:RE" Search A \s-1HREF\s0 links in page matching a regular expression. The regular expression must be a single word with no whitespace. This is incorrect: .Sp .Vb 1 \& pregexp:(this regexp ) .Ve .Sp It must be written as: .Sp .Vb 1 \& pregexp:(this\es+regexp\es) .Ve .IP "\fBprint:MESSAGE\fR" 4 .IX Item "print:MESSAGE" Print associated message to user requesting matching tag name. This directive must in separate line inside tag. .Sp .Vb 1 \& tag1: linux \& \& print: this download site moved 2002\-02\-02, check your bookmarks. \& http://new.site.com/dir/file\-1.1.tar.gz new: .Ve .Sp The \f(CW\*(C`print:\*(C'\fR directive for tag is shown only if user turns on \-\-verbose mode: .Sp .Vb 1 \& pwget \-v \-T linux .Ve .IP "\fBrename:PERL\-CODE\fR" 4 .IX Item "rename:PERL-CODE" Rename each file using PERL-CODE. The PERL-CODE must be full perl program with no spaces anywhere. Following variables are available during the \&\fIeval()\fR of code: .Sp .Vb 3 \& $ARG = current file name \& $url = complete url for the file \& The code must return $ARG which is used for file name .Ve .Sp For example, if page contains links to .html files that are in fact text files, following statement would change the file extensions: .Sp .Vb 1 \& http://example.com/dir/ page:find pregexp:\e.html rename:s/html/txt/ .Ve .Sp You can also call function \f(CW\*(C`MonthToNumber($string)\*(C'\fR if the filename contains written month name, like <2005\-February.mbox>.The function will convert the name into number. Many mailing list archives can be downloaded cleanly this way. .Sp .Vb 2 \& # This will download SA\-Exim Mailing list archives: \& http://lists.merlins.org/archives/sa\-exim/ pregexp:\e.txt$ rename:$ARG=MonthToNumber($ARG) .Ve .Sp Here is a more complicated example: .Sp .Vb 1 \& http://www.contactor.se/~dast/svnusers/mbox.cgi pregexp:mbox.*\ed$ rename:my($y,$m)=($url=~/year=(\ed+).*month=(\ed+)/);$ARG="$y\-$m.mbox" .Ve .Sp Let's break that one apart. You may spend some time with this example since the possiblilities are limitless. .Sp .Vb 2 \& 1. Connect to page \& http://www.contactor.se/~dast/svnusers/mbox.cgi \& \& 2. Search page for URLs matching regexp \*(Aqmbox.*\ed$\*(Aq. A \& found link could match hrefs like this: \& http://svn.haxx.se/users/mbox.cgi?year=2004&month=12 \& \& 3. The found link is put to $ARG (same as $_), which can be used \& to extract suitable mailbox name with a perl code that is \& evaluated. The resulting name must apear in $ARG. Thus the code \& effectively extract two items from the link to form a mailbox \& name: \& \& my ($y, $m) = ( $url =~ /year=(\ed+).*month=(\ed+)/ ) \& $ARG = "$y\-$m.mbox" \& \& => 2004\-12.mbox .Ve .Sp Just remember, that the perl code that follows \f(CW\*(C`rename:\*(C'\fR directive \&\fBmust\fR must not contain any spaces. It all must be readable as one string. .IP "\fBregexp:REGEXP\fR" 4 .IX Item "regexp:REGEXP" Get all files in ftp directory matching regexp. Directive \fBsave:\fR is ignored. .IP "\fBregexp\-no:REGEXP\fR" 4 .IX Item "regexp-no:REGEXP" After the \f(CW\*(C`regexp:\*(C'\fR directive has matched, exclude files that match directive \fBregexp-no:\fR .IP "\fBRegexp:REGEXP\fR" 4 .IX Item "Regexp:REGEXP" This option is for interactive use. Retrieve all files from \s-1HTTP\s0 or \s-1FTP\s0 site which match \s-1REGEXP.\s0 .IP "\fBsave:LOCAL\-FILE\-NAME\fR" 4 .IX Item "save:LOCAL-FILE-NAME" Save file under this name to local disk. .IP "\fBtagN:NAME\fR" 4 .IX Item "tagN:NAME" Downloads can be grouped under \f(CW\*(C`tagN\*(C'\fR so that e.g. option \fB\-\-tag1\fR would start downloading files from that point on until next \f(CW\*(C`tag1\*(C'\fR is found. There are currently unlimited number of tag levels: tag1, tag2 and tag3, so that you can arrange your downlods hierarchially in the configuration file. For example to download all Linux files rhat you monitor, you would give option \fB\-\-tag linux\fR. To download only the \s-1NT\s0 Emacs latest binary, you would give option \fB\-\-tag emacs-nt\fR. Notice that you do not give the \&\f(CW\*(C`level\*(C'\fR in the option, program will find it out from the configuration file after the tag name matches. .Sp The downloading stops at next tag of the \f(CW\*(C`same level\*(C'\fR. That is, tag2 stops only at next tag2, or when upper level tag is found (tag1) or or until end of file. .Sp .Vb 1 \& tag1: linux # All Linux downlods under this category \& \& tag2: sunsite tag2: another\-name\-for\-this\-spot \& \& # List of files to download from here \& \& tag2: ftp.funet.fi \& \& # List of files to download from here \& \& tag1: emacs\-binary \& \& tag2: emacs\-nt \& \& tag2: xemacs\-nt \& \& tag2: emacs \& \& tag2: xemacs .Ve .IP "\fBx:\fR" 4 .IX Item "x:" Extract (unpack) file after download. See also option \fB\-\-unpack\fR and \&\fB\-\-no\-extract\fR The archive file, say .tar.gz will be extracted the file in current download location. (see directive \fBlcd:\fR) .Sp The unpack procedure checks the contents of the archive to see if the package is correctly formed. The de facto archive format is .Sp .Vb 1 \& package\-N.NN.tar.gz .Ve .Sp In the archive, all files are supposed to be stored under the proper subdirectory with version information: .Sp .Vb 4 \& package\-N.NN/doc/README \& package\-N.NN/doc/INSTALL \& package\-N.NN/src/Makefile \& package\-N.NN/src/some\-code.java .Ve .Sp \&\f(CW\*(C`IMPORTANT:\*(C'\fR If the archive does not have a subdirectory for all files, a subdirectory is created and all items are unpacked under it. The default subdirectory name in constructed from the archive name with currect date stamp in format: .Sp .Vb 1 \& package\-YYYY.MMDD .Ve .Sp If the archive name contains something that looks like a version number, the created directory will be constructed from it, instead of current date. .Sp .Vb 1 \& package\-1.43.tar.gz => package\-1.43 .Ve .IP "\fBxx:\fR" 4 .IX Item "xx:" Like directive \fBx:\fR but extract the archive \f(CW\*(C`as is\*(C'\fR, without checking content of the archive. If you know that it is ok for the archive not to include any subdirectories, use this option to suppress creation of an artificial root package\-YYYY.MMDD. .IP "\fBxopt:rm\fR" 4 .IX Item "xopt:rm" This options tells to remove any previous unpack directory. .Sp Sometimes the files in the archive are all read-only and unpacking the archive second time, after some period of time, would display .Sp .Vb 2 \& tar: package\-3.9.5/.cvsignore: Could not create file: \& Permission denied \& \& tar: package\-3.9.5/BUGS: Could not create file: \& Permission denied .Ve .Sp This is not a serious error, because the archive was already on disk and tar did not overwrite previous files. It might be good to inform the archive maintainer, that the files have wrong permissions. It is customary to expect that distributed packages have writable flag set for all files. .SH "ERRORS" .IX Header "ERRORS" Here is list of possible error messages and how to deal with them. Turning on \fB\-\-debug\fR will help to understand how program has interpreted the configuration file or command line options. Pay close attention to the generated output, because it may reveal that a regexp for a site is too lose or too tight. .IP "\fB\s-1ERROR\s0 {\s-1URL\-HERE\s0} Bad file descriptor\fR" 4 .IX Item "ERROR {URL-HERE} Bad file descriptor" This is \*(L"file not found error\*(R". You have written the filename incorrectly. Double check the configuration file's line. .SH "BUGS AND LIMITATIONS" .IX Header "BUGS AND LIMITATIONS" \&\f(CW\*(C`Sourceforge note\*(C'\fR: To download archive files from Sourceforge requires some trickery because of the redirections and load balancers the site uses. The Sourceforge page have also undergone many changes during their existence. Due to these changes there exists an ugly hack in the program to use \fIwget\fR\|(1) to get certain information from the site. This could have been implemented in pure Perl, but as of now the developer hasn't had time to remove the \fIwget\fR\|(1) dependency. No doubt, this is an ironic situation to use \fIwget\fR\|(1). You you have Perl skills, go ahead and look at \fIUrlHttGet()\fR. \fIUrlHttGetWget()\fR and sen patches. .PP The program was initially designed to read options from one line. It is unfortunately not possible to change the program to read configuration file directives from multiple lines, e.g. by using backslashes (\e) to indicate contuatinued line. .SH "ENVIRONMENT" .IX Header "ENVIRONMENT" Variable \f(CW\*(C`PWGET_CFG\*(C'\fR can point to the root configuration file. The configuration file is read at startup if it exists. .PP .Vb 2 \& export PWGET_CFG=$HOME/conf/pwget.conf # /bin/hash syntax \& setenv PWGET_CFG $HOME/conf/pwget.conf # /bin/csh syntax .Ve .SH "EXIT STATUS" .IX Header "EXIT STATUS" Not defined. .SH "DEPENDENCIES" .IX Header "DEPENDENCIES" External utilities: .PP .Vb 2 \& wget(1) only needed for Sourceforge.net downloads \& see BUGS AND LIMITATIONS .Ve .PP Non-core Perl modules from \s-1CPAN:\s0 .PP .Vb 2 \& LWP::UserAgent \& Net::FTP .Ve .PP The following modules are loaded in run-time only if directive \&\fBcnv:text\fR is used. Otherwise these modules are not loaded: .PP .Vb 3 \& HTML::Parse \& HTML::TextFormat \& HTML::FormatText .Ve .PP This module is loaded in run-time only if \s-1HTTPS\s0 scheme is used: .PP .Vb 1 \& Crypt::SSLeay .Ve .SH "SEE ALSO" .IX Header "SEE ALSO" \&\fIlwp\-download\fR\|(1) \&\fIlwp\-mirror\fR\|(1) \&\fIlwp\-request\fR\|(1) \&\fIlwp\-rget\fR\|(1) \&\fIwget\fR\|(1) .SH "AUTHOR" .IX Header "AUTHOR" Jari Aalto .SH "LICENSE AND COPYRIGHT" .IX Header "LICENSE AND COPYRIGHT" Copyright (C) 1996\-2016 Jari Aalto .PP This program is free software; you can redistribute and/or modify program under the terms of \s-1GNU\s0 General Public license either version 2 of the License, or (at your option) any later version. pwget-2016.1019+git75c6e3e/bin/pwget.pl000077500000000000000000005435611300167571300172230ustar00rootroot00000000000000#!/usr/bin/perl # # pwget -- batch download files possibly with configuration file # # Copyright # # Copyright (C) 1996-2016 Jari Aalto # # License # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . # # Documentation # # To read manual, start this program with option: --help # "Named Capture Buffers" are used use 5.10.0; # **************************************************************************** # # Globals # # **************************************************************************** use vars qw ( $VERSION ); # This is for use of Makefile.PL and ExtUtils::MakeMaker # # The following variable is updated by Emacs setup whenever # this file is saved. $VERSION = '2016.1019.1354'; # **************************************************************************** # # Standard perl modules # # **************************************************************************** use strict; use autouse 'Carp' => qw( croak carp cluck confess ); use autouse 'Text::Tabs' => qw( expand ); use autouse 'File::Copy' => qw( copy move ); use autouse 'File::Path' => qw( mkpath rmtree ); use autouse 'Pod::Html' => qw( pod2html ); #use autouse 'Pod::Text' => qw( pod2text ); use Cwd; use Env; use English; use File::Basename; use Getopt::Long; use Net::FTP; IMPORT: { use Env; use vars qw ( $PATH $HOME $TEMP $TEMPDIR $SHELL ); } # **************************************************************************** # # Modules from CPAN # # **************************************************************************** use LWP::UserAgent; # **************************************************************************** # # DESCRIPTION # # Set global variables for the program # # INPUT PARAMETERS # # none # # RETURN VALUES # # none # # **************************************************************************** sub Initialize () { use vars qw ( $PROGNAME $LIB $LICENSE $AUTHOR $URL $WIN32 $CYGWIN_PERL ); $LIB = basename $PROGRAM_NAME; $PROGNAME = $LIB; $LICENSE = "GPL-2+"; $AUTHOR = "Jari Aalto"; $URL = "http://freecode.com/projects/perl-webget"; $WIN32 = 1 if $OSNAME =~ /win32/i; if ( $OSNAME =~ /cygwin/i ) { # We need to know if this perl is Cygwin native perl? use vars qw( %Config ); eval "use Config"; $EVAL_ERROR and die "$EVAL_ERROR"; if ( $main::Config{osname} =~ /cygwin/i ) { $CYGWIN_PERL = 1; } } $OUTPUT_AUTOFLUSH = 1; # This variable holds the current tag line being used. use vars qw( $CURRENT_TAG_LINE ); } # ***************************************************************** &help **** # # DESCRIPTION # # Print help and exit. # # INPUT PARAMETERS # # $msg [optional] Reason why function was called. # # RETURN VALUES # # none # # **************************************************************************** =pod =head1 NAME pwget - Perl Web URL fetch program =head1 SYNOPSIS pwget http://example.com/ [URL ...] pwget --config $HOME/config/pwget.conf --tag linux --tag emacs .. pwget --verbose --overwrite http://example.com/ pwget --verbose --overwrite --Output ~/dir/ http://example.com/ pwget --new --overwrite http://example.com/package-1.1.tar.gz =head1 DESCRIPTION Automate periodic downloads of files and packages. If you retrieve latest versions of certain program blocks periodically, this is the Perl script for you. Run from cron job or once a week to upload newest versions of files around the net. Note: =head2 Wget and this program At this point you may wonder, where would you need this perl program when wget(1) C-program has been the standard for ages. Well, 1) Perl is cross platform and more easily extendable 2) You can record file download criteria to a configuration file and use perl regular epxressions to select downloads 3) the program can anlyze web-pages and "search" for the download only links as instructed 4) last but not least, it can track newest packages whose name has changed since last downlaod. There are heuristics to determine the newest file or package according to file name skeleton defined in configuration. This program does not replace pwget(1) because it does not offer as many options as wget, like recursive downloads and date comparing. Use wget for ad hoc downloads and this utility for files that change (new releases of archives) or which you monitor periodically. =head2 Short introduction This small utility makes it possible to keep a list of URLs in a configuration file and periodically retrieve those pages or files with simple commands. This utility is best suited for small batch jobs to download e.g. most recent versions of software files. If you use an URL that is already on disk, be sure to supply option B<--overwrite> to allow overwriting existing files. While you can run this program from command line to retrieve individual files, program has been designed to use separate configuration file via B<--config> option. In the configuration file you can control the downloading with separate directives like C which tells to save the file under different name. The simplest way to retrieve the latest version of apackage from a FTP site is: pwget --new --overwite --verbose \ http://www.example.com/package-1.00.tar.gz Do not worry about the filename C. The latest version, say, C will be retrieved. The option B<--new> instructs to find newer version than the provided URL. If the URL ends to slash, then directory list at the remote machine is stored to file: !path!000root-file The content of this file can be either index.html or the directory listing depending on the used http or ftp protocol. =head1 OPTIONS =over 4 =item B<-A, --regexp-content REGEXP> Analyze the content of the file and match REGEXP. Only if the regexp matches the file content, then download file. This option will make downloads slow, because the file is read into memory as a single line and then a match is searched against the content. For example to download Emacs lisp file (.el) written by Mr. Foo in case insensitive manner: pwget -v -r '\.el$' -A "(?i)Author: Mr. Foo" \ http://www.emacswiki.org/elisp/index.html =item B<-C, --create-paths> Create paths that do not exist in C directives. By default, any LCD directive to non-existing directory will interrupt program. With this option, local directories are created as needed making it possible to re-create the exact structure as it is in configuration file. =item B<-c, --config FILE> This option can be given multiple times. All configurations are read. Read URLs from configuration file. If no configuration file is given, file pointed by environment variable is read. See ENVIRONMENT. The configuration file layout is envlained in section CONFIGURATION FILE =item B<--chdir DIRECTORY> Do a chdir() to DIRECTORY before any URL download starts. This is like doing: cd DIRECTORY pwget http://example.com/index.html =item B<-d, --debug [LEVEL]> Turn on debug with positive LEVEL number. Zero means no debug. This option turns on B<--verbose> too. =item B<-e, --extract> Unpack any files after retrieving them. The command to unpack typical archive files are defined in a program. Make sure these programs are along path. Win32 users are encouraged to install the Cygwin utilities where these programs come standard. Refer to section SEE ALSO. .tar => tar .tgz => tar + gzip .gz => gzip .bz2 => bzip2 .xz => xz .zip => unzip =item B<-F, --firewall FIREWALL> Use FIREWALL when accessing files via ftp:// protocol. =item B<-h, --help> Print help page in text. =item B<--help-html> Print help page in HTML. =item B<--help-man> Print help page in Unix manual page format. You want to feed this output to c in order to read it. Print help page. =item B<-m, --mirror SITE> If URL points to Sourcefoge download area, use mirror SITE for downloading. Alternatively the full full URL can include the mirror information. And example: --mirror kent http://downloads.sourceforge.net/foo/foo-1.0.0.tar.gz =item B<-n, --new> Get newest file. This applies to datafiles, which do not have extension .asp or .html. When new releases are announced, the version number in filename usually tells which is the current one so getting hardcoded file with: pwget -o -v http://example.com/dir/program-1.3.tar.gz is not usually practical from automation point of view. Adding B<--new> option to the command line causes double pass: a) the whole http://example.com/dir/ is examined for all files and b) files matching approximately filename program-1.3.tar.gz are examined, heuristically sorted and file with latest version number is retrieved. =item B<--no-lcd> Ignore C directives in configuration file. In the configuration file, any C directives are obeyed as they are seen. But if you do want to retrieve URL to your current directory, be sure to supply this option. Otherwise the file will end to the directory pointer by C. =item B<--no-save> Ignore C directives in configuration file. If the URLs have C options, they are ignored during fetch. You usually want to combine B<--no-lcd> with B<--no-save> =item B<--no-extract> Ignore C directives in configuration file. =item B<-O, --output DIR> Before retrieving any files, chdir to DIR. =item B<-o, --overwrite> Allow overwriting existing files when retrieving URLs. Combine this with B<--skip-version> if you periodically update files. =item B<--proxy PROXY> Use PROXY server for HTTP. (See B<--Firewall> for FTP.). The port number is optional in the call: --proxy http://example.com.proxy.com --proxy example.com.proxy.com:8080 =item B<-p, --prefix PREFIX> Add PREFIX to all retrieved files. =item B<-P, --postfix POSTFIX > Add POSTFIX to all retrieved files. =item B<-D, --prefix-date> Add iso8601 ":YYYY-MM-DD" prefix to all retrieved files. This is added before possible B<--prefix-www> or B<--prefix>. =item B<-W, --prefix-www> Usually the files are stored with the same name as in the URL dir, but if you retrieve files that have identical names you can store each page separately so that the file name is prefixed by the site name. http://example.com/page.html --> example.com::page.html http://example2.com/page.html --> example2.com::page.html =item B<-r, --regexp REGEXP> Retrieve file matching at the destination URL site. This is like "Connect to the URL and get all files matching REGEXP". Here all gzip compressed files are found form HTTP server directory: pwget -v -r "\.gz" http://example.com/archive/ Caveat: currently works only for http:// URLs. =item B<-R, --config-regexp REGEXP> Retrieve URLs matching REGEXP from configuration file. This cancels B<--tag> options in the command line. =item B<-s, --selftest> Run some internal tests. For maintainer or developer only. =item B<--sleep SECONDS> Sleep SECONDS before next URL request. When using regexp based downlaods that may return many hits, some sites disallow successive requests in within short period of time. This options makes program sleep for number of SECONDS between retrievals to overcome 'Service unavailable'. =item B<--stdout> Retrieve URL and write to stdout. =item B<--skip-version> Do not download files that have version number and which already exists on disk. Suppose you have these files and you use option B<--skip-version>: package.tar.gz file-1.1.tar.gz Only file.txt is retrieved, because file-1.1.tar.gz contains version number and the file has not changed since last retrieval. The idea is, that in every release the number in in distribution increases, but there may be distributions which do not contain version number. In regular intervals you may want to load those packages again, but skip versioned files. In short: This option does not make much sense without additional option B<--new> If you want to reload versioned file again, add option B<--overwrite>. =item B<-t, --test, --dry-run> Run in test mode. =item B<-T, --tag NAME [NAME] ...> Search tag NAME from the config file and download only entries defined under that tag. Refer to B<--config FILE> option description. You can give Multiple B<--tag> switches. Combining this option with B<--regexp> does not make sense and the concequencies are undefined. =item B<-v, --verbose [NUMBER]> Print verbose messages. =item B<-V, --version> Print version information. =back =head1 EXAMPLES Get files from site: pwget http://www.example.com/dir/package.tar.gz .. Display copyright file for package GNU make from Debian pages: pwget --stdout --regexp 'copyright$' http://packages.debian.org/unstable/make Get all mailing list archive files that match "gz": pwget --regexp gz http://example.com/mailing-list/archive/download/ Read a directory and store it to filename YYYY-MM-DD::!dir!000root-file. pwget --prefix-date --overwrite --verbose http://www.example.com/dir/ To update newest version of the package, but only if there is none at disk already. The B<--new> option instructs to find newer packages and the filename is only used as a skeleton for files to look for: pwget --overwrite --skip-version --new --verbose \ ftp://ftp.example.com/dir/packet-1.23.tar.gz To overwrite file and add a date prefix to the file name: pwget --prefix-date --overwrite --verbose \ http://www.example.com/file.pl --> YYYY-MM-DD::file.pl To add date and WWW site prefix to the filenames: pwget --prefix-date --prefix-www --overwrite --verbose \ http://www.example.com/file.pl --> YYYY-MM-DD::www.example.com::file.pl Get all updated files under cnfiguration file's tag updates: pwget --verbose --overwrite --skip-version --new --tag updates pwget -v -o -s -n -T updates Get files as they read in the configuration file to the current directory, ignoring any C and C directives: pwget --config $HOME/config/pwget.conf / --no-lcd --no-save --overwrite --verbose \ http://www.example.com/file.pl To check configuration file, run the program with non-matching regexp and it parses the file and checks the C directives on the way: pwget -v -r dummy-regexp --> pwget.DirectiveLcd: LCD [$EUSR/directory ...] is not a directory at /users/foo/bin/pwget line 889. =head1 CONFIGURATION FILE =head2 Comments The configuration file is NOT Perl code. Comments start with hash character (#). =head2 Variables At this point, variable expansions happen only in B. Do not try to use them anywhere else, like in URLs. Path variables for B are defined using following notation, spaces are not allowed in VALUE part (no directory names with spaces). Variable names are case sensitive. Variables substitute environment variabales with the same name. Environment variables are immediately available. VARIABLE = /home/my/dir # define variable VARIABLE = $dir/some/file # Use previously defined variable FTP = $HOME/ftp # Use environment variable The right hand can refer to previously defined variables or existing environment variables. Repeat, this is not Perl code although it may look like one, but just an allowed syntax in the configuration file. Notice that there is dollar to the right hand> when variable is referred, but no dollar to the left hand side when variable is defined. Here is example of a possible configuration file contant. The tags are hierarchically ordered without a limit. Warning: remember to use different variables names in separate include files. All variables are global. =head2 Include files It is possible to include more configuration files with statement INCLUDE Variable expansions are possible in the file name. There is no limit how many or how deep include structure is used. Every file is included only once, so it is safe to to have multiple includes to the same file. Every include is read, so put the most importat override includes last: INCLUDE # Global INCLUDE <$HOME/config/pwget.conf> # HOME overrides it A special C tag means relative path of the current include file, which makes it possible to include several files form the same directory where a initial include file resides # Start of config at /etc/pwget.conf # THIS = /etc, current location include # Refers to directory where current user is: the pwd include # end =head2 Configuraton file example The configuration file can contain many , where each directive end to a colon. The usage of each directory is best explained by examining the configuration file below and reading the commentary near each directive. # $HOME/config/pwget.conf F- Perl pwget configuration file ROOT = $HOME # define variables CONF = $HOME/config UPDATE = $ROOT/updates DOWNL = $ROOT/download # Include more configuration files. It is possible to # split a huge file in pieces and have "linux", # "win32", "debian", "emacs" configurations in separate # and manageable files. INCLUDE <$CONF/pwget-other.conf> INCLUDE <$CONF/pwget-more.conf> tag1: local-copies tag1: local # multiple names to this category lcd: $UPDATE # chdir directive # This is show to user with option --verbose print: Notice, this site moved YYYY-MM-DD, update your bookmarks file://absolute/dir/file-1.23.tar.gz tag1: external lcd: $DOWNL tag2: external-http http://www.example.com/page.html http://www.example.com/page.html save:/dir/dir/page.html tag2: external-ftp ftp://ftp.com/dir/file.txt.gz save:xx-file.txt.gz login:foo pass:passwd x: lcd: $HOME/download/package ftp://ftp.com/dir/package-1.1.tar.gz new: tag2: package-x lcd: $DOWNL/package-x # Person announces new files in his homepage, download all # announced files. Unpack everything (x:) and remove any # existing directories (xopt:rm) http://example.com/~foo pregexp:\.tar\.gz$ x: xopt:rm # End of configuration file pwget.conf =head1 LIST OF DIRECTIVES IN CONFIGURATION FILE All the directives must in the same line where the URL is. The programs scans lines and determines all options given in line for the URL. Directives can be overridden by command line options. =over 4 =item B Currently only B is available. Convert downloaded page to text. This option always needs either B or B, because only those directives change filename. Here is an example: http://example.com/dir/file.html cnv:text save:file.txt http://example.com/dir/ pregexp:\.html cnv:text rename:s/html/txt/ A B shorthand directive can be used instead of B. =item B Download file only if the content matches REGEXP. This is same as option B<--Regexp-content>. In this example directory listing Emacs lisp packages (.el) are downloaded but only if their content indicates that the Author is Mr. Foo: http://example.com/index.html cregexp:(?i)author:.*Foo pregexp:\.el$ =item B Set local download directory to DIRECTORY (chdir to it). Any environment variables are substituted in path name. If this tag is found, it replaces setting of B<--Output>. If path is not a directory, terminate with error. See also B<--Create-paths> and B<--no-lcd>. =item B Ftp login name. Default value is "anonymous". =item B This is relevant to Sourceforge only which does not allow direct downloads with links. Visit project's Sourceforge homepage and see which mirrors are available for downloading. An example: http://sourceforge.net/projects/austrumi/files/austrumi/austrumi-1.8.5/austrumi-1.8.5.iso/download new: mirror:kent =item B Get newest file. This variable is reset to the value of B<--new> after the line has been processed. Newest means, that an C command is run in the ftp, and something equivalent in HTTP "ftp directories", and any files that resemble the filename is examined, sorted and heurestically determined according to version number of file which one is the latest. For example files that have version information in YYYYMMDD format will most likely to be retrieved right. Time stamps of the files are not checked. The only requirement is that filename C follow the universal version numbering standard: FILE-VERSION.extension # de facto VERSION is defined as [\d.]+ file-19990101.tar.gz # ok file-1999.0101.tar.gz # ok file-1.2.3.5.tar.gz # ok file1234.txt # not recognized. Must have "-" file-0.23d.tar.gz # warning, letters are problematic Files that have some alphabetic version indicator at the end of VERSION may not be handled correctly. Contact the developer and inform him about the de facto standard so that files can be retrieved more intelligently. I In order the B directive to know what kind of files to look for, it needs a file tamplate. You can use a direct link to some filename. Here the location "http://www.example.com/downloads" is examined and the filename template used is took as "file-1.1.tar.gz" to search for files that might be newer, like "file-9.1.10.tar.gz": http://www.example.com/downloads/file-1.1.tar.gz new: If the filename appeard in a named page, use directive B for template. In this case the "download.html" page is examined for files looking like "file.*tar.gz" and the latest is searched: http://www.example.com/project/download.html file:file-1.1.tar.gz new: =item B B Same as turning on B<--overwrite> =item B Read web page and apply commands to it. An example: contact the root page and save it: http://example.com/~foo page: save:foo-homepage.html In order to find the correct information from the page, other directives are usually supplied to guide the searching. 1) Adding directive C matches the A HREF links in the page. 2) Adding directive B instructs to find newer VERSIONS of the file. 3) Adding directive C tells what template to use to construct the downloadable file name. This is needed for the C directive. 4) A directive C matches the exact location in the page from where the version information is extracted. The default regexp looks for line that says "The latest version ... is ... N.N". The regexp must return submatch 2 for the version number. AN EXAMPLE Search for newer files from a HTTP directory listing. Examine page http://www.example.com/download/dir for model C and find a newer file. E.g. C would be downloaded. http://www.example.com/download/dir/package-1.1.tar.gz new: AN EXAMPLE Search for newer files from the content of the page. The directive B acts as a model for filenames to pay attention to. http://www.example.com/project/download.html new: pregexp:tar.gz file:package-1.1.tar.gz AN EXAMPLE Use directive B to change the filename before soring it on disk. Here, the version number is attached to the actila filename: file.txt-1.1 file.txt-1.2 The directived needed would be as follows; entries have been broken to separate lines for legibility: http://example.com/files/ pregexp:\.el-\d vregexp:(file.el-([\d.]+)) file:file.el-1.1 new: rename:s/-[\d.]+// This effectively reads: "See if there is new version of something that looks like file.el-1.1 and save it under name file.el by deleting the extra version number at the end of original filename". AN EXAMPLE Contact absolute B at http://www.example.com/package.html and search A HREF urls in the page that match B. In addition, do another scan and search the version number in the page from thw position that match B (submatch 2). After all the pieces have been found, use template B to make the retrievable file using the version number found from B. The actual download location is combination of B and A HREF B location. The directived needed would be as follows; entries have been broken to separate lines for legibility: http://www.example.com/~foo/package.html page: pregexp: package.tar.gz vregexp: ((?i)latest.*?version.*?\b([\d][\d.]+).*) file: package-1.3.tar.gz new: x: An example of web page where the above would apply: The latest version of package is 2.4.1 It can be downloaded in several forms: Tar file ZIP file For this example, assume that C is a symbolic link pointing to the latest release file C. Thus the actual download location would have been C. Why not simply download C? Because then the program can't decide if the version at the page is newer than one stored on disk from the previous download. With version numbers in the file names, the comparison is possible. =item B FIXME: This opton is obsolete. do not use. THIS IS FOR HTTP only. Use Use directive B for FTP protocls. This is a more general instruction than the B and B explained above. Instruct to download every URL on HTML page matching B. In typical situation the page maintainer lists his software in the development page. This example would download every tar.gz file in the page. Note, that the REGEXP is matched against the A HREF link content, not the actual text that is displayed on the page: http://www.example.com/index.html page:find pregexp:\.tar.gz$ You can also use additional B directive if you want to exclude files after the B has matched a link. http://www.example.com/index.html page:find pregexp:\.tar.gz$ regexp-no:desktop =item B For FTP logins. Default value is C. =item B Search A HREF links in page matching a regular expression. The regular expression must be a single word with no whitespace. This is incorrect: pregexp:(this regexp ) It must be written as: pregexp:(this\s+regexp\s) =item B Print associated message to user requesting matching tag name. This directive must in separate line inside tag. tag1: linux print: this download site moved 2002-02-02, check your bookmarks. http://new.site.com/dir/file-1.1.tar.gz new: The C directive for tag is shown only if user turns on --verbose mode: pwget -v -T linux =item B Rename each file using PERL-CODE. The PERL-CODE must be full perl program with no spaces anywhere. Following variables are available during the eval() of code: $ARG = current file name $url = complete url for the file The code must return $ARG which is used for file name For example, if page contains links to .html files that are in fact text files, following statement would change the file extensions: http://example.com/dir/ page:find pregexp:\.html rename:s/html/txt/ You can also call function C if the filename contains written month name, like <2005-February.mbox>.The function will convert the name into number. Many mailing list archives can be downloaded cleanly this way. # This will download SA-Exim Mailing list archives: http://lists.merlins.org/archives/sa-exim/ pregexp:\.txt$ rename:$ARG=MonthToNumber($ARG) Here is a more complicated example: http://www.contactor.se/~dast/svnusers/mbox.cgi pregexp:mbox.*\d$ rename:my($y,$m)=($url=~/year=(\d+).*month=(\d+)/);$ARG="$y-$m.mbox" Let's break that one apart. You may spend some time with this example since the possiblilities are limitless. 1. Connect to page http://www.contactor.se/~dast/svnusers/mbox.cgi 2. Search page for URLs matching regexp 'mbox.*\d$'. A found link could match hrefs like this: http://svn.haxx.se/users/mbox.cgi?year=2004&month=12 3. The found link is put to $ARG (same as $_), which can be used to extract suitable mailbox name with a perl code that is evaluated. The resulting name must apear in $ARG. Thus the code effectively extract two items from the link to form a mailbox name: my ($y, $m) = ( $url =~ /year=(\d+).*month=(\d+)/ ) $ARG = "$y-$m.mbox" => 2004-12.mbox Just remember, that the perl code that follows C directive B must not contain any spaces. It all must be readable as one string. =item B Get all files in ftp directory matching regexp. Directive B is ignored. =item B After the C directive has matched, exclude files that match directive B =item B This option is for interactive use. Retrieve all files from HTTP or FTP site which match REGEXP. =item B Save file under this name to local disk. =item B Downloads can be grouped under C so that e.g. option B<--tag1> would start downloading files from that point on until next C is found. There are currently unlimited number of tag levels: tag1, tag2 and tag3, so that you can arrange your downlods hierarchially in the configuration file. For example to download all Linux files rhat you monitor, you would give option B<--tag linux>. To download only the NT Emacs latest binary, you would give option B<--tag emacs-nt>. Notice that you do not give the C in the option, program will find it out from the configuration file after the tag name matches. The downloading stops at next tag of the C. That is, tag2 stops only at next tag2, or when upper level tag is found (tag1) or or until end of file. tag1: linux # All Linux downlods under this category tag2: sunsite tag2: another-name-for-this-spot # List of files to download from here tag2: ftp.funet.fi # List of files to download from here tag1: emacs-binary tag2: emacs-nt tag2: xemacs-nt tag2: emacs tag2: xemacs =item B Extract (unpack) file after download. See also option B<--unpack> and B<--no-extract> The archive file, say .tar.gz will be extracted the file in current download location. (see directive B) The unpack procedure checks the contents of the archive to see if the package is correctly formed. The de facto archive format is package-N.NN.tar.gz In the archive, all files are supposed to be stored under the proper subdirectory with version information: package-N.NN/doc/README package-N.NN/doc/INSTALL package-N.NN/src/Makefile package-N.NN/src/some-code.java C If the archive does not have a subdirectory for all files, a subdirectory is created and all items are unpacked under it. The default subdirectory name in constructed from the archive name with currect date stamp in format: package-YYYY.MMDD If the archive name contains something that looks like a version number, the created directory will be constructed from it, instead of current date. package-1.43.tar.gz => package-1.43 =item B Like directive B but extract the archive C, without checking content of the archive. If you know that it is ok for the archive not to include any subdirectories, use this option to suppress creation of an artificial root package-YYYY.MMDD. =item B This options tells to remove any previous unpack directory. Sometimes the files in the archive are all read-only and unpacking the archive second time, after some period of time, would display tar: package-3.9.5/.cvsignore: Could not create file: Permission denied tar: package-3.9.5/BUGS: Could not create file: Permission denied This is not a serious error, because the archive was already on disk and tar did not overwrite previous files. It might be good to inform the archive maintainer, that the files have wrong permissions. It is customary to expect that distributed packages have writable flag set for all files. =back =head1 ERRORS Here is list of possible error messages and how to deal with them. Turning on B<--debug> will help to understand how program has interpreted the configuration file or command line options. Pay close attention to the generated output, because it may reveal that a regexp for a site is too lose or too tight. =over 4 =item B This is "file not found error". You have written the filename incorrectly. Double check the configuration file's line. =back =head1 BUGS AND LIMITATIONS C: To download archive files from Sourceforge requires some trickery because of the redirections and load balancers the site uses. The Sourceforge page have also undergone many changes during their existence. Due to these changes there exists an ugly hack in the program to use wget(1) to get certain information from the site. This could have been implemented in pure Perl, but as of now the developer hasn't had time to remove the wget(1) dependency. No doubt, this is an ironic situation to use wget(1). You you have Perl skills, go ahead and look at UrlHttGet(). UrlHttGetWget() and sen patches. The program was initially designed to read options from one line. It is unfortunately not possible to change the program to read configuration file directives from multiple lines, e.g. by using backslashes (\) to indicate contuatinued line. =head1 ENVIRONMENT Variable C can point to the root configuration file. The configuration file is read at startup if it exists. export PWGET_CFG=$HOME/conf/pwget.conf # /bin/hash syntax setenv PWGET_CFG $HOME/conf/pwget.conf # /bin/csh syntax =head1 EXIT STATUS Not defined. =head1 DEPENDENCIES External utilities: wget(1) only needed for Sourceforge.net downloads see BUGS AND LIMITATIONS Non-core Perl modules from CPAN: LWP::UserAgent Net::FTP The following modules are loaded in run-time only if directive B is used. Otherwise these modules are not loaded: HTML::Parse HTML::TextFormat HTML::FormatText This module is loaded in run-time only if HTTPS scheme is used: Crypt::SSLeay =head1 SEE ALSO lwp-download(1) lwp-mirror(1) lwp-request(1) lwp-rget(1) wget(1) =head1 AUTHOR Jari Aalto =head1 LICENSE AND COPYRIGHT Copyright (C) 1996-2016 Jari Aalto This program is free software; you can redistribute and/or modify program under the terms of GNU General Public license either version 2 of the License, or (at your option) any later version. =cut sub Help (;$ $) { my $id = "$LIB.Help"; my $msg = shift; # optional arg, why are we here... my $type = shift; # optional arg, type if ( $type eq -html ) { pod2html $PROGRAM_NAME; } elsif ( $type eq -man ) { eval { require Pod::Man; 1 } or die "$id: Cannot generate Man: $EVAL_ERROR"; # Other option: name, section, release # my %options; $options{center} = 'Perl pwget URL fetch utility'; my $parser = Pod::Man->new(%options); $parser->parse_from_file ($PROGRAM_NAME); } else { system "pod2text $PROGRAM_NAME"; } exit 0; } # **************************************************************************** # # DESCRIPTION # # # # INPUT PARAMETERS # # $path Path name # $type -win32 convert to win32 path. # -cygwin convert to cygwin path. # # RETURN VALUES # # # # **************************************************************************** sub PathConvert ($;$) { my $id = "$LIB.PathConvertDosToCygwin"; local $ARG = shift; my $type = shift; my $ret = $ARG; if ( /^([a-z]):(.*)/i and $type eq -cygwin ) { my $dir = $1; my $path = $2; $ret = "/cygdrive/\L$dir\E$path"; $ret =~ s,\\,/,g; } $ret; } # **************************************************************************** # # DESCRIPTION # # Determine OS and convert path to correct Win32 environment. # Cygwin perl or to Win32 Activestate perl # # INPUT PARAMETERS # # # # RETURN VALUES # # # # **************************************************************************** sub PathConvertSmart ($) { my $id = "$LIB.PathConvertSmart"; local ($ARG) = @ARG; if ( $CYGWIN_PERL ) { # In win32, you could define environment variables as # C:\something\like, but that's not understood under cygwin. $ARG = PathConvert $ARG, -cygwin; } my $home = $ENV{HOME}; s,~,$home,; s,\Q$HOME,$home,; $ARG; } # **************************************************************************** # # DESCRIPTION # # Return version string # # INPUT PARAMETERS # # none # # RETURN VALUES # # string # # **************************************************************************** sub Version () { "$VERSION"; } sub VersionInfo () { Version() . " $AUTHOR $LICENSE $URL" } sub VersionPrint () { print( VersionInfo() . "\n"); exit 0; } # ************************************************************** &args ******* # # DESCRIPTION # # Read and interpret command line arguments # # INPUT PARAMETERS # # none # # RETURN VALUES # # none # # **************************************************************************** sub HandleCommandLineArgs () { # ............................................... local variables ... my $id = "$LIB.HandleCommandLineArgs"; # .......................................... command line options ... use vars qw # declare global variables ( $CFG_FILE_NEEDED $CHECK_NEWEST $CONTENT_REGEXP $DIR_DATE $EXTRACT $FIREWALL $LCD_CREATE $MIRROR $NO_EXTRACT $NO_LCD $NO_SAVE $OUT_DIR $OVERWRITE $POSTFIX $PREFIX $PREFIX_DATE $PREFIX_WWW $PROXY $PWGET_CFG $SITE_REGEXP $SKIP_VERSION $SLEEP_SECONDS $STDOUT $TAG_REGEXP $URL_REGEXP @CFG_FILE @TAG_LIST $debug $test $verb ); $CFG_FILE_NEEDED = 0; $FIREWALL = ""; $OVERWRITE = 0; # .................................................... read args ... Getopt::Long::config( qw ( no_ignore_case no_ignore_case_always )); my ( $version, $help, $helpHTML, $helpMan, $selfTest, $chdir ); GetOptions # Getopt::Long ( "A|regexp-content=s" => \$CONTENT_REGEXP , "chdir=s" => \$chdir , "c|config:s" => \@CFG_FILE , "C|create-paths" => \$LCD_CREATE , "dry-run" => \$test , "d|debug:i" => \$debug , "D|prefix-date" => \$PREFIX_DATE , "extract" => \$EXTRACT , "firewall=s" => \$FIREWALL , "help-html" => \$helpHTML , "help-man" => \$helpMan , "h|help" => \$help , "mirror=s" => \$MIRROR , "no-extract" => \$NO_EXTRACT , "no-lcd" => \$NO_LCD , "no-save" => \$NO_SAVE , "n|new" => \$CHECK_NEWEST , "output:s" => \$OUT_DIR , "overwrite" => \$OVERWRITE , "postfix:s" => \$POSTFIX , "prefix:s" => \$PREFIX , "proxy=s" => \$PROXY , "r|regexp=s" => \$SITE_REGEXP , "R|config-regexp=s" => \$URL_REGEXP , "selftest" => \$selfTest , "skip-version" => \$SKIP_VERSION , "sleep:i" => \$SLEEP_SECONDS , "stdout" => \$STDOUT , "t|tag=s" => \@TAG_LIST , "T|test" => \$test , "v|verbose:i" => \$verb , "V|version" => \$version , "W|prefix-www" => \$PREFIX_WWW ); if ( defined $debug ) { $debug = 1 unless $debug; } if ( defined $verb ) { $verb = 1 unless $verb; } $verb = 5 if $debug; # set verbose to 1, if debug is on. Set to full verbose if # debug is higher than 2. $debug and $verb == 0 and $verb = 1; $debug > 2 and $verb = 10; $version and VersionPrint(); $helpHTML and Help( undef, -html ); $helpMan and Help( undef, -man ); $help and Help(); $selfTest and SelfTest(); $NO_LCD = 0 unless defined $NO_LCD; $NO_SAVE = 0 unless defined $NO_SAVE; $NO_EXTRACT = 0 unless defined $NO_EXTRACT; if ( $chdir ) { unless ( chdir $chdir ) { die "$id: CHDIR [$chdir] fail. $ERRNO"; } } if ( defined $URL_REGEXP or @TAG_LIST ) { $CFG_FILE_NEEDED = -yes; } if ( defined $URL_REGEXP and @TAG_LIST ) { die "You can't use both --tag and --regexp options."; } if ( defined $PROXY ) { $ARG = $PROXY; if ( not m,^http://, ) { $debug and print "$id: Adding http:// to proxy $PROXY\n"; $ARG = "http://" . $ARG; } if ( not m,/$, ) { $debug and print "$id: Adding trailing / to proxy $PROXY\n"; $ARG .= "/"; } $PROXY = $ARG; $debug and print "$id: PROXY $PROXY\n"; } if ( @TAG_LIST ) { # -s -t -n tag --> whoopos.... if ( grep /^-/ , @TAG_LIST ) { die "$id: --tag option argument was an option: @TAG_LIST\n"; } $TAG_REGEXP = '\btag(\d+):\s*(\S+)'; } if ( not @CFG_FILE and ( @TAG_LIST or $URL_REGEXP ) ) { unless ( defined $PWGET_CFG ) { die "$id: No environment variable PWGET_CFG defined. " , "Need --config FILE where to search." ; } my $file = PathConvertSmart $PWGET_CFG; unless ( -r $file ) { die "$id: PWGET_CFG is not readable [$file]" . " for regexp match 'URL_REGEXP'"; } $verb and print "$id: Using default config file $file\n"; push @CFG_FILE, $file; } $debug and @CFG_FILE and print "$id: Config file [@CFG_FILE]\n"; # Do not remove this comment, it is for Emacs font-lock-mode # to handle hairy perl fontification right. #font-lock * s/*/ } # **************************************************************************** # # DESCRIPTION # # Convert month names to numbers in string. # # INPUT PARAMETERS # # $str Like "2005-February.txt" # # RETURN VALUES # # $ Like "2005-02.txt" is no changes. # # **************************************************************************** sub MonthToNumber ($) { local $ARG = shift; my %hash = ( 'Jan(uary)?' => '01' , 'Feb(ruary)?' => '02' , 'Mar(ch)?' => '03' , 'Apr(il)?' => '04' , 'May' => '05' , 'Jun(e)?' => '06' , 'Jul(y)?' => '07' , 'Aug(ust)?' => '08' , 'Oct(ober)?' => '09' , 'Sep(tember)?' => '10' , 'Nov(ember)?' => '11' , 'Dec(ember)?' => '12' ); while ( my($re, $month) = each %hash ) { s/$re/$month/i; } $ARG; } # **************************************************************************** # # DESCRIPTION # # Find out the temporary directory # # INPUT PARAMETERS # # none # # RETURN VALUES # # $ temporary directory # # **************************************************************************** sub TempDir () { my $id = "$LIB.TempDir"; local $ARG; if ( defined $TEMPDIR and -d $TEMPDIR) { $ARG = $TEMPDIR; } elsif ( defined $TEMP and -d $TEMP) { $ARG = $TEMP; } elsif ( -d "/tmp" ) { $ARG = "/tmp"; } elsif ( -d "c:/temp" ) { $ARG = "c:/temp" } elsif ( -d "$HOME/temp" ) { $verb and print "$id: [WARNING] using HOME/tmp, make sure you have disk space"; $ARG = "$HOME/temp"; } else { die "$id: Can't find temporary directory. Please set TEMPDIR." } if ( $ARG and not -d ) { die "$id: Temporary directory found is invalid: [$ARG]"; } s,[\\/]$,,; # Delete trailing slash s,\\,/,g; # Unix slashes in this Perl code $debug and print "$id: $ARG\n"; $ARG; } # **************************************************************************** # # DESCRIPTION # # Remove duplicate entries from list. Empty values are removed too. # # INPUT PARAMETERS # # @list # # RETURN VALUES # # @list # # **************************************************************************** sub ListRemoveDuplicates (@) { my $id = "$LIB.FilterDuplicates"; my @list = @ARG; $debug and print "$id: [@list]\n"; if ( @list ) { my %hash; @hash{ @list }++; @list = grep /\S/, keys %hash; } @list; } # **************************************************************************** # # DESCRIPTION # # Return temporary process file # # INPUT PARAMETERS # # none # # RETURN VALUES # # $ temporary filename # # **************************************************************************** sub TempFile () { my $id = "$LIB.TempFile"; my $ret = TempDir() . basename($PROGRAM_NAME) . "-" . $PROCESS_ID; $debug and print "$id: $ret\n"; $ret; } # **************************************************************************** # # DESCRIPTION # # Write file to stdout # # INPUT PARAMETERS # # $file # # RETURN VALUES # # none # # **************************************************************************** sub Stdout ( $ ) { my $id = "$LIB.Stdout"; my($file) = @ARG; local *FILE; unless ( open FILE, "< $file" ) { warn "$id: Can't STDOUT $file $ERRNO"; } else { print ; close FILE; } } # **************************************************************************** # # DESCRIPTION # # Fix the filename to correct OS version ( win32 /Cygwin / DOS ) # This is needed when calling external programs that take file arguments. # # INPUT PARAMETERS # # $file # # RETURN VALUES # # $file Converted file # # **************************************************************************** sub MakeOSfile ( $ ) { my $id = "$LIB.MakeOSfile"; local($ARG) = @ARG; if ( $WIN32 ) { if ( defined $SHELL ) { $debug and print "$id: SHELL = $SHELL\n"; if ( $SHELL =~ /sh/i ) # bash.exe { # This is Win32/Cygwin, which needs c:/ --> /cygdrive/c/ if ( /^(.):(.*)/ ) #font s/ { $ARG = "/cygdrive/$1/$2"; s,\\,/,g; s,//,/,g; } } } else { s,/,\\,g; # Win32 likes backslashes more } } $debug and print "$id: $ARG\n"; $ARG; } # **************************************************************************** # # DESCRIPTION # # Return ISO 8601 date YYYY-MM-DD # # INPUT PARAMETERS # # $format [optional] If "-version", return in format YYYY.MMDD # # RETURN VALUES # # $str Date string # # **************************************************************************** sub DateYYYY_MM_DD (; $) { my $id = "$LIB.DateYYYY_MM_DD"; my ($format) = @ARG; my (@time) = localtime(time); my $YY = 1900 + $time[5]; my ($DD, $MM) = @time[3,4]; # my ($mm, $hh) = @time[1,2]; $debug > 3 and print "$id: @time\n"; # Month(MM) counts from zero my $ret; if ( defined $format and $format eq -version ) { $ret = sprintf "%d.%02d%02d", $YY, $MM + 1, $DD; } else { $ret = sprintf "%d-%02d-%02d", $YY, $MM + 1, $DD; } $debug > 3 and print "$id: RET $ret\n"; $ret; } # **************************************************************************** # # DESCRIPTION # # Print variables in hash # # INPUT PARAMETERS # # $name name of the hash # %hash content of hash # # RETURN VALUES # # none # # **************************************************************************** sub PrintHash ( $ % ) { my $id = "$LIB.PrintHash"; my ($name, %hash ) = @ARG; print "$id: hash [$name] contents\n"; for my $key ( sort keys %hash ) { my $val = $hash{ $key }; printf "%-20s = %s\n", $key, $val; } } # **************************************************************************** # # DESCRIPTION # # Remove duplicates. # # INPUT PARAMETERS # # @ List of values. # # RETURN VALUES # # @ List of values. # # **************************************************************************** sub ListUnique ( @ ) { my $id = "$LIB.ListUnique"; $debug > 2 and print "$id: INPUT\n", join("\n", @ARG), "\n"; my %hash; local $ARG; # does no longer work in latest Perl: # @hash{ @ARG }++; for (@ARG) { $hash{$ARG} = 1; } my @ret = sort keys %hash; $debug > 1 and print "$id: RET\n", join("\n", @ret), "\n"; @ret; } # **************************************************************************** # # DESCRIPTION # # Print download progress # # INPUT PARAMETERS # # $url Site from where to download # $prefix String to print # $index current count # $total total # # RETURN VALUES # # string indicator message # # **************************************************************************** { my %staticDone; sub DownloadProgress ($$ $$ $) { my $id = "$LIB.DownloadProgress"; my ( $site, $url, $prefix, $index, $total ) = @ARG; if ( $verb ) { if ( $total > 1 ) { sprintf $prefix . " %3d%% (%2d/%d) " , int ( $index * 100 / $total ) , $index , $total ; } else { unless ( exists $staticDone{$site} ) { $staticDone{$site} = 1; $prefix; } } } }} # **************************************************************************** # # DESCRIPTION # # Expand variable by substituting any Environment variables in it. # # INPUT PARAMETERS # # $string Path information, like $HOME/.example # $str Original line; full string from configuration file. # # RETURN VALUES # # string Expanded path. # # **************************************************************************** sub ExpandVars ($; $) { my $id = "$LIB.ExpandVars"; local $ARG = shift; my $origline = shift; $debug > 2 and print "$id: input $ARG [$origline]\n"; return $ARG unless /\$[a-z]/i; # nothing to do my $orig = $ARG; # We must substitute environment variables so that the # longest are handled first. An example of the problem using # variable: $FTP_DIR_THIS/here. # # FTP_DIR = one # FTP_DIR_THIS = two # # --> one_THIS/here my @keys = sort { length $b <=> length $a } keys %ENV; my $value; for my $key ( @keys ) { next unless $key =~ /[a-z]/i; # ignore odd "__" env vars $value = $ENV{$key}; if ( /$key/ and $value ne "" ) #font s/ { $debug > 2 and print "$id $ARG substituting key $key => $value\n"; s/\$$key/$value/; #font s/; } } # The env variables may contain leading slashes, get rid of them. # Or there may be "doubles"; fix them. # # [$ENV = /dir/ ] # $ENV/path --> /dir//path # s,//+,/, unless /(http|ftp):/i; $debug > 2 and print "$id: after loop $orig ==> $ARG\n"; if ( /\$/ ) { # PrintHash "ENV", %ENV; die "$id: [ERROR]. Check environment. Expansion did not " , "find variable(s): $orig\n"; } # Convert to Unix paths s,\\,/,g; $ARG; } # **************************************************************************** # # DESCRIPTION # # Evaluate perl code and return result. # # INPUT PARAMETERS # # $url text to put variable $url # $text The text to be placed to variable $ARG # $code Perl code to manipulate $ARG # $flag non-empty: Do not return empty values, if the perl # code didn]t set ARG at all, then return original TEXT # # RETURN VALUES # # $text # # **************************************************************************** sub EvalCode ($ $ ; $) { my $id = "$LIB.EvalCode"; my ($url, $text, $code, $flag ) = @ARG; my $ret = $text; # Variable $url is seen to CODE if it wants to use it $debug and print "$id: ARG $ARG EVAL $code\n"; return $ret unless $code; # Wrap this inside private block, so that user defined # $code can execute in safe environment. E.g he can # define his own variables and they willnot affect the program # afterwards. { local $ARG = $text; eval $code; if ( $EVAL_ERROR ) { warn "$id: eval-fail ARG [$ARG] CODE [$code] $EVAL_ERROR"; $ARG = $text; } if ( not $ARG and $flag ) { $debug and print "$id: ARG [$ARG] is empty [$code]\n"; $ARG = $text; } elsif ( $ARG ) { $ret = $ARG; } } $debug and print "$id: RET $ret\n"; $ret; } # **************************************************************************** # # DESCRIPTION # # Check sourceforge special. The sourceofrge site does not publish # direct download links. Check if this is one of those. And example: # # INPUT PARAMETERS # # string # # RETURN VALUES # # 0 No # 1 Yes # # **************************************************************************** sub IsSourceforgeDownload ($) { my $id = "$LIB.IsSourceforgeDownload"; local $ARG = shift; m,(?:sourceforge|sf)\.net.*/download$,; } # **************************************************************************** # # DESCRIPTION # # Check if HTML::Parse and HTML::FormatText libraries are available # # INPUT PARAMETERS # # none # # RETURN VALUES # # 0 Error # 1 Ok, support present # # **************************************************************************** sub IsLibHTML () { my $id = "$LIB.IsLibHTML"; my $error = 0; $EVAL_ERROR = ''; local *LoadLib = sub ($) { my $lib = shift; eval "use $lib"; if ( $EVAL_ERROR ) { warn "$id: $lib not available [$EVAL_ERROR]\n"; $error++; } }; LoadLib( "HTML::Parse"); LoadLib( "HTML::FormatText"); return 0 if $error; 1; } # **************************************************************************** # # DESCRIPTION # # convert html into ascii # # INPUT PARAMETERS # # @lines # # RETURN VALUES # # @txt # # **************************************************************************** { my $staticLibCheck = 0; my $staticLibOk = 0; sub Html2txt (@) { my $id = "$LIB.Html2txt"; my (@list) = @ARG; my @ret = @list; unless ( $staticLibCheck ) { $staticLibOk = IsLibHTML(); $staticLibCheck = 1; unless ( $staticLibOk ) { warn "$id: No HTML to TEXT conversion available."; } } # Library was not found, nothing to do return unless $staticLibOk; if ( not @list ) { $verb and print "$id: Empty content"; } elsif ( $staticLibCheck ) { my $formatter = new HTML::FormatText ( leftmargin => 0, rightmargin => 76); # my $parser = HTML::Parser->new(); # $parser->parse( join '', @list ); # $parser-eof(); # $HTML::Parse::WARN = 1; my $html = parse_html( join '', @list ); $verb and print "$id: Making conversion\n"; @ret = ( $formatter->format($html) ); $html->delete(); # mandatory to free memory } @ret; }} # **************************************************************************** # # DESCRIPTION # # Return File content # # INPUT PARAMETERS # # $file # $join [optional] If set, then the file is read as one big # string. The value is the first argument in return array. # # RETURN VALUES # # (\@lines, $status) # # **************************************************************************** sub FileRead ( $; $ ) { my $id = "$LIB.FileRead"; my $file = shift; my $join = shift; my $status = 0; my @ret; local *FILE; # Convert path to Unix format $file =~ s,\\,/,g; if ( $file =~ /^~/ ) { # must expand ~/dir and ~user/dir contructs with shell my $expanded = qx(echo $file); $debug and print "$id: EXPANDED by shell: $file => $expanded\n"; $file = $expanded; } unless ( open FILE, "< $file" ) { $status = $ERRNO; warn "$id: FILE [$file] $ERRNO"; } else { if ( $join ) { $ret[0] = join '', ; } else { @ret = ; } close FILE; } $debug and print "$id: [$file] status [$status]\n"; \@ret, $status; } # **************************************************************************** # # DESCRIPTION # # Write File content # # INPUT PARAMETERS # # $file # @lines # # RETURN VALUES # # $status true, ERROR # # **************************************************************************** sub FileWrite ( $ @) { my $id = "$LIB.FileWrite"; my ($file, @lines ) = @ARG; my $status = 0; local *FILE; unless ( open FILE, "> $file" ) { $status = $ERRNO; warn "$id: FILE [$file] $ERRNO"; } else { print FILE @lines; close FILE; } $debug and print "$id: [$file] status [$status]\n"; $status; } # **************************************************************************** # # DESCRIPTION # # Convert HTML file to text # # INPUT PARAMETERS # # $file # # RETURN VALUES # # $status # # **************************************************************************** sub FileHtml2txt ($) { my $id = "$LIB.FileHtml2txt"; my $file = shift; my( $lineArrRef, $status ) = FileRead $file; if ( $status ) { $debug and print "$id: Can't convert\n"; } else { my @text = Html2txt @$lineArrRef; $status = FileWrite $file, @text; } $debug and print "$id: [$file] status [$status]\n"; $status; } # **************************************************************************** # # DESCRIPTION # # analyze FILE to match to match content. If content is not found, # delete file. # # INPUT PARAMETERS # # $file # $regexp [optional] content match regexp. # # RETURN VALUES # # $status True if file is okay # # **************************************************************************** sub FileContentAnalyze ($ $) { my $id = "$LIB.FileContentAnalyze"; my $file = shift; my $re = shift; return -noregexp unless $re; my( $lineArrRef, $status ) = FileRead $file, '-join'; my $ret; if ( $status ) { $debug and print "$id: Can't Analyze $file\n"; } else { local $ARG = $$lineArrRef[0]; my $match = $MATCH if /$re/; if ( $match ) { $ret = -matched; } else { unless ( $test ) { $debug > 1 and print "$id: [$re] content not found, deleting $file\n"; unless ( unlink $file ) { $verb and print "$id: $ERRNO"; } } } $debug and print "$id: [$file] status [$ret] RE [$re]=[$match]\n"; } $ret; } # **************************************************************************** # # DESCRIPTION # # Append slash to the end. Optionally remove # # INPUT PARAMETERS # # $path Add slash to path # $flag [optional] Remove slash # # RETURN VALUES # # $path # # **************************************************************************** sub Slash ($; $) { my $id = "$LIB.Slash"; local $ARG = shift; my $remove = shift; if ( $remove ) { s,/$,,; } { $ARG .= '/' unless m,/$,; } $ARG; } # **************************************************************************** # # DESCRIPTION # # Split url to components. http://some.com/1/2page.html would be seen as # # http some.com 1/2 page.html # # INPUT PARAMETERS # # $url # # RETURN VALUES # # @list Component list # # **************************************************************************** sub SplitUrl ($) { my $id = "$LIB.SplitUrl"; local $ARG = shift; my($protocol, $site, $dir, $file ) = ("") x 4; $protocol = lc $1 if m,^([a-z][a-z]+):/,i; $site = lc $1 if m,://?([^/]+),i; $dir = lc $1 if m,://?[^/]+(/.*/),i; $file = lc $1 if m,^.*/(.*),i; if ( $file and $file !~ /[.]/ ) { $debug and print "$id: [WARNING] ambiguous [$ARG], dir or file?\n"; unless ( $dir ) { $dir = $file; $file = ""; } } $debug and print "$id:" , " PROTOCOL <$protocol>" , " SITE <$site>" , " DIR <$dir>" , " FILE <$file>" , "\n" ; $protocol, $site, $dir, $file; } # **************************************************************************** # # DESCRIPTION # # Return basename from URL. This drops the possible # filename from the end. The extra file is dtected # from the file extension, perriod(.) # # http://some.com/~foo ok # http://some.com/foo ok treated as directory # http://some.com/page.html nok # # INPUT PARAMETERS # # $base # # RETURN VALUES # # $string The url will not contain trailing slash # # **************************************************************************** sub BaseUrl ($) { my $id = "$LIB.BaseUrl"; local $ARG = shift; if ( m,/~[^/]+$, ) { $debug and print "$id: ~foo\n"; # ok } elsif ( m,^(.*/)([^/]+)$, ) { my ( $base, $rest ) = ( $1, $2 ); $debug and print "$id: [$base] [$rest]\n"; $ARG = $base if $rest =~ /[.]/; } s,/$,,; $ARG; } # **************************************************************************** # # DESCRIPTION # # Return basename of the archive. # # file.tar,gz => file # file-1.2.tar.gz => file-1.2 # file-1_2.tar.gz => file-1.2 # # INPUT PARAMETERS # # $file # # RETURN VALUES # # $string # # **************************************************************************** sub BaseArchive ($) { my $id = "$LIB.BaseArchive"; local $ARG = shift; if ( /^(.*-\d+[-_.]\d+[-_.\d]*)/ ) { # delete last trailing - or . or _ ( $ARG = $1 ) =~ s/[-_.]$//; } elsif ( /^(.*)\.[a-z].*$/ ) { # some-archive.bz2 --> some-archive # some-archive.zip --> some-archive $ARG = $1; } s/_/-/g; $ARG; } # **************************************************************************** # # DESCRIPTION # # Try to make sense of relative paths when Base is known. # This function is very simplistic. # # INPUT PARAMETERS # # $base # $relative # # RETURN VALUES # # $path # # **************************************************************************** sub RelativePath ($ $) { my $id = "$LIB.RelativePath"; my $base = shift; local $ARG = shift; $base = Slash $base; my $ret = $base; unless ( $ARG ) { $debug and print "$id: second arg is empty [$base]"; } else { if ( m,^/.*, ) # /root/somewhere/file.txt { my ($proto, $site, $dir, $file) = SplitUrl $base; $ret = "$proto://$site$ARG"; } elsif ( m,^\./(.*), ) # ./somewhere/file.txt { $ret = $base . $1; } elsif ( m,^[^/\\#?=], ) # this/path/file.txt { $ret = $base . $ARG; } else { chomp; # make warn display line number, remove \n warn "$id: [ERROR] Can't resolve relative $base + $ARG"; } } $debug and print "$id: BASE $base ARG $ARG RET $ret\n"; $ret; } # **************************************************************************** # # DESCRIPTION # # Return decompress command for file. # # INPUT PARAMETERS # # $file # $type -list return listng command # -extract return unpack command # # RETURN VALUES # # lines as listed in file # # **************************************************************************** sub FileDeCompressedCmd ($; $) { my $id = "$LIB.FileDeCompressedCmd"; local $ARG = shift; my $type = shift; $type = '-extract' unless defined $type; my $opt = "--decompress --stdout"; my $cmd; my $decompress; if ( /\.rar$/ ) { die "$id: $ARG Can't handle. No free rar uncompress program exists."; } if ( $type eq -extract ) { if ( /\.(tar|tgz)(.*)/ ) { my $ext = $2; /\.(gz|tgz)$/i and $decompress = "gzip $opt"; /\.(bzip|bz2)$/i and $decompress = "bzip2 $opt"; /\.(xz)$/i and $decompress = "xz $opt"; $cmd = "$decompress $ARG | tar xvf -"; if ( length $ext and not $decompress) { warn "[WARN] Unknown compress extension: $ext ($ARG)"; $cmd = ""; } } elsif ( /\.gz$/ ) { $cmd = "gzip -f -d $ARG"; } elsif ( /\.(bz2|bzip)$/ ) { $cmd = "bzip2 -f -d $ARG"; } elsif ( /\.zip$/ ) { $cmd = $decompress = "unzip -o $ARG"; } } else { if ( /tar/ ) { SWITCH: { /\.(gz|tgz)$/i and $decompress = "gzip $opt", last; /\.(bzip|bz2)$/i and $decompress = "bzip2 $opt", last; /\.(xz)$/i and $decompress = "xz $opt", last; } if ( defined $decompress ) { $cmd = "$decompress $ARG | tar tvf -"; } else { $cmd = "tar tvf $ARG"; } } elsif ( /\.zip$/ ) { $cmd = "unzip -l $ARG"; } } $debug and print "$id:\n\tARG = $ARG\n" , "\tTYPE $type\n" , "\tRET [$cmd]\n" ; $cmd; } # **************************************************************************** # # DESCRIPTION # # Return decompress file listing # # INPUT PARAMETERS # # $file # # RETURN VALUES # # \@files Files from the archive # $error If "-noarchive" , then file is not an archive. # # **************************************************************************** sub FileDeCompressedListing ( $ ) { my $id = "$LIB.FileDeCompressedListing"; my $file = shift; $debug and print "$id: BEGIN $file CWD ", cwd(), "\n"; my ($cmd, @result, $status); if ( -f $file ) { $cmd = FileDeCompressedCmd $file, -list ; $debug and print "$id: running [$cmd] CWD ", cwd(), "\n"; $ERRNO = 0; @result = qx($cmd) if $cmd; if ( $ERRNO ) { $verb and print "$id: [WARN] $ERRNO\n"; $status = -externalerror; } $debug and print "$id: CMD [$cmd] => \n[@result]\n"; } else { $verb and warn "[ERROR] file not found ", cwd(), "$file"; $status = -file-not-found; } my @ret = (); local $ARG; if ( $status ) { # Nothing to do here. Error } elsif ( $file =~ /tar/ ) { # Get last elements in the line # # .. 0 2000-11-18 16:18 semantic-1.3.2 # .. 23688 2000-11-18 16:18 semantic-1.3.2/semantic-bnf.el # .. 50396 2000-11-18 16:18 semantic-1.3.2/semantic.el # .. 36176 2000-11-18 16:18 semantic-1.3.2/semantic-util.el # # This comment fixes Emacs fontification: m/// for ( @result ) { my $file = (reverse split)[0]; chomp $file; push @ret, (reverse split)[0]; } $debug and print "$id: TAR [@result]\n"; } elsif ( $file =~ /zip/ ) { # Length Date Time Name # ------ ---- ---- ---- # 4971 03-22-00 21:14 1/gnus-ml.el # 0 10-03-99 01:33 tmp/1/tpu/ # ------ ------- # 25036 8 files for ( @result ) { my @split = reverse split; chomp $split[0]; next unless /^\s+\d+\s/; push @ret, $split[0] if @split == 4; } $debug and print "$id: ZIP [@result]\n"; } else { $debug and print "$id: -noarchive $file\n"; $status = -noarchive; } $debug and print "$id: RET $file status [$status] [@ret]\n"; \@ret, $status; } # **************************************************************************** # # DESCRIPTION # # Return the subdirectory where the files are in compressed archive. # There may not be any directory or there may be several direcotries # that do not share one ROOT directory. # # INPUT PARAMETERS # # $file # # RETURN VALUES # # $dir The topmost COMMON root directory. If not all files # have common root, return nothing. # # $status -noarchive The file was not archive. # \@file reference to file list # # **************************************************************************** sub FileDeCompressedRootDir ( $ ) { my $id = "$LIB.FileDeCompressedRootDir"; my $file = shift; $debug and print "$id: BEGIN $file CWD ", cwd(), "\n"; my ( $fileArrRef, $status ) = FileDeCompressedListing $file; # If there is directory it must be in front of every file local $ARG; my %seen; my @nodir; for ( @$fileArrRef ) { if ( m,^([^/]+)/, ) { # Do not accept "./" directory $seen{ $1 } = 1 if $1 ne "."; } else { push @nodir, $ARG; } } my @roots = keys %seen; my $ret; if ( @roots ) { my $root = $roots[0]; if ( @roots == 1 and @nodir == 1 and $root eq $nodir[0]) { # Special case. The directory itself is always "alone" entry # drwxrwxrwx foo/users 0 2006-07-22 14:18 package-0.5.6 $ret = $root } elsif ( @roots == 1 and @nodir == 0 ) { $ret = $root; } } $debug and print "$id: RET [$ret] status [$status]; " . "roots [@roots] no-dirs [@nodir]\n"; $ret, $status, $fileArrRef ; } # **************************************************************************** # # DESCRIPTION # # If archive does not have root directory, return the # filename which is bet used for archive root dir. # # package.tar.gz --> package-YYYY.MMDD # # INPUT PARAMETERS # # $file # # RETURN VALUES # # $root Returned, If archive does not have natural ROOT # # **************************************************************************** sub FileRootDirNeedeed ( $ ) { my $id = "$LIB.FileRootDirNeedeed"; my $file = shift; $debug and print "$id: BEGIN $file CWD ", cwd(), "\n"; my ($root, $status, $fileArrRef) = FileDeCompressedRootDir $file; local $ARG; if ( $status eq -noarchive ) # Single file.txt.gz, not package { $debug and print "$id: -noarchive $file\n"; $ARG = ''; } elsif ( @$fileArrRef == 0 ) { $debug and print "$id: EMPTY $file\n"; $ARG = ''; } elsif ( @$fileArrRef == 1 ) { $debug and print "$id: SINGLE FILE $file\n"; $ARG = ''; } elsif ( $root eq '' ) { $ARG = basename $file; my $base = BaseArchive $ARG; # If there is no numbers left, assume that we got barebones # and not name like "package-1.11". Add date postfix unless ( $base =~ /\d/ ) { $base .= "-" . DateYYYY_MM_DD(-version); } $ARG = $base; } $debug and print "$id: $file --> need dir [$ARG]\n"; $ARG; } # **************************************************************************** # # DESCRIPTION # # Create root directory if it is necessary in order to unpack # the file. If the archive does not contain ROOT, contruct one # from the filename and current date. # # If directory was created or it already exists, return full path # # INPUT PARAMETERS # # $file # $path Under this directory the creation # $opt -rm Delete previous unpack directory # # RETURN VALUES # # $path If directory was created # # **************************************************************************** sub FileRootDirCreate ( $ $; $ ) { my $id = "$LIB.FileRootDirCreate"; my ($file, $path, $opt) = @ARG; not defined $opt and $opt = ''; $debug and print "$id: BEGIN $file PATH $path\n"; local $ARG = FileRootDirNeedeed $file; my $ret = ''; if ( $ARG ) { $ARG = "$path/$ARG"; if ( -e ) { $verb and print "$id: Unpack dir already exists $ARG\n"; if ( $opt =~ /rm/i ) { $verb and print "$id: deleting old unpack dir\n"; unless ( rmtree($ARG, $debug) ) { warn "$id: Could not rmtree() $ARG\n"; } } } unless ( -e ) { unless ( $test ) { mkpath( $ARG ) or die "$id: mkdir() fail $ARG $ERRNO"; $verb and warn "$id: [WARNING] archive $file" , " has no root-N.NN directory." , " Report this to archive maintainer. CREATED $ARG" , "\n" ; } } $ret = $ARG; } $debug and print "$id:\n\tFILE $file\n" , "\tPATH $path\n" , "\tRET --> created [$ret]\n" ; $ret; } # **************************************************************************** # # DESCRIPTION # # Unpack list of files recursively (If package contains more # archives) # # INPUT PARAMETERS # # \@array List of files # \%hash Unpack command hash table: REGEXP => COMMAND, where # %s is substituted for filename # $check "-noroot", will not check the archive content # $opt "-rm", will remove any existing unpack dir # # RETURN VALUES # # none # # **************************************************************************** sub Unpack($ $; $ $); # must be introdced due to recursion sub Unpack ($ $; $ $) { my $id = "$LIB.Unpack"; my ( $filesArray, $cmdHash, $check, $opt ) = @ARG; $check = 1 if not defined $check; $check = 0 if $check eq -noroot; $opt = '' if not defined $opt; local $ARG; my $origCwd = cwd(); $debug and print "$id: entry cwd = $origCwd OPT [$opt]\n"; my @array = sort { length $b <=> length $a } keys %$cmdHash; $debug and print "$id: SORTED decode array @array\n"; for ( @$filesArray ) { $debug and warn "$id: unpacking $ARG\n"; if ( -d ) { $verb and print "$id: $ARG is directory, skipped.\n"; next; } elsif ( not -f ) { $verb and print "$id: $ARG is not a file (not exist), skipped.\n"; next; } # The filename may look lik test/tar.gz my $gocwd = dirname($ARG) || '.' ; chdir $gocwd or die "$id: [for] Can't chdir [$gocwd] $ERRNO"; # Check only archives that do not contains some kind # of numbering for missing ROOT directories. my $cwd = cwd(); my $chdir = 0; my $file = basename $ARG; # ............................................ check ... # Must contain root directory in archive # We check every archive. Regexp \d would have # skipped names looking like package-1.34.tar.gz if ( $check ) # and not /-[\d]/ ) #font s/ { $debug and print "##\n"; my $newDir = FileRootDirCreate basename($ARG), $cwd, $opt; $debug and print "## $newDir\n"; if ( $newDir ) { $debug and print "$id: cd newdir $newDir\n"; unless ( chdir $newDir ) { print "$id: [ERROR] chdir $newDir $ERRNO\n"; next; } $file = "../$file"; $chdir = 1; } } # ........................................... unpack ... $debug and print ">>\n"; my $cmd = FileDeCompressedCmd $file; $debug and print "$id: unpacking CWD ", cwd(), " [$cmd]\n"; my @response = qx($cmd) unless $test; print "@response\n" if $verb; # ........................................ recursive ... for my $entry ( @response ) # Make this recursive { local $ARG = $entry; chomp; if ( /\.(bz2|gz|z|zip)$/i ) # s/ { $debug and print "$id: >> RESCURSIVE [$ARG]\n"; Unpack( [ $ARG ], $cmdHash, -noroot, $opt ); } } chdir $cwd if $chdir; # Go back to original dir } $debug and print "EXIT chdir $origCwd\n"; chdir $origCwd or die "$id: [exit] Can't chdir [$origCwd] $ERRNO"; } # **************************************************************************** # # DESCRIPTION # # Read directory content # # INPUT PARAMETERS # # $path # # RETURN VALUES # # @ list of files # # **************************************************************************** sub DirContent ($) { my $id = "$LIB.DirContent"; my ( $path ) = @ARG; $debug and warn "$id: $path\n"; local *DIR; unless ( opendir DIR, $path ) { print "$id: can't read $path $ERRNO\n"; next; } my @tmp = readdir DIR; closedir DIR; $debug > 1 and warn "$id: @tmp"; @tmp; } # **************************************************************************** # # DESCRIPTION # # Scan until valid tag line shows up. Return line if it is under the # TAG # # INPUT PARAMETERS # # $line line content # $tag Tag name to look for # $reset If set, do nothing but reset state variables. # You should call with this if you start a new round. # # RETURN VALUES # # ($LINE, $stop) The $LINE is non-empty if it belongs to the TAG. # The $stop flag is non-zero if TAG has ended. # # **************************************************************************** { my ( $staticTagLevel , $staticTagName , $staticTagFound ); sub TagHandle ($$ ; $) { my $id = "$LIB.TagHandle"; local $ARG = shift; my ( $tag , $reset) = @ARG; my $info = ( $verb > 1 || $debug > 1 ) ? 1 : 0; # ........................................................ reset ... if ( $reset ) { $debug > 1 and print "$id: RESET\n"; $staticTagLevel = $staticTagName = $staticTagFound = ""; return $ARG; } # ...................................................... tag ... my $stop; # The line may have multiple tags and the $1 is number, second # is the tag name. However we can't put them in that order # to the hash, because the number is "key". Use reverse here # # tag2: A tag2: B # # 2 => A # 2 => B # | # The key, only last would be in hash my %choices = reverse /$TAG_REGEXP/go; if( $info and keys %choices > 0 ) { print "$id: TAG CHOICES: ", join( ' ', %choices), "\n" } unless ( $staticTagFound ) { while ( my($tagN, $tagNbr) = each %choices ) { if ( $info and $tagNbr ) { print "$id: [$tagNbr] '$tagN' eq '$tag'\n"; } if ( $tagNbr and $tagN eq $tag ) { $staticTagLevel = $tagNbr; $staticTagName = $tagN; $staticTagFound = 1; $debug > 0 and warn "$id: TAG FOUND [$staticTagName] $ARG\n" } } $ARG = "" unless $staticTagFound; # Read until TAG } else { # We're reading lines after the tag was found. # Terminate on next found tag name while ( my($tagN, $tagNbr) = each %choices ) { if ( $tagNbr and $tagNbr <= $staticTagLevel ) { $info and print "$id: End at [$staticTagName] $ARG\n"; $stop = 1; } } } $debug > 1 and print "$id: RETURN [$ARG] stop [$stop]\n"; $ARG, $stop; }} # **************************************************************************** # # DESCRIPTION # # Handle Local directory change and die if can't checnge to # directory. # # INPUT PARAMETERS # # $dir Where to chdir # $make [optional] Flag, if allowed to create directory # # RETURN VALUES # # none # # **************************************************************************** sub DirectiveLcd (%) { my $id = "$LIB.DirectiveLcd"; my %arg = @ARG; my $dir = $arg{-dir} or die "$id: No DIR argument"; my $mkdir = $arg{-mkdir} || 0; $verb > 2 and PrintHash "$id: input", %arg; my $lcd = ExpandVars $dir; $verb > 2 and print "$id: LCD original $lcd\n"; $lcd = PathConvertSmart $lcd; $verb > 2 and print "$id: LCD converted $lcd\n"; my $isDir = -d $lcd ? 1 : 0 ; unless ( $isDir ) { my $TEST = "[test mode] " if $test; $verb and print STDERR "$id: ${TEST}Creating directory $lcd\n"; not $mkdir and die "$id: [$dir] => lcd [$lcd] is not a directory"; unless ($test) { mkpath($lcd, $verb) or die "$id: mkpath $lcd failed $ERRNO"; } } if ( -d $lcd ) { $debug > 2 and print "$id: chdir $lcd\n"; chdir $lcd or die "$id: chdir $lcd $ERRNO"; } } # **************************************************************************** # # DESCRIPTION # # Examine list of files and return the newest file that match FILE # the idea is that we divide the filename into 3 parts # # PREFIX VERSION REST # # So that for example filename # # emacs-20.3.5.1-lisp.tar.gz # # is exploded to parts # # emacs -20.3.5.1- lisp.tar.gz # # After this, the VERSION part is examined and all the numbers from # it are read and converted to zero filled keys, so that sorting # between versions is possible: # # (20 3 5 1) --> "0020.0003.0005.0001" # # A hash table for each file is build according to this version key # # VERSION-KEY => FILE-NAME # # When we finally sort the has by key, we get the latest version number # and the associated file. # # INPUT PARAMETERS # # $file File to use as base. # \@files List of files to compare. # # RETURN VALUES # # $file File that is newest, based on version number. # # **************************************************************************** sub LatestVersion ( $ $ ) { my $id = "$LIB.LatestVersion"; my ( $file , $array ) = @ARG; $debug > 1 and print "$id: INPUT file [$file] ARRAY =>\n", join ("\n", @$array), "\n"; ! $file and die "$id: argument missing"; # ................................................ write regexps ... # NN.NN YYYY-MM-DD # 1.2beta23 # 1.1-beta1 # 1.1a # file_150b6_en.zip # # Difficult names are like4this-1.1.gz # # Prevent 1.1.tar.gz --> "1.1.t" with negative lookahead my $ext = '(?!(?i)tar|gz|bzip|bz2|tgz|tbz2|zip|rar|z$)'; my $add = '(?:[-_]?(?:alpha|beta|rc)\d*|' . $ext . '[a-z])'; my $regexp = '^(?.*?[-_]|\D*\d+\D+|\D+)' # $1 . '(?' . '[-_.\db]*\d' # $2 . $add . '?)' . '(?\S+)' # $3 ; $debug and print "$id: file [$file] REGEXP /$regexp/ " , "ARRAY OF FILENAMES TO EXAMINE:\n" , join("\n", @$array) , "\n" ; @$array = ListUnique @$array; # .......................................................... sub ... my ( %hash, %hash2, $max ); local *VersionPush = sub ( $ $ ) { my $id = "$id.VersionPush"; local $ARG = shift; # filename my $verStr = shift; my $key = ""; my @v = /(\d+)/g ; # explode all digits if ( $verStr =~ /(?[a-z])$/ ) # "1.1a" { # 1.1a => 1.1.97 use a's ascii code # 1.1 => 1.1.0 push @v, ord $+{ascii}; # get character ASCII code } $debug > 1 and print "$id: [Version pure] \@v = @v\n"; # Sometimes the version number is really unorthodox # like foo-1.020.el.gz When compared with 1.2.3, that # would say: # # 1. 20 # 1. 2 # # And giving false prioroty to "020". The leading # zeroes must be treated as it the number was: # # 1. 0 20 LOCALIZE: { # Use inner block to localize @ver my @ver; for my $nbr ( @v ) { if ( $nbr =~ /^(?0+)(?\d+)/ ) { push @ver, (0) x length $+{zeros}; push @ver, $+{digits}; } else { push @ver, $nbr; } } @v = @ver; } $debug > 1 and print "$id: [Version fixed] \@v = @v\n"; # Record how many separate digits was found. $max = @v if @v > $max; # fill until 8 version digit elements in array push @v, 0 while @v < 8 ; for my $version ( @v ) { # 1.0 --> 0001.0000.0000.0000.0000.0000 $key .= sprintf "%015d.", $version; } $hash { $key } = $ARG; $hash2 { $v[0] } = $ARG; }; # .......................................................... sub ... local *DebugHash = sub () { if ( $debug > 1 ) { while ( my($key, $val) = each %hash ) { printf "$id: HASH1 $key => $val\n"; } while ( my($key, $val) = each %hash2 ) { printf "$id: HASH2 $key => $val\n"; } } }; # .......................................................... sub ... local *ParseVersion = sub ($ $ $) { my $id = "$id.ParseVersion"; my ($pfx, $post, $ver) = @ARG; my $ret; # If date is used as version number: # # wemi-199802080907.tar.gz # wemi-19980804.tar.gz # wemi-199901260856.tar.gz # wemi-199901262204.tar.gz # # then sort directly by the %hash2, which only contains direct # NUMBER key without prefixed zeroes. For multiple numbers we # sort according to %hash my @try; if ( $max == 1 ) { @try = sort { $b <=> $a } keys %hash2; %hash = %hash2; } else { @try = sort { $b cmp $a } keys %hash; } if ( $debug ) { print "$id: Sorted choices: $ver $pfx.*$post\n"; for my $arg ( @try ) { print "\t$hash{$arg}\n"; } } # If SINGLE answer, then use that. Or if we grepped versioned # files, take the sorted one from the beginning if ( @try ) { $ret = $hash{ $try[0] }; } $ret; }; # ........................................... search version [1] ... # APACHE project uses underscores in filenames: # apache_1_3_9_win32.exe local $ARG = $file; my $ret = $file; if ( /$regexp/o ) { my $pfx = $+{prefix}; my $version = $+{version}; my $rest = $+{rest}; $pfx =~ s/[-_]$//; # package-name- => package-name $pfx =~ s,([][{}+.?*]),\\$1,g; # In RE, quote special characters. # Examine 150b6, 1.50, 1_15 my $ver = '[-_]([-._\db]+ ' . $add . '?)'; # NOTE: Sourceforge is on format /file.tar.gz/download my $post = qr/\Q$rest\E/i . '($|[&?][a-z]|/download)'; $debug and print "$id: ORIGINAL ARG '$ARG'" . " INITIAL PFX: [$pfx]" . " MIDDLE: [$version] regexp /$ver/" . " POSTFIX: [$rest] regexp /$post/" . "\n" ; # .................................................. arrange ... # If there are version numbers, then sort all according # to version. for ( @$array ) { $debug > 1 and print "$id: FOR [$ARG]\n"; # If the filename is like file.txt-1.1, then there is # no point to use $post, because it would reject files # because it requires extension "1$", but file.txt-1.2 # has extension "2$" unless ( ( ( /\.[a-z]+[^-.]+$/ and /$pfx.*$post/) or /$pfx/ ) and /$regexp/o ) { $debug > 1 and print "$id: REJECTED\t\t$ARG\n"; next; } unless ( /$pfx.*$post/ ) { $debug > 1 and print "$id: REJECTED no ", "prefix '$pfx'", "postfix '$post'", "\t$ARG\n" ; next; } unless ( /$regexp/o ) { $debug > 1 and print "$id: REJECTED, no regexp match\t$ARG\n"; } my ($BEG, $vver, $END) = ($+{prefix}, $+{version}, $+{rest}); $debug > 1 and print "$id: MATCH: [$BEG] [$vver] [$END]\n"; VersionPush( $ARG, $vver); } DebugHash(); $ret = ParseVersion( $pfx, $post, $ver ); } elsif ( /(?.*)-[\d.]+$/ ) { $debug and print "$id: plan B, non-standard version-N.NN\n"; my $pfx = $+{REST}; my $ver = '(?-[\d.]+)$'; my $post = ""; $debug > 1 and print "$id: (B) PFX: [$pfx] POSTFIX: [$post] [$ARG]\n"; for ( @$array ) { unless ( /(?$pfx)-$ver/ ) { $debug and print "$id: REJECTED\t\t$ARG\n"; next; } my ($BEG, $vver) = ($+{beg}, $+{version}); $debug > 1 and print "$id: MATCH: [$BEG] [$vver]\n"; VersionPush( $ARG, $vver); } DebugHash(); $ret = ParseVersion( $pfx, $post, $ver ); } elsif ( /^(?\D+)[\d.]+[a-h]+[\d.]+(?.*)/ ) # WinCvs11b14.zip { $debug > 1 and print "$id: plan C, non-standard version-N.NN\n"; my $pfx = $+{beg}; my $ver = '(?[\d.]+[a-h]+[\d.]+)'; my $post = $+{REST}; $debug > 1 and print "$id: (C) PFX: [$pfx] POSTFIX: [$post] [$ARG]\n"; for ( @$array ) { unless ( /(?$pfx)$ver$post/ ) { $debug > 1 and print "$id: REJECTED\t\t$ARG\n"; next; } my ($BEG, $vver) = ($+{beg}, $+{version}); $debug > 1 and print "$id: MATCH: [$BEG] [$vver] $ARG\n"; VersionPush( $ARG, $vver); } DebugHash(); $ret = ParseVersion( $pfx, $post, $ver ); } elsif ( /^(?\D+)[\d.]+/ ) # WinCvs136.zip { $debug > 1 and print "$id: plan D, non-standard version-N.N\n"; my $pfx = $+{prefix}; my $ver = '(?[\d.]+)'; my $post = ""; $debug > 1 and print "$id: (D) PFX: [$pfx] POSTFIX: [$post]\n"; for ( @$array ) { unless ( /(?$pfx)$ver/ ) { $debug > 1 and print "$id: REJECTED\t\t$ARG\n"; next; } my ($BEG, $vver) = ($+{beg}, $+{version}); $debug > 1 and print "$id: MATCH: [$BEG] [$vver] $ARG\n"; VersionPush( $ARG, $vver); } DebugHash(); $ret = ParseVersion( $pfx, $post, $ver ); } else { if ( $verb ) { print << "EOF"; $id: Unknown version format in filename. Cannot parse according to skeleton [$ARG] The most usual reason for this error is, that you have supplied and . Please examine your URL and try removing Another reason may be that the filename is in format that is not standard NAME-VERSION.EXTENSION, like package-1.34.4.tar.gz. In that case please contact the developer of the package and let him know about the de facto packaging format. EOF } } $debug and warn "$id: RETURN model was:$file --> [$ret]\n"; if ( $ret eq '' ) { die << "EOF"; $id: Internal error, Run with --debug on to pinpoint the details. Cannot find anything suitable for download. This may be due to non-matching file or is too limiting or filtered everything. If you used , it may be possible that the heuristics could not determine what were the links to examine. In that case, please let the program know what kind of file it should search by providing template directive . Check also that the file extension looks the same as what found from the page. [$CURRENT_TAG_LINE] EOF } $ret; } # **************************************************************************** # # DESCRIPTION # # Check if file is compressed once. This means that # if file is decompressed it would possibly overwrite original file. # # file.txt.gz => is simple compressed # package.tar.gz => is NOT simple compressed. # # INPUT PARAMETERS # # $ FILENAME # # RETURN VALUES # # $ True value if file is simple compressed # # **************************************************************************** sub FileSimpleCompressed ( $ ) { my $id = "FileSimpleCompressed"; my ($file) = @ARG; my @list = ( '.gz' , '.bz2' , '.Z' , '.z' , '.xz' ); my $ret; unless ( $file =~ /\.(tar|zip|rar)/ ) # must not be multi-archive format { my @suffixlist = map { my $f = $ARG; $f =~ s/\./\\./g; $f } @list; my ($name,$path,$suffix) = fileparse( $file , @suffixlist); $ret = $suffix; } $debug and print "$id: [$file] RET [$ret]\n"; $ret; } # **************************************************************************** # # DESCRIPTION # # Check If file exists. Checks also the name without compression # extension. # # INPUT PARAMETERS # # $ FILENAME # $ [optional] UNPACK. If set, then the file will be # $ decompressed and the original file under it should be checked. # That is, if FILENAME is "file.txt.gz", check also "file.txt" # # RETURN VALUES # # @ List of files that exist on disk # # **************************************************************************** sub FileExists ( % ) { my $id = "FileExists"; my %arg = @ARG; local $ARG = $arg{file}; # REQUIRED my $unpack = $arg{unpack} || 0; my $file = $ARG; if ( /^(.+)\?use_mirror/ ) { $debug and print "$id: FILE [$file] sf fixed => [$1]\n"; $file = $1; } elsif ( /\?.*=(.+[a-zA-Z])/ ) { # download.php?file=this.tar.gz $debug and print "$id: FILE [$file] fixed => [$1]\n"; $file = $1; } my @list = qw ( .bz2 .gz .lzma .lzop .rar .rzip .z .Z .zip .xz ); my @suffixlist = map { my $f = $ARG; $f =~ s/\./\\./g; $f } @list; my ($name, $path, $suffix) = fileparse( $file , @suffixlist); my %ret; # hash filters out duplicates if ( $suffix ) { my @try = ( $suffix ); push @try, '' if $unpack; # '' => file itself for my $try ( @try ) { $debug and print "$id: trying: [$path] + [$name] + [$try]\n"; my $file = $path . $name . $try; $ret{$file} = 1 if -e $file; } } elsif ( not $unpack and -e $file ) { $ret{$file} = 1; } $debug and print "$id: ", cwd(), " [$file] RET [", keys %ret, "]\n"; keys %ret; } # **************************************************************************** # # DESCRIPTION # # Check Invalid filename characters # # INPUT PARAMETERS # # $ filename # # RETURN VALUES # # $ URL file # # **************************************************************************** sub FileNameFix ( $ ) { my $id = "FileNameFix"; local ($ARG) = @ARG; $debug and print "$id: INPUT [$ARG]\n"; if ( /\?.+=(.+\.(?:gz|bz2|tar|zip|rar|pak|lhz|iso))/ ) { # download.php?id=file.tar.gz $debug and print "$id: A: chararcter [?] - fixing [$ARG] => [$1]\n"; $ARG = $1; } elsif ( m,^(?:sourceforge|sf)\.net.*/([^/?]+), ) { $ARG = $1; } elsif ( /^(.*viewcvs.*)\?/ ) { # http://cvs.someorg/cgi-bin/viewcvs.cgi/~checkout~/file?rev=HEAD # => file "file?rev=HEAD" $debug and print "$id: B: chararcter [?] - fixing [$ARG] => [$1]\n"; $ARG = $1; } $debug and print "$id: RET [$ARG]\n"; $ARG; } # **************************************************************************** # # DESCRIPTION # # Make latest filename with possible version numbers # # INPUT PARAMETERS # # $file Template, how the file looks like # @ Array of possible verion numbers # # RETURN VALUES # # @ Versioned files # # **************************************************************************** sub MakeLatestFiles ( $ @ ) { my $id = "$LIB.MakeLatestFiles"; local $ARG = shift; my @versions = @ARG; my @ret; if ( /^(.*?)-([\d.]+[\d])(.*)/ ) { my ( $pre, $middle, $rest ) = ( $1, $2, $3 ); $debug and print "$id: Exploded [$pre] [$middle] [$rest]\n"; for my $ver ( @versions ) { my $file = $pre . "-" . $ver . $rest; push @ret, $file; } } else { $verb and print "$id: Can't parse version from FILE $ARG\n"; # Suppose that all the files in @versions are versioned # # file.txt-1.2 file.txt-1.3 and the model file was # file.txt @ret = ( LatestVersion $versions[0], \@versions ); } $debug and print "$id: FILE $ARG RET => [@ret]\n"; @ret; } # **************************************************************************** # # DESCRIPTION # # Selct file or files from LIST. GETFILE and REGEXP are # mutually exclusive # # INPUT PARAMETERS # # regexp Select files according to regexp. # regexpno Files not to include after REGEXP has matched. # # getfile If newest file is wanted, here is sample. # If this variable is empty; then no newest file is searched. # # list @, candidate file list # # RETURN VALUES # # @ List of selected files # # **************************************************************************** sub FileListFilter ( % ) { my $id = "$LIB.FileListFilter"; my %arg = @ARG; my $regexp = $arg{regexp}; my $regexpNo = $arg{regexpno}; my $getFile = $arg{getfile}; my $list = $arg{list}; my @list = @$list; $debug and print "$id: INPUT REGEXP [$regexp]" , " REGEXPNO [$regexpNo]" , " GETFILE [$getFile]" , " LIST => " , join("\n", @list) , "\n" ; # ......................................................... args ... local *Filter = sub ( $ @) { my ($regexpNo, @list) = @ARG; my @new = grep ! /$regexpNo/, @list; $debug and print "$id: [$regexpNo] FILTERED " , join(' ', grep /$regexpNo/, @list), "\n" ; if ( $verb and not @new ) { print "$id: [WARNING] regexpNo [$regexpNo] rejected everything\n"; } @list = @new; }; @list = Filter( $regexpNo, @list ) if $regexpNo and @list; if ( $regexp ) { @list = sort grep /$regexp/, @list; } $debug and print "$id: after regexp\n", join("\n", @list), "\n"; if ( $getFile ) { my $name = basename $getFile; $debug and print "$id: getfile basename [$name]\n"; my $file = LatestVersion $name, \@list; if ( $verb ) { print "$id: ... Getting latest version: $file DIR: ", cwd(), "\n"; } @list = ( $file ); } @list = Filter( $regexpNo, @list ) if $regexpNo and @list; $debug and print "$id: RET\n", join("\n", @list), "\n"; @list; } # **************************************************************************** # # DESCRIPTION # # Get file via FTP # # INPUT PARAMETERS # # $site Dite to connect # $path dir in SITE # # $getFile File to get # $saveFile File to save on local disk # $regexp # $regexpNo # # $firewall # # $new Flag, Should only the newest file retrieved? # $stdout Print to stdout # # RETURN VALUES # # () RETURN LIST whose elements are: # # $stat Error reason or "" => ok # @ list of retrieved files # # **************************************************************************** sub UrlFtp ( % ) { my $id = "$LIB.UrlFtp"; my %arg = @ARG; # ......................................................... args ... # check mandatory not exists $arg{site} and die "$id: SITE missing"; not exists $arg{path} and die "$id: PATH missing"; not exists $arg{getFile} and die "$id: FILE missing"; not exists $arg{saveFile} and die "$id: SAVE missing"; # Defaults. Note: login 'ftp' is still not known to every # FTP server. not $arg{login} and $arg{login} = 'anonymous'; not $arg{pass} and $arg{pass} = 'nobody@example.com'; # Read values my $url = $arg{url}; my $site = $arg{site}; my $path = $arg{path}; my $getFile = $arg{getFile}; my $saveFile = $arg{saveFile}; my $regexp = $arg{regexp}; my $regexpNo = $arg{regexpNo}; my $firewall = $arg{firewall}; my $login = $arg{login}; my $pass = $arg{pass}; my $new = $arg{new} || 0; my $stdout = $arg{stdout} || 0 ; my $conversion = $arg{conversion} || ''; my $rename = $arg{rename} || ''; my $origLine = $arg{origLine} || ''; my $unpack = $arg{unpack} || ''; my $overwrite = $arg{overwrite} || ''; # ............................................ private functions ... my @files; local *PUSH = sub ($) { local ( $ARG ) = @ARG; if ( $stdout ) { Stdout $ARG; } else { unless ( m,[/\\], ) { $ARG = cwd() . "/" . $ARG ; } push @files, $ARG if not $stdout; } }; # ............................................ private variables ... my $timeout = 120; my $singleTransfer; if ( (not defined $regexp or $regexp eq '') and ! $new ) { $singleTransfer = 1; } local $ARG; $stdout and $saveFile = TempFile(); if ( $debug ) { print "$id:\n" , "\tsingleTransfer: $singleTransfer\n" , "\tSITE : $site\n" , "\tPATH : $path\n" , "\tLOGIN : $login PASS $pass\n" , "\tgetFile : $getFile\n" , "\tsaveFile : $saveFile\n" , "\trename : $rename\n" , "\tconversion : $conversion\n" , "\tregexp : $regexp\n" , "\tregexp-no : $regexpNo\n" , "\tfirewall : $firewall\n" , "\tnew : $new\n" , "\tcwd : ", cwd(), "\n" , "\tOVERWRITE : $overwrite\n" , "\tSKIP_VERSION: $SKIP_VERSION\n" , "\tstdout : $stdout\n" ; } $verb and print "$id: Connecting to ftp://$site$getFile --> $saveFile\n"; $debug and print "$id:\n" , "REGEXP: $regexp \n" , "LOGIN : $login\n" , "PASSWD: $pass\n" , "SITE : $site\n" , "PATH : $path\n" ; # One file would be transferred, but it already exists and # we are not allowed to overwrite --> do nothing. if ( $singleTransfer and -e $saveFile and not $overwrite and not $stdout ) { $verb and print "$id: [ignored, exists] $saveFile\n"; return; } # .................................................. make object ... my $ftp; if ( $firewall ne '' ) { $ftp = Net::FTP->new ( $site, ( Firewall => $firewall, Timeout => $timeout ) ); } else { $ftp = Net::FTP->new ( $site, ( Timeout => $timeout ) ); } unless ( defined $ftp ) { print "$id: Cannot make route to $site $ERRNO\n"; return; } # ........................................................ login ... $debug and print "$id: Login to $site ..\n"; unless ( $ftp->login($login, $pass) ) { print "$id: Login failed $login, $pass\n"; goto QUIT; } $ftp->binary(); my $cd = $path; $cd = dirname $path unless $path =~ m,/$, ; if ( $cd ne '' ) { unless ( $ftp->cwd($cd) ) { print "$id: Remote cd $cd failed [$path]\n"; goto QUIT; } } # .......................................................... get ... my $stat; $ftp->binary(); $ftp->hash( $verb ? "on" : undef ); # m" if ( $singleTransfer ) { $verb and print "$id: Getting file... [$getFile]\n"; # m: $rename and do{$saveFile = EvalCode $url, $saveFile, $rename}; unless ( $ftp->get($getFile, $saveFile) ) { warn "$id: ** [ERROR] SingleFile [$getFile] $ERRNO\n" , "\tMaybe the URL on the line is invalid? [$origLine]" ; } else { PUSH($saveFile); } } else { my (@list, $i); $verb and print "$id: Getting list of files $site ...\n"; $i = 0; $debug and warn "$id: Running ftp dir ls()\n"; @list = $ftp->ls(); @list = FileListFilter regexp => $regexp, regexpno => $regexpNo, getfile => $getFile, list => [@list]; $debug and warn "$id: List length ", scalar @list, " --> @list\n"; if ( $verb and not @list ) { print "$id: No files to download." , " Run with debug to investigate the problem.\n" ; } for ( @list ) { $i++; my $progress = DownloadProgress $site . $cd, $ARG, "$id: ..." , $i, scalar @list; print $progress if $progress; my $saveFile = $ARG; $saveFile = TempFile() if $stdout; $rename and do{$saveFile = EvalCode $url, $saveFile, $rename}; $verb and print " $ARG [$saveFile]\n"; unless ( $stdout ) { my $onDisk; my $simpleZ = FileSimpleCompressed $saveFile; if ( $simpleZ ) { ($onDisk) = FileExists file => $saveFile, unpack => -forceUnpackCheck; if ( $verb > 1 ) { print "$id: Uncompressed file; use --overwrite\n"; } } else { ($onDisk) = FileExists file => $saveFile, unpack => $unpack; } $debug and print "$id: On disk? [$ARG] [save $saveFile] .. " , -e $onDisk ? "[yes]" : "[no]" , cwd() , "\n" ; if ( $onDisk ) { if ( $SKIP_VERSION and /-\d[\d.]*\D+/ ) { $verb and print "$id: [skip version/already on disk] " , " $ARG => $onDisk\n"; next; } elsif ( not $overwrite ) { $verb and print "$id: [no overwrite/already on disk] " , " $ARG => $onDisk\n"; next; } } } unless ( $stat = $ftp->get($saveFile) ) { print "$id: ... ** error $ARG $ERRNO $stat\n"; } else { PUSH($saveFile); } sleep $SLEEP_SECONDS if $SLEEP_SECONDS; } } QUIT: { $ftp->quit() if defined $ftp; } ($stat, @files); } # **************************************************************************** # # DESCRIPTION # # Download URL using external program wget(1) # # INPUT PARAMETERS # # $ URL # # RETURN VALUES # # $ content in string if success # # **************************************************************************** sub UrlHttGetWget ( $ ) { my $id = "$LIB.UrlHttpGetWget"; my $url = shift; $debug and print "$id: GET $url ...\n"; my $ret = qx(wget --quiet --output-document=- '$url' 2> /dev/null); return $ret; } # **************************************************************************** # # DESCRIPTION # # Download URL by using Perl only. # # INPUT PARAMETERS # # $ URL # # RETURN VALUES # # $ Content in string if success # $ Headers # # **************************************************************************** sub UrlHttGetPerl ( $ ) { my $id = "$LIB.UrlHttpGetPerl"; my $url = shift; # http://www.useragentstring.com/pages/Firefox/ my $agent = "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.3) Gecko/20090913 Firefox/3.5.3"; my $ua = new LWP::UserAgent; $debug and print "$id: GET $url ...\n"; my $request = new HTTP::Request( 'GET' => $url ); $request->user_agent($agent); my $obj = $ua->request($request); my $stat = $obj->is_success; unless ( $stat ) { print " ** error: $url ", $obj->message, "\n"; return; } my $content = $obj->content(); my $head = $obj->headers_as_string(); $debug > 4 and print "$id: RET head [$head]\n"; $debug > 4 and print "$id: RET ARG [$content]\n"; $debug and print "$id: RET success [$stat]\n"; $content, $head; } # **************************************************************************** # # DESCRIPTION # # Download URL by using Perl only. # # INPUT PARAMETERS # # $ URL # # RETURN VALUES # # $ Content as string if success # $ [optional] Header (not always available) # # **************************************************************************** sub UrlHttGet ( $ ) { my $id = "$LIB.UrlHttpGet"; my $url = shift; $debug and print "$id: INPUT: $url\n"; # Sourceforge is tricky, it automatically tries to start download # and pure Perl method doesn't do it. We need to get page content # only, not to start the file download. # # FIXME: if there is a way to do this with LWP::UserAgent, please # let me know and this wget(1) dependency is gladly removed. if ( m,(?:sourceforge|sf)\.net.*/download, ) { UrlHttGetWget $url; } else { UrlHttGetPerl $url; } } # **************************************************************************** # # DESCRIPTION # # Try to find the latest version number from the page. # Normally indicated by "The latest version of XXX is N.N.N" # # INPUT PARAMETERS # # $ String, the Url page # $ [optional] regexp, what words to lok for # # RETURN VALUES # # % ver => string, List of versions and text matches # # **************************************************************************** sub UrlHttPageParse ( $ ; $ ) { my $id = "$LIB.UrlHttpPageParse"; local $ARG = shift; my $regexp = shift; my %hash; if ( defined $regexp and $regexp ne '' ) { while ( /$regexp/g ) { $hash{$2} = $1 if $1 and $2; } unless ( scalar keys %hash ) { print "$id: [ERROR] version regexp [$regexp] " , " didn't find versions. " , "Please check or define correct \n"; } } elsif ( 0 and /(latest.*?version.*?\b([\d][\d.]+[\d]).*)/ ) { $debug and print "$id: Using DEFAULT page regexp => [$MATCH]\n"; $hash{$2} = $1; } $debug and print "$id: RET regexp = [$regexp] HASH = [" , join ( ' => ', %hash) , "]\n" ; %hash; } # **************************************************************************** # # DESCRIPTION # # Parse all HREFs in the page and return the locations. If there is # tag, it is always obeyed and not filtered # # INPUT PARAMETERS # # content The html page # regexp [optional] Return only HREFs matching regexp. # unique [optional] Filter out duplicates. # # RETURN VALUES # # @urls # # **************************************************************************** sub UrlHttpParseHref ( % ) { my $id = "$LIB.UrlHttpParseHref"; my %arg = @ARG; local $ARG = $arg{content}; # REQUIRED my $regexp = $arg{regexp} || ''; my $unique = $arg{unique} || 0; $debug > 3 and print "$id: INPUT ARG [$ARG]\n"; $debug and print "$id: INPUT regexp [$regexp]\n"; $debug and print "$id: INPUT unique [$unique]\n"; # Some HTML pages do not use double quotes # # # # The strict way is to use double quotes # # # # URLs do not necessarily stop after HREF # # my (@ret, $base); if ( /<\s*BASE\s+href\s*=[\s\"]*([^\">]+)/i ) { $base = $1; $base =~ s,/$,,; # Remove trailing slash. $debug and print "$id: BASE $base\n"; } while ( /HREF\s*=[\s\"']*([^\"'>]+)/ig ) { my $file = $1; $debug and print "$id: FILE $file\n"; if ( $base and $file eq $base ) { $debug and print "$id: FILTERED BY BASE $base\n"; next; } if ( $base and $file !~ m,//, ) { $file = "$base/$file"; } if ( $regexp ne '' and $file !~ /$regexp/ ) { $debug and print "$id: FILTERED BY REGEXP $file\n"; next; } if ( $file =~ /^#/i ) { $debug and print "$id: FILTERED [#] $file\n"; next; } # code.google.com: detail?archive.tar.bz2&can=2&q= if ( $file =~ /;q=$/i ) { $debug and print "$id: FILTERED [google] $file\n"; next; } # http://www.emacswiki.org/emacs?action=admin;id=twit.el if ( $file =~ /emacswiki.*;.*=/ ) { $debug and print "$id: FILTERED [emacswiki] $file\n"; next; } if ( $file =~ m,^\?|/$|mailto, ) { $debug and print "$id: FILTERED OTHER [mailto] $file\n"; next; } if ( $file =~ m,mirror_picker, ) # Sourceforge { $debug and print "$id: FILTERED OTHER [mirror_picker] $file\n"; next; } push @ret, $file; } @ret = ListUnique @ret if $unique; $debug and print "$id: RET REGEXP = [$regexp] " , " LIST =>\n" , join("\n", @ret), "\n"; @ret; } # **************************************************************************** # # DESCRIPTION # # If you connect to http page, that shows directory, this # Function tries to parse the HTML and extract the filenames # # INPUT PARAMETERS # # $ String, The Url page # $ [optional] boolean, if non-zero, filter out # non-interesting files like directories. # # RETURN VALUES # # @ List of files # # **************************************************************************** sub UrlHttpDirParse ( $ ; $ ) { my $id = "$LIB.UrlHttpDirParse"; local $ARG = shift; my $filter = shift; $debug and print "$id: $filter\n"; my @files; if ( /Server:\s+apache/i ) { # Date: Wed, 16 Feb 2000 16:26:08 GMT # Server: Apache/1.3.11 (Win32) # Connection: close # Content-Type: text/html m: # # [DIR] # [IMG] # [TXT] # Anything special to know? No? } # Filter out directories and non interesting files # # ?N=D ?M=A # manual/ @files = UrlHttpParseHref content => $ARG; @files; } # **************************************************************************** # # DESCRIPTION # # Parse list of mirrors from sourceforge's download page. # # INPUT PARAMETERS # # $ HTML # # RETURN VALUES # # % Hash of hash. Mirrors and their locations # MIRROR_NAME => { location, continent } # # **************************************************************************** sub UrlSfMirrorParse ($) { my $id = "$LIB.UrlSfMirrorParse"; local($ARG) = @ARG; my %hash; $debug > 3 and print "$id: INPUT $ARG\n"; while ( m,\s*([a-z].*?) \s+ (.*?) \s+ .*?use_mirror=([a-z]+) ,gsmix ) { my( $location, $continent, $mirror) = ($1, $2, $3); $debug > 2 and print "$id: location [$location] " , "continent [$continent] " , "mirror $mirror\n" ; $hash{$mirror} = { location => $location, continent => $continent }; } %hash; } # **************************************************************************** # # DESCRIPTION # # Determine Sourceforge project ID based on project name. # # INPUT PARAMETERS # # $ project name # # RETURN VALUES # # $ URL # # **************************************************************************** sub SourceforgeProjectId ($) { my $id = "$LIB.SourceforgeProjectId"; my($name) = @ARG; $debug and print "$id: INPUT name [$name]\n"; my $url = "http://sourceforge.net/projects/$name"; local ($ARG) = UrlHttGet $url; # href="/project/showfiles.php?group_id=88346#downloads">Jump to downloads for FOO my $ret; if ( m,href\s*=.*group_id=(?\d+),i ) { $ret = $+{id}; } $debug and print "$id: RET name [$name] id [$ret]\n"; $ret; } # **************************************************************************** # # DESCRIPTION # # Parse Sourceforge project name from URL # # INPUT PARAMETERS # # $ URL # # RETURN VALUES # # $ Project name # # **************************************************************************** sub SourceforgeProjectName ($) { my $id = "$LIB.SourceforgeProjectName"; local($ARG) = @ARG; $debug and print "$id: INPUT $ARG\n"; my $name; # http://sourceforge.net/projects/emacs-jabber # http://prdownloads.sourceforge.net/emacs-jabber/emacs-jabber-0.6.1.tar.gz if ( m,(?:sourceforge|sf)\.net/project/([^/]+), or m,downloads\.(?:sourceforge|sf)\.net/([^/]+), ) { $name = $1; } elsif (m, http://(?:www\.)?(?:sourceforge|sf).net/([^/]+), ) { $name = $1; } $debug and print "$id: RET [$name]\n"; $name; } # **************************************************************************** # # DESCRIPTION # # Parse downloads from Sourceforge page # # INPUT PARAMETERS # # $ URL # # RETURN VALUES # # % hash: "string" => URL # # **************************************************************************** sub SopurceforgeParseDownloadPage ($) { my $id = "$LIB.UrlSfManipulate"; local($ARG) = @ARG; # 0.7.1 Notes (2007-01-31 22:46) # emacs-jabber-0.7.1.tar.gz # Emacs font-lock no-op comment "' } # **************************************************************************** # # DESCRIPTION # # Manipulate http://prdownloads.sourceforge.net address # # INPUT PARAMETERS # # $ URL # # RETURN VALUES # # $ URL # # **************************************************************************** sub UrlManipulateSfOld ($) { my $id = "$LIB.UrlManipulateSfOld"; my ($url) = @ARG; $debug and print "$id: INPUT [$url]\n"; my $project = SourceforgeProjectName $url; unless ( $project ) { die "$id: [FATAL] Cannot parse project name from $url\n"; } my $gid = SourceforgeProjectId $project; unless ( $gid ) { die "$id: [FATAL] Cannot get group ID for [$project] $url\n"; } # Download URL my $base = "http://sourceforge.net"; my $durl = "$base/project/platformdownload.php?group_id=$gid"; local ($ARG) = UrlHttGet $durl; unless ( $ARG ) { die "$id: [FATAL] Cannot get SF page [$durl]"; } # Download if ( m,class\s*=\s*.download.\s.*?href\s*=\s*.([^\"\'<>]+),ism ) { $durl = $base . $1; } else { die "$id: [FATAL] Cannot parse SF page [$durl]"; } local ($ARG) = UrlHttGet $durl; unless ( $ARG ) { die "$id: [FAIL] Cannot read SF page [$durl]"; } # 0.7.1 # /project/showfiles.php?group_id=88346&package_id=92339&release_id=482983 if ( m,\shref\s*=\s*[\"\']\s* ( /project/ showfiles.php\? group_id=$gid[^;]+; package_id=\d+[^;]+; release_id=(\d+) ) ,isx ) { $durl = $base . $1; } else { die "$id: [FATAL] Cannot parse release_id (for final page) [$durl]"; } $durl =~ s/&/&/g; $debug and print "$id: RET [$durl]\n"; $durl; } # **************************************************************************** # # DESCRIPTION # # Manipulate address like: # http://sourceforge.net/projects/clonezilla/files/clonezilla_live_testing/clonezilla-live-1.2.2-30.iso/download # # INPUT PARAMETERS # # $ URL # # RETURN VALUES # # $ URL # # **************************************************************************** sub UrlManipulateSf ($ ; $) { my $id = "$LIB.UrlManipulateSf"; my ($url, $mirror ) = @ARG; $debug and print "$id: INPUT url [$url]\n"; local ($ARG) = UrlHttGet $url; my $ret; if ( m,a \s+ href \s* = \"([^\"\']+) .* direct \s+ link \s* ,x ) { $ret = $1; $debug > 1 and print "$id: SF parsed direct link: $ret\n"; if ( $mirror and $ret =~ m,^(.*use_mirror=)(.*), ) { $ret = $1 . $mirror; $debug > 1 and print "$id: SF use mirror: $mirror\n"; } } else { die "$id: [FATAL] Cannot parse direct download from page $url"; } $debug and print "$id: RET $ret\n"; return $ret; } # **************************************************************************** # # DESCRIPTION # # Manipulate URLs; redirect if necessary # # INPUT PARAMETERS # # $ URL # # RETURN VALUES # # $ URL # # **************************************************************************** sub UrlManipulateMain ($ ; $ ) { my $id = "$LIB.UrlManipulateMain"; my ($url, $mirror ) = @ARG; $debug and print "$id: INPUT $url MIRROR $mirror\n"; # http://downloads.sourceforge.net/project/clonezilla/clonezilla_live_testing/clonezilla-live-1.2.2-30.iso?use_mirror=sunet if ( $url =~ m,prdownloads\.(?:sourceforge|sf)\.net/, ) { $url = UrlManipulateSfOld $url; } if ( $url =~ m,(?:sourceforge|sf)\.net.*/download, ) { $url = UrlManipulateSf $url, $mirror; } $debug and print "$id: RET $url\n"; $url; } # **************************************************************************** # # DESCRIPTION # # Check if file is already on disk and do not overwrite if that is # not allowed. Check also if there is skip version option active. # # GLOBAL VARIABLES # # $ARG # $SKIP_VERSION # # INPUT PARAMETERS # # $file # $unpack # # RETURN VALUES # # True if it's ok to download file. # -skipversion # -noovrt # # **************************************************************************** sub UrlHttpFileCheck ( % ) { my $id = "$LIB.UrlHttpFileCheck"; my %arg = @ARG; my $saveFile = $arg{savefile}; my $unpack = $arg{unpack}; my $overwrite = $arg{overwrite}; my $ret; my $onDisk; my $simpleZ = FileSimpleCompressed $saveFile; if ( $simpleZ ) { ($onDisk) = FileExists file => $saveFile, unpack => -forceUnpackCheck; if ( $verb > 1 ) { print "$id: Uncompressed file found (use --overwrite)\n"; } } else { ($onDisk) = FileExists file => $saveFile, unpack => $unpack; } $debug and print "$id: file on disk? .. " , -e($saveFile) ? "[yes]" : "[no]" , "\n" ; if ( $onDisk ) { # If the filename contains version number # AND skipping is on, then ignore downoad if ( $SKIP_VERSION and /-\d[\d.]*\D+/ ) { $verb and print "$id: [already on disk]" , " $ARG => $onDisk\n"; $ret = -skipversion; } elsif ( not $overwrite ) { $verb and print "$id: [no overwrite/already on disk]" , " $ARG => $onDisk\n"; $ret = -noovrt; } } $debug and print "$id: RET [$ret]\n"; $ret; } # **************************************************************************** # # DESCRIPTION # # Search download ULRs from page. # # INPUT PARAMETERS # # % # # RETURN VALUES # # @list List of URLs found # # **************************************************************************** sub UrlHttpSearchPage ( % ) { my $id = "$LIB.UrlHttpSearchPage"; my %arg = @ARG; my $url = $arg{url} || die "$id: missing arg URL"; my $regexpNo = $arg{regexpno}; my $baseUrl = $arg{baseurl}; my $thisPageRegexp = $arg{pageregexp}; if ( $debug ) { print "$id: INPUT\n" , "\turl : $url\n" , "\tregexpNo : $regexpNo\n" , "\tbaseUrl : $baseUrl\n" , "\tthisPageRegexp: $thisPageRegexp\n" ; } my ($content, $head) = UrlHttGet $url; my @list; if ( $content ) { @list = UrlHttpParseHref content => $content, regexp => $thisPageRegexp; if ( $regexpNo ) { $debug > 2 and print "$id: filter before [$regexpNo] [@list]\n"; @list = grep ! /$regexpNo/, @list; $debug > 2 and print "$id: filter after [@list]\n"; } # Filter out FRAGMENTs that are not part of the file names: # # http://localhost/index.html#section1 local $ARG; for ( @list ) { if ( /#.*/ ) { $debug and print "$id: filtering out FRAGMENT-SPEC $ARG\n"; s/#.*//; } } $debug and print "$id: -find regexpNo [$regexpNo] @list\n"; } $debug and print "$id: RET [@list]\n"; @list; } # **************************************************************************** # # DESCRIPTION # # Search Newest file form page # # INPUT PARAMETERS # # % # # RETURN VALUES # # @list List of URLs found # # **************************************************************************** sub UrlHttpSearchNewest ( % ) { my $id = "$LIB.UrlHttpSearchNewest"; my %arg = @ARG; my $ua = $arg{useragent} || die "No UA object"; my $getPage = $arg{page}; my $thisPage = $arg{flag}; my $file = $arg{file}; my $getFile = $arg{getfile}; my $baseUrl = $arg{baseurl}; my $versionRegexp = $arg{versionRE}; my $thisPageRegexp = $arg{pageRE}; my $regexp = $arg{RE}; my $regexpNo = $arg{REno}; my @list; $debug > 1 and PrintHash "$id: INPUT", %arg; $debug and print "$id: Getting list of files $getPage\n"; if ( $getPage =~ /\.(gz|bz2|lzma|zip|tar|jar|iso)$/ ) { die "[ERROR] The URL must not contain filename: $getPage"; } my ($content, $head) = UrlHttGet $getPage or return; if ( $thisPage ) { $debug and print "$id: THISPAGE START $file\n"; $getFile = $file; my %hash = UrlHttPageParse $content, $versionRegexp; my @keys = keys %hash; my @urls = UrlHttpParseHref content => $content, regexp => $thisPageRegexp, unique => 'unique' ; if ( @urls == 1 ) # only one match { @list = @urls; goto EXIT; } my @files; # The filename may contain the version information, # UNLESS this is page search condition. if ( $getFile !~ /\d/ ) { # Nope, this is "download.html" search with # possible "--Regexp SEARCH" option. $getFile = $urls[0]; $file = $getFile; $debug and print "$id: thispage changed file [$file]"; } $debug and print "$id: THISPAGE file [$file] " , "getFile [$getFile] " , "urls [@urls] " , "version urls [@keys]\n"; if ( @keys ) { $debug and print "$id: THISPAGE if KEYS file [$file]\n"; @files = MakeLatestFiles $file, keys %hash ; if ( @files == 1 ) { @list = ( RelativePath dirname($urls[0]), $files[0] ); # for my $path ( @urls ) # { # push @list, RelativePath # ( dirname($path), $files[0] ); # } # } else { $debug and print "$id: THISPAGE Latest files > 1\n"; @list = ( LatestVersion $file, [@urls, @files] ) ; # @list > 1 and $file = ''; } } else { # Try old fashioned. The filename may contain the # version information, $debug and print "$id: EXAMINE latest URL model [$file] list [@urls]\n"; @list = ( LatestVersion $file, \@urls ) if @urls ; # $file = ''; } $debug and print "$id: FILES [@files] URLS [@urls]\n"; unless ( @urls == 1 ) { $verb > 2 and warn "$id: Can't parse precise latest version location [@urls] "; @list = @urls; } } else { $debug and print "$id: NOT THISPAGE else statement\n"; @list = UrlHttpDirParse $head . $content, "clean"; } EXIT: @list = FileListFilter regexp => $regexp, regexpno => $regexpNo, getfile => $getFile, list => [@list]; $debug and print "$id: RET", join("\n", @list), "\n"; @list; } # **************************************************************************** # # DESCRIPTION # # Download files # # INPUT PARAMETERS # # % # # RETURN VALUES # # @list List of URLs found # # **************************************************************************** sub UrlHttpDownload ( % ) { my $id = "$LIB.UrlHttpDownload"; my %arg = @ARG ; my $ua = $arg{useragent}; my $list = $arg{list}; my $file = $arg{file}; my $stdout = $arg{stdout}; my $find = $arg{find}; my $saveopt = $arg{saveopt}; my $baseUrl = $arg{baseurl}; my $unpack = $arg{unpack}; my $rename = $arg{rename}; my $overwrite = $arg{overwrite}; my $mirror = $arg{mirror}; my $contentRegexp = $arg{contentre} || ''; my $errUrlHashRef = $arg{errhash}; my $errExplanationHashRef = $arg{errtext}; if ( $debug ) { print < 1 or $file eq '' or ($find and not $saveopt) ) { # Sourceforge special my $tmp = $ARG; $tmp =~ s,/download$,,; $saveFile = basename $tmp; $debug and print "$id: SAVEFILE-1c $saveFile [@list]\n"; } my $relative = $ARG || $baseUrl; $debug and print "$id: SAVEFILE-2 $saveFile RELATIVE $relative\n"; if ( $ARG and not m,://, ) { # If the ARG is NOT ABSOLUTE reference ftp:// or http:// # Then glue together the base site + relative reference found # from page $debug and print "$id: glue [$baseUrl] + [$ARG]\n"; $relative = RelativePath BaseUrl($baseUrl), $ARG; # The whole URL is now known, strip PATH from savefile. $saveFile = basename $saveFile; } unless ( $relative ) { warn "$id: [ERROR] Can't resolve relative $baseUrl + [$ARG]"; next; } my $url = $relative; $url = UrlManipulateMain $url, $mirror; if ( $rename ) { $saveFile = EvalCode $url, $saveFile, $rename } $saveFile = FileNameFix $saveFile; unless ( $stdout ) { next if UrlHttpFileCheck savefile => $saveFile , unpack => $unpack , overwrite => $overwrite ; } my $progress = DownloadProgress $baseUrl, $ARG, "$id: ..." , $i, scalar @list; my $request = new HTTP::Request('GET' => $url ); my $obj = $ua->request($request , $saveFile ); my $stat = $obj->is_success; if ( $debug ) { print "$id: content-type:\n\t", $obj->content_type, "\n" , "\tsuccess status ", $stat, "\n" , map { $ARG = "\t$ARG\n" } $obj->headers_as_string ; } # ........................................... file downloaded ... if ( $stat ) { PUSH($saveFile); my $contentStatus = FileContentAnalyze $saveFile, $contentRegexp; my $err; $err = "[no match] " unless $contentStatus; if ( (not $contentStatus and $verb > 1) or ($contentStatus and $verb) ) { $verb and print "$progress ${err}$url => $saveFile\n"; } } else { $verb and print "$progress $url => $saveFile\n"; $errUrlHashRef->{ $url } = $obj->code; # There is new error code, record it. if ( not defined $errUrlHashRef->{ $obj->code } ) { $errExplanationHashRef->{ $obj->code } = $obj->message; } $ret = $errUrlHashRef->{ $obj->code }; warn " ** error: $url ", $obj->message, "\n"; } } $ret, @files; } # **************************************************************************** # # DESCRIPTION # # Get content of URL # # INPUT PARAMETERS # # $url The URL pointer # $file # $regexp # $regexpNo # $proxy # \%errUrlHashRef Hahs where to store the URL-ERROR_CODE # \%errExplanationHashRef Hash where to store ERROR_CODE-EXPLANATION # $new Get never file # $stdout Write to stdout # $versionRegexp How to find the version number from page # # RETURN VALUES # # () RETURN LIST whose elements are # # $stat Error reason or "" => ok # @ list of retrieved files # # **************************************************************************** sub UrlHttp ( % ) { my $id = "$LIB.UrlHttp"; my %arg = @ARG ; # .............................................. input arguments ... # check mandatory not exists $arg{url} and die "$id: URL missing"; not exists $arg{file} and die "$id: FILE missing"; not exists $arg{errUrlHashRef} and die "$id: HashRef missing"; not exists $arg{errExplanationHashRef} and die "$id: errHashRef missing"; # Read values my $url = $arg{url}; my $file = $arg{file}; my $errUrlHashRef = $arg{errUrlHashRef}; my $errExplanationHashRef = $arg{errExplanationHashRef}; my $proxy = $arg{proxy} || ''; my $regexp = $arg{regexp} || ''; my $regexpNo = $arg{regexpNo} || ''; my $new = $arg{new} || 0; my $stdout = $arg{stdout} || 0;; my $versionRegexp = $arg{versionRegexp} || ''; my $thisPage = $arg{plainPage} || 0; my $thisPageRegexp = $arg{pageRegexp} || ''; my $contentRegexp = $arg{contentRegexp} || ''; my $conversion = $arg{conversion} || ''; my $rename = $arg{rename} || ''; my $saveopt = $arg{save} || ''; my $unpack = $arg{unpack} || ''; my $overwrite = $arg{overwrite} || ''; my $mirror = $arg{mirror} || ''; my $find = $thisPage eq -find ? 1 : 0; # ......................................................... code ... if ( $debug ) { print "$id: INPUT\n" , "\turl : $url\n" , "\tfile : $file\n" , "\trename : $rename\n" , "\tconversion: $conversion\n" , "\tregexp : $regexp\n" , "\tregexp-no : $regexpNo\n" , "\tthis page : $thisPage\n" , "\tfind : $find\n" , "\tvregexp : $versionRegexp\n" , "\tpregexp : $thisPageRegexp\n" , "\tcregexp : $contentRegexp\n" , "\tproxy : $proxy\n" , "\tnew : $new\n" , "\tstdout : $stdout\n" , "\tcwd : ", cwd(), "\n" , "\toverwrite : $overwrite\n" } # FIXME: remove, this is not the final save name # $verb and print "$id: $url --> $file\n"; my $ua = new LWP::UserAgent; if ( defined $proxy ) { $debug and $proxy and print "$id: Using PROXY $proxy\n"; $ua->proxy( "http", "$proxy" ); } my ($baseUrl, $getFile) = ($url,""); unless ( $thisPage ) { ($baseUrl, $getFile) = ( $url =~ m,^(.*/)(.*), ); } $baseUrl = UrlManipulateMain $url, $mirror; if ( $getFile eq '' and ($regexp eq '' or $thisPageRegexp eq '') and not $thisPage ) { die "$id: [ERROR] invalid URL $url. No file name part found." , " Did you forgot to use or ?" ; } my @list = ( $getFile ); if ( $new ) # Directory lookup { my $getPage = $thisPage ? $url : $baseUrl ; $debug and print "$id: getPage 1 $getPage\n"; if ( $file ) { $getPage =~ s/\Q$file//; $debug and print "$id: getPage 2 file [$file] $getPage\n"; } @list = UrlHttpSearchNewest useragent => $ua , page => $getPage , flag => $thisPage , file => $file , baseurl => $baseUrl , getfile => $getFile , versionRE => $versionRegexp , pageRE => $thisPageRegexp , RE => $regexp , REno => $regexpNo ; } elsif ( $find ) { @list = UrlHttpSearchPage url => $url , regexpno => $regexpNo , baseurl => $baseUrl , pageregexp => $thisPageRegexp ; } # ............................................ get list of files ... # Multiple links to the same destination @list = ListRemoveDuplicates @list; local $ARG; $debug and print "$id: FILE LIST [@list]\n"; $verb and !@list and print "$id: No matching files [$regexp]\n"; if ( @list > 1 and $file and not $new) { $file = ''; $debug and print "$id: Clearing FILE: [$file] because many/new" , " files to load. " , "\@list = count, ", scalar @list, ", [@list]\n" ; } $file = $getFile; if ( $new ) { local $ARG = $list[0]; $file = $ARG unless /[?&]/; # Ignore PHP and exotic paths } my ($ret, @files) = UrlHttpDownload useragent => $ua , list => \@list , file => $file , stdout => $stdout , find => $find , saveopt => $saveopt , baseurl => $baseUrl , unpack => $unpack , rename => $rename , errhash => $errUrlHashRef , contentre => $contentRegexp , errtext => $errExplanationHashRef , overwrite => $overwrite , mirror => $mirror ; $ret, @files; } # **************************************************************************** # # DESCRIPTION # # Copy content of PATH to FILE. # # INPUT PARAMETERS # # $path From where to read. If this is directory, read files # in directory. If this is file, copy file. # # $file Where to put resuts. # $prefix [optional] Filename prefix # $postfif [optional] postfix # # RETURN VALUES # # () RETURN LIST whose elements are: # # $stat Error reason or "" => ok # @ list of retrieved files # # # **************************************************************************** sub UrlFile (%) { my $id = "$LIB.UrlFile"; my %arg = @ARG; my $path = $arg{path} || die "$id: Missing arg PATH"; my $file = $arg{file} || die "$id: Missing arg FILE"; my $prefix = $arg{prefix} || ''; my $postfix = $arg{postfix} || ''; my $overwrite = $arg{overwrite} || ''; my ( $stat, @files ); $debug and warn "$id: PATH $path, FILE $file\n"; if ( -f $path and not -d $path ) { if ( $CHECK_NEWEST ) { my @dir = DirContent dirname( $path ); if ( @dir ) { my $base = dirname($path); $file = LatestVersion basename($path) , \@dir; $path = $base . "/" . $file; } else { $verb and print "$id: Can't set newest $file"; } } $file = $prefix . $file . $postfix; $debug and warn "$id: FileCopy $path => $file\n"; unless ( copy($path, $file) ) { $verb and print "$id: FileCopy $path => $file $ERRNO"; } else { push @files, $file; } } else { my @tmp = DirContent $path; local *FILE; $file =~ s,/,!,g; if ( -e $file and not $overwrite ) { $verb and print "$id: [ignored, exists] $file\n"; return; } unless ( open FILE, "> $file" ) { warn "$id: can't write $file $ERRNO\n"; return; } print FILE join "\n", @tmp; close FILE; push @files, $file; } ( $stat, @files ); } # **************************************************************************** # # DESCRIPTION # # Run Some self tests. This is for developer only # # INPUT PARAMETERS # # none # # RETURN VALUES # # none # # **************************************************************************** sub TestDriverSfMirror () { my $id = "$LIB.TestDriverSfMirror"; $debug = 3 unless $debug; my $str = << "EOF"; Download Mirrors Host Location Continent Download optusnet logo
Sydney, Australia Australia Download mesh logo Duesseldorf, Germany Europe Download kent logo Kent, UK Europe Download heanet logo Dublin, Ireland Europe Download ovh logo Paris, France Europe Download puzzle logo Bern, Switzerland Europe Download EOF print "$id: UrlSfMirrorParse\n"; UrlSfMirrorParse $str; } sub SelfTest () { my $id = "$LIB.SelfTest"; $debug = 1 unless $debug; my (@files, $file, $i); local $ARG; # ............................................................ X ... $i++; $file = "artist-1.1-beta1.tar.gz", print "$id: [$i] LatestVersion ", "." x 40, "\n" ; @files = qw ( mailto:tab@lysator.liu.se emacs-shapes.gif emacs-shapes.html emacs-a.gif emacs-a.html emacs-rydmap.gif emacs-rydmap.html COPYING artist-1.2.3.tar.gz artist.el mailto:kj@lysator.liu.se mailto:jdoe@example.com http://st-www.cs.uiuc.edu/~chai/figlet.html artist-1.2.1.tar.gz artist-1.2.tar.gz artist-1.1.tar.gz artist-1.1a.tar.gz artist-1.1-beta1.tar.gz artist-1.0.tar.gz artist-1.0-11.tar.gz mailto:tab@lysator.liu.se ); LatestVersion $file, \@files; # ............................................................ X ... $i++; $file = "irchat-900625.tar.gz"; print "$id: [$i] LatestVersion ", "." x 40, "\n" ; @files = qw ( ./dist/irchat/irchat-20001203.tar.gz ./dist/irchat/irchat-19991105.tar.gz ./dist/irchat/irchat-980625-2.tar.gz ./dist/irchat/irchat-980128.tar.gz ./dist/irchat/irchat-971212.tar.gz ./dist/irchat/irchat-3.04.tar.gz ./dist/irchat/irchat-3.03.tar.gz ./dist/irchat/irchat-3.02.tar.gz ./dist/irchat/irchat-3.01.tar.gz ./dist/irchat/irchat-3.00.tar.gz ); LatestVersion $file, \@files; # ............................................................ X ... $i++; $file = "bogofilter-0.9.1.tar.gz"; print "$id: [$i] LatestVersion ", "." x 40, "\n" ; @files = qw ( http://freecode.com http://newsletters.osdn.com http://ads.osdn.com/?ad_id=2435&alloc_id=5907&op=click http://sourceforge.net http://sf.net http://sf.net/support/getsupport.php /bogofilter/?sort_by=name&sort=desc /bogofilter/?sort_by=size /bogofilter/?sort_by=date /bogofilter/.. /bogofilter/ANNOUNCE /bogofilter/ANNOUNCE-0.94.12 /bogofilter/Judy-0.0.2-1.i386.rpm /bogofilter/Judy-0.04-1.i386.rpm /bogofilter/Judy-0.04-1.src.rpm /bogofilter/Judy-devel-0.04-1.i386.rpm /bogofilter/Judy_trial.0.0.4.src.tar.gz /bogofilter/NEWS /bogofilter/NEWS-0.10 /bogofilter/NEWS-0.10.0 /bogofilter/NEWS-0.11 /bogofilter/bogofilter-0.10.0-1.i586.rpm /bogofilter/bogofilter-0.10.0-1.src.rpm /bogofilter/bogofilter-0.10.0.tar.gz /bogofilter/bogofilter-0.96.6-1.src.rpm /bogofilter/bogofilter-0.96.6.tar.bz2 /bogofilter/bogofilter-0.96.6.tar.gz /bogofilter/bogofilter-1.0.0-1.i586.rpm /bogofilter/bogofilter-1.0.0-1.src.rpm /bogofilter/bogofilter-1.0.0.tar.bz2 /bogofilter/bogofilter-1.0.0.tar.gz /bogofilter/bogofilter-faq.html /bogofilter/bogofilter-static-0.13.1-1.i586.rpm /bogofilter/bogofilter-static-0.13.2-1.i586.rpm /bogofilter/bogofilter-static-0.13.2.1-1.i586.rpm ); LatestVersion $file, \@files; # ............................................................ X ... $i++; print "$id: [$i] FileDeCompressedCmd ", "." x 40, "\n" ; for ( qw ( 1.tar 1.tar.gz 1.tgz 2.bz2 2.tar.bz2 3.zip 3.rar )) { eval { FileDeCompressedCmd $ARG }; print $EVAL_ERROR if $EVAL_ERROR; } exit; } # **************************************************************************** # # DESCRIPTION # # # # INPUT PARAMETERS # # \@data Configuration file content # # # RETURN VALUES # # none # # **************************************************************************** sub Main ($ $) { my $id = "$LIB.Main"; my ( $TAG_NAME, $data ) = @ARG; if ( $TAG_NAME ) { $debug and warn "$id: Tag name search [$TAG_NAME]\n"; } # This is an old relict and not used my %EXTRACT_HASH = ( '\.tar\.gz$' => "gzip -d -c %s | tar xvf -" , '\.gz$' => "gzip -f -d %s" , '\.bz2$' => "bzip2 -f -d %s" , '\.tar$' => "tar xvf %s" , '\.tgz$' => "tar -zxvf %s" # GNU TAR , '\.zip$' => "unzip %s" , '\.xz$' => "xz -d -c %s | tar xvf -" ); # ............................................... prepare output ... if ( $OUT_DIR ) { $verb and print "$id: chdir $OUT_DIR\n"; chdir $OUT_DIR or die "$id: chdir $OUT_DIR $ERRNO"; } my $date = DateYYYY_MM_DD(); my $count = 0; my ( %URL_ERROR_HASH , %URL_ERROR_REASON_HASH ); my $TagLine; local $ARG; for ( @$data ) { chomp; my $line = $ARG; s/^\s*[#].*$//; # Kill comments next if /^\s*$/; # ignore empty lines # ............................................ Variable defs ... # todo: should be removed, this was for gz = 'command' my %variables; %variables = /'(\S+)'\s*=\s*(.*)/g; while ( my($var, $val) = each %variables ) { $debug and warn "$id:\t\t$var = $val\n"; $EXTRACT_HASH{ $var } = $val; } # ............................................... directives ... my $LINE = $ARG; # make a secure copy my $new = $CHECK_NEWEST; my $unpack = $EXTRACT; my $overwrite = $OVERWRITE; my $contentRegexp = $CONTENT_REGEXP; my $mirror = $MIRROR; $TagLine = $ARG if /tag\d+:/; # Remember tag name my $pass = ExpandVars($1) if /\bpass:\s*(\S+)/; my $login = $1 if /\blogin:\s*(\S+)/; my $regexp = $1 if /\bregexp:\s*(\S+)/; my $regexpNo = $1 if /\bregexp-no:\s*(\S+)/; $new = 1 if /\bnew:/; $unpack = 1 if /\bx:/; $unpack = -noroot if /\bxx:/; my $xopt = $1 if /\bxopt:\s*(\S+)/; my $lcd = $1 if /lcd:\s*(\S+)/; $overwrite = 1 if /\bo(verwrite)?:/; my $vregexp = $1 if /\bvregexp:\s*(\S+)/; my $fileName = $1 if /\bfile:\s*(\S+)/; my $rename = $1 if /\brename:\s*(\S+)/; $mirror = $1 if /\bmirror:\s*(\S+)/; my $pageRegexp = $1 if /\bpregexp:\s*(\S+)/; $contentRegexp = $1 if /\bcregexp:\s*(\S+)/; my $conversion = -text if /\btext:/; if ( /\bco?nv:\s*(\S+)/ ) # old implementation used tag { local $ARG = $1; if ( /te?xt/i ) { $conversion = -text } else { warn "$id: Unknown conversion [$ARG] [$line]"; } } my $plainPage; if ( /\bpage:/ ) { $plainPage = 1; if ( /\bpage:\s*find/i ) { $plainPage = -find; } } # "lcd-ohio" is valid tag name, but "lcd" is our # directive. Accept word names after OUR directives. if ( $verb and not /print:/ and /(?:^|\s)(:[-a-z]+)\b/ ) { print "$id: [WARNING] directive, leading colon? [$1] $ARG\n"; } if ( $lcd ) { $debug > 2 and print "$id: LCD $lcd\n"; DirectiveLcd -dir => $lcd , -mkdir => $LCD_CREATE unless $NO_LCD; } # ................................................... regexp ... if ( defined $URL_REGEXP ) { if ( /$URL_REGEXP/o ) { $debug and warn "$id: REGEXP match [$URL_REGEXP] $ARG\n" } else { $debug > 3 and warn "$id: [regexp ignored] $ARG\n"; next; } } if ( defined $TAG_REGEXP ) { my $stop; ($ARG, $stop) = TagHandle $ARG, $TAG_NAME; last if $stop; next if $ARG eq ''; } # ................................................. grab url ... $ARG = ExpandVars $ARG if /\$/ and ! $rename; if ( $verb and /(print:\s*)(.+)/ ) # Print user messages { print "$TagLine: $2\n"; next; } m,^\s*((https?|ftp|file):/?(/([^/\s]+)(\S*))),; unless ( $1 and defined $2 ) { if ( /https/ ) { warn "$id: https is not supported, just http://"; } $debug and warn "$id: [skipped] not URL: $line [$ARG]\n"; next; } # ............................................... components ... my $urlOrig = $1; my $url = $urlOrig; # may be changed my $type = $2; my $path = $3; my $site = $4; my $sitePath = $5; # Remove leading slash if we log with real username. # The path is usually relative to the directory under LOGIN. # # For anonymous, the path is absolute. $sitePath =~ s,^/,, if $login; my $origFile = $sitePath; if ( $type eq 'https' ) { eval "use Crypt::SSLeay"; if ( $EVAL_ERROR ) { warn "HTTPS requires Crypt::SSLeay.pm [$EVAL_ERROR]"; next; } } my $file; # The page:find command may instruct to search # # http://some.com/~foo # http://some.com/ # # Do not consider those to contain filename part if ( $plainPage ne -find or $url !~ m,/$, or $url !~ m,/[~][^/]+$, ) { ($file = $url) =~ s,^\s*\S+/,,; $file = $fileName if $fileName ne ''; } if ( /http/ and $file eq '' and not $plainPage ) { $file = $path . "000root-file"; } $debug and print "$id: VARIABLES\n" , "\tURL = $url\n" , "\tFILE = $file\n" , "\tFILE_NAME = $fileName\n" , "\tTYPE = $type\n" , "\tPATH = $path\n" , "\tSITE = $site\n" , "\tSITE_PATH = $sitePath\n" , "\tCONVERSION = $conversion\n" ; my $saveopt; if ( $NO_SAVE == 0 and /save:\s*(\S+)/ ) { $saveopt = $1; $file = $1; } my $postfix = $POSTFIX if defined $POSTFIX; my $prefix = $PREFIX if defined $PREFIX; $prefix = $site . "::" . $prefix if $PREFIX_WWW; $prefix = $date . "::" . $prefix if $PREFIX_DATE; $file = $prefix . $file . $postfix; # Sourceforge special if ( IsSourceforgeDownload $url ) { $url = $urlOrig; # reset everything $file = ""; } # .................................................... do-it ... $debug and print "$id: type <$type> site <$site>" . "path <$path> url <$url> file <$file>\n"; $ARG = $type; my ($stat, @files); # ****************************************************************** # Set global which is used in error messages or if program must die. $CURRENT_TAG_LINE = $line; # ******************************************************************* $verb and print "$id: DIRECTORY ", cwd(), "\n"; if ( /http/ ) { $count++; if ( $plainPage eq -find and not $pageRegexp ) { die "$id: no directive" , " LINE => [$line]" ; } if ( $pageRegexp and not $plainPage ) { $debug and print "$id: Implicit [$line]\n"; $plainPage = -find; } if ( ($plainPage ne -find) and $pageRegexp and not $file ) { $debug and print "$id: Expecting [page:find]", , " for non-named download file" , " [$url]" , " LINE => [$line]" ; $plainPage = -find; } elsif ( $plainPage ne -find and $pageRegexp ) { $plainPage = -find; } if ( $saveopt and $pageRegexp and $verb > 1 ) { chomp; warn "$id: [WARNING] mixing and " , " May give multiple answers. Use absolute filename" , " URL with LINE => [$line]" ; } if ( $pageRegexp and not $plainPage ) { warn "$id: [WARNING] no page: directive [$ARG]\n"; } if ( $pageRegexp and not $file and ($plainPage ne -find)) { warn "$id: [WARNING] no file: directive. [$ARG]\n"; } if ( $pageRegexp and not $file and ($plainPage ne -find)) { warn "$id: [WARNING] no file: directive. [$ARG]\n"; } ($stat, @files) = UrlHttp url => $url , file => $file , regexp => $regexp , regexpNo => $regexpNo , proxy => $PROXY , errUrlHashRef => \%URL_ERROR_HASH , errExplanationHashRef => \%URL_ERROR_REASON_HASH , new => $new , stdout => $STDOUT , versionRegexp => $vregexp , plainPage => $plainPage , pageRegexp => $pageRegexp , contentRegexp => $contentRegexp , conversion => $conversion , rename => $rename , save => $saveopt , origLine => $line , unpack => $unpack , overwrite => $overwrite , mirror => $mirror ; } elsif ( /ftp/ ) { $count++; my ($pproto, $ssite, $ddir, $ffile) = SplitUrl $url; if ( $ffile and $ffile !~ /[.]/ ) { # ftp://some.com/dir/dir warn "$id: Did you forgot trailing slash? [$line]"; } if ( $regexp ) { # There can't be serched "file" if regexp is used. $origFile = ''; $file = ''; $sitePath = Slash $sitePath; } if ( $fileName and ! $origFile ) { $origFile = $fileName; # the "new" search. } if ( $pageRegexp ) { chomp; warn "$id: [WARNING] mixing ftp:// and " , " Did you mean instead?" , " LINE => [$line]" ; } # Directory path given, so reset the file $origFile = '' if $origFile =~ m,/$,; ($stat, @files ) = UrlFtp site => $site , url => $url , path => $sitePath , getFile => $origFile , saveFile => $file , regexp => $regexp , regexpNo => $regexpNo , firewall => $FIREWALL , login => $login , pass => $pass , new => $new , stdout => $STDOUT , conversion => $conversion , rename => $rename , origLine => $line , unpack => $unpack , overwrite => $overwrite ; } elsif ( /file/ ) { ($stat, @files) = UrlFile path => $path , file => $origFile , prefix => $prefix , postfix => $postfix , overwrite => $overwrite ; $count++; } # .............................................. conversion ... if ( $conversion eq -text ) { for my $file ( @files ) { FileHtml2txt $file; } } elsif ( $conversion ) { warn "$id: Unknown conversion [$conversion]"; } # .................................................. &unpack ... if ( $unpack and not $NO_EXTRACT ) { $debug and print "$id: extracting [@files]\n"; @files and Unpack \@files, \%EXTRACT_HASH, $unpack, $xopt; } } if ( not $count and $verb) { $URL_REGEXP and printf "$id: No labels matching regexp [%s]\n", $URL_REGEXP; @TAG_LIST and printf "$id: Nothing to do. No tag matching [%s]\n", join(' ', @TAG_LIST); if ( @CFG_FILE == 0 ) { print "$id: Nothing to do. Use config file or give URL? ", "Did you mean --tag for [@ARGV]?\n" ; } } } # **************************************************************************** # # DESCRIPTION # # Parse VAR = VALUE statements. The values are put to %ENV # # INPUT PARAMETERS # # @lines # # RETURN VALUES # # none # # **************************************************************************** sub ConfigVariableParse (@) { my $id = "$LIB.ConfigVariableParse"; my @data = @ARG; local $ARG; for ( @data ) { s/#.*//; next unless /\S/; next if /rename:/; # Skip Perl command line, which may contain '=' my %variables = /(\S+)\s*=\s*(\S+)/g; while ( my($var, $val) = each %variables ) { # Save values to environment. # Ignore some URLS, that look like variable assignments: # print http://example.com/viewcvs/vc-svn.el?rev=HEAD $debug > 2 and print "$id:[$var] = [$val] [$ARG]\n"; if ( $var =~ m![?,.&:]!i ) { $debug > 2 and print "$id: IGNORED. Wasn't a variable\n"; next; } $ARG = $val; $val = ExpandVars $ARG; $debug > 2 and print "$id: assigning ENV $var => $val\n"; $ENV{ $var } = $val; } } } # **************************************************************************** # # DESCRIPTION # # Read Configuration file contents. The directive can be in format: # # include Read from User's current dir # include <$HOME/file> Expand $HOME # include Absolute path # include Read from same directory where the # current file with includes reside. # # INPUT PARAMETERS # # $file # # RETURN VALUES # # @lines # # **************************************************************************** sub ConfigRead ( $ ); # Recursive call needs prototyping { my %staticInclude; # already included files, do not read again my $staticPwd; # Current pwd sub ConfigRead ( $ ) { my $id = "$LIB.ConfigRead"; my $file = shift; $verb > 2 and print "$id: Reading config [$file]\n"; $file = PathConvertSmart $file; $verb > 2 and print "$id: Reading config CONVERSION [$file]\n"; if ( $debug > 1 ) { print "$id: input FILE $file " , ExpandVars($file) , " [" , join(' ', %staticInclude) , "]\n" ; } # .............................................. already included ... # In windows c:/dir is same as C:/DIR my $check = $file; $check = lc $file if $WIN32; if ( exists $staticInclude{$check} ) { $debug and print "$id: skipped, already included $file\n"; return; } # .......................................................... pwd ... my $dir = dirname $file; $staticPwd = cwd() unless $staticPwd; # set inital value $debug > 2 and print "$id: PWD $staticPwd\n"; if ( -f $file and not $dir =~ /^.$|THIS/ ) { my $orig = cwd(); # Peek where are we going, handles ../../ cases too if ( chdir $dir ) { $staticPwd = cwd(); # ok, set "THIS" location chdir $orig; # Back to original directory } } elsif ( $dir eq "THIS" ) { $file = "$staticPwd/" . basename $file; $debug and print "$id: THIS set to $file\n"; } # .......................................................... read ... my ($lineArrRef, $status); if ( -f $file ) { ($lineArrRef, $status) = FileRead $file; } else { $status = "File does not exist $file"; } $staticInclude{ $file } = 1; if ( $status ) { $verb > 0 and warn "$id: SKIPPED, Can't include $file"; return; } if ( @$lineArrRef ) { ConfigVariableParse @$lineArrRef; local $ARG; my @lines; for my $line ( @$lineArrRef ) { push @lines, $line; # Skip INCLUDE statements that have been commented out. $ARG = $line; s/#.*//; next unless /[a-z]/i; # include # include # include < this/here > # include < c:/Progrm Files/this/here > if ( /include\s+<\s*(.*[^\s]+)\s*>/i ) { my $inc = $1; my $path = ExpandVars $inc; my $already = exists $staticInclude{$path}; $debug > 1 and print "$id: RECURSIVE INCLUDE [$path] [$inc]" , " already flag [$already]\n"; unless ( $already ) { push @lines, ConfigRead $path; $path = lc $path if $WIN32; $staticInclude{ $path } = 1; } } } @$lineArrRef = @lines; $debug > 4 and print "$id: READ config $file\n@lines\n\n"; } else { $debug and print "$id: Nothing found from $file\n"; } @$lineArrRef; }} # }}} # {{{ more # **************************************************************************** # # DESCRIPTION # # Start, the start of the program. # # INPUT PARAMETERS # # None # # RETURN VALUES # # None # # **************************************************************************** sub Boot () { Initialize(); HandleCommandLineArgs(); my $id = "$LIB.Boot"; # ......................................................... args ... $debug > 2 and PrintHash "$id: begin ENV", %ENV; unless ( @ARGV ) { $debug > 1 && warn "$id: No plain command line arguments\n"; } # Convert any command line arguments as if they would appear # in configuration file: # # --site-regexp gz http://there.at/ # # --> http://there.at page: pregexp:gz for my $arg ( @ARGV ) { if ( $SITE_REGEXP ) { if ( /ftp/i ) { $arg .= " regexp:$SITE_REGEXP"; } else { $arg .= " page: pregexp:$SITE_REGEXP"; } } } my @data; if ( $CFG_FILE_NEEDED and @CFG_FILE ) { for my $arg ( @CFG_FILE ) { my @lines = ConfigRead $arg; push @data, @lines; } if ( $debug > 4 ) { print "$id: CONFIG-FILE-CONTENT-BEGIN\n" , @data , "$id: CONFIG-FILE-CONTENT-END\n" ; } } $debug > 4 and PrintHash "$id: end ENV", %ENV; push @data, @ARGV if @ARGV; # Add command line URLs if ( @TAG_LIST ) { for my $arg ( @TAG_LIST ) { TagHandle undef, undef, "1-reset"; Main $arg, \@data; } } else { Main "", \@data; } } sub Test () { my $str = join '', <>; $debug = 1; print UrlHttpParseHref content => $str, regexp => "tar.gz"; } Boot(); # }}} 0; __END__ pwget-2016.1019+git75c6e3e/doc/000077500000000000000000000000001300167571300155145ustar00rootroot00000000000000pwget-2016.1019+git75c6e3e/doc/announce/000077500000000000000000000000001300167571300173225ustar00rootroot00000000000000pwget-2016.1019+git75c6e3e/doc/announce/announce-mail.txt000066400000000000000000000012771300167571300226200ustar00rootroot00000000000000From: Jari Aalto Newsgroups: comp.emacs,comp.emacs.xemacs,gnu.emacs.help Subject: ANNOUNCE: Lisp Developer tracking system --text follows this line-- Hi, I'm pleased to announce utility `pwget', which can fetch URLs around the globe. pwget - Perl webget Project http://freecode.com/projects/perlwebget Now, what does that has to do with Emacs Lisp? The utility includes a configuration files that contain few records of URLs of Emacs Lisp developers. If you're an Emacs Lisp developer and you're not yet in the configuration file yet, please spend a moment and send email including the URL of your web page where the code can be found. Jari pwget-2016.1019+git75c6e3e/doc/announce/announce.mail000066400000000000000000000022031300167571300217710ustar00rootroot00000000000000From: Jari Aalto Subject: ANNOUNCE: Emacs lisp developer tracking system (site-lisp builder) Newsgroups: comp.emacs, gnu.emacs.help, comp.emacs.xemacs Hi I've been maintaining a list which includes site address for Emacs Lisp Developers who have released code in their web or ftp sites. The list is not complete yet. If you're developing lisp code, please make yourself known by sending following information to me: - Your full name and preferred email address - URL address of a web (or FTP) page that lists your software If you keep the code under version control which is publicly available, then please send URL to the web page. Where is the list used? The list is actually a configuration file for a utility `pwget.pl' which can automatically retrieve and update files based on the rules. With the stored information the utillity can download lisp packages around the globe and build a site-lisp/net/users and site-lisp/net/packages hierarchies on local disk. Who or what projects are tracked? See http://git.savannah.gnu.org/cgit/perl-webget.git/tree/doc/examples/emacs.conf pwget-2016.1019+git75c6e3e/doc/examples/000077500000000000000000000000001300167571300173325ustar00rootroot00000000000000pwget-2016.1019+git75c6e3e/doc/examples/emacs-vars.conf000066400000000000000000000055321300167571300222470ustar00rootroot00000000000000# emacs-vars.conf -- configuration file # # File id # # Copyright (C) 2000-2016 Jari Aalto # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License as # published by the Free Software Foundation; either version 2 of the # License, or (at your option) any later version # # This program is distributed in the hope that it will be useful, but # WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU # General Public License for more details. # # Visit # # How to use this file # # 1) Create $HOME/confing/pwget/pwget.conf and set environment # variable PWGET_CFG to point to that location. Let the # file read: # # # pwget.conf # # CONF = $HOME/config/pwget # # include <$CONF/emacs-vars.conf> # include <$CONF/emacs.conf> # # # End of file # # 2) Make the necessary directory location changes. You MUST change # the ROOT, where you want to store files. A good candidate # for site wide installation is /usr/share. In Windows environment # set this to something similar, like c:/share/site-lisp # # 3) Make sure perl finds pwebget along $PATH # # If you just want to see the layout, without actually downloading # anything, use command: # # perl -S pwebget -r no-match --Create-paths # # After the directories are in place, leave this command running # for few hours and you get all the latest versions of Emacs # packages known to this configuration file. # # perl -S pwget --verbose --overwrite --Tag elisp # # ........................................................................ # Root directory of all downloads. # In site wide Unix this could be something like: # # ROOT = /usr/share/emacs # # !! YOU MUST CHANGE THIS VALUE, unless you're testing first. ROOT = $HOME/tmp # These "E" variables are used for Emacs downloads. # ESITE_LISP is the root directory under all Emacs Lisp packages are stored. ESITE_LISP = $ROOT/site-lisp # The preferred Sourcefoforge download site. # CHANGE this value to reflect mirror closest to you. EHTTP_SF = http://belnet.dl.sourceforge.net/sourceforge EPKG_EMACS = $ESITE_LISP/emacs/packages EPKG_XEMACS = $ESITE_LISP/xemacs/packages ECOMMON = $ESITE_LISP/common ENET = $ESITE_LISP/net EPKG_NET = $ENET/packages # Xemacs and Emacs compatible EUSR = $ENET/users ELCD = $ECOMMON/lcd # lisp code directory EOTHER = $ECOMMON/other EDOC = $ECOMMON/doc # Info files etc. ELANG = $ECOMMON/programming EWIN32 = $ECOMMON/windows # End of file pwget-2016.1019+git75c6e3e/doc/examples/emacs.conf000066400000000000000000002125511300167571300212770ustar00rootroot00000000000000# emacs.conf -- configuration file # # File id # # Copyright (C) 2000-2016 Jari Aalto # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License as # published by the Free Software Foundation; either version 2 of the # License, or (at your option) any later version # # This program is distributed in the hope that it will be useful, but # WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU # General Public License for more details. # # You should have received a copy of the GNU General Public License # along with program; see the file COPYING. If not, # visit # # Foreword # # WARNING: This file is only an example. Many links have changed # since this file was written so it cannot be used "as is". # # The examples below list download locations of Emacs Lisp # packages. If you don't know what Emacs is, you can stop # reading. # # Description # # You can't use this configuration file, unless the directory # structure is already in place. To create these paths to your # disk, change the ROOT directory in the variable configuration # file `emacs-vars.conf' See furher instructions there. # # Some packages are commented out, because they are available # via other means, like version control. # # The following special markings have been used int his file: # # #e = File is part of Emacs # #xe = File is part of XEmacs # #cvs = Person has an accessible CVS repository # # The overall site-lisp structure # # This configuration file creates the "Net" portion of this general # Emacs `site-lisp' installation structure. # # ................................................................ # # ROOT/ # common/ # | Files that can be used in Emacs and XEmacs, # | these files have been picked from # | gnu.emacs.sources or from mailing lists. # | Files do not have a homepage. # | # | doc Emacs lisp manual # | other # | programming # | win32 win32 only files # | # emacs/ Files that work only in EMACS # | users By person name # | packages By package name # | other Miscellaneous # | # net/ Packages available from net, URL exists. # cvs-packages Packages that can be updated via CVS pserver # | # | ILISP $SFORGE/ilisp # | apel :pserver:anonymous@cvs.m17n.org:/cvs/root co apel # | bbdb $SFORGE/bbdb # | cc-mode $SFORGE/cc-mode # | cedet http://cedet.sourceforge.net # | includes speedbar, EDE, quickpeek, semantic, EIEIO # | devel emacro/devel -- See Emacro # | ecb $SFORGE/ecb # | eicq $SFORGE/eicq # | emacro $SFORGE/emacro # | gnus :pserver:anoncvs@anoncvs.gnu.org:/gd/gnu/anoncvsroot co url # | jde :pserver:cvs@sunsite.auc.dk:/pack/anoncvs login # | login: cvs / checkout jde # | jess-mode $SFORGE/jess-mode # | http://www.unixuser.org/~ueno/liece/ # | lookup $SFORGE/lookup # | mailcrypt $SFORGE/mailcrypt # | tnt $SFORGE/tnt # | url CVS www.savannah.org # | w3 CVS www.savannah.org # | xslt-process $SFORGE/xslt-process # | # packages/ Packages, that consist of multiple files # | # users/ Individual user packages from their homepages. # # ROOT/ # | # +-- users By person name # +-- packages By package name # +-- other Miscellaneous # Note: The variable like $ESITE_LISP are defined in another configuration # file lcd: $ESITE_LISP tag1: elisp tag2: elisp-packages-other ####################################################################### # # # special packages # # ####################################################################### # ...................................................... &gnus ... tag3: gnus lcd: $EPKG_NET print: use CVS :pserver:gnus@cvs.gnus.org:/usr/local/cvsroot co gnus # .......................................................... &mime ... tag3: mime-semi # lcd: $ECOMMON/mime # apel is now in CVS # ftp://ftp.jaist.ac.jp/pub/GNU/elisp/semi/apel-9.12.tar.gz new: # Do not use any more. Gnus has MIME # ftp://ftp.jaist.ac.jp/pub/GNU/elisp/semi/semi-current/flim-1.12.3.tar.gz new # ftp://ftp.jaist.ac.jp/pub/GNU/elisp/semi/semi-current/semi-1.12.1.tar.gz new: tag3: mime-wemi # ftp://ftp.jaist.ac.jp/pub/GNU/elisp/semi/wemi/snapshots/wemi-199902081310.tar.gz new: tag3: chao-gnus # ftp://ftp.jaist.ac.jp/pub/GNU/elisp/semi/semi-current/chao-gnus-6.12 new: tag3: dgnus # ftp://ftp.jaist.ac.jp/pub/GNU/elisp/dgnus/gnus.tar.gz # ftp://ftp.jaist.ac.jp/pub/GNU/elisp/dgnus/hott.tar.gz # ftp://ftp.jaist.ac.jp/pub/GNU/elisp/dgnus/etc.tar.gz tag3: semi-gnus # ftp://ftp.jaist.ac.jp:/pub/GNU/elisp/semi-gnus/ ####################################################################### # # # Emacs and XEmacs compatible packages # # ####################################################################### lcd: $EPKG_NET tag4: elisp-packages-net tag5: antlr-mode # Included in Emacs 21.2 # $EHTTP_SF/antlr-mode/antlr-mode-2.2a.tar.gz new: x: tag5: arch tag5: gnuarch tag5: xtla print: http://wiki.gnuarch.org/moin.cgi/xtla print: https://gna.org/projects/xtla-el http://download.gna.org/xtla-el/xtla-0.9.tar.gz new: x: # http://xsteve.nit.at/prg/emacs/ pregexp:xtla.el tag5: auto-insert # This package does not have a good internal archive structure http://www.geocities.com/SiliconValley/Foothills/9093/files/auto-insert-tkld-1.23.tar.gz new: x: tag5: artist print: Part of Emacs 21.1. There is no homepage any more tag5: auctex print: Get AUCTex from CVS tree print: cvs -d :pserver:cvs@sunsite.auc.dk:/pack/anoncvs co auctex tag5: bibletools http://prdownloads.sourceforge.net/sourceforge/bibletools/BibleTools-0.13.tar.gz x: tag5: bbdb # Use CVS tag5: bbdb-expire # Included in BBDB now # http://www.esperi.demon.co.uk/nix/downloads/bbdb-expire-1.4.tar.gz new: x: tag5: bakel-tijs tag5: c-mode-addon http://vengeance.et.tudelft.nl/~smoke/pub/emacs pregexp:(tar.gz|.el$) x: tag5: bugtrack http://jdee.sunsite.dk/developerscorner.html pregexp:bug.*-\d+\.[\d\.]+tar.gz x: tag5: cc-mode # http://cc-mode.sourceforge.net/ # http://www.python.org/emacs/cc-mode/cc-mode.tar.gz tag5: color-mate # vregexp:color-mate-\d+[\d.]+.tar.gz # file:color-mate-10.1.tar.gz # pregexp:\d\.tar.gz http://www.netlab.is.tsukuba.ac.jp/~yokota/izumi/color_mate/ pregexp:color-mate-\d+\.[\d\.]+tar.gz file:color-mate-1.10.tar.gz new: x: tag5: dcsh-mode # 2001-03 Link is not longer there # http://www.emacs.org/hdl/dcsh-mode.html page: pregexp:\d\.tar.gz vregexp:((?i)dcsh\s+mode\s+([\d.]+)) file:dcsh-mode-1.2.tar.gz new: x: tag5: dired-dd http://www.asahi-net.or.jp/~pi9s-nnb/dired-dd-home.html http://www.asahi-net.or.jp/~pi9s-nnb/dired-dd-home.html page:find pregexp:\.tar.gz x: tag5: docbook-xml http://prdownloads.sourceforge.net/sourceforge/docbookxml/docbook-xml-mode.tar.gz x: tag5: ede print: See cedet.sourceforge.net and use CVS to get code. tag5: eicq # This project is now available at sourceforge. Use CVS # http://sourceforge.net/projects/eicq/ # Old: 2000-03 # http://www.sfu.ca/~stephent/eicq/ pregexp:tar.gz x: # 2001-03 no longer exists # http://users.ozlinx.com.au/~youngs_s/eicq/ page:find pregexp:eicq-0.2.5.tar.gz vregexp:((?i)current\s+version\s+-\s-[^\d]+([\d.]+) file:eicq-0.2.5.tar.gz x: tag5: eieio print: See cedet.sourceforge.net and use CVS to get code. tag5: elib # Required by pcl-cvs ftp://ftp.lysator.liu.se/pub/emacs/elib-1.0.tar.gz new: x: tag5: emacs-jabber http://prdownloads.sourceforge.net/emacs-jabber/emacs-jabber-0.6.1.tar.gz new: x: tag5: esheet http://esheet.tripod.com/index.html page: pregexp:tar.gz x: tag5: eshell # Includes pcomplete # the vregexp is not needed, because the DEFUALT regexp will # match string "The latest version of Eshell is 2.4.1". # # The is mandatory so that correct name template # is known to program. print: eshell is included in latest Emacs, download site was print: http://www.gci-net.com/users/j/johnw/ # http://www.gci-net.com/users/j/johnw/eshell.html page: pregexp:eshell.tar.gz file:eshell-1.3.tar.gz new: x: tag5: eudc # Now available at sourceforge CVS # http://lspwww.epfl.ch/~figueire/Software/eudc/ page: save:emacs-figueiredo-oscar-eudc.heml # http://lspwww.epfl.ch/~figueire/Software/eudc/index.html page: pageregexp:eudc-1.30b.tar.gz vregexp:((?i)new\s+in\s+version\s+([\d.]+)) file:eudc-1.30b.tar.gz new: x: print eudc - emacs unified Directory client, is available sourceforge tag5: flymake http://prdownloads.sourceforge.net/flymake/flymake-0.1.zip new: x: tag5: guess-lang print. See tag `drieu-benjamin' # http://www.grassouille.org/emacs/index.en.html page: save:emacs-drieu-benjamin.html pregexp:guess.*\.tar.gz x: tag5: gnuplot http://leonardo.phys.washington.edu/~ravel/software/gnuplot-mode/Welcome.html save:emacs-gnuplot-mode.html http://leonardo.phys.washington.edu/~ravel/software/gnuplot-mode/Welcome.html page: pregexp:\.tar.gz x: tag5: gnyognyo #todo http://www.gentei.org/~yuuji/software/ pregexp:gnyognyo x: tag5: html-helper print: available at tag "minar-nelson" tag5: hm-html-menus http://www.tnt.uni-hannover.de/~muenkel/software/own/hm--html-menus/overview.html save:hm-overview.html ftp://ftp.tnt.uni-hannover.de/pub/editors/xemacs/contrib/ regexp:hm-- regexp-no:dired x: tag5: id3 http://www.gentei.org/~yuuji/software/mpg123el/ pregexp:id3el-[\d.]+tar.gz x: tag5: idlwave print: Included in latest Emacs # http://idlwave.org/download/idlwave.tar.gz tag5: italk http://prdownloads.sourceforge.net/sourceforge/italk/italk-el-1.03.tar.gz x: new: tag5: irchat # patches http://www.iki.fi/azure/tmp/ # http://people.ssh.fi/tri/irchat/ # # irchat latest snapshots are in format -YYYYMMDD and the older # release kits are in format -NN.NN We search here the # very latest kits irchat-980625.tar.gz http://people.ssh.fi/tri/irchat/index.html page: pregexp:/irchat-\d{8} vregexp:((?i)irchat-(\d{8})) file:irchat-980625.tar.gz new: x: tag5: jabber http://prdownloads.sourceforge.net/sourceforge/emacs-jabber/emacs-jabber-0.3.1.tar.gz new: x: tag5: jde tag5: jdee # The "latest" is last production release, "beta" is # bleading edge. Unfortunately there is no inforamtion # on the page about the latest version number which we # could use. This simply downlodas the file again. # # http://jde.sunsite.auc.dk/jde-latest.zip new: overwrite: x: # Search from page following string in one line: # # "JDEE 2.2.9beta6, the latest beta, is also available in # zip or zipped tar format." # # http://jdee.sunsite.dk/rootpage.html page: pregexp:jde-beta.zip vregexp:((?i)JDEE\s+([\d.]+)) file:jde-1.1.zip new: x: print: CVS :pserver:cvs@sunsite.auc.dk:/pack/anoncvs login http://jdee.sunsite.dk/jde-latest.zip overwrite: x: tag5: jde-contrib lcd: $EPKG_NET/jde-contrib http://jdee.sunsite.dk/contributions.html save: jde-contributions.html http://jdee.sunsite.dk/contributions.html pregexp:\.el regexp-no:jsee|jserial lcd: $EPKG_NET tag5: jde-docindex # 2005-09-29 invalid URL # http://relativity.yi.org/jde-docindex/download/jde-docindex-0.9.0.tar.gz new: x: tag5: jde-usages http://prdownloads.sourceforge.net/sourceforge/jde-usages/jde-usages-0.1.zip x: tag5: lout http://www.chez.com/emarsden/lout/ page:find pregexp:lout.*((?i)faq.*html|tar.gz) x: tag4: mirror http://people.netscape.com/drush/emacs/ pregexp:\.tar.gz x: tag5: notes-mode #todo: error on loading! http://www.isi.edu/~johnh/SOFTWARE/NOTES_MODE/index.html pregexp:notes-mode-[\d.]+.\d.tar.gz file:notes-mode-1.11.tar.gz new: x: tag5: marche # http://www.gentei.org/~yuuji/software/ page:find pregexp:marche-[\d.]+.tar.gz vregexp:(marche-([\d.]+)) file:marche-1.11.tar.gz new: x: http://www.gentei.org/~yuuji/software/ page:find pregexp:marche-[\d.]+.tar.gz x: tag5: mkhtml tag5: adams-drew # 2000-12 No personal homepage ftp://ftp.cis.ohio-state.edu/pub/emacs-lisp/archive/mkhtml-1.0.tar.gz new: x: tag5: mmm-mode print: See mmm-mode.sourceforge.net and use CVS to get code. # The project is now at sourceforge # ftp://download.sourceforge.net/pub/sourceforge/mmm-mode/mmm-mode-0.4.6.tar.gz new: x: tag5: two-mode http://www.dedasys.com/freesoftware/files/two-mode-mode.el tag5: mp3 http://prdownloads.sourceforge.net/sourceforge/emacsmp3player/mp3player-1.8.tar.gz new: x: tag5: psgml # Do not download this, the Emacs loaddefs.el sets autoload html-mode # to psgml-mode.el, but the latest package does not # do that any more. => Emacs doesn't find html-mode. # And the package is no longer actively maintained. # ftp://ftp.lysator.liu.se/pub/sgml/psgml-1.0.3.tar.gz new: x: print: MOTE that sgml-mode.el is included in XEmacs print: See http://www.lysator.liu.se/projects/about_psgml.html tag5: php-mode http://prdownloads.sourceforge.net/sourceforge/php-mode/php-mode.el tag5: python-mode # Official page is at http://wiki.python.org/moin/EmacsPythonMode print: See sourceforge project: python-mode tag5: rcpp-mode http://www.interhack.net/projects/rcpp-mode/ pregexp:tgz x: tag5: records-mode print: see records.sourceforge.net and use CVS to get code # http://www.cse.ogi.edu/~ashvin/software.html pregexp:records-[\d.]+.tar.gz file:records-1.2.2.tar.gz new: x: tag5: semantic print: See cedet.sourceforge.net and use CVS to get code. tag5: ses # SES21 - Simple Emacs Spreadsheet # Jonathan Yavner http://mywebpages.comcast.net/jyavner/ses/ pregexp:ses21-020426.tgz tag5: sml-mode ftp://rum.cs.yale.edu/pub/monnier/sml-mode/sml-mode-3.9.5.tar.gz new: xopt:rm x: tag5: semantic print: See cedet.sourceforge.net and use CVS to get code. tag5: svn tag5: subversion print Note: At Savannat there is also subversion code print http://savannah.gnu.org/cgi-bin/viewcvs/emacs/emacs/lisp/vc-svn.el?rev=HEAD save:vc-svn.el print See psvn.el from 'reichor-stefan' tag5: template print: See project page http://emacs-template.sourceforge.net/ # No CVS! $EHTTP_SF/emacs-template/template-3.1a.tar.gz new: tag5: ttn # Too many files that conflict other sources (core Emacs etc) # http://www.glug.org/people/ttn/software/ttn-pers-elisp/ pregexp:tar.gz x: http://www.glug.org/people/ttn/software/minor-mode-survey/ pregexp:(survey|README) tag5: vera-mode # 2001-12 No more http://www.emacs.org/hdl/vera-mode.html page: pregexp:\d\.tar.gz vregexp:((?i)vera\s+mode\s+([\d.]+)) file:vera-mode-2.3.tar.gz new: x: tag5: vhdl-mode # in Emacs # http://www.emacs.org/hdl/vhdl-mode.html page: pregexp:\d\.tar.gz vregexp:((?i)vhdl\s+mode\s+([\d.]+)) file:vhdl-mode-3.30.tar.gz new: x: tag5: w3 print: latest versions available at savannah.gnu.org CVS tree tag5: x-symbol $EHTTP_SF/x-symbol/x-symbol-4.40-src.tar.gz new: x: tag5: xslt print: See xslt-process.sourceforge.net and use CVS to get code. print: 2002-08-01 old page was at http://www.geocities.com/SiliconValley/Monitor/7464/emacs tag5: zenirc http://www.zenirc.org/download.html pregexp:\.tar\.gz x: ####################################################################### # # # Misc packages # # ####################################################################### tag4: elisp-packages-misc http://www.asahi-net.or.jp/~pi9s-nnb/myelisp.html page:find pregexp:\.tar.gz x: ####################################################################### # # # Emacs only packages # # ####################################################################### tag4: elisp-packages-emacs lcd: $EPKG_EMACS tag5: pcl-cvs print: Note, pcl-cvs.el is part of Eamcs 21 under name pcvs.el print: Old pcl-cvs can be found at http://savannah.nongnu.org/projects/elisp-code # pcl-cvs-2.0b2.tar.gz # ftp://ftp.weird.com/pub/local/pcl-cvs-2.9.9.tar.gz new: # ftp://rum.cs.yale.edu/pub/monnier/pcl-cvs/pcl-cvs-2.9.9.tar.gz new: x: # ftp://rum.cs.yale.edu/pub/monnier/pcl-cvs.tar.gz overwrite: ####################################################################### # # # Misc files # # ####################################################################### tag3: emacs-elisp-intro lcd: $ECOMMON/doc ftp://ftp.gnu.org/gnu/emacs/emacs-lisp-intro-2.04.tar.gz new: tag3: emacs-elisp-manual lcd: $ECOMMON/doc ftp://ftp.gnu.org/gnu/emacs/elisp-manual-21-2.8.tar.gz new: tag3: jsp lcd: $ECOMMON/www # jsp-html-helper-mode.el -- JSP add-on for html-helper-mode.el # # Author: Ben Tindale # Maintainer: Ben Tindale # Created: 19 April 2000 https://sourceforge.net/snippet/download.php?type=snippet&id=100284 save:jsp-html-helperl-mode.el tag3: visual-basic lcd: $ECOMMON/programming/vb ####################################################################### # # # Emacs user packages from net # # ####################################################################### tag2: elisp-users lcd: $EUSR tag3: abrahamsen-per # lcd: $EUSR/abrahamsen-per print: See tag 'auctex' tag3: buhl-josh lcd: $EUSR/buhl-josh # 2000-12 No personal homepage ftp://ftp.cis.ohio-state.edu/pub/emacs-lisp/archive/screenline.el tag3: akimichi-tatsukawa lcd: $EUSR/akimichi-tatsukawa http://www.emacswiki.org/elisp/index.html cregexp:;;.*(?i)author:.*Akimichi pregexp:\.el$ tag3: alcorn-doug lcd: $EUSR/alcorn-doug http://www.emacswiki.org/elisp/index.html cregexp:;;.*(?i)author:.*Alcorn pregexp:\.el$ tag3: anderson-patrick lcd: $EUSR/anderson-patrick http://www.emacswiki.org/elisp/index.html cregexp:;;.*(?i)author:.*Patrick.*Andersen tag3: arneson-erik lcd $EUSR/arneson-erik http://erik.arneson.org/elisp.html pregexp:\.el$ tag3: bakhash-david lcd: $EUSR/bakhash-david # Has no central page for all the lisp, impossible to find http://mit.edu/cadet/www/cl-array.el # Included in Emacs: http://www.mit.edu/people/cadet/strokes.el http://www.mit.edu/people/cadet/.strokes save:emacs-bakhash-david-strokes-dotfile.el http://www.mit.edu/people/cadet/strokes-abc.el http://www.mit.edu/people/cadet/strokes-help.html tag3: barzilay-eli # lcd: $EUSR/barzilay-eli #todo: print: barzilay-eli Emacs download site unknown # http://www.cs.cornell.edu/eli/ # ftp://ftp.cs.cornell.edu/pub/eli/ regexp:\.el$ # # calculator.el is in Emacs 21.2 # http://www.cs.cornell.edu/eli/interests.html pregexp:\.el$ regexp-no:calculator.el tag3: belanger-jay lcd: $EUSR/belanger-jay http://vh213601.truman.edu/~belanger/Emacs.html save:emacs-belanger-jay.html http://vh213601.truman.edu/~belanger/Emacs.html pregexp:\.el$ regecp-no:httpd tag3: berndl-klaus tag3: cygwin-mount lcd: $EUSR/berndl-klaus # No homepage, but one important code is here. http://www.emacswiki.org/elisp/index.html pregexp:cygwin-mount.el$ tag3: berry-karl tag3: crypt lcd: $EUSR/berry-karl print: http://fink.sourceforge.net/pdb/package.php/crypt++el # ftp://ftp.cs.umb.edu/pub/misc/crypt++.el tag3: bihlmeyer-robert tag3: gnus-junk # lcd: $EUSR/bihlmeyer-robert print: 2002-08-02 URL and email address unknown. tag3: bini-michele lcd: $EUSR/bini-michele http://www.emacswiki.org/elisp/index.html cregexp:;;.*(?i)author:.*Michele.*Bini pregexp:\.el$ regexp-no:diff.el tag3: bjorlykke-stig lcd: $EUSR/bjorlykke-stig http://www.xemacs.org/~stigb page: pregexp:\.el tag3: blaak-ray # lcd: $EUSR/blaak-ray print: Latest Emacs includes delphi.el # http://www.infomatch.com/~blaak pregexp:\.el$ regexp-no:delphi.el tag3: boukanov-igor lcd: $EUSR/boukanov-igor http://www.fi.uib.no/~boukanov/emacs/ page: save:emacs-boukanov-igor.html # http://www.fi.uib.no/~boukanov/emacs/ pregexp:\.el tag3: breton-peter tag3: pbreton lcd: $EUSR/breton-peter # #todo: 2001-08 connection error # most of the files are in Emacs now. # http://pbreton.ne.client2.attbi.com/emacs/ pregexp:\.el regexp-no:generic|net-util|dirtrack|net-util|locate|generic|dirtrack|find-lisp|generic|locate tag3: breton-tom # lcd: $EUSR/breton-tom print: 2002-08-02 Address unknown. tag3: brillant-alexandre lcd: $EUSR/brillant-alexandre http://www.djefer.com/~jtemplate/ page:find pregexp:^jtemplate\.el tag3: broubaker-heddy # lcd: $EUSR/broubaker-heddy print: 2002-08-02 Address unknown. tag3: brodie-bill lcd: $EUSR/brodie-bill #todo: no home page ftp://ls6-ftp.cs.uni-dortmund.de/pub/src/emacs/ regexp:(linemenu).*el$ tag3: brown-jeremy lcd: $EUSR/brown-jeremy http://www.ai.mit.edu/~jhbrown/ifile-gnus.html pregexp:\.el$ http://www.ai.mit.edu/~jhbrown/ifile-gnus.html pregexp:\.tar.gz$ x: tag3: burgett-steve # URL does not exist #lcd: $EUSR/burgett-steve #http://robotics.eecs.berkeley.edu/~burgett tag3: burton-brent lcd: $EUSR/burton-brent http://www.io.com/~brentb/emacs/ page: save:emacs-burton-brent.html http://www.io.com/~brentb/emacs/ page:find pregexp:(\.el|(?i)faq) tag3: burton-kevin lcd: $EUSR/burton-kevin print: CVS at http://www.peerfear.org/cgi-bin/cvsweb/el/ http://relativity.yi.org/emacs/ page: save:emacs-burton-kevin.html http://relativity.yi.org/emacs/ page:find pregexp:\.el$ regexp-no:NAME http://relativity.yi.org/el/ page:find pregexp:etail tag3: caoile-clifford-escobar # lcd: $EUSR/caoile-clifford-escobar # http://www2.odn.ne.jp/piyokun/emacs/ tag3: carpenter-bill # feedmail is now part of Emacs # http://www.carpenter.org/ tag3: casadonte-joe lcd: $EUSR/casadonte-joe # old was http://www.netaxs.com/~joc/emacs.html http://www.northbound-train.com/emacs.html page:find pregexp:\.el$ regexp-no:cperl tag3: chen-gongquan lcd: $EUSR/chen-gongquan # Jerry G. Chen http://www.geocities.com/SiliconValley/Bridge/7750/xemacs/ page: save:emacs-chen-gongquan.html http://www.geocities.com/SiliconValley/Bridge/7750/xemacs/ page:find pregexp:\.el.gz$ x: tag3: chua-sandra lcd: $EUSR/chua-sandra lcd: $EUSR/chua-sandra/planner http://sacha.free.net.ph/notebook/emacs/ pregexp:planner(-[a-z]+)?\.el$ # Sandra Jean Chua lcd: $EUSR/chua-sandra/remember http://sacha.free.net.ph/notebook/emacs/remember/ pregexp:(\.el|texi|info) tag3: chen-jerry lcd: $EUSR/chen-jerry http://www.geocities.com/SiliconValley/Bridge/7750/xemacs/ page: save:emacs-chen-jerry.html http://www.geocities.com/SiliconValley/Bridge/7750/xemacs/ pregexp:\.el(\.gz)? x: tag3: clausen-lars lcd: $EUSR/clausen-lars http://shasta.cs.uiuc.edu/~lrclause/tc.html pregexp:\.el$ tag3: conrad-christoph lcd: $EUSR/conrad-christoph # todo: where is the homepage? This is not ideal URL # which.el is part of the "emacro". See CVS # http://www.gnusoftware.com/Emacs/Lisp/ page:find pregexp:^which.el tag3: corcoran-travis lcd: $EUSR/corcoran-travis http://www.emacswiki.org/elisp/index.html cregexp:;;.*(?i)author:.*Corcoran pregexp:\.el$ tag3: corneli-joe lcd: $EUSR/corneli-joe http://www.ma.utexas.edu/~jcorneli/a/elisp/ pregexp:\.el tag3: curtin-matt print: see package tag 'rcpp-mode' tag3: daiki-ueno tag3: liece print: Use CVS at :pserver:anonymous@cvs.m17n.org:/cvs/root co liece tag3: dasmohapatra-vivek lcd: $EUSR/dasmohapatra-vivek print: Cannot parse web page, ownload manually from http://rtfm.etla.org/emacs # http://rtfm.etla.org/cgi-bin/htmlfontify.cgi tag3: dampel-herbert lcd: $EUSR/dampel-herbert # No email address or homepage known ftp://ftp.ul.bawue.de/pub/purple/emacs/ regexp:\.el x: regexp-no:battery|info-look tag3: davidson-kevin lcd: $EUSR/davidson-kevin # Those listed in no regexp are sperate "packages" # # http://www.geocities.com/SiliconValley/Foothills/9093/files/ page:find pregexp:. regexp-no: auto-insert|setcdblk|proxy-cmsd|^dt\.|^dtbin x: # http://www.geocities.com/SiliconValley/Foothills/9093/files/mail-signature.gz save:mail-signature.el.gz x: # http://www.geocities.com/SiliconValley/Foothills/9093/files/clean.gz save:clean.pl.gz x: tag3: deleuze-christophe lcd: $EUSR/deleuze-christophe http://www-rp.lip6.fr/~deleuze/emacs-en.html save:emacs-deleuze-christophe.html http://www-rp.lip6.fr/~deleuze/emacs-en.html page: pregexp:\.el tag3: dickow-ulrik lcd: $EUSR/dickow-ulrik http://www.nbi.dk/~dickow/emacs page: save:emacs-dickow-ulrik.html http://www.nbi.dk/~dickow/emacs page:find pregexp:Xdefaults http://www.nbi.dk/~dickow/skel/.emacs save:dickow-ulrik-dotemacs.el tag3: dirson-yann #todo: lcd: $EUSR/dirson-yann http://ydirson.free.fr/ pregexp:\.el tag3: drieu-benjamin lcd: $EUSR/drieu-benjamin # The fortune.el is maintained by shulman-michael # fortune.el is included in Emacs 21.2 http://www.grassouille.org/emacs/index.en.html page: save:emacs-drieu-benjamin.html http://www.grassouille.org/emacs/index.en.html pregexp:\.el$ regexp-no:fortune|folding|pong tag3: dyke-neil tag3: junkbuster tag3: spamprod lcd: $EUSR/dyke-neil http://www.neilvandyke.org/spamprod/ page:find pregexp:\.el$ http://www.neilvandyke.org/junkbust-emacs/ page:find pregexp:\.el$ #e http://www.neilvandyke.org/webjump/ page:find pregexp:\.el$ http://www.neilvandyke.org/jasmin-emacs/ page:find pregexp:\.el$ http://www.neilvandyke.org/bbdbpalm/ page:find pregexp:\.el$ http://www.neilvandyke.org/perkymbf/ page:find pregexp:\.el$ http://www.neilvandyke.org/kbdraw/ page:find pregexp:\.el$ http://www.neilvandyke.org/padr/ page:find pregexp:\.el$ http://www.neilvandyke.org/quack/ page:find pregexp:\.el$ http://www.neilvandyke.org/revbufs/ page:find pregexp:\.el$ http://www.neilvandyke.org/noticeify/ page:find pregexp:\.el$ http://www.neilvandyke.org/nsamail/ page:find pregexp:\.el$ tag3: edmonds-brian #todo: lcd: $EUSR/edmonds-brian http://www.gweep.ca/~edmonds/usenet page: save:emacs-edmonds-brian.html tag3: eide-eric lcd: $EUSR/eide-eric http://www.cs.utah.edu/~eeide/emacs/index.html save:emacs-eide-eric.html http://www.cs.utah.edu/~eeide/emacs/index.html pregexp:\.el regexp-no:filladapt x: tag3: ellison-gary lcd: $EUSR/ellison-gary http://www.interhack.net/people/gfe/ pregexp:\.el tag3: elmes-damien lcd: $EUSR/elmes-damien print: Current maintainer http://mwolson.org/projects/EmacsWiki.html print: Old page http://repose.cx/emacs/wiki/index.html tag3: englen-stephen lcd: $EUSR/englen-stephen # In Emacs: mspools.el iswitchb.el http://www.anc.ed.ac.uk/~stephen/emacs/index.html save:emacs-englen-stephen.html http://www.anc.ed.ac.uk/~stephen/emacs/ page:find pregexp:\.el$ regexp-no:(mspool|switch) http://www.anc.ed.ac.uk/~stephen/emacs/ell.el http://anc.ed.ac.uk/~stephen/emacs/ell.html # This file is in list of theberge-jean # http://anc.ed.ac.uk/~stephen/emacs/ell.html page:find pregexp:ell\.el$ tag3: ernst-michael lcd: $EUSR/ernst-michael # Other packages are very old. The EDB is Emacs database, # Which seems to be old too. http://sdg.lcs.mit.edu/~mernst/software/edb/ pregexp:gz x: tag3: fouts-martin # lcd: $EUSR/fouts-martin print: 2002-08-02 Address unknown. tag3: friedman-noah lcd: $EUSR/friedman-noah # ftp://ftp.splode.com/pub/users/friedman/emacs-lisp/ # In Emacs: eldoc.el rlogin.el rsz-mini.el type-break.el vcard.el #todo ftp://ftp.splode.com/pub/users/friedman/emacs-lisp/ http://www.splode.com/~friedman/software/emacs-lisp/index.html save:emacs-noah.html http://www.splode.com/~friedman/software/emacs-lisp/ pregexp:\.el$ regexp-no:(eldoc|rlogin|rsz|suggbind|type-break|vcard|whitespace|type-break|vcard) tag3: galbraith-peter lcd: $EUSR/galbraith-peter http://people.debian.org/~psg/elisp/ pregexp:\.*el regexp-no:ff-paths|ffap|word-help| tag3: garshol-lars tag3: css-mode http://www.garshol.priv.no/download/software/css-mode/ pregexp:\.el$ tag3: gaston-pierre lcd: $EUSR/gaston-pierre http://www.emacswiki.org/elisp/index.html cregexp:;;.*(?i)author:.*Gaston pregexp:\.el$ tag3: glickstein-bob # lcd: $EUSR/glickstein-bob # Now in Emacs sregex.el ishl.el (isearch-lazy-highlight) # 2001-12 Page has disappeared # http://www.zanshin.com/~bobg/software.html page: save:emacs-glickstein-bob.html # http://www.zanshin.com/~bobg/source/emacs/ishl.el # http://www.zanshin.com/~bobg/source/emacs/unscroll.el # http://www.zanshin.com/~bobg/source/emacs/live-mode.el # http://www.zanshin.com/~bobg/source/emacs/wcount.el # http://www.zanshin.com/~bobg/source/emacs/winhist.el tag3: goel-ashvin print: See sourceforge project 'records'. tag3: goel-deepak tag3: deego # Todo: this site has difficult download structure. lcd: $EUSR/goel-deepak http://gnufans.net/~deego/emacs.html page: save:emacs-goel-deepak.html print: Page is too difficult to parse. Please browse manually http://gnufans.net/~deego/emacs.html http://gnufans.net/~deego/DeegoWiki/WelcomePage.html pregexp:lisp-mine.tar x: # http://gnufans.net/~deego/emacspub/gnusfop/alpha/gnusfop.el # http://gnufans.net/~deego/emacspub/autoview/alpha/ pregexp:(\.el|\.tex)$ # http://gnufans.net/~deego/emacspub/miniedit/alpha/ pregexp:\.el$ # http://gnufans.net/~deego/emacspub/clel/alpha/ pregexp:\.el$ # http://gnufans.net/~deego/emacspub/choose/alpha/ pregexp:\.el$ # This pacage does not have standard kit naming # NAME-VERSION.VERSION.tar.gz # impossible to download latest # http://gnufans.net/~deego/emacspub/elder/2.7.1release.tar.gz x: # http://gnufans.net/~deego/emacspub/faith/alpha/faith.el # http://gnufans.net/~deego/emacspub/notworking/newmail/alpha/newmail.el # http://gnufans.net/~deego/emacspub/timerfunctions/alpha/timerfunctions.el tag3: gorrell-harley lcd: $EUSR/gorrell-harley # footnote.el in not "The footnote", which comes with XEmacs and # inserts [1] references to mail messages. #http://www.mahalito.net/~harley/elisp/ page: save:emacs-gorrell-harley.html #http://www.mahalito.net/~harley/elisp/ page:find pregexp:\.el regexp-no:footnote http://www.mahalito.net/~harley/elisp/footnote.el save:jhg-footnote.el tag3: grigni-michelangelo lcd: $EUSR/grigni-michelangelo ftp://ftp.mathcs.emory.edu/pub/mic/emacs regexp:(\.el$|(?i)readme) regexp-no:pp.el|ffap|ff-path tag3: grossjohan-kai lcd: $EUSR/grossjohan-kai # This regexp-no files are not Kai's, they are just # available at this ftp directory or the file is part of emacs # # ssh is old, it is now tramp package. # ftp://ls6-ftp.cs.uni-dortmund.de/pub/src/emacs/ regexp:(\.el$|rcp.el) regexp-no:(nndb|rect-mark|setnu|todo|worklog|cyclebuffer|winmgr|linemenu|lib-complete|locate|ssh) x: # http://www.emacswiki.org/elisp/index.html cregexp:;;.*(?i)author:.*Grossjohann pregexp:\.el$ http://www.emacswiki.org/elisp/index.html pregexp:longlines.el$ print: See savannah.gnu.org/projects/tramp and use CVS to get code. print: See tamp customisations at ftp://ftp.comlab.ox.ac.uk/tmp/Joe.Stoy/ tag3: heideman-john lcd: $EUSR/heideman-john # note-mode is in the packages section. http://www.isi.edu/~johnh/SOFTWARE/index.html page: save:emacs-heideman-john.html http://www.isi.edu/~johnh/SOFTWARE/index.html page:find pregexp:\.el regexp-no:crypt http://www.isi.edu/~johnh/SOFTWARE/NOTES_MODE/index.html pregexp:\.el tag3: hiroshi-yokota # http://www.netlab.is.tsukuba.ac.jp/~yokota/izumi/color_mate/ pregexp:color-mate-10.3.tar.gz new: tag3: hirose-yuuji lcd: $EUSR/hirose-yuuji http://www.gentei.org/~yuuji/software/biff-el.html pregexp:\.el regexp-no:base64|idl3 http://www.gentei.org/~yuuji/software/mpg123el/ pregexp:\.el regexp-no:base64|idl3 # See also 2001-12 Access problems # http://www.yatex.org/ tag3: hodges-matthew lcd: $EUSR/hodges-matthew http://www.emacswiki.org/elisp/index.html cregexp:;;.*(?i)author:.*Matthew.*Hodges pregexp:\.el$ http://vegemite.chem.nott.ac.uk/~matt page:find pregexp:\.el$ tag3: hodgson-kahlil lcd: $EUSR/hodgson-kahlil http://www.emacswiki.org/elisp/index.html cregexp:;;.*(?i)author:.*Hodgson pregexp:\.el$ tag3: hughes-graham lcd: $EUSR/hughes-graham http://www.xs4all.nl/~cg/ciphersaber/comp/emacs-lisp.txt save:rc4.el tag3: howe-dennis lcd: $EUSR/howe-dennis # In Emacs browse-url.el http://wombat.doc.ic.ac.uk/emacs/dot-emacs.el save:emacs-howe-dennis-dotemacs.html tag3: ingalls-bruce #cvs http://emacro.sourceforge.net/ #cvs http://www.sourceforge.net/projects/emacro/ # An OLDER version of a2ps-print.el is bundled with a2ps ftp://ftp.cppsig.org/pub/tools/emacs/ regexp:a2ps-print regexp-no:old tag3: ingrand-francois lcd: $EUSR/ingrand-francois http://www.laas.fr/~felix/despam.html save:ingrand-francois-despam.html http://www.laas.fr/~felix/despam.html pregexp:\.el$ tag3: jackson-trey lcd: $EUSR/jackson-trey http://bmrc.berkeley.edu/~trey/emacs/src/sig-quote.el http://bmrc.berkeley.edu/~trey/emacs/src/rmail-extras.el tag3: johnson-bryan lcd: $EUSR/johnson-bryan http://www.comsecmilnavpac.net/elisp/ page: save:emacs-johnson-bryan.html http://www.comsecmilnavpac.net/elisp/ page:find pregexp:\.el$ tag3: jones-kyle lcd: $EUSR/jones-kyle http://www.wonderworks.com/ page: save:emacs-wonderworks.html http://www.wonderworks.com/ page:find pregexp:\.el regexp-no:crypt|nnir|prosper|bibfind x: # Not maintained by kyle # http://www.wonderworks.com/download/crypt++.el # ftp://www.uni-mainz.de/pub/gnu/vm/vm.tar.gz new: x: tag3: josefsson-simon lcd: $EUSR/josefsson-simon # nnimap.el, base64.el, dig.el are part of Gnus http://josefsson.org/ page: save:emacs-josefsson-simon.html http://josefsson.org/aes/ pregexp:\.el$ tag3: jump-theodore lcd: $EUSR/jump-theodore # http://www.tertius.com/projects/library/ page: save:emacs-jump-theodore.html http://www.tertius.com/projects/library/emacs/_emacs.gz save:jump-theodore-dotemacs.el.gz x: # http://www.tertius.com/projects/library/emacs/ page:find pregexp:. regexp-no:(site-lisp|^emacs-|^elisp-|^_ema|elc|nnir|prosper) x: tag3: kadlec-albrecht # 2001-12 Site disappeared # lcd: $EUSR/kadlec-albrecht # http://www.auto.tuwien.ac.at/~albrecht tag3: kapur-nevin http://www.mts.jhu.edu/~kapur/hacks.html pregexp:\.el$ regexp-no:gnus-grepmail|msn.el tag3: karunakaran-rajeev lcd: $EUSR/karunakaran-rajeev http://www.mayura.com/misc/ page: save:emacs-karunakaran-rajeev.html http://www.mayura.com/misc/ page:find pregexp:\.el http://www.mayura.com/misc/java-path.txt tag3: keane-joe lcd: $EUSR/keane-joe http://www.jgk.org/src/elisp.html pregexp:\.el tag3: kemp-steve lcd: $EUSR/kemp-steve # 2003-06-02 No longer available # Not all files are by Steve kemp, select only some # http://www.gnusoftware.com/Emacs/Lisp/ page:find pregexp:(auto-diff|digest|dired-fns|lisp-index|macro-mode|keep-buf|mp3|require|slashdot|small-func|htmlpp|version|w32-faq|w32-set|word).*\.el # # http://www.gnusoftware.com/Emacs/Lisp/modes.el save:emacs-kemp-steve-modes.el # http://www.gnusoftware.com/Emacs/Lisp/gnuemacs.el save:emacs-kemp-steve-dotemacs.el # http://www.gnusoftware.com/Emacs/Lisp/setup-emacs.el save:emacs-kemp-steve-dotemacs2.el tag3: kifer-michael tag3: ediff # lcd: $EUSR/kifer-michael # These are part of Emacs now # ftp://ftp.cs.sunysb.edu/pub/TechReports/kifer/viper.tar.Z # ftp://ftp.cs.sunysb.edu/pub/TechReports/kifer/ediff.tar.Z tag3: kleinpaste-karl lcd: $EUSR/kleinpaste-karl http://www.cs.cmu.edu/~karl page:find pregexp:\.el tag3: kline-christopher print: See sourceforge project 'starteam-el' tag3: koomen-hans lcd: $EUSR/koomen-hans http://www.emacswiki.org/elisp/index.html cregexp:;;.*(?i)author:.*Koomen pregexp:\.el$ tag3: knuth-don lcd: $EUSR/knuth-don http://www-cs-faculty.stanford.edu/~knuth/programs.html page:find pregexp:\.el$ tag3: kruse-peter lcd: $EUSR/kruse-peter http://www.brigadoon.de/peter pregexp:\.el tag3: konerding-david lcd: $EUSR/konerding-david ftp://ls6-ftp.cs.uni-dortmund.de/pub/src/emacs/ regexp:winmgr-mode.el tag3: koppelman-david lcd: $EUSR/latorre-vinicius http://www.ee.lsu.edu/koppel/emacs.html page:find pregexp:\.el tag3: lang-mario lcd: $EUSR/lang-mario http://www.emacswiki.org/elisp/index.html cregexp:;;.*(?i)author:.*Mario.*Lang pregexp:\.el$ tag3: lannerback-anders lcd: $EUSR/lannerback-anders http://www.student.nada.kth.se/~f95-ala/emacs/auto-arg-0.3.el new: save:auto-arg.el tag3: latorre-vinicius lcd: $EUSR/latorre-vinicius # Exluded files are now in Emacs #todo: See also ftp://ftp.cpqd.com.br/pub/users/vinicius/ # http://kin.hypermart.net/emacs/emacs/highline.el http://www.cpqd.com.br/~vinicius/emacs/ page:find pregexp:\.el regexp-no:(delim-col|ebnf|lpr|ps-print) x: tag3: lepied-frederick tag3: flepied lcd: $EUSR/lepied-frederick # expand is part of emacs http://www.teaser.fr/~flepied/ page:find pregexp:\.el regexp-no:expand x: tag3: liljenberg-peter lcd: $EUSR/liljenberg-peter #todo: not exist # http://www.lysator.liu.se/~petli/elisp/elint.tar.gz # http://www.lysator.liu.se/~petli/elisp/byte-record.el # http://www.lysator.liu.se/~petli/elisp/closure.el # http://www.lysator.liu.se/~petli/elisp/getcomics.el # http://www.lysator.liu.se/~petli/elisp/lyskom.el # http://www.lysator.liu.se/~petli/elisp/termo.el # http://www.lysator.liu.se/~petli/elisp/webster.el tag3: lindgren-anders lcd: $EUSR/lindgren-anders # fdb was replaced by latest Emacs releases # autorevert.el and follow.el are part of the Emacs # More recent Folding is distributed in Tiny Tools kit http://www.csd.uu.se/~andersl/emacs.shtml save:emacs-lindgren-anders.html http://www.csd.uu.se/~andersl/emacs.shtml page: pregexp:\.el regexp-no:(follow|fdb|folding|autorevert|my-) ftp://ftp.csd.uu.se/pub/users/andersl/emacs/my-init.el save:emacs-lindgren-anders-dotemacs.el tag3: link-thomas lcd: $EUSR/link-thomas http://members.a1.net/t.link/CompEmacsFilesets.html save:emacs-link-thomas.html http://members.a1.net/t.link/CompEmacsFilesets.html pregexp:\.el tag3: linkov-juri # Mail: lcd: $EUSR/linkov-juri http://www.jurta.org/emacs/ee/index.en.html pregexp:\.tar\.gz x: tag3: lopez-emilio lcd: $EUSR/lopez-emilio # http://WWW.Physik.TU-Muenchen.DE/~ecl/emacs.html save:emacs-lopez-emilio.html http://WWW.Physik.TU-Muenchen.DE/~ecl/emacs.html page: pregexp:\.html regexp-no:(gpl|emacs|contact|prosper|ecl\.) text: rename:s/html/el/ tag3: lord-philip # lcd: $EUSR/lord-philip # 2001-12 URL invalid # the labbook is replaced by records.el from different author. http://www.russet.org.uk/emacs.html save: emacs-lord-philip.html http://www.russet.org.uk/emacs.html pregexp:\.el regexp-no:(#|labbook|jde-auto-abbrev|jde-import|jde-stack) tag3: lorentey-karoly print http://lorentey.hu/project/emacs.html tag3: ludlam-eric tag3: zappo # lcd: $EUSR/ludlam-eric # Now at sourceforge http://cedet.sourceforge.net/ # http://choochoo.ultranet.com/~zappo tag3: ludlam-eric-ftp lcd: $EPKG_NET # ftp://ftp.ultranet.com/pub/zappo/etalk-0.11.a10.tar.gz new: # These are in sourceforge project CEDET: speedbar, EDE # ftp://ftp.ultranet.com/pub/zappo/cparse-0.4.tar.gz new: tag3: lundin-daniel # Lispmeralda designer lcd: $EUSR/lundin-daniel http://www.daemoncode.com/emacs/ pregexp:\.el$ regexp-no:xml http://www.daemoncode.com/emacs/ pregexp:dot.emacs save:dot-emacs.el http://www.daemoncode.com/emacs/ pregexp:dot.gnus save:dot-gnus.el # ftp://codefactory.se/pub/people/daniel/elisp/ pregexp:\.el$ # ftp://codefactory.se/pub/people/daniel/elisp/ pregexp:dot.emacs save:dot-emacs.el # ftp://codefactory.se/pub/people/daniel/elisp/ pregexp:dot.gnus save:dot-gnus.el tag3: mackall-matt tag3: quilt lcd: $EUSR/mackall-matt http://www.selenic.com/quilt/quilt.el tag3: manning-carl lcd: $EUSR/manning-carl http://www.ai.mit.edu/people/caroma/tools/ page:find pregexp:java.*20.*\.el tag3: marsden-eric lcd: $EUSR/marsden-eric # todo: Which are Eric's? http://purl.org/net/emarsden/home/downloads pregexp:\.el tag3: matsushita-akihisa lcd: $EUSR/matsushita-akihisa http://www.bookshelf.jp/elc/ cregexp:;;.*(?i)author:.*Akihisa pregexp:\.el$ http://www.emacswiki.org/elisp/index.html cregexp:;;.*(?i)author:.*Akihisa pregexp:\.el$ tag3: maclennan-sean lcd: $EUSR/maclennan-sean http://xemacs.seanm.ca/ page: save:emacs-maclennan-sean.html http://xemacs.seanm.ca/lisp/ pregexp:\.el regexp-no:_pkg.el tag3: mengarini-will #todo: connect error lcd: $EUSR/mengarini-will http://www.eskimo.com/~seldon page: save:emacs-mengarini-will.html http://www.eskimo.com/~seldon page:find pregexp:\.el regexp-no:(dotemacs|vi-dot|filemenu) # http://www.eskimo.com/~seldon/dotemacs.el save:dotemacs-mengarini-will-dotemacs.el tag3: mitchell-lawrence lcd: $EUSR/mitchell-lawrence http://www.ph.ed.ac.uk/~p0198183/ pregexp:\.(el|patch)$ regexp-no:css-mode|google|ogg http://purl.org/NET/wence/lisppaste.el tag3: mccrossan-fraser lcd: $EUSR/mccrossan-fraser http://joat.ca/software/smbmode.html pregexp:\.el$ tag3: minar-nelson lcd: $EUSR/minar-nelson # html-helper was maintained by Gian Umberto Lauri http://www.santafe.edu/~nelson/hhm-beta/ page:find pregexp:(html-helper|]adding-tag|contrib-) http://www.santafe.edu/~nelson/hhm-beta/testmodule.el save:hhm-testmodule.el http://www.santafe.edu/~nelson/hhm-beta/tables.el save:hhm-tables.el tag3: minejima-yuji lcd: $EUSR/minejima-yuji http://homepage1.nifty.com/bmonkey/emacs/index-en.html page: save:emacs-minejima-yuji.html http://homepage1.nifty.com/bmonkey/emacs/index-en.html page:find pregexp:\.el$ tag3: milliken-peter tag3: ELSE lcd: $EUSR/milliken-peter # http://www.emacswiki.org/elisp/index.html cregexp:;;.*(?i)author:.*milliken pregexp:\.el$ http://www.zipworld.com.au/~peterm/ page: save:emacs-milliken-peter.html http://www.zipworld.com.au/~peterm/ pregexp:. regexp-no:setnu tag3: monnier-stefan lcd: $EUSR/monnier-stefan print: Note, pcl-cvs.el is now part of Eamcs 21 under name pcvs.el # ftp://rum.cs.yale.edu/pub/monnier/misc/ regexp:\.el$ regexp-no:\d|ish.el|diff-mode|newcomment tag3: moshin-ahmed lcd: $EUSR/moshin-ahmed http://www.cs.albany.edu/~mosh/Elisp/ pregexp:\.el regexp-no:align tag3: muenkel-heiko # Heiko Muenkel lcd: $EUSR/muenkel-heiko ftp://ftp.tnt.uni-hannover.de/pub/editors/xemacs/contrib/balloon-help.el.gz http://www.tnt.uni-hannover.de/~muenkel/software/own/hm--html-menus/overview.html save:hm-overview.html tag5: nguyen-thien-thi print: See package tag 'ttn' tag3: nickelsen-jorgen lcd: $EUSR/nickelsen-jorgen # 'recent-files.el' is part of XEmacs http://home.snafu.de/jn/emacs/ pregexp:\.el regexp-no:recent-files tag3: niksic-hrvoje #todo: no homepage, must know the filenames lcd: $EUSR/niksic-hrvoje # http://jagor.srce.hr/~hniksic # http://fly.srk.fer.hr/~hniksic/emacs/htmlize.el # http://fly.srk.fer.hr/~hniksic/emacs/blackbook.el http://fly.srk.fer.hr/~hniksic/emacs pregexp: \.el tag3: naess-rolf lcd: $EUSR/naess-rolf http://www.pvv.org/~rolfn/ pregexp:\.(el|html)$ tag3: nix print: bbdb-expire is now included in BBDB core. print: See site wide XEmacs configuration at http://www.esperi.demon.co.uk/nix/ tag3: oconnor-edward # Mail: Edward O'Connor lcd: $EUSR/oconnor-edward http://oconnor.cx/elisp/ pregexp:\.el$ tag3: scholz-oliver lcd: $EUSR/scholz-oliver http://www.emacswiki.org/elisp/index.html cregexp:;;.*(?i)author:.*oliver.*scholz pregexp:\.el$ tag3: oconnor-edward lcd: $EUSR/oconnor-edward # print: See also http://oconnor.cx/elisp http://www.emacswiki.org/elisp/index.html cregexp:;;.*(?i)author:.*Edward.*Connor pregexp:\.el$ tag3: osterlund-peter lcd: $EUSR/osterlund-peter # p4.el is by vaidheeswarran-rajesh http://home.swipnet.se/~w-15919 page:find pregexp:\.el regexp-no:p4 tag3: owen-gareth lcd: $EUSR/owen-gareth # These may give ISP troubles, but the links are # okay according to Gareth # # http://www.gwowen.freeservers.com/lisp/ page: save:emacs-owen-gareth.html # http://www.gwowen.freeservers.com/lisp/ pregexp:\.el(\.gz)?$ x: # The backup site is here http://www.geocities.com/drgazowen/lisp/ pregexp:\.el(\.gz)?$ x: tag3: padioleau-yoann lcd: $EUSR/padioleau-yoann # 2000-12 no homepage ftp://ftp.cis.ohio-state.edu/pub/emacs-lisp/archive/dircolors.el tag3: pearson-dave lcd: $EUSR/pearson-dave http://www.davep.org/emacs/ pregexp:\.el$ regexp-no:quickurl|5x5|slashdot tag3: pedersen-jesper lcd: $EUSR/pedersen-jesper # The older version: poer-macros-1.0.el, new is power-macros.el http://www.blackie.dk/emacs/ pregexp:\.el regexp-no:macros- tag3: perry-william # lcd: $EUSR/perry-william print: 'w3' ann 'url' projects are at CVS tree savannah.gnu.org print: :pserver:anoncvs@subversions.gnu.org:/cvsroot/w3 co w3 # ftp://ftp.cs.indiana.edu/pub/elisp/cddb/freedb.el # ftp://ftp.cs.indiana.edu/pub/elisp/mwheel.el # ftp://ftp.cs.indiana.edu/pub/elisp/netrek/add-to-dot-emacs # ftp://ftp.cs.indiana.edu/pub/elisp/netrek/meta-server.doc # ftp://ftp.cs.indiana.edu/pub/elisp/netrek/meta-server.el # ftp://ftp.cs.indiana.edu/pub/elisp/netrek/todo save:meta-server.todo # In Emacs ftp://ftp.cs.indiana.edu/pub/elisp/socks/socks.el tag3: pierce-tom lcd: $EUSR/pierce-tom http://www.emacswiki.org/elisp/index.html cregexp:;;.*(?i)author:.*Tom.*Pierce pregexp:\.el$ tag3: pinard-francois lcd: $EUSR/pinard-francois http://www.iro.umontreal.ca/contrib/libit/dist/misc/ pregexp:rebox.el tag3: ponce-david lcd: $EUSR/ponce-david print: Use CVS, files are at Sourceforge project 'emhacks' # senator.el is distributed with CVS package CEDET # Recentf is part of Emacs 21.1 http://www.dponce.com/more-elisp.html page: save:emacs-ponce-david.html # http://perso.wanadoo.fr/david.ponce/more-elisp.html page:find pregexp:\.(el|zip) regexp-no:senator|recentf|swbuff|tree-widget xx: lcd: $EPKG_NET http://perso.wanadoo.fr/david.ponce/ page:find pregexp:\.zip xx: # http://perso.wanadoo.fr/david.ponce/ pregexp:jjar x: tag3: predescu-ovidiu lcd: $EUSR/predescu-ovidiu # Gecities does not allow uploding .el files, so all files # end to .txt http://www.geocities.com/SiliconValley/Monitor/7464/emacs/index.html page:find pregexp:\.txt rename:s/\.txt/.el/ tag3: ramakrishnan lcd: $EUSR/ramakrishnan # See also http://www.advogato.org/person/rkrishnan/ http://www.hackgnu.org/hacks.html pregexp:advogato.el tag3: ravel-bruce lcd: $EUSR/ravel-bruce http://leonardo.phys.washington.edu/~ravel/software/gnuplot-mode/Welcome.html page: pregexp:dotemacs rename:s/.*/emacs-ravel-bruce-dotemacs.html http://leonardo.phys.washington.edu/~ravel/software/compjuga/Welcome.html save:emacs-ravel-bruce-compjuga.html http://leonardo.phys.washington.edu/~ravel/software/compjuga/Welcome.html page: pregexp:\.el tag3: reichor-stefan tag3: svn tag3: subversion lcd: $EUSR/reichor-stefan http://xsteve.nit.at/prg/emacs pregexp:\.el$ regexp-no:emacs-functions tag3: riel-joe lcd: $EUSR/riel-joe http://k-online.com/~joer/mws/mws.htm pregexp:\.el$ http://k-online.com/~joer/maplev/maplev.htm pregexp:maplev http://k-online.com/~joer/maplev/maplev.htm pregexp:INSTAL save:maplev-install.txt tag3: riepel-rob print: sql-mode.el and tpu-edt.el are part of Emacs tag3: rodgers-kevin tag3: igrep lcd: $EUSR/rodgers-kevin # 2001-01 No homepage where list of his files can be found. # Use Google search http://groups.google.com/groups?group=gnu.emacs.sources http://www.emacswiki.org/cgi-bin/emacs/download/igrep.el http://www.emacswiki.org/cgi-bin/wiki/download/ifind.el tag3: rush-david lcd: $EUSR/rush-david http://people.netscape.com/drush/emacs/ pregexp:\.el$ regexp-no:surl # This is more later version http://people.netscape.com/drush/emacs/ssl-hacks.el tag3: sasser-dewey # lcd: $EUSR/sasser-dewey tag3: schaeffer-timothy lcd: $EUSR/schaeffer-timothy # Tab-size independent c-mode indentation for Emacs http://apache.bsilabs.com/~tim/cc-mode/description.html page: save:emacs-cc-mode-indent.html http://apache.bsilabs.com/~tim/cc-mode/description.html pregexp:cc-mode tag3: schroeder-alex tag3: kensanata lcd: $EUSR/schroeder-alex # http://hammer.prohosting.com/~gumbart # http://www.geocities.com/TimesSquare/6120/ # http://www.geocities.com/kensanata/emacs.html save:emacs-schroeder-alex.html # http://www.geocities.com/kensanata/emacs-colors.html # http://www.geocities.com/kensanata/emacs-colors.html page:find pregexp:\.el$ rename:s/el.*/el/ regexp-no:ansi-color # http://www.geocities.com/kensanata/emacs-games.html # http://www.geocities.com/kensanata/emacs-games.html page:find pregexp:\.el$ rename:s/el.*/el/ # http://www.geocities.com/kensanata/bbdb-funcs.html # # print: Getting file from EmacsWiki, this may not be accurate guess. # print: Searching 'author:.*Schroeder' from every file # http://www.emacswiki.org/elisp/index.html cregexp:(?i);;.*author:.*Schroeder pregexp:\.el$ regexp-no:ansi-color|^sql.el tag3: schwenke-martin lcd: $EUSR/schwenke-martin http://meltin.net/hacks/emacs/ page: save: emacs-schwenke-martin.html http://meltin.net/hacks/emacs/ pregexp:\.el$ regexp-no:todo # Not good, mms.tar.gz includes lot of packages that # are already shipped by Emacs or CEDET project etc. # http://meltin.net/hacks/emacs/ pregexp:(\.tgz|tar.gz) regexp-no:lisp rename:s/tgz/tar.gz/ x: tag3: sebold-charles lcd: $EUSR/sebold-charles http://messengers-of-messiah.org/~csebold/emacs/ page: save:emacs-sebold-charles.html http://messengers-of-messiah.org/~csebold/emacs/dot_emacs.html save:sebold-charles-dotemacs.html http://messengers-of-messiah.org/~csebold/emacs/dot.gnus save:sebold-charles-dotgnus.html http://messengers-of-messiah.org/~csebold/emacs/answers.phtml save:emacs-sebold-charles-answers.html http://messengers-of-messiah.org/~csebold/emacs/ page:find pregexp:\.el regexp-no:(vfaq|backup|mutt) tag3: seiichi-namba lcd: $EUSR/seiichi-namba http://www.asahi-net.or.jp/~pi9s-nnb/myelisp.html save:emacs-seiichi-namba.html http://www.asahi-net.or.jp/~pi9s-nnb/dired-dd-home.html tag3: sepulveda-rafael lcd: $EUSR/sepulveda-rafael http://www.gnulinux.org.mx/~drs/emacs pregexp:\.el$ tag3: serrano-manuel lcd: $EUSR/serrano-manuel # Part of emacs # http://kaolin.unice.fr/~serrano/emacs/flyspell save:flyspell.el http://kaolin.unice.fr/~serrano/emacs/emacs.html save:emacs-serrano-manuel.html http://kaolin.unice.fr/~serrano/emacs/emacs.html page: pregexp:xinfo rename:s/$/.el/ http://kaolin.unice.fr/~serrano/emacs/emacs.html page: pregexp:case rename:s/$/.el/ tag3: sharman-richard lcd: $EUSR/sharman-richard # Change-mode.el is part of the emacs # ls-mode.el is superceded by dired-x.el / dired-virtual-mode http://www.pobox.com/~rsharman/emacs page: pregexp:\.el$ regexp-no:ls-|change-|sh-|shin|^hilit tag3: shinn-alex lcd: $EUSR/shinn-alex #todo: Lars Garsholm also has made css-mode.el http://synthcode.com/emacs pregexp:\.el regexp-no:gnus|emacs|vm|css|battery|lynx|todo http://synthcode.com/emacs pregexp:\.emacs$ save:dot-emacs.el http://synthcode.com/emacs pregexp:gnus.el$ save:dot-gnus.el http://synthcode.com/emacs pregexp:vm.el$ save:dot-vm.el tag3: shulman-michael # mmm-mode is in "packages" lcd: $EUSR/shulman-michael # 2002-07-11 invalid # http://kurukshetra.cjb.net/elisp/ page: save:emacs-shulman-michael.html # http://kurukshetra.cjb.net/elisp/ page:find pregexp:\.el$ regexp-no:folding|fortune tag3: sjodin-mikael lcd: $EUSR/sjodin-mikael http://www.docs.uu.se/~mic/emacs.shtml save:emacs-sjodin-mikael.html http://www.docs.uu.se/~mic/emacs.shtml page:find pregexp:\D\.el$ x: tag3: sprenger-karel lcd: $EUSR/sprenger-karel http://paddington.ic.uva.nl/public/sql-modes.zip xx: tag3: staats-michael lcd: $EUSR/staats-michael # pc-select is in Emacs ftp://ftp.thp.Uni-Duisburg.DE/pub/source/elisp/ regexp:. regexp-no:pc-select x: tag3: staflin-lennart print: See tag 'psgml' http://www.emacswiki.org/elisp/index.html pregexp:corba.el$ # http://www.emacswiki.org/elisp/index.html cregexp:;;.*(?i)author:.*Staflin pregexp:\.el$ tag3: steverud-jonas lcd: $EUSR/steverud-jonas http://www.dtek.chalmers.se/~d4jonas/Emacs page:find pregexp:\.el.gz$ regexp-no:gnus.el http://www.dtek.chalmers.se/~d4jonas/Emacs/gnus.el.gz save:dot-gnus.el tag3: storm-kim lcd: $EUSR/storm-kim http://www.cua.dk/emacs.html page: save:emacs-storm-kim.html http://www.cua.dk/emacs.html pregexp:\.el$|(cua|ido)\.html$ tag3: sylvester-olaf lcd: $EUSR/sylvester-olaf http://www.geekware.de/software/emacs/ page: save:emacs-sylvester-olaf.html http://www.geekware.de/software/emacs/ page:find pregexp:\.el regexp-no:bs\.el tag3: teixeira-roberto # lcd: $EUSR/teixeira-roberto tag3: theberge-jean # lcd: $EUSR/theberge-jean print: See Sourceforge CVS project emacsmp3player # homepage 2003-06 http://www.emacswiki.org/cgi-bin/emacs-fr.pl?JeanPhilippeTheberge # 2003-06-10 # - bofh.el and taelm.el used servers that are now down so are now useless # - hachette.el is broken because the hachette website changed their html layout. # The package will be placed on the emacswiki # - ell.el is now at http://www.anc.ed.ac.uk/~stephen/emacs/ell.el # and maintained by Stephen Eglen # http://godzilla.act.oda.fr/ # Strange directory structure # /ell.el/ell.el # /thumbs/thumbs.el # /taelm/taelm.el # /thumbs/thumbs.el # # ([a-z]+(?:\.el)?)/\1(?:\.el)? # | | # ell.el ell.el => ell.el/ell.el # taelm taelm => taelm/taelm.el # # # http://aroy.net/emacslisp.org/mypackages/ pregexp:bofh|mp3player-|([a-z]+(?:\.el)?)/\1(?:\.el)? regexp-no:emacs.html http://www.emacswiki.org/elisp/index.html pregexp:(thumbs|hachette).el$ tag3: triggs-mark lcd: $EUSR/triggs-mark http://members.iinet.net.au/~mtriggs/code.html pregexp:dot.emacs save:tiggs-mark-dot-emacs.el http://members.iinet.net.au/~mtriggs/code.html pregexp:\.el$ tag3: tomohiko-morioka # lcd: $EUSR/tomohiko-morioka # FLIM, SEMI, APEL tag3: tse-stephen # 2001-10 This was old eicq.el package, do not use. See sourceforge. # lcd: $EUSR/tse-stephen # # http://www.sfu.ca/~stephent/emacs/ pregexp:\.el print: see sourceforge project 'eicq' tag3: tziperman-eli tag3: rmail-spam-filter lcd: $EUSR/tziperman-eli http://www.weizmann.ac.il/home/eli/Downloads/rmail-spam-filter/ page: save:rmail-spam-filter.html http://www.weizmann.ac.il/home/eli/Downloads/rmail-spam-filter/ pregexp:\.el tag3: vaidheeswarran-rajesh # lcd: $EUSR/vaidheeswarran-rajesh print: P4 package has moved to p4el.sourceforge.net. Use CVS # http://www.dsmit.com/lisp/ page:find pregexp:\.el regexp-no:vc|whitespace tag3: voelker-geoff print: Voelker no longer maintain NT Emacs print: See http://www-cse.ucsd.edu/users/voelker/ tag3: volker-franz lcd: $EUSR/volker-franz http://www.kyb.tuebingen.mpg.de/bu/people/vf/xemacs-packages.html save:xemacs-volker-franz.html http://www.kyb.tuebingen.mpg.de/bu/people/vf/xemacs-packages.html pregexp:\.el tag3: vroonhof-jan # lcd: $EUSR/vroonhof-jan # http://www.math.ethz.ch/~vroonhof tag3: waider-ronan lcd: $EUSR/waider-ronan http://www.waider.ie/hacks/emacs/mud.html pregexp:\.el$ http://www.waider.ie/hacks/emacs/ pregexp:\.el$ tag3: waldman-charles lcd: $EUSR/waldman-charles http://home.fnal.gov/~cgw/xemacs/ pregexp:dot-emacs.html tag3: walters-colin lcd: $EUSR/walters-colin # ibuffer is now distributed in Emacs 21.2, do not try to load # from here, because the code will not work in other Emacs versions. # http://cvs.verbum.org/cgi-bin/viewcvs.cgi/~checkout~/ibuffer/ibuffer.el?rev=HEAD # http://cvs.verbum.org/cgi-bin/viewcvs.cgi/~checkout~/ibuffer/ibuf-macs.el?rev=HEAD http://cvs.verbum.org/cgi-bin/viewcvs.cgi/~checkout~/browse-kill-ring/browse-kill-ring.el?rev=HEAD http://cvs.verbum.org/cgi-bin/viewcvs.cgi/~checkout~/ascii-display/ascii-display.el?rev=HEAD # This location is obsolete # http://www.cis.ohio-state.edu/~walters pregexp:\.el tag3: warsaw-barry lcd: $EUSR/warsaw-barry # In Emacs already: elp.el reporter.el supercite.tar.gz http://barry.warsaw.us/elisp/ page:find pregexp:\.el$ regexp-no:(elp|reporter|supercite|python|winring) # Old information # http://www.python.org/emacs/python-mode/ page:find pregexp:\.el$ # http://www.python.org/emacs/winring/ page:find pregexp:\.el$ tag3: wedler-cristoph lcd: $EUSR/wedler-cristoph http://www.fmi.uni-passau.de/~wedler/session/ page: save:emacs-wedler-cristoph-session.html http://www.fmi.uni-passau.de/~wedler/session/ page:find pregexp:\.el.gz x: tag3: wiborg-espen lcd: $EUSR/wiborg-espen http://www.empolis.no/~espenhw/emacs/ pregexp:\.el$ tag3: widhopf-fenk-robert lcd: $EUSR/widhopf-fenk-robert http://www.robf.de/Hacking/elisp/ pregexp:\.el$ tag3: wiegley-john # (Esehell + pcomplete), scheduler are in PACKETS section lcd: $EUSR/wiegley-john print: homepage (2003-08) http://emacswiki.org/johnw/emacs.html http://emacswiki.org/johnw/Emacs pregexp:\.el regexp-no:\.(gnus|emacs|align|httpd|timeclock|planner|remember) http://emacswiki.org/johnw/emacs.html pregexp:\.emacs save:dot-emacs.el http://emacswiki.org/johnw/emacs.html pregexp:\.gnus save:dot-gnus.el tag3: wright-francis tag3: w32-symlinks lcd: $EUSR/wright-francis # In Emacs ls-lisp.el http://centaur.maths.qmw.ac.uk/Emacs/index.html page:find pregexp:\.el$ regexp-no:ls-lisp|woman|remote # http://centaur.maths.qmw.ac.uk/Emacs/WoMan/index.html page:find pregexp:\.el$ tag3: yamaoka-katsumi lcd: $EUSR/yamaoka-katsumi ftp://ftp.jpl.org/pub/elisp/ regexp:\.el\.gz$ regexp-no:remote x: lcd: $EPKG_NET ftp://ftp.jpl.org/pub/elisp/ regexp:(checkgroups|select|super|x-pgp).*tar.gz x: tag3: yankowski-fred lcd: $EUSR/yankowski-fred # http://www.ontosys.com/reports/PHP.html page: save:emacs-yankowski-fred-php.html http://www.ontosys.com/reports/PHP.html pregexp:(\.el|doc)$ tag3: yavner-jonathan lcd: $EUSR/yavner-jonathan http://mywebpages.comcast.net/jyavner/unsafep/ pregexp:\.el$ http://mywebpages.comcast.net/jyavner/ pregexp:\.el$ tag3: youngs-steve # Package has been moved to sourceforge project eicq, please # use cvs tag3: yuji-minejima lcd: $EUSR/yuji-minejima http://homepage1.nifty.com/bmonkey/lisp/index.html pregexp:\.el$ tag3: cperl # lcd: $EUSR/zakharevich-ilya # http://www.perl.com/CPAN-local/misc/emacs/cperl-mode/ pregexp:cperl-mode.el$ print: Please see tag `zakharevich-ilya' tag3: zajcev-evgeny lcd: $EUSR/zajcev-evgeny http://www.emacswiki.org/elisp/index.html cregexp:;;.*(?i)author:.*zajcev pregexp:\.el$ tag3: zakharevich-ilya lcd: $EUSR/zakharevich-ilya # in Emacs: options.el tmm.el cperl-mode.el # todo: Should look from CPAN? print: cperl-mode.el is included in Emacs. Not fetched. ftp://ftp.math.ohio-state.edu/pub/users/ilya/emacs/ regexp:\.el regexp-no:(tmm|options|cperl) tag3: zhu-shenghuo lcd: $EUSR/zhu-shenghuo http://www.cs.rochester.edu/~zsh/hacks/emacs.html page: save:emacs-zhu-shenghuo.html http://www.cs.rochester.edu/~zsh/hacks/emacs.html page:find pregexp:\.el$ regexp-no:bbdb-edit tag3: zundel-detlev lcd: $EUSR/zundel-detlev # rpm.el conflicts with # cvs-packages/sourceforge/cedet/speedbar/rpm.el http://www.akk.org/~dzu/download/ pregexp:\.el regexp-no:rpm.el http://www.akk.org/~dzu/download/ pregexp:rpm\.el save:rpm2.el tag3: zimmermann-reto # lcd: $EUSR/zimmermann-reto # http://opensource.ethz.ch/emacs/dcsh-mode.html pregexp:\.gz$ x: # http://opensource.ethz.ch/emacs/vera-mode.html pregexp:\.gz$ x: # http://opensource.ethz.ch/emacs/vhdl-mode.html pregexp:\d\.tar.gz file:vhdl-mode-3.32.1.tar.gz new: x: # http://www.verilog.com/verilog-mode.html pregepx:\.gz$ x: tag3: 000-misc lcd: $EUSR/000-misc # ftp://ls6-ftp.cs.uni-dortmund.de/pub/src/emacs/ regexp:(winmgr|lib-complete).*el$ ftp://ftp.mathworks.com/pub/contrib/emacs_add_ons/INDEX.emacs_add_ons save:matlab-files.txt ftp://ftp.mathworks.com/pub/contrib/emacs_add_ons/readme.txt save:matlab-readme.txt ftp://ftp.mathworks.com/pub/contrib/emacs_add_ons/ regexp:\.el$ tag2: elisp-lcd # ........................................................... &lcd: ... # the Ancient Lisp Code Directory tag3: lcd-ohio lcd: $ELCD # http://www.cs.indiana.edu/LCD/cover.html?make-regexp save:make-regexp.el http://www.cs.indiana.edu/LCD/cover.html?timecard save:timecard-mode.el # ######################################################### &other ### tag2: elisp-doc lcd: $EDOC ftp://ftp.cs.columbia.edu/archives/gnu/prep/emacs/emacs-lisp-intro-2.04.tar.gz new: regexp-no:README x: tag2: elisp-other lcd: $EOTHER tag3: elisp-other # WWW ftp://ftp.inria.fr/gnu/emacs-lisp/rect-mark-1.4.el.gz new: http://members.verizon.net/~vze24fr2/EmacsClearCase/clearcase.el http://prdownloads.sourceforge.net/table/table-1.5.54.el.gz new: x: http://prdownloads.sourceforge.net/sourceforge/table/table-1.5.54.el.gz new: x: save:table.el.gz http://prdownloads.sourceforge.net/sourceforge/jtags/jtags.el http://prdownloads.sourceforge.net/sourceforge/jdc-el/jdc.el tag3: ntemacs-contrib lcd: $EWIN32 # NTEmacs, gnuserv, epop3 # http://www.gnu.org/software/emacs/windows/big.html pregexp:(gnuserv|epop).*zip$ tag3: sql lcd: $ECOMMON/programming/sql http://paddington.ic.uva.nl/public/sql-modes.zip tag3: jde-other lcd: $ECOMMON/programming/java/jde/contrib http://www.sunsite.auc.dk/jde/contrib/ pregexp:\.el tag3: idl lcd: $ELANG/idl http://cc-mode.sourceforge.net/contrib.php pregexp:\.el noregexp:cc-mode tag3: lua lcd: $ELANG/lua http://luaforge.net/frs/?group_id=185&release_id=869 pregexp:\.tar\.gz file:lua-mode-20061122.tar.gz new: x: # End of file pwget-2016.1019+git75c6e3e/doc/index.html000066400000000000000000000245131300167571300175160ustar00rootroot00000000000000 Perl Webget Project
Perl Webget Project

Links and download

Note: that the most up to date version is in version control. See development page instructions how get latest source code. Other pages

Description

The pwget is a tool that can help to keep track of programs and package releases around the Internet. The major differences to well known wget utility are as follows:
  • Configuration file(s). Program supports file downloads as with wget by specifying the URLs in the command line. But the advantage is to automate the periodic downloads from your favourite locations by adding entries to configuration file(s). You refer to named tag in order to download new files from developer's site.

    The configuration file can use unlimited number of hierarchy levels: like a) "linux" b) sub-level "linux-gnome". If you request tag "linux", everything under that hierarchy level is downloaded.

  • Heuristics. As developers release new packages, the package name is subject to changes. The de facto packaging standard is package-version.suffix, like in package-12.1.tar.gz. You do not need to know the exact URL, because program is able to determine latest version. If you had instructed to download emacs-19.28.tar.gz and supplied option option --new, the newest kit would have been found. Record the initial URL to a configuration file, instruct to monitor newer versions, and you can start keeping your favourite programs up to date

  • Content check. If a page contains many files (like directory listing), it is possible to download only files whose content match a regular expression. For example, it is possible to request author's name in the file with option --Regexp-content='(?i)Author:\s*Mr.\s*Foo'

  • Extract. After download, program can unpack the downloaded package in a civilized manner. If archive includes a root directory, then everything is ok, if not, root directory is created upon unpack. This is desireable, because some archives may not include proper package-version/ root directory.

All in all, pwget is designed to be your companion to keep in touch with the changing package releases in the software release world. The aim was to develop an utility which can go to an URL, sniff around a bit to make the decision what to download. Debian has developed .deb install archives and Red Hat a .rpm format respectively, but there is a need to handle .tar.gz, .tgz, .tar.bz2 and other file downloads as well. pwget may help with tracking URLs easier.

Test run - creating Emacs site-lisp

There is an example configuration which contains URLs of Emacs Lisp packages. Emacs is a development environment that can bextended with additional packages (an example). The example configuration contains presets for /usr/share/*/site-lisp/net hierachy (the install location is configurable). It contains instructions how to download many custom Emacs packages. The results, what generated directory structure would be, can be seen at log file.

    perl pwget [--dry-run] --verbose --Tag elisp
    

Contact

There is no mailing list for the project. See Development page how to contact maintainer and submit feature requests and bug reports.


GNU GPL All files in this project are licensed under GNU GPL. Savannah Logo This project, as well as many other projects is hosted by Savannah.
W3C HTML valid W3C CSS valid W3 validated.
pwget-2016.1019+git75c6e3e/doc/log/000077500000000000000000000000001300167571300162755ustar00rootroot00000000000000pwget-2016.1019+git75c6e3e/doc/log/emacs-directory-list.txt000066400000000000000000000400721300167571300231040ustar00rootroot00000000000000Generated directories after running tag "elisp". The directory structure is typically stored under /usr/share/*/site-lisp/net (you can change variable ROOT, which points in default installation dir /tmp) The command used was: perl -S pwget --verbose --Create-paths --Tag elisp . +--cvs-packages/ | +--gnus/ | | +--contrib/ | | +--etc/ | | | +--gnus/ | | | +--smilies/ | | +--lisp/ | | +--texi/ | | | +--ps/ | | | +--etc/ | | | +--herds/ | | | +--misc/ | | | +--picons/ | | | +--screen/ | | | +--smilies/ | | | +--xface/ | +--bbdb/ | | +--bits/ | | | +--bbdb-filters/ | | | | +--doc/ | | | | +--formatted/ | | +--html/ | | | +--images/ | | | +--patches/ | | +--lisp/ | | +--misc/ | | +--tex/ | | +--texinfo/ | | +--utils/ | +--mailcrypt/ | +--gnu/ | | +--url/ | | | +--lisp/ | | | +--texi/ | +--cc-mode/ | | +--admin/ | | +--tests/ | +--tnt/ | | +--sounds/ | +--emacro/ | | +--contrib/ | | +--doc/ | | +--i18n/ | | +--packages/ | | +--programmer/ | +--devel/ | +--jess-mode/ | | +--jess-mode/ | +--ILISP/ | | +--docs/ | | +--extra/ | | +--pictures/ | | +--debian/ | +--lookup/ | | +--lisp/ | | +--texi/ | | +--web/ | | | +--images/ | | | +--info/ | | | +--ja/ | | | | +--howto/ | | | | +--images/ | | | | +--info/ | | | | +--lists/ | | +--fonts/ | +--liece/ | | +--contrib/ | | +--dcc/ | | +--doc/ | | +--etc/ | | | +--icons/ | | | +--po/ | | | +--styles/ | | +--lisp/ | +--cedet/ | | +--ede/ | | +--eieio/ | | +--quickpeek/ | | +--semantic/ | | | +--tests/ | | | +--wisent/ | | +--speedbar/ | | +--cogre/ | | +--www/ | +--eicq/ | | +--etc/ | | | +--eicq/ | +--ecb/ | | +--html/ | | +--ecb2/ | +--w3/ | | +--contrib/ | | +--etc/ | | | +--w3/ | | | | +--pixmaps/ | | +--lisp/ | | +--tests/ | | +--texi/ | +--jde/ | | +--doc/ | | | +--jdebug/ | | | | +--ug/ | | | | +--images/ | | | +--src/ | | | | +--css/ | | | | +--jde-ug/ | | | | | +--images/ | | | | +--styles/ | | | | | +--html/ | | | | +--ug/ | | | | +--images/ | | | +--tli_rbl/ | | | | +--au/ | | | | +--img/ | | | | +--txt/ | | | +--ug/ | | | +--images/ | | +--java/ | | | +--bsh-commands/ | | | | +--bsh/ | | | | +--commands/ | | | +--src/ | | | +--jde/ | | | +--debugger/ | | | | +--command/ | | | | +--expr/ | | | | +--spec/ | | | +--parser/ | | | | +--syntaxtree/ | | | | +--visitor/ | | | +--util/ | | | +--wizards/ | | +--lisp/ | +--xslt-process/ | | +--bin/ | | +--doc/ | | +--etc/ | | +--java/ | | | +--xslt/ | | | +--debugger/ | | | +--cmdline/ | | | +--saxon/ | | | +--xalan/ | | +--lisp/ | | +--samples/ | | +--tests/ | | +--fo/ | +--xslide/ | +--tramp/ | | +--lisp/ | | +--test/ | | +--texi/ | | +--contrib/ | | +--tramp2/ | +--jtags/ | | +--lisp/ | | +--test/ | | +--pkg/ | | +--hupp/ | +--pcl-cvs/ | | +--releases/ | +--erc/ | | +--debian/ | | | +--maint/ | | | +--scripts/ | +--records/ | +--distributions/ | +--mailcrypt/ | +--proposals/ | +--nextgen/ +--packages/ | +--tmp-eicq-0.2.5/ | | +--etc/ | +--tmp-eicq-0.2.2/ | +--crapstation-0.0.19/ | | +--Data/ | | +--video/ | | +--font/ | | +--gui/ | | +--pixeler/ | | +--system/ | | +--main/ | +--dired-dd-0.9.1.19/ | | +--spinout/ | | +--non-dired-drop/ | +--template/ | | +--templates/ | | +--lisp/ | | +--examples/ | +--eudc-1.30b/ | +--X/ | +--gnyognyo2.1.tar/ | +--artist-1.2.3/ | +--dcsh-mode-1.2/ | +--eshell/ | | +--tmp-pcomplete-1.1.6/ | | +--eshell-2.4.2/ | | +--pcomplete-1.1.7/ | | +--tmp-eshell-2.4.1/ | +--records-1.4.8/ | +--notes-mode-1.16/ | | +--HTML/ | +--tmp-irchat-20001203/ | +--elib-1.0/ | +--vera-mode-2.6/ | +--index-2000.1218/ | +--mirror.tar.gz-2001.0104/ | +--lout-mode/ | +--tmp-jde-beta-2001.0303/ | | +--jde-2.2.7beta2/ | | +--lisp/ | | +--doc/ | | | +--src/ | | | | +--jde-ug/ | | | | | +--images/ | | | | +--css/ | | | | +--styles/ | | | | +--html/ | | | +--html/ | | | | +--css/ | | | | +--jde-ug/ | | | | | +--images/ | | | | +--jdb-ug/ | | | | +--bsh-ug/ | | | | | +--images/ | | | | +--jdebug-ug/ | | | | +--images/ | | | +--tli_rbl/ | | | +--au/ | | | +--img/ | | | +--txt/ | | +--java/ | | +--classes/ | | | +--jde/ | | | +--wizards/ | | | +--parser/ | | | +--util/ | | +--lib/ | | +--src/ | | | +--jde/ | | | +--wizards/ | | | +--debugger/ | | | | +--command/ | | | | +--expr/ | | | | +--spec/ | | | +--parser/ | | | | +--syntaxtree/ | | | | +--visitor/ | | | +--util/ | | +--bsh-commands/ | | +--bsh/ | | +--commands/ | +--mmm-mode-0.4.7/ | +--sml-mode-3.9.5/ | +--x-symbol-beta/ | | +--etc/ | | +--fonts/ | | +--man/ | | +--www/ | | +--lisp/ | | +--origfonts/ | | +--info/ | | +--staging/ | +--ep-2.0/ | +--basic4e/ | | +--readmes/ | +--marche-1.12/ | +--gnuplot-mode/ | | +--Win9x/ | +--hm--html-menus/ | | +--doc/ | +--view-process-mode/ | +--vm-folder/ | +--auto-insert-tkld-1.23/ | | +--insert/ | +--mmm-mode-0.4.6/ | +--tmp-ttn-pers-elisp-1.36/ | | +--info/ | | +--lisp/ | | | +--bofh/ | | | +--import/ | | | +--low-stress/ | | | +--editing/ | | | +--core/ | | | +--term/ | | | +--prog-env/ | | | +--mail-n-news/ | | | +--diversions/ | | +--doc/ | +--c-mode-addons-0.1/ | +--esheet/ | +--html-helper-mode.tar-2001.0303/ | +--tmp-jde-2.2.6beta7/ | | +--lisp/ | | +--doc/ | | | +--ug/ | | | | +--images/ | | | +--src/ | | | | +--ug/ | | | | +--images/ | | | +--jdebug/ | | | | +--ug/ | | | | +--images/ | | | +--tli_rbl/ | | | +--au/ | | | +--img/ | | | +--txt/ | | +--java/ | | +--classes/ | | | +--jde/ | | | +--wizards/ | | | +--parser/ | | | +--util/ | | +--lib/ | | +--doc/ | | | +--jde/ | | | +--debugger/ | | | +--command/ | | | +--expr/ | | | +--spec/ | | +--src/ | | | +--jde/ | | | +--wizards/ | | | +--debugger/ | | | | +--expr/ | | | | +--spec/ | | | | +--interpret/ | | | | | +--syntaxtree/ | | | | | +--visitor/ | | | | +--command/ | | | +--parser/ | | | | +--syntaxtree/ | | | | +--visitor/ | | | +--util/ | | +--bsh-commands/ | | +--bsh/ | | +--commands/ | +--psgml-1.2.2/ | +--xslt-process-1.2.1/ | | +--doc/ | | +--java/ | | | +--xslt/ | | +--lisp/ | +--color-mate-10.3/ | | +--kanakan-cursor/ | | | +--org-doc/ | | +--contrib/ | | +--theme/ | +--tmp-jde-2.2.8beta4/ | | +--lisp/ | | +--doc/ | | | +--src/ | | | | +--jde-ug/ | | | | | +--images/ | | | | +--css/ | | | | +--styles/ | | | | +--html/ | | | +--html/ | | | | +--jde-ug/ | | | | | +--images/ | | | | +--jdebug-ug/ | | | | | +--images/ | | | | +--bsh-ug/ | | | | | +--images/ | | | | +--css/ | | | +--tli_rbl/ | | | +--au/ | | | +--img/ | | | +--txt/ | | +--java/ | | +--src/ | | | +--jde/ | | | +--debugger/ | | | | +--command/ | | | | +--expr/ | | | | +--spec/ | | | +--util/ | | | +--wizards/ | | +--lib/ | | +--classes/ | | +--bsh-commands/ | | +--bsh/ | | +--commands/ | +--bbdb-expire-1.4/ | +--tmp-irchat-20010608/ | +--color-mate-10.5/ | | +--contrib/ | | +--theme/ | | +--kanakan-cursor/ | | | +--org-doc/ | +--tmp-jde-2.2.8/ | | +--lisp/ | | +--doc/ | | +--src/ | | | +--jde-ug/ | | | | +--images/ | | | +--css/ | | | +--styles/ | | | +--html/ | | +--html/ | | +--jde-ug/ | +--bugtrack-1.3.1/ | +--ttn-pers-elisp-1.37/ | | +--info/ | | +--lisp/ | | | +--bofh/ | | | +--import/ | | | +--low-stress/ | | | +--editing/ | | | +--core/ | | | +--term/ | | | +--prog-env/ | | | +--mail-n-news/ | | | +--diversions/ | | +--doc/ | +--jde-beta--2.2.9beta6-2001.1127/ | | +--jde-2.2.9beta6/ | | +--lisp/ | | +--doc/ | | | +--src/ | | | | +--jde-ug/ | | | | | +--images/ | | | | +--css/ | | | | +--styles/ | | | | +--html/ | | | +--html/ | | | | +--jde-ug/ | | | | | +--images/ | | | | +--jdebug-ug/ | | | | | +--images/ | | | | +--bsh-ug/ | | | | | +--images/ | | | | +--css/ | | | +--tli_rbl/ | | | +--au/ | | | +--img/ | | | +--txt/ | | +--java/ | | +--src/ | | | +--jde/ | | | +--debugger/ | | | | +--command/ | | | | +--expr/ | | | | +--spec/ | | | +--util/ | | | +--wizards/ | | +--lib/ | | +--classes/ | | +--bsh-commands/ | | +--bsh/ | | +--commands/ | +--mkhtml-1.7a/ | +--tmp-mkhtml-1.7/ | +--id3el-0.05/ | +--artist-1.2.4/ | +--dired-dd-0.9.1.23/ | | +--spinout/ | | +--non-dired-drop/ | +--html-helper-mode.tar-2001.1226/ | +--idlwave.info.tar-2001.1226/ | +--idlwave-4.9/ | +--irchat-20011104-2/ | +--jde-beta-2001.1226/ | | +--jde-2.2.9beta7/ | | +--lisp/ | | +--doc/ | | | +--src/ | | | | +--jde-ug/ | | | | | +--images/ | | | | +--css/ | | | | +--styles/ | | | | +--html/ | | | +--html/ | | | | +--jde-ug/ | | | | | +--images/ | | | | +--jdebug-ug/ | | | | | +--images/ | | | | +--bsh-ug/ | | | | | +--images/ | | | | +--css/ | | | +--tli_rbl/ | | | +--au/ | | | +--img/ | | | +--txt/ | | +--java/ | | +--src/ | | | +--jde/ | | | +--debugger/ | | | | +--command/ | | | | +--expr/ | | | | +--spec/ | | | +--util/ | | | +--wizards/ | | +--lib/ | | +--classes/ | | +--bsh-commands/ | | +--bsh/ | | +--commands/ | +--mirror.tar-2001.1226/ | +--ssl-hacks-v2.tar/ | +--tenson-20010411/ | +--notes-mode-1.21/ | | +--HTML/ | +--mkhtml-1.7/ | +--ttn-pers-elisp-1.38/ | | +--doc/ | | +--info/ | | +--lisp/ | | | +--bofh/ | | | +--core/ | | | +--diversions/ | | | +--editing/ | | | +--import/ | | | +--low-stress/ | | | +--mail-n-news/ | | | +--prog-env/ | | | +--term/ | +--basic4d.tar/ | | +--readmes/ | +--html-helper-mode.tar-2001.1230/ | +--idlwave.info.tar-2001.1230/ | +--jde-beta-2001.1230/ | | +--jde-2.2.9beta7/ | | +--lisp/ | | +--doc/ | | | +--src/ | | | | +--jde-ug/ | | | | | +--images/ | | | | +--css/ | | | | +--styles/ | | | | +--html/ | | | +--html/ | | | | +--jde-ug/ | | | | | +--images/ | | | | +--jdebug-ug/ | | | | | +--images/ | | | | +--bsh-ug/ | | | | | +--images/ | | | | +--css/ | | | +--tli_rbl/ | | | +--au/ | | | +--img/ | | | +--txt/ | | +--java/ | | +--src/ | | | +--jde/ | | | +--debugger/ | | | | +--command/ | | | | +--expr/ | | | | +--spec/ | | | +--util/ | | | +--wizards/ | | +--lib/ | | +--classes/ | | +--bsh-commands/ | | +--bsh/ | | +--commands/ | +--mirror.tar-2001.1230/ | +--rcpp-mode/ +--users/ +--riebel-rob/ +--sjodin-mikael/ | +--RCS/ +--owen-gareth/ +--heideman-john/ +--howe-dennis/ +--ludlam-eric/ +--walters-colin/ | +--RCS/ +--fouts-martin/ +--theberge-jean/ +--lauri-gian/ +--niksic-hrvoje/ +--warsaw-barry/ +--blaak-ray/ +--000-misc/ +--johnson-bryan/ +--hirose-yuuji/ +--moshin-ahmed/ +--zundel-detlev/ +--marsden-eric/ +--bakhash-david/ +--jones-kyle/ +--dirson-yann/ +--schwenke-martin/ | +--desire/ | | +--appt/ | | +--bbdb/ | | +--browse-url/ | | +--calendar/ | | +--diary/ | | +--gnus/ | | +--gnuserv/ | | +--html-helper-mode/ | | +--hugs-mode/ | | +--latex/ | | +--lispdir/ | | +--message/ | | +--sendmail/ | | +--shell/ | | +--sqlplus-mode/ | | +--supercite/ | | +--tex/ | | +--todo-mode/ | | +--vm/ | | +--w3/ | | +--mailcrypt/ | | +--haskell-mode/ | | +--psgml/ | +--mms/ | +--gnuserv-3.12.4/ +--serrano-manuel/ +--jump-theodore/ | +--bin/ | +--VisEmacs-2.1.1/ | | +--Debug/ | | +--Release/ | | +--res/ | +--regsetup.zip-2000.1225/ | +--w32-dev-env-0.zip/ | | +--w32hhelp/ | | +--samples/ | +--w32-print-1-13-4/ | | +--bin/ | | +--samples/ | +--w32-winmsg-1-6-3.zip/ | | +--mingw32/ | | +--nmake/ +--wiegley-john/ +--burton-brent/ +--barzilay-eli/ +--friedman-noah/ +--ravel-bruce/ +--lord-philip/ +--muenkel-heiko/ +--lopez-emilio/ +--casadonte-joe/ +--pearson-dave/ +--clausen-lars/ +--grossjohan-kai/ +--bihlmeyer-robert/ +--tziperman-eli/ +--dyke-neil/ +--jackson-trey/ +--moody-ray/ +--garshol-lars/ +--breton-tom/ +--lepied-frederick/ +--knuth-don/ +--predescu-ovidiu/ +--zakharevich-ilya/ +--latorre-vinicius/ +--englen-stephen/ +--wright-francis/ +--milliken-peter/ +--dominik-carsten/ +--ponce-david/ | +--jjar.zip-2001.0303/ | +--jjar-1.7/ +--manning-carl/ +--dickow-ulrik/ +--minar-nelson/ +--monnier-stefan/ +--karunakaran-rajeev/ +--chen-gongquan/ +--burton-kevin/ +--shulman-michael/ +--yankowski-fred/ +--konerding-david/ +--goel-deepak/ | +--2.7.1release/ | | +--help/ | | | +--ex_article/ | | | +--usrguide/ | +--RCS/ | +--lisp-mine-2001.1223/ | +--lisp-mine-2001.1226/ | +--lisp-mine-2001.1230/ +--minejima-yuji/ +--sebold-charles/ +--kemp-steve/ +--glickstein-bob/ +--hodges-matt/ +--zhu-shenghuo/ +--youngs-steve/ +--galbraith-peter/ +--osterlund-peter/ +--wedler-cristoph/ +--mengarini-will/ +--davidson-kevin/ +--lindgren-anders/ +--sharman-richard/ +--teixeira-roberto/ +--storm-kim/ +--pedersen-jesper/ +--grigni-michelangelo/ +--staats-michael/ +--broubaker-heddy/ +--berry-karl/ +--kleinpaste-karl/ +--carpenter-bill/ +--conrad-christoph/ +--edmonds-brian/ +--sylvester-olaf/ +--vaidheeswarran-rajesh/ +--perry-william/ +--kadlec-albrecht/ +--lannerback-anders/ +--liljenberg-peter/ +--boukanov-igor/ +--burgett-steve/ +--steverud-jonas/ +--gorrell-harley/ +--seiichi-namba/ +--nickelsen-jorgen/ +--sasser-dewey/ +--schroeder-alex/ +--breton-peter/ +--deleuze-christophe/ +--bjorlykke-stig/ +--wiborg-espen/ +--sprenger-karel/ +--buhl-josh/ +--padioleau-yoann/ +--brodie-bill/ +--drieu-benjamin/ +--belanger-jay/ +--ingrand-francois/ +--josefsson-simon/ +--rush-david/ +--rodgers-kevin/ +--urban-reini/ +--ernst-michael/ | +--doc/ | +--edb-1.21/ | +--examples/ | +--btxdb-0.6/ | | +--examples/ +--tse-stephen/ +--naess-rolf/ +--brillant-alexandre/ +--reichor-stefan/ +--riel-joe/ +--eide-eric/ +--schaeffer-timothy/ +--lundin-daniel/ +--pinard-francois/ +--yavner-jonathan/ +--berndl-klaus/ +--chen-jerry/ +--dampel-herbert/ +--shinn-alex/ +--keane-joe/ +--socha-robin/ +--ellison-gary/ pwget-2016.1019+git75c6e3e/doc/manual/000077500000000000000000000000001300167571300167715ustar00rootroot00000000000000pwget-2016.1019+git75c6e3e/doc/manual/index.html000066400000000000000000001162451300167571300207770ustar00rootroot00000000000000

NAME

pwget - Perl Web URL fetch program

SYNOPSIS

    pwget http://example.com/ [URL ...]
    pwget --config $HOME/config/pwget.conf --tag linux --tag emacs ..
    pwget --verbose --overwrite http://example.com/
    pwget --verbose --overwrite --Output ~/dir/ http://example.com/
    pwget --new --overwrite http://example.com/package-1.1.tar.gz

DESCRIPTION

Automate periodic downloads of files and packages.

If you retrieve latest versions of certain program blocks periodically, this is the Perl script for you. Run from cron job or once a week to upload newest versions of files around the net. Note:

Wget and this program

At this point you may wonder, where would you need this perl program when wget(1) C-program has been the standard for ages. Well, 1) Perl is cross platform and more easily extendable 2) You can record file download criteria to a configuration file and use perl regular epxressions to select downloads 3) the program can anlyze web-pages and "search" for the download only links as instructed 4) last but not least, it can track newest packages whose name has changed since last downlaod. There are heuristics to determine the newest file or package according to file name skeleton defined in configuration.

This program does not replace pwget(1) because it does not offer as many options as wget, like recursive downloads and date comparing. Use wget for ad hoc downloads and this utility for files that change (new releases of archives) or which you monitor periodically.

Short introduction

This small utility makes it possible to keep a list of URLs in a configuration file and periodically retrieve those pages or files with simple commands. This utility is best suited for small batch jobs to download e.g. most recent versions of software files. If you use an URL that is already on disk, be sure to supply option --overwrite to allow overwriting existing files.

While you can run this program from command line to retrieve individual files, program has been designed to use separate configuration file via --config option. In the configuration file you can control the downloading with separate directives like save: which tells to save the file under different name. The simplest way to retrieve the latest version of apackage from a FTP site is:

    pwget --new --overwite --verbose \
       http://www.example.com/package-1.00.tar.gz

Do not worry about the filename package-1.00.tar.gz. The latest version, say, package-3.08.tar.gz will be retrieved. The option --new instructs to find newer version than the provided URL.

If the URL ends to slash, then directory list at the remote machine is stored to file:

    !path!000root-file

The content of this file can be either index.html or the directory listing depending on the used http or ftp protocol.

OPTIONS

-A, --regexp-content REGEXP

Analyze the content of the file and match REGEXP. Only if the regexp matches the file content, then download file. This option will make downloads slow, because the file is read into memory as a single line and then a match is searched against the content.

For example to download Emacs lisp file (.el) written by Mr. Foo in case insensitive manner:

    pwget -v -r '\.el$' -A "(?i)Author: Mr. Foo" \
      http://www.emacswiki.org/elisp/index.html
-C, --create-paths

Create paths that do not exist in lcd: directives.

By default, any LCD directive to non-existing directory will interrupt program. With this option, local directories are created as needed making it possible to re-create the exact structure as it is in configuration file.

-c, --config FILE

This option can be given multiple times. All configurations are read.

Read URLs from configuration file. If no configuration file is given, file pointed by environment variable is read. See ENVIRONMENT.

The configuration file layout is envlained in section CONFIGURATION FILE

--chdir DIRECTORY

Do a chdir() to DIRECTORY before any URL download starts. This is like doing:

    cd DIRECTORY
    pwget http://example.com/index.html
-d, --debug [LEVEL]

Turn on debug with positive LEVEL number. Zero means no debug. This option turns on --verbose too.

-e, --extract

Unpack any files after retrieving them. The command to unpack typical archive files are defined in a program. Make sure these programs are along path. Win32 users are encouraged to install the Cygwin utilities where these programs come standard. Refer to section SEE ALSO.

  .tar => tar
  .tgz => tar + gzip
  .gz  => gzip
  .bz2 => bzip2
  .xz  => xz
  .zip => unzip
-F, --firewall FIREWALL

Use FIREWALL when accessing files via ftp:// protocol.

-h, --help

Print help page in text.

--help-html

Print help page in HTML.

--help-man

Print help page in Unix manual page format. You want to feed this output to c<nroff -man> in order to read it.

Print help page.

-m, --mirror SITE

If URL points to Sourcefoge download area, use mirror SITE for downloading. Alternatively the full full URL can include the mirror information. And example:

    --mirror kent http://downloads.sourceforge.net/foo/foo-1.0.0.tar.gz
-n, --new

Get newest file. This applies to datafiles, which do not have extension .asp or .html. When new releases are announced, the version number in filename usually tells which is the current one so getting hardcoded file with:

    pwget -o -v http://example.com/dir/program-1.3.tar.gz

is not usually practical from automation point of view. Adding --new option to the command line causes double pass: a) the whole http://example.com/dir/ is examined for all files and b) files matching approximately filename program-1.3.tar.gz are examined, heuristically sorted and file with latest version number is retrieved.

--no-lcd

Ignore lcd: directives in configuration file.

In the configuration file, any lcd: directives are obeyed as they are seen. But if you do want to retrieve URL to your current directory, be sure to supply this option. Otherwise the file will end to the directory pointer by lcd:.

--no-save

Ignore save: directives in configuration file. If the URLs have save: options, they are ignored during fetch. You usually want to combine --no-lcd with --no-save

--no-extract

Ignore x: directives in configuration file.

-O, --output DIR

Before retrieving any files, chdir to DIR.

-o, --overwrite

Allow overwriting existing files when retrieving URLs. Combine this with --skip-version if you periodically update files.

--proxy PROXY

Use PROXY server for HTTP. (See --Firewall for FTP.). The port number is optional in the call:

    --proxy http://example.com.proxy.com
    --proxy example.com.proxy.com:8080
-p, --prefix PREFIX

Add PREFIX to all retrieved files.

-P, --postfix POSTFIX

Add POSTFIX to all retrieved files.

-D, --prefix-date

Add iso8601 ":YYYY-MM-DD" prefix to all retrieved files. This is added before possible --prefix-www or --prefix.

-W, --prefix-www

Usually the files are stored with the same name as in the URL dir, but if you retrieve files that have identical names you can store each page separately so that the file name is prefixed by the site name.

    http://example.com/page.html    --> example.com::page.html
    http://example2.com/page.html   --> example2.com::page.html
-r, --regexp REGEXP

Retrieve file matching at the destination URL site. This is like "Connect to the URL and get all files matching REGEXP". Here all gzip compressed files are found form HTTP server directory:

    pwget -v -r "\.gz" http://example.com/archive/

Caveat: currently works only for http:// URLs.

-R, --config-regexp REGEXP

Retrieve URLs matching REGEXP from configuration file. This cancels --tag options in the command line.

-s, --selftest

Run some internal tests. For maintainer or developer only.

--sleep SECONDS

Sleep SECONDS before next URL request. When using regexp based downlaods that may return many hits, some sites disallow successive requests in within short period of time. This options makes program sleep for number of SECONDS between retrievals to overcome 'Service unavailable'.

--stdout

Retrieve URL and write to stdout.

--skip-version

Do not download files that have version number and which already exists on disk. Suppose you have these files and you use option --skip-version:

    package.tar.gz
    file-1.1.tar.gz

Only file.txt is retrieved, because file-1.1.tar.gz contains version number and the file has not changed since last retrieval. The idea is, that in every release the number in in distribution increases, but there may be distributions which do not contain version number. In regular intervals you may want to load those packages again, but skip versioned files. In short: This option does not make much sense without additional option --new

If you want to reload versioned file again, add option --overwrite.

-t, --test, --dry-run

Run in test mode.

-T, --tag NAME [NAME] ...

Search tag NAME from the config file and download only entries defined under that tag. Refer to --config FILE option description. You can give Multiple --tag switches. Combining this option with --regexp does not make sense and the concequencies are undefined.

-v, --verbose [NUMBER]

Print verbose messages.

-V, --version

Print version information.

EXAMPLES

Get files from site:

    pwget http://www.example.com/dir/package.tar.gz ..

Display copyright file for package GNU make from Debian pages:

    pwget --stdout --regexp 'copyright$' http://packages.debian.org/unstable/make

Get all mailing list archive files that match "gz":

    pwget --regexp gz  http://example.com/mailing-list/archive/download/

Read a directory and store it to filename YYYY-MM-DD::!dir!000root-file.

    pwget --prefix-date --overwrite --verbose http://www.example.com/dir/

To update newest version of the package, but only if there is none at disk already. The --new option instructs to find newer packages and the filename is only used as a skeleton for files to look for:

    pwget --overwrite --skip-version --new --verbose \
        ftp://ftp.example.com/dir/packet-1.23.tar.gz

To overwrite file and add a date prefix to the file name:

    pwget --prefix-date --overwrite --verbose \
       http://www.example.com/file.pl

    --> YYYY-MM-DD::file.pl

To add date and WWW site prefix to the filenames:

    pwget --prefix-date --prefix-www --overwrite --verbose \
       http://www.example.com/file.pl

    --> YYYY-MM-DD::www.example.com::file.pl

Get all updated files under cnfiguration file's tag updates:

    pwget --verbose --overwrite --skip-version --new --tag updates
    pwget -v -o -s -n -T updates

Get files as they read in the configuration file to the current directory, ignoring any lcd: and save: directives:

    pwget --config $HOME/config/pwget.conf /
        --no-lcd --no-save --overwrite --verbose \
        http://www.example.com/file.pl

To check configuration file, run the program with non-matching regexp and it parses the file and checks the lcd: directives on the way:

    pwget -v -r dummy-regexp

    -->

    pwget.DirectiveLcd: LCD [$EUSR/directory ...]
    is not a directory at /users/foo/bin/pwget line 889.

CONFIGURATION FILE

Comments

The configuration file is NOT Perl code. Comments start with hash character (#).

Variables

At this point, variable expansions happen only in lcd:. Do not try to use them anywhere else, like in URLs.

Path variables for lcd: are defined using following notation, spaces are not allowed in VALUE part (no directory names with spaces). Variable names are case sensitive. Variables substitute environment variabales with the same name. Environment variables are immediately available.

    VARIABLE = /home/my/dir         # define variable
    VARIABLE = $dir/some/file       # Use previously defined variable
    FTP      = $HOME/ftp            # Use environment variable

The right hand can refer to previously defined variables or existing environment variables. Repeat, this is not Perl code although it may look like one, but just an allowed syntax in the configuration file. Notice that there is dollar to the right hand> when variable is referred, but no dollar to the left hand side when variable is defined. Here is example of a possible configuration file contant. The tags are hierarchically ordered without a limit.

Warning: remember to use different variables names in separate include files. All variables are global.

Include files

It is possible to include more configuration files with statement

    INCLUDE <path-to-file-name>

Variable expansions are possible in the file name. There is no limit how many or how deep include structure is used. Every file is included only once, so it is safe to to have multiple includes to the same file. Every include is read, so put the most importat override includes last:

    INCLUDE <etc/pwget.conf>             # Global
    INCLUDE <$HOME/config/pwget.conf>    # HOME overrides it

A special THIS tag means relative path of the current include file, which makes it possible to include several files form the same directory where a initial include file resides

    # Start of config at /etc/pwget.conf

    # THIS = /etc, current location
    include <THIS/pwget-others.conf>

    # Refers to directory where current user is: the pwd
    include <pwget-others.conf>

    # end

Configuraton file example

The configuration file can contain many <directoves:>, where each directive end to a colon. The usage of each directory is best explained by examining the configuration file below and reading the commentary near each directive.

    #   $HOME/config/pwget.conf F- Perl pwget configuration file

    ROOT   = $HOME                      # define variables
    CONF   = $HOME/config
    UPDATE = $ROOT/updates
    DOWNL  = $ROOT/download

    #   Include more configuration files. It is possible to
    #   split a huge file in pieces and have "linux",
    #   "win32", "debian", "emacs" configurations in separate
    #   and manageable files.

    INCLUDE <$CONF/pwget-other.conf>
    INCLUDE <$CONF/pwget-more.conf>

    tag1: local-copies tag1: local      # multiple names to this category

        lcd:  $UPDATE                   # chdir directive

        #  This is show to user with option --verbose
        print: Notice, this site moved YYYY-MM-DD, update your bookmarks

        file://absolute/dir/file-1.23.tar.gz

    tag1: external

      lcd:  $DOWNL

      tag2: external-http

        http://www.example.com/page.html
        http://www.example.com/page.html save:/dir/dir/page.html

      tag2: external-ftp

        ftp://ftp.com/dir/file.txt.gz save:xx-file.txt.gz login:foo pass:passwd x:

        lcd: $HOME/download/package

        ftp://ftp.com/dir/package-1.1.tar.gz new:

      tag2: package-x

        lcd: $DOWNL/package-x

        #  Person announces new files in his homepage, download all
        #  announced files. Unpack everything (x:) and remove any
        #  existing directories (xopt:rm)

        http://example.com/~foo pregexp:\.tar\.gz$ x: xopt:rm

    # End of configuration file pwget.conf

LIST OF DIRECTIVES IN CONFIGURATION FILE

All the directives must in the same line where the URL is. The programs scans lines and determines all options given in line for the URL. Directives can be overridden by command line options.

cnv:CONVERSION

Currently only conv:text is available.

Convert downloaded page to text. This option always needs either save: or rename:, because only those directives change filename. Here is an example:

    http://example.com/dir/file.html cnv:text save:file.txt
    http://example.com/dir/ pregexp:\.html cnv:text rename:s/html/txt/

A text: shorthand directive can be used instead of cnv:text.

cregexp:REGEXP

Download file only if the content matches REGEXP. This is same as option --Regexp-content. In this example directory listing Emacs lisp packages (.el) are downloaded but only if their content indicates that the Author is Mr. Foo:

    http://example.com/index.html cregexp:(?i)author:.*Foo pregexp:\.el$
lcd:DIRECTORY

Set local download directory to DIRECTORY (chdir to it). Any environment variables are substituted in path name. If this tag is found, it replaces setting of --Output. If path is not a directory, terminate with error. See also --Create-paths and --no-lcd.

login:LOGIN-NAME

Ftp login name. Default value is "anonymous".

mirror:SITE

This is relevant to Sourceforge only which does not allow direct downloads with links. Visit project's Sourceforge homepage and see which mirrors are available for downloading.

An example:

  http://sourceforge.net/projects/austrumi/files/austrumi/austrumi-1.8.5/austrumi-1.8.5.iso/download new: mirror:kent
new:

Get newest file. This variable is reset to the value of --new after the line has been processed. Newest means, that an ls command is run in the ftp, and something equivalent in HTTP "ftp directories", and any files that resemble the filename is examined, sorted and heurestically determined according to version number of file which one is the latest. For example files that have version information in YYYYMMDD format will most likely to be retrieved right.

Time stamps of the files are not checked.

The only requirement is that filename must follow the universal version numbering standard:

    FILE-VERSION.extension      # de facto VERSION is defined as [\d.]+

    file-19990101.tar.gz        # ok
    file-1999.0101.tar.gz       # ok
    file-1.2.3.5.tar.gz         # ok

    file1234.txt                # not recognized. Must have "-"
    file-0.23d.tar.gz           # warning, letters are problematic

Files that have some alphabetic version indicator at the end of VERSION may not be handled correctly. Contact the developer and inform him about the de facto standard so that files can be retrieved more intelligently.

NOTE: In order the new: directive to know what kind of files to look for, it needs a file tamplate. You can use a direct link to some filename. Here the location "http://www.example.com/downloads" is examined and the filename template used is took as "file-1.1.tar.gz" to search for files that might be newer, like "file-9.1.10.tar.gz":

  http://www.example.com/downloads/file-1.1.tar.gz new:

If the filename appeard in a named page, use directive file: for template. In this case the "download.html" page is examined for files looking like "file.*tar.gz" and the latest is searched:

  http://www.example.com/project/download.html file:file-1.1.tar.gz new:
overwrite: o:

Same as turning on --overwrite

page:

Read web page and apply commands to it. An example: contact the root page and save it:

   http://example.com/~foo page: save:foo-homepage.html

In order to find the correct information from the page, other directives are usually supplied to guide the searching.

1) Adding directive pregexp:ARCHIVE-REGEXP matches the A HREF links in the page.

2) Adding directive new: instructs to find newer VERSIONS of the file.

3) Adding directive file:DOWNLOAD-FILE tells what template to use to construct the downloadable file name. This is needed for the new: directive.

4) A directive vregexp:VERSION-REGEXP matches the exact location in the page from where the version information is extracted. The default regexp looks for line that says "The latest version ... is ... N.N". The regexp must return submatch 2 for the version number.

AN EXAMPLE

Search for newer files from a HTTP directory listing. Examine page http://www.example.com/download/dir for model package-1.1.tar.gz and find a newer file. E.g. package-4.7.tar.gz would be downloaded.

    http://www.example.com/download/dir/package-1.1.tar.gz new:

AN EXAMPLE

Search for newer files from the content of the page. The directive file: acts as a model for filenames to pay attention to.

    http://www.example.com/project/download.html new: pregexp:tar.gz file:package-1.1.tar.gz

AN EXAMPLE

Use directive rename: to change the filename before soring it on disk. Here, the version number is attached to the actila filename:

    file.txt-1.1
    file.txt-1.2

The directived needed would be as follows; entries have been broken to separate lines for legibility:

    http://example.com/files/
    pregexp:\.el-\d
    vregexp:(file.el-([\d.]+))
    file:file.el-1.1
    new:
    rename:s/-[\d.]+//

This effectively reads: "See if there is new version of something that looks like file.el-1.1 and save it under name file.el by deleting the extra version number at the end of original filename".

AN EXAMPLE

Contact absolute page: at http://www.example.com/package.html and search A HREF urls in the page that match pregexp:. In addition, do another scan and search the version number in the page from thw position that match vregexp: (submatch 2).

After all the pieces have been found, use template file: to make the retrievable file using the version number found from vregexp:. The actual download location is combination of page: and A HREF pregexp: location.

The directived needed would be as follows; entries have been broken to separate lines for legibility:

    http://www.example.com/~foo/package.html
    page:
    pregexp: package.tar.gz
    vregexp: ((?i)latest.*?version.*?\b([\d][\d.]+).*)
    file: package-1.3.tar.gz
    new:
    x:

An example of web page where the above would apply:

    <HTML>
    <BODY>

    The latest version of package is <B>2.4.1</B> It can be
    downloaded in several forms:

        <A HREF="download/files/package.tar.gz">Tar file</A>
        <A HREF="download/files/package.zip">ZIP file

    </BODY>
    </HTML>

For this example, assume that package.tar.gz is a symbolic link pointing to the latest release file package-2.4.1.tar.gz. Thus the actual download location would have been http://www.example.com/~foo/download/files/package-2.4.1.tar.gz.

Why not simply download package.tar.gz? Because then the program can't decide if the version at the page is newer than one stored on disk from the previous download. With version numbers in the file names, the comparison is possible.

page:find

FIXME: This opton is obsolete. do not use.

THIS IS FOR HTTP only. Use Use directive regexp: for FTP protocls.

This is a more general instruction than the page: and vregexp: explained above.

Instruct to download every URL on HTML page matching pregexp:RE. In typical situation the page maintainer lists his software in the development page. This example would download every tar.gz file in the page. Note, that the REGEXP is matched against the A HREF link content, not the actual text that is displayed on the page:

    http://www.example.com/index.html page:find pregexp:\.tar.gz$

You can also use additional regexp-no: directive if you want to exclude files after the pregexp: has matched a link.

    http://www.example.com/index.html page:find pregexp:\.tar.gz$ regexp-no:desktop
pass:PASSWORD

For FTP logins. Default value is nobody@example.com.

pregexp:RE

Search A HREF links in page matching a regular expression. The regular expression must be a single word with no whitespace. This is incorrect:

    pregexp:(this regexp )

It must be written as:

    pregexp:(this\s+regexp\s)
print:MESSAGE

Print associated message to user requesting matching tag name. This directive must in separate line inside tag.

    tag1: linux

      print: this download site moved 2002-02-02, check your bookmarks.
      http://new.site.com/dir/file-1.1.tar.gz new:

The print: directive for tag is shown only if user turns on --verbose mode:

    pwget -v -T linux
rename:PERL-CODE

Rename each file using PERL-CODE. The PERL-CODE must be full perl program with no spaces anywhere. Following variables are available during the eval() of code:

    $ARG = current file name
    $url = complete url for the file
    The code must return $ARG which is used for file name

For example, if page contains links to .html files that are in fact text files, following statement would change the file extensions:

    http://example.com/dir/ page:find pregexp:\.html rename:s/html/txt/

You can also call function MonthToNumber($string) if the filename contains written month name, like <2005-February.mbox>.The function will convert the name into number. Many mailing list archives can be downloaded cleanly this way.

    #  This will download SA-Exim Mailing list archives:
    http://lists.merlins.org/archives/sa-exim/ pregexp:\.txt$ rename:$ARG=MonthToNumber($ARG)

Here is a more complicated example:

    http://www.contactor.se/~dast/svnusers/mbox.cgi pregexp:mbox.*\d$ rename:my($y,$m)=($url=~/year=(\d+).*month=(\d+)/);$ARG="$y-$m.mbox"

Let's break that one apart. You may spend some time with this example since the possiblilities are limitless.

    1. Connect to page
       http://www.contactor.se/~dast/svnusers/mbox.cgi

    2. Search page for URLs matching regexp 'mbox.*\d$'. A
       found link could match hrefs like this:
       http://svn.haxx.se/users/mbox.cgi?year=2004&month=12

    3. The found link is put to $ARG (same as $_), which can be used
       to extract suitable mailbox name with a perl code that is
       evaluated. The resulting name must apear in $ARG. Thus the code
       effectively extract two items from the link to form a mailbox
       name:

        my ($y, $m) = ( $url =~ /year=(\d+).*month=(\d+)/ )
        $ARG = "$y-$m.mbox"

        => 2004-12.mbox

Just remember, that the perl code that follows rename: directive must must not contain any spaces. It all must be readable as one string.

regexp:REGEXP

Get all files in ftp directory matching regexp. Directive save: is ignored.

regexp-no:REGEXP

After the regexp: directive has matched, exclude files that match directive regexp-no:

Regexp:REGEXP

This option is for interactive use. Retrieve all files from HTTP or FTP site which match REGEXP.

save:LOCAL-FILE-NAME

Save file under this name to local disk.

tagN:NAME

Downloads can be grouped under tagN so that e.g. option --tag1 would start downloading files from that point on until next tag1 is found. There are currently unlimited number of tag levels: tag1, tag2 and tag3, so that you can arrange your downlods hierarchially in the configuration file. For example to download all Linux files rhat you monitor, you would give option --tag linux. To download only the NT Emacs latest binary, you would give option --tag emacs-nt. Notice that you do not give the level in the option, program will find it out from the configuration file after the tag name matches.

The downloading stops at next tag of the same level. That is, tag2 stops only at next tag2, or when upper level tag is found (tag1) or or until end of file.

    tag1: linux             # All Linux downlods under this category

        tag2: sunsite    tag2: another-name-for-this-spot

        #   List of files to download from here

        tag2: ftp.funet.fi

        #   List of files to download from here

    tag1: emacs-binary

        tag2: emacs-nt

        tag2: xemacs-nt

        tag2: emacs

        tag2: xemacs
x:

Extract (unpack) file after download. See also option --unpack and --no-extract The archive file, say .tar.gz will be extracted the file in current download location. (see directive lcd:)

The unpack procedure checks the contents of the archive to see if the package is correctly formed. The de facto archive format is

    package-N.NN.tar.gz

In the archive, all files are supposed to be stored under the proper subdirectory with version information:

    package-N.NN/doc/README
    package-N.NN/doc/INSTALL
    package-N.NN/src/Makefile
    package-N.NN/src/some-code.java

IMPORTANT: If the archive does not have a subdirectory for all files, a subdirectory is created and all items are unpacked under it. The default subdirectory name in constructed from the archive name with currect date stamp in format:

    package-YYYY.MMDD

If the archive name contains something that looks like a version number, the created directory will be constructed from it, instead of current date.

    package-1.43.tar.gz    =>  package-1.43
xx:

Like directive x: but extract the archive as is, without checking content of the archive. If you know that it is ok for the archive not to include any subdirectories, use this option to suppress creation of an artificial root package-YYYY.MMDD.

xopt:rm

This options tells to remove any previous unpack directory.

Sometimes the files in the archive are all read-only and unpacking the archive second time, after some period of time, would display

    tar: package-3.9.5/.cvsignore: Could not create file:
    Permission denied

    tar: package-3.9.5/BUGS: Could not create file:
    Permission denied

This is not a serious error, because the archive was already on disk and tar did not overwrite previous files. It might be good to inform the archive maintainer, that the files have wrong permissions. It is customary to expect that distributed packages have writable flag set for all files.

ERRORS

Here is list of possible error messages and how to deal with them. Turning on --debug will help to understand how program has interpreted the configuration file or command line options. Pay close attention to the generated output, because it may reveal that a regexp for a site is too lose or too tight.

ERROR {URL-HERE} Bad file descriptor

This is "file not found error". You have written the filename incorrectly. Double check the configuration file's line.

BUGS AND LIMITATIONS

Sourceforge note: To download archive files from Sourceforge requires some trickery because of the redirections and load balancers the site uses. The Sourceforge page have also undergone many changes during their existence. Due to these changes there exists an ugly hack in the program to use wget(1) to get certain information from the site. This could have been implemented in pure Perl, but as of now the developer hasn't had time to remove the wget(1) dependency. No doubt, this is an ironic situation to use wget(1). You you have Perl skills, go ahead and look at UrlHttGet(). UrlHttGetWget() and sen patches.

The program was initially designed to read options from one line. It is unfortunately not possible to change the program to read configuration file directives from multiple lines, e.g. by using backslashes (\) to indicate contuatinued line.

ENVIRONMENT

Variable PWGET_CFG can point to the root configuration file. The configuration file is read at startup if it exists.

    export PWGET_CFG=$HOME/conf/pwget.conf     # /bin/hash syntax
    setenv PWGET_CFG $HOME/conf/pwget.conf     # /bin/csh syntax

EXIT STATUS

Not defined.

DEPENDENCIES

External utilities:

    wget(1)   only needed for Sourceforge.net downloads
              see BUGS AND LIMITATIONS

Non-core Perl modules from CPAN:

    LWP::UserAgent
    Net::FTP

The following modules are loaded in run-time only if directive cnv:text is used. Otherwise these modules are not loaded:

    HTML::Parse
    HTML::TextFormat
    HTML::FormatText

This module is loaded in run-time only if HTTPS scheme is used:

    Crypt::SSLeay

SEE ALSO

lwp-download(1) lwp-mirror(1) lwp-request(1) lwp-rget(1) wget(1)

AUTHOR

Jari Aalto

LICENSE AND COPYRIGHT

Copyright (C) 1996-2016 Jari Aalto

This program is free software; you can redistribute and/or modify program under the terms of GNU General Public license either version 2 of the License, or (at your option) any later version.

pwget-2016.1019+git75c6e3e/doc/manual/index.txt000066400000000000000000001051651300167571300206510ustar00rootroot00000000000000NAME pwget - Perl Web URL fetch program SYNOPSIS pwget http://example.com/ [URL ...] pwget --config $HOME/config/pwget.conf --tag linux --tag emacs .. pwget --verbose --overwrite http://example.com/ pwget --verbose --overwrite --Output ~/dir/ http://example.com/ pwget --new --overwrite http://example.com/package-1.1.tar.gz DESCRIPTION Automate periodic downloads of files and packages. If you retrieve latest versions of certain program blocks periodically, this is the Perl script for you. Run from cron job or once a week to upload newest versions of files around the net. Note: Wget and this program At this point you may wonder, where would you need this perl program when wget(1) C-program has been the standard for ages. Well, 1) Perl is cross platform and more easily extendable 2) You can record file download criteria to a configuration file and use perl regular epxressions to select downloads 3) the program can anlyze web-pages and "search" for the download only links as instructed 4) last but not least, it can track newest packages whose name has changed since last downlaod. There are heuristics to determine the newest file or package according to file name skeleton defined in configuration. This program does not replace pwget(1) because it does not offer as many options as wget, like recursive downloads and date comparing. Use wget for ad hoc downloads and this utility for files that change (new releases of archives) or which you monitor periodically. Short introduction This small utility makes it possible to keep a list of URLs in a configuration file and periodically retrieve those pages or files with simple commands. This utility is best suited for small batch jobs to download e.g. most recent versions of software files. If you use an URL that is already on disk, be sure to supply option --overwrite to allow overwriting existing files. While you can run this program from command line to retrieve individual files, program has been designed to use separate configuration file via --config option. In the configuration file you can control the downloading with separate directives like "save:" which tells to save the file under different name. The simplest way to retrieve the latest version of apackage from a FTP site is: pwget --new --overwite --verbose \ http://www.example.com/package-1.00.tar.gz Do not worry about the filename "package-1.00.tar.gz". The latest version, say, "package-3.08.tar.gz" will be retrieved. The option --new instructs to find newer version than the provided URL. If the URL ends to slash, then directory list at the remote machine is stored to file: !path!000root-file The content of this file can be either index.html or the directory listing depending on the used http or ftp protocol. OPTIONS -A, --regexp-content REGEXP Analyze the content of the file and match REGEXP. Only if the regexp matches the file content, then download file. This option will make downloads slow, because the file is read into memory as a single line and then a match is searched against the content. For example to download Emacs lisp file (.el) written by Mr. Foo in case insensitive manner: pwget -v -r '\.el$' -A "(?i)Author: Mr. Foo" \ http://www.emacswiki.org/elisp/index.html -C, --create-paths Create paths that do not exist in "lcd:" directives. By default, any LCD directive to non-existing directory will interrupt program. With this option, local directories are created as needed making it possible to re-create the exact structure as it is in configuration file. -c, --config FILE This option can be given multiple times. All configurations are read. Read URLs from configuration file. If no configuration file is given, file pointed by environment variable is read. See ENVIRONMENT. The configuration file layout is envlained in section CONFIGURATION FILE --chdir DIRECTORY Do a chdir() to DIRECTORY before any URL download starts. This is like doing: cd DIRECTORY pwget http://example.com/index.html -d, --debug [LEVEL] Turn on debug with positive LEVEL number. Zero means no debug. This option turns on --verbose too. -e, --extract Unpack any files after retrieving them. The command to unpack typical archive files are defined in a program. Make sure these programs are along path. Win32 users are encouraged to install the Cygwin utilities where these programs come standard. Refer to section SEE ALSO. .tar => tar .tgz => tar + gzip .gz => gzip .bz2 => bzip2 .xz => xz .zip => unzip -F, --firewall FIREWALL Use FIREWALL when accessing files via ftp:// protocol. -h, --help Print help page in text. --help-html Print help page in HTML. --help-man Print help page in Unix manual page format. You want to feed this output to c in order to read it. Print help page. -m, --mirror SITE If URL points to Sourcefoge download area, use mirror SITE for downloading. Alternatively the full full URL can include the mirror information. And example: --mirror kent http://downloads.sourceforge.net/foo/foo-1.0.0.tar.gz -n, --new Get newest file. This applies to datafiles, which do not have extension .asp or .html. When new releases are announced, the version number in filename usually tells which is the current one so getting hardcoded file with: pwget -o -v http://example.com/dir/program-1.3.tar.gz is not usually practical from automation point of view. Adding --new option to the command line causes double pass: a) the whole http://example.com/dir/ is examined for all files and b) files matching approximately filename program-1.3.tar.gz are examined, heuristically sorted and file with latest version number is retrieved. --no-lcd Ignore "lcd:" directives in configuration file. In the configuration file, any "lcd:" directives are obeyed as they are seen. But if you do want to retrieve URL to your current directory, be sure to supply this option. Otherwise the file will end to the directory pointer by "lcd:". --no-save Ignore "save:" directives in configuration file. If the URLs have "save:" options, they are ignored during fetch. You usually want to combine --no-lcd with --no-save --no-extract Ignore "x:" directives in configuration file. -O, --output DIR Before retrieving any files, chdir to DIR. -o, --overwrite Allow overwriting existing files when retrieving URLs. Combine this with --skip-version if you periodically update files. --proxy PROXY Use PROXY server for HTTP. (See --Firewall for FTP.). The port number is optional in the call: --proxy http://example.com.proxy.com --proxy example.com.proxy.com:8080 -p, --prefix PREFIX Add PREFIX to all retrieved files. -P, --postfix POSTFIX Add POSTFIX to all retrieved files. -D, --prefix-date Add iso8601 ":YYYY-MM-DD" prefix to all retrieved files. This is added before possible --prefix-www or --prefix. -W, --prefix-www Usually the files are stored with the same name as in the URL dir, but if you retrieve files that have identical names you can store each page separately so that the file name is prefixed by the site name. http://example.com/page.html --> example.com::page.html http://example2.com/page.html --> example2.com::page.html -r, --regexp REGEXP Retrieve file matching at the destination URL site. This is like "Connect to the URL and get all files matching REGEXP". Here all gzip compressed files are found form HTTP server directory: pwget -v -r "\.gz" http://example.com/archive/ Caveat: currently works only for http:// URLs. -R, --config-regexp REGEXP Retrieve URLs matching REGEXP from configuration file. This cancels --tag options in the command line. -s, --selftest Run some internal tests. For maintainer or developer only. --sleep SECONDS Sleep SECONDS before next URL request. When using regexp based downlaods that may return many hits, some sites disallow successive requests in within short period of time. This options makes program sleep for number of SECONDS between retrievals to overcome 'Service unavailable'. --stdout Retrieve URL and write to stdout. --skip-version Do not download files that have version number and which already exists on disk. Suppose you have these files and you use option --skip-version: package.tar.gz file-1.1.tar.gz Only file.txt is retrieved, because file-1.1.tar.gz contains version number and the file has not changed since last retrieval. The idea is, that in every release the number in in distribution increases, but there may be distributions which do not contain version number. In regular intervals you may want to load those packages again, but skip versioned files. In short: This option does not make much sense without additional option --new If you want to reload versioned file again, add option --overwrite. -t, --test, --dry-run Run in test mode. -T, --tag NAME [NAME] ... Search tag NAME from the config file and download only entries defined under that tag. Refer to --config FILE option description. You can give Multiple --tag switches. Combining this option with --regexp does not make sense and the concequencies are undefined. -v, --verbose [NUMBER] Print verbose messages. -V, --version Print version information. EXAMPLES Get files from site: pwget http://www.example.com/dir/package.tar.gz .. Display copyright file for package GNU make from Debian pages: pwget --stdout --regexp 'copyright$' http://packages.debian.org/unstable/make Get all mailing list archive files that match "gz": pwget --regexp gz http://example.com/mailing-list/archive/download/ Read a directory and store it to filename YYYY-MM-DD::!dir!000root-file. pwget --prefix-date --overwrite --verbose http://www.example.com/dir/ To update newest version of the package, but only if there is none at disk already. The --new option instructs to find newer packages and the filename is only used as a skeleton for files to look for: pwget --overwrite --skip-version --new --verbose \ ftp://ftp.example.com/dir/packet-1.23.tar.gz To overwrite file and add a date prefix to the file name: pwget --prefix-date --overwrite --verbose \ http://www.example.com/file.pl --> YYYY-MM-DD::file.pl To add date and WWW site prefix to the filenames: pwget --prefix-date --prefix-www --overwrite --verbose \ http://www.example.com/file.pl --> YYYY-MM-DD::www.example.com::file.pl Get all updated files under cnfiguration file's tag updates: pwget --verbose --overwrite --skip-version --new --tag updates pwget -v -o -s -n -T updates Get files as they read in the configuration file to the current directory, ignoring any "lcd:" and "save:" directives: pwget --config $HOME/config/pwget.conf / --no-lcd --no-save --overwrite --verbose \ http://www.example.com/file.pl To check configuration file, run the program with non-matching regexp and it parses the file and checks the "lcd:" directives on the way: pwget -v -r dummy-regexp --> pwget.DirectiveLcd: LCD [$EUSR/directory ...] is not a directory at /users/foo/bin/pwget line 889. CONFIGURATION FILE Comments The configuration file is NOT Perl code. Comments start with hash character (#). Variables At this point, variable expansions happen only in lcd:. Do not try to use them anywhere else, like in URLs. Path variables for lcd: are defined using following notation, spaces are not allowed in VALUE part (no directory names with spaces). Variable names are case sensitive. Variables substitute environment variabales with the same name. Environment variables are immediately available. VARIABLE = /home/my/dir # define variable VARIABLE = $dir/some/file # Use previously defined variable FTP = $HOME/ftp # Use environment variable The right hand can refer to previously defined variables or existing environment variables. Repeat, this is not Perl code although it may look like one, but just an allowed syntax in the configuration file. Notice that there is dollar to the right hand> when variable is referred, but no dollar to the left hand side when variable is defined. Here is example of a possible configuration file contant. The tags are hierarchically ordered without a limit. Warning: remember to use different variables names in separate include files. All variables are global. Include files It is possible to include more configuration files with statement INCLUDE Variable expansions are possible in the file name. There is no limit how many or how deep include structure is used. Every file is included only once, so it is safe to to have multiple includes to the same file. Every include is read, so put the most importat override includes last: INCLUDE # Global INCLUDE <$HOME/config/pwget.conf> # HOME overrides it A special "THIS" tag means relative path of the current include file, which makes it possible to include several files form the same directory where a initial include file resides # Start of config at /etc/pwget.conf # THIS = /etc, current location include # Refers to directory where current user is: the pwd include # end Configuraton file example The configuration file can contain many , where each directive end to a colon. The usage of each directory is best explained by examining the configuration file below and reading the commentary near each directive. # $HOME/config/pwget.conf F- Perl pwget configuration file ROOT = $HOME # define variables CONF = $HOME/config UPDATE = $ROOT/updates DOWNL = $ROOT/download # Include more configuration files. It is possible to # split a huge file in pieces and have "linux", # "win32", "debian", "emacs" configurations in separate # and manageable files. INCLUDE <$CONF/pwget-other.conf> INCLUDE <$CONF/pwget-more.conf> tag1: local-copies tag1: local # multiple names to this category lcd: $UPDATE # chdir directive # This is show to user with option --verbose print: Notice, this site moved YYYY-MM-DD, update your bookmarks file://absolute/dir/file-1.23.tar.gz tag1: external lcd: $DOWNL tag2: external-http http://www.example.com/page.html http://www.example.com/page.html save:/dir/dir/page.html tag2: external-ftp ftp://ftp.com/dir/file.txt.gz save:xx-file.txt.gz login:foo pass:passwd x: lcd: $HOME/download/package ftp://ftp.com/dir/package-1.1.tar.gz new: tag2: package-x lcd: $DOWNL/package-x # Person announces new files in his homepage, download all # announced files. Unpack everything (x:) and remove any # existing directories (xopt:rm) http://example.com/~foo pregexp:\.tar\.gz$ x: xopt:rm # End of configuration file pwget.conf LIST OF DIRECTIVES IN CONFIGURATION FILE All the directives must in the same line where the URL is. The programs scans lines and determines all options given in line for the URL. Directives can be overridden by command line options. cnv:CONVERSION Currently only conv:text is available. Convert downloaded page to text. This option always needs either save: or rename:, because only those directives change filename. Here is an example: http://example.com/dir/file.html cnv:text save:file.txt http://example.com/dir/ pregexp:\.html cnv:text rename:s/html/txt/ A text: shorthand directive can be used instead of cnv:text. cregexp:REGEXP Download file only if the content matches REGEXP. This is same as option --Regexp-content. In this example directory listing Emacs lisp packages (.el) are downloaded but only if their content indicates that the Author is Mr. Foo: http://example.com/index.html cregexp:(?i)author:.*Foo pregexp:\.el$ lcd:DIRECTORY Set local download directory to DIRECTORY (chdir to it). Any environment variables are substituted in path name. If this tag is found, it replaces setting of --Output. If path is not a directory, terminate with error. See also --Create-paths and --no-lcd. login:LOGIN-NAME Ftp login name. Default value is "anonymous". mirror:SITE This is relevant to Sourceforge only which does not allow direct downloads with links. Visit project's Sourceforge homepage and see which mirrors are available for downloading. An example: http://sourceforge.net/projects/austrumi/files/austrumi/austrumi-1.8.5/austrumi-1.8.5.iso/download new: mirror:kent new: Get newest file. This variable is reset to the value of --new after the line has been processed. Newest means, that an "ls" command is run in the ftp, and something equivalent in HTTP "ftp directories", and any files that resemble the filename is examined, sorted and heurestically determined according to version number of file which one is the latest. For example files that have version information in YYYYMMDD format will most likely to be retrieved right. Time stamps of the files are not checked. The only requirement is that filename "must" follow the universal version numbering standard: FILE-VERSION.extension # de facto VERSION is defined as [\d.]+ file-19990101.tar.gz # ok file-1999.0101.tar.gz # ok file-1.2.3.5.tar.gz # ok file1234.txt # not recognized. Must have "-" file-0.23d.tar.gz # warning, letters are problematic Files that have some alphabetic version indicator at the end of VERSION may not be handled correctly. Contact the developer and inform him about the de facto standard so that files can be retrieved more intelligently. *NOTE:* In order the new: directive to know what kind of files to look for, it needs a file tamplate. You can use a direct link to some filename. Here the location "http://www.example.com/downloads" is examined and the filename template used is took as "file-1.1.tar.gz" to search for files that might be newer, like "file-9.1.10.tar.gz": http://www.example.com/downloads/file-1.1.tar.gz new: If the filename appeard in a named page, use directive file: for template. In this case the "download.html" page is examined for files looking like "file.*tar.gz" and the latest is searched: http://www.example.com/project/download.html file:file-1.1.tar.gz new: overwrite: o: Same as turning on --overwrite page: Read web page and apply commands to it. An example: contact the root page and save it: http://example.com/~foo page: save:foo-homepage.html In order to find the correct information from the page, other directives are usually supplied to guide the searching. 1) Adding directive "pregexp:ARCHIVE-REGEXP" matches the A HREF links in the page. 2) Adding directive new: instructs to find newer VERSIONS of the file. 3) Adding directive "file:DOWNLOAD-FILE" tells what template to use to construct the downloadable file name. This is needed for the "new:" directive. 4) A directive "vregexp:VERSION-REGEXP" matches the exact location in the page from where the version information is extracted. The default regexp looks for line that says "The latest version ... is ... N.N". The regexp must return submatch 2 for the version number. AN EXAMPLE Search for newer files from a HTTP directory listing. Examine page http://www.example.com/download/dir for model "package-1.1.tar.gz" and find a newer file. E.g. "package-4.7.tar.gz" would be downloaded. http://www.example.com/download/dir/package-1.1.tar.gz new: AN EXAMPLE Search for newer files from the content of the page. The directive file: acts as a model for filenames to pay attention to. http://www.example.com/project/download.html new: pregexp:tar.gz file:package-1.1.tar.gz AN EXAMPLE Use directive rename: to change the filename before soring it on disk. Here, the version number is attached to the actila filename: file.txt-1.1 file.txt-1.2 The directived needed would be as follows; entries have been broken to separate lines for legibility: http://example.com/files/ pregexp:\.el-\d vregexp:(file.el-([\d.]+)) file:file.el-1.1 new: rename:s/-[\d.]+// This effectively reads: "See if there is new version of something that looks like file.el-1.1 and save it under name file.el by deleting the extra version number at the end of original filename". AN EXAMPLE Contact absolute page: at http://www.example.com/package.html and search A HREF urls in the page that match pregexp:. In addition, do another scan and search the version number in the page from thw position that match vregexp: (submatch 2). After all the pieces have been found, use template file: to make the retrievable file using the version number found from vregexp:. The actual download location is combination of page: and A HREF pregexp: location. The directived needed would be as follows; entries have been broken to separate lines for legibility: http://www.example.com/~foo/package.html page: pregexp: package.tar.gz vregexp: ((?i)latest.*?version.*?\b([\d][\d.]+).*) file: package-1.3.tar.gz new: x: An example of web page where the above would apply: The latest version of package is 2.4.1 It can be downloaded in several forms: Tar file ZIP file For this example, assume that "package.tar.gz" is a symbolic link pointing to the latest release file "package-2.4.1.tar.gz". Thus the actual download location would have been "http://www.example.com/~foo/download/files/package-2.4.1.tar.gz". Why not simply download "package.tar.gz"? Because then the program can't decide if the version at the page is newer than one stored on disk from the previous download. With version numbers in the file names, the comparison is possible. page:find FIXME: This opton is obsolete. do not use. THIS IS FOR HTTP only. Use Use directive regexp: for FTP protocls. This is a more general instruction than the page: and vregexp: explained above. Instruct to download every URL on HTML page matching pregexp:RE. In typical situation the page maintainer lists his software in the development page. This example would download every tar.gz file in the page. Note, that the REGEXP is matched against the A HREF link content, not the actual text that is displayed on the page: http://www.example.com/index.html page:find pregexp:\.tar.gz$ You can also use additional regexp-no: directive if you want to exclude files after the pregexp: has matched a link. http://www.example.com/index.html page:find pregexp:\.tar.gz$ regexp-no:desktop pass:PASSWORD For FTP logins. Default value is "nobody@example.com". pregexp:RE Search A HREF links in page matching a regular expression. The regular expression must be a single word with no whitespace. This is incorrect: pregexp:(this regexp ) It must be written as: pregexp:(this\s+regexp\s) print:MESSAGE Print associated message to user requesting matching tag name. This directive must in separate line inside tag. tag1: linux print: this download site moved 2002-02-02, check your bookmarks. http://new.site.com/dir/file-1.1.tar.gz new: The "print:" directive for tag is shown only if user turns on --verbose mode: pwget -v -T linux rename:PERL-CODE Rename each file using PERL-CODE. The PERL-CODE must be full perl program with no spaces anywhere. Following variables are available during the eval() of code: $ARG = current file name $url = complete url for the file The code must return $ARG which is used for file name For example, if page contains links to .html files that are in fact text files, following statement would change the file extensions: http://example.com/dir/ page:find pregexp:\.html rename:s/html/txt/ You can also call function "MonthToNumber($string)" if the filename contains written month name, like <2005-February.mbox>.The function will convert the name into number. Many mailing list archives can be downloaded cleanly this way. # This will download SA-Exim Mailing list archives: http://lists.merlins.org/archives/sa-exim/ pregexp:\.txt$ rename:$ARG=MonthToNumber($ARG) Here is a more complicated example: http://www.contactor.se/~dast/svnusers/mbox.cgi pregexp:mbox.*\d$ rename:my($y,$m)=($url=~/year=(\d+).*month=(\d+)/);$ARG="$y-$m.mbox" Let's break that one apart. You may spend some time with this example since the possiblilities are limitless. 1. Connect to page http://www.contactor.se/~dast/svnusers/mbox.cgi 2. Search page for URLs matching regexp 'mbox.*\d$'. A found link could match hrefs like this: http://svn.haxx.se/users/mbox.cgi?year=2004&month=12 3. The found link is put to $ARG (same as $_), which can be used to extract suitable mailbox name with a perl code that is evaluated. The resulting name must apear in $ARG. Thus the code effectively extract two items from the link to form a mailbox name: my ($y, $m) = ( $url =~ /year=(\d+).*month=(\d+)/ ) $ARG = "$y-$m.mbox" => 2004-12.mbox Just remember, that the perl code that follows "rename:" directive must must not contain any spaces. It all must be readable as one string. regexp:REGEXP Get all files in ftp directory matching regexp. Directive save: is ignored. regexp-no:REGEXP After the "regexp:" directive has matched, exclude files that match directive regexp-no: Regexp:REGEXP This option is for interactive use. Retrieve all files from HTTP or FTP site which match REGEXP. save:LOCAL-FILE-NAME Save file under this name to local disk. tagN:NAME Downloads can be grouped under "tagN" so that e.g. option --tag1 would start downloading files from that point on until next "tag1" is found. There are currently unlimited number of tag levels: tag1, tag2 and tag3, so that you can arrange your downlods hierarchially in the configuration file. For example to download all Linux files rhat you monitor, you would give option --tag linux. To download only the NT Emacs latest binary, you would give option --tag emacs-nt. Notice that you do not give the "level" in the option, program will find it out from the configuration file after the tag name matches. The downloading stops at next tag of the "same level". That is, tag2 stops only at next tag2, or when upper level tag is found (tag1) or or until end of file. tag1: linux # All Linux downlods under this category tag2: sunsite tag2: another-name-for-this-spot # List of files to download from here tag2: ftp.funet.fi # List of files to download from here tag1: emacs-binary tag2: emacs-nt tag2: xemacs-nt tag2: emacs tag2: xemacs x: Extract (unpack) file after download. See also option --unpack and --no-extract The archive file, say .tar.gz will be extracted the file in current download location. (see directive lcd:) The unpack procedure checks the contents of the archive to see if the package is correctly formed. The de facto archive format is package-N.NN.tar.gz In the archive, all files are supposed to be stored under the proper subdirectory with version information: package-N.NN/doc/README package-N.NN/doc/INSTALL package-N.NN/src/Makefile package-N.NN/src/some-code.java "IMPORTANT:" If the archive does not have a subdirectory for all files, a subdirectory is created and all items are unpacked under it. The default subdirectory name in constructed from the archive name with currect date stamp in format: package-YYYY.MMDD If the archive name contains something that looks like a version number, the created directory will be constructed from it, instead of current date. package-1.43.tar.gz => package-1.43 xx: Like directive x: but extract the archive "as is", without checking content of the archive. If you know that it is ok for the archive not to include any subdirectories, use this option to suppress creation of an artificial root package-YYYY.MMDD. xopt:rm This options tells to remove any previous unpack directory. Sometimes the files in the archive are all read-only and unpacking the archive second time, after some period of time, would display tar: package-3.9.5/.cvsignore: Could not create file: Permission denied tar: package-3.9.5/BUGS: Could not create file: Permission denied This is not a serious error, because the archive was already on disk and tar did not overwrite previous files. It might be good to inform the archive maintainer, that the files have wrong permissions. It is customary to expect that distributed packages have writable flag set for all files. ERRORS Here is list of possible error messages and how to deal with them. Turning on --debug will help to understand how program has interpreted the configuration file or command line options. Pay close attention to the generated output, because it may reveal that a regexp for a site is too lose or too tight. ERROR {URL-HERE} Bad file descriptor This is "file not found error". You have written the filename incorrectly. Double check the configuration file's line. BUGS AND LIMITATIONS "Sourceforge note": To download archive files from Sourceforge requires some trickery because of the redirections and load balancers the site uses. The Sourceforge page have also undergone many changes during their existence. Due to these changes there exists an ugly hack in the program to use wget(1) to get certain information from the site. This could have been implemented in pure Perl, but as of now the developer hasn't had time to remove the wget(1) dependency. No doubt, this is an ironic situation to use wget(1). You you have Perl skills, go ahead and look at UrlHttGet(). UrlHttGetWget() and sen patches. The program was initially designed to read options from one line. It is unfortunately not possible to change the program to read configuration file directives from multiple lines, e.g. by using backslashes (\) to indicate contuatinued line. ENVIRONMENT Variable "PWGET_CFG" can point to the root configuration file. The configuration file is read at startup if it exists. export PWGET_CFG=$HOME/conf/pwget.conf # /bin/hash syntax setenv PWGET_CFG $HOME/conf/pwget.conf # /bin/csh syntax EXIT STATUS Not defined. DEPENDENCIES External utilities: wget(1) only needed for Sourceforge.net downloads see BUGS AND LIMITATIONS Non-core Perl modules from CPAN: LWP::UserAgent Net::FTP The following modules are loaded in run-time only if directive cnv:text is used. Otherwise these modules are not loaded: HTML::Parse HTML::TextFormat HTML::FormatText This module is loaded in run-time only if HTTPS scheme is used: Crypt::SSLeay SEE ALSO lwp-download(1) lwp-mirror(1) lwp-request(1) lwp-rget(1) wget(1) AUTHOR Jari Aalto LICENSE AND COPYRIGHT Copyright (C) 1996-2016 Jari Aalto This program is free software; you can redistribute and/or modify program under the terms of GNU General Public license either version 2 of the License, or (at your option) any later version.