pax_global_header00006660000000000000000000000064146547200460014522gustar00rootroot0000000000000052 comment=596425bd937b73fa7808f62d484fb39b1e5449d9 kiwix-zim-updater-3.3/000077500000000000000000000000001465472004600147615ustar00rootroot00000000000000kiwix-zim-updater-3.3/.gitignore000066400000000000000000000001261465472004600167500ustar00rootroot00000000000000# Ignore possible download log download.log purge.log screenshot.png /zims kiwix-indexkiwix-zim-updater-3.3/LICENSE000066400000000000000000000432541465472004600157760ustar00rootroot00000000000000 GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Lesser General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) year name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. , 1 April 1989 Ty Coon, President of Vice This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. kiwix-zim-updater-3.3/README.md000066400000000000000000000153771465472004600162550ustar00rootroot00000000000000# kiwix-zim-updater.sh A script to check `download.kiwix.org` for updates to your local ZIM library. Just pass this script your ZIM directory and away it goes. *(see Usage below)* Tested on PopOS! 22.04, and should work out of the box on most debian systems, but I have not tested that. # DEPRECATION WARNING `kiwix-zim.sh` has been deprecated in favor of the more descriptive `kiwix-zim-updater.sh`. A hard link from `kiwix-zim.sh` to `kiwix-zim-updater.sh` will be maintained on this repo until at least `2025-01-01T00:00:00Z` for compatability with CRON users. ***CALLING `kiwix-zim.sh` WILL RESULT IN AN EXIT CODE OF 2 ON SUCCESSFUL EXECUTION. THIS BEHAVIOR IS EXPECTED. CALL `kiwix-zim-updater.sh` INSTEAD TO GET A 0 EXIT CODE*** ## What It Does I wanted an easy way to ensure my ZIM library was kept updated without actually needing to check every ZIM individually. I looked for a tool to do this, but didn't find anything... so I put on my amateur BASH hat and made one. Some people run this script via a scheduled cron job where they store their ZIM library and host a Kiwix server. It works for me. Your mileage may vary...no warranty, see [the license](./LICENSE) for more info This script will parse a list of all ZIM(s) found in the ZIM directory passed to it. It then checks each ZIM against what is on the `download.kiwix.org` website via the file name Year-Month part. Any zims with newer versions online will then be replaced by default. There is an option to verify the downloaded checksums automatically, and options to set the maximum and minimum zim size to download. Although default behavior is to purge the old zim if the new zim passes inspection, purging can be disabled if you would like to keep an archive of old zims. ```text Note: Due to the nature of ZIM sizes and internet connection speeds, you should expect the download process to take a while. This script will output the download progress bar during the download process just so you can see that the script hasn't frozen or locked up. Download status is also logged in real-time for monitoring from outside this script. ``` ### Special Note 1 For data safety reasons, I have coded this script to "dry-run" by default. This means that this script will not download or purge anything, however, it will "go through the motions" and output what it would have actually done, allowing you to review the "effects" before commiting to them. Once you are good with the "dry-run" results and wish to commit to them, simply re-run the script like you did the first time, but this time, add the "dry-run" override flag (`-d`) to the end. ```text Bonus: A dry-run/simulation run is not required. If you like to live dangerously, feel free to just call the script with the override flag right from the start. It's your ZIM libary... not mine. ``` ### Special Note 2 Creates `downloads.log` for the following reasons: 1. History of what was done. Just good to have. 2. Because downloads can take a really long time, if you were to run this script in the background, you'd have no real way of monitoring the status of any downloads it may be running... `download.log` can be monitored for real-time status of any downloads taking place. You could use a very simple `tail -f download.log` to watch those download stats in real-time from outside of the script. ## Limitations - This script is only for ZIM(s) hosted by `download.kiwix.org` due to the file naming standard they use. If you have self-made ZIM(s) or ZIM(s) downloaded from somewhere else, they most likely do not use the same naming standards and will not be processed by this script. - If you have ZIM(s) from `download.kiwix.org`, but you have changed their file names, this script will treat them like the previous limitation explains. - This script does not attempt to update any `library.xml` that may or may not exist/be needed for your install/setup of Kiwix. If needed, you'll need to handle this part on your own. ## Requirements This script does not need root, however it does need the same rights as your ZIM directory or it won't be able to download and/or purge ZIMs. Not checked or installed via script: - Git *(only needed for the self-update process to work.)* ## Install This script is self-updating. The self-update routine uses git commands to make the update so this script should be "installed" with the below command. ```shell git clone https://github.com/jojo2357/kiwix-zim-updater.git ``` UPDATE: If you decide not to install via a git clone, you can still use this script, however, it will just skip the update check and continue on. NOTE: if you are not tracking the `main` branch, the update check will be skipped. So if you do not want to get updates, but like git, just track the commit of your choosing. ## Usage ```text Usage: ./kiwix-zim-updater.sh /full/path/ /full/path/ Full path to ZIM directory Universal Options: -h, --help Show this usage and exit. -d, --disable-dry-run Dry-Run Override. *** Caution *** -u, --skip-update Skips checking for script updates (very useful for development). -g, --get-index Forces using remote index rather than cached index. Cache auto clears after one day -n , --min-size Minimum ZIM Size to be downloaded. Specify units include M Mi G Gi, etc. See `man numfmt` -x , --max-size Maximum ZIM Size to be downloaded. Specify units include M Mi G Gi, etc. See `man numfmt` Action Method Options: -w, --web Downloads zims over http(s). -t, --torrent Downloads `.torrent` files. REQUIRES ADDITIONAL SOFTWARE TO EXECUTE DOWNLOAD. Web Download Options: -c, --calculate-checksum Verifies that the downloaded files were not corrupted, but can take a while for large downloads. -p, --skip-purge Skips purging any replaced ZIMs. -l , --location Country Code to prefer mirrors from -f, --verify-library Verifies that the entire library has the correct checksums as found online. Expected behavior is to create sha256 files during a normal run so this option can be used at a later date without internet. Disable this using -S -S, --no-sha Disables saving the zim checksum for future reference. Does not delete present checksums. ``` kiwix-zim-updater-3.3/kiwix-zim-updater.sh000077500000000000000000001154101465472004600207140ustar00rootroot00000000000000#!/bin/bash VER="3.3" # This array will contain all of the local zims, with the file extension LocalZIMArray=() # This array will contain all of the local zims, without the file extension LocalZIMNameArray=() # This array will map the local zim to the index in the remote arrays that contains the same base file name LocalZIMRemoteIndexArray=() # This array is a boolean array which remembers if a given local zim shoud be processed in the download loop LocalRequiresDownloadArray=() # After updating, this array will be used to store hanging locks and to deal with them HangingFileLocks=(); # This array stores the file names that kiwix has to offer, with .zim extensions RemoteFiles=() # Ditto, without the YYYY-MM (note a trailing _) Basenames=() # Contains the absolute path to this file, from /zims/ RemotePaths=() # Contains the folder this file is in relative to /zims/ RemoteCategory=() # Set Script Strings SCRIPT="$(readlink -f "$0")" SCRIPTFILE="$(basename "$SCRIPT")" SCRIPTPATH="$(dirname "$SCRIPT")" CALLEDSCRIPTNAME="$0" CALLEDSCRIPTFILE="$(basename "$CALLEDSCRIPTNAME")" CALLEDSCRIPTPATH="$(dirname "$CALLEDSCRIPTNAME")" ARGS=("$@") BRANCH="main" SKIP_UPDATE=0 DEBUG=1 # This forces the script to default to "dry-run/simulation mode" MIN_SIZE=0 MAX_SIZE=0 CALCULATE_CHECKSUM=0 CHECKSUM_FILES=1 VERIFY_LIBRARY=0 FORCE_FETCH_INDEX=0 DOWNLOAD_METHOD=1 # 1: web 2: torrent BaseURL="https://download.kiwix.org/zim/" ZIMPath="" RED_REGULAR="\033[0;31m" RED_BOLD="\033[1;31m" YELLOW_REGULAR="\033[0;33m" YELLOW_BOLD="\033[1;33m" GREEN_REGULAR="\033[0;32m" GREEN_BOLD="\033[1;32m" BLUE_REGULAR="\033[0;34m" BLUE_BOLD="\033[1;34m" CLEAR="\033[0m" # This will ask the api what files it has to offer and store them in arrays master_scrape() { unset RemoteFiles unset Basenames unset RemotePaths unset RemoteCategory indexIsValid=1 if [[ ! -f kiwix-index ]]; then indexIsValid=0 else indexDate="$(head -1 "kiwix-index")" if [[ -z "${indexDate}" ]] || [[ "$(date -u -d "$indexDate" +%s)" -lt "$(date -u -d "1 day ago" +%s)" ]]; then indexIsValid=0 fi fi if [[ FORCE_FETCH_INDEX -eq 1 ]] || [[ $indexIsValid -eq 0 ]]; then # both write the file timestamp to the index file and save all of the links to RawLibrary RawLibrary="$(wget --show-progress -q -O - "https://library.kiwix.org/catalog/v2/entries?count=-1" | tee --output-error=warn-nopipe >(grep -ioP "(?<=)\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(?=Z)" | head -1 > kiwix-index) | grep -i 'application/x-zim' | grep -ioP "^\s+\K.*$")" echo "$RawLibrary" >> kiwix-index else RawLibrary=$(grep -i '> download.log exit 0 else echo -e "${GREEN_BOLD} ✓ Found ${#RemoteFiles[@]} files online" echo "✓ Found ${#RemoteFiles[@]} files online" >> download.log fi # Housekeeping... unset RawLibrary unset hrefs } # self_update - Script Self-Update Function self_update() { echo -e "${YELLOW_BOLD}1. Checking for Script Updates...${CLEAR}" echo "1. Checking for Script Updates..." >> download.log echo # Check if script path is a git clone. # If true, then check for update. # If false, skip self-update check/funciton. if [ $SKIP_UPDATE -eq 1 ]; then echo -e "${YELLOW_REGULAR} Check Skipped${CLEAR}" echo "Check Skipped" >> download.log elif [[ -d "$SCRIPTPATH/.git" ]]; then echo -e "${GREEN_BOLD} ✓ Git Clone Detected: Checking Script Version...${CLEAR}" echo "✓ Git Clone Detected: Checking Script Version..." >> download.log cd "$SCRIPTPATH" || exit 1 [[ $(timeout 1s git rev-parse --abbrev-ref HEAD) != "$BRANCH" ]] && echo -e "${YELLOW_BOLD} You appear to be on a different branch so I will assume you are developing and do not want an update${CLEAR}" && echo && return timeout 1s git fetch --quiet timeout 1s git diff --quiet --exit-code "origin/$BRANCH" "$SCRIPTFILE" [ $? -eq 1 ] && { echo -e "${RED_REGULAR} ✗ Version: Mismatched${CLEAR}" echo "✗ Version: Mismatched" >> download.log echo echo -e "${YELLOW_BOLD}1a. Fetching Update...${CLEAR}" echo "1a. Fetching Update..." >> download.log echo if [ -n "$(git status --porcelain)" ]; then git stash push -m 'local changes stashed before self update' --quiet fi git pull --force --quiet git checkout $BRANCH --quiet git pull --force --quiet echo -e "${GREEN_BOLD} ✓ Update Complete. Running New Version. Standby...${CLEAR}" echo "✓ Update Complete. Running New Version. Standby..." >> download.log sleep 3 cd - >/dev/null || exit 1 exec "$CALLEDSCRIPTNAME" "${ARGS[@]}" # Exit this old instance of the script exit 1 } echo -e "${GREEN_BOLD} ✓ Version: Current${CLEAR}" else echo -e "${RED_REGULAR} ✗ Git Clone Not Detected: Skipping Update Check${CLEAR}" fi echo } # usage_example - Show Usage and Exit usage_example() { echo 'Usage: ./kiwix-zim-updater.sh /full/path/' echo '' echo ' /full/path/ Full path to ZIM directory' echo '' echo 'Universal Options:' echo ' -h, --help Show this usage and exit.' echo ' -u, --skip-update Skips checking for script updates (very useful for development).' echo ' -g, --get-index Forces using remote index rather than cached index. Cache auto clears after one day' echo ' -n , --min-size Minimum ZIM Size to be downloaded.' echo ' Specify units include M Mi G Gi, etc. See `man numfmt`' echo ' -x , --max-size Maximum ZIM Size to be downloaded.' echo ' Specify units include M Mi G Gi, etc. See `man numfmt`' echo ' -S, --no-sha Disables saving the zim checksum for future reference. Does not delete present checksums.' echo ' ' echo 'Action Method Options:' echo ' -w, --web Downloads zims over http(s).' echo ' -t, --torrent Downloads `.torrent` files. REQUIRES ADDITIONAL SOFTWARE TO EXECUTE DOWNLOAD.' echo ' ' echo ' -f, --verify-library Verifies that the entire library has the correct checksums as found online.' echo ' Expected behavior is to create sha256 files during a normal run so this option can be used at a later date without internet.' echo ' Disable this using -S' echo '' echo 'Web Download Options:' echo ' -c, --calculate-checksum Verifies that the downloaded files were not corrupted, but can take a while for large downloads.' echo ' -p, --skip-purge Skips purging any replaced ZIMs.' echo ' -l , --location Country Code to prefer mirrors from' echo ' -d, --disable-dry-run Dry-Run Override.' echo ' *** Caution ***' echo exit 0 } # flags - Flag and ZIM Processing Functions flags() { echo -e "${YELLOW_BOLD}2. Preprocessing...${CLEAR}" echo "2. Preprocessing..." >> download.log echo echo -e "${BLUE_BOLD} -Validating ZIM directory...${CLEAR}" echo "Validating ZIM directory..." >> download.log # Let's identify which argument is the ZIM directory path and if it's an actual directory. if [[ -d ${1} ]]; then if [[ -w ${1} ]]; then ZIMPath=$1 else ZIMPath=$1 echo -e "${RED_REGULAR} ✗ Cannot write to '${1}', continuing in dry-run${CLEAR}" echo "✗ Cannot write to '${1}', continuing in dry-run" >> download.log echo DEBUG=1 fi else # Um... no ZIM directory path provided? Okay, let's show the usage and exit. if [[ -z ${1} ]]; then echo -e "${RED_REGULAR} ✗ Kiwix ZIM Directory not provided${CLEAR}" echo "✗ Kiwix ZIM Directory not provided" >> download.log else echo -e "${RED_REGULAR} ✗ '$1' is not a directory${CLEAR}" echo "✗ '$1' is not a directory" >> download.log fi echo usage_example fi # Check for and add if missing, trailing slash. [[ "${ZIMPath}" != */ ]] && ZIMPath="${ZIMPath}/" # Now we need to check for ZIM files. shopt -s nullglob # This is in case there are no matching files # Load all found ZIM(s) w/path into LocalZIMArray IFS=$'\n' read -r -d '' -a LocalZIMArray < <(ls -1 "$ZIMPath" | grep -iP "\.zim$") unset IFS for index in "${!LocalZIMArray[@]}" ; do duplicated=0 myBasename=$(echo "${LocalZIMArray[$index]}" | grep -ioP "^.*(?=_\d{4}-\d{2}\.zim$)") for scanIndex in "${!LocalZIMArray[@]}"; do if [[ -f "${ZIMPath}.~lock.${LocalZIMArray[$index]}" ]]; then if [[ $index -le $scanIndex ]] || [[ -f "${ZIMPath}.~lock.${LocalZIMArray[$scanIndex]}" ]]; then continue; fi else if [[ $index -ge $scanIndex ]] || [[ -f "${ZIMPath}.~lock.${LocalZIMArray[$scanIndex]}" ]]; then continue; fi fi scanBasename=$(echo "${LocalZIMArray[$scanIndex]}" | grep -ioP "^.*(?=_\d{4}-\d{2}\.zim$)") if [[ "$myBasename" == "$scanBasename" ]]; then if [[ -f "${ZIMPath}.~lock.${LocalZIMArray[$index]}" ]]; then echo "Disregarding ${LocalZIMArray[$index]} because it was interrupted by ${LocalZIMArray[$scanIndex]}" >> download.log elif [[ -f "${ZIMPath}${LocalZIMArray[$index]}.torrent" ]]; then echo "Disregarding ${LocalZIMArray[$index]} because new torrent exists ${LocalZIMArray[$scanIndex]}" >> download.log else echo "Disregarding ${LocalZIMArray[$index]} because it is shadowed by ${LocalZIMArray[$scanIndex]}" >> download.log fi duplicated=1 break fi done [[ $duplicated -eq 1 ]] && unset -v 'LocalZIMArray[$index]' done LocalZIMArray=("${LocalZIMArray[@]}") # Check that ZIM(s) were actually found/loaded. if [ ${#LocalZIMArray[@]} -eq 0 ]; then # No ZIM(s) were found in the directory... I guess there's nothing else for us to do, so we'll Exit. echo -e "${RED_REGULAR} ✗ No ZIMs found. Exiting...${CLEAR}" exit 0 else echo -e "${GREEN_BOLD} ✓ Valid ZIM Directory ${CLEAR}" fi echo echo -e "${BLUE_BOLD} -Building online ZIM list...${CLEAR}" # Build online ZIM list. master_scrape echo # Populate ZIM arrays from found ZIM(s) echo -e "${BLUE_BOLD} -Parsing ZIM(s)...${CLEAR}" for ((i = 0; i < ${#LocalZIMArray[@]}; i++)); do # Loop through local ZIM(s). LocalZIMNameArray[$i]=$(basename "${LocalZIMArray[$i]}") # Extract file name. filename=$(basename "${LocalZIMArray[$i]}" | grep -ioP "[\w:\/\-.]+(?=\d{4}-\d{2}\.zim$)") # Extract file name. # IFS='_' read -ra fields <<< "${LocalZIMNameArray[$i]}"; unset IFS # Break the filename into fields delimited by the underscore '_' # Search MasterZIMArray for the current local ZIM to discover the online Root (directory) for the URL for ((z = 0; z < ${#Basenames[@]}; z++)); do if [[ ${Basenames[$z]} == "$filename" ]]; then # Match Found (ignore the filename datepart). LocalZIMRemoteIndexArray[$i]="$z" break else # No Match Found. LocalZIMRemoteIndexArray[$i]="-1" fi done if [[ LocalZIMRemoteIndexArray[$i] -eq -1 ]]; then echo -e "${RED_REGULAR} ✗ ${LocalZIMNameArray[$i]} No online match found.${CLEAR}" else echo -e "${GREEN_BOLD} ✓ ${LocalZIMNameArray[$i]} [${RemoteCategory[${LocalZIMRemoteIndexArray[$i]}]}]${CLEAR}" fi done echo echo -e "${GREEN_REGULAR} ${#LocalZIMNameArray[*]} ZIM(s) found.${CLEAR}" } # mirror_search - Find ZIM URL Priority #1 mirror from meta4 Function mirror_search() { IsMirror=0 DownloadURL="" RemotePath="${RemotePaths[${LocalZIMRemoteIndexArray[$z]}]}" ExpectedSize="${FileSizes[${LocalZIMRemoteIndexArray[$z]}]}" # If we need the checksum, we need a link and the hash, which we can get both by using .meta4, otherwise we only need # Silently fetch (via wget) the associated meta4 xml and extract the mirror URL marked priority="1" MetaInfo=$(wget -q -O - "$BaseURL$RemotePath.meta4?country=$COUNTRY_CODE") ExpectedSize=$(echo "$MetaInfo" | grep '' | grep -Po '\d+') ExpectedHash=$(echo "$MetaInfo" | grep '' | grep -Poi '(?<="sha-256">)[a-f\d]{64}(?=<)') # still run the last wget so that the logic is simple for verification. if [[ $DOWNLOAD_METHOD -eq 2 ]]; then DownloadURL="$BaseURL$RemotePath.torrent" # Set the direct download URL as our download URL return fi RawMirror=$(echo "$MetaInfo" | grep 'priority="1"' | grep -Po 'https?://[^ ")]+(?=)') # Check that we actually got a URL (this could probably be done better). If no mirror URL, default back to direct URL. if [[ $RawMirror == *"http"* ]]; then # Mirror URL found DownloadURL="$RawMirror" # Set the mirror URL as our download URL IsMirror=1 else # Mirror URL not found DownloadURL="$BaseURL$RemotePath" # Set the direct download URL as our download URL fi } ######################### # Begin Script Execute ######################### while [[ $# -gt 0 ]]; do case $1 in -h | --help) usage_example ;; -d | --disable-dry-run) DEBUG=0 shift # discard argument ;; -v | --version) echo "$VER" exit 0 ;; -p | --skip-purge) SKIP_PURGE=1 shift # discard argument ;; -n | --min-size) shift # discard -n argument MIN_SIZE=$(numfmt --from=auto "$1") # convert passed arg to bytes shift # discard value ;; -x | --max-size) shift # discard -x argument MAX_SIZE=$(numfmt --from=auto "$1") # convert passed arg to bytes shift # discard value ;; -l | --location) shift # discard -l argument if [[ "$1" =~ ^[A-Z]{2}$ ]]; then COUNTRY_CODE=$1 # convert passed arg to bytes else COUNTRY_CODE="" echo "Invlaid country code, falling back to default kiwix behavior" >> download.log fi shift # discard value ;; -c | --calculate-checksum) CALCULATE_CHECKSUM=1 shift ;; -f | --verfiy-library) VERIFY_LIBRARY=1 CALCULATE_CHECKSUM=1 shift ;; -u | --skip-update) SKIP_UPDATE=1 shift ;; -g | --get-index) FORCE_FETCH_INDEX=1 shift ;; -w | --web) DOWNLOAD_METHOD=1 shift ;; -t | --torrent) DOWNLOAD_METHOD=2 shift ;; -S | --no-sha) CHECKSUM_FILES=0 shift ;; *) # We can either parse the arg here, or just tuck it away for safekeeping POSITIONAL_ARGS+=("$1") # save positional arg shift # past argument ;; esac done set -- "${POSITIONAL_ARGS[@]}" # restore positional parameters that we skipped earlier clear # Clear screen if [[ $CALCULATE_CHECKSUM -ne 0 ]] && [[ $DOWNLOAD_METHOD -ne 1 ]]; then echo -e "${RED_BOLD}Calculating Checksum not available with torrenting. Aborting.${CLEAR}" fi # Display Header echo "==========================================" echo " kiwix-zim-updater" if [[ "$CALLEDSCRIPTFILE" == "kiwix-zim.sh" ]]; then echo " WARNING: kiwix-zim.sh has been deprecated " echo " in favor of kiwix-zim-updater " echo "WARNING: kiwix-zim has been deprecated in favor of kiwix-zim-updater. kiwix-zim will be a hard link to the new script name until at least 2025-01-01" >> download.log fi echo " download.kiwix.org ZIM Updater" echo echo " v$VER by DocDrydenn and jojo2357" echo "==========================================" echo echo " DRY-RUN/SIMULATION" [[ $DEBUG -eq 1 ]] && echo " - ENABLED -" [[ $DEBUG -eq 1 ]] && echo [[ $DEBUG -eq 1 ]] && echo " Use '-d' to disable." [[ $DEBUG -eq 0 ]] && echo " - DISABLED -" [[ $DEBUG -eq 0 ]] && echo [[ $DEBUG -eq 0 ]] && echo " !!! Caution !!!" echo echo "==========================================" echo # First, Self-Update Check. # Shouldnt this be first? it is not dependent on anything else and resets everything, so may as well reset it before getting all invested? self_update # Second, Flag Check. flags "$@" echo echo -e "${YELLOW_BOLD}3. Processing ZIM(s)...${CLEAR}" echo "3. Processing ZIM(s)..." >> download.log echo AnyDownloads=0 for ((i = 0; i < ${#LocalZIMNameArray[@]}; i++)); do RemoteIndex=${LocalZIMRemoteIndexArray[$i]} if [[ $RemoteIndex -eq -1 ]]; then if [[ $VERIFY_LIBRARY -eq 1 ]] && [[ -f "$FileName.sha256" ]]; then LocalRequiresDownloadArray+=(3) AnyDownloads=1 echo -e "${BLUE_BOLD} - $FileName:${CLEAR}" echo -e "${GREEN_REGULAR} Cached Checksum Found${CLEAR}" else LocalRequiresDownloadArray+=(0) fi continue fi FileName=${LocalZIMNameArray[$i]} echo -e "${BLUE_BOLD} - $FileName:${CLEAR}" [[ -f "$ZIMPath.~lock.$FileName" ]] && echo -e "${YELLOW_REGULAR} Incomplete download detected\n${GREEN_BOLD} ✓ Online Version Found${CLEAR}\n" && LocalRequiresDownloadArray+=(1) && AnyDownloads=1 && continue MatchingSize=${FileSizes[$RemoteIndex]} MatchingFileName=${RemoteFiles[$RemoteIndex]} MatchingFullPath=${RemotePaths[$RemoteIndex]} MatchingCategory=${RemoteCategory[$RemoteIndex]} [[ -f "$ZIMPath$MatchingFileName.torrent" ]] && [[ ! -f "$ZIMPath$MatchingFileName" ]] && echo -e "${YELLOW_REGULAR} Torrent already downloaded\n${GREEN_BOLD} ✓ Online Version Found${CLEAR}\n" && LocalRequiresDownloadArray+=(0) && continue MatchedDate="$(echo "$MatchingFileName" | grep -oP '\d{4}-\d{2}(?=\.zim$)')" MatchedYear="$(echo "$MatchedDate" | grep -oP '\d{4}(?=-\d{2})')" MatchedMonth="$(echo "$MatchedDate" | grep -oP '(?<=\d{4}-)\d{2}')" LocalDate="$(echo "$FileName" | grep -oP '\d{4}-\d{2}(?=\.zim$)')" LocalYear="$(echo "$LocalDate" | grep -oP '\d{4}(?=-\d{2})')" LocalMonth="$(echo "$LocalDate" | grep -oP '(?<=\d{4}-)\d{2}')" FileTooSmall=0 [[ $MIN_SIZE -gt 0 ]] && [[ $MatchingSize -lt $MIN_SIZE ]] && FileTooSmall=1 FileTooLarge=0 [[ $MAX_SIZE -gt 0 ]] && [[ $MatchingSize -gt $MAX_SIZE ]] && FileTooLarge=1 FileSizeAcceptable=0 [ $FileTooSmall -eq 0 ] && [ $FileTooLarge -eq 0 ] && FileSizeAcceptable=1 if [ $VERIFY_LIBRARY -eq 1 ] && [ $FileSizeAcceptable -eq 0 ]; then if [ $FileTooSmall -eq 1 ]; then LocalRequiresDownloadArray+=(0) [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Verification skipped due to file size (minimum: $(numfmt --to=iec-i $MIN_SIZE), download size: $(numfmt --to=iec-i "$MatchingSize"))${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ *** Simulated *** Verification skipped due to file size (minimum: $(numfmt --to=iec-i $MIN_SIZE), download size: $(numfmt --to=iec-i "$MatchingSize"))${CLEAR}" elif [ $FileTooLarge -eq 1 ]; then LocalRequiresDownloadArray+=(0) [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Verification skipped due to file size (maximum: $(numfmt --to=iec-i $MAX_SIZE), download size: $(numfmt --to=iec-i "$MatchingSize"))${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ *** Simulated *** Verification skipped due to file size (maximum: $(numfmt --to=iec-i $MAX_SIZE), download size: $(numfmt --to=iec-i "$MatchingSize"))${CLEAR}" fi elif [[ "$MatchingFileName" == "$FileName" ]]; then if [ $VERIFY_LIBRARY -eq 1 ]; then LocalRequiresDownloadArray+=(1) AnyDownloads=1 echo -e "${GREEN_BOLD} ✓ Online Version Found${CLEAR}" else LocalRequiresDownloadArray+=(0) echo " ✗ No new update" fi else if [ $VERIFY_LIBRARY -eq 1 ]; then if [[ -f "$ZIMPath$FileName.sha256" ]]; then LocalRequiresDownloadArray+=(2) AnyDownloads=1 echo -e "${GREEN_REGULAR} Cached Checksum Found${CLEAR}" else echo " Checking for online checksum..." if wget -S --spider -q -O - "$BaseURL$MatchingCategory/$FileName.sha256" >/dev/null 2>&1; then LocalRequiresDownloadArray+=(1) AnyDownloads=1 echo -e "${GREEN_BOLD} ✓ Online Version Found${CLEAR}" else LocalRequiresDownloadArray+=(0) echo " ✗ Online Version Not Found" fi fi else if [ $FileTooSmall -eq 1 ]; then LocalRequiresDownloadArray+=(0) [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Update skipped (minimum: $(numfmt --to=iec-i $MIN_SIZE), download size: $(numfmt --to=iec-i "$MatchingSize")). New version: $(echo "$MatchingFileName" | grep -oP '\d{4}-\d{2}(?=\.zim$)')${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ *** Simulated *** Update skipped (minimum: $(numfmt --to=iec-i $MIN_SIZE), download size: $(numfmt --to=iec-i "$MatchingSize")). New version: $(echo "$MatchingFileName" | grep -oP '\d{4}-\d{2}(?=\.zim$)')${CLEAR}" elif [ $FileTooLarge -eq 1 ]; then LocalRequiresDownloadArray+=(0) [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Update skipped (maximum: $(numfmt --to=iec-i $MAX_SIZE), download size: $(numfmt --to=iec-i "$MatchingSize")). New version: $(echo "$MatchingFileName" | grep -oP '\d{4}-\d{2}(?=\.zim$)')${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ *** Simulated *** Update skipped (maximum: $(numfmt --to=iec-i $MAX_SIZE), download size: $(numfmt --to=iec-i "$MatchingSize")). New version: $(echo "$MatchingFileName" | grep -oP '\d{4}-\d{2}(?=\.zim$)')${CLEAR}" elif [ "$MatchedYear" -lt "$LocalYear" ]; then LocalRequiresDownloadArray+=(0) echo " ✗ No new update" elif [ "$MatchedYear" -eq "$LocalYear" ] && [ "$MatchedMonth" -le "$LocalMonth" ]; then LocalRequiresDownloadArray+=(0) echo " ✗ No new update" else LocalRequiresDownloadArray+=(1) AnyDownloads=1 echo -e "${GREEN_BOLD} ✓ Update found! --> $MatchedDate${CLEAR}" fi fi fi echo done echo -e "${YELLOW_BOLD}4. Downloading New ZIM(s)...${CLEAR}" echo -e "4. Downloading New ZIM(s)..." >> download.log echo # Let's clear out any possible duplicates # Let's Start the download process, but only if we have actual downloads to do. if [ $AnyDownloads -eq 1 ]; then for ((z = 0; z < ${#LocalZIMNameArray[@]}; z++)); do # Iterate through the download queue. [[ ${LocalRequiresDownloadArray[$z]} -eq 0 ]] && continue OldZIM=${LocalZIMNameArray[$z]} OldZIMPath=$ZIMPath$OldZIM echo -e "${BLUE_BOLD} Processing $OldZIM${CLEAR}" if [[ ${LocalRequiresDownloadArray[$z]} -eq 3 ]]; then ExpectedHash=$(grep -ioP "^[0-9a-f]{64}" <"$OldZIMPath.sha256") NewZIM="$OldZIM" NewZIMPath="$OldZIMPath" else mirror_search # Let's look for a mirror URL first. if [[ ${LocalRequiresDownloadArray[$z]} -eq 2 ]]; then ExpectedHash=$(grep -ioP "^[0-9a-f]{64}" <"$OldZIMPath.sha256") fi NewZIM=${RemoteFiles[${LocalZIMRemoteIndexArray[$z]}]} NewZIMPath=$ZIMPath$NewZIM fi FilePath=$ZIMPath$NewZIM # Set destination path with file name LockFilePath="$ZIMPath.~lock.$NewZIM" # Set destination path with file name TorrentFilePath="$ZIMPath$NewZIM.torrent" # Set destination path with file name RequiresDownload=0 if [ $VERIFY_LIBRARY -eq 0 ]; then if [[ $DOWNLOAD_METHOD -eq 2 ]]; then if [[ -f $NewZIM ]]; then [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Status : ZIM already exists on disk. Skipping download.${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ Status : *** Simulated *** ZIM already exists on disk. Skipping download.${CLEAR}" echo elif [[ -f $TorrentFilePath ]]; then [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Status : Torrent already exists on disk. Skipping download.${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ Status : *** Simulated *** Torrent already exists on disk. Skipping download.${CLEAR}" echo else [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Status : Torrent doesn't exist on disk. Downloading...${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ Status : *** Simulated *** Torrent doesn't exist on disk. Downloading...${CLEAR}" RequiresDownload=1 fi elif [[ -f $NewZIM ]] && ! [[ -f $LockFilePath ]]; then # New ZIM already found, and no interruptions, we don't need to download it. [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Status : ZIM already exists on disk. Skipping download.${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ Status : *** Simulated *** ZIM already exists on disk. Skipping download.${CLEAR}" echo else # New ZIM not found, so we'll go ahead and download it. RequiresDownload=1 if [[ -f $LockFilePath ]]; then [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Status : ZIM download was interrupted. Continuing...${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ Status : *** Simulated *** ZIM download was interrupted. Continuing...${CLEAR}" else [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Status : ZIM doesn't exist on disk. Downloading...${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ Status : *** Simulated *** ZIM doesn't exist on disk. Downloading...${CLEAR}" fi echo fi else # lockfile implies an incomplete download if [[ -f $LockFilePath ]]; then echo "Incomplete download detected, resuming" >>download.log echo -e "${YELLOW_REGULAR} Status: Incomplete download detected, resuming${CLEAR}" # [[ $IsMirror -eq 0 ]] && echo -e "${BLUE_BOLD} Download (direct) : $DownloadURL${CLEAR}" # [[ $IsMirror -eq 1 ]] && echo -e "${BLUE_BOLD} Download (mirror) : $DownloadURL${CLEAR}" RequiresDownload=1 elif [[ -f $TorrentFilePath ]]; then echo -e "${YELLOW_REGULAR} Status: Torrent exists, skipping${CLEAR}" echo "Skipping checksum for torrent : $OldZIM. If the download has been completed, please delete the .torrent if you want me to verify the checksum." >>download.log else # actually verify the file echo -e "${BLUE_REGULAR} Calculating checksum for : $OldZIM${CLEAR}" echo "Calculating checksum for : $OldZIM" >>download.log [[ ${LocalRequiresDownloadArray[$z]} -ne 2 ]] && echo "$ExpectedHash $OldZIM" >"$OldZIMPath.sha256" if [[ ${LocalRequiresDownloadArray[$z]} -ne 2 ]] && [[ $(du -b "$OldZIMPath" | grep -ioP "^\d+") -ne "$ExpectedSize" ]]; then RequiresDownload=1 if [[ $DEBUG -eq 0 ]]; then if [[ $SKIP_PURGE -eq 0 ]]; then echo -e "${RED_BOLD} ✗ Status : File size verification failed, removing corrupt file${CLEAR}" echo "✗ Status : File size verification failed, removing corrupt file" >>download.log rm "$OldZIMPath" else echo -e "${RED_BOLD} ✗ Status : File size verification failed but purge skipped${CLEAR}" echo "✗ Status : File size verification failed but purge skipped" >>download.log fi else [[ $SKIP_PURGE -eq 0 ]] && echo -e "${RED_BOLD} ✗ Status : *** Simulated *** File size verification failed, removing corrupt file ($FilePath)${CLEAR}" [[ $SKIP_PURGE -eq 1 ]] && echo -e "${RED_BOLD} ✗ Status : *** Simulated *** File size verification failed but purge skipped ($FilePath)${CLEAR}" fi echo elif [[ ${#ExpectedHash} -ne 64 ]]; then echo " This hash doesn't look quite right...skipping" elif (cd "$ZIMPath" && ! sha256sum --status -c "$OldZIM.sha256"); then # we checked a very old file, but we will choose to not replace it because we cannot, so we will leave it. A regular update will purge it naturally [[ ${LocalRequiresDownloadArray[$z]} -eq 2 ]] && echo -e "${RED_BOLD} Checksum failed, online file not found, continuing${CLEAR}" && echo && continue if [[ $DEBUG -eq 0 ]]; then if [[ $SKIP_PURGE -eq 0 ]]; then echo -e "${RED_BOLD} ✗ Status : Checksum failed, removing corrupt file${CLEAR}" echo "✗ Status : Checksum failed, removing corrupt file" >>download.log rm "$OldZIMPath" rm "$OldZIMPath.sha256" 2>/dev/null else echo -e "${RED_BOLD} ✗ Status : Checksum failed but purge was skipped${CLEAR}" echo "✗ Status : Checksum failed but purge was skipped" >>download.log echo continue fi else echo -e "${RED_BOLD} ✗ Status : *** Simulated *** Checksum failed, removing corrupt file ($FilePath)${CLEAR}" fi RequiresDownload=1 echo else echo -e "${GREEN_BOLD} ✓ Status : Checksum passed${CLEAR}" echo "✓ Status : Checksum passed" >>download.log [[ $DEBUG -eq 0 ]] && echo "End : $(date -u)" >>download.log [[ $DEBUG -eq 1 ]] && echo "End : $(date -u) *** Simulation ***" >>download.log # rm "$OldZIMPath.sha256" echo continue fi [[ ${LocalRequiresDownloadArray[$z]} -eq 2 ]] && continue # rm "$OldZIMPath.sha256" fi fi echo >>download.log [[ $DEBUG -eq 0 ]] && echo "Start : $(date -u)" >>download.log [[ $DEBUG -eq 1 ]] && echo "Start : $(date -u) *** Simulation ***" >>download.log # Here is where we actually download the files and log to the download.log file. if [[ $RequiresDownload -eq 1 ]]; then if [[ $DOWNLOAD_METHOD -eq 2 ]]; then FilePath="$FilePath.torrent" if [[ -f "$LockFilePath" ]]; then [[ $DEBUG -eq 0 ]] && wget -q --show-progress --progress=bar:force -c -O "$FilePath" "$DownloadURL" 2>&1 |& tee -a download.log # Download new ZIM [[ $DEBUG -eq 1 ]] && echo "Continue Download : $FilePath" >>download.log elif [[ -f $FilePath ]]; then # New ZIM already found, we don't need to download it. [[ $DEBUG -eq 1 ]] && echo "Download : New Torrent already exists on disk. Skipping download." >>download.log else # New ZIM not found, so we'll go ahead and download it. [[ $DEBUG -eq 0 ]] && wget -q --show-progress --progress=bar:force -c -O "$FilePath" "$DownloadURL" 2>&1 |& tee -a download.log # Download new ZIM [[ $DEBUG -eq 1 ]] && echo "Download : $FilePath" >>download.log fi continue else [[ $IsMirror -eq 0 ]] && echo -e "${BLUE_REGULAR} Download (direct) : $DownloadURL${CLEAR}" [[ $IsMirror -eq 1 ]] && echo -e "${BLUE_REGULAR} Download (mirror) : $DownloadURL${CLEAR}" echo >>download.log echo "=======================================================================" >>download.log echo "File : $NewZIM" >>download.log [[ $IsMirror -eq 0 ]] && echo "URL (direct) : $DownloadURL" >>download.log [[ $IsMirror -eq 1 ]] && echo "URL (mirror) : $DownloadURL" >>download.log echo >>download.log # Before we actually download, let's just check to see that it isn't already in the folder. if [[ -f "$LockFilePath" ]]; then [[ $DEBUG -eq 0 ]] && wget -q --show-progress --progress=bar:force -c -O "$FilePath" "$DownloadURL" 2>&1 |& tee -a download.log # Download new ZIM [[ $DEBUG -eq 1 ]] && echo "Continue Download : $FilePath" >>download.log elif [[ -f $FilePath ]]; then # New ZIM already found, we don't need to download it. [[ $DEBUG -eq 1 ]] && echo "Download : New ZIM already exists on disk. Skipping download." >>download.log else # New ZIM not found, so we'll go ahead and download it. [[ $DEBUG -eq 0 ]] && touch "$LockFilePath" [[ $DEBUG -eq 0 ]] && wget -q --show-progress --progress=bar:force -c -O "$FilePath" "$DownloadURL" 2>&1 |& tee -a download.log # Download new ZIM [[ $DEBUG -eq 1 ]] && echo "Download : $FilePath" >>download.log fi fi fi echo "$ExpectedHash $NewZIM" 2>/dev/null 1>"$NewZIMPath.sha256" if [[ $CALCULATE_CHECKSUM -eq 1 ]]; then echo -e "${BLUE_REGULAR} Calculating checksum for : $NewZIMPath${CLEAR}" if [[ $(du -b "$NewZIMPath" 2>/dev/null | grep -ioP "^\d+") -ne "$ExpectedSize" ]]; then if [[ $DEBUG -eq 0 ]]; then echo -e "${RED_BOLD} ✗ Status : File size verification failed, removing corrupt file${CLEAR}" echo "✗ Status : File size verification failed, removing corrupt file" >>download.log rm "$NewZIMPath" else echo -e "${GREEN_BOLD} ✓ *** Simulated *** Checksum passed${CLEAR}" fi elif [[ ${#ExpectedHash} -ne 64 ]]; then echo -e "${YELLOW_BOLD} This hash doesn't look quite right...skipping${CLEAR}" elif [[ $DEBUG -eq 0 ]] && (cd "$ZIMPath" && ! sha256sum --status -c "$NewZIM.sha256"); then echo -e "${RED_BOLD} ✗ Checksum failed, removing corrupt file${CLEAR}" rm "$NewZIMPath" touch "$NewZIMPath" DownloadFailed=1 else if [[ $DEBUG -eq 0 ]]; then echo -e "${GREEN_BOLD} ✓ Checksum passed${CLEAR}" else echo -e "${GREEN_BOLD} ✓ *** Simulated *** Checksum passed${CLEAR}" fi fi # rm "$NewZIMPath.sha256" # rm "$LockFilePath" echo fi echo >> download.log [[ $DownloadFailed -eq 1 ]] && echo " !!! DOWNLOAD FAILED !!!" >>download.log # in all of these cases, we will not re-pruge and will leave the lockfile so we know to resume later if [[ $DownloadFailed -eq 1 ]] || [[ $SKIP_PURGE -eq 1 ]] || [[ $VERIFY_LIBRARY -eq 1 ]]; then [[ $DEBUG -eq 0 ]] && echo "End : $(date -u)" >>download.log [[ $DEBUG -eq 1 ]] && echo "End : $(date -u) *** Simulation ***" >>download.log [[ $SKIP_PURGE -eq 1 ]] && [[ $DownloadFailed -ne 1 ]] && rm "$LockFilePath" [[ $VERIFY_LIBRARY -eq 1 ]] && [[ $RequiresDownload -eq 1 ]] && rm "$LockFilePath" continue fi [[ $RequiresDownload -eq 1 ]] && [[ $DEBUG -eq 0 ]] && rm "$LockFilePath" ######################################## echo -e "${BLUE_REGULAR} Old : $OldZIM${CLEAR}" echo "Old : $OldZIM" >>download.log echo -e "${BLUE_BOLD} New : $NewZIM${CLEAR}" echo "New : $NewZIM" >>download.log # Check for the new ZIM on disk. if [[ -f "$NewZIMPath" ]]; then # New ZIM found if [[ $DEBUG -eq 0 ]]; then if [[ "$OldZIMPath" == "$NewZIMPath" ]]; then echo -e "${GREEN_BOLD} ✓ Status : New ZIM downloaded succesfully.${CLEAR}" echo "✓ Status : New ZIM downloaded succesfully." >>download.log # rm "$OldZIMPath.sha256" 2>/dev/null # Purge old ZIM else echo -e "${GREEN_BOLD} ✓ Status : New ZIM downloaded succesfully. Old ZIM purged.${CLEAR}" echo "✓ Status : New ZIM downloaded succesfully. Old ZIM purged." >>download.log [[ -f "$OldZIMPath" ]] && rm "$OldZIMPath" && rm "$OldZIMPath.sha256" 2>/dev/null # Purge old ZIM fi else echo -e "${GREEN_BOLD} ✓ Status : *** Simulated ***${CLEAR}" echo "✓ Status : *** Simulated ***" >>download.log fi else # New ZIM not found. Something went wrong, so we will skip this purge. if [[ $DEBUG -eq 0 ]]; then echo -e "${RED_BOLD} ✗ Status : New ZIM failed verification. Old ZIM not purged.${CLEAR}" echo "✗ Status : New ZIM failed verification. Old ZIM not purged." >>download.log else if [[ $RequiresDownload -eq 1 ]]; then echo -e "${GREEN_BOLD} ✓ Status : *** Simulated *** New zim exists, old zim purged${CLEAR}" echo "✓ Status : *** Simulated *** New zim exists, old zim purged" >>download.log else echo -e "${YELLOW_BOLD} ✗ Status : *** Simulated *** Zim was skipped, and will not be purged${CLEAR}" echo "✗ Status : *** Simulated *** Zim not purged" >>download.log fi fi fi echo echo >>download.log ######################################### [[ $DEBUG -eq 0 ]] && echo "End : $(date -u)" >>download.log [[ $DEBUG -eq 1 ]] && echo "End : $(date -u) *** Simulation ***" >>download.log done if [[ $DEBUG -eq 0 ]]; then IFS=$'\n' HangingFileLocks=("$ZIMPath".~lock.*.zim) unset IFS if [[ ${#HangingFileLocks[@]} -gt 0 ]]; then echo -e "${YELLOW_BOLD}5. Cleaning up...${CLEAR}" echo "5. Cleaning up..." >> download.log echo for ((i = 0; i < ${#HangingFileLocks[@]}; i++)); do baseFileName=$(echo "${HangingFileLocks[$i]}" | grep -ioP "(?<=\.~lock\.).*$") echo -e " ${RED_BOLD}Found broken lock: ${HangingFileLocks[$i]}${CLEAR}" echo "Found broken lock: ${HangingFileLocks[$i]}" >> download.log if [[ -f "$ZIMPath$baseFileName" ]]; then echo -e " ${BLUE_BOLD}Found abandoned download: $ZIMPath$baseFileName${CLEAR}" echo -e "Found abandoned download: $ZIMPath$baseFileName" >> download.log rm "$ZIMPath$baseFileName" fi if [[ -f "$ZIMPath$baseFileName.sha256" ]]; then echo -e " ${BLUE_BOLD}Found abandoned checksum: $ZIMPath$baseFileName.sha256${CLEAR}" echo -e "Found abandoned checksum: $ZIMPath$baseFileName.sha256" >> download.log rm "$ZIMPath$baseFileName.sha256" fi rm "${HangingFileLocks[$i]}" echo done fi fi else echo -e "${GREEN_REGULAR} ✓ Download: Nothing to download.${CLEAR}" echo "✓ Download: Nothing to download." >> download.log echo fi if [[ "$CALLEDSCRIPTFILE" == "kiwix-zim.sh" ]]; then echo "WARNING: kiwix-zim has been deprecated in favor of kiwix-zim-updater. kiwix-zim will be a hard link to the new script name until at least 2025-01-01.\n This exit code is nonzero, but the script has not errored out. This is simply so you may notice and switch to using kiwix-zim-updater " >> download.log exit 2 fikiwix-zim-updater-3.3/kiwix-zim.sh000077500000000000000000001154101465472004600172520ustar00rootroot00000000000000#!/bin/bash VER="3.3" # This array will contain all of the local zims, with the file extension LocalZIMArray=() # This array will contain all of the local zims, without the file extension LocalZIMNameArray=() # This array will map the local zim to the index in the remote arrays that contains the same base file name LocalZIMRemoteIndexArray=() # This array is a boolean array which remembers if a given local zim shoud be processed in the download loop LocalRequiresDownloadArray=() # After updating, this array will be used to store hanging locks and to deal with them HangingFileLocks=(); # This array stores the file names that kiwix has to offer, with .zim extensions RemoteFiles=() # Ditto, without the YYYY-MM (note a trailing _) Basenames=() # Contains the absolute path to this file, from /zims/ RemotePaths=() # Contains the folder this file is in relative to /zims/ RemoteCategory=() # Set Script Strings SCRIPT="$(readlink -f "$0")" SCRIPTFILE="$(basename "$SCRIPT")" SCRIPTPATH="$(dirname "$SCRIPT")" CALLEDSCRIPTNAME="$0" CALLEDSCRIPTFILE="$(basename "$CALLEDSCRIPTNAME")" CALLEDSCRIPTPATH="$(dirname "$CALLEDSCRIPTNAME")" ARGS=("$@") BRANCH="main" SKIP_UPDATE=0 DEBUG=1 # This forces the script to default to "dry-run/simulation mode" MIN_SIZE=0 MAX_SIZE=0 CALCULATE_CHECKSUM=0 CHECKSUM_FILES=1 VERIFY_LIBRARY=0 FORCE_FETCH_INDEX=0 DOWNLOAD_METHOD=1 # 1: web 2: torrent BaseURL="https://download.kiwix.org/zim/" ZIMPath="" RED_REGULAR="\033[0;31m" RED_BOLD="\033[1;31m" YELLOW_REGULAR="\033[0;33m" YELLOW_BOLD="\033[1;33m" GREEN_REGULAR="\033[0;32m" GREEN_BOLD="\033[1;32m" BLUE_REGULAR="\033[0;34m" BLUE_BOLD="\033[1;34m" CLEAR="\033[0m" # This will ask the api what files it has to offer and store them in arrays master_scrape() { unset RemoteFiles unset Basenames unset RemotePaths unset RemoteCategory indexIsValid=1 if [[ ! -f kiwix-index ]]; then indexIsValid=0 else indexDate="$(head -1 "kiwix-index")" if [[ -z "${indexDate}" ]] || [[ "$(date -u -d "$indexDate" +%s)" -lt "$(date -u -d "1 day ago" +%s)" ]]; then indexIsValid=0 fi fi if [[ FORCE_FETCH_INDEX -eq 1 ]] || [[ $indexIsValid -eq 0 ]]; then # both write the file timestamp to the index file and save all of the links to RawLibrary RawLibrary="$(wget --show-progress -q -O - "https://library.kiwix.org/catalog/v2/entries?count=-1" | tee --output-error=warn-nopipe >(grep -ioP "(?<=)\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(?=Z)" | head -1 > kiwix-index) | grep -i 'application/x-zim' | grep -ioP "^\s+\K.*$")" echo "$RawLibrary" >> kiwix-index else RawLibrary=$(grep -i '> download.log exit 0 else echo -e "${GREEN_BOLD} ✓ Found ${#RemoteFiles[@]} files online" echo "✓ Found ${#RemoteFiles[@]} files online" >> download.log fi # Housekeeping... unset RawLibrary unset hrefs } # self_update - Script Self-Update Function self_update() { echo -e "${YELLOW_BOLD}1. Checking for Script Updates...${CLEAR}" echo "1. Checking for Script Updates..." >> download.log echo # Check if script path is a git clone. # If true, then check for update. # If false, skip self-update check/funciton. if [ $SKIP_UPDATE -eq 1 ]; then echo -e "${YELLOW_REGULAR} Check Skipped${CLEAR}" echo "Check Skipped" >> download.log elif [[ -d "$SCRIPTPATH/.git" ]]; then echo -e "${GREEN_BOLD} ✓ Git Clone Detected: Checking Script Version...${CLEAR}" echo "✓ Git Clone Detected: Checking Script Version..." >> download.log cd "$SCRIPTPATH" || exit 1 [[ $(timeout 1s git rev-parse --abbrev-ref HEAD) != "$BRANCH" ]] && echo -e "${YELLOW_BOLD} You appear to be on a different branch so I will assume you are developing and do not want an update${CLEAR}" && echo && return timeout 1s git fetch --quiet timeout 1s git diff --quiet --exit-code "origin/$BRANCH" "$SCRIPTFILE" [ $? -eq 1 ] && { echo -e "${RED_REGULAR} ✗ Version: Mismatched${CLEAR}" echo "✗ Version: Mismatched" >> download.log echo echo -e "${YELLOW_BOLD}1a. Fetching Update...${CLEAR}" echo "1a. Fetching Update..." >> download.log echo if [ -n "$(git status --porcelain)" ]; then git stash push -m 'local changes stashed before self update' --quiet fi git pull --force --quiet git checkout $BRANCH --quiet git pull --force --quiet echo -e "${GREEN_BOLD} ✓ Update Complete. Running New Version. Standby...${CLEAR}" echo "✓ Update Complete. Running New Version. Standby..." >> download.log sleep 3 cd - >/dev/null || exit 1 exec "$CALLEDSCRIPTNAME" "${ARGS[@]}" # Exit this old instance of the script exit 1 } echo -e "${GREEN_BOLD} ✓ Version: Current${CLEAR}" else echo -e "${RED_REGULAR} ✗ Git Clone Not Detected: Skipping Update Check${CLEAR}" fi echo } # usage_example - Show Usage and Exit usage_example() { echo 'Usage: ./kiwix-zim-updater.sh /full/path/' echo '' echo ' /full/path/ Full path to ZIM directory' echo '' echo 'Universal Options:' echo ' -h, --help Show this usage and exit.' echo ' -u, --skip-update Skips checking for script updates (very useful for development).' echo ' -g, --get-index Forces using remote index rather than cached index. Cache auto clears after one day' echo ' -n , --min-size Minimum ZIM Size to be downloaded.' echo ' Specify units include M Mi G Gi, etc. See `man numfmt`' echo ' -x , --max-size Maximum ZIM Size to be downloaded.' echo ' Specify units include M Mi G Gi, etc. See `man numfmt`' echo ' -S, --no-sha Disables saving the zim checksum for future reference. Does not delete present checksums.' echo ' ' echo 'Action Method Options:' echo ' -w, --web Downloads zims over http(s).' echo ' -t, --torrent Downloads `.torrent` files. REQUIRES ADDITIONAL SOFTWARE TO EXECUTE DOWNLOAD.' echo ' ' echo ' -f, --verify-library Verifies that the entire library has the correct checksums as found online.' echo ' Expected behavior is to create sha256 files during a normal run so this option can be used at a later date without internet.' echo ' Disable this using -S' echo '' echo 'Web Download Options:' echo ' -c, --calculate-checksum Verifies that the downloaded files were not corrupted, but can take a while for large downloads.' echo ' -p, --skip-purge Skips purging any replaced ZIMs.' echo ' -l , --location Country Code to prefer mirrors from' echo ' -d, --disable-dry-run Dry-Run Override.' echo ' *** Caution ***' echo exit 0 } # flags - Flag and ZIM Processing Functions flags() { echo -e "${YELLOW_BOLD}2. Preprocessing...${CLEAR}" echo "2. Preprocessing..." >> download.log echo echo -e "${BLUE_BOLD} -Validating ZIM directory...${CLEAR}" echo "Validating ZIM directory..." >> download.log # Let's identify which argument is the ZIM directory path and if it's an actual directory. if [[ -d ${1} ]]; then if [[ -w ${1} ]]; then ZIMPath=$1 else ZIMPath=$1 echo -e "${RED_REGULAR} ✗ Cannot write to '${1}', continuing in dry-run${CLEAR}" echo "✗ Cannot write to '${1}', continuing in dry-run" >> download.log echo DEBUG=1 fi else # Um... no ZIM directory path provided? Okay, let's show the usage and exit. if [[ -z ${1} ]]; then echo -e "${RED_REGULAR} ✗ Kiwix ZIM Directory not provided${CLEAR}" echo "✗ Kiwix ZIM Directory not provided" >> download.log else echo -e "${RED_REGULAR} ✗ '$1' is not a directory${CLEAR}" echo "✗ '$1' is not a directory" >> download.log fi echo usage_example fi # Check for and add if missing, trailing slash. [[ "${ZIMPath}" != */ ]] && ZIMPath="${ZIMPath}/" # Now we need to check for ZIM files. shopt -s nullglob # This is in case there are no matching files # Load all found ZIM(s) w/path into LocalZIMArray IFS=$'\n' read -r -d '' -a LocalZIMArray < <(ls -1 "$ZIMPath" | grep -iP "\.zim$") unset IFS for index in "${!LocalZIMArray[@]}" ; do duplicated=0 myBasename=$(echo "${LocalZIMArray[$index]}" | grep -ioP "^.*(?=_\d{4}-\d{2}\.zim$)") for scanIndex in "${!LocalZIMArray[@]}"; do if [[ -f "${ZIMPath}.~lock.${LocalZIMArray[$index]}" ]]; then if [[ $index -le $scanIndex ]] || [[ -f "${ZIMPath}.~lock.${LocalZIMArray[$scanIndex]}" ]]; then continue; fi else if [[ $index -ge $scanIndex ]] || [[ -f "${ZIMPath}.~lock.${LocalZIMArray[$scanIndex]}" ]]; then continue; fi fi scanBasename=$(echo "${LocalZIMArray[$scanIndex]}" | grep -ioP "^.*(?=_\d{4}-\d{2}\.zim$)") if [[ "$myBasename" == "$scanBasename" ]]; then if [[ -f "${ZIMPath}.~lock.${LocalZIMArray[$index]}" ]]; then echo "Disregarding ${LocalZIMArray[$index]} because it was interrupted by ${LocalZIMArray[$scanIndex]}" >> download.log elif [[ -f "${ZIMPath}${LocalZIMArray[$index]}.torrent" ]]; then echo "Disregarding ${LocalZIMArray[$index]} because new torrent exists ${LocalZIMArray[$scanIndex]}" >> download.log else echo "Disregarding ${LocalZIMArray[$index]} because it is shadowed by ${LocalZIMArray[$scanIndex]}" >> download.log fi duplicated=1 break fi done [[ $duplicated -eq 1 ]] && unset -v 'LocalZIMArray[$index]' done LocalZIMArray=("${LocalZIMArray[@]}") # Check that ZIM(s) were actually found/loaded. if [ ${#LocalZIMArray[@]} -eq 0 ]; then # No ZIM(s) were found in the directory... I guess there's nothing else for us to do, so we'll Exit. echo -e "${RED_REGULAR} ✗ No ZIMs found. Exiting...${CLEAR}" exit 0 else echo -e "${GREEN_BOLD} ✓ Valid ZIM Directory ${CLEAR}" fi echo echo -e "${BLUE_BOLD} -Building online ZIM list...${CLEAR}" # Build online ZIM list. master_scrape echo # Populate ZIM arrays from found ZIM(s) echo -e "${BLUE_BOLD} -Parsing ZIM(s)...${CLEAR}" for ((i = 0; i < ${#LocalZIMArray[@]}; i++)); do # Loop through local ZIM(s). LocalZIMNameArray[$i]=$(basename "${LocalZIMArray[$i]}") # Extract file name. filename=$(basename "${LocalZIMArray[$i]}" | grep -ioP "[\w:\/\-.]+(?=\d{4}-\d{2}\.zim$)") # Extract file name. # IFS='_' read -ra fields <<< "${LocalZIMNameArray[$i]}"; unset IFS # Break the filename into fields delimited by the underscore '_' # Search MasterZIMArray for the current local ZIM to discover the online Root (directory) for the URL for ((z = 0; z < ${#Basenames[@]}; z++)); do if [[ ${Basenames[$z]} == "$filename" ]]; then # Match Found (ignore the filename datepart). LocalZIMRemoteIndexArray[$i]="$z" break else # No Match Found. LocalZIMRemoteIndexArray[$i]="-1" fi done if [[ LocalZIMRemoteIndexArray[$i] -eq -1 ]]; then echo -e "${RED_REGULAR} ✗ ${LocalZIMNameArray[$i]} No online match found.${CLEAR}" else echo -e "${GREEN_BOLD} ✓ ${LocalZIMNameArray[$i]} [${RemoteCategory[${LocalZIMRemoteIndexArray[$i]}]}]${CLEAR}" fi done echo echo -e "${GREEN_REGULAR} ${#LocalZIMNameArray[*]} ZIM(s) found.${CLEAR}" } # mirror_search - Find ZIM URL Priority #1 mirror from meta4 Function mirror_search() { IsMirror=0 DownloadURL="" RemotePath="${RemotePaths[${LocalZIMRemoteIndexArray[$z]}]}" ExpectedSize="${FileSizes[${LocalZIMRemoteIndexArray[$z]}]}" # If we need the checksum, we need a link and the hash, which we can get both by using .meta4, otherwise we only need # Silently fetch (via wget) the associated meta4 xml and extract the mirror URL marked priority="1" MetaInfo=$(wget -q -O - "$BaseURL$RemotePath.meta4?country=$COUNTRY_CODE") ExpectedSize=$(echo "$MetaInfo" | grep '' | grep -Po '\d+') ExpectedHash=$(echo "$MetaInfo" | grep '' | grep -Poi '(?<="sha-256">)[a-f\d]{64}(?=<)') # still run the last wget so that the logic is simple for verification. if [[ $DOWNLOAD_METHOD -eq 2 ]]; then DownloadURL="$BaseURL$RemotePath.torrent" # Set the direct download URL as our download URL return fi RawMirror=$(echo "$MetaInfo" | grep 'priority="1"' | grep -Po 'https?://[^ ")]+(?=)') # Check that we actually got a URL (this could probably be done better). If no mirror URL, default back to direct URL. if [[ $RawMirror == *"http"* ]]; then # Mirror URL found DownloadURL="$RawMirror" # Set the mirror URL as our download URL IsMirror=1 else # Mirror URL not found DownloadURL="$BaseURL$RemotePath" # Set the direct download URL as our download URL fi } ######################### # Begin Script Execute ######################### while [[ $# -gt 0 ]]; do case $1 in -h | --help) usage_example ;; -d | --disable-dry-run) DEBUG=0 shift # discard argument ;; -v | --version) echo "$VER" exit 0 ;; -p | --skip-purge) SKIP_PURGE=1 shift # discard argument ;; -n | --min-size) shift # discard -n argument MIN_SIZE=$(numfmt --from=auto "$1") # convert passed arg to bytes shift # discard value ;; -x | --max-size) shift # discard -x argument MAX_SIZE=$(numfmt --from=auto "$1") # convert passed arg to bytes shift # discard value ;; -l | --location) shift # discard -l argument if [[ "$1" =~ ^[A-Z]{2}$ ]]; then COUNTRY_CODE=$1 # convert passed arg to bytes else COUNTRY_CODE="" echo "Invlaid country code, falling back to default kiwix behavior" >> download.log fi shift # discard value ;; -c | --calculate-checksum) CALCULATE_CHECKSUM=1 shift ;; -f | --verfiy-library) VERIFY_LIBRARY=1 CALCULATE_CHECKSUM=1 shift ;; -u | --skip-update) SKIP_UPDATE=1 shift ;; -g | --get-index) FORCE_FETCH_INDEX=1 shift ;; -w | --web) DOWNLOAD_METHOD=1 shift ;; -t | --torrent) DOWNLOAD_METHOD=2 shift ;; -S | --no-sha) CHECKSUM_FILES=0 shift ;; *) # We can either parse the arg here, or just tuck it away for safekeeping POSITIONAL_ARGS+=("$1") # save positional arg shift # past argument ;; esac done set -- "${POSITIONAL_ARGS[@]}" # restore positional parameters that we skipped earlier clear # Clear screen if [[ $CALCULATE_CHECKSUM -ne 0 ]] && [[ $DOWNLOAD_METHOD -ne 1 ]]; then echo -e "${RED_BOLD}Calculating Checksum not available with torrenting. Aborting.${CLEAR}" fi # Display Header echo "==========================================" echo " kiwix-zim-updater" if [[ "$CALLEDSCRIPTFILE" == "kiwix-zim.sh" ]]; then echo " WARNING: kiwix-zim.sh has been deprecated " echo " in favor of kiwix-zim-updater " echo "WARNING: kiwix-zim has been deprecated in favor of kiwix-zim-updater. kiwix-zim will be a hard link to the new script name until at least 2025-01-01" >> download.log fi echo " download.kiwix.org ZIM Updater" echo echo " v$VER by DocDrydenn and jojo2357" echo "==========================================" echo echo " DRY-RUN/SIMULATION" [[ $DEBUG -eq 1 ]] && echo " - ENABLED -" [[ $DEBUG -eq 1 ]] && echo [[ $DEBUG -eq 1 ]] && echo " Use '-d' to disable." [[ $DEBUG -eq 0 ]] && echo " - DISABLED -" [[ $DEBUG -eq 0 ]] && echo [[ $DEBUG -eq 0 ]] && echo " !!! Caution !!!" echo echo "==========================================" echo # First, Self-Update Check. # Shouldnt this be first? it is not dependent on anything else and resets everything, so may as well reset it before getting all invested? self_update # Second, Flag Check. flags "$@" echo echo -e "${YELLOW_BOLD}3. Processing ZIM(s)...${CLEAR}" echo "3. Processing ZIM(s)..." >> download.log echo AnyDownloads=0 for ((i = 0; i < ${#LocalZIMNameArray[@]}; i++)); do RemoteIndex=${LocalZIMRemoteIndexArray[$i]} if [[ $RemoteIndex -eq -1 ]]; then if [[ $VERIFY_LIBRARY -eq 1 ]] && [[ -f "$FileName.sha256" ]]; then LocalRequiresDownloadArray+=(3) AnyDownloads=1 echo -e "${BLUE_BOLD} - $FileName:${CLEAR}" echo -e "${GREEN_REGULAR} Cached Checksum Found${CLEAR}" else LocalRequiresDownloadArray+=(0) fi continue fi FileName=${LocalZIMNameArray[$i]} echo -e "${BLUE_BOLD} - $FileName:${CLEAR}" [[ -f "$ZIMPath.~lock.$FileName" ]] && echo -e "${YELLOW_REGULAR} Incomplete download detected\n${GREEN_BOLD} ✓ Online Version Found${CLEAR}\n" && LocalRequiresDownloadArray+=(1) && AnyDownloads=1 && continue MatchingSize=${FileSizes[$RemoteIndex]} MatchingFileName=${RemoteFiles[$RemoteIndex]} MatchingFullPath=${RemotePaths[$RemoteIndex]} MatchingCategory=${RemoteCategory[$RemoteIndex]} [[ -f "$ZIMPath$MatchingFileName.torrent" ]] && [[ ! -f "$ZIMPath$MatchingFileName" ]] && echo -e "${YELLOW_REGULAR} Torrent already downloaded\n${GREEN_BOLD} ✓ Online Version Found${CLEAR}\n" && LocalRequiresDownloadArray+=(0) && continue MatchedDate="$(echo "$MatchingFileName" | grep -oP '\d{4}-\d{2}(?=\.zim$)')" MatchedYear="$(echo "$MatchedDate" | grep -oP '\d{4}(?=-\d{2})')" MatchedMonth="$(echo "$MatchedDate" | grep -oP '(?<=\d{4}-)\d{2}')" LocalDate="$(echo "$FileName" | grep -oP '\d{4}-\d{2}(?=\.zim$)')" LocalYear="$(echo "$LocalDate" | grep -oP '\d{4}(?=-\d{2})')" LocalMonth="$(echo "$LocalDate" | grep -oP '(?<=\d{4}-)\d{2}')" FileTooSmall=0 [[ $MIN_SIZE -gt 0 ]] && [[ $MatchingSize -lt $MIN_SIZE ]] && FileTooSmall=1 FileTooLarge=0 [[ $MAX_SIZE -gt 0 ]] && [[ $MatchingSize -gt $MAX_SIZE ]] && FileTooLarge=1 FileSizeAcceptable=0 [ $FileTooSmall -eq 0 ] && [ $FileTooLarge -eq 0 ] && FileSizeAcceptable=1 if [ $VERIFY_LIBRARY -eq 1 ] && [ $FileSizeAcceptable -eq 0 ]; then if [ $FileTooSmall -eq 1 ]; then LocalRequiresDownloadArray+=(0) [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Verification skipped due to file size (minimum: $(numfmt --to=iec-i $MIN_SIZE), download size: $(numfmt --to=iec-i "$MatchingSize"))${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ *** Simulated *** Verification skipped due to file size (minimum: $(numfmt --to=iec-i $MIN_SIZE), download size: $(numfmt --to=iec-i "$MatchingSize"))${CLEAR}" elif [ $FileTooLarge -eq 1 ]; then LocalRequiresDownloadArray+=(0) [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Verification skipped due to file size (maximum: $(numfmt --to=iec-i $MAX_SIZE), download size: $(numfmt --to=iec-i "$MatchingSize"))${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ *** Simulated *** Verification skipped due to file size (maximum: $(numfmt --to=iec-i $MAX_SIZE), download size: $(numfmt --to=iec-i "$MatchingSize"))${CLEAR}" fi elif [[ "$MatchingFileName" == "$FileName" ]]; then if [ $VERIFY_LIBRARY -eq 1 ]; then LocalRequiresDownloadArray+=(1) AnyDownloads=1 echo -e "${GREEN_BOLD} ✓ Online Version Found${CLEAR}" else LocalRequiresDownloadArray+=(0) echo " ✗ No new update" fi else if [ $VERIFY_LIBRARY -eq 1 ]; then if [[ -f "$ZIMPath$FileName.sha256" ]]; then LocalRequiresDownloadArray+=(2) AnyDownloads=1 echo -e "${GREEN_REGULAR} Cached Checksum Found${CLEAR}" else echo " Checking for online checksum..." if wget -S --spider -q -O - "$BaseURL$MatchingCategory/$FileName.sha256" >/dev/null 2>&1; then LocalRequiresDownloadArray+=(1) AnyDownloads=1 echo -e "${GREEN_BOLD} ✓ Online Version Found${CLEAR}" else LocalRequiresDownloadArray+=(0) echo " ✗ Online Version Not Found" fi fi else if [ $FileTooSmall -eq 1 ]; then LocalRequiresDownloadArray+=(0) [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Update skipped (minimum: $(numfmt --to=iec-i $MIN_SIZE), download size: $(numfmt --to=iec-i "$MatchingSize")). New version: $(echo "$MatchingFileName" | grep -oP '\d{4}-\d{2}(?=\.zim$)')${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ *** Simulated *** Update skipped (minimum: $(numfmt --to=iec-i $MIN_SIZE), download size: $(numfmt --to=iec-i "$MatchingSize")). New version: $(echo "$MatchingFileName" | grep -oP '\d{4}-\d{2}(?=\.zim$)')${CLEAR}" elif [ $FileTooLarge -eq 1 ]; then LocalRequiresDownloadArray+=(0) [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Update skipped (maximum: $(numfmt --to=iec-i $MAX_SIZE), download size: $(numfmt --to=iec-i "$MatchingSize")). New version: $(echo "$MatchingFileName" | grep -oP '\d{4}-\d{2}(?=\.zim$)')${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ *** Simulated *** Update skipped (maximum: $(numfmt --to=iec-i $MAX_SIZE), download size: $(numfmt --to=iec-i "$MatchingSize")). New version: $(echo "$MatchingFileName" | grep -oP '\d{4}-\d{2}(?=\.zim$)')${CLEAR}" elif [ "$MatchedYear" -lt "$LocalYear" ]; then LocalRequiresDownloadArray+=(0) echo " ✗ No new update" elif [ "$MatchedYear" -eq "$LocalYear" ] && [ "$MatchedMonth" -le "$LocalMonth" ]; then LocalRequiresDownloadArray+=(0) echo " ✗ No new update" else LocalRequiresDownloadArray+=(1) AnyDownloads=1 echo -e "${GREEN_BOLD} ✓ Update found! --> $MatchedDate${CLEAR}" fi fi fi echo done echo -e "${YELLOW_BOLD}4. Downloading New ZIM(s)...${CLEAR}" echo -e "4. Downloading New ZIM(s)..." >> download.log echo # Let's clear out any possible duplicates # Let's Start the download process, but only if we have actual downloads to do. if [ $AnyDownloads -eq 1 ]; then for ((z = 0; z < ${#LocalZIMNameArray[@]}; z++)); do # Iterate through the download queue. [[ ${LocalRequiresDownloadArray[$z]} -eq 0 ]] && continue OldZIM=${LocalZIMNameArray[$z]} OldZIMPath=$ZIMPath$OldZIM echo -e "${BLUE_BOLD} Processing $OldZIM${CLEAR}" if [[ ${LocalRequiresDownloadArray[$z]} -eq 3 ]]; then ExpectedHash=$(grep -ioP "^[0-9a-f]{64}" <"$OldZIMPath.sha256") NewZIM="$OldZIM" NewZIMPath="$OldZIMPath" else mirror_search # Let's look for a mirror URL first. if [[ ${LocalRequiresDownloadArray[$z]} -eq 2 ]]; then ExpectedHash=$(grep -ioP "^[0-9a-f]{64}" <"$OldZIMPath.sha256") fi NewZIM=${RemoteFiles[${LocalZIMRemoteIndexArray[$z]}]} NewZIMPath=$ZIMPath$NewZIM fi FilePath=$ZIMPath$NewZIM # Set destination path with file name LockFilePath="$ZIMPath.~lock.$NewZIM" # Set destination path with file name TorrentFilePath="$ZIMPath$NewZIM.torrent" # Set destination path with file name RequiresDownload=0 if [ $VERIFY_LIBRARY -eq 0 ]; then if [[ $DOWNLOAD_METHOD -eq 2 ]]; then if [[ -f $NewZIM ]]; then [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Status : ZIM already exists on disk. Skipping download.${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ Status : *** Simulated *** ZIM already exists on disk. Skipping download.${CLEAR}" echo elif [[ -f $TorrentFilePath ]]; then [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Status : Torrent already exists on disk. Skipping download.${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ Status : *** Simulated *** Torrent already exists on disk. Skipping download.${CLEAR}" echo else [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Status : Torrent doesn't exist on disk. Downloading...${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ Status : *** Simulated *** Torrent doesn't exist on disk. Downloading...${CLEAR}" RequiresDownload=1 fi elif [[ -f $NewZIM ]] && ! [[ -f $LockFilePath ]]; then # New ZIM already found, and no interruptions, we don't need to download it. [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Status : ZIM already exists on disk. Skipping download.${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ Status : *** Simulated *** ZIM already exists on disk. Skipping download.${CLEAR}" echo else # New ZIM not found, so we'll go ahead and download it. RequiresDownload=1 if [[ -f $LockFilePath ]]; then [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Status : ZIM download was interrupted. Continuing...${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ Status : *** Simulated *** ZIM download was interrupted. Continuing...${CLEAR}" else [[ $DEBUG -eq 0 ]] && echo -e "${GREEN_REGULAR} ✓ Status : ZIM doesn't exist on disk. Downloading...${CLEAR}" [[ $DEBUG -eq 1 ]] && echo -e "${GREEN_REGULAR} ✓ Status : *** Simulated *** ZIM doesn't exist on disk. Downloading...${CLEAR}" fi echo fi else # lockfile implies an incomplete download if [[ -f $LockFilePath ]]; then echo "Incomplete download detected, resuming" >>download.log echo -e "${YELLOW_REGULAR} Status: Incomplete download detected, resuming${CLEAR}" # [[ $IsMirror -eq 0 ]] && echo -e "${BLUE_BOLD} Download (direct) : $DownloadURL${CLEAR}" # [[ $IsMirror -eq 1 ]] && echo -e "${BLUE_BOLD} Download (mirror) : $DownloadURL${CLEAR}" RequiresDownload=1 elif [[ -f $TorrentFilePath ]]; then echo -e "${YELLOW_REGULAR} Status: Torrent exists, skipping${CLEAR}" echo "Skipping checksum for torrent : $OldZIM. If the download has been completed, please delete the .torrent if you want me to verify the checksum." >>download.log else # actually verify the file echo -e "${BLUE_REGULAR} Calculating checksum for : $OldZIM${CLEAR}" echo "Calculating checksum for : $OldZIM" >>download.log [[ ${LocalRequiresDownloadArray[$z]} -ne 2 ]] && echo "$ExpectedHash $OldZIM" >"$OldZIMPath.sha256" if [[ ${LocalRequiresDownloadArray[$z]} -ne 2 ]] && [[ $(du -b "$OldZIMPath" | grep -ioP "^\d+") -ne "$ExpectedSize" ]]; then RequiresDownload=1 if [[ $DEBUG -eq 0 ]]; then if [[ $SKIP_PURGE -eq 0 ]]; then echo -e "${RED_BOLD} ✗ Status : File size verification failed, removing corrupt file${CLEAR}" echo "✗ Status : File size verification failed, removing corrupt file" >>download.log rm "$OldZIMPath" else echo -e "${RED_BOLD} ✗ Status : File size verification failed but purge skipped${CLEAR}" echo "✗ Status : File size verification failed but purge skipped" >>download.log fi else [[ $SKIP_PURGE -eq 0 ]] && echo -e "${RED_BOLD} ✗ Status : *** Simulated *** File size verification failed, removing corrupt file ($FilePath)${CLEAR}" [[ $SKIP_PURGE -eq 1 ]] && echo -e "${RED_BOLD} ✗ Status : *** Simulated *** File size verification failed but purge skipped ($FilePath)${CLEAR}" fi echo elif [[ ${#ExpectedHash} -ne 64 ]]; then echo " This hash doesn't look quite right...skipping" elif (cd "$ZIMPath" && ! sha256sum --status -c "$OldZIM.sha256"); then # we checked a very old file, but we will choose to not replace it because we cannot, so we will leave it. A regular update will purge it naturally [[ ${LocalRequiresDownloadArray[$z]} -eq 2 ]] && echo -e "${RED_BOLD} Checksum failed, online file not found, continuing${CLEAR}" && echo && continue if [[ $DEBUG -eq 0 ]]; then if [[ $SKIP_PURGE -eq 0 ]]; then echo -e "${RED_BOLD} ✗ Status : Checksum failed, removing corrupt file${CLEAR}" echo "✗ Status : Checksum failed, removing corrupt file" >>download.log rm "$OldZIMPath" rm "$OldZIMPath.sha256" 2>/dev/null else echo -e "${RED_BOLD} ✗ Status : Checksum failed but purge was skipped${CLEAR}" echo "✗ Status : Checksum failed but purge was skipped" >>download.log echo continue fi else echo -e "${RED_BOLD} ✗ Status : *** Simulated *** Checksum failed, removing corrupt file ($FilePath)${CLEAR}" fi RequiresDownload=1 echo else echo -e "${GREEN_BOLD} ✓ Status : Checksum passed${CLEAR}" echo "✓ Status : Checksum passed" >>download.log [[ $DEBUG -eq 0 ]] && echo "End : $(date -u)" >>download.log [[ $DEBUG -eq 1 ]] && echo "End : $(date -u) *** Simulation ***" >>download.log # rm "$OldZIMPath.sha256" echo continue fi [[ ${LocalRequiresDownloadArray[$z]} -eq 2 ]] && continue # rm "$OldZIMPath.sha256" fi fi echo >>download.log [[ $DEBUG -eq 0 ]] && echo "Start : $(date -u)" >>download.log [[ $DEBUG -eq 1 ]] && echo "Start : $(date -u) *** Simulation ***" >>download.log # Here is where we actually download the files and log to the download.log file. if [[ $RequiresDownload -eq 1 ]]; then if [[ $DOWNLOAD_METHOD -eq 2 ]]; then FilePath="$FilePath.torrent" if [[ -f "$LockFilePath" ]]; then [[ $DEBUG -eq 0 ]] && wget -q --show-progress --progress=bar:force -c -O "$FilePath" "$DownloadURL" 2>&1 |& tee -a download.log # Download new ZIM [[ $DEBUG -eq 1 ]] && echo "Continue Download : $FilePath" >>download.log elif [[ -f $FilePath ]]; then # New ZIM already found, we don't need to download it. [[ $DEBUG -eq 1 ]] && echo "Download : New Torrent already exists on disk. Skipping download." >>download.log else # New ZIM not found, so we'll go ahead and download it. [[ $DEBUG -eq 0 ]] && wget -q --show-progress --progress=bar:force -c -O "$FilePath" "$DownloadURL" 2>&1 |& tee -a download.log # Download new ZIM [[ $DEBUG -eq 1 ]] && echo "Download : $FilePath" >>download.log fi continue else [[ $IsMirror -eq 0 ]] && echo -e "${BLUE_REGULAR} Download (direct) : $DownloadURL${CLEAR}" [[ $IsMirror -eq 1 ]] && echo -e "${BLUE_REGULAR} Download (mirror) : $DownloadURL${CLEAR}" echo >>download.log echo "=======================================================================" >>download.log echo "File : $NewZIM" >>download.log [[ $IsMirror -eq 0 ]] && echo "URL (direct) : $DownloadURL" >>download.log [[ $IsMirror -eq 1 ]] && echo "URL (mirror) : $DownloadURL" >>download.log echo >>download.log # Before we actually download, let's just check to see that it isn't already in the folder. if [[ -f "$LockFilePath" ]]; then [[ $DEBUG -eq 0 ]] && wget -q --show-progress --progress=bar:force -c -O "$FilePath" "$DownloadURL" 2>&1 |& tee -a download.log # Download new ZIM [[ $DEBUG -eq 1 ]] && echo "Continue Download : $FilePath" >>download.log elif [[ -f $FilePath ]]; then # New ZIM already found, we don't need to download it. [[ $DEBUG -eq 1 ]] && echo "Download : New ZIM already exists on disk. Skipping download." >>download.log else # New ZIM not found, so we'll go ahead and download it. [[ $DEBUG -eq 0 ]] && touch "$LockFilePath" [[ $DEBUG -eq 0 ]] && wget -q --show-progress --progress=bar:force -c -O "$FilePath" "$DownloadURL" 2>&1 |& tee -a download.log # Download new ZIM [[ $DEBUG -eq 1 ]] && echo "Download : $FilePath" >>download.log fi fi fi echo "$ExpectedHash $NewZIM" 2>/dev/null 1>"$NewZIMPath.sha256" if [[ $CALCULATE_CHECKSUM -eq 1 ]]; then echo -e "${BLUE_REGULAR} Calculating checksum for : $NewZIMPath${CLEAR}" if [[ $(du -b "$NewZIMPath" 2>/dev/null | grep -ioP "^\d+") -ne "$ExpectedSize" ]]; then if [[ $DEBUG -eq 0 ]]; then echo -e "${RED_BOLD} ✗ Status : File size verification failed, removing corrupt file${CLEAR}" echo "✗ Status : File size verification failed, removing corrupt file" >>download.log rm "$NewZIMPath" else echo -e "${GREEN_BOLD} ✓ *** Simulated *** Checksum passed${CLEAR}" fi elif [[ ${#ExpectedHash} -ne 64 ]]; then echo -e "${YELLOW_BOLD} This hash doesn't look quite right...skipping${CLEAR}" elif [[ $DEBUG -eq 0 ]] && (cd "$ZIMPath" && ! sha256sum --status -c "$NewZIM.sha256"); then echo -e "${RED_BOLD} ✗ Checksum failed, removing corrupt file${CLEAR}" rm "$NewZIMPath" touch "$NewZIMPath" DownloadFailed=1 else if [[ $DEBUG -eq 0 ]]; then echo -e "${GREEN_BOLD} ✓ Checksum passed${CLEAR}" else echo -e "${GREEN_BOLD} ✓ *** Simulated *** Checksum passed${CLEAR}" fi fi # rm "$NewZIMPath.sha256" # rm "$LockFilePath" echo fi echo >> download.log [[ $DownloadFailed -eq 1 ]] && echo " !!! DOWNLOAD FAILED !!!" >>download.log # in all of these cases, we will not re-pruge and will leave the lockfile so we know to resume later if [[ $DownloadFailed -eq 1 ]] || [[ $SKIP_PURGE -eq 1 ]] || [[ $VERIFY_LIBRARY -eq 1 ]]; then [[ $DEBUG -eq 0 ]] && echo "End : $(date -u)" >>download.log [[ $DEBUG -eq 1 ]] && echo "End : $(date -u) *** Simulation ***" >>download.log [[ $SKIP_PURGE -eq 1 ]] && [[ $DownloadFailed -ne 1 ]] && rm "$LockFilePath" [[ $VERIFY_LIBRARY -eq 1 ]] && [[ $RequiresDownload -eq 1 ]] && rm "$LockFilePath" continue fi [[ $RequiresDownload -eq 1 ]] && [[ $DEBUG -eq 0 ]] && rm "$LockFilePath" ######################################## echo -e "${BLUE_REGULAR} Old : $OldZIM${CLEAR}" echo "Old : $OldZIM" >>download.log echo -e "${BLUE_BOLD} New : $NewZIM${CLEAR}" echo "New : $NewZIM" >>download.log # Check for the new ZIM on disk. if [[ -f "$NewZIMPath" ]]; then # New ZIM found if [[ $DEBUG -eq 0 ]]; then if [[ "$OldZIMPath" == "$NewZIMPath" ]]; then echo -e "${GREEN_BOLD} ✓ Status : New ZIM downloaded succesfully.${CLEAR}" echo "✓ Status : New ZIM downloaded succesfully." >>download.log # rm "$OldZIMPath.sha256" 2>/dev/null # Purge old ZIM else echo -e "${GREEN_BOLD} ✓ Status : New ZIM downloaded succesfully. Old ZIM purged.${CLEAR}" echo "✓ Status : New ZIM downloaded succesfully. Old ZIM purged." >>download.log [[ -f "$OldZIMPath" ]] && rm "$OldZIMPath" && rm "$OldZIMPath.sha256" 2>/dev/null # Purge old ZIM fi else echo -e "${GREEN_BOLD} ✓ Status : *** Simulated ***${CLEAR}" echo "✓ Status : *** Simulated ***" >>download.log fi else # New ZIM not found. Something went wrong, so we will skip this purge. if [[ $DEBUG -eq 0 ]]; then echo -e "${RED_BOLD} ✗ Status : New ZIM failed verification. Old ZIM not purged.${CLEAR}" echo "✗ Status : New ZIM failed verification. Old ZIM not purged." >>download.log else if [[ $RequiresDownload -eq 1 ]]; then echo -e "${GREEN_BOLD} ✓ Status : *** Simulated *** New zim exists, old zim purged${CLEAR}" echo "✓ Status : *** Simulated *** New zim exists, old zim purged" >>download.log else echo -e "${YELLOW_BOLD} ✗ Status : *** Simulated *** Zim was skipped, and will not be purged${CLEAR}" echo "✗ Status : *** Simulated *** Zim not purged" >>download.log fi fi fi echo echo >>download.log ######################################### [[ $DEBUG -eq 0 ]] && echo "End : $(date -u)" >>download.log [[ $DEBUG -eq 1 ]] && echo "End : $(date -u) *** Simulation ***" >>download.log done if [[ $DEBUG -eq 0 ]]; then IFS=$'\n' HangingFileLocks=("$ZIMPath".~lock.*.zim) unset IFS if [[ ${#HangingFileLocks[@]} -gt 0 ]]; then echo -e "${YELLOW_BOLD}5. Cleaning up...${CLEAR}" echo "5. Cleaning up..." >> download.log echo for ((i = 0; i < ${#HangingFileLocks[@]}; i++)); do baseFileName=$(echo "${HangingFileLocks[$i]}" | grep -ioP "(?<=\.~lock\.).*$") echo -e " ${RED_BOLD}Found broken lock: ${HangingFileLocks[$i]}${CLEAR}" echo "Found broken lock: ${HangingFileLocks[$i]}" >> download.log if [[ -f "$ZIMPath$baseFileName" ]]; then echo -e " ${BLUE_BOLD}Found abandoned download: $ZIMPath$baseFileName${CLEAR}" echo -e "Found abandoned download: $ZIMPath$baseFileName" >> download.log rm "$ZIMPath$baseFileName" fi if [[ -f "$ZIMPath$baseFileName.sha256" ]]; then echo -e " ${BLUE_BOLD}Found abandoned checksum: $ZIMPath$baseFileName.sha256${CLEAR}" echo -e "Found abandoned checksum: $ZIMPath$baseFileName.sha256" >> download.log rm "$ZIMPath$baseFileName.sha256" fi rm "${HangingFileLocks[$i]}" echo done fi fi else echo -e "${GREEN_REGULAR} ✓ Download: Nothing to download.${CLEAR}" echo "✓ Download: Nothing to download." >> download.log echo fi if [[ "$CALLEDSCRIPTFILE" == "kiwix-zim.sh" ]]; then echo "WARNING: kiwix-zim has been deprecated in favor of kiwix-zim-updater. kiwix-zim will be a hard link to the new script name until at least 2025-01-01.\n This exit code is nonzero, but the script has not errored out. This is simply so you may notice and switch to using kiwix-zim-updater " >> download.log exit 2 fi