pax_global_header 0000666 0000000 0000000 00000000064 14574625054 0014526 g ustar 00root root 0000000 0000000 52 comment=dc24248e77b12983141a5d306c936d8d2b2eb24f
waymore-3.7/ 0000775 0000000 0000000 00000000000 14574625054 0013062 5 ustar 00root root 0000000 0000000 waymore-3.7/.gitignore 0000664 0000000 0000000 00000000105 14574625054 0015046 0 ustar 00root root 0000000 0000000 results/
build/
dist/
waymore.egg-info
__pycache__
collinfo.json waymore-3.7/CHANGELOG.md 0000664 0000000 0000000 00000063640 14574625054 0014704 0 ustar 00root root 0000000 0000000 ## Changelog
- v3.7
- Changed
- Fix a big that can occur in some situations where error `ERROR processResponses 1: [Errno 2] No such file or directory: 'testing/responses.tmp'` shown. The required directories weren't being created correctly.
- Remove a debug print line I left in!
- Remove this script from downloaded responses that's now being included by archive.org:
``
- Remove the comment `` from the downloaded responses.
- Clarify that `-nlf` argument is only relevant to `mode U`.
- v3.6
- Changed
- Added `-ko` to the suggestions displayed for Responses when the `-co`/`--check-only` option is used, and there a huge amount of requests to be made.
- Remove `-ko` from the suggestion displayed for Urls when the `-co`/`--check-only` option is used, because this doesn't affect this. The `-ko` is applied after the links are retrieved.
- Add a statement to `setup.py` to show where `config.yml` is created if it doesn't already exist. This is to help in figuring out Issue #41.
- v3.5
- Changed
- Change `README` descriptions of `-oU` and `-oR` to reference recent `DEFAULT_OUTPUT_DIR`.
- Change description of `-ra` arg in code and on `README` to say all sources.
- Other small improvements to `README`.
- v3.4
- New
- Add `DEFAULT_OUTPUT_DIR` to the `config.yml` file. This will be used to specify the default directory where output will be written if the `-oU` and `-oR` options aren't used. If blank, this defaults to the directory where `config.yml` is stored (typically `~/.config/waymore/`).
- v3.3
- New
- Add `WEBHOOK_DISCORD` to `config.yml` to provide a webhook to be notified when `waymore` has finished, because in some cases it can take a looooooong time!
- Add arg `-nd`/`--notify-discord` to send a notification to the specified Discord webhook in `config.yml` when `waymore` completes. This is useful when because `waymore` can take a looooong time to complete for some targets.
- v3.2
- New
- When getting the Common Crawl index file, if the response is 503, then let the user know it's unavailable. If anything other than 429 or 403, then print an error.
- Changed
- Don't show the coffee link if the output is piped out to something else.
- Remove the `mimetypes` library as it turned out to be quite inaccurate compared to just getting the path extension and using content type of response. Also improve the extension logic.
- If the input has a path, then make sure it is treated as if no subdomains are wanted, i.e. don't prefix with `*.`. This stopped links coming back from archive.org
- Change the messaging to make more sense when multiple sources are used, showing `Links found...` for the first, but `Extra links found...` for the rest.
- v3.1
- Changed
- Make the identification of extension type better when creating the archived hash files. First try the `mimetypes` library that guesses the extension based on the mimetype. If that doesn't work, try to get the extension from the path. If the extension cannot be retrieved from the path, it will be derived from the `content-type` header. If a generic type still can't be obtained, it will be set to the 2nd part of the `content-type` after the `/`. If still unknown, it will be set to `.unknown`. There will be no more `.xnl` extensions by default.
- Updated `README` and images to reflect the most recent version.
- v3.0
- New
- Allow `waymore` to be installed using `pip` or `pipx` so it can be run from any directory.
- Show the current version of the tool in the banner, and whether it is the latest, or outdated.
- When installing `waymore`, if the `config.yml` already exists then it will keep that one and create `config.yml.NEW` in case you need to replace the old config.
- Add reference to VirusTotal v2 API in `README.md`.
- Fix a big where the `results/target` folder was being created every time, even if the `-oU` and `-oR` arguments were passed.
- Include "Buy Me a Coffee" link at the end of output.
- Changed
- Change installation instructions in `README.md`.
- If `--check-only` was passed and it looks like it will take a long time. include the `-ko` argument in the message description of arguments to consider.
- v2.6
- New
- Use `tldextract` library to determine whether the input is a subdomain, or just a domain.
- Include `tldextract` in `setup.py`
- Changed
- Fix a bug that causes Alien Vault to not return any links if a subdomain is passed as input. This happens because the api is called with `/indicators/domain/`. If a URL is passed, it will use `/indicators/hostname/` instead and return links successfully.
- Fix a bug that causes URLScan to fail with error `[ 400 ] Unable to get links from urlscan.io`. This happens when a URL is sent as input because URLScan.io can only retrieve information for hosts. Also, if a host is sent with a trailing `/` then it will be stripped for URLScan.io so it doesn't think there is a path.
- Fix a bug that causes Alien Vault to fail with runtime error `ERROR getAlienVaultUrls 1: 'full_size'`. This happens when a URL is sent as input. This will now successfully return links for passed URLs.
- v2.5
- New
- Show a warning if the user may be passing a sub domain. The chances are that they want all subs if a domain, so should just call for the domain only.
- v2.4
- New
- Add lots of extra search terms to the `DEFAULT_FILTER_KEYWORDS` and `FILTER_KEYWORDS` in `config.yml`
- Changed
- The Common Crawl HTTPAdaptor for retry strategy will just be applied for code 503. There was an issue with 504 errors happening, and then waymore effectively freezes because of the retry strategy. The Common Crawl documentation (https://commoncrawl.org/blog/oct-nov-2023-performance-issues) just says to retry on 503.
- v2.3
- New
- Add `jira` as a search term to the `DEFAULT_FILTER_KEYWORDS` and `FILTER_KEYWORDS` in `config.yml`
- v2.2
- New
- Add `-lcy` argument. This lets you limit the number of Common Crawl index collections searched by the year of the index data. The earliest index has data from 2008. Setting to 0 (default) will search collections or any year (but in conjuction with `-lcc`). For example, if you are only interested in data from 2015 and after, pass `-lcy 2015`. This will override the value of `-lcc` if passed.
- v2.1
- New
- When the responses are downloaded from archive.org they include some archive.orf code such as scripts and stylesheets. This is usually removed but they may have changed this so was being included again. This change will ensure the new code is removed so the response doesn't include the archive.org code.
- v2.0
- New
- Add VirusTotal as a source for URLs. We will get URLs from the v2 API domain report. This can include sub domains, detected URLs, and undetected URLs in the response. It does not give you the status code or MIME type of the links, so we will just check against extension.
- Show a specific message for Wayback Machine if there is a Connection Refused error. This happens when they have blocked the users IP.
- Add some pointless celebration messages to the banner for a few different dates!
- v1.37
- New
- Add argument `-co`/`--check-only`. If passed, then it will just get the count of requests that need to be made to get URLs from the sources, and how many archived responses will be downloaded. It will try to give an idea of the time the tool could take with the settings given.
- v1.36
- New
- Add argument `-wrlr`/`--wayback-rate-limit-retry` which is the number of minutes the user wants to wait for a rate limit pause on Wayback Machine (archive.org) instead of stopping with a `429` error. This defaults to 3 minutes which is a time that seems to work for a while after.
- Add some additional User-Agents to use when making requests to the API providers.
- Add new MIME exclusions `video/x-ms-wmv`,`image/x-png`,`video/quicktime`,`image/x-ms-bmp`,`font/opentype`,`application/x-font-opentype`,`application/x-woff` and `audio/aiff`.
- Changed
- Change the default `-p`/`--processes` to 1 instead of 3. This is to help with the rate limiting now put in place by web.archive.org. If set to 1 we can also ensure that the pages are processed in order and save where we stopped.
- Change the `backoff_factor` on `HTTP_ADAPTER` from 1 to 1.1 to help with the rate limiting now put in place by web.archive.org.
- Change the `pages` set to a list to ensure pages are processed in order (only does if `--processes` is 1).
- v1.35
- New
- I had a specific problem with my ISP blocking archive.org for adult content (!) which resulted in a large and confusing error message. This has been replaced by a more useful message if this happens for anyone else.
- v1.34
- Changed
- Any scheme, port number, query string, or URL fragment will be removed from the input values.
- Only show the warning `No value for "URLSCAN_API_KEY" in config.yml - consider adding (you can get a FREE api key at urlscan.io)` if the `-xus` argument wasn't passed.
- If the input has a domain AND path, then it will still be searched for links, and the mode will not be forced to R.
- When input value is validated and `` is used, just assume one line is a domain/url, and multiple lines are treated as a file (so the correct description is shown).
- v1.33
- Changed
- A bug existed that would cause any site that had only had one page of links to not be completely retrieved. Change the processing for Wayback Machine that gets the number of pages. If the total number of pages is 1, then don't pass page number at all.
- In the `getSPACER` function, add 5 to the length instead of taking 1 away, to text artifacts aren't left.
- v1.32
- Changed
- Changes to prevent `SyntaxWarning: invalid escape sequence` errors when Python 3.12 is used.
- v1.31
- New
- Add new argument `-urlr`/`--urlscan-rate-limit-retry` to pass the number of minutes that you want to wait between each rate-limit pause from URLScan.
- Add new MIME exclusions `application/x-msdownload` and `application/x-ms-application`.
- Changed
- When getting URLs from the results of URLScsn, also get the `[task][url]` values. Thanks to @Ali45598547 for highlighting this!
- When the URLScan rate limits, it says how many seconds you need to wait until you can try again. If less than 1 minute, the program will wait automatically to get more results. If more than 1 minute, then the code will wait for the length of time specified by the `-urlr`/`--urlscan-rate-limit-retry` argument, if passed.
- For CommonCrawl, do at least 20 retires. This helps reduce the problem of `503`` errors and doing many retries was suggested by CommonCrawl them selves to deal with the problem.
- v1.30
- Changed
- If there any `+` in the MIME types, e.g. `image/svg+xml`, then replace the `+` with a `.` otherwise the wayback API does not recognise it.
- Add `application/font-otf` to the `FILTER_MIME` value in `config.yml`.
- v1.29
- New
- Check for specific text in response code of 503 (which usually means the site is down for maintenance or not available) and return a specific message instead of the full response.
- v1.28
- New
- Added `application/font-otf` to `DEFAULT_FILTER_MIME`
- Changed
- Fix a bug that overwrites the output URLs file if the input is a file that contains different hosts.
- v1.27
- Changed
- Set the default for `-lcc` to 3 instead of 0 to only search the 3 latest indexes for Common Crawl instead of all of them.
- v1.26
- Changed
- Allow an input value of just a TLD, e.g. `.mil`. If a TLD is passed then resources for all domains with that TLD will be retrieved. NOTE: If a TLD is passed then the Alien Vault OTX source is excluded because it needs a full domain.
- v1.25
- Changed
- Fix a bug that always strips the port number from URLs found. It should only remove the port if it is :80 or :443
- v1.24
- Changed
- Handle errors with the config file better. Display specific message to say if the file isn't found or if there is a formatting error. If there is any other kind of error, the error message will be displayed. THe default values will be used in the case of any of these errors.
- v1.23
- Changed
- The `-ko`/`--keywords-only` argument can now be passed without a value, which will use the `FILTER_KEYWORDS` in `config.yml` as before, or passed with a Regex value that will be used instead. For example, `-ko "admin"` to only get links containing the word `admin`, or `-ko "\.js(\?\|$)"` to only get JS files. The Regex check is NOT case sensitive.
- v1.22
- Changed
- Fix issue https://github.com/xnl-h4ck3r/waymore/issues/23. If a file is passed as input, an error would occur if any of the domains in the file contained a capital letter or ended with a full stop. The regex in `validateArgInput` has been amended to fix this, adn any `.` on the end of a domain is stripped and domain converted to lowercase before processing.
- v1.21
- Changed
- Fix issue https://github.com/xnl-h4ck3r/waymore/issues/24. If the `FILTER_CODE` in `config.yml` is set to one status code then it is needs to be explicitly set to a string in `getConfig()`
- v1.20
- New
- Add argument `-fc` for filtering HTTP status codes. Using this will override the `FILTER_CODE` value from `config.yml`. This is for specifying HTTP status codes you want to exclude from the results, and are provided in a comma separated list.
- Add argument `-mc` for matching HTTP status codes. Using this will override the `FILTER_CODE` value from `config.yml` AND the `-fc` argument. This is for specifying HTTP status codes you want to match from the results, and are provided in a comma separated list.
- Changed
- Changed how filters are specified in the request to the Common Crawl API. Removes the regex negative lookahead which is not needed if you use `filter=!`
- v1.19
- Changed
- Bug fix - ignore any blank lines in the input file when validating if input is in the correct format
- v1.18
- Changed
- Cache the Common Crawl `collinfo.json` file locally. The file is only updated a few times per year so there is no point in requesting it every time **waymore** is run. Common Crawl can struggle with volume against it's API which can cause timeouts, and currently, about 10% of all requests they get are for the `collinfo.json`!
- Add a HTTPAdapter specifically for Common Crawl to have `retries` and `backoff_factor` increased which seems to reduce the errors and maximize the results found.
- v1.17
- Changed
- If an input file has a sub domain starting with \_ or - then an error was raised, but these are valid. This bug has been fixed.
- In addition to the fix above, the error message will show what line was flagged in error so the user can raise an issue on Github about it if they believe it is an error.
- v1.16
- Changed
- Fix a bug that raises `ERROR processURLOutput 6: [Errno 2] No such file or directory: ''` if the value passed to `-oU` has no directory specified as part of the file name.
- v1.15
- Changed
- Fix bug that shows an error when `-v` is passed and `-oU` does not specify a directory, just a filename
- v1.14
- Changed
- Fix a bug with the `-c`/`--config` option
- v1.13
- New
- Added argument `-oU`/`--output-urls` to allow the user to specify a filename (including path) for the URL links file when `-mode U` (or `B`oth) is used. If not passed, then the file `waymore.txt` will be created in the `results/{target.domain}` directory as normal. If a path is passed with the file, then any directories will be created. For example: `-oU ~/Recon/Redbull/waymoreUrls.txt`
- Added argument `-oR`/`--output-responses` to allow the user to specify a directory (or path) where the archived responses and `index.txt` file is written when `-mode R` (or `B`oth) is used. If any directories in the path do not exist they will be created. For example: `-oR ~/Recon/Redbull/waymoreResponses`
- Changed
- When removing all web archive references in the downloaded archived response, there were a few occasions this wasn't working so the regex has been changed to be more specific to ensure this works.
- v1.12
- New
- Added argument `-c` / `--config` to specify the full path of a YML config file. If not passed, it looks for file `config.yml` in the same directory as runtime file `waymore.py`
- v1.11
- New
- Added argument `-nlf`/`--new-links-file`. If passed, and you run `-mode U` or `-mode B` to get URLs more than once for the same target, the `waymore.txt` will still be appended with new links (unless `-ow` is passed), but a new output file called `waymore.new` will also be written. If there are no new links, the empty file will still be created. This can be used for continuous monitoring of a target.
- Added a `waymore` folder containing a new `__init__.py` file that contains the `__version__` value.
- Added argument `--verison` to display the current version.
- Show better error messages if the archive.org site returns a `Blocked Site Error`.
- Changed
- If a file of domains is passed as input, make sure spaces are stripped from the lines.
- Change `.gitignore` to include `__pycache__`.
- Move images to `waymore/images` folder.
- v1.10
- New
- If `-mode U` is run for the same target again, by default new links found will be added to the `waymore.txt` file and duplicates removed.
- Added argument `-ow`/`--output-overwrite` that can be passed to force the `waymore.txt` file to be overwritten with newly found links instead of being appended.
- Changed
- Change the README.md to reflect new changes
- v1.9
- New
- Add functionality to continue downloading archived responses if it does not complete for any reason. When downloading archived responses, a file called `responses.tmp` will be created with the links of all responses that will be downloaded. There will also be a `continueresp.tmp` that will store the index of the current response being saved. If these files exist when run again, the user will be prompted whether to continue a previous run (so new filters will be ignored) or start a new one.
- Add `CONTINUE_RESPONSES_IF_PIPED` to `config.yml`. If `stdin` or `stdout` is piped from another process, the user is not prompted whether they want a previous run of downloading responses. This value will determine whether to continue a previous run, or start a new one, in that situation.
- Changed
- Corrected the total pages shown when getting wayback URLs
- Included missing packages in the `requirements.txt` document.
- Fix Issue #16 (https://github.com/xnl-h4ck3r/waymore/issues/16)
- v1.8
- Changed
- When archived responses are saved as files, the extension `.xnl` will no longer be used if `-url-filename` is passed. If `-url-filename` is not passed then the filename is represented by a hash value. The extension of these files will be set to `.xnl` only of the original file type cannot be derived from the original URL.
- v1.7
- New
- Added `-xwm` parameter to exclude getting URL's from Wayback Machine (archive.org)
- Added `-lr`/`--limit-requests` that can be used to limit the number of requests made per source (excluding Common Crawl) when getting URL's. For example, if you run **waymore** for `-i twitter.com` it says there are 28,903,799 requests to archive.org that need to be made (that could take almost 1000 days for some people!!!). The default value for the argument is 0 (Zero) which will apply no limit as before. There is also an problem with the Wayback Machine CDX API where the number of pages returned is not correct when filters are applied and can cause issues. Setting this parameter to a sensible value can relieve that issue (it may take a while for archive.org to address the problem).
- Changed
- Make sure that filters in API URL's are escaped correctly
- Add error handling to `getMemory()` to avoid any errors if `psutil` is not installed
- v1.6
- New
- Add a docker option to run `waymore`. Include instructions in `README.md` and a new `DockerFile`
- Changed
- If multiple domain/URLs are passed by file or STDIN, remove `*.` from the start of any input values.
- Change the default `FILTER_KEYWORDS` to include more useful words.
- If a link found from an API has port 80 or 443 specified, e.g. `https://exmaple.com:80/example` then remove the `:80`. Many links have this in archive.org so this could reduce the number of similar links reported.
- Amend `setup.py` to include `urlparse3` that is now used to get the domain and port of links found
- v1.5
- New
- Add argument `-ko`/`--keywords-only` which if passed, will only get Links (unless `-f` is passed) that have a specified keyword in the URL, and will only download responses (regardless of `-f`) where the keyword is in the URL. These multiple keywords can be specified in `config.yml` in a comma separated list.
- Add a `FILTER_KEYWORDS` key/pair to `config.yml` (and default value in code) initially set to `admin,login,logon,signin,register,dashboard,portal,ftp,cpanel`
- Changed
- Only add to the MIME type list if the `-v` option is used because they are not displayed otherwise.
- Warn the user if there is a value missing from the config.yml file
- Fixed small bug in `getURLScanUrls` that raised an error for `getSPACER`
- v1.4
- New
- Added `-m /--memory-threshold` argument to set memory threshold percentage. If the machines memory goes above the threshold, the program will be stopped and ended gracefully before running out of memory (default: 95)
- If `-v` verbose output was used, memory stats will be output at the end, and also shown on teh progress bar downloading responses.
- Included `psutil` in `setup.py`
- Changed
- Fix some display issues not completely done in v1.3, regarding trailing spaces when errors are displayed.
- Remove line `os.kill(os.getpid(),SIGINT)` from `processArchiveUrl` which isn't needed and just causes more error if a user does press Ctrl-C.
- v1.3
- New
- Added functionality to allow output Links output to be piped to another program (the output file will still be written). Errors and progress bar are written to STDERR. No information about archived responses will be piped.
- Added functionality to allow input to be piped to waymore. This will be the same as passing to `-i` argument.
- Changed
- Use a better way to add trailing spaces to strings to cover up other strings (like progress bar), regardless of terminal width.
- Change the README to mention `-xus` argument and how to get a URLScan API key to add to the config file.
- v1.2
- Changed
- Removed User-Agent `Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:15.0) Gecko/20100101 Firefox/15.0.1` because it caused problems on some domains in `xnLinkFinder` tool, so removing from here too.
- Base the length of the progress bar (show when downloading archived responses) on the width of the terminal so it displays better and you don't get multiple lines on smaller windows.
- Amend `.gitignore` to include other unwanted files
- v1.1
- New
- Allow a file of domains/URL's to be passed as input with `-i` instead of just one.
- Changed
- Remove version numbers from `requirements.txt` as these aren't really needed and may cause some issues.
- v1.0
- New
- Added URLScan as a source of URL's. Waymore now has all the same sources for URLs as [gau](https://github.com/lc/gau)
- Added `-xus` parameter to exclude URLScan when getting URL's.
- Added `-r` parameter to specify the number of times requests are retried if they return 429, 500, 502, 503 or 504 (default: 1).
- Made requests use a retry strategy using `-r` value, and also a `backoff_factor` of 1 for Too Many Redirect (429) responses.
- General bug fixes.
- Changed
- Fixed a bug with was preventing HTTP Status Code filtering from working on Alien Vaults requests.
- Fixed a bug that was preventing MIME type filtering from working on Common Crawl requests.
- Correctly escape all characters is strings compared in regex with re.escape instead of just changing `.` to `\.`
- Changed default MIME type filter to include: video/webm,video/3gpp,application/font-ttf,audio/mp3,audio/x-wav,image/pjpeg,audio/basic
- Changed default URL filter to include:/jquery,/bootstrap
- If Ctrl-C is used to end the program, try to ensure that results at that point are still saved before ending.
- v0.3
- New
- Added Alien Vault OTX as a source of URL's. Results cannot be checked against MIME filters though because that info is not available in the API response.
- Added `-xav` parameter to exclude Alien Vault when getting URL's.
- Changed
- Improved regex for the `-i` input value to ensure it's a valid domain, with or without sub domains and path, but no query string or fragments.
- General tidying up and improvements
- v0.2
- New
- Added to the TODO list on README.md of changes coming soon
- Changed
- When getting URl's from archive.org it now uses pagination. Instead of one API call (how waybackurls does it), it makes one call per page of URL's (how gau does it). This actually results in a lot more URL's being returned even though the archive.org API docs seem to imply it should be the same. So in comparison to gau it now returns the same number of URL's from archive.org
- Ensure input domain/path is URL encoded when added to the API call URL's
- v0.1 - Initial release
waymore-3.7/Dockerfile 0000664 0000000 0000000 00000000140 14574625054 0015047 0 ustar 00root root 0000000 0000000 FROM python:3.10
WORKDIR /app
COPY . .
RUN mkdir -p results
RUN python3 setup.py install
waymore-3.7/LICENSE 0000664 0000000 0000000 00000002054 14574625054 0014070 0 ustar 00root root 0000000 0000000 MIT License
Copyright (c) 2022 /XNL-h4ck3r
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
waymore-3.7/README.md 0000664 0000000 0000000 00000132105 14574625054 0014343 0 ustar 00root root 0000000 0000000
## About - v3.7
The idea behind **waymore** is to find even more links from the Wayback Machine than other existing tools.
👉 The biggest difference between **waymore** and other tools is that it can also **download the archived responses** for URLs on wayback machine so that you can then search these for even more links, developer comments, extra parameters, etc. etc.
👉 Also, other tools do not currenrtly deal with the rate limiting now in place by the sources, and will often just stop with incomplete results and not let you know they are incomplete.
Anyone who does bug bounty will have likely used the amazing [waybackurls](https://github.com/tomnomnom/waybackurls) by @TomNomNoms. This tool gets URLs from [web.archive.org](https://web.archive.org) and additional links (if any) from one of the index collections on [index.commoncrawl.org](http://index.commoncrawl.org/).
You would have also likely used the amazing [gau](https://github.com/lc/gau) by @hacker\_ which also finds URL's from wayback archive, Common Crawl, but also from Alien Vault and URLScan.
Now **waymore** gets URL's from ALL of those sources too (with ability to filter more to get what you want):
- Wayback Machine (web.archive.org)
- Common Crawl (index.commoncrawl.org)
- Alien Vault OTX (otx.alienvault.com)
- URLScan (urlscan.io)
- Virus Total (virustotal.com)
👉 It's a point that many seem to miss, so I'll just add it again :) ... The biggest difference between **waymore** and other tools is that it can also **download the archived responses** for URLs on wayback machine so that you can then search these for even more links, developer comments, extra parameters, etc. etc.
👉 **PLEASE READ ALL OF THE INFORMATION ON THIS PAGE TO MAKE THE MOST OF THIS TOOL, AND ESPECIALLY BEFORE RAISING ANY ISSUES** 🤘
👉 **THIS TOOL CAN BE VERY SLOW, BUT IT IS MEANT FOR COVERAGE, NOT SPEED**
⚠️ **A common mistake that is made is passing a file of subdomains to get everything for a domain. DON'T DO IT! Just pass the domain only to get all subs for that domain. It will be SO much quicker, and you won't miss anything.**
## Installation
**NOTE: If you already have a `config.yml` file, it will not be overwritten. The file `config.yml.NEW` will be created in the same directory. If you need the new config, remove `config.yml` and rename `config.yml.NEW` back to `config.yml`.**
`waymore` supports **Python 3**.
Install `waymore` in default(global) python environment.
```bash
pip install git+https://github.com/xnl-h4ck3r/waymore.git -v
```
### pipx
Quick setup in isolated python environment using [pipx](https://pypa.github.io/pipx/)
```bash
pipx install git+https://github.com/xnl-h4ck3r/waymore.git
```
## Usage
| Arg | Long Arg | Description |
| ------------- | -------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| -i | --input | The target domain (or file of domains) to find links for. This can be a domain only, or a domain with a specific path. If it is a domain only to get everything for that domain, don't prefix with `www.`. You can also specify a TLD only by prefixing with a period, e.g. `.mil`, which will get all subs for all domains with that TLD (NOTE: The Alien Vault OTX source is excluded if searching for a TLD because it requires a full domain). **NOTE: Any scheme, port number, query string, or URL fragment will be removed from the input values. However, if you provide a path, this will be specifically searched for, so will limit your results.** |
| -mode | | The mode to run: `U` (retrieve URLs only), `R` (download Responses only) or `B` (Both). If `-i` is a domain only, then `-mode` will default to `B`. If `-i` is a domain with path then `-mode` will default to `R`. |
| -oU | --output-urls | The file to save the Links output to, including path if necessary. If the `-oR` argument is not passed, a `/results` directory will be created in the path specified by the `DEFAULT_OUTPUT_DIR` key in `config.yml` file (typically defaults to `~/.config/waymore/`). Within that, a directory will be created with target domain (or domain with path) passed with `-i` (or for each line of a file passed with `-i`). For example: `-oU ~/Recon/Redbull/waymoreUrls.txt` |
| -oR | --output-responses | The directory to save the response output files to, including path if necessary. If the argument is not passed, a `/results` directory will be created in the path specified by the `DEFAULT_OUTPUT_DIR` key in `config.yml` file (typically defaults to `~/.config/waymore/`). Within that, a directory will be created with target domain (or domain with path) passed with `-i` (or for each line of a file passed with `-i`). For example: `-oR ~/Recon/Redbull/waymoreResponses` |
| -n | --no-subs | Don't include subdomains of the target domain (only used if input is not a domain with a specific path). |
| -f | --filter-responses-only | The initial links from sources will not be filtered, only the responses that are downloaded, e.g. it maybe useful to still see all available paths from the links, even if you don't want to check the content. |
| -fc | | Filter HTTP status codes for retrieved URLs and responses. Comma separated list of codes (default: the `FILTER_CODE` values from `config.yml`). Passing this argument will override the value from `config.yml` |
| -mc | | Only Match HTTP status codes for retrieved URLs and responses. Comma separated list of codes. Passing this argument overrides the config `FILTER_CODE` and `-fc`. |
| -l | --limit | How many responses will be saved (if `-mode R` or `-mode B` is passed). A positive value will get the **first N** results, a negative value will get the **last N** results. A value of 0 will get **ALL** responses (default: 5000) |
| -from | --from-date | What date to get responses from. If not specified it will get from the earliest possible results. A partial value can be passed, e.g. `2016`, `201805`, etc. |
| -to | --to-date | What date to get responses to. If not specified it will get to the latest possible results. A partial value can be passed, e.g. `2021`, `202112`, etc. |
| -ci | --capture-interval | Filters the search on archive.org to only get at most 1 capture per hour (`h`), day (`d`) or month (`m`). This filter is used for responses only. The default is `d` but can also be set to `none` to not filter anything and get all responses. |
| -ra | --regex-after | RegEx for filtering purposes against links found from all sources of URLs AND responses downloaded. Only positive matches will be output. |
| -url-filename | | Set the file name of downloaded responses to the URL that generated the response, otherwise it will be set to the hash value of the response. Using the hash value means multiple URLs that generated the same response will only result in one file being saved for that response. |
| -xwm | | Exclude checks for links from Wayback Machine (archive.org) |
| -xcc | | Exclude checks for links from commoncrawl.org |
| -xav | | Exclude checks for links from alienvault.com |
| -xus | | Exclude checks for links from urlscan.io |
| -xvt | | Exclude checks for links from virustotal.com |
| -lcc | | Limit the number of Common Crawl index collections searched, e.g. `-lcc 10` will just search the latest `10` collections (default: 3). As of July 2023 there are currently 95 collections. Setting to `0` (default) will search **ALL** collections. If you don't want to search Common Crawl at all, use the `-xcc` option. |
| -lcy | | Limit the number of Common Crawl index collections searched by the year of the index data. The earliest index has data from 2008. Setting to 0 (default) will search collections or any year (but in conjuction with `-lcc`). For example, if you are only interested in data from 2015 and after, pass `-lcy 2015`. This will override the value of `-lcc` if passed. If you don't want to search Common Crawl at all, use the `-xcc` option. |
| -t | --timeout | This is for archived responses only! How many seconds to wait for the server to send data before giving up (default: 30) |
| -p | --processes | Basic multithreading is done when getting requests for a file of URLs. This argument determines the number of processes (threads) used (default: 1) |
| -r | --retries | The number of retries for requests that get connection error or rate limited (default: 1). |
| -m | --memory-threshold | The memory threshold percentage. If the machines memory goes above the threshold, the program will be stopped and ended gracefully before running out of memory (default: 95) |
| -ko | --keywords-only | Only return links and responses that contain keywords that you are interested in. This can reduce the time it takes to get results. If you provide the flag with no value, Keywords are taken from the comma separated list in the `config.yml` file (typically in `~/.config/waymore/`) with the `FILTER_KEYWORDS` key, otherwise you can pass a specific Regex value to use, e.g. `-ko "admin"` to only get links containing the word `admin`, or `-ko "\.js(\?\|$)"` to only get JS files. The Regex check is NOT case sensitive. |
| -lr | --limit-requests | Limit the number of requests that will be made when getting links from a source (this doesn\'t apply to Common Crawl). Some targets can return a huge amount of requests needed that are just not feasible to get, so this can be used to manage that situation. This defaults to 0 (Zero) which means there is no limit. |
| -ow | --output-overwrite | If the URL output file (default `waymore.txt`, or specified by `-oU`) already exists, it will be overwritten instead of being appended to. |
| -nlf | --new-links-file | If this argument is passed, a `waymore.new` file (or if `-oU` is used it will be the name of that file suffixed with `.new`) will also be written, and will contain links for the latest run. This can be used for continuous monitoring of a target (only for `mode U`, not `mode R`). |
| -c | --config | Path to the YML config file. If not passed, it looks for file `config.yml` in the default directory, typically `~/.config/waymore`. |
| -wrlr | --wayback-rate-limit-retry | The number of minutes the user wants to wait for a rate limit pause on Wayback Machine (archive.org) instead of stopping with a `429` error (default: 3). |
| -urlr | --urlscan-rate-limit-retry | The number of minutes the user wants to wait for a rate limit pause on URLScan.io instead of stopping with a `429` error (default: 1). |
| -co | --check-only | This will make a few minimal requests to show you how many requests, and roughly how long it could take, to get URLs from the sources and downloaded responses from Wayback Machine. |
| -nd | --notify-discord | Whether to send a notification to Discord when waymore completes. It requires `WEBHOOK_DISCORD` to be provided in the `config.yml` file. |
| -v | --verbose | Verbose output |
| | --version | Show current version number. |
| -h | --help | Show the help message and exit. |
## Run with docker
Install [docker](https://docs.docker.com/get-docker/)
```bash
git clone https://github.com/xnl-h4ck3r/waymore.git
cd waymore
```
Build image:
```bash
docker build -t waymore .
```
Run waymore with this command:
```bash
docker run -it --rm -v $PWD/results:/app/results waymore:latest waymore -i example.com -oU example.com.links -oR results/example.com/
```
## Input and Mode
The input `-i` can either be a domain only, e.g. `redbull.com` or a specific domain and path, e.g. `redbull.com/robots.txt`. You can also pass a file of domains/URLs to process (or pass values in by piping from another program on the command line). **NOTE: Any scheme, port number, query string, or URL fragment will be removed from the input values. However, if you provide a path, this will be specifically searched for, so will limit your results.**
There are different modes that can be run for waymore. The `-mode` argument can be 3 different value:
- `U` - URLs will be retrieved from archive.org (if `-xwm` is not passed), commoncrawl.org (if `-xcc` is not passed), otx.alienvault.com (if `-xvv` is not passed) and urlscan.io (if `-xus` is not passed)
- `R` - Responses will be downloaded from archive.org
- `B` - Both URLs and Responses will be retrieved
If the input was a specific URL, e.g. `redbull.com/robots.txt` then the `-mode` defaults to `R`. Only responses will be downloaded. You cannot change the mode to `U` or `B` for a domain with path because it isn't necessary to retrieve URLs for a specific URL.
If the input is just a domain, e.g. `redbull.com` then the `-mode` defaults to `B`. It can be changed to `U` or `R` if required. When a domain only is passed then all URLs/responses are retrieved for that domain (and sub domains unless `-n` is passed). If the no sub domain option `-n` is passed then the `www` sub domain is still included by default.
## config.yml
The `config.yml` file (typically in `~/.config/waymore/`) have values that can be updated to suit your needs. Filters are all provided as comma separated lists:
- `FILTER_CODE` - Exclusions used to exclude responses we will try to get from web.archive.org, and also for file names when `-i` is a directory, e.g. `301,302`. This can be overridden with the `-fc` argument. Passing the `-mc` (to match status codes instead of filter) will override any value in `FILTER_CODE` or `-fc`
- `FILTER_MIME` - MIME Content-Type exclusions used to filter links and responses from web.archive.org through their API, e.g. `'text/css,image/jpeg`
- `FILTER_URL` - Response code exclusions we will use to filter links and responses from web.archive.org through their API, e.g. `.css,.jpg`
- `FILTER_KEYWORDS` - Only links and responses will be returned that contain the specified keywords if the `-ko`/`--keywords-only` argument is passed (without providing an explicit value on the command line), e.g. `admin,portal`
- `URLSCAN_API_KEY` - You can sign up to [urlscan.io](https://urlscan.io/user/signup) to get a **FREE** API key (there are also paid subscriptions available). It is recommended you get a key and put it into the config file so that you can get more back (and quicker) from their API. NOTE: You will get rate limited unless you have a full paid subscription.
- `CONTINUE_RESPONSES_IF_PIPED` - If retrieving archive responses doesn't complete, you will be prompted next time whether you want to continue with the previous run. However, if `stdout` is piped to another process it is assumed you don't want to have an interactive prompt. A value of `True` (default) will determine assure the previous run will be continued. if you want a fresh run every time then set to `False`.
- `WEBHOOK_DISCORD` - If the `--notify-discord` argument is passed, `knoxnl` will send a notification to this Discord wehook when a successful XSS is found.
- `DEFAULT_OUTPUT_DIR` - This is the default location of any output files written if the `-oU` and `-oR` arguments are not used. If the value of this key is blank, then it will default to the location of the `config.yml` file.
**NOTE: The MIME types cannot be filtered for Alien Vault results because they do not return that in the API response.**
## Output
In the default output directory specificed in the `config.yml` file with `DEFAULT_OUTPUT_DIR` (if that is blank, it will default to the location of the `config.yml` file itself, typically `~/.config/waymore/`), a `results` directory will be created. Within that, a directory will be created with target domain (or domain with path) passed with `-i` (or for each line of a file passed with `-i`). You can alternatively use argument `-oU` to specify where the URL links file will be output (and the name of the file). You can also use argument `-oR` to specify a directory (or path) where the archived responses will be output.
When run, the following files are created in the target directory:
- `waymore.txt` - If `-mode` is `U` or `B`, this file will contain links from selected sources. Links will be retrieved from archive.org Wayback Machine (unless `-xwm` was passed), commoncrawl.org (unless `-xcc` was passed), otx.alienvault.com (unless `-xav` was passed) and urlscan.io (unless `-xus` was passed). If the `-ow` option was also passed, any existing `waymore.txt` file in the target results directory will be overwritten, otherwise new links will be appended and duplicates removed.
- `index.txt` - If `-mode` is `R` or `B`, and `-url-filname` was not passed then archived responses will be downloaded and hash values will be used for the saved file names. This file contains a comma separated list of `,,` in case you need to know which URLs produced which response.
- `ALL OTHER FILES` - These archived response files will be created if `-mode` was `R` or `B`. If `-url-filename` was passed the the file names will be the archive URL that generated the response, e.g. `https--example.com-robots.txt`, otherwise the file name will be a hash value, e.g. `7960113391501.{EXT}` where `{EXT}` will be the extension derived from the path, otherwise it will be derived from the response `content-type`. Sometimes a general extesions will be given, e.g. `.js` for any content type containing the word `javascript`, otherwise it may be set to the last part of the type (e.g. if it's `application/java-archive` the file will be `7960113391501.java-archive`). Using hash values mean that less files will be written as there will only be one file per unique response. These archived responses are edited, before being saved, to remove any reference to `web.archive.org`.
## Info and Suggestions
The number of links found, and then potentially the number of files archived responses you will download could potentially be **HUGE** for many domains. This tool isn't about speed, it's about finding more, so be patient.
There is a `-p` option to increase the number of processes (used when retrieving links from all sources, and downloading archived responses from archive.org). However, although it may not be as fast as you'd like, I would suggest leaving `-p` at the default of 3 because I personally found issues with getting responses with higher values. We don't want to cause these services any problems, so be sensible!
I often use the `-f` option because I want `waymore.txt` to contain ALL possible links. Even though I don't care about images, fonts, etc. it could still be useful to see all possible paths and maybe parameters. Any filters will always be applied to downloading archived responses though. You don't want to waste time downloading thousands of images!
Using the `-v` option can help see what is happening behind the scenes and could help you if you aren't getting the output you are expecting.
All the MIME Content Types of URL's found (by all sources except Alien Vault) will be displayed when `-v` is used. This may help to add further exclusions if you find you still get back things you don't want to see. If you spot a MIME type that is being included but you don't want that going forward, add it to the `FILTER_MIME` in `config.yml`.
It should be noted that sometimes the MIME type on archive.org is stored as `unk` and `unknown` instead of the real MIME so the filter won't necessarily remove it from their results. The `FILTER_URL` config settings can be used to remove these afterwards. For example, if a GIF has MIME type `unk` instead of `image/gif` (and that's in `FILTER_MIME`) then it won't get filtered, but if the url is `https://target.com/assets/logo.gif` and `.gif` is in `FILTER_URL` it won't get requested.
If `config.yml` doesn't exist, or the entries for filters, aren't in the file, then default filters are used. It's better to have the file and review these to ensure you are getting what you need.
There can potentially be millions of responses so make sure you set filters, but also the Limit (`-l`), From Date (`-from`), To Date (`-to`) and/or Capture Interval (`-ci`) if you need to. The limit defaults to 5000, but say you wanted to get the latest 20,000 responses from 2015 up until January 2018... you would pass `-l -20000 -from 2015 -to 201801`. The Capture Interval determines how many responses will get downloaded for a particular URL within a specified period, e.g. if you set to `m` you will only get one response per month for a URL. The default `d` will likely greatly reduce the number of responses and unlikely to miss many unique responses unless a target changed something more than once in a given day.
Another useful argument is `-mc` that will only get results where the HTTP status code matches the comma separated list passed, e.g. `-mc 200` or `-mc 200,403`.
You can also greatly reduce the number (and therefore reduce the execution time) of links and responses by only returning ones that contain keywords you are interested in. You can list these keywords in `config.yml` with the `FILTER_KEYWORDS` key and then pass argument `-ko`/`--keywords-only` to use these, or you can pass `-ko`/`--keywords-only` with a specific Regex, e.g. `-ko "\.js(\?|$)"` to only get JS files.
As mentioned above, sign up to [urlscan.io](https://urlscan.io/user/signup) to get a **FREE** API key (there are also paid subscriptions available). It is recommended you get a key and put it into the `config.yml` file so that you can get more back (and quicker) from their API. NOTE: You will get rate limited unless you have a full paid subscription.
The archive.org Wayback Machine CDX API can sometimes can sometimes require a huge amount of requests to get all the links. For example, if you run **waymore** for `-i twitter.com` it says there are **28,903,799** requests to archive.org that need to be made (that could take almost 1000 days for some people!!!). The argument `-lr` can be used to limit the number of requests made per source (although it's usually archive.org that is the problem). The default value for the argument is 0 (Zero) which will apply no limit.
There is also a problem with the Wayback Machine CDX API where the number of pages returned is not correct when filters are applied and can cause issues (see https://github.com/internetarchive/wayback/issues/243). Until that issue is resolved, setting the `-lr` argument to a sensible value can help with that problem in the short term.
**The provider API servers aren't designed to cope with huge volumes, so be sensible and considerate about what you hit them with!**
When downloading archived responses, this can take a long time and can sometimes be killed by the machine for some reason, or manually killed by the user.
In the targets `results` directory, a file called `responses.tmp` is created at the start of the process and contains all the response URLs that will be retrieved. There will also be a file called `continueResp.tmp` that stores the index of the latest response retrieved. If `waymore` is run to get responses (`-mode R` or `-mode B`), and these files exist, it means there was a previous incomplete run, and you will be asked if you want to continue with that one instead. It will then continue from where it stopped before.
## Some Basic Examples
### Example 1
Just get the URLs from all sources for `redbull.com` (`-mode U` is just for URLs, so no responses are downloaded):
The URLs are saved in the same path as `config.yml` (typically `~/.config/waymore`) under `results/redbull.com/waymore.txt`
### Example 2
Get ALL the URLs from Wayback for `redbull.com` (no filters are applied in `mode U` with `-f`, and no URLs are retrieved from Commone Crawl, Alien Vault, URLScan and Virus Total, because `-xcc`, `-xav`, `-xus`, `-xvt` are passed respectively).
Save the FIRST 200 responses that are found starting from 2022 (`-l 200 -from 2022`):
The `-mode` wasn't explicitly set so defaults to `B` (Both - retrieve URLs AND download responses).
A file will be created for each unique response and also saved in `results/redbull.com/`:
There will also be a file `results/redbull.com/index.txt` that will contain a reference to what URLs gave the response for what file, e.g.
```
4847147712618,https://web.archive.org/web/20220426044405/https://www.redbull.com/additional-services/geo ,2022-06-24 20:07:50.603486
```
where `4847147712618` is the hash value of the response in `4847147712618.xnl`, the 2nd value is the Wayback Machine URL where you can view the actual page that was archived, and the 3rd is a time stamp of when the response was downloaded.
## Example 3
You can pipe waymore to other tools. Any errors are sent to `stderr` and any links found are sent to `stdout`. The output file is still created in addition to the links being piped to the next program. However, archived responses are not piped to the next program, but they are still written to files. For example:
```
waymore -i redbull.com -mode U | unfurl keys | sort -u
```
You can also pass the input through `stdin` instead of `-i`.
```
cat redbull_subs.txt | waymore
```
## Example 4
Sometimes you may just want to check how many request, and how long `waymore` is likely to take if you ran it for a particular domain. You can do a quick check by using the `-co`/`--check-only` argument. For example:
```
waymore -i redbull.com --check-only
```
## Finding Way More URLs!
So now you have lots of archived responses and you want to find extra links? Easy! Why not use [xnLinkFinder](https://github.com/xnl-h4ck3r/xnLinkFinder)?
For example:
```
xnLinkFinder -i ~/Tools/waymore/results/redbull.com -sp https://www.redbull.com -sf redbull.com -o redbull.txt
```
Or run other tools such as [trufflehog](https://github.com/trufflesecurity/trufflehog) or [gf](https://github.com/tomnomnom/gf) over the directory of responses to find even more from the archived responses!
## Issues
If you come across any problems at all, or have ideas for improvements, please feel free to raise an issue on Github. If there is a problem, it will be useful if you can provide the exact command you ran and a detailed description of the problem. If possible, run with `-v` to reproduce the problem and let me know about any error messages that are given.
## TODO
- For `mode R`, if `404` responses are requested to be filtered, also check the the response of `200`'s to see if the page was a custom 404 page.
- Add an `-oss` argument that accepts a file of Out Of Scope subdomains/URLs that will not be returned in the output, or have any responses downloaded
## References
- [Wayback CDX Server API - BETA](https://github.com/internetarchive/wayback/tree/master/wayback-cdx-server)
- [Common Crawl Index Server](https://index.commoncrawl.org/)
- [Alien Vault OTX API](https://otx.alienvault.com/assets/static/external_api.html)
- [URLScan API](https://urlscan.io/docs/api/)
- [VirusTotal API (v2)](https://docs.virustotal.com/v2.0/reference/getting-started)
Good luck and good hunting!
If you really love the tool (or any others), or they helped you find an awesome bounty, consider [BUYING ME A COFFEE!](https://ko-fi.com/xnlh4ck3r) ☕ (I could use the caffeine!)
🤘 /XNL-h4ck3r
waymore-3.7/config.yml 0000664 0000000 0000000 00000003437 14574625054 0015061 0 ustar 00root root 0000000 0000000 FILTER_CODE: 404,301,302
FILTER_MIME: text/css,image/jpeg,image/jpg,image/png,image/svg+xml,image/gif,image/tiff,image/webp,image/bmp,image/vnd,image/x-icon,image/vnd.microsoft.icon,font/ttf,font/woff,font/woff2,font/x-woff2,font/x-woff,font/otf,audio/mpeg,audio/wav,audio/webm,audio/aac,audio/ogg,audio/wav,audio/webm,video/mp4,video/mpeg,video/webm,video/ogg,video/mp2t,video/webm,video/x-msvideo,video/x-flv,application/font-woff,application/font-woff2,application/x-font-woff,application/x-font-woff2,application/vnd.ms-fontobject,application/font-sfnt,application/vnd.android.package-archive,binary/octet-stream,application/octet-stream,application/pdf,application/x-font-ttf,application/x-font-otf,video/webm,video/3gpp,application/font-ttf,audio/mp3,audio/x-wav,image/pjpeg,audio/basic,application/font-otf,application/x-ms-application,application/x-msdownload,video/x-ms-wmv,image/x-png,video/quicktime,image/x-ms-bmp,font/opentype,application/x-font-opentype,application/x-woff,audio/aiff
FILTER_URL: .css,.jpg,.jpeg,.png,.svg,.img,.gif,.mp4,.flv,.ogv,.webm,.webp,.mov,.mp3,.m4a,.m4p,.scss,.tif,.tiff,.ttf,.otf,.woff,.woff2,.bmp,.ico,.eot,.htc,.rtf,.swf,.image,/image,/img,/css,/wp-json,/wp-content,/wp-includes,/theme,/audio,/captcha,/font,node_modules,/jquery,/bootstrap
FILTER_KEYWORDS: admin,login,logon,signin,signup,register,registration,dash,portal,ftp,panel,.js,api,robots.txt,graph,gql,config,backup,debug,db,database,git,cgi-bin,swagger,zip,.rar,tar.gz,internal,jira,jenkins,confluence,atlassian,okta,corp,upload,delete,email,sql,create,edit,test,temp,cache,wsdl,log,payment,setting,mail,file,redirect,chat,billing,doc,trace,ftp,gateway,import,proxy,dev,stage,stg,uat,sonar.ci.,.cp.
URLSCAN_API_KEY:
VIRUSTOTAL_API_KEY:
CONTINUE_RESPONSES_IF_PIPED: True
WEBHOOK_DISCORD: YOUR_WEBHOOK
DEFAULT_OUTPUT_DIR:
waymore-3.7/requirements.txt 0000664 0000000 0000000 00000000111 14574625054 0016337 0 ustar 00root root 0000000 0000000 argparse
PyYAML
requests
setuptools
termcolor
psutil
urlparse3
tldextract waymore-3.7/setup.py 0000664 0000000 0000000 00000004063 14574625054 0014577 0 ustar 00root root 0000000 0000000 #!/usr/bin/env python
import os
import shutil
from setuptools import setup, find_packages
target_directory = (
os.path.join(os.getenv('APPDATA', ''), 'waymore') if os.name == 'nt'
else os.path.join(os.path.expanduser("~"), ".config", "waymore") if os.name == 'posix'
else os.path.join(os.path.expanduser("~"), "Library", "Application Support", "waymore") if os.name == 'darwin'
else None
)
# Copy the config.yml file to the target directory if it exists
if target_directory and os.path.isfile("config.yml"):
os.makedirs(target_directory, exist_ok=True)
# If file already exists, create a new one
if os.path.isfile(target_directory+'/config.yml'):
configNew = True
os.rename(target_directory+'/config.yml',target_directory+'/config.yml.OLD')
shutil.copy("config.yml", target_directory)
os.rename(target_directory+'/config.yml',target_directory+'/config.yml.NEW')
os.rename(target_directory+'/config.yml.OLD',target_directory+'/config.yml')
else:
configNew = False
shutil.copy("config.yml", target_directory)
setup(
name="waymore",
packages=find_packages(),
version=__import__('waymore').__version__,
description="Find way more from the Wayback Machine, Common Crawl, Alien Vault OTX, URLScan & VirusTotal!",
long_description=open("README.md").read(),
author="@xnl-h4ck3r",
url="https://github.com/xnl-h4ck3r/waymore",
py_modules=["waymore"],
install_requires=["argparse","requests","pyyaml","termcolor","psutil","urlparse3","tldextract"],
entry_points={
'console_scripts': [
'waymore = waymore.waymore:main',
],
},
)
if configNew:
print('\n\033[33mIMPORTANT: The file '+target_directory+'/config.yml already exists.\nCreating config.yml.NEW but leaving existing config.\nIf you need the new file, then remove the current one and rename config.yml.NEW to config.yml\n\033[0m')
else:
print('\n\033[92mThe file '+target_directory+'/config.yml has been created.\n\033[0m') waymore-3.7/waymore/ 0000775 0000000 0000000 00000000000 14574625054 0014545 5 ustar 00root root 0000000 0000000 waymore-3.7/waymore/__init__.py 0000664 0000000 0000000 00000000021 14574625054 0016647 0 ustar 00root root 0000000 0000000 __version__="3.7" waymore-3.7/waymore/images/ 0000775 0000000 0000000 00000000000 14574625054 0016012 5 ustar 00root root 0000000 0000000 waymore-3.7/waymore/images/example1.png 0000664 0000000 0000000 00001400612 14574625054 0020240 0 ustar 00root root 0000000 0000000 PNG
IHDR Tk sRGB gAMA a pHYs t tfx tEXtSoftware Greenshot^U IDATx^{mI~^{q=!9{QPb(tlIPbҐh{(I=CJ3NxF P{ 16< 9IG D?8=~UV=]_WUkjZ~{xvT|1)P#=>2BQ}s>xVxMO`