pax_global_header00006660000000000000000000000064143116707270014522gustar00rootroot0000000000000052 comment=10c34e6f07fa29c72da00b3635293ab38db0d20f python-email-validator-1.3.0/000077500000000000000000000000001431167072700161145ustar00rootroot00000000000000python-email-validator-1.3.0/.gitignore000066400000000000000000000002721431167072700201050ustar00rootroot00000000000000__pycache__/ *.py[cod] *$py.class *.so .Python build/ dist/ downloads/ eggs/ .eggs/ *.egg-info/ *.egg *.log docs/_build/ .python-version .env .venv env/ env27/ .idea/ .coverage htmlcov/ python-email-validator-1.3.0/.travis.yml000066400000000000000000000003331431167072700202240ustar00rootroot00000000000000os: linux dist: bionic language: python cache: pip python: - '3.6' #- '3.7' #- '3.8' - '3.9' - '3.10' install: - make install script: - make lint - make test after_success: - bash <(curl -s https://codecov.io/bash) python-email-validator-1.3.0/CHANGELOG.md000066400000000000000000000070021431167072700177240ustar00rootroot00000000000000Version 1.3.0 (September 18, 2022) ---------------------------------- * Deliverability checks now check for 'v=spf1 -all' SPF records as a way to reject more bad domains. * Special use domain names now raise EmailSyntaxError instead of EmailUndeliverableError since they are performed even if check_deliverability is off. * New module-level attributes are added to override the default values of the keyword arguments and the special-use domains list. * The keyword arguments of the public methods are now marked as keyword-only. * [pyIsEmail](https://github.com/michaelherold/pyIsEmail)'s test cases are added to the tests. * Recommend that check_deliverability be set to False for validation on login pages. * Added an undocumented globally_deliverable option. Version 1.2.1 (May 1, 2022) --------------------------- * example.com/net/org are removed from the special-use reserved domain names list so that they do not raise exceptions if check_deliverability is off. * Improved README. Verison 1.2.0 (April 24, 2022) ------------------------------ * Reject domains with NULL MX records (when deliverability checks are turned on). * Reject unsafe unicode characters. (Some of these checks you should be doing on all of your user inputs already!) * Reject most special-use reserved domain names with EmailUndeliverableError. A new `test_environment` option is added for using `@*.test` domains. * Improved safety of exception text by not repeating an unsafe input character in the message. * Minor fixes in tests. * Invoking the module as a standalone program now caches DNS queries. * Improved README. Version 1.1.3 (June 12, 2021) ----------------------------- * Allow passing a custom dns_resolver so that a DNS cache and a custom timeout can be set. Version 1.1.2 (Nov 5, 2020) --------------------------- * Fix invoking the module as a standalone program. * Fix deprecation warning in Python 3.8. * Code improvements. * Improved README. Version 1.1.1 (May 19, 2020) ---------------------------- * Fix exception when DNS queries time-out. * Improved README. Version 1.1.0 (Spril 30, 2020) ------------------------------ * The main function now returns an object with attributes rather than a dict with keys, but accessing the object in the old way is still supported. * Added overall email address length checks. * Minor tweak to regular expressions. * Improved error messages. * Added tests. * Linted source code files; changed README to Markdown. Version 1.0.5 (Oct 18, 2019) ---------------------------- * Prevent resolving domain names as if they were not fully qualified using a local search domain settings. Version 1.0.4 (May 2, 2019) --------------------------- * Added a timeout argument for DNS queries. * The wheel distribution is now a universal wheel. * Improved README. Version 1.0.3 (Sept 12, 2017) ----------------------------- * Added a wheel distribution for easier installation. Version 1.0.2 (Dec 30, 2016) ---------------------------- * Fix dnspython package name in Python 3. * Improved README. Version 1.0.1 (March 6, 2016) ----------------------------- * Fixed minor errors. Version 1.0.0 (Sept 5, 2015) ---------------------------- * Fail domains with a leading period. * Improved error messages. * Added tests. Version 0.5.0 (June 15, 2015) ----------------------------- * Use IDNA 2008 instead of IDNA 2003 and use the idna package's UTS46 normalization instead of our own. * Fixes for Python 2. * Improved error messages. * Improved README. Version 0.1.0 (April 21, 2015) ------------------------------ Initial release! python-email-validator-1.3.0/CONTRIBUTING.md000066400000000000000000000007201431167072700203440ustar00rootroot00000000000000## Public domain This project is in the public domain. Copyright and related rights in the work worldwide are waived through the [CC0 1.0 Universal public domain dedication][CC0]. See the LICENSE file in this directory. All contributions to this project must be released under the same CC0 wavier. By submitting a pull request or patch, you are agreeing to comply with this waiver of copyright interest. [CC0]: http://creativecommons.org/publicdomain/zero/1.0/ python-email-validator-1.3.0/LICENSE000066400000000000000000000156101431167072700171240ustar00rootroot00000000000000Creative Commons Legal Code CC0 1.0 Universal CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE LEGAL SERVICES. DISTRIBUTION OF THIS DOCUMENT DOES NOT CREATE AN ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES REGARDING THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED HEREUNDER, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED HEREUNDER. Statement of Purpose The laws of most jurisdictions throughout the world automatically confer exclusive Copyright and Related Rights (defined below) upon the creator and subsequent owner(s) (each and all, an "owner") of an original work of authorship and/or a database (each, a "Work"). Certain owners wish to permanently relinquish those rights to a Work for the purpose of contributing to a commons of creative, cultural and scientific works ("Commons") that the public can reliably and without fear of later claims of infringement build upon, modify, incorporate in other works, reuse and redistribute as freely as possible in any form whatsoever and for any purposes, including without limitation commercial purposes. These owners may contribute to the Commons to promote the ideal of a free culture and the further production of creative, cultural and scientific works, or to gain reputation or greater distribution for their Work in part through the use and efforts of others. For these and/or other purposes and motivations, and without any expectation of additional consideration or compensation, the person associating CC0 with a Work (the "Affirmer"), to the extent that he or she is an owner of Copyright and Related Rights in the Work, voluntarily elects to apply CC0 to the Work and publicly distribute the Work under its terms, with knowledge of his or her Copyright and Related Rights in the Work and the meaning and intended legal effect of CC0 on those rights. 1. Copyright and Related Rights. A Work made available under CC0 may be protected by copyright and related or neighboring rights ("Copyright and Related Rights"). Copyright and Related Rights include, but are not limited to, the following: i. the right to reproduce, adapt, distribute, perform, display, communicate, and translate a Work; ii. moral rights retained by the original author(s) and/or performer(s); iii. publicity and privacy rights pertaining to a person's image or likeness depicted in a Work; iv. rights protecting against unfair competition in regards to a Work, subject to the limitations in paragraph 4(a), below; v. rights protecting the extraction, dissemination, use and reuse of data in a Work; vi. database rights (such as those arising under Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases, and under any national implementation thereof, including any amended or successor version of such directive); and vii. other similar, equivalent or corresponding rights throughout the world based on applicable law or treaty, and any national implementations thereof. 2. Waiver. To the greatest extent permitted by, but not in contravention of, applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and unconditionally waives, abandons, and surrenders all of Affirmer's Copyright and Related Rights and associated claims and causes of action, whether now known or unknown (including existing as well as future claims and causes of action), in the Work (i) in all territories worldwide, (ii) for the maximum duration provided by applicable law or treaty (including future time extensions), (iii) in any current or future medium and for any number of copies, and (iv) for any purpose whatsoever, including without limitation commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes the Waiver for the benefit of each member of the public at large and to the detriment of Affirmer's heirs and successors, fully intending that such Waiver shall not be subject to revocation, rescission, cancellation, termination, or any other legal or equitable action to disrupt the quiet enjoyment of the Work by the public as contemplated by Affirmer's express Statement of Purpose. 3. Public License Fallback. Should any part of the Waiver for any reason be judged legally invalid or ineffective under applicable law, then the Waiver shall be preserved to the maximum extent permitted taking into account Affirmer's express Statement of Purpose. In addition, to the extent the Waiver is so judged Affirmer hereby grants to each affected person a royalty-free, non transferable, non sublicensable, non exclusive, irrevocable and unconditional license to exercise Affirmer's Copyright and Related Rights in the Work (i) in all territories worldwide, (ii) for the maximum duration provided by applicable law or treaty (including future time extensions), (iii) in any current or future medium and for any number of copies, and (iv) for any purpose whatsoever, including without limitation commercial, advertising or promotional purposes (the "License"). The License shall be deemed effective as of the date CC0 was applied by Affirmer to the Work. Should any part of the License for any reason be judged legally invalid or ineffective under applicable law, such partial invalidity or ineffectiveness shall not invalidate the remainder of the License, and in such case Affirmer hereby affirms that he or she will not (i) exercise any of his or her remaining Copyright and Related Rights in the Work or (ii) assert any associated claims and causes of action with respect to the Work, in either case contrary to Affirmer's express Statement of Purpose. 4. Limitations and Disclaimers. a. No trademark or patent rights held by Affirmer are waived, abandoned, surrendered, licensed or otherwise affected by this document. b. Affirmer offers the Work as-is and makes no representations or warranties of any kind concerning the Work, express, implied, statutory or otherwise, including without limitation warranties of title, merchantability, fitness for a particular purpose, non infringement, or the absence of latent or other defects, accuracy, or the present or absence of errors, whether or not discoverable, all to the greatest extent permissible under applicable law. c. Affirmer disclaims responsibility for clearing rights of other persons that may apply to the Work or any use thereof, including without limitation any person's Copyright and Related Rights in the Work. Further, Affirmer disclaims responsibility for obtaining any necessary consents, permissions or other rights required for any use of the Work. d. Affirmer understands and acknowledges that Creative Commons is not a party to this document and has no duty or obligation with respect to this CC0 or use of the Work. python-email-validator-1.3.0/MANIFEST.in000066400000000000000000000000651431167072700176530ustar00rootroot00000000000000include email_validator.py include LICENSE README.md python-email-validator-1.3.0/Makefile000066400000000000000000000013321431167072700175530ustar00rootroot00000000000000.DEFAULT_GOAL := all .PHONY: install install: pip install -U setuptools pip pip install -U -r test_requirements.txt pip install -e . .PHONY: lint lint: #python setup.py check -rms flake8 --ignore=E501,E126,W503 email_validator tests .PHONY: test test: pytest --cov=email_validator .PHONY: testcov testcov: test @echo "building coverage html" @coverage html .PHONY: all all: testcov lint .PHONY: clean clean: rm -rf `find . -name __pycache__` rm -f `find . -type f -name '*.py[co]' ` rm -f `find . -type f -name '*~' ` rm -f `find . -type f -name '.*~' ` rm -rf .cache rm -rf .pytest_cache rm -rf htmlcov rm -rf *.egg-info rm -f .coverage rm -f .coverage.* rm -rf build rm -rf dist python setup.py clean python-email-validator-1.3.0/README.md000066400000000000000000000533731431167072700174060ustar00rootroot00000000000000email-validator: Validate Email Addresses ========================================= A robust email address syntax and deliverability validation library for Python by [Joshua Tauberer](https://joshdata.me). This library validates that a string is of the form `name@example.com`. This is the sort of validation you would want for an email-based login form on a website. Key features: * Checks that an email address has the correct syntax --- good for login forms or other uses related to identifying users. * Gives friendly error messages when validation fails (appropriate to show to end users). * (optionally) Checks deliverability: Does the domain name resolve? And you can override the default DNS resolver. * Supports internationalized domain names and (optionally) internationalized local parts, but blocks unsafe characters. * Normalizes email addresses (super important for internationalized addresses! see below). The library is NOT for validation of the To: line in an email message (e.g. `My Name `), which [flanker](https://github.com/mailgun/flanker) is more appropriate for. And this library does NOT permit obsolete forms of email addresses, so if you need strict validation against the email specs exactly, use [pyIsEmail](https://github.com/michaelherold/pyIsEmail). This library is tested with Python 3.6+ but should work in earlier versions: [![Build Status](https://app.travis-ci.com/JoshData/python-email-validator.svg?branch=main)](https://app.travis-ci.com/JoshData/python-email-validator) View the [CHANGELOG / Release Notes](CHANGELOG.md) for the version history of changes in the library. Occasionally this README is ahead of the latest published package --- see the CHANGELOG for details. --- Installation ------------ This package [is on PyPI](https://pypi.org/project/email-validator/), so: ```sh pip install email-validator ``` `pip3` also works. Quick Start ----------- If you're validating a user's email address before creating a user account in your application, you might do this: ```python from email_validator import validate_email, EmailNotValidError email = "my+address@mydomain.tld" is_new_account = True # False for login pages try: # Check that the email address is valid. validation = validate_email(email, check_deliverability=is_new_account) # Take the normalized form of the email address # for all logic beyond this point (especially # before going to a database query where equality # may not take into account Unicode normalization). email = validation.email except EmailNotValidError as e: # Email is not valid. # The exception message is human-readable. print(str(e)) ``` This validates the address and gives you its normalized form. You should **put the normalized form in your database** and always normalize before checking if an address is in your database. When using this in a login form, set `check_deliverability` to `False` to avoid unnecessary DNS queries. Usage ----- ### Overview The module provides a function `validate_email(email_address)` which takes an email address (either a `str` or `bytes`, but only non-internationalized addresses are allowed when passing a `bytes`) and: - Raises a `EmailNotValidError` with a helpful, human-readable error message explaining why the email address is not valid, or - Returns an object with a normalized form of the email address (which you should use!) and other information about it. When an email address is not valid, `validate_email` raises either an `EmailSyntaxError` if the form of the address is invalid or an `EmailUndeliverableError` if the domain name fails DNS checks. Both exception classes are subclasses of `EmailNotValidError`, which in turn is a subclass of `ValueError`. But when an email address is valid, an object is returned containing a normalized form of the email address (which you should use!) and other information. The validator doesn't permit obsoleted forms of email addresses that no one uses anymore even though they are still valid and deliverable, since they will probably give you grief if you're using email for login. (See later in the document about that.) The validator checks that the domain name in the email address has a DNS MX record (except a NULL MX record) indicating that it can receive email and that it does not have a reject-all SPF record (`v=spf1 -all`) which would indicate that it cannot send email. There is nothing to be gained by trying to actually contact an SMTP server, so that's not done here. For privacy, security, and practicality reasons servers are good at not giving away whether an address is deliverable or not: email addresses that appear to accept mail at first can bounce mail after a delay, and bounced mail may indicate a temporary failure of a good email address (sometimes an intentional failure, like greylisting). ### Options The `validate_email` function also accepts the following keyword arguments (defaults are as shown below): `allow_smtputf8=True`: Set to `False` to prohibit internationalized addresses that would require the [SMTPUTF8](https://tools.ietf.org/html/rfc6531) extension. You can also set `email_validator.ALLOW_SMTPUTF8` to `False` to turn it off for all calls by default. `check_deliverability=True`: If true, DNS queries check that a non-null MX (or A/AAAA record as an MX fallback) is present for the domain-part of the email address and that a reject-all SPF record is not present. Set to `False` to skip these DNS checks. DNS is slow and sometimes unavailable, so consider whether these checks are useful for your use case. It is recommended to pass `False` when performing validation for login pages (but not account creation pages) since re-validation of the domain by querying DNS at every login is probably undesirable. You can also set `email_validator.CHECK_DELIVERABILITY` to `False` to turn this off for all calls by default. `allow_empty_local=False`: Set to `True` to allow an empty local part (i.e. `@example.com`), e.g. for validating Postfix aliases. `dns_resolver=None`: Pass an instance of [dns.resolver.Resolver](https://dnspython.readthedocs.io/en/latest/resolver-class.html) to control the DNS resolver including setting a timeout and [a cache](https://dnspython.readthedocs.io/en/latest/resolver-caching.html). The `caching_resolver` function shown above is a helper function to construct a dns.resolver.Resolver with a [LRUCache](https://dnspython.readthedocs.io/en/latest/resolver-caching.html#dns.resolver.LRUCache). Reuse the same resolver instance across calls to `validate_email` to make use of the cache. `test_environment=False`: DNS-based deliverability checks are disabled and `test` and `subdomain.test` domain names are permitted (see below). You can also set `email_validator.TEST_ENVIRONMENT` to `True` to turn it on for all calls by default. ### DNS timeout and cache When validating many email addresses or to control the timeout (the default is 15 seconds), create a caching [dns.resolver.Resolver](https://dnspython.readthedocs.io/en/latest/resolver-class.html) to reuse in each call. The `caching_resolver` function returns one easily for you: ```python from email_validator import validate_email, caching_resolver resolver = caching_resolver(timeout=10) while True: email = validate_email(email, dns_resolver=resolver).email ``` ### Test addresses This library rejects email addresess that use the [Special Use Domain Names](https://www.iana.org/assignments/special-use-domain-names/special-use-domain-names.xhtml) `invalid`, `localhost`, `test`, and some others by raising `EmailSyntaxError`. This is to protect your system from abuse: You probably don't want a user to be able to cause an email to be sent to `localhost`. However, in your non-production test environments you may want to use `@test` or `@myname.test` email addresses. There are three ways you can allow this: 1. Add `test_environment=True` to the call to `validate_email` (see above). 2. Set `email_validator.TEST_ENVIRONMENT` to `True`. 3. Remove the special-use domain name that you want to use from `email_validator.SPECIAL_USE_DOMAIN_NAMES`: ```python import email_validator email_validator.SPECIAL_USE_DOMAIN_NAMES.remove("test") ``` It is tempting to use `@example.com/net/org` in tests. These domains are reserved to IANA for use in documentation so there is no risk of accidentally emailing someone at those domains. But beware that this library will reject these domain names if DNS-based deliverability checks are not disabled because these domains do not resolve to domains that accept email. In tests, consider using your own domain name or `@test` or `@myname.test` instead. Internationalized email addresses --------------------------------- The email protocol SMTP and the domain name system DNS have historically only allowed English (ASCII) characters in email addresses and domain names, respectively. Each has adapted to internationalization in a separate way, creating two separate aspects to email address internationalization. ### Internationalized domain names (IDN) The first is [internationalized domain names (RFC 5891)](https://tools.ietf.org/html/rfc5891), a.k.a IDNA 2008. The DNS system has not been updated with Unicode support. Instead, internationalized domain names are converted into a special IDNA ASCII "[Punycode](https://www.rfc-editor.org/rfc/rfc3492.txt)" form starting with `xn--`. When an email address has non-ASCII characters in its domain part, the domain part is replaced with its IDNA ASCII equivalent form in the process of mail transmission. Your mail submission library probably does this for you transparently. Note that most web browsers are currently in transition between IDNA 2003 (RFC 3490) and IDNA 2008 (RFC 5891) and [compliance around the web is not very good](http://archives.miloush.net/michkap/archive/2012/02/27/10273315.html) in any case, so be aware that edge cases are handled differently by different applications and libraries. This library conforms to IDNA 2008 using the [idna](https://github.com/kjd/idna) module by Kim Davies. ### Internationalized local parts The second sort of internationalization is internationalization in the *local* part of the address (before the @-sign). In non-internationalized email addresses, only English letters, numbers, and some punctuation (`._!#$%&'^``*+-=~/?{|}`) are allowed. In internationalized email address local parts, a wider range of Unicode characters are allowed. A surprisingly large number of Unicode characters are not safe to display, especially when the email address is concatenated with other text, so this library tries to protect you by not permitting resvered, non-, private use, formatting (which can be used to alter the display order of characters), whitespace, and control characters, and combining characters as the first character (so that they cannot combine with something outside of the email address string). See https://qntm.org/safe and https://trojansource.codes/ for relevant prior work. (Other than whitespace, these are checks that you should be applying to nearly all user inputs in a security-sensitive context.) These character checks are performed after Unicode normalization (see below), so you are only fully protected if you replace all user-provided email addresses with the normalized email address string returned by this library. This does not guard against the well known problem that many Unicode characters look alike (or are identical), which can be used to fool humans reading displayed text. Email addresses with these non-ASCII characters require that your mail submission library and the mail servers along the route to the destination, including your own outbound mail server, all support the [SMTPUTF8 (RFC 6531)](https://tools.ietf.org/html/rfc6531) extension. Support for SMTPUTF8 varies. See the `allow_smtputf8` parameter. ### If you know ahead of time that SMTPUTF8 is not supported by your mail submission stack By default all internationalized forms are accepted by the validator. But if you know ahead of time that SMTPUTF8 is not supported by your mail submission stack, then you must filter out addresses that require SMTPUTF8 using the `allow_smtputf8=False` keyword argument (see above). This will cause the validation function to raise a `EmailSyntaxError` if delivery would require SMTPUTF8. That's just in those cases where non-ASCII characters appear before the @-sign. If you do not set `allow_smtputf8=False`, you can also check the value of the `smtputf8` field in the returned object. If your mail submission library doesn't support Unicode at all --- even in the domain part of the address --- then immediately prior to mail submission you must replace the email address with its ASCII-ized form. This library gives you back the ASCII-ized form in the `ascii_email` field in the returned object, which you can get like this: ```python valid = validate_email(email, allow_smtputf8=False) email = valid.ascii_email ``` The local part is left alone (if it has internationalized characters `allow_smtputf8=False` will force validation to fail) and the domain part is converted to [IDNA ASCII](https://tools.ietf.org/html/rfc5891). (You probably should not do this at account creation time so you don't change the user's login information without telling them.) ### UCS-4 support required for Python 2.7 This library hopefully still works with Python 2.7. Note that when using Python 2.7, it is required that it was built with UCS-4 support (see [here](https://stackoverflow.com/questions/29109944/python-returns-length-of-2-for-single-unicode-character-string)); otherwise emails with unicode characters outside of the BMP (Basic Multilingual Plane) will not validate correctly. Normalization ------------- The use of Unicode in email addresses introduced a normalization problem. Different Unicode strings can look identical and have the same semantic meaning to the user. The `email` field returned on successful validation provides the correctly normalized form of the given email address: ```python valid = validate_email("me@Domain.com") email = valid.ascii_email print(email) # prints: me@domain.com ``` Because an end-user might type their email address in different (but equivalent) un-normalized forms at different times, you ought to replace what they enter with the normalized form immediately prior to going into your database (during account creation), querying your database (during login), or sending outbound mail. Normalization may also change the length of an email address, and this may affect whether it is valid and acceptable by your SMTP provider. The normalizations include lowercasing the domain part of the email address (domain names are case-insensitive), [Unicode "NFC" normalization](https://en.wikipedia.org/wiki/Unicode_equivalence) of the whole address (which turns characters plus [combining characters](https://en.wikipedia.org/wiki/Combining_character) into precomposed characters where possible, replacement of [fullwidth and halfwidth characters](https://en.wikipedia.org/wiki/Halfwidth_and_fullwidth_forms) in the domain part, possibly other [UTS46](http://unicode.org/reports/tr46) mappings on the domain part, and conversion from Punycode to Unicode characters. (See [RFC 6532 (internationalized email) section 3.1](https://tools.ietf.org/html/rfc6532#section-3.1) and [RFC 5895 (IDNA 2008) section 2](http://www.ietf.org/rfc/rfc5895.txt).) Examples -------- For the email address `test@joshdata.me`, the returned object is: ```python ValidatedEmail( email='test@joshdata.me', local_part='test', domain='joshdata.me', ascii_email='test@joshdata.me', ascii_local_part='test', ascii_domain='joshdata.me', smtputf8=False) ``` For the fictitious address `example@ツ.life`, which has an internationalized domain but ASCII local part, the returned object is: ```python ValidatedEmail( email='example@ツ.life', local_part='example', domain='ツ.life', ascii_email='example@xn--bdk.life', ascii_local_part='example', ascii_domain='xn--bdk.life', smtputf8=False) ``` Note that `smtputf8` is `False` even though the domain part is internationalized because [SMTPUTF8](https://tools.ietf.org/html/rfc6531) is only needed if the local part of the address is internationalized (the domain part can be converted to IDNA ASCII Punycode). Also note that the `email` and `domain` fields provide a normalized form of the email address and domain name (casefolding and Unicode normalization as required by IDNA 2008). Calling `validate_email` with the ASCII form of the above email address, `example@xn--bdk.life`, returns the exact same information (i.e., the `email` field always will contain Unicode characters, not Punycode). For the fictitious address `ツ-test@joshdata.me`, which has an internationalized local part, the returned object is: ```python ValidatedEmail( email='ツ-test@joshdata.me', local_part='ツ-test', domain='joshdata.me', ascii_email=None, ascii_local_part=None, ascii_domain='joshdata.me', smtputf8=True) ``` Now `smtputf8` is `True` and `ascii_email` is `None` because the local part of the address is internationalized. The `local_part` and `email` fields return the normalized form of the address: certain Unicode characters (such as angstrom and ohm) may be replaced by other equivalent code points (a-with-ring and omega). Return value ------------ When an email address passes validation, the fields in the returned object are: | Field | Value | | -----:|-------| | `email` | The normalized form of the email address that you should put in your database. This merely combines the `local_part` and `domain` fields (see below). | | `ascii_email` | If set, an ASCII-only form of the email address by replacing the domain part with [IDNA](https://tools.ietf.org/html/rfc5891) [Punycode](https://www.rfc-editor.org/rfc/rfc3492.txt). This field will be present when an ASCII-only form of the email address exists (including if the email address is already ASCII). If the local part of the email address contains internationalized characters, `ascii_email` will be `None`. If set, it merely combines `ascii_local_part` and `ascii_domain`. | | `local_part` | The local part of the given email address (before the @-sign) with Unicode NFC normalization applied. | | `ascii_local_part` | If set, the local part, which is composed of ASCII characters only. | | `domain` | The canonical internationalized Unicode form of the domain part of the email address. If the returned string contains non-ASCII characters, either the [SMTPUTF8](https://tools.ietf.org/html/rfc6531) feature of your mail relay will be required to transmit the message or else the email address's domain part must be converted to IDNA ASCII first: Use `ascii_domain` field instead. | | `ascii_domain` | The [IDNA](https://tools.ietf.org/html/rfc5891) [Punycode](https://www.rfc-editor.org/rfc/rfc3492.txt)-encoded form of the domain part of the given email address, as it would be transmitted on the wire. | | `smtputf8` | A boolean indicating that the [SMTPUTF8](https://tools.ietf.org/html/rfc6531) feature of your mail relay will be required to transmit messages to this address because the local part of the address has non-ASCII characters (the local part cannot be IDNA-encoded). If `allow_smtputf8=False` is passed as an argument, this flag will always be false because an exception is raised if it would have been true. | | `mx` | A list of (priority, domain) tuples of MX records specified in the DNS for the domain (see [RFC 5321 section 5](https://tools.ietf.org/html/rfc5321#section-5)). May be `None` if the deliverability check could not be completed because of a temporary issue like a timeout. | | `mx_fallback_type` | `None` if an `MX` record is found. If no MX records are actually specified in DNS and instead are inferred, through an obsolete mechanism, from A or AAAA records, the value is the type of DNS record used instead (`A` or `AAAA`). May be `None` if the deliverability check could not be completed because of a temporary issue like a timeout. | | `spf` | Any SPF record found while checking deliverability. | Assumptions ----------- By design, this validator does not pass all email addresses that strictly conform to the standards. Many email address forms are obsolete or likely to cause trouble: * The validator assumes the email address is intended to be usable on the public Internet. The domain part of the email address must be a resolvable domain name (without NULL MX or SPF -all DNS records) if deliverability checks are turned on. Most [Special Use Domain Names](https://www.iana.org/assignments/special-use-domain-names/special-use-domain-names.xhtml) and their subdomains and domain names without a `.` are rejected as a syntax error (except see the `test_environment` parameter above). * Obsolete email syntaxes are rejected: The "quoted string" form of the local part of the email address (RFC 5321 4.1.2) is not permitted. Quoted forms allow multiple @-signs, space characters, and other troublesome conditions. The unsual [(comment) syntax](https://github.com/JoshData/python-email-validator/issues/77) is also rejected. The "literal" form for the domain part of an email address (an IP address in brackets) is rejected. Other obsolete and deprecated syntaxes are rejected. No one uses these forms anymore. Testing ------- Tests can be run using ```sh pip install -r test_requirements.txt make test ``` For Project Maintainers ----------------------- The package is distributed as a universal wheel and as a source package. To release: * Update CHANGELOG.md. * Update the version number in setup.cfg. * Make a commit with the new version number. * Follow the steps below to publish source and a universal wheel to pypi and tag the release. * Make a release at https://github.com/JoshData/python-email-validator/releases/new. ```sh ./release_to_pypi.sh git tag v$(grep version setup.cfg | sed "s/.*= //") git push --tags ``` Notes: The wheel is specified as universal in the file `setup.cfg` by the `universal = 1` key in the `[bdist_wheel]` section. python-email-validator-1.3.0/email_validator/000077500000000000000000000000001431167072700212505ustar00rootroot00000000000000python-email-validator-1.3.0/email_validator/__init__.py000066400000000000000000001015261431167072700233660ustar00rootroot00000000000000# -*- coding: utf-8 -*- import sys import re import unicodedata import dns.resolver import dns.exception import idna # implements IDNA 2008; Python's codec is only IDNA 2003 # Default values for keyword arguments. ALLOW_SMTPUTF8 = True CHECK_DELIVERABILITY = True TEST_ENVIRONMENT = False GLOBALLY_DELIVERABLE = True DEFAULT_TIMEOUT = 15 # secs # Based on RFC 2822 section 3.2.4 / RFC 5322 section 3.2.3, these # characters are permitted in email addresses (not taking into # account internationalization): ATEXT = r'a-zA-Z0-9_!#\$%&\'\*\+\-/=\?\^`\{\|\}~' # A "dot atom text", per RFC 2822 3.2.4: DOT_ATOM_TEXT = '[' + ATEXT + ']+(?:\\.[' + ATEXT + ']+)*' # RFC 6531 section 3.3 extends the allowed characters in internationalized # addresses to also include three specific ranges of UTF8 defined in # RFC3629 section 4, which appear to be the Unicode code points from # U+0080 to U+10FFFF. ATEXT_INTL = ATEXT + u"\u0080-\U0010FFFF" DOT_ATOM_TEXT_INTL = '[' + ATEXT_INTL + ']+(?:\\.[' + ATEXT_INTL + ']+)*' # The domain part of the email address, after IDNA (ASCII) encoding, # must also satisfy the requirements of RFC 952/RFC 1123 which restrict # the allowed characters of hostnames further. The hyphen cannot be at # the beginning or end of a *dot-atom component* of a hostname either. ATEXT_HOSTNAME = r'(?:(?:[a-zA-Z0-9][a-zA-Z0-9\-]*)?[a-zA-Z0-9])' # Length constants # RFC 3696 + errata 1003 + errata 1690 (https://www.rfc-editor.org/errata_search.php?rfc=3696&eid=1690) # explains the maximum length of an email address is 254 octets. EMAIL_MAX_LENGTH = 254 LOCAL_PART_MAX_LENGTH = 64 DOMAIN_MAX_LENGTH = 255 # IANA Special Use Domain Names # Last Updated 2021-09-21 # https://www.iana.org/assignments/special-use-domain-names/special-use-domain-names.txt # # The domain names without dots would be caught by the check that the domain # name in an email address must have a period, but this list will also catch # subdomains of these domains, which are also reserved. SPECIAL_USE_DOMAIN_NAMES = [ # The "arpa" entry here is consolidated from a lot of arpa subdomains # for private address (i.e. non-routable IP addresses like 172.16.x.x) # reverse mapping, plus some other subdomains. Although RFC 6761 says # that application software should not treat these domains as special, # they are private-use domains and so cannot have globally deliverable # email addresses, which is an assumption of this library, and probably # all of arpa is similarly special-use, so we reject it all. "arpa", # RFC 6761 says applications "SHOULD NOT" treat the "example" domains # as special, i.e. applications should accept these domains. # # The domain "example" alone fails our syntax validation because it # lacks a dot (we assume no one has an email address on a TLD directly). # "@example.com/net/org" will currently fail DNS-based deliverability # checks because IANA publishes a NULL MX for these domains, and # "@mail.example[.com/net/org]" and other subdomains will fail DNS- # based deliverability checks because IANA does not publish MX or A # DNS records for these subdomains. # "example", # i.e. "wwww.example" # "example.com", # "example.net", # "example.org", # RFC 6761 says that applications are permitted to treat this domain # as special and that DNS should return an immediate negative response, # so we also immediately reject this domain, which also follows the # purpose of the domain. "invalid", # RFC 6762 says that applications "may" treat ".local" as special and # that "name resolution APIs and libraries SHOULD recognize these names # as special," and since ".local" has no global definition, we reject # it, as we expect email addresses to be gloally routable. "local", # RFC 6761 says that applications (like this library) are permitted # to treat "localhost" as special, and since it cannot have a globally # deliverable email address, we reject it. "localhost", # RFC 7686 says "applications that do not implement the Tor protocol # SHOULD generate an error upon the use of .onion and SHOULD NOT # perform a DNS lookup. "onion", # Although RFC 6761 says that application software should not treat # these domains as special, it also warns users that the address may # resolve differently in different systems, and therefore it cannot # have a globally routable email address, which is an assumption of # this library, so we reject "@test" and "@*.test" addresses, unless # the test_environment keyword argument is given, to allow their use # in application-level test environments. These domains will generally # fail deliverability checks because "test" is not an actual TLD. "test", ] # ease compatibility in type checking if sys.version_info >= (3,): unicode_class = str else: unicode_class = unicode # noqa: F821 # turn regexes to unicode (because 'ur' literals are not allowed in Py3) ATEXT = ATEXT.decode("ascii") DOT_ATOM_TEXT = DOT_ATOM_TEXT.decode("ascii") ATEXT_HOSTNAME = ATEXT_HOSTNAME.decode("ascii") class EmailNotValidError(ValueError): """Parent class of all exceptions raised by this module.""" pass class EmailSyntaxError(EmailNotValidError): """Exception raised when an email address fails validation because of its form.""" pass class EmailUndeliverableError(EmailNotValidError): """Exception raised when an email address fails validation because its domain name does not appear deliverable.""" pass class ValidatedEmail(object): """The validate_email function returns objects of this type holding the normalized form of the email address and other information.""" """The email address that was passed to validate_email. (If passed as bytes, this will be a string.)""" original_email = None """The normalized email address, which should always be used in preferance to the original address. The normalized address converts an IDNA ASCII domain name to Unicode, if possible, and performs Unicode normalization on the local part and on the domain (if originally Unicode). It is the concatenation of the local_part and domain attributes, separated by an @-sign.""" email = None """The local part of the email address after Unicode normalization.""" local_part = None """The domain part of the email address after Unicode normalization or conversion to Unicode from IDNA ascii.""" domain = None """If not None, a form of the email address that uses 7-bit ASCII characters only.""" ascii_email = None """If not None, the local part of the email address using 7-bit ASCII characters only.""" ascii_local_part = None """If not None, a form of the domain name that uses 7-bit ASCII characters only.""" ascii_domain = None """If True, the SMTPUTF8 feature of your mail relay will be required to transmit messages to this address. This flag is True just when ascii_local_part is missing. Otherwise it is False.""" smtputf8 = None """If a deliverability check is performed and if it succeeds, a list of (priority, domain) tuples of MX records specified in the DNS for the domain.""" mx = None """If no MX records are actually specified in DNS and instead are inferred, through an obsolete mechanism, from A or AAAA records, the value is the type of DNS record used instead (`A` or `AAAA`).""" mx_fallback_type = None """Tests use this constructor.""" def __init__(self, **kwargs): for k, v in kwargs.items(): setattr(self, k, v) """As a convenience, str(...) on instances of this class return the normalized address.""" def __self__(self): return self.normalized_email def __repr__(self): return "".format(self.email) """For backwards compatibility, some fields are also exposed through a dict-like interface. Note that some of the names changed when they became attributes.""" def __getitem__(self, key): if key == "email": return self.email if key == "email_ascii": return self.ascii_email if key == "local": return self.local_part if key == "domain": return self.ascii_domain if key == "domain_i18n": return self.domain if key == "smtputf8": return self.smtputf8 if key == "mx": return self.mx if key == "mx-fallback": return self.mx_fallback_type raise KeyError() """Tests use this.""" def __eq__(self, other): if not isinstance(other, ValidatedEmail): return False return ( self.email == other.email and self.local_part == other.local_part and self.domain == other.domain and self.ascii_email == other.ascii_email and self.ascii_local_part == other.ascii_local_part and self.ascii_domain == other.ascii_domain and self.smtputf8 == other.smtputf8 and repr(sorted(self.mx) if self.mx else self.mx) == repr(sorted(other.mx) if other.mx else other.mx) and self.mx_fallback_type == other.mx_fallback_type ) """This helps producing the README.""" def as_constructor(self): return "ValidatedEmail(" \ + ",".join("\n {}={}".format( key, repr(getattr(self, key))) for key in ('email', 'local_part', 'domain', 'ascii_email', 'ascii_local_part', 'ascii_domain', 'smtputf8', 'mx', 'mx_fallback_type') ) \ + ")" """Convenience method for accessing ValidatedEmail as a dict""" def as_dict(self): return self.__dict__ def __get_length_reason(addr, utf8=False, limit=EMAIL_MAX_LENGTH): diff = len(addr) - limit reason = "({}{} character{} too many)" prefix = "at least " if utf8 else "" suffix = "s" if diff > 1 else "" return reason.format(prefix, diff, suffix) def caching_resolver(*, timeout=None, cache=None): if timeout is None: timeout = DEFAULT_TIMEOUT resolver = dns.resolver.Resolver() resolver.cache = cache or dns.resolver.LRUCache() resolver.lifetime = timeout # timeout, in seconds return resolver def validate_email( email, # /, # not supported in Python 3.6, 3.7 *, allow_smtputf8=None, allow_empty_local=False, check_deliverability=None, test_environment=None, globally_deliverable=GLOBALLY_DELIVERABLE, timeout=None, dns_resolver=None ): """ Validates an email address, raising an EmailNotValidError if the address is not valid or returning a dict of information when the address is valid. The email argument can be a str or a bytes instance, but if bytes it must be ASCII-only. """ # Fill in default values of arguments. if allow_smtputf8 is None: allow_smtputf8 = ALLOW_SMTPUTF8 if check_deliverability is None: check_deliverability = CHECK_DELIVERABILITY if test_environment is None: test_environment = TEST_ENVIRONMENT if timeout is None: timeout = DEFAULT_TIMEOUT # Allow email to be a str or bytes instance. If bytes, # it must be ASCII because that's how the bytes work # on the wire with SMTP. if not isinstance(email, (str, unicode_class)): try: email = email.decode("ascii") except ValueError: raise EmailSyntaxError("The email address is not valid ASCII.") # At-sign. parts = email.split('@') if len(parts) != 2: raise EmailSyntaxError("The email address is not valid. It must have exactly one @-sign.") # Collect return values in this instance. ret = ValidatedEmail() ret.original_email = email # Validate the email address's local part syntax and get a normalized form. local_part_info = validate_email_local_part(parts[0], allow_smtputf8=allow_smtputf8, allow_empty_local=allow_empty_local) ret.local_part = local_part_info["local_part"] ret.ascii_local_part = local_part_info["ascii_local_part"] ret.smtputf8 = local_part_info["smtputf8"] # Validate the email address's domain part syntax and get a normalized form. domain_part_info = validate_email_domain_part(parts[1], test_environment=test_environment, globally_deliverable=globally_deliverable) ret.domain = domain_part_info["domain"] ret.ascii_domain = domain_part_info["ascii_domain"] # Construct the complete normalized form. ret.email = ret.local_part + "@" + ret.domain # If the email address has an ASCII form, add it. if not ret.smtputf8: ret.ascii_email = ret.ascii_local_part + "@" + ret.ascii_domain # If the email address has an ASCII representation, then we assume it may be # transmitted in ASCII (we can't assume SMTPUTF8 will be used on all hops to # the destination) and the length limit applies to ASCII characters (which is # the same as octets). The number of characters in the internationalized form # may be many fewer (because IDNA ASCII is verbose) and could be less than 254 # Unicode characters, and of course the number of octets over the limit may # not be the number of characters over the limit, so if the email address is # internationalized, we can't give any simple information about why the address # is too long. # # In addition, check that the UTF-8 encoding (i.e. not IDNA ASCII and not # Unicode characters) is at most 254 octets. If the addres is transmitted using # SMTPUTF8, then the length limit probably applies to the UTF-8 encoded octets. # If the email address has an ASCII form that differs from its internationalized # form, I don't think the internationalized form can be longer, and so the ASCII # form length check would be sufficient. If there is no ASCII form, then we have # to check the UTF-8 encoding. The UTF-8 encoding could be up to about four times # longer than the number of characters. # # See the length checks on the local part and the domain. if ret.ascii_email and len(ret.ascii_email) > EMAIL_MAX_LENGTH: if ret.ascii_email == ret.email: reason = __get_length_reason(ret.ascii_email) elif len(ret.email) > EMAIL_MAX_LENGTH: # If there are more than 254 characters, then the ASCII # form is definitely going to be too long. reason = __get_length_reason(ret.email, utf8=True) else: reason = "(when converted to IDNA ASCII)" raise EmailSyntaxError("The email address is too long {}.".format(reason)) if len(ret.email.encode("utf8")) > EMAIL_MAX_LENGTH: if len(ret.email) > EMAIL_MAX_LENGTH: # If there are more than 254 characters, then the UTF-8 # encoding is definitely going to be too long. reason = __get_length_reason(ret.email, utf8=True) else: reason = "(when encoded in bytes)" raise EmailSyntaxError("The email address is too long {}.".format(reason)) if check_deliverability and not test_environment: # Validate the email address's deliverability using DNS # and update the return dict with metadata. deliverability_info = validate_email_deliverability( ret["domain"], ret["domain_i18n"], timeout, dns_resolver ) for key, value in deliverability_info.items(): setattr(ret, key, value) return ret def validate_email_local_part(local, allow_smtputf8=True, allow_empty_local=False): # Validates the local part of an email address. if len(local) == 0: if not allow_empty_local: raise EmailSyntaxError("There must be something before the @-sign.") else: # The caller allows an empty local part. Useful for validating certain # Postfix aliases. return { "local_part": local, "ascii_local_part": local, "smtputf8": False, } # RFC 5321 4.5.3.1.1 # We're checking the number of characters here. If the local part # is ASCII-only, then that's the same as bytes (octets). If it's # internationalized, then the UTF-8 encoding may be longer, but # that may not be relevant. We will check the total address length # instead. if len(local) > LOCAL_PART_MAX_LENGTH: reason = __get_length_reason(local, limit=LOCAL_PART_MAX_LENGTH) raise EmailSyntaxError("The email address is too long before the @-sign {}.".format(reason)) # Check the local part against the regular expression for the older ASCII requirements. m = re.match(DOT_ATOM_TEXT + "\\Z", local) if m: # Return the local part unchanged and flag that SMTPUTF8 is not needed. return { "local_part": local, "ascii_local_part": local, "smtputf8": False, } else: # The local part failed the ASCII check. Now try the extended internationalized requirements. m = re.match(DOT_ATOM_TEXT_INTL + "\\Z", local) if not m: # It's not a valid internationalized address either. Report which characters were not valid. bad_chars = ', '.join(sorted(set( unicodedata.name(c, repr(c)) for c in local if not re.match(u"[" + (ATEXT if not allow_smtputf8 else ATEXT_INTL) + u"]", c) ))) raise EmailSyntaxError("The email address contains invalid characters before the @-sign: %s." % bad_chars) # It would be valid if internationalized characters were allowed by the caller. if not allow_smtputf8: raise EmailSyntaxError("Internationalized characters before the @-sign are not supported.") # It's valid. # RFC 6532 section 3.1 also says that Unicode NFC normalization should be applied, # so we'll return the normalized local part in the return value. local = unicodedata.normalize("NFC", local) # Check for unsafe characters. # Some of this may be redundant with the range U+0080 to U+10FFFF that is checked # by DOT_ATOM_TEXT_INTL. for i, c in enumerate(local): category = unicodedata.category(c) if category[0] in ("L", "N", "P", "S"): # letters, numbers, punctuation, and symbols are permitted pass elif category[0] == "M": # combining character in first position would combine with something # outside of the email address if concatenated to the right, but are # otherwise permitted if i == 0: raise EmailSyntaxError("The email address contains an initial invalid character (%s)." % unicodedata.name(c, repr(c))) elif category[0] in ("Z", "C"): # spaces and line/paragraph characters (Z) and # control, format, surrogate, private use, and unassigned code points (C) raise EmailSyntaxError("The email address contains an invalid character (%s)." % unicodedata.name(c, repr(c))) else: # All categories should be handled above, but in case there is something new # in the future. raise EmailSyntaxError("The email address contains a character (%s; category %s) that may not be safe." % (unicodedata.name(c, repr(c)), category)) # Try encoding to UTF-8. Failure is possible with some characters like # surrogate code points, but those are checked above. Still, we don't # want to have an unhandled exception later. try: local.encode("utf8") except ValueError: raise EmailSyntaxError("The email address contains an invalid character.") # Flag that SMTPUTF8 will be required for deliverability. return { "local_part": local, "ascii_local_part": None, # no ASCII form is possible "smtputf8": True, } def validate_email_domain_part(domain, test_environment=False, globally_deliverable=True): # Empty? if len(domain) == 0: raise EmailSyntaxError("There must be something after the @-sign.") # Perform UTS-46 normalization, which includes casefolding, NFC normalization, # and converting all label separators (the period/full stop, fullwidth full stop, # ideographic full stop, and halfwidth ideographic full stop) to basic periods. # It will also raise an exception if there is an invalid character in the input, # such as "⒈" which is invalid because it would expand to include a period. try: domain = idna.uts46_remap(domain, std3_rules=False, transitional=False) except idna.IDNAError as e: raise EmailSyntaxError("The domain name %s contains invalid characters (%s)." % (domain, str(e))) # Now we can perform basic checks on the use of periods (since equivalent # symbols have been mapped to periods). These checks are needed because the # IDNA library doesn't handle well domains that have empty labels (i.e. initial # dot, trailing dot, or two dots in a row). if domain.endswith("."): raise EmailSyntaxError("An email address cannot end with a period.") if domain.startswith("."): raise EmailSyntaxError("An email address cannot have a period immediately after the @-sign.") if ".." in domain: raise EmailSyntaxError("An email address cannot have two periods in a row.") # Regardless of whether international characters are actually used, # first convert to IDNA ASCII. For ASCII-only domains, the transformation # does nothing. If internationalized characters are present, the MTA # must either support SMTPUTF8 or the mail client must convert the # domain name to IDNA before submission. # # Unfortunately this step incorrectly 'fixes' domain names with leading # periods by removing them, so we have to check for this above. It also gives # a funky error message ("No input") when there are two periods in a # row, also checked separately above. try: ascii_domain = idna.encode(domain, uts46=False).decode("ascii") except idna.IDNAError as e: if "Domain too long" in str(e): # We can't really be more specific because UTS-46 normalization means # the length check is applied to a string that is different from the # one the user supplied. Also I'm not sure if the length check applies # to the internationalized form, the IDNA ASCII form, or even both! raise EmailSyntaxError("The email address is too long after the @-sign.") raise EmailSyntaxError("The domain name %s contains invalid characters (%s)." % (domain, str(e))) # We may have been given an IDNA ASCII domain to begin with. Check # that the domain actually conforms to IDNA. It could look like IDNA # but not be actual IDNA. For ASCII-only domains, the conversion out # of IDNA just gives the same thing back. # # This gives us the canonical internationalized form of the domain, # which we should use in all error messages. try: domain_i18n = idna.decode(ascii_domain.encode('ascii')) except idna.IDNAError as e: raise EmailSyntaxError("The domain name %s is not valid IDNA (%s)." % (ascii_domain, str(e))) # RFC 5321 4.5.3.1.2 # We're checking the number of bytes (octets) here, which can be much # higher than the number of characters in internationalized domains, # on the assumption that the domain may be transmitted without SMTPUTF8 # as IDNA ASCII. This is also checked by idna.encode, so this exception # is never reached. if len(ascii_domain) > DOMAIN_MAX_LENGTH: raise EmailSyntaxError("The email address is too long after the @-sign.") # A "dot atom text", per RFC 2822 3.2.4, but using the restricted # characters allowed in a hostname (see ATEXT_HOSTNAME above). DOT_ATOM_TEXT = ATEXT_HOSTNAME + r'(?:\.' + ATEXT_HOSTNAME + r')*' # Check the regular expression. This is probably entirely redundant # with idna.decode, which also checks this format. m = re.match(DOT_ATOM_TEXT + "\\Z", ascii_domain) if not m: raise EmailSyntaxError("The email address contains invalid characters after the @-sign.") if globally_deliverable: # All publicly deliverable addresses have domain named with at least # one period, and we'll consider the lack of a period a syntax error # since that will match people's sense of what an email address looks # like. We'll skip this in test environments to allow '@test' email # addresses. if "." not in ascii_domain and not (ascii_domain == "test" and test_environment): raise EmailSyntaxError("The domain name %s is not valid. It should have a period." % domain_i18n) # We also know that all TLDs currently end with a letter. if not re.search(r"[A-Za-z]\Z", ascii_domain): raise EmailSyntaxError( "The domain name %s is not valid. It is not within a valid top-level domain." % domain_i18n ) # Check special-use and reserved domain names. # Some might fail DNS-based deliverability checks, but that # can be turned off, so we should fail them all sooner. for d in SPECIAL_USE_DOMAIN_NAMES: # See the note near the definition of SPECIAL_USE_DOMAIN_NAMES. if d == "test" and test_environment: continue if ascii_domain == d or ascii_domain.endswith("." + d): raise EmailSyntaxError("The domain name %s is a special-use or reserved name that cannot be used with email." % domain_i18n) # Return the IDNA ASCII-encoded form of the domain, which is how it # would be transmitted on the wire (except when used with SMTPUTF8 # possibly), as well as the canonical Unicode form of the domain, # which is better for display purposes. This should also take care # of RFC 6532 section 3.1's suggestion to apply Unicode NFC # normalization to addresses. return { "ascii_domain": ascii_domain, "domain": domain_i18n, } def validate_email_deliverability(domain, domain_i18n, timeout=DEFAULT_TIMEOUT, dns_resolver=None): # Check that the domain resolves to an MX record. If there is no MX record, # try an A or AAAA record which is a deprecated fallback for deliverability. # (Note that changing the DEFAULT_TIMEOUT module-level attribute # will not change the default value of this method's timeout argument.) # If no dns.resolver.Resolver was given, get dnspython's default resolver. # Override the default resolver's timeout. This may affect other uses of # dnspython in this process. if dns_resolver is None: dns_resolver = dns.resolver.get_default_resolver() dns_resolver.lifetime = timeout deliverability_info = {} def dns_resolver_resolve_shim(domain, record): try: # dns.resolver.Resolver.resolve is new to dnspython 2.x. # https://dnspython.readthedocs.io/en/latest/resolver-class.html#dns.resolver.Resolver.resolve return dns_resolver.resolve(domain, record) except AttributeError: # dnspython 2.x is only available in Python 3.6 and later. For earlier versions # of Python, we maintain compatibility with dnspython 1.x which has a # dnspython.resolver.Resolver.query method instead. The only difference is that # query may treat the domain as relative and use the system's search domains, # which we prevent by adding a "." to the domain name to make it absolute. # dns.resolver.Resolver.query is deprecated in dnspython version 2.x. # https://dnspython.readthedocs.io/en/latest/resolver-class.html#dns.resolver.Resolver.query return dns_resolver.query(domain + ".", record) try: # We need a way to check how timeouts are handled in the tests. So we # have a secret variable that if set makes this method always test the # handling of a timeout. if getattr(validate_email_deliverability, 'TEST_CHECK_TIMEOUT', False): raise dns.exception.Timeout() try: # Try resolving for MX records. response = dns_resolver_resolve_shim(domain, "MX") # For reporting, put them in priority order and remove the trailing dot in the qnames. mtas = sorted([(r.preference, str(r.exchange).rstrip('.')) for r in response]) # Remove "null MX" records from the list (their value is (0, ".") but we've stripped # trailing dots, so the 'exchange' is just ""). If there was only a null MX record, # email is not deliverable. mtas = [(preference, exchange) for preference, exchange in mtas if exchange != ""] if len(mtas) == 0: raise EmailUndeliverableError("The domain name %s does not accept email." % domain_i18n) deliverability_info["mx"] = mtas deliverability_info["mx_fallback_type"] = None except (dns.resolver.NoNameservers, dns.resolver.NXDOMAIN, dns.resolver.NoAnswer): # If there was no MX record, fall back to an A record. try: response = dns_resolver_resolve_shim(domain, "A") deliverability_info["mx"] = [(0, str(r)) for r in response] deliverability_info["mx_fallback_type"] = "A" except (dns.resolver.NoNameservers, dns.resolver.NXDOMAIN, dns.resolver.NoAnswer): # If there was no A record, fall back to an AAAA record. try: response = dns_resolver_resolve_shim(domain, "AAAA") deliverability_info["mx"] = [(0, str(r)) for r in response] deliverability_info["mx_fallback_type"] = "AAAA" except (dns.resolver.NoNameservers, dns.resolver.NXDOMAIN, dns.resolver.NoAnswer): # If there was no MX, A, or AAAA record, then mail to # this domain is not deliverable. raise EmailUndeliverableError("The domain name %s does not exist." % domain_i18n) try: # Check for a SPF reject all ("v=spf1 -all") record which indicates # no emails are sent from this domain, which like a NULL MX record # would indicate that the domain is not used for email. response = dns_resolver_resolve_shim(domain, "TXT") for rec in response: value = b"".join(rec.strings) if value.startswith(b"v=spf1 "): deliverability_info["spf"] = value.decode("ascii", errors='replace') if value == b"v=spf1 -all": raise EmailUndeliverableError("The domain name %s does not send email." % domain_i18n) except dns.resolver.NoAnswer: # No TXT records means there is no SPF policy, so we cannot take any action. pass except (dns.resolver.NoNameservers, dns.resolver.NXDOMAIN): # Failure to resolve at this step will be ignored. pass except dns.exception.Timeout: # A timeout could occur for various reasons, so don't treat it as a failure. return { "unknown-deliverability": "timeout", } except EmailUndeliverableError: # Don't let these get clobbered by the wider except block below. raise except Exception as e: # Unhandled conditions should not propagate. raise EmailUndeliverableError( "There was an error while checking if the domain name in the email address is deliverable: " + str(e) ) return deliverability_info def main(): import json def __utf8_input_shim(input_str): if sys.version_info < (3,): return input_str.decode("utf-8") return input_str def __utf8_output_shim(output_str): if sys.version_info < (3,): return unicode_class(output_str).encode("utf-8") return output_str if len(sys.argv) == 1: # Validate the email addresses pased line-by-line on STDIN. dns_resolver = caching_resolver() for line in sys.stdin: email = __utf8_input_shim(line.strip()) try: validate_email(email, dns_resolver=dns_resolver) except EmailNotValidError as e: print(__utf8_output_shim("{} {}".format(email, e))) else: # Validate the email address passed on the command line. email = __utf8_input_shim(sys.argv[1]) try: result = validate_email(email) print(json.dumps(result.as_dict(), indent=2, sort_keys=True, ensure_ascii=False)) except EmailNotValidError as e: print(__utf8_output_shim(e)) if __name__ == "__main__": main() python-email-validator-1.3.0/release_to_pypi.sh000077500000000000000000000002671431167072700216430ustar00rootroot00000000000000#!/bin/sh pip3 install --upgrade twine rm -rf dist python3 setup.py sdist python3 setup.py bdist_wheel twine upload -u __token__ dist/* # username: __token__ password: pypi API token python-email-validator-1.3.0/setup.cfg000066400000000000000000000024211431167072700177340ustar00rootroot00000000000000[metadata] name = email_validator version = 1.3.0 description = A robust email address syntax and deliverability validation library. long_description = file: README.md long_description_content_type = text/markdown url = https://github.com/JoshData/python-email-validator author = Joshua Tauberer author_email = jt@occams.info license = CC0 (copyright waived) license_file = LICENSE classifiers = Development Status :: 5 - Production/Stable Intended Audience :: Developers License :: CC0 1.0 Universal (CC0 1.0) Public Domain Dedication Programming Language :: Python :: 2 Programming Language :: Python :: 2.7 Programming Language :: Python :: 3 Programming Language :: Python :: 3.7 Programming Language :: Python :: 3.8 Programming Language :: Python :: 3.9 Programming Language :: Python :: 3.10 Topic :: Software Development :: Libraries :: Python Modules keywords = email address validator [options] packages = find: install_requires = dnspython>=1.15.0 idna>=2.0.0 python_requires = >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.* [options.entry_points] console_scripts = email_validator=email_validator:main [bdist_wheel] universal = 1 [flake8] max-line-length = 120 [tool:pytest] testpaths = tests filterwarnings = error python-email-validator-1.3.0/setup.py000066400000000000000000000000451431167072700176250ustar00rootroot00000000000000from setuptools import setup setup() python-email-validator-1.3.0/test_requirements.txt000066400000000000000000000011751431167072700224430ustar00rootroot00000000000000# This file was generated by running # pip install dnspython idna # from setup.cfg # pip install pytest pytest-cov coverage flake8 # pip freeze # in a virtualenv with Python 3.6. (Some packages' latest versions # are not compatible with Python 3.6, so we must pin versions for # repeatable testing in earlier versions of Python.) attrs==21.4.0 coverage==6.2 dnspython==2.2.1 flake8==4.0.1 idna==3.3 importlib-metadata==4.2.0 iniconfig==1.1.1 mccabe==0.6.1 packaging==21.3 pluggy==1.0.0 py==1.11.0 pycodestyle==2.8.0 pyflakes==2.4.0 pyparsing==3.0.7 pytest==7.0.1 pytest-cov==3.0.0 tomli==1.2.3 typing_extensions==4.1.1 zipp==3.6.0 python-email-validator-1.3.0/tests/000077500000000000000000000000001431167072700172565ustar00rootroot00000000000000python-email-validator-1.3.0/tests/test_main.py000066400000000000000000000775131431167072700216300ustar00rootroot00000000000000import dns.resolver import re import pytest from email_validator import EmailSyntaxError, EmailUndeliverableError, \ validate_email, validate_email_deliverability, \ caching_resolver, ValidatedEmail # Let's test main but rename it to be clear from email_validator import main as validator_main @pytest.mark.parametrize( 'email_input,output', [ ( 'Abc@example.tld', ValidatedEmail( local_part='Abc', ascii_local_part='Abc', smtputf8=False, ascii_domain='example.tld', domain='example.tld', email='Abc@example.tld', ascii_email='Abc@example.tld', ), ), ( 'Abc.123@test-example.com', ValidatedEmail( local_part='Abc.123', ascii_local_part='Abc.123', smtputf8=False, ascii_domain='test-example.com', domain='test-example.com', email='Abc.123@test-example.com', ascii_email='Abc.123@test-example.com', ), ), ( 'user+mailbox/department=shipping@example.tld', ValidatedEmail( local_part='user+mailbox/department=shipping', ascii_local_part='user+mailbox/department=shipping', smtputf8=False, ascii_domain='example.tld', domain='example.tld', email='user+mailbox/department=shipping@example.tld', ascii_email='user+mailbox/department=shipping@example.tld', ), ), ( "!#$%&'*+-/=?^_`.{|}~@example.tld", ValidatedEmail( local_part="!#$%&'*+-/=?^_`.{|}~", ascii_local_part="!#$%&'*+-/=?^_`.{|}~", smtputf8=False, ascii_domain='example.tld', domain='example.tld', email="!#$%&'*+-/=?^_`.{|}~@example.tld", ascii_email="!#$%&'*+-/=?^_`.{|}~@example.tld", ), ), ( '伊昭傑@郵件.商務', ValidatedEmail( local_part='伊昭傑', smtputf8=True, ascii_domain='xn--5nqv22n.xn--lhr59c', domain='郵件.商務', email='伊昭傑@郵件.商務', ), ), ( 'राम@मोहन.ईन्फो', ValidatedEmail( local_part='राम', smtputf8=True, ascii_domain='xn--l2bl7a9d.xn--o1b8dj2ki', domain='मोहन.ईन्फो', email='राम@मोहन.ईन्फो', ), ), ( 'юзер@екзампл.ком', ValidatedEmail( local_part='юзер', smtputf8=True, ascii_domain='xn--80ajglhfv.xn--j1aef', domain='екзампл.ком', email='юзер@екзампл.ком', ), ), ( 'θσερ@εχαμπλε.ψομ', ValidatedEmail( local_part='θσερ', smtputf8=True, ascii_domain='xn--mxahbxey0c.xn--xxaf0a', domain='εχαμπλε.ψομ', email='θσερ@εχαμπλε.ψομ', ), ), ( '葉士豪@臺網中心.tw', ValidatedEmail( local_part='葉士豪', smtputf8=True, ascii_domain='xn--fiqq24b10vi0d.tw', domain='臺網中心.tw', email='葉士豪@臺網中心.tw', ), ), ( 'jeff@臺網中心.tw', ValidatedEmail( local_part='jeff', ascii_local_part='jeff', smtputf8=False, ascii_domain='xn--fiqq24b10vi0d.tw', domain='臺網中心.tw', email='jeff@臺網中心.tw', ascii_email='jeff@xn--fiqq24b10vi0d.tw', ), ), ( '葉士豪@臺網中心.台灣', ValidatedEmail( local_part='葉士豪', smtputf8=True, ascii_domain='xn--fiqq24b10vi0d.xn--kpry57d', domain='臺網中心.台灣', email='葉士豪@臺網中心.台灣', ), ), ( 'jeff葉@臺網中心.tw', ValidatedEmail( local_part='jeff葉', smtputf8=True, ascii_domain='xn--fiqq24b10vi0d.tw', domain='臺網中心.tw', email='jeff葉@臺網中心.tw', ), ), ( 'ñoñó@example.tld', ValidatedEmail( local_part='ñoñó', smtputf8=True, ascii_domain='example.tld', domain='example.tld', email='ñoñó@example.tld', ), ), ( '我買@example.tld', ValidatedEmail( local_part='我買', smtputf8=True, ascii_domain='example.tld', domain='example.tld', email='我買@example.tld', ), ), ( '甲斐黒川日本@example.tld', ValidatedEmail( local_part='甲斐黒川日本', smtputf8=True, ascii_domain='example.tld', domain='example.tld', email='甲斐黒川日本@example.tld', ), ), ( 'чебурашкаящик-с-апельсинами.рф@example.tld', ValidatedEmail( local_part='чебурашкаящик-с-апельсинами.рф', smtputf8=True, ascii_domain='example.tld', domain='example.tld', email='чебурашкаящик-с-апельсинами.рф@example.tld', ), ), ( 'उदाहरण.परीक्ष@domain.with.idn.tld', ValidatedEmail( local_part='उदाहरण.परीक्ष', smtputf8=True, ascii_domain='domain.with.idn.tld', domain='domain.with.idn.tld', email='उदाहरण.परीक्ष@domain.with.idn.tld', ), ), ( 'ιωάννης@εεττ.gr', ValidatedEmail( local_part='ιωάννης', smtputf8=True, ascii_domain='xn--qxaa9ba.gr', domain='εεττ.gr', email='ιωάννης@εεττ.gr', ), ), ], ) def test_email_valid(email_input, output): # print(f'({email_input!r}, {validate_email(email_input, check_deliverability=False)!r}),') assert validate_email(email_input, check_deliverability=False) == output @pytest.mark.parametrize( 'email_input,error_msg', [ ('my@localhost', 'The domain name localhost is not valid. It should have a period.'), ('my@.leadingdot.com', 'An email address cannot have a period immediately after the @-sign.'), ('my@..leadingfwdot.com', 'An email address cannot have a period immediately after the @-sign.'), ('my@..twodots.com', 'An email address cannot have a period immediately after the @-sign.'), ('my@twodots..com', 'An email address cannot have two periods in a row.'), ('my@baddash.-.com', 'The domain name baddash.-.com contains invalid characters (Label must not start or end with a hyphen).'), ('my@baddash.-a.com', 'The domain name baddash.-a.com contains invalid characters (Label must not start or end with a hyphen).'), ('my@baddash.b-.com', 'The domain name baddash.b-.com contains invalid characters (Label must not start or end with a hyphen).'), ('my@example.com\n', 'The domain name example.com\n contains invalid characters (Codepoint U+000A at position 4 of ' '\'com\\n\' not allowed).'), ('my@example\n.com', 'The domain name example\n.com contains invalid characters (Codepoint U+000A at position 8 of ' '\'example\\n\' not allowed).'), ('.leadingdot@domain.com', 'The email address contains invalid characters before the @-sign: FULL STOP.'), ('..twodots@domain.com', 'The email address contains invalid characters before the @-sign: FULL STOP.'), ('twodots..here@domain.com', 'The email address contains invalid characters before the @-sign: FULL STOP.'), ('me@⒈wouldbeinvalid.com', "The domain name ⒈wouldbeinvalid.com contains invalid characters (Codepoint U+2488 not allowed " "at position 1 in '⒈wouldbeinvalid.com')."), ('@example.com', 'There must be something before the @-sign.'), ('\nmy@example.com', 'The email address contains invalid characters before the @-sign: \'\\n\'.'), ('m\ny@example.com', 'The email address contains invalid characters before the @-sign: \'\\n\'.'), ('my\n@example.com', 'The email address contains invalid characters before the @-sign: \'\\n\'.'), ('11111111112222222222333333333344444444445555555555666666666677777@example.com', 'The email address is too long before the @-sign (1 character too many).'), ('111111111122222222223333333333444444444455555555556666666666777777@example.com', 'The email address is too long before the @-sign (2 characters too many).'), ('me@1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.111111111122222222223333333333444444444455555555556.com', 'The email address is too long after the @-sign.'), ('my.long.address@1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.11111111112222222222333333333344444.info', 'The email address is too long (2 characters too many).'), ('my.long.address@λ111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.11111111112222222222333333.info', 'The email address is too long (when converted to IDNA ASCII).'), ('my.long.address@λ111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.1111111111222222222233333333334444.info', 'The email address is too long (at least 1 character too many).'), ('my.λong.address@1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.111111111122222222223333333333444.info', 'The email address is too long (when encoded in bytes).'), ('my.λong.address@1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.1111111111222222222233333333334444.info', 'The email address is too long (at least 1 character too many).'), ], ) def test_email_invalid_syntax(email_input, error_msg): # Since these all have syntax errors, deliverability # checks do not arise. with pytest.raises(EmailSyntaxError) as exc_info: validate_email(email_input) # print(f'({email_input!r}, {str(exc_info.value)!r}),') assert str(exc_info.value) == error_msg @pytest.mark.parametrize( 'email_input', [ ('me@anything.arpa'), ('me@valid.invalid'), ('me@link.local'), ('me@host.localhost'), ('me@onion.onion.onion'), ('me@test.test.test'), ], ) def test_email_invalid_reserved_domain(email_input): # Since these all fail deliverabiltiy from a static list, # DNS deliverability checks do not arise. with pytest.raises(EmailSyntaxError) as exc_info: validate_email(email_input) # print(f'({email_input!r}, {str(exc_info.value)!r}),') assert "is a special-use or reserved name" in str(exc_info.value) @pytest.mark.parametrize( 'email_input', [ ('me@mail.example'), ('me@example.com'), ('me@mail.example.com'), ], ) def test_email_example_reserved_domain(email_input): # Since these all fail deliverabiltiy from a static list, # DNS deliverability checks do not arise. with pytest.raises(EmailUndeliverableError) as exc_info: validate_email(email_input) # print(f'({email_input!r}, {str(exc_info.value)!r}),') assert re.match(r"The domain name [a-z\.]+ does not (accept email|exist)\.", str(exc_info.value)) is not None @pytest.mark.parametrize( 'email_input', [ ('white space@test'), ('\n@test'), ('\u2005@test'), # four-per-em space (Zs) ('\u009C@test'), # string terminator (Cc) ('\u200B@test'), # zero-width space (Cf) ('\u202Dforward-\u202Ereversed@test'), # BIDI (Cf) ('\uD800@test'), # surrogate (Cs) ('\uE000@test'), # private use (Co) ('\uFDEF@test'), # unassigned (Cn) ], ) def test_email_unsafe_character(email_input): # Check for various unsafe characters: with pytest.raises(EmailSyntaxError) as exc_info: validate_email(email_input, test_environment=True) assert "invalid character" in str(exc_info.value) def test_email_test_domain_name_in_test_environment(): validate_email("anything@test", test_environment=True) validate_email("anything@mycompany.test", test_environment=True) # This is the pyIsEmail (https://github.com/michaelherold/pyIsEmail) test suite. # # The test data was extracted by: # # $ wget https://raw.githubusercontent.com/michaelherold/pyIsEmail/master/tests/data/tests.xml # $ xmllint --xpath '/tests/test/address/text()' tests.xml > t1 # $ xmllint --xpath "/tests/test[not(address='')]/diagnosis/text()" tests.xml > t2 # # tests = [] # def fixup_char(c): # if ord(c) >= 0x2400 and ord(c) <= 0x2432: # c = chr(ord(c)-0x2400) # return c # for email, diagnosis in zip(open("t1"), open("t2")): # email = email[:-1] # strip trailing \n but not more because trailing whitespace is significant # email = "".join(fixup_char(c) for c in email).replace("&", "&") # tests.append([email, diagnosis.strip()]) # print(repr(tests).replace("'], ['", "'],\n['")) @pytest.mark.parametrize( ('email_input', 'status'), [ ['test', 'ISEMAIL_ERR_NODOMAIN'], ['@', 'ISEMAIL_ERR_NOLOCALPART'], ['test@', 'ISEMAIL_ERR_NODOMAIN'], # ['test@io', 'ISEMAIL_VALID'], # we reject domains without a dot, knowing they are not deliverable ['@io', 'ISEMAIL_ERR_NOLOCALPART'], ['@iana.org', 'ISEMAIL_ERR_NOLOCALPART'], ['test@iana.org', 'ISEMAIL_VALID'], ['test@nominet.org.uk', 'ISEMAIL_VALID'], ['test@about.museum', 'ISEMAIL_VALID'], ['a@iana.org', 'ISEMAIL_VALID'], ['test.test@iana.org', 'ISEMAIL_VALID'], ['.test@iana.org', 'ISEMAIL_ERR_DOT_START'], ['test.@iana.org', 'ISEMAIL_ERR_DOT_END'], ['test..iana.org', 'ISEMAIL_ERR_CONSECUTIVEDOTS'], ['test_exa-mple.com', 'ISEMAIL_ERR_NODOMAIN'], ['!#$%&`*+/=?^`{|}~@iana.org', 'ISEMAIL_VALID'], ['test\\@test@iana.org', 'ISEMAIL_ERR_EXPECTING_ATEXT'], ['123@iana.org', 'ISEMAIL_VALID'], ['test@123.com', 'ISEMAIL_VALID'], ['abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghiklm@iana.org', 'ISEMAIL_VALID'], ['abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghiklmn@iana.org', 'ISEMAIL_RFC5322_LOCAL_TOOLONG'], ['test@abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghiklm.com', 'ISEMAIL_RFC5322_LABEL_TOOLONG'], ['test@mason-dixon.com', 'ISEMAIL_VALID'], ['test@-iana.org', 'ISEMAIL_ERR_DOMAINHYPHENSTART'], ['test@iana-.com', 'ISEMAIL_ERR_DOMAINHYPHENEND'], ['test@g--a.com', 'ISEMAIL_VALID'], ['test@.iana.org', 'ISEMAIL_ERR_DOT_START'], ['test@iana.org.', 'ISEMAIL_ERR_DOT_END'], ['test@iana..com', 'ISEMAIL_ERR_CONSECUTIVEDOTS'], ['abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghiklm@abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghikl.abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghikl.abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghij', 'ISEMAIL_RFC5322_TOOLONG'], ['a@abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghikl.abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghikl.abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghikl.abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefg.hij', 'ISEMAIL_RFC5322_TOOLONG'], ['a@abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghikl.abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghikl.abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghikl.abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefg.hijk', 'ISEMAIL_RFC5322_DOMAIN_TOOLONG'], ['"test"@iana.org', 'ISEMAIL_RFC5321_QUOTEDSTRING'], ['""@iana.org', 'ISEMAIL_RFC5321_QUOTEDSTRING'], ['"""@iana.org', 'ISEMAIL_ERR_EXPECTING_ATEXT'], ['"\\a"@iana.org', 'ISEMAIL_RFC5321_QUOTEDSTRING'], ['"\\""@iana.org', 'ISEMAIL_RFC5321_QUOTEDSTRING'], ['"\\"@iana.org', 'ISEMAIL_ERR_UNCLOSEDQUOTEDSTR'], ['"\\\\"@iana.org', 'ISEMAIL_RFC5321_QUOTEDSTRING'], ['test"@iana.org', 'ISEMAIL_ERR_EXPECTING_ATEXT'], ['"test@iana.org', 'ISEMAIL_ERR_UNCLOSEDQUOTEDSTR'], ['"test"test@iana.org', 'ISEMAIL_ERR_ATEXT_AFTER_QS'], ['test"text"@iana.org', 'ISEMAIL_ERR_EXPECTING_ATEXT'], ['"test""test"@iana.org', 'ISEMAIL_ERR_EXPECTING_ATEXT'], ['"test"."test"@iana.org', 'ISEMAIL_DEPREC_LOCALPART'], ['"test\\ test"@iana.org', 'ISEMAIL_RFC5321_QUOTEDSTRING'], ['"test".test@iana.org', 'ISEMAIL_DEPREC_LOCALPART'], ['"test\x00"@iana.org', 'ISEMAIL_ERR_EXPECTING_QTEXT'], ['"test\\\x00"@iana.org', 'ISEMAIL_DEPREC_QP'], ['"abcdefghijklmnopqrstuvwxyz abcdefghijklmnopqrstuvwxyz abcdefghj"@iana.org', 'ISEMAIL_RFC5322_LOCAL_TOOLONG'], ['"abcdefghijklmnopqrstuvwxyz abcdefghijklmnopqrstuvwxyz abcdefg\\h"@iana.org', 'ISEMAIL_RFC5322_LOCAL_TOOLONG'], ['test@[255.255.255.255]', 'ISEMAIL_RFC5321_ADDRESSLITERAL'], ['test@a[255.255.255.255]', 'ISEMAIL_ERR_EXPECTING_ATEXT'], ['test@[255.255.255]', 'ISEMAIL_RFC5322_DOMAINLITERAL'], ['test@[255.255.255.255.255]', 'ISEMAIL_RFC5322_DOMAINLITERAL'], ['test@[255.255.255.256]', 'ISEMAIL_RFC5322_DOMAINLITERAL'], ['test@[1111:2222:3333:4444:5555:6666:7777:8888]', 'ISEMAIL_RFC5322_DOMAINLITERAL'], ['test@[IPv6:1111:2222:3333:4444:5555:6666:7777]', 'ISEMAIL_RFC5322_IPV6_GRPCOUNT'], ['test@[IPv6:1111:2222:3333:4444:5555:6666:7777:8888]', 'ISEMAIL_RFC5321_ADDRESSLITERAL'], ['test@[IPv6:1111:2222:3333:4444:5555:6666:7777:8888:9999]', 'ISEMAIL_RFC5322_IPV6_GRPCOUNT'], ['test@[IPv6:1111:2222:3333:4444:5555:6666:7777:888G]', 'ISEMAIL_RFC5322_IPV6_BADCHAR'], ['test@[IPv6:1111:2222:3333:4444:5555:6666::8888]', 'ISEMAIL_RFC5321_IPV6DEPRECATED'], ['test@[IPv6:1111:2222:3333:4444:5555::8888]', 'ISEMAIL_RFC5321_ADDRESSLITERAL'], ['test@[IPv6:1111:2222:3333:4444:5555:6666::7777:8888]', 'ISEMAIL_RFC5322_IPV6_MAXGRPS'], ['test@[IPv6::3333:4444:5555:6666:7777:8888]', 'ISEMAIL_RFC5322_IPV6_COLONSTRT'], ['test@[IPv6:::3333:4444:5555:6666:7777:8888]', 'ISEMAIL_RFC5321_ADDRESSLITERAL'], ['test@[IPv6:1111::4444:5555::8888]', 'ISEMAIL_RFC5322_IPV6_2X2XCOLON'], ['test@[IPv6:::]', 'ISEMAIL_RFC5321_ADDRESSLITERAL'], ['test@[IPv6:1111:2222:3333:4444:5555:255.255.255.255]', 'ISEMAIL_RFC5322_IPV6_GRPCOUNT'], ['test@[IPv6:1111:2222:3333:4444:5555:6666:255.255.255.255]', 'ISEMAIL_RFC5321_ADDRESSLITERAL'], ['test@[IPv6:1111:2222:3333:4444:5555:6666:7777:255.255.255.255]', 'ISEMAIL_RFC5322_IPV6_GRPCOUNT'], ['test@[IPv6:1111:2222:3333:4444::255.255.255.255]', 'ISEMAIL_RFC5321_ADDRESSLITERAL'], ['test@[IPv6:1111:2222:3333:4444:5555:6666::255.255.255.255]', 'ISEMAIL_RFC5322_IPV6_MAXGRPS'], ['test@[IPv6:1111:2222:3333:4444:::255.255.255.255]', 'ISEMAIL_RFC5322_IPV6_2X2XCOLON'], ['test@[IPv6::255.255.255.255]', 'ISEMAIL_RFC5322_IPV6_COLONSTRT'], [' test @iana.org', 'ISEMAIL_DEPREC_CFWS_NEAR_AT'], ['test@ iana .com', 'ISEMAIL_DEPREC_CFWS_NEAR_AT'], ['test . test@iana.org', 'ISEMAIL_DEPREC_FWS'], ['\r\n test@iana.org', 'ISEMAIL_CFWS_FWS'], ['\r\n \r\n test@iana.org', 'ISEMAIL_DEPREC_FWS'], ['(comment)test@iana.org', 'ISEMAIL_CFWS_COMMENT'], ['((comment)test@iana.org', 'ISEMAIL_ERR_UNCLOSEDCOMMENT'], ['(comment(comment))test@iana.org', 'ISEMAIL_CFWS_COMMENT'], ['test@(comment)iana.org', 'ISEMAIL_DEPREC_CFWS_NEAR_AT'], ['test(comment)test@iana.org', 'ISEMAIL_ERR_ATEXT_AFTER_CFWS'], ['test@(comment)[255.255.255.255]', 'ISEMAIL_DEPREC_CFWS_NEAR_AT'], ['(comment)abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghiklm@iana.org', 'ISEMAIL_CFWS_COMMENT'], ['test@(comment)abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghikl.com', 'ISEMAIL_DEPREC_CFWS_NEAR_AT'], ['(comment)test@abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghik.abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghik.abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijk.abcdefghijklmnopqrstuvwxyzabcdefghijk.abcdefghijklmnopqrstu', 'ISEMAIL_CFWS_COMMENT'], ['test@iana.org\n', 'ISEMAIL_ERR_EXPECTING_ATEXT'], ['test@xn--hxajbheg2az3al.xn--jxalpdlp', 'ISEMAIL_VALID'], ['xn--test@iana.org', 'ISEMAIL_VALID'], ['test@iana.org-', 'ISEMAIL_ERR_DOMAINHYPHENEND'], ['"test@iana.org', 'ISEMAIL_ERR_UNCLOSEDQUOTEDSTR'], ['(test@iana.org', 'ISEMAIL_ERR_UNCLOSEDCOMMENT'], ['test@(iana.org', 'ISEMAIL_ERR_UNCLOSEDCOMMENT'], ['test@[1.2.3.4', 'ISEMAIL_ERR_UNCLOSEDDOMLIT'], ['"test\\"@iana.org', 'ISEMAIL_ERR_UNCLOSEDQUOTEDSTR'], ['(comment\\)test@iana.org', 'ISEMAIL_ERR_UNCLOSEDCOMMENT'], ['test@iana.org(comment\\)', 'ISEMAIL_ERR_UNCLOSEDCOMMENT'], ['test@iana.org(comment\\', 'ISEMAIL_ERR_BACKSLASHEND'], ['test@[RFC-5322-domain-literal]', 'ISEMAIL_RFC5322_DOMAINLITERAL'], ['test@[RFC-5322]-domain-literal]', 'ISEMAIL_ERR_ATEXT_AFTER_DOMLIT'], ['test@[RFC-5322-[domain-literal]', 'ISEMAIL_ERR_EXPECTING_DTEXT'], ['test@[RFC-5322-\\\x07-domain-literal]', 'ISEMAIL_RFC5322_DOMLIT_OBSDTEXT'], ['test@[RFC-5322-\\\t-domain-literal]', 'ISEMAIL_RFC5322_DOMLIT_OBSDTEXT'], ['test@[RFC-5322-\\]-domain-literal]', 'ISEMAIL_RFC5322_DOMLIT_OBSDTEXT'], ['test@[RFC-5322-domain-literal\\]', 'ISEMAIL_ERR_UNCLOSEDDOMLIT'], ['test@[RFC-5322-domain-literal\\', 'ISEMAIL_ERR_BACKSLASHEND'], ['test@[RFC 5322 domain literal]', 'ISEMAIL_RFC5322_DOMAINLITERAL'], ['test@[RFC-5322-domain-literal] (comment)', 'ISEMAIL_RFC5322_DOMAINLITERAL'], ['\x7f@iana.org', 'ISEMAIL_ERR_EXPECTING_ATEXT'], ['test@\x7f.org', 'ISEMAIL_ERR_EXPECTING_ATEXT'], ['"\x7f"@iana.org', 'ISEMAIL_DEPREC_QTEXT'], ['"\\\x7f"@iana.org', 'ISEMAIL_DEPREC_QP'], ['(\x7f)test@iana.org', 'ISEMAIL_DEPREC_CTEXT'], ['test@iana.org\r', 'ISEMAIL_ERR_CR_NO_LF'], ['\rtest@iana.org', 'ISEMAIL_ERR_CR_NO_LF'], ['"\rtest"@iana.org', 'ISEMAIL_ERR_CR_NO_LF'], ['(\r)test@iana.org', 'ISEMAIL_ERR_CR_NO_LF'], ['test@iana.org(\r)', 'ISEMAIL_ERR_CR_NO_LF'], ['\ntest@iana.org', 'ISEMAIL_ERR_EXPECTING_ATEXT'], ['"\n"@iana.org', 'ISEMAIL_ERR_EXPECTING_QTEXT'], ['"\\\n"@iana.org', 'ISEMAIL_DEPREC_QP'], ['(\n)test@iana.org', 'ISEMAIL_ERR_EXPECTING_CTEXT'], ['\x07@iana.org', 'ISEMAIL_ERR_EXPECTING_ATEXT'], ['test@\x07.org', 'ISEMAIL_ERR_EXPECTING_ATEXT'], ['"\x07"@iana.org', 'ISEMAIL_DEPREC_QTEXT'], ['"\\\x07"@iana.org', 'ISEMAIL_DEPREC_QP'], ['(\x07)test@iana.org', 'ISEMAIL_DEPREC_CTEXT'], ['\r\ntest@iana.org', 'ISEMAIL_ERR_FWS_CRLF_END'], ['\r\n \r\ntest@iana.org', 'ISEMAIL_ERR_FWS_CRLF_END'], [' \r\ntest@iana.org', 'ISEMAIL_ERR_FWS_CRLF_END'], [' \r\n test@iana.org', 'ISEMAIL_CFWS_FWS'], [' \r\n \r\ntest@iana.org', 'ISEMAIL_ERR_FWS_CRLF_END'], [' \r\n\r\ntest@iana.org', 'ISEMAIL_ERR_FWS_CRLF_X2'], [' \r\n\r\n test@iana.org', 'ISEMAIL_ERR_FWS_CRLF_X2'], ['test@iana.org\r\n ', 'ISEMAIL_CFWS_FWS'], ['test@iana.org\r\n \r\n ', 'ISEMAIL_DEPREC_FWS'], ['test@iana.org\r\n', 'ISEMAIL_ERR_FWS_CRLF_END'], ['test@iana.org\r\n \r\n', 'ISEMAIL_ERR_FWS_CRLF_END'], ['test@iana.org \r\n', 'ISEMAIL_ERR_FWS_CRLF_END'], ['test@iana.org \r\n ', 'ISEMAIL_CFWS_FWS'], ['test@iana.org \r\n \r\n', 'ISEMAIL_ERR_FWS_CRLF_END'], ['test@iana.org \r\n\r\n', 'ISEMAIL_ERR_FWS_CRLF_X2'], ['test@iana.org \r\n\r\n ', 'ISEMAIL_ERR_FWS_CRLF_X2'], [' test@iana.org', 'ISEMAIL_CFWS_FWS'], ['test@iana.org ', 'ISEMAIL_CFWS_FWS'], ['test@[IPv6:1::2:]', 'ISEMAIL_RFC5322_IPV6_COLONEND'], ['"test\\©"@iana.org', 'ISEMAIL_ERR_EXPECTING_QPAIR'], ['test@iana/icann.org', 'ISEMAIL_RFC5322_DOMAIN'], ['test.(comment)test@iana.org', 'ISEMAIL_DEPREC_COMMENT'] ] ) def test_pyisemail_tests(email_input, status): if status == "ISEMAIL_VALID": # All standard email address forms should not raise an exception. validate_email(email_input, test_environment=True) elif "_ERR_" in status or "_TOOLONG" in status \ or "_CFWS_FWS" in status or "_CFWS_COMMENT" in status \ or "_IPV6" in status or status == "ISEMAIL_RFC5322_DOMAIN": # Invalid syntax, extranous whitespace, and "(comments)" should be rejected. # The _IPV6_ diagnoses appear to represent syntactically invalid domain literals. # The ISEMAIL_RFC5322_DOMAIN diagnosis appears to be a syntactically invalid domain. with pytest.raises(EmailSyntaxError): validate_email(email_input, test_environment=True) elif "_DEPREC_" in status \ or "RFC5321_QUOTEDSTRING" in status \ or "DOMAINLITERAL" in status or "_DOMLIT_" in status or "_ADDRESSLITERAL" in status: # Quoted strings in the local part, domain literals (IP addresses in brackets), # and other deprecated syntax are valid email addresses and are accepted by pyIsEmail, # but we reject them. with pytest.raises(EmailSyntaxError): validate_email(email_input, test_environment=True) else: raise ValueError("status {} is not recognized".format(status)) def test_dict_accessor(): input_email = "testaddr@example.tld" valid_email = validate_email(input_email, check_deliverability=False) assert isinstance(valid_email.as_dict(), dict) assert valid_email.as_dict()["original_email"] == input_email def test_deliverability_found(): response = validate_email_deliverability('gmail.com', 'gmail.com') assert response.keys() == {'mx', 'mx_fallback_type', 'spf'} assert response['mx_fallback_type'] is None assert len(response['mx']) > 1 assert len(response['mx'][0]) == 2 assert isinstance(response['mx'][0][0], int) assert response['mx'][0][1].endswith('.com') def test_deliverability_fails(): # No MX record. domain = 'xkxufoekjvjfjeodlfmdfjcu.com' with pytest.raises(EmailUndeliverableError, match='The domain name {} does not exist'.format(domain)): validate_email_deliverability(domain, domain) # Null MX record. domain = 'example.com' with pytest.raises(EmailUndeliverableError, match='The domain name {} does not accept email'.format(domain)): validate_email_deliverability(domain, domain) def test_deliverability_dns_timeout(): validate_email_deliverability.TEST_CHECK_TIMEOUT = True response = validate_email_deliverability('gmail.com', 'gmail.com') assert "mx" not in response assert response.get("unknown-deliverability") == "timeout" validate_email('test@gmail.com') del validate_email_deliverability.TEST_CHECK_TIMEOUT def test_main_single_good_input(monkeypatch, capsys): import json test_email = "google@google.com" monkeypatch.setattr('sys.argv', ['email_validator', test_email]) validator_main() stdout, _ = capsys.readouterr() output = json.loads(str(stdout)) assert isinstance(output, dict) assert validate_email(test_email).original_email == output["original_email"] def test_main_single_bad_input(monkeypatch, capsys): bad_email = 'test@..com' monkeypatch.setattr('sys.argv', ['email_validator', bad_email]) validator_main() stdout, _ = capsys.readouterr() assert stdout == 'An email address cannot have a period immediately after the @-sign.\n' def test_main_multi_input(monkeypatch, capsys): import io test_cases = ["google1@google.com", "google2@google.com", "test@.com", "test3@.com"] test_input = io.StringIO("\n".join(test_cases)) monkeypatch.setattr('sys.stdin', test_input) monkeypatch.setattr('sys.argv', ['email_validator']) validator_main() stdout, _ = capsys.readouterr() assert test_cases[0] not in stdout assert test_cases[1] not in stdout assert test_cases[2] in stdout assert test_cases[3] in stdout def test_main_input_shim(monkeypatch, capsys): import json monkeypatch.setattr('sys.version_info', (2, 7)) test_email = b"google@google.com" monkeypatch.setattr('sys.argv', ['email_validator', test_email]) validator_main() stdout, _ = capsys.readouterr() output = json.loads(str(stdout)) assert isinstance(output, dict) assert validate_email(test_email).original_email == output["original_email"] def test_main_output_shim(monkeypatch, capsys): monkeypatch.setattr('sys.version_info', (2, 7)) test_email = b"test@.com" monkeypatch.setattr('sys.argv', ['email_validator', test_email]) validator_main() stdout, _ = capsys.readouterr() # This looks bad but it has to do with the way python 2.7 prints vs py3 # The \n is part of the print statement, not part of the string, which is what the b'...' is # Since we're mocking py 2.7 here instead of actually using 2.7, this was the closest I could get assert stdout == "b'An email address cannot have a period immediately after the @-sign.'\n" def test_validate_email__with_caching_resolver(): # unittest.mock.patch("dns.resolver.LRUCache.get") doesn't # work --- it causes get to always return an empty list. # So we'll mock our own way. class MockedCache: get_called = False put_called = False def get(self, key): self.get_called = True return None def put(self, key, value): self.put_called = True # Test with caching_resolver helper method. mocked_cache = MockedCache() dns_resolver = caching_resolver(cache=mocked_cache) validate_email("test@gmail.com", dns_resolver=dns_resolver) assert mocked_cache.put_called validate_email("test@gmail.com", dns_resolver=dns_resolver) assert mocked_cache.get_called # Test with dns.resolver.Resolver instance. dns_resolver = dns.resolver.Resolver() dns_resolver.lifetime = 10 dns_resolver.cache = MockedCache() validate_email("test@gmail.com", dns_resolver=dns_resolver) assert mocked_cache.put_called validate_email("test@gmail.com", dns_resolver=dns_resolver) assert mocked_cache.get_called