pax_global_header00006660000000000000000000000064137510710400014510gustar00rootroot0000000000000052 comment=df5dbf9f83de4dea72404fda5f68049ca20fce98 python-email-validator-1.1.2/000077500000000000000000000000001375107104000161025ustar00rootroot00000000000000python-email-validator-1.1.2/.gitignore000066400000000000000000000002721375107104000200730ustar00rootroot00000000000000__pycache__/ *.py[cod] *$py.class *.so .Python build/ dist/ downloads/ eggs/ .eggs/ *.egg-info/ *.egg *.log docs/_build/ .python-version .env .venv env/ env27/ .idea/ .coverage htmlcov/ python-email-validator-1.1.2/.travis.yml000066400000000000000000000003421375107104000202120ustar00rootroot00000000000000os: linux dist: xenial language: python cache: pip python: #- '2.7' #- '3.4' - '3.5' - '3.6' - '3.7' - '3.8' install: - make install script: - make lint - make test after_success: - bash <(curl -s https://codecov.io/bash) python-email-validator-1.1.2/CONTRIBUTING.md000066400000000000000000000007201375107104000203320ustar00rootroot00000000000000## Public domain This project is in the public domain. Copyright and related rights in the work worldwide are waived through the [CC0 1.0 Universal public domain dedication][CC0]. See the LICENSE file in this directory. All contributions to this project must be released under the same CC0 wavier. By submitting a pull request or patch, you are agreeing to comply with this waiver of copyright interest. [CC0]: http://creativecommons.org/publicdomain/zero/1.0/ python-email-validator-1.1.2/LICENSE000066400000000000000000000156101375107104000171120ustar00rootroot00000000000000Creative Commons Legal Code CC0 1.0 Universal CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE LEGAL SERVICES. DISTRIBUTION OF THIS DOCUMENT DOES NOT CREATE AN ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES REGARDING THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED HEREUNDER, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED HEREUNDER. Statement of Purpose The laws of most jurisdictions throughout the world automatically confer exclusive Copyright and Related Rights (defined below) upon the creator and subsequent owner(s) (each and all, an "owner") of an original work of authorship and/or a database (each, a "Work"). Certain owners wish to permanently relinquish those rights to a Work for the purpose of contributing to a commons of creative, cultural and scientific works ("Commons") that the public can reliably and without fear of later claims of infringement build upon, modify, incorporate in other works, reuse and redistribute as freely as possible in any form whatsoever and for any purposes, including without limitation commercial purposes. These owners may contribute to the Commons to promote the ideal of a free culture and the further production of creative, cultural and scientific works, or to gain reputation or greater distribution for their Work in part through the use and efforts of others. For these and/or other purposes and motivations, and without any expectation of additional consideration or compensation, the person associating CC0 with a Work (the "Affirmer"), to the extent that he or she is an owner of Copyright and Related Rights in the Work, voluntarily elects to apply CC0 to the Work and publicly distribute the Work under its terms, with knowledge of his or her Copyright and Related Rights in the Work and the meaning and intended legal effect of CC0 on those rights. 1. Copyright and Related Rights. A Work made available under CC0 may be protected by copyright and related or neighboring rights ("Copyright and Related Rights"). Copyright and Related Rights include, but are not limited to, the following: i. the right to reproduce, adapt, distribute, perform, display, communicate, and translate a Work; ii. moral rights retained by the original author(s) and/or performer(s); iii. publicity and privacy rights pertaining to a person's image or likeness depicted in a Work; iv. rights protecting against unfair competition in regards to a Work, subject to the limitations in paragraph 4(a), below; v. rights protecting the extraction, dissemination, use and reuse of data in a Work; vi. database rights (such as those arising under Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases, and under any national implementation thereof, including any amended or successor version of such directive); and vii. other similar, equivalent or corresponding rights throughout the world based on applicable law or treaty, and any national implementations thereof. 2. Waiver. To the greatest extent permitted by, but not in contravention of, applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and unconditionally waives, abandons, and surrenders all of Affirmer's Copyright and Related Rights and associated claims and causes of action, whether now known or unknown (including existing as well as future claims and causes of action), in the Work (i) in all territories worldwide, (ii) for the maximum duration provided by applicable law or treaty (including future time extensions), (iii) in any current or future medium and for any number of copies, and (iv) for any purpose whatsoever, including without limitation commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes the Waiver for the benefit of each member of the public at large and to the detriment of Affirmer's heirs and successors, fully intending that such Waiver shall not be subject to revocation, rescission, cancellation, termination, or any other legal or equitable action to disrupt the quiet enjoyment of the Work by the public as contemplated by Affirmer's express Statement of Purpose. 3. Public License Fallback. Should any part of the Waiver for any reason be judged legally invalid or ineffective under applicable law, then the Waiver shall be preserved to the maximum extent permitted taking into account Affirmer's express Statement of Purpose. In addition, to the extent the Waiver is so judged Affirmer hereby grants to each affected person a royalty-free, non transferable, non sublicensable, non exclusive, irrevocable and unconditional license to exercise Affirmer's Copyright and Related Rights in the Work (i) in all territories worldwide, (ii) for the maximum duration provided by applicable law or treaty (including future time extensions), (iii) in any current or future medium and for any number of copies, and (iv) for any purpose whatsoever, including without limitation commercial, advertising or promotional purposes (the "License"). The License shall be deemed effective as of the date CC0 was applied by Affirmer to the Work. Should any part of the License for any reason be judged legally invalid or ineffective under applicable law, such partial invalidity or ineffectiveness shall not invalidate the remainder of the License, and in such case Affirmer hereby affirms that he or she will not (i) exercise any of his or her remaining Copyright and Related Rights in the Work or (ii) assert any associated claims and causes of action with respect to the Work, in either case contrary to Affirmer's express Statement of Purpose. 4. Limitations and Disclaimers. a. No trademark or patent rights held by Affirmer are waived, abandoned, surrendered, licensed or otherwise affected by this document. b. Affirmer offers the Work as-is and makes no representations or warranties of any kind concerning the Work, express, implied, statutory or otherwise, including without limitation warranties of title, merchantability, fitness for a particular purpose, non infringement, or the absence of latent or other defects, accuracy, or the present or absence of errors, whether or not discoverable, all to the greatest extent permissible under applicable law. c. Affirmer disclaims responsibility for clearing rights of other persons that may apply to the Work or any use thereof, including without limitation any person's Copyright and Related Rights in the Work. Further, Affirmer disclaims responsibility for obtaining any necessary consents, permissions or other rights required for any use of the Work. d. Affirmer understands and acknowledges that Creative Commons is not a party to this document and has no duty or obligation with respect to this CC0 or use of the Work. python-email-validator-1.1.2/MANIFEST.in000066400000000000000000000000651375107104000176410ustar00rootroot00000000000000include email_validator.py include LICENSE README.md python-email-validator-1.1.2/Makefile000066400000000000000000000013321375107104000175410ustar00rootroot00000000000000.DEFAULT_GOAL := all .PHONY: install install: pip install -U setuptools pip pip install -U -r test_requirements.txt pip install -e . .PHONY: lint lint: #python setup.py check -rms flake8 --ignore=E501,E126,W503 email_validator tests .PHONY: test test: pytest --cov=email_validator .PHONY: testcov testcov: test @echo "building coverage html" @coverage html .PHONY: all all: testcov lint .PHONY: clean clean: rm -rf `find . -name __pycache__` rm -f `find . -type f -name '*.py[co]' ` rm -f `find . -type f -name '*~' ` rm -f `find . -type f -name '.*~' ` rm -rf .cache rm -rf .pytest_cache rm -rf htmlcov rm -rf *.egg-info rm -f .coverage rm -f .coverage.* rm -rf build rm -rf dist python setup.py clean python-email-validator-1.1.2/README.md000066400000000000000000000402631375107104000173660ustar00rootroot00000000000000email-validator: Validate Email Addresses ========================================= A robust email address syntax and deliverability validation library for Python 2.7/3.4+ by [Joshua Tauberer](https://joshdata.me). This library validates that a string is of the form `name@example.com`. This is the sort of validation you would want for an email-based login form on a website. Key features: * Checks that an email address has the correct syntax --- good for login forms or other uses related to identifying users. * Gives friendly error messages when validation fails (appropriate to show to end users). * (optionally) Checks deliverability: Does the domain name resolve? * Supports internationalized domain names and (optionally) internationalized local parts. * Normalizes email addresses (super important for internationalized addresses! see below). The library is NOT for validation of the To: line in an email message (e.g. `My Name `), which [flanker](https://github.com/mailgun/flanker) is more appropriate for. And this library does NOT permit obsolete forms of email addresses, so if you need strict validation against the email specs exactly, use [pyIsEmail](https://github.com/michaelherold/pyIsEmail). This library was first published in 2015. The current version is 1.1.1 (posted May 19, 2020). **Starting in version 1.1.0, the type of the value returned from `validate_email` has changed, but dict-style access to the validated address information still works, so it is backwards compatible.** Installation ------------ This package [is on PyPI](https://pypi.org/project/email-validator/), so: ```sh pip install email-validator ``` `pip3` also works. Usage ----- If you're validating a user's email address before creating a user account, you might do this: ```python from email_validator import validate_email, EmailNotValidError email = "my+address@mydomain.tld" try: # Validate. valid = validate_email(email) # Update with the normalized form. email = valid.email except EmailNotValidError as e: # email is not valid, exception message is human-readable print(str(e)) ``` This validates the address and gives you its normalized form. You should put the normalized form in your database and always normalize before checking if an address is in your database. The validator will accept internationalized email addresses, but email addresses with non-ASCII characters in the *local* part of the address (before the @-sign) require the [SMTPUTF8](https://tools.ietf.org/html/rfc6531) extension which may not be supported by your mail submission library or your outbound mail server. If you know ahead of time that SMTPUTF8 is not supported then **add the keyword argument allow\_smtputf8=False to fail validation for addresses that would require SMTPUTF8**: ```python valid = validate_email(email, allow_smtputf8=False) ``` Overview -------- The module provides a single function `validate_email(email_address)` which takes an email address (either a `str` or ASCII `bytes`) and: - Raises a `EmailNotValidError` with a helpful, human-readable error message explaining why the email address is not valid, or - Returns an object with a normalized form of the email address and other information about it. When an email address is not valid, `validate_email` raises either an `EmailSyntaxError` if the form of the address is invalid or an `EmailUndeliverableError` if the domain name does not resolve. Both exception classes are subclasses of `EmailNotValidError`, which in turn is a subclass of `ValueError`. But when an email address is valid, an object is returned containing a normalized form of the email address (which you should use!) and other information. The validator doesn't permit obsoleted forms of email addresses that no one uses anymore even though they are still valid and deliverable, since they will probably give you grief if you're using email for login. (See later in the document about that.) The validator checks that the domain name in the email address resolves. There is nothing to be gained by trying to actually contact an SMTP server, so that's not done here. For privacy, security, and practicality reasons servers are good at not giving away whether an address is deliverable or not: email addresses that appear to accept mail at first can bounce mail after a delay, and bounced mail may indicate a temporary failure of a good email address (sometimes an intentional failure, like greylisting). The function also accepts the following keyword arguments (default as shown): `allow_smtputf8=True`: Set to `False` to prohibit internationalized addresses that would require the [SMTPUTF8](https://tools.ietf.org/html/rfc6531) extension. `check_deliverability=True`: Set to `False` to skip the domain name resolution check. `allow_empty_local=False`: Set to `True` to allow an empty local part (i.e. `@example.com`), e.g. for validating Postfix aliases. Internationalized email addresses --------------------------------- The email protocol SMTP and the domain name system DNS have historically only allowed ASCII characters in email addresses and domain names, respectively. Each has adapted to internationalization in a separate way, creating two separate aspects to email address internationalization. ### Internationalized domain names (IDN) The first is [internationalized domain names (RFC 5891)](https://tools.ietf.org/html/rfc5891), a.k.a IDNA 2008. The DNS system has not been updated with Unicode support. Instead, internationalized domain names are converted into a special IDNA ASCII "[Punycode](https://www.rfc-editor.org/rfc/rfc3492.txt)" form starting with `xn--`. When an email address has non-ASCII characters in its domain part, the domain part is replaced with its IDNA ASCII equivalent form in the process of mail transmission. Your mail submission library probably does this for you transparently. Note that most web browsers are currently in transition between IDNA 2003 (RFC 3490) and IDNA 2008 (RFC 5891) and [compliance around the web is not very good](http://archives.miloush.net/michkap/archive/2012/02/27/10273315.html) in any case, so be aware that edge cases are handled differently by different applications and libraries. This library conforms to IDNA 2008 using the [idna](https://github.com/kjd/idna) module by Kim Davies. ### Internationalized local parts The second sort of internationalization is internationalization in the *local* part of the address (before the @-sign). These email addresses require that your mail submission library and the mail servers along the route to the destination, including your own outbound mail server, all support the [SMTPUTF8 (RFC 6531)](https://tools.ietf.org/html/rfc6531) extension. Support for SMTPUTF8 varies. ### If you know ahead of time that SMTPUTF8 is not supported by your mail submission stack By default all internationalized forms are accepted by the validator. But if you know ahead of time that SMTPUTF8 is not supported by your mail submission stack, then you must filter out addresses that require SMTPUTF8 using the `allow_smtputf8=False` keyword argument (see above). This will cause the validation function to raise a `EmailSyntaxError` if delivery would require SMTPUTF8. That's just in those cases where non-ASCII characters appear before the @-sign. If you do not set `allow_smtputf8=False`, you can also check the value of the `smtputf8` field in the returned object. If your mail submission library doesn't support Unicode at all --- even in the domain part of the address --- then immediately prior to mail submission you must replace the email address with its ASCII-ized form. This library gives you back the ASCII-ized form in the `ascii_email` field in the returned object, which you can get like this: ```python valid = validate_email(email, allow_smtputf8=False) email = valid.ascii_email ``` The local part is left alone (if it has internationalized characters `allow_smtputf8=False` will force validation to fail) and the domain part is converted to [IDNA ASCII](https://tools.ietf.org/html/rfc5891). (You probably should not do this at account creation time so you don't change the user's login information without telling them.) ### UCS-4 support required for Python 2.7 Note that when using Python 2.7, it is required that it was built with UCS-4 support (see [here](https://stackoverflow.com/questions/29109944/python-returns-length-of-2-for-single-unicode-character-string)); otherwise emails with unicode characters outside of the BMP (Basic Multilingual Plane) will not validate correctly. Normalization ------------- The use of Unicode in email addresses introduced a normalization problem. Different Unicode strings can look identical and have the same semantic meaning to the user. The `email` field returned on successful validation provides the correctly normalized form of the given email address: ```python valid = validate_email("me@Domain.com") email = valid.ascii_email print(email) # prints: me@domain.com ``` Because an end-user might type their email address in different (but equivalent) un-normalized forms at different times, you ought to replace what they enter with the normalized form immediately prior to going into your database (during account creation), querying your database (during login), or sending outbound mail. Normalization may also change the length of an email address, and this may affect whether it is valid and acceptable by your SMTP provider. The normalizations include lowercasing the domain part of the email address (domain names are case-insensitive), [Unicode "NFC" normalization](https://en.wikipedia.org/wiki/Unicode_equivalence) of the whole address (which turns characters plus [combining characters](https://en.wikipedia.org/wiki/Combining_character) into precomposed characters where possible, replacement of [fullwidth and halfwidth characters](https://en.wikipedia.org/wiki/Halfwidth_and_fullwidth_forms) in the domain part, possibly other [UTS46](http://unicode.org/reports/tr46) mappings on the domain part, and conversion from Punycode to Unicode characters. (See [RFC 6532 (internationalized email) section 3.1](https://tools.ietf.org/html/rfc6532#section-3.1) and [RFC 5895 (IDNA 2008) section 2](http://www.ietf.org/rfc/rfc5895.txt).) Examples -------- For the email address `test@joshdata.me`, the returned object is: ```python ValidatedEmail( email='test@joshdata.me', local_part='test', domain='joshdata.me', ascii_email='test@joshdata.me', ascii_local_part='test', ascii_domain='joshdata.me', smtputf8=False, mx=[(10, 'box.occams.info')], mx_fallback_type=None) ``` For the fictitious address `example@ツ.life`, which has an internationalized domain but ASCII local part, the returned object is: ```python ValidatedEmail( email='example@ツ.life', local_part='example', domain='ツ.life', ascii_email='example@xn--bdk.life', ascii_local_part='example', ascii_domain='xn--bdk.life', smtputf8=False) ``` Note that `smtputf8` is `False` even though the domain part is internationalized because [SMTPUTF8](https://tools.ietf.org/html/rfc6531) is only needed if the local part of the address is internationalized (the domain part can be converted to IDNA ASCII Punycode). Also note that the `email` and `domain` fields provide a normalized form of the email address and domain name (casefolding and Unicode normalization as required by IDNA 2008). Calling `validate_email` with the ASCII form of the above email address, `example@xn--bdk.life`, returns the exact same information (i.e., the `email` field always will contain Unicode characters, not Punycode). For the fictitious address `ツ-test@joshdata.me`, which has an internationalized local part, the returned object is: ```python ValidatedEmail( email='ツ-test@joshdata.me', local_part='ツ-test', domain='joshdata.me', ascii_email=None, ascii_local_part=None, ascii_domain='joshdata.me', smtputf8=True) ``` Now `smtputf8` is `True` and `ascii_email` is `None` because the local part of the address is internationalized. The `local_part` and `email` fields return the normalized form of the address: certain Unicode characters (such as angstrom and ohm) may be replaced by other equivalent code points (a-with-ring and omega). Return value ------------ When an email address passes validation, the fields in the returned object are: | Field | Value | | -----:|-------| | `email` | The normalized form of the email address that you should put in your database. This merely combines the `local_part` and `domain` fields (see below). | | `ascii_email` | If set, an ASCII-only form of the email address by replacing the domain part with [IDNA](https://tools.ietf.org/html/rfc5891) [Punycode](https://www.rfc-editor.org/rfc/rfc3492.txt). This field will be present when an ASCII-only form of the email address exists (including if the email address is already ASCII). If the local part of the email address contains internationalized characters, `ascii_email` will be `None`. If set, it merely combines `ascii_local_part` and `ascii_domain`. | | `local_part` | The local part of the given email address (before the @-sign) with Unicode NFC normalization applied. | | `ascii_local_part` | If set, the local part, which is composed of ASCII characters only. | | `domain` | The canonical internationalized Unicode form of the domain part of the email address. If the returned string contains non-ASCII characters, either the [SMTPUTF8](https://tools.ietf.org/html/rfc6531) feature of your mail relay will be required to transmit the message or else the email address's domain part must be converted to IDNA ASCII first: Use `ascii_domain` field instead. | | `ascii_domain` | The [IDNA](https://tools.ietf.org/html/rfc5891) [Punycode](https://www.rfc-editor.org/rfc/rfc3492.txt)-encoded form of the domain part of the given email address, as it would be transmitted on the wire. | | `smtputf8` | A boolean indicating that the [SMTPUTF8](https://tools.ietf.org/html/rfc6531) feature of your mail relay will be required to transmit messages to this address because the local part of the address has non-ASCII characters (the local part cannot be IDNA-encoded). If `allow_smtputf8=False` is passed as an argument, this flag will always be false because an exception is raised if it would have been true. | | `mx` | A list of (priority, domain) tuples of MX records specified in the DNS for the domain (see [RFC 5321 section 5](https://tools.ietf.org/html/rfc5321#section-5)). May be `None` if the deliverability check could not be completed because of a temporary issue like a timeout. | | `mx_fallback_type` | `None` if an `MX` record is found. If no MX records are actually specified in DNS and instead are inferred, through an obsolete mechanism, from A or AAAA records, the value is the type of DNS record used instead (`A` or `AAAA`). May be `None` if the deliverability check could not be completed because of a temporary issue like a timeout. | Assumptions ----------- By design, this validator does not pass all email addresses that strictly conform to the standards. Many email address forms are obsolete or likely to cause trouble: * The validator assumes the email address is intended to be deliverable on the public Internet using DNS, and so the domain part of the email address must be a resolvable domain name. * The "quoted string" form of the local part of the email address (RFC 5321 4.1.2) is not permitted --- no one uses this anymore anyway. Quoted forms allow multiple @-signs, space characters, and other troublesome conditions. * The "literal" form for the domain part of an email address (an IP address) is not accepted --- no one uses this anymore anyway. Testing ------- Tests can be run using ```sh pip install -r test_requirements.txt make test ``` For Project Maintainers ----------------------- The package is distributed as a universal wheel and as a source package. To release: * Update the version number. * Follow the steps below to publish source and a universal wheel to pypi. * Make a release at https://github.com/JoshData/python-email-validator/releases/new. ```sh pip3 install twine rm -rf dist python3 setup.py sdist python3 setup.py bdist_wheel twine upload dist/* git tag v1.0.XXX # replace with version in setup.py git push --tags ``` Notes: The wheel is specified as universal in the file `setup.cfg` by the `universal = 1` key in the `[bdist_wheel]` section. python-email-validator-1.1.2/email_validator/000077500000000000000000000000001375107104000212365ustar00rootroot00000000000000python-email-validator-1.1.2/email_validator/__init__.py000066400000000000000000000573601375107104000233620ustar00rootroot00000000000000# -*- coding: utf-8 -*- import sys import re import unicodedata import dns.resolver import dns.exception import idna # implements IDNA 2008; Python's codec is only IDNA 2003 # Based on RFC 2822 section 3.2.4 / RFC 5322 section 3.2.3, these # characters are permitted in email addresses (not taking into # account internationalization): ATEXT = r'a-zA-Z0-9_!#\$%&\'\*\+\-/=\?\^`\{\|\}~' # A "dot atom text", per RFC 2822 3.2.4: DOT_ATOM_TEXT = '[' + ATEXT + ']+(?:\\.[' + ATEXT + ']+)*' # RFC 6531 section 3.3 extends the allowed characters in internationalized # addresses to also include three specific ranges of UTF8 defined in # RFC3629 section 4, which appear to be the Unicode code points from # U+0080 to U+10FFFF. ATEXT_UTF8 = ATEXT + u"\u0080-\U0010FFFF" DOT_ATOM_TEXT_UTF8 = '[' + ATEXT_UTF8 + ']+(?:\\.[' + ATEXT_UTF8 + ']+)*' # The domain part of the email address, after IDNA (ASCII) encoding, # must also satisfy the requirements of RFC 952/RFC 1123 which restrict # the allowed characters of hostnames further. The hyphen cannot be at # the beginning or end of a *dot-atom component* of a hostname either. ATEXT_HOSTNAME = r'(?:(?:[a-zA-Z0-9][a-zA-Z0-9\-]*)?[a-zA-Z0-9])' # Length constants # RFC 3696 + errata 1003 + errata 1690 (https://www.rfc-editor.org/errata_search.php?rfc=3696&eid=1690) # explains the maximum length of an email address is 254 octets. EMAIL_MAX_LENGTH = 254 LOCAL_PART_MAX_LENGTH = 64 DOMAIN_MAX_LENGTH = 255 # ease compatibility in type checking if sys.version_info >= (3,): unicode_class = str else: unicode_class = unicode # noqa: F821 # turn regexes to unicode (because 'ur' literals are not allowed in Py3) ATEXT = ATEXT.decode("ascii") DOT_ATOM_TEXT = DOT_ATOM_TEXT.decode("ascii") ATEXT_HOSTNAME = ATEXT_HOSTNAME.decode("ascii") DEFAULT_TIMEOUT = 15 # secs class EmailNotValidError(ValueError): """Parent class of all exceptions raised by this module.""" pass class EmailSyntaxError(EmailNotValidError): """Exception raised when an email address fails validation because of its form.""" pass class EmailUndeliverableError(EmailNotValidError): """Exception raised when an email address fails validation because its domain name does not appear deliverable.""" pass class ValidatedEmail(object): """The validate_email function returns objects of this type holding the normalized form of the email address and other information.""" """The email address that was passed to validate_email. (If passed as bytes, this will be a string.)""" original_email = None """The normalized email address, which should always be used in preferance to the original address. The normalized address converts an IDNA ASCII domain name to Unicode, if possible, and performs Unicode normalization on the local part and on the domain (if originally Unicode). It is the concatenation of the local_part and domain attributes, separated by an @-sign.""" email = None """The local part of the email address after Unicode normalization.""" local_part = None """The domain part of the email address after Unicode normalization or conversion to Unicode from IDNA ascii.""" domain = None """If not None, a form of the email address that uses 7-bit ASCII characters only.""" ascii_email = None """If not None, the local part of the email address using 7-bit ASCII characters only.""" ascii_local_part = None """If not None, a form of the domain name that uses 7-bit ASCII characters only.""" ascii_domain = None """If True, the SMTPUTF8 feature of your mail relay will be required to transmit messages to this address. This flag is True just when ascii_local_part is missing. Otherwise it is False.""" smtputf8 = None """If a deliverability check is performed and if it succeeds, a list of (priority, domain) tuples of MX records specified in the DNS for the domain.""" mx = None """If no MX records are actually specified in DNS and instead are inferred, through an obsolete mechanism, from A or AAAA records, the value is the type of DNS record used instead (`A` or `AAAA`).""" mx_fallback_type = None """Tests use this constructor.""" def __init__(self, **kwargs): for k, v in kwargs.items(): setattr(self, k, v) """As a convenience, str(...) on instances of this class return the normalized address.""" def __self__(self): return self.normalized_email def __repr__(self): return "".format(self.email) """For backwards compatibility, some fields are also exposed through a dict-like interface. Note that some of the names changed when they became attributes.""" def __getitem__(self, key): if key == "email": return self.email if key == "email_ascii": return self.ascii_email if key == "local": return self.local_part if key == "domain": return self.ascii_domain if key == "domain_i18n": return self.domain if key == "smtputf8": return self.smtputf8 if key == "mx": return self.mx if key == "mx-fallback": return self.mx_fallback_type raise KeyError() """Tests use this.""" def __eq__(self, other): return ( self.email == other.email and self.local_part == other.local_part and self.domain == other.domain and self.ascii_email == other.ascii_email and self.ascii_local_part == other.ascii_local_part and self.ascii_domain == other.ascii_domain and self.smtputf8 == other.smtputf8 and repr(sorted(self.mx) if self.mx else self.mx) == repr(sorted(other.mx) if other.mx else other.mx) and self.mx_fallback_type == other.mx_fallback_type ) """This helps producing the README.""" def as_constructor(self): return "ValidatedEmail(" \ + ",".join("\n {}={}".format( key, repr(getattr(self, key))) for key in ('email', 'local_part', 'domain', 'ascii_email', 'ascii_local_part', 'ascii_domain', 'smtputf8', 'mx', 'mx_fallback_type') ) \ + ")" """Convenience method for accessing ValidatedEmail as a dict""" def as_dict(self): return self.__dict__ def __get_length_reason(addr, utf8=False, limit=EMAIL_MAX_LENGTH): diff = len(addr) - limit reason = "({}{} character{} too many)" prefix = "at least " if utf8 else "" suffix = "s" if diff > 1 else "" return reason.format(prefix, diff, suffix) def validate_email( email, allow_smtputf8=True, allow_empty_local=False, check_deliverability=True, timeout=DEFAULT_TIMEOUT, ): """ Validates an email address, raising an EmailNotValidError if the address is not valid or returning a dict of information when the address is valid. The email argument can be a str or a bytes instance, but if bytes it must be ASCII-only. """ # Allow email to be a str or bytes instance. If bytes, # it must be ASCII because that's how the bytes work # on the wire with SMTP. if not isinstance(email, (str, unicode_class)): try: email = email.decode("ascii") except ValueError: raise EmailSyntaxError("The email address is not valid ASCII.") # At-sign. parts = email.split('@') if len(parts) != 2: raise EmailSyntaxError("The email address is not valid. It must have exactly one @-sign.") # Collect return values in this instance. ret = ValidatedEmail() ret.original_email = email # Validate the email address's local part syntax and get a normalized form. local_part_info = validate_email_local_part(parts[0], allow_smtputf8=allow_smtputf8, allow_empty_local=allow_empty_local) ret.local_part = local_part_info["local_part"] ret.ascii_local_part = local_part_info["ascii_local_part"] ret.smtputf8 = local_part_info["smtputf8"] # Validate the email address's domain part syntax and get a normalized form. domain_part_info = validate_email_domain_part(parts[1]) ret.domain = domain_part_info["domain"] ret.ascii_domain = domain_part_info["ascii_domain"] # Construct the complete normalized form. ret.email = ret.local_part + "@" + ret.domain # If the email address has an ASCII form, add it. if not ret.smtputf8: ret.ascii_email = ret.ascii_local_part + "@" + ret.ascii_domain # If the email address has an ASCII representation, then we assume it may be # transmitted in ASCII (we can't assume SMTPUTF8 will be used on all hops to # the destination) and the length limit applies to ASCII characters (which is # the same as octets). The number of characters in the internationalized form # may be many fewer (because IDNA ASCII is verbose) and could be less than 254 # Unicode characters, and of course the number of octets over the limit may # not be the number of characters over the limit, so if the email address is # internationalized, we can't give any simple information about why the address # is too long. # # In addition, check that the UTF-8 encoding (i.e. not IDNA ASCII and not # Unicode characters) is at most 254 octets. If the addres is transmitted using # SMTPUTF8, then the length limit probably applies to the UTF-8 encoded octets. # If the email address has an ASCII form that differs from its internationalized # form, I don't think the internationalized form can be longer, and so the ASCII # form length check would be sufficient. If there is no ASCII form, then we have # to check the UTF-8 encoding. The UTF-8 encoding could be up to about four times # longer than the number of characters. # # See the length checks on the local part and the domain. if ret.ascii_email and len(ret.ascii_email) > EMAIL_MAX_LENGTH: if ret.ascii_email == ret.email: reason = __get_length_reason(ret.ascii_email) elif len(ret.email) > EMAIL_MAX_LENGTH: # If there are more than 254 characters, then the ASCII # form is definitely going to be too long. reason = __get_length_reason(ret.email, utf8=True) else: reason = "(when converted to IDNA ASCII)" raise EmailSyntaxError("The email address is too long {}.".format(reason)) if len(ret.email.encode("utf8")) > EMAIL_MAX_LENGTH: if len(ret.email) > EMAIL_MAX_LENGTH: # If there are more than 254 characters, then the UTF-8 # encoding is definitely going to be too long. reason = __get_length_reason(ret.email, utf8=True) else: reason = "(when encoded in bytes)" raise EmailSyntaxError("The email address is too long {}.".format(reason)) if check_deliverability: # Validate the email address's deliverability and update the # return dict with metadata. deliverability_info = validate_email_deliverability(ret["domain"], ret["domain_i18n"], timeout) if "mx" in deliverability_info: ret.mx = deliverability_info["mx"] ret.mx_fallback_type = deliverability_info["mx-fallback"] return ret def validate_email_local_part(local, allow_smtputf8=True, allow_empty_local=False): # Validates the local part of an email address. if len(local) == 0: if not allow_empty_local: raise EmailSyntaxError("There must be something before the @-sign.") else: # The caller allows an empty local part. Useful for validating certain # Postfix aliases. return { "local_part": local, "ascii_local_part": local, "smtputf8": False, } # RFC 5321 4.5.3.1.1 # We're checking the number of characters here. If the local part # is ASCII-only, then that's the same as bytes (octets). If it's # internationalized, then the UTF-8 encoding may be longer, but # that may not be relevant. We will check the total address length # instead. if len(local) > LOCAL_PART_MAX_LENGTH: reason = __get_length_reason(local, limit=LOCAL_PART_MAX_LENGTH) raise EmailSyntaxError("The email address is too long before the @-sign {}.".format(reason)) # Check the local part against the regular expression for the older ASCII requirements. m = re.match(DOT_ATOM_TEXT + "\\Z", local) if m: # Return the local part unchanged and flag that SMTPUTF8 is not needed. return { "local_part": local, "ascii_local_part": local, "smtputf8": False, } else: # The local part failed the ASCII check. Now try the extended internationalized requirements. m = re.match(DOT_ATOM_TEXT_UTF8 + "\\Z", local) if not m: # It's not a valid internationalized address either. Report which characters were not valid. bad_chars = ', '.join(sorted(set( c for c in local if not re.match(u"[" + (ATEXT if not allow_smtputf8 else ATEXT_UTF8) + u"]", c) ))) raise EmailSyntaxError("The email address contains invalid characters before the @-sign: %s." % bad_chars) # It would be valid if internationalized characters were allowed by the caller. if not allow_smtputf8: raise EmailSyntaxError("Internationalized characters before the @-sign are not supported.") # It's valid. # RFC 6532 section 3.1 also says that Unicode NFC normalization should be applied, # so we'll return the normalized local part in the return value. local = unicodedata.normalize("NFC", local) # Flag that SMTPUTF8 will be required for deliverability. return { "local_part": local, "ascii_local_part": None, # no ASCII form is possible "smtputf8": True, } def validate_email_domain_part(domain): # Empty? if len(domain) == 0: raise EmailSyntaxError("There must be something after the @-sign.") # Perform UTS-46 normalization, which includes casefolding, NFC normalization, # and converting all label separators (the period/full stop, fullwidth full stop, # ideographic full stop, and halfwidth ideographic full stop) to basic periods. # It will also raise an exception if there is an invalid character in the input, # such as "⒈" which is invalid because it would expand to include a period. try: domain = idna.uts46_remap(domain, std3_rules=False, transitional=False) except idna.IDNAError as e: raise EmailSyntaxError("The domain name %s contains invalid characters (%s)." % (domain, str(e))) # Now we can perform basic checks on the use of periods (since equivalent # symbols have been mapped to periods). These checks are needed because the # IDNA library doesn't handle well domains that have empty labels (i.e. initial # dot, trailing dot, or two dots in a row). if domain.endswith("."): raise EmailSyntaxError("An email address cannot end with a period.") if domain.startswith("."): raise EmailSyntaxError("An email address cannot have a period immediately after the @-sign.") if ".." in domain: raise EmailSyntaxError("An email address cannot have two periods in a row.") # Regardless of whether international characters are actually used, # first convert to IDNA ASCII. For ASCII-only domains, the transformation # does nothing. If internationalized characters are present, the MTA # must either support SMTPUTF8 or the mail client must convert the # domain name to IDNA before submission. # # Unfortunately this step incorrectly 'fixes' domain names with leading # periods by removing them, so we have to check for this above. It also gives # a funky error message ("No input") when there are two periods in a # row, also checked separately above. try: ascii_domain = idna.encode(domain, uts46=False).decode("ascii") except idna.IDNAError as e: if "Domain too long" in str(e): # We can't really be more specific because UTS-46 normalization means # the length check is applied to a string that is different from the # one the user supplied. Also I'm not sure if the length check applies # to the internationalized form, the IDNA ASCII form, or even both! raise EmailSyntaxError("The email address is too long after the @-sign.") raise EmailSyntaxError("The domain name %s contains invalid characters (%s)." % (domain, str(e))) # We may have been given an IDNA ASCII domain to begin with. Check # that the domain actually conforms to IDNA. It could look like IDNA # but not be actual IDNA. For ASCII-only domains, the conversion out # of IDNA just gives the same thing back. # # This gives us the canonical internationalized form of the domain, # which we should use in all error messages. try: domain_i18n = idna.decode(ascii_domain.encode('ascii')) except idna.IDNAError as e: raise EmailSyntaxError("The domain name %s is not valid IDNA (%s)." % (ascii_domain, str(e))) # RFC 5321 4.5.3.1.2 # We're checking the number of bytes (octets) here, which can be much # higher than the number of characters in internationalized domains, # on the assumption that the domain may be transmitted without SMTPUTF8 # as IDNA ASCII. This is also checked by idna.encode, so this exception # is never reached. if len(ascii_domain) > DOMAIN_MAX_LENGTH: raise EmailSyntaxError("The email address is too long after the @-sign.") # A "dot atom text", per RFC 2822 3.2.4, but using the restricted # characters allowed in a hostname (see ATEXT_HOSTNAME above). DOT_ATOM_TEXT = ATEXT_HOSTNAME + r'(?:\.' + ATEXT_HOSTNAME + r')*' # Check the regular expression. This is probably entirely redundant # with idna.decode, which also checks this format. m = re.match(DOT_ATOM_TEXT + "\\Z", ascii_domain) if not m: raise EmailSyntaxError("The email address contains invalid characters after the @-sign.") # All publicly deliverable addresses have domain named with at least # one period. We also know that all TLDs end with a letter. if "." not in ascii_domain: raise EmailSyntaxError("The domain name %s is not valid. It should have a period." % domain_i18n) if not re.search(r"[A-Za-z]\Z", ascii_domain): raise EmailSyntaxError( "The domain name %s is not valid. It is not within a valid top-level domain." % domain_i18n ) # Return the IDNA ASCII-encoded form of the domain, which is how it # would be transmitted on the wire (except when used with SMTPUTF8 # possibly), as well as the canonical Unicode form of the domain, # which is better for display purposes. This should also take care # of RFC 6532 section 3.1's suggestion to apply Unicode NFC # normalization to addresses. return { "ascii_domain": ascii_domain, "domain": domain_i18n, } def validate_email_deliverability(domain, domain_i18n, timeout=DEFAULT_TIMEOUT): # Check that the domain resolves to an MX record. If there is no MX record, # try an A or AAAA record which is a deprecated fallback for deliverability. def dns_resolver_resolve_shim(resolver, domain, record): try: # dns.resolver.Resolver.resolve is new to dnspython 2.x. # https://dnspython.readthedocs.io/en/latest/resolver-class.html#dns.resolver.Resolver.resolve return resolver.resolve(domain, record) except AttributeError: # dnspython 2.x is only available in Python 3.6 and later. For earlier versions # of Python, we maintain compatibility with dnspython 1.x which has a # dnspython.resolver.Resolver.query method instead. The only difference is that # query may treat the domain as relative and use the system's search domains, # which we prevent by adding a "." to the domain name to make it absolute. # dns.resolver.Resolver.query is deprecated in dnspython version 2.x. # https://dnspython.readthedocs.io/en/latest/resolver-class.html#dns.resolver.Resolver.query return resolver.query(domain + ".", record) try: # We need a way to check how timeouts are handled in the tests. So we # have a secret variable that if set makes this method always test the # handling of a timeout. if getattr(validate_email_deliverability, 'TEST_CHECK_TIMEOUT', False): raise dns.exception.Timeout() resolver = dns.resolver.get_default_resolver() if timeout: resolver.lifetime = timeout try: # Try resolving for MX records and get them in sorted priority order. response = dns_resolver_resolve_shim(resolver, domain, "MX") mtas = sorted([(r.preference, str(r.exchange).rstrip('.')) for r in response]) mx_fallback = None except (dns.resolver.NoNameservers, dns.resolver.NXDOMAIN, dns.resolver.NoAnswer): # If there was no MX record, fall back to an A record. try: response = dns_resolver_resolve_shim(resolver, domain, "A") mtas = [(0, str(r)) for r in response] mx_fallback = "A" except (dns.resolver.NoNameservers, dns.resolver.NXDOMAIN, dns.resolver.NoAnswer): # If there was no A record, fall back to an AAAA record. try: response = dns_resolver_resolve_shim(resolver, domain, "AAAA") mtas = [(0, str(r)) for r in response] mx_fallback = "AAAA" except (dns.resolver.NoNameservers, dns.resolver.NXDOMAIN, dns.resolver.NoAnswer): # If there was no MX, A, or AAAA record, then mail to # this domain is not deliverable. raise EmailUndeliverableError("The domain name %s does not exist." % domain_i18n) except dns.exception.Timeout: # A timeout could occur for various reasons, so don't treat it as a failure. return { "unknown-deliverability": "timeout", } except EmailUndeliverableError: # Don't let these get clobbered by the wider except block below. raise except Exception as e: # Unhandled conditions should not propagate. raise EmailUndeliverableError( "There was an error while checking if the domain name in the email address is deliverable: " + str(e) ) return { "mx": mtas, "mx-fallback": mx_fallback, } def main(): import sys import json def __utf8_input_shim(input_str): if sys.version_info < (3,): return input_str.decode("utf-8") return input_str def __utf8_output_shim(output_str): if sys.version_info < (3,): return unicode_class(output_str).encode("utf-8") return output_str if len(sys.argv) == 1: for line in sys.stdin: email = __utf8_input_shim(line.strip()) try: validate_email(email) except EmailNotValidError as e: print(__utf8_output_shim("{} {}".format(email, e))) else: # Validate the email address passed on the command line. email = __utf8_input_shim(sys.argv[1]) try: result = validate_email(email) print(json.dumps(result.as_dict(), indent=2, sort_keys=True, ensure_ascii=False)) except EmailNotValidError as e: print(__utf8_output_shim(e)) if __name__ == "__main__": main() python-email-validator-1.1.2/setup.cfg000066400000000000000000000002331375107104000177210ustar00rootroot00000000000000[bdist_wheel] universal = 1 [metadata] license_file = LICENSE [flake8] max-line-length = 120 [tool:pytest] testpaths = tests filterwarnings = error python-email-validator-1.1.2/setup.py000066400000000000000000000027631375107104000176240ustar00rootroot00000000000000# -*- coding: utf-8 -*- from setuptools import setup, find_packages from codecs import open setup( name='email-validator', version='1.1.2', description='A robust email syntax and deliverability validation library for Python 2.x/3.x.', long_description=open("README.md", encoding='utf-8').read(), long_description_content_type="text/markdown", url='https://github.com/JoshData/python-email-validator', author=u'Joshua Tauberer', author_email=u'jt@occams.info', license='CC0 (copyright waived)', # See https://pypi.org/pypi?%3Aaction=list_classifiers classifiers=[ 'Development Status :: 5 - Production/Stable', 'License :: CC0 1.0 Universal (CC0 1.0) Public Domain Dedication', 'Intended Audience :: Developers', 'Topic :: Software Development :: Libraries :: Python Modules', 'Programming Language :: Python :: 2', 'Programming Language :: Python :: 2.7', 'Programming Language :: Python :: 3', 'Programming Language :: Python :: 3.4', 'Programming Language :: Python :: 3.5', 'Programming Language :: Python :: 3.6', 'Programming Language :: Python :: 3.7', 'Programming Language :: Python :: 3.8', ], keywords="email address validator", packages=find_packages(), install_requires=[ "idna>=2.0.0", "dnspython>=1.15.0"], entry_points={ 'console_scripts': [ 'email_validator=email_validator:main', ], }, ) python-email-validator-1.1.2/test_requirements.txt000066400000000000000000000001171375107104000224240ustar00rootroot00000000000000coverage==4.5.4 docutils==0.15.2 flake8==3.7.9 pytest==5.2.2 pytest-cov==2.8.1 python-email-validator-1.1.2/tests/000077500000000000000000000000001375107104000172445ustar00rootroot00000000000000python-email-validator-1.1.2/tests/test_main.py000066400000000000000000000362561375107104000216150ustar00rootroot00000000000000import pytest from email_validator import EmailSyntaxError, EmailUndeliverableError, \ validate_email, validate_email_deliverability, \ ValidatedEmail # Let's test main but rename it to be clear from email_validator import main as validator_main @pytest.mark.parametrize( 'email_input,output', [ ( 'Abc@example.com', ValidatedEmail( local_part='Abc', ascii_local_part='Abc', smtputf8=False, ascii_domain='example.com', domain='example.com', email='Abc@example.com', ascii_email='Abc@example.com', ), ), ( 'Abc.123@example.com', ValidatedEmail( local_part='Abc.123', ascii_local_part='Abc.123', smtputf8=False, ascii_domain='example.com', domain='example.com', email='Abc.123@example.com', ascii_email='Abc.123@example.com', ), ), ( 'user+mailbox/department=shipping@example.com', ValidatedEmail( local_part='user+mailbox/department=shipping', ascii_local_part='user+mailbox/department=shipping', smtputf8=False, ascii_domain='example.com', domain='example.com', email='user+mailbox/department=shipping@example.com', ascii_email='user+mailbox/department=shipping@example.com', ), ), ( "!#$%&'*+-/=?^_`.{|}~@example.com", ValidatedEmail( local_part="!#$%&'*+-/=?^_`.{|}~", ascii_local_part="!#$%&'*+-/=?^_`.{|}~", smtputf8=False, ascii_domain='example.com', domain='example.com', email="!#$%&'*+-/=?^_`.{|}~@example.com", ascii_email="!#$%&'*+-/=?^_`.{|}~@example.com", ), ), ( '伊昭傑@郵件.商務', ValidatedEmail( local_part='伊昭傑', smtputf8=True, ascii_domain='xn--5nqv22n.xn--lhr59c', domain='郵件.商務', email='伊昭傑@郵件.商務', ), ), ( 'राम@मोहन.ईन्फो', ValidatedEmail( local_part='राम', smtputf8=True, ascii_domain='xn--l2bl7a9d.xn--o1b8dj2ki', domain='मोहन.ईन्फो', email='राम@मोहन.ईन्फो', ), ), ( 'юзер@екзампл.ком', ValidatedEmail( local_part='юзер', smtputf8=True, ascii_domain='xn--80ajglhfv.xn--j1aef', domain='екзампл.ком', email='юзер@екзампл.ком', ), ), ( 'θσερ@εχαμπλε.ψομ', ValidatedEmail( local_part='θσερ', smtputf8=True, ascii_domain='xn--mxahbxey0c.xn--xxaf0a', domain='εχαμπλε.ψομ', email='θσερ@εχαμπλε.ψομ', ), ), ( '葉士豪@臺網中心.tw', ValidatedEmail( local_part='葉士豪', smtputf8=True, ascii_domain='xn--fiqq24b10vi0d.tw', domain='臺網中心.tw', email='葉士豪@臺網中心.tw', ), ), ( 'jeff@臺網中心.tw', ValidatedEmail( local_part='jeff', ascii_local_part='jeff', smtputf8=False, ascii_domain='xn--fiqq24b10vi0d.tw', domain='臺網中心.tw', email='jeff@臺網中心.tw', ascii_email='jeff@xn--fiqq24b10vi0d.tw', ), ), ( '葉士豪@臺網中心.台灣', ValidatedEmail( local_part='葉士豪', smtputf8=True, ascii_domain='xn--fiqq24b10vi0d.xn--kpry57d', domain='臺網中心.台灣', email='葉士豪@臺網中心.台灣', ), ), ( 'jeff葉@臺網中心.tw', ValidatedEmail( local_part='jeff葉', smtputf8=True, ascii_domain='xn--fiqq24b10vi0d.tw', domain='臺網中心.tw', email='jeff葉@臺網中心.tw', ), ), ( 'ñoñó@example.com', ValidatedEmail( local_part='ñoñó', smtputf8=True, ascii_domain='example.com', domain='example.com', email='ñoñó@example.com', ), ), ( '我買@example.com', ValidatedEmail( local_part='我買', smtputf8=True, ascii_domain='example.com', domain='example.com', email='我買@example.com', ), ), ( '甲斐黒川日本@example.com', ValidatedEmail( local_part='甲斐黒川日本', smtputf8=True, ascii_domain='example.com', domain='example.com', email='甲斐黒川日本@example.com', ), ), ( 'чебурашкаящик-с-апельсинами.рф@example.com', ValidatedEmail( local_part='чебурашкаящик-с-апельсинами.рф', smtputf8=True, ascii_domain='example.com', domain='example.com', email='чебурашкаящик-с-апельсинами.рф@example.com', ), ), ( 'उदाहरण.परीक्ष@domain.with.idn.tld', ValidatedEmail( local_part='उदाहरण.परीक्ष', smtputf8=True, ascii_domain='domain.with.idn.tld', domain='domain.with.idn.tld', email='उदाहरण.परीक्ष@domain.with.idn.tld', ), ), ( 'ιωάννης@εεττ.gr', ValidatedEmail( local_part='ιωάννης', smtputf8=True, ascii_domain='xn--qxaa9ba.gr', domain='εεττ.gr', email='ιωάννης@εεττ.gr', ), ), ], ) def test_email_valid(email_input, output): # print(f'({email_input!r}, {validate_email(email_input, check_deliverability=False)!r}),') assert validate_email(email_input, check_deliverability=False) == output @pytest.mark.parametrize( 'email_input,error_msg', [ ('my@.leadingdot.com', 'An email address cannot have a period immediately after the @-sign.'), ('my@..leadingfwdot.com', 'An email address cannot have a period immediately after the @-sign.'), ('my@..twodots.com', 'An email address cannot have a period immediately after the @-sign.'), ('my@twodots..com', 'An email address cannot have two periods in a row.'), ('my@baddash.-.com', 'The domain name baddash.-.com contains invalid characters (Label must not start or end with a hyphen).'), ('my@baddash.-a.com', 'The domain name baddash.-a.com contains invalid characters (Label must not start or end with a hyphen).'), ('my@baddash.b-.com', 'The domain name baddash.b-.com contains invalid characters (Label must not start or end with a hyphen).'), ('my@example.com\n', 'The domain name example.com\n contains invalid characters (Codepoint U+000A at position 4 of ' '\'com\\n\' not allowed).'), ('my@example\n.com', 'The domain name example\n.com contains invalid characters (Codepoint U+000A at position 8 of ' '\'example\\n\' not allowed).'), ('.leadingdot@domain.com', 'The email address contains invalid characters before the @-sign: ..'), ('..twodots@domain.com', 'The email address contains invalid characters before the @-sign: ..'), ('twodots..here@domain.com', 'The email address contains invalid characters before the @-sign: ..'), ('me@⒈wouldbeinvalid.com', "The domain name ⒈wouldbeinvalid.com contains invalid characters (Codepoint U+2488 not allowed " "at position 1 in '⒈wouldbeinvalid.com')."), ('@example.com', 'There must be something before the @-sign.'), ('\nmy@example.com', 'The email address contains invalid characters before the @-sign: \n.'), ('m\ny@example.com', 'The email address contains invalid characters before the @-sign: \n.'), ('my\n@example.com', 'The email address contains invalid characters before the @-sign: \n.'), ('11111111112222222222333333333344444444445555555555666666666677777@example.com', 'The email address is too long before the @-sign (1 character too many).'), ('111111111122222222223333333333444444444455555555556666666666777777@example.com', 'The email address is too long before the @-sign (2 characters too many).'), ('me@1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.111111111122222222223333333333444444444455555555556.com', 'The email address is too long after the @-sign.'), ('my.long.address@1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.11111111112222222222333333333344444.info', 'The email address is too long (2 characters too many).'), ('my.long.address@λ111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.11111111112222222222333333.info', 'The email address is too long (when converted to IDNA ASCII).'), ('my.long.address@λ111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.1111111111222222222233333333334444.info', 'The email address is too long (at least 1 character too many).'), ('my.λong.address@1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.111111111122222222223333333333444.info', 'The email address is too long (when encoded in bytes).'), ('my.λong.address@1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.1111111111222222222233333333334444444444555555555.6666666666777777777788888888889999999999000000000.1111111111222222222233333333334444.info', 'The email address is too long (at least 1 character too many).'), ], ) def test_email_invalid(email_input, error_msg): with pytest.raises(EmailSyntaxError) as exc_info: validate_email(email_input) # print(f'({email_input!r}, {str(exc_info.value)!r}),') assert str(exc_info.value) == error_msg def test_dict_accessor(): input_email = "testaddr@example.com" valid_email = validate_email(input_email, check_deliverability=False) assert isinstance(valid_email.as_dict(), dict) assert valid_email.as_dict()["original_email"] == input_email def test_deliverability_no_records(): assert validate_email_deliverability('example.com', 'example.com') == {'mx': [(0, '')], 'mx-fallback': None} def test_deliverability_found(): response = validate_email_deliverability('gmail.com', 'gmail.com') assert response.keys() == {'mx', 'mx-fallback'} assert response['mx-fallback'] is None assert len(response['mx']) > 1 assert len(response['mx'][0]) == 2 assert isinstance(response['mx'][0][0], int) assert response['mx'][0][1].endswith('.com') def test_deliverability_fails(): domain = 'xkxufoekjvjfjeodlfmdfjcu.com' with pytest.raises(EmailUndeliverableError, match='The domain name {} does not exist'.format(domain)): validate_email_deliverability(domain, domain) def test_deliverability_dns_timeout(): validate_email_deliverability.TEST_CHECK_TIMEOUT = True response = validate_email_deliverability('gmail.com', 'gmail.com') assert "mx" not in response assert response.get("unknown-deliverability") == "timeout" validate_email('test@gmail.com') del validate_email_deliverability.TEST_CHECK_TIMEOUT def test_main_single_good_input(monkeypatch, capsys): import json test_email = "test@example.com" monkeypatch.setattr('sys.argv', ['email_validator', test_email]) validator_main() stdout, _ = capsys.readouterr() output = json.loads(str(stdout)) assert isinstance(output, dict) assert validate_email(test_email).original_email == output["original_email"] def test_main_single_bad_input(monkeypatch, capsys): bad_email = 'test@..com' monkeypatch.setattr('sys.argv', ['email_validator', bad_email]) validator_main() stdout, _ = capsys.readouterr() assert stdout == 'An email address cannot have a period immediately after the @-sign.\n' def test_main_multi_input(monkeypatch, capsys): import io test_cases = ["test@example.com", "test2@example.com", "test@.com", "test3@.com"] test_input = io.StringIO("\n".join(test_cases)) monkeypatch.setattr('sys.stdin', test_input) monkeypatch.setattr('sys.argv', ['email_validator']) validator_main() stdout, _ = capsys.readouterr() assert test_cases[0] not in stdout assert test_cases[1] not in stdout assert test_cases[2] in stdout assert test_cases[3] in stdout def test_main_input_shim(monkeypatch, capsys): import json monkeypatch.setattr('sys.version_info', (2, 7)) test_email = b"test@example.com" monkeypatch.setattr('sys.argv', ['email_validator', test_email]) validator_main() stdout, _ = capsys.readouterr() output = json.loads(str(stdout)) assert isinstance(output, dict) assert validate_email(test_email).original_email == output["original_email"] def test_main_output_shim(monkeypatch, capsys): monkeypatch.setattr('sys.version_info', (2, 7)) test_email = b"test@.com" monkeypatch.setattr('sys.argv', ['email_validator', test_email]) validator_main() stdout, _ = capsys.readouterr() # This looks bad but it has to do with the way python 2.7 prints vs py3 # The \n is part of the print statement, not part of the string, which is what the b'...' is # Since we're mocking py 2.7 here instead of actually using 2.7, this was the closest I could get assert stdout == "b'An email address cannot have a period immediately after the @-sign.'\n"