gTTS-2.0.3/0000775000372000037200000000000013405123321013171 5ustar travistravis00000000000000gTTS-2.0.3/CHANGELOG.rst0000664000372000037200000003013113405123222015210 0ustar travistravis00000000000000.. NOTE: You should *NOT* be adding new change log entries to this file, this file is managed by towncrier. You *may* edit previous change logs to fix problems like typo corrections or such. To add a new change log entry, please see CONTRIBUTING.rst Changelog ========= .. towncrier release notes start 2.0.3 (2018-12-15) ------------------ Features ~~~~~~~~ - Added new tokenizer case for ':' preventing cut in the middle of a time notation (`#135 `_) Misc ~~~~ - `#159 `_ 2.0.2 (2018-12-09) ------------------ Features ~~~~~~~~ - Added Python 3.7 support, modernization of packaging, testing and CI (`#126 `_) Bugfixes ~~~~~~~~ - Fixed language retrieval/validation broken from new Google Translate page (`#156 `_) 2.0.1 (2018-06-20) ------------------ Bugfixes ~~~~~~~~ - Fixed an UnicodeDecodeError when installing gTTS if system locale was not utf-8 (`#120 `_) Improved Documentation ~~~~~~~~~~~~~~~~~~~~~~ - Added *Pre-processing and tokenizing > Minimizing* section about the API's 100 characters limit and how larger tokens are handled (`#121 `_) Misc ~~~~ - `#122 `_ 2.0.0 (2018-04-30) ------------------ (`#108 `_) Features ~~~~~~~~ - The ``gtts`` module - New logger ("gtts") replaces all occurrences of ``print()`` - Languages list is now obtained automatically (``gtts.lang``) (`#91 `_, `#94 `_, `#106 `_) - Added a curated list of language sub-tags that have been observed to provide different dialects or accents (e.g. "en-gb", "fr-ca") - New ``gTTS()`` parameter ``lang_check`` to disable language checking. - ``gTTS()`` now delegates the ``text`` tokenizing to the API request methods (i.e. ``write_to_fp()``, ``save()``), allowing ``gTTS`` instances to be modified/reused - Rewrote tokenizing and added pre-processing (see below) - New ``gTTS()`` parameters ``pre_processor_funcs`` and ``tokenizer_func`` to configure pre-processing and tokenizing (or use a 3rd party tokenizer) - Error handling: - Added new exception ``gTTSError`` raised on API request errors. It attempts to guess what went wrong based on known information and observed behaviour (`#60 `_, `#106 `_) - ``gTTS.write_to_fp()`` and ``gTTS.save()`` also raise ``gTTSError`` on `gtts_token` error - ``gTTS.write_to_fp()`` raises ``TypeError`` when ``fp`` is not a file-like object or one that doesn't take bytes - ``gTTS()`` raises ``ValueError`` on unsupported languages (and ``lang_check`` is ``True``) - More fine-grained error handling throughout (e.g. `request failed` vs. `request successful with a bad response`) - Tokenizer (and new pre-processors): - Rewrote and greatly expanded tokenizer (``gtts.tokenizer``) - Smarter token 'cleaning' that will remove tokens that only contain characters that can't be spoken (i.e. punctuation and whitespace) - Decoupled token minimizing from tokenizing, making the latter usable in other contexts - New flexible speech-centric text pre-processing - New flexible full-featured regex-based tokenizer (``gtts.tokenizer.core.Tokenizer``) - New ``RegexBuilder``, ``PreProcessorRegex`` and ``PreProcessorSub`` classes to make writing regex-powered text `pre-processors` and `tokenizer cases` easier - Pre-processors: - Re-form words cut by end-of-line hyphens - Remove periods after a (customizable) list of known abbreviations (e.g. "jr", "sr", "dr") that can be spoken the same without a period - Perform speech corrections by doing word-for-word replacements from a (customizable) list of tuples - Tokenizing: - Keep punctuation that modify the inflection of speech (e.g. "?", "!") - Don't split in the middle of numbers (e.g. "10.5", "20,000,000") (`#101 `_) - Don't split on "dotted" abbreviations and accronyms (e.g. "U.S.A") - Added Chinese comma (","), ellipsis ("…") to punctuation list to tokenize on (`#86 `_) - The ``gtts-cli`` command-line tool - Rewrote cli as first-class citizen module (``gtts.cli``), powered by `Click `_ - Windows support using `setuptool`'s `entry_points` - Better support for Unicode I/O in Python 2 - All arguments are now pre-validated - New ``--nocheck`` flag to skip language pre-checking - New ``--all`` flag to list all available languages - Either the ``--file`` option or the ```` argument can be set to "-" to read from ``stdin`` - The ``--debug`` flag uses logging and doesn't pollute ``stdout`` anymore Bugfixes ~~~~~~~~ - ``_minimize()``: Fixed an infinite recursion loop that would occur when a token started with the miminizing delimiter (i.e. a space) (`#86 `_) - ``_minimize()``: Handle the case where a token of more than 100 characters did not contain a space (e.g. in Chinese). - Fixed an issue that fused multiline text together if the total number of characters was less than 100 - Fixed ``gtts-cli`` Unicode errors in Python 2.7 (famous last words) (`#78 `_, `#93 `_, `#96 `_) Deprecations and Removals ~~~~~~~~~~~~~~~~~~~~~~~~~ - Dropped Python 3.3 support - Removed ``debug`` parameter of ``gTTS`` (in favour of logger) - ``gtts-cli``: Changed long option name of ``-o`` to ``--output`` instead of ``--destination`` - ``gTTS()`` will raise a ``ValueError`` rather than an ``AssertionError`` on unsupported language Improved Documentation ~~~~~~~~~~~~~~~~~~~~~~ - Rewrote all documentation files as reStructuredText - Comprehensive documentation writen for `Sphinx `_, published to http://gtts.readthedocs.io - Changelog built with `towncrier `_ Misc ~~~~ - Major test re-work - Language tests can read a ``TEST_LANGS`` enviromment variable so not all language tests are run every time. - Added `AppVeyor `_ CI for Windows - `PEP 8 `_ compliance 1.2.2 (2017-08-15) ------------------ Misc ~~~~ - Update LICENCE, add to manifest (`#77 `_) 1.2.1 (2017-08-02) ------------------ Features ~~~~~~~~ - Add Unicode punctuation to the tokenizer (such as for Chinese and Japanese) (`#75 `_) Bugfixes ~~~~~~~~ - Fix > 100 characters non-ASCII split, ``unicode()`` for Python 2 (`#71 `_, `#73 `_, `#75 `_) 1.2.0 (2017-04-15) ------------------ Features ~~~~~~~~ - Option for slower read speed (``slow=True`` for ``gTTS()``, ``--slow`` for ``gtts-cli``) (`#40 `_, `#41 `_, `#64 `_, `#67 `_) - System proxy settings are passed transparently to all http requests (`#45 `_, `#68 `_) - Silence SSL warnings from urllib3 (`#69 `_) Bugfixes ~~~~~~~~ - The text to read is now cut in proper chunks in Python 2 unicode. This broke reading for many languages such as Russian. - Disabled SSL verify on http requests to accommodate certain firewalls and proxies. - Better Python 2/3 support in general (`#9 `_, `#48 `_, `#68 `_) Deprecations and Removals ~~~~~~~~~~~~~~~~~~~~~~~~~ - 'pt-br' : 'Portuguese (Brazil)' (it was the same as 'pt' and not Brazilian) (`#69 `_) 1.1.8 (2017-01-15) ------------------ Features ~~~~~~~~ - Added ``stdin`` support via the '-' ``text`` argument to ``gtts-cli`` (`#56 `_) 1.1.7 (2016-12-14) ------------------ Features ~~~~~~~~ - Added utf-8 support to ``gtts-cli`` (`#52 `_) 1.1.6 (2016-07-20) ------------------ Features ~~~~~~~~ - Added 'bn' : 'Bengali' (`#39 `_, `#44 `_) Deprecations and Removals ~~~~~~~~~~~~~~~~~~~~~~~~~ - 'ht' : 'Haitian Creole' (removed by Google) (`#43 `_) 1.1.5 (2016-05-13) ------------------ Bugfixes ~~~~~~~~ - Fixed HTTP 403s by updating the client argument to reflect new API usage (`#32 `_, `#33 `_) 1.1.4 (2016-02-22) ------------------ Features ~~~~~~~~ - Spun-off token calculation to `gTTS-Token `_ (`#23 `_, `#29 `_) 1.1.3 (2016-01-24) ------------------ Bugfixes ~~~~~~~~ - ``gtts-cli`` works with Python 3 (`#20 `_) - Better support for non-ASCII characters (`#21 `_, `#22 `_) Misc ~~~~ - Moved out gTTS token to its own module (`#19 `_) 1.1.2 (2016-01-13) ------------------ Features ~~~~~~~~ - Added gTTS token (tk url parameter) calculation (`#14 `_, `#15 `_, `#17 `_) 1.0.7 (2015-10-07) ------------------ Features ~~~~~~~~ - Added ``stdout`` support to ``gtts-cli``, text now an argument rather than an option (`#10 `_) 1.0.6 (2015-07-30) ------------------ Features ~~~~~~~~ - Raise an exception on bad HTTP response (4xx or 5xx) (`#8 `_) Bugfixes ~~~~~~~~ - Added ``client=t`` parameter for the api HTTP request (`#8 `_) 1.0.5 (2015-07-15) ------------------ Features ~~~~~~~~ - ``write_to_fp()`` to write to a file-like object (`#6 `_) 1.0.4 (2015-05-11) ------------------ Features ~~~~~~~~ - Added Languages: `zh-yue` : 'Chinese (Cantonese)', `en-uk` : 'English (United Kingdom)', `pt-br` : 'Portuguese (Brazil)', `es-es` : 'Spanish (Spain)', `es-us` : 'Spanish (United StateS)', `zh-cn` : 'Chinese (Mandarin/China)', `zh-tw` : 'Chinese (Mandarin/Taiwan)' (`#4 `_) Bugfixes ~~~~~~~~ - ``gtts-cli`` print version and pretty printed available languages, language codes are now case insensitive (`#4 `_) 1.0.3 (2014-11-21) ------------------ Features ~~~~~~~~ - Added Languages: 'en-us' : 'English (United States)', 'en-au' : 'English (Australia)' (`#3 `_) 1.0.2 (2014-05-15) ------------------ Features ~~~~~~~~ - Python 3 support 1.0.1 (2014-05-15) ------------------ Misc ~~~~ - SemVer versioning, CI changes 1.0 (2014-05-08) ---------------- Features ~~~~~~~~ - Initial release gTTS-2.0.3/MANIFEST.in0000664000372000037200000000012013405123222014720 0ustar travistravis00000000000000include README.md include CHANGELOG.rst include CONTRIBUTING.rst include LICENSEgTTS-2.0.3/gTTS.egg-info/0000775000372000037200000000000013405123321015504 5ustar travistravis00000000000000gTTS-2.0.3/gTTS.egg-info/PKG-INFO0000664000372000037200000000656413405123320016613 0ustar travistravis00000000000000Metadata-Version: 2.1 Name: gTTS Version: 2.0.3 Summary: gTTS (Google Text-to-Speech), a Python library and CLI tool to interface with Google Translate text-to-speech API Home-page: https://github.com/pndurette/gTTS Author: Pierre Nicolas Durette Author-email: pndurette@gmail.com License: MIT Description: # gTTS **gTTS** (*Google Text-to-Speech*), a Python library and CLI tool to interface with Google Translate's text-to-speech API. Writes spoken `mp3` data to a file, a file-like object (bytestring) for further audio manipulation, or `stdout`. [![PyPI version](https://img.shields.io/pypi/v/gTTS.svg)](https://pypi.org/project/gTTS/) [![Python versions](https://img.shields.io/pypi/pyversions/gTTS.svg)](https://pypi.org/project/gTTS/) [![Build Status](https://travis-ci.org/pndurette/gTTS.svg?branch=master)](https://travis-ci.org/pndurette/gTTS) [![AppVeyor](https://ci.appveyor.com/api/projects/status/eiuxodugo78kemff/branch/master?svg=true)](https://ci.appveyor.com/project/pndurette/gtts) [![Coveralls](https://coveralls.io/repos/github/pndurette/gTTS/badge.svg?branch=master)](https://coveralls.io/github/pndurette/gTTS?branch=master) [![Commits Since](https://img.shields.io/github/commits-since/pndurette/gTTS/latest.svg)](https://github.com/pndurette/gTTS/commits/) [![PyPi Downloads](http://pepy.tech/badge/gtts)](http://pepy.tech/project/gtts) ## Features - Customizable speech-specific sentence tokenizer that allows for unlimited lengths of text to be read, all while keeping proper intonation, abbreviations, decimals and more; - Customizable text pre-processors which can, for example, provide pronunciation corrections; - Automatic retrieval of supported languages. ### Installation $ pip install gTTS ### Quickstart Command Line: $ gtts-cli 'hello' --output hello.mp3 Module: >>> from gtts import gTTS >>> tts = gTTS('hello') >>> tts.save('hello.mp3') See for documentation and examples. ### Project - [Changelog](CHANGELOG.rst) - [Contributing](CONTRIBUTING.rst) ### Licence [The MIT License (MIT)](LICENSE) Copyright © 2014-2018 Pierre Nicolas Durette Keywords: text to speech,Google Translate,TTS Platform: UNKNOWN Classifier: Environment :: Console Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: MIT License Classifier: Operating System :: MacOS :: MacOS X Classifier: Operating System :: Unix Classifier: Operating System :: POSIX Classifier: Operating System :: POSIX :: Linux Classifier: Operating System :: Microsoft :: Windows Classifier: Programming Language :: Python :: 2.7 Classifier: Programming Language :: Python :: 3.4 Classifier: Programming Language :: Python :: 3.5 Classifier: Programming Language :: Python :: 3.6 Classifier: Programming Language :: Python :: 3.7 Classifier: Topic :: Software Development :: Libraries Classifier: Topic :: Multimedia :: Sound/Audio :: Speech Requires-Python: >= 2.7 Description-Content-Type: text/markdown Provides-Extra: docs Provides-Extra: tests gTTS-2.0.3/gTTS.egg-info/dependency_links.txt0000664000372000037200000000000113405123320021551 0ustar travistravis00000000000000 gTTS-2.0.3/gTTS.egg-info/top_level.txt0000664000372000037200000000000513405123320020230 0ustar travistravis00000000000000gtts gTTS-2.0.3/gTTS.egg-info/SOURCES.txt0000664000372000037200000000112213405123320017363 0ustar travistravis00000000000000CHANGELOG.rst CONTRIBUTING.rst LICENSE MANIFEST.in README.md setup.cfg setup.py gTTS.egg-info/PKG-INFO gTTS.egg-info/SOURCES.txt gTTS.egg-info/dependency_links.txt gTTS.egg-info/entry_points.txt gTTS.egg-info/requires.txt gTTS.egg-info/top_level.txt gtts/__init__.py gtts/cli.py gtts/lang.py gtts/tts.py gtts/utils.py gtts/version.py gtts/tests/__init__.py gtts/tests/test_cli.py gtts/tests/test_lang.py gtts/tests/test_tts.py gtts/tests/test_utils.py gtts/tokenizer/__init__.py gtts/tokenizer/core.py gtts/tokenizer/pre_processors.py gtts/tokenizer/symbols.py gtts/tokenizer/tokenizer_cases.pygTTS-2.0.3/gTTS.egg-info/requires.txt0000664000372000037200000000025013405123320020100 0ustar travistravis00000000000000six bs4 click requests gtts_token [docs] sphinx sphinx-autobuild sphinx_rtd_theme sphinx-click towncrier [tests] pytest pytest-cov coveralls flake8 testfixtures mock gTTS-2.0.3/gTTS.egg-info/entry_points.txt0000664000372000037200000000005713405123320021003 0ustar travistravis00000000000000[console_scripts] gtts-cli = gtts.cli:tts_cli gTTS-2.0.3/CONTRIBUTING.rst0000664000372000037200000000337413405123222015641 0ustar travistravis00000000000000Contributing ============ Reporting Issues ---------------- On the Github issues_ page. Thanks! Submitting Patches ------------------ 1. **Fork**. Follow `PEP 8 `_! 2. **Write/Update tests** (see below). 3. **Document**. Docstrings follow the `Google Python Style Guide`_ (docs by Sphinx_). You can 'test' documentation:: $ pip install .[docs] $ cd docs && make html # generated in docs/_build/html/ 4. **Open Pull Request**. To the ``master`` branch. 5. **Changelog**. This project uses towncrier_ for managing the changelog. Please consider creating one or more 'news fragment' in the ``/news/`` directory and adding them to your PR, in the style of ``.`` where 'type' is one of: 'feature', 'bugfix', 'doc', 'removal' or 'misc'. See towncrier_ (New Fragments) for more details. Example:: $ echo 'Fixed a thing!' > gtts/news/1234.bugfix .. note:: | Please don't hesitate to contribute! While good tests, docs and structure are | encouraged, I do welcome great ideas over absolute comformity to the above! | Thanks! ❤️ Testing ------- | Testing is done with the ``unittest`` framework. | As a rule, the file ``./tests/test_.py`` file tests the ```` module. To run all tests (testing only language 'en' and generating an html coverage report in ``gtts/htmlcov/``):: $ pip install .[tests] $ TEST_LANGS=en pytest -v -s gtts/ --cov=gtts --cov-report=html .. _repo: https://github.com/pndurette/gTTS/ .. _issues: https://github.com/pndurette/gTTS/issues .. _Google Python Style Guide: http://google.github.io/styleguide/pyguide.html#Comments .. _Sphinx: http://www.sphinx-doc.org/ .. _towncrier: https://github.com/hawkowl/towncrier gTTS-2.0.3/PKG-INFO0000664000372000037200000000656413405123321014301 0ustar travistravis00000000000000Metadata-Version: 2.1 Name: gTTS Version: 2.0.3 Summary: gTTS (Google Text-to-Speech), a Python library and CLI tool to interface with Google Translate text-to-speech API Home-page: https://github.com/pndurette/gTTS Author: Pierre Nicolas Durette Author-email: pndurette@gmail.com License: MIT Description: # gTTS **gTTS** (*Google Text-to-Speech*), a Python library and CLI tool to interface with Google Translate's text-to-speech API. Writes spoken `mp3` data to a file, a file-like object (bytestring) for further audio manipulation, or `stdout`. [![PyPI version](https://img.shields.io/pypi/v/gTTS.svg)](https://pypi.org/project/gTTS/) [![Python versions](https://img.shields.io/pypi/pyversions/gTTS.svg)](https://pypi.org/project/gTTS/) [![Build Status](https://travis-ci.org/pndurette/gTTS.svg?branch=master)](https://travis-ci.org/pndurette/gTTS) [![AppVeyor](https://ci.appveyor.com/api/projects/status/eiuxodugo78kemff/branch/master?svg=true)](https://ci.appveyor.com/project/pndurette/gtts) [![Coveralls](https://coveralls.io/repos/github/pndurette/gTTS/badge.svg?branch=master)](https://coveralls.io/github/pndurette/gTTS?branch=master) [![Commits Since](https://img.shields.io/github/commits-since/pndurette/gTTS/latest.svg)](https://github.com/pndurette/gTTS/commits/) [![PyPi Downloads](http://pepy.tech/badge/gtts)](http://pepy.tech/project/gtts) ## Features - Customizable speech-specific sentence tokenizer that allows for unlimited lengths of text to be read, all while keeping proper intonation, abbreviations, decimals and more; - Customizable text pre-processors which can, for example, provide pronunciation corrections; - Automatic retrieval of supported languages. ### Installation $ pip install gTTS ### Quickstart Command Line: $ gtts-cli 'hello' --output hello.mp3 Module: >>> from gtts import gTTS >>> tts = gTTS('hello') >>> tts.save('hello.mp3') See for documentation and examples. ### Project - [Changelog](CHANGELOG.rst) - [Contributing](CONTRIBUTING.rst) ### Licence [The MIT License (MIT)](LICENSE) Copyright © 2014-2018 Pierre Nicolas Durette Keywords: text to speech,Google Translate,TTS Platform: UNKNOWN Classifier: Environment :: Console Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: MIT License Classifier: Operating System :: MacOS :: MacOS X Classifier: Operating System :: Unix Classifier: Operating System :: POSIX Classifier: Operating System :: POSIX :: Linux Classifier: Operating System :: Microsoft :: Windows Classifier: Programming Language :: Python :: 2.7 Classifier: Programming Language :: Python :: 3.4 Classifier: Programming Language :: Python :: 3.5 Classifier: Programming Language :: Python :: 3.6 Classifier: Programming Language :: Python :: 3.7 Classifier: Topic :: Software Development :: Libraries Classifier: Topic :: Multimedia :: Sound/Audio :: Speech Requires-Python: >= 2.7 Description-Content-Type: text/markdown Provides-Extra: docs Provides-Extra: tests gTTS-2.0.3/README.md0000664000372000037200000000353113405123222014452 0ustar travistravis00000000000000# gTTS **gTTS** (*Google Text-to-Speech*), a Python library and CLI tool to interface with Google Translate's text-to-speech API. Writes spoken `mp3` data to a file, a file-like object (bytestring) for further audio manipulation, or `stdout`. [![PyPI version](https://img.shields.io/pypi/v/gTTS.svg)](https://pypi.org/project/gTTS/) [![Python versions](https://img.shields.io/pypi/pyversions/gTTS.svg)](https://pypi.org/project/gTTS/) [![Build Status](https://travis-ci.org/pndurette/gTTS.svg?branch=master)](https://travis-ci.org/pndurette/gTTS) [![AppVeyor](https://ci.appveyor.com/api/projects/status/eiuxodugo78kemff/branch/master?svg=true)](https://ci.appveyor.com/project/pndurette/gtts) [![Coveralls](https://coveralls.io/repos/github/pndurette/gTTS/badge.svg?branch=master)](https://coveralls.io/github/pndurette/gTTS?branch=master) [![Commits Since](https://img.shields.io/github/commits-since/pndurette/gTTS/latest.svg)](https://github.com/pndurette/gTTS/commits/) [![PyPi Downloads](http://pepy.tech/badge/gtts)](http://pepy.tech/project/gtts) ## Features - Customizable speech-specific sentence tokenizer that allows for unlimited lengths of text to be read, all while keeping proper intonation, abbreviations, decimals and more; - Customizable text pre-processors which can, for example, provide pronunciation corrections; - Automatic retrieval of supported languages. ### Installation $ pip install gTTS ### Quickstart Command Line: $ gtts-cli 'hello' --output hello.mp3 Module: >>> from gtts import gTTS >>> tts = gTTS('hello') >>> tts.save('hello.mp3') See for documentation and examples. ### Project - [Changelog](CHANGELOG.rst) - [Contributing](CONTRIBUTING.rst) ### Licence [The MIT License (MIT)](LICENSE) Copyright © 2014-2018 Pierre Nicolas Durette gTTS-2.0.3/setup.cfg0000664000372000037200000000344013405123321015013 0ustar travistravis00000000000000[metadata] name = gTTS description = gTTS (Google Text-to-Speech), a Python library and CLI tool to interface with Google Translate text-to-speech API author = Pierre Nicolas Durette author_email = pndurette@gmail.com url = https://github.com/pndurette/gTTS license = MIT keywords = text to speech Google Translate TTS classifiers = Environment :: Console Intended Audience :: Developers License :: OSI Approved :: MIT License Operating System :: MacOS :: MacOS X Operating System :: Unix Operating System :: POSIX Operating System :: POSIX :: Linux Operating System :: Microsoft :: Windows Programming Language :: Python :: 2.7 Programming Language :: Python :: 3.4 Programming Language :: Python :: 3.5 Programming Language :: Python :: 3.6 Programming Language :: Python :: 3.7 Topic :: Software Development :: Libraries Topic :: Multimedia :: Sound/Audio :: Speech license_file = LICENSE long_description = file: README.md long_description_content_type = text/markdown [options] python_requires = >= 2.7 setup_requires = setuptools >= 38.6 pip >= 10 twine >= 1.11 include_package_data = True packages = find: install_requires = six bs4 click requests gtts_token [options.extras_require] tests = pytest pytest-cov coveralls flake8 testfixtures mock docs = sphinx sphinx-autobuild sphinx_rtd_theme sphinx-click towncrier [options.entry_points] console_scripts = gtts-cli = gtts.cli:tts_cli [flake8] max-line-length = 132 exclude = .git,__pycache__,.eggs/,doc/,docs/,build/,dist/,archive/ ignore = W605, W503, W504 [coverage:run] cover_pylib = false omit = /home/travis/virtualenv/* */site-packages/* gtts/tests/* gtts/tokenizer/tests/* [coverage:report] exclude_lines = pragma: no cover def __repr__ log.debug log.warning [egg_info] tag_build = tag_date = 0 gTTS-2.0.3/LICENSE0000664000372000037200000000210513405123222014174 0ustar travistravis00000000000000The MIT License (MIT) Copyright © 2014-2018 Pierre Nicolas Durette Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. gTTS-2.0.3/gtts/0000775000372000037200000000000013405123321014152 5ustar travistravis00000000000000gTTS-2.0.3/gtts/cli.py0000664000372000037200000001262613405123222015302 0ustar travistravis00000000000000# -*- coding: utf-8 -*- from gtts import gTTS, gTTSError, __version__ from gtts.lang import tts_langs import click import logging import logging.config # Click settings CONTEXT_SETTINGS = { 'help_option_names': ['-h', '--help'] } # Logger settings LOGGER_SETTINGS = { 'version': 1, 'formatters': { 'default': { 'format': '%(name)s - %(levelname)s - %(message)s' } }, 'handlers': { 'console': { 'class': 'logging.StreamHandler', 'formatter': 'default' } }, 'loggers': { 'gtts': { 'handlers': ['console'], 'level': 'WARNING' } } } # Logger logging.config.dictConfig(LOGGER_SETTINGS) log = logging.getLogger('gtts') def sys_encoding(): """Charset to use for --file |- (stdin)""" return 'utf8' def validate_text(ctx, param, text): """Validation callback for the argument. Ensures (arg) and (opt) are mutually exclusive """ if not text and 'file' not in ctx.params: # No and no raise click.BadParameter( " or -f/--file required") if text and 'file' in ctx.params: # Both and raise click.BadParameter( " and -f/--file can't be used together") return text def validate_lang(ctx, param, lang): """Validation callback for the option. Ensures is a supported language unless the flag is set """ if ctx.params['nocheck']: return lang try: if lang not in tts_langs(): raise click.UsageError( "'%s' not in list of supported languages.\n" "Use --all to list languages or " "add --nocheck to disable language check." % lang) else: # The language is valid. # No need to let gTTS re-validate. ctx.params['nocheck'] = True except RuntimeError as e: # Only case where the flag can be False # Non-fatal. gTTS will try to re-validate. log.debug(str(e), exc_info=True) return lang def print_languages(ctx, param, value): """Callback for flag. Prints formatted sorted list of supported languages and exits """ if not value or ctx.resilient_parsing: return try: langs = tts_langs() langs_str_list = sorted("{}: {}".format(k, langs[k]) for k in langs) click.echo(' ' + '\n '.join(langs_str_list)) except RuntimeError as e: # pragma: no cover log.debug(str(e), exc_info=True) raise click.ClickException("Couldn't fetch language list.") ctx.exit() def set_debug(ctx, param, debug): """Callback for flag. Sets logger level to DEBUG """ if debug: log.setLevel(logging.DEBUG) return @click.command(context_settings=CONTEXT_SETTINGS) @click.argument('text', metavar='', required=False, callback=validate_text) @click.option( '-f', '--file', metavar='', # For py2.7/unicode. If encoding not None Click uses io.open type=click.File(encoding=sys_encoding()), help="Read from instead of .") @click.option( '-o', '--output', metavar='', type=click.File(mode='wb'), help="Write to instead of stdout.") @click.option( '-s', '--slow', default=False, is_flag=True, help="Read more slowly.") @click.option( '-l', '--lang', metavar='', default='en', show_default=True, callback=validate_lang, help="IETF language tag. Language to speak in. List documented tags with --all.") @click.option( '--nocheck', default=False, is_flag=True, is_eager=True, # Prioritize to ensure it gets set before help="Disable strict IETF language tag checking. Allow undocumented tags.") @click.option( '--all', default=False, is_flag=True, is_eager=True, expose_value=False, callback=print_languages, help="Print all documented available IETF language tags and exit.") @click.option( '--debug', default=False, is_flag=True, is_eager=True, # Prioritize to see debug logs of callbacks expose_value=False, callback=set_debug, help="Show debug information.") @click.version_option(version=__version__) def tts_cli(text, file, output, slow, lang, nocheck): """ Read to mp3 format using Google Translate's Text-to-Speech API (set or --file to - for standard input) """ # stdin for if text == '-': text = click.get_text_stream('stdin').read() # stdout (when no ) if not output: output = click.get_binary_stream('stdout') # input (stdin on '-' is handled by click.File) if file: try: text = file.read() except UnicodeDecodeError as e: # pragma: no cover log.debug(str(e), exc_info=True) raise click.FileError( file.name, " must be encoded using '%s'." % sys_encoding()) # TTS try: tts = gTTS( text=text, lang=lang, slow=slow, lang_check=not nocheck) tts.write_to_fp(output) except (ValueError, AssertionError) as e: raise click.UsageError(str(e)) except gTTSError as e: raise click.ClickException(str(e)) gTTS-2.0.3/gtts/tts.py0000664000372000037200000002264513405123222015347 0ustar travistravis00000000000000# -*- coding: utf-8 -*- from gtts.tokenizer import pre_processors, Tokenizer, tokenizer_cases from gtts.utils import _minimize, _len, _clean_tokens from gtts.lang import tts_langs from gtts_token import gtts_token from six.moves import urllib import urllib3 import requests import logging __all__ = ['gTTS', 'gTTSError'] # Logger log = logging.getLogger(__name__) log.addHandler(logging.NullHandler()) class Speed: """Read Speed The Google TTS Translate API supports two speeds: 'slow' <= 0.3 < 'normal' """ SLOW = 0.3 NORMAL = 1 class gTTS: """gTTS -- Google Text-to-Speech. An interface to Google Translate's Text-to-Speech API. Args: text (string): The text to be read. lang (string, optional): The language (IETF language tag) to read the text in. Defaults to 'en'. slow (bool, optional): Reads text more slowly. Defaults to ``False``. lang_check (bool, optional): Strictly enforce an existing ``lang``, to catch a language error early. If set to ``True``, a ``ValueError`` is raised if ``lang`` doesn't exist. Default is ``True``. pre_processor_funcs (list): A list of zero or more functions that are called to transform (pre-process) text before tokenizing. Those functions must take a string and return a string. Defaults to:: [ pre_processors.tone_marks, pre_processors.end_of_line, pre_processors.abbreviations, pre_processors.word_sub ] tokenizer_func (callable): A function that takes in a string and returns a list of string (tokens). Defaults to:: Tokenizer([ tokenizer_cases.tone_marks, tokenizer_cases.period_comma, tokenizer_cases.colon, tokenizer_cases.other_punctuation ]).run See Also: :doc:`Pre-processing and tokenizing ` Raises: AssertionError: When ``text`` is ``None`` or empty; when there's nothing left to speak after pre-precessing, tokenizing and cleaning. ValueError: When ``lang_check`` is ``True`` and ``lang`` is not supported. RuntimeError: When ``lang_check`` is ``True`` but there's an error loading the languages dictionnary. """ GOOGLE_TTS_MAX_CHARS = 100 # Max characters the Google TTS API takes at a time GOOGLE_TTS_URL = "https://translate.google.com/translate_tts" GOOGLE_TTS_HEADERS = { "Referer": "http://translate.google.com/", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) " "AppleWebKit/537.36 (KHTML, like Gecko) " "Chrome/47.0.2526.106 Safari/537.36" } def __init__( self, text, lang='en', slow=False, lang_check=True, pre_processor_funcs=[ pre_processors.tone_marks, pre_processors.end_of_line, pre_processors.abbreviations, pre_processors.word_sub ], tokenizer_func=Tokenizer([ tokenizer_cases.tone_marks, tokenizer_cases.period_comma, tokenizer_cases.colon, tokenizer_cases.other_punctuation ]).run ): # Debug for k, v in locals().items(): if k == 'self': continue log.debug("%s: %s", k, v) # Text assert text, 'No text to speak' self.text = text # Language if lang_check: try: langs = tts_langs() if lang.lower() not in langs: raise ValueError("Language not supported: %s" % lang) except RuntimeError as e: log.debug(str(e), exc_info=True) log.warning(str(e)) self.lang_check = lang_check self.lang = lang.lower() # Read speed if slow: self.speed = Speed.SLOW else: self.speed = Speed.NORMAL # Pre-processors and tokenizer self.pre_processor_funcs = pre_processor_funcs self.tokenizer_func = tokenizer_func # Google Translate token self.token = gtts_token.Token() def _tokenize(self, text): # Pre-clean text = text.strip() # Apply pre-processors for pp in self.pre_processor_funcs: log.debug("pre-processing: %s", pp) text = pp(text) if _len(text) <= self.GOOGLE_TTS_MAX_CHARS: return _clean_tokens([text]) # Tokenize log.debug("tokenizing: %s", self.tokenizer_func) tokens = self.tokenizer_func(text) # Clean tokens = _clean_tokens(tokens) # Minimize min_tokens = [] for t in tokens: min_tokens += _minimize(t, ' ', self.GOOGLE_TTS_MAX_CHARS) return min_tokens def write_to_fp(self, fp): """Do the TTS API request and write bytes to a file-like object. Args: fp (file object): Any file-like object to write the ``mp3`` to. Raises: :class:`gTTSError`: When there's an error with the API request. TypeError: When ``fp`` is not a file-like object that takes bytes. """ # When disabling ssl verify in requests (for proxies and firewalls), # urllib3 prints an insecure warning on stdout. We disable that. urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning) text_parts = self._tokenize(self.text) log.debug("text_parts: %i", len(text_parts)) assert text_parts, 'No text to send to TTS API' for idx, part in enumerate(text_parts): try: # Calculate token part_tk = self.token.calculate_token(part) except requests.exceptions.RequestException as e: # pragma: no cover log.debug(str(e), exc_info=True) raise gTTSError( "Connection error during token calculation: %s" % str(e)) payload = {'ie': 'UTF-8', 'q': part, 'tl': self.lang, 'ttsspeed': self.speed, 'total': len(text_parts), 'idx': idx, 'client': 'tw-ob', 'textlen': _len(part), 'tk': part_tk} log.debug("payload-%i: %s", idx, payload) try: # Request r = requests.get(self.GOOGLE_TTS_URL, params=payload, headers=self.GOOGLE_TTS_HEADERS, proxies=urllib.request.getproxies(), verify=False) log.debug("headers-%i: %s", idx, r.request.headers) log.debug("url-%i: %s", idx, r.request.url) log.debug("status-%i: %s", idx, r.status_code) r.raise_for_status() except requests.exceptions.HTTPError: # Request successful, bad response raise gTTSError(tts=self, response=r) except requests.exceptions.RequestException as e: # pragma: no cover # Request failed raise gTTSError(str(e)) try: # Write for chunk in r.iter_content(chunk_size=1024): fp.write(chunk) log.debug("part-%i written to %s", idx, fp) except (AttributeError, TypeError) as e: raise TypeError( "'fp' is not a file-like object or it does not take bytes: %s" % str(e)) def save(self, savefile): """Do the TTS API request and write result to file. Args: savefile (string): The path and file name to save the ``mp3`` to. Raises: :class:`gTTSError`: When there's an error with the API request. """ with open(savefile, 'wb') as f: self.write_to_fp(f) log.debug("Saved to %s", savefile) class gTTSError(Exception): """Exception that uses context to present a meaningful error message""" def __init__(self, msg=None, **kwargs): self.tts = kwargs.pop('tts', None) self.rsp = kwargs.pop('response', None) if msg: self.msg = msg elif self.tts is not None and self.rsp is not None: self.msg = self.infer_msg(self.tts, self.rsp) else: self.msg = None super(gTTSError, self).__init__(self.msg) def infer_msg(self, tts, rsp): """Attempt to guess what went wrong by using known information (e.g. http response) and observed behaviour """ # rsp should be # http://docs.python-requests.org/en/master/api/ status = rsp.status_code reason = rsp.reason cause = "Unknown" if status == 403: cause = "Bad token or upstream API changes" elif status == 404 and not tts.lang_check: cause = "Unsupported language '%s'" % self.tts.lang elif status >= 500: cause = "Uptream API error. Try again later." return "%i (%s) from TTS API. Probable cause: %s" % ( status, reason, cause) gTTS-2.0.3/gtts/utils.py0000664000372000037200000000540113405123222015664 0ustar travistravis00000000000000# -*- coding: utf-8 -*- from gtts.tokenizer.symbols import ALL_PUNC as punc from string import whitespace as ws import re _ALL_PUNC_OR_SPACE = re.compile(u"^[{}]*$".format(re.escape(punc + ws))) """Regex that matches if an entire line is only comprised of whitespace and punctuation """ def _minimize(the_string, delim, max_size): """Recursively split a string in the largest chunks possible from the highest position of a delimiter all the way to a maximum size Args: the_string (string): The string to split. delim (string): The delimiter to split on. max_size (int): The maximum size of a chunk. Returns: list: the minimized string in tokens Every chunk size will be at minimum `the_string[0:idx]` where `idx` is the highest index of `delim` found in `the_string`; and at maximum `the_string[0:max_size]` if no `delim` was found in `the_string`. In the latter case, the split will occur at `the_string[max_size]` which can be any character. The function runs itself again on the rest of `the_string` (`the_string[idx:]`) until no chunk is larger than `max_size`. """ # Remove `delim` from start of `the_string` # i.e. prevent a recursive infinite loop on `the_string[0:0]` # if `the_string` starts with `delim` and is larger than `max_size` if the_string.startswith(delim): the_string = the_string[_len(delim):] if _len(the_string) > max_size: try: # Find the highest index of `delim` in `the_string[0:max_size]` # i.e. `the_string` will be cut in half on `delim` index idx = the_string.rindex(delim, 0, max_size) except ValueError: # `delim` not found in `the_string`, index becomes `max_size` # i.e. `the_string` will be cut in half arbitrarily on `max_size` idx = max_size # Call itself again for `the_string[idx:]` return [the_string[:idx]] + \ _minimize(the_string[idx:], delim, max_size) else: return [the_string] def _len(text): """Same as `len(text)` for a string but that decodes `text` first in Python 2.x Args: text (string): string to get the size of. Returns: int: the size of the string. """ try: # Python 2 return len(unicode(text)) except NameError: # pragma: no cover # Python 3 return len(text) def _clean_tokens(tokens): """Clean a list of strings Args: tokens (list): a list of strings (tokens) to clean. Returns: list: stripped strings `tokens` without the original elements that only consisted of whitespace and/or punctuation characters. """ return [t.strip() for t in tokens if not _ALL_PUNC_OR_SPACE.match(t)] gTTS-2.0.3/gtts/version.py0000664000372000037200000000002613405123222016207 0ustar travistravis00000000000000__version__ = '2.0.3' gTTS-2.0.3/gtts/lang.py0000664000372000037200000001006213405123222015444 0ustar travistravis00000000000000# -*- coding: utf-8 -*- from bs4 import BeautifulSoup import requests import logging import re __all__ = ['tts_langs'] URL_BASE = 'http://translate.google.com' JS_FILE = 'translate_m.js' # Logger log = logging.getLogger(__name__) log.addHandler(logging.NullHandler()) def tts_langs(): """Languages Google Text-to-Speech supports. Returns: dict: A dictionnary of the type `{ '': ''}` Where `` is an IETF language tag such as `en` or `pt-br`, and `` is the full English name of the language, such as `English` or `Portuguese (Brazil)`. The dictionnary returned combines languages from two origins: - Languages fetched automatically from Google Translate - Languages that are undocumented variations that were observed to work and present different dialects or accents. """ try: langs = dict() langs.update(_fetch_langs()) langs.update(_extra_langs()) log.debug("langs: %s", langs) return langs except Exception as e: raise RuntimeError("Unable to get language list: %s" % str(e)) def _fetch_langs(): """Fetch (scrape) languages from Google Translate. Google Translate loads a JavaScript Array of 'languages codes' that can be spoken. We intersect this list with all the languages Google Translate provides to get the ones that support text-to-speech. Returns: dict: A dictionnary of languages from Google Translate """ # Load HTML page = requests.get(URL_BASE) soup = BeautifulSoup(page.content, 'html.parser') # JavaScript URL # The