pax_global_header00006660000000000000000000000064135145412700014514gustar00rootroot0000000000000052 comment=d64e2b16137ea4af811365b8e64b8b6a40afe422 imdbpy-6.8/000077500000000000000000000000001351454127000126555ustar00rootroot00000000000000imdbpy-6.8/.github/000077500000000000000000000000001351454127000142155ustar00rootroot00000000000000imdbpy-6.8/.github/ISSUE_TEMPLATE.md000066400000000000000000000011461351454127000167240ustar00rootroot00000000000000#### Issue description *write the description here* #### Version of IMDbPY, Python and OS - **Python:** `python3 -V` or, if you are using Python 2, `python -V` - **IMDbPY:** `python3 -c 'import imdb ; print(imdb.VERSION)'` or, if you are using Python 2, `python -c 'import imdb ; print(imdb.VERSION)'` - **OS:** `python -c 'import platform ; print(platform.uname())'` #### Steps to reproduce the issue *if possible, provide a minimal code to reproduce the problem* ``` #!python # your code here ``` #### What's the expected result? - #### What's the actual result? - #### Additional details - imdbpy-6.8/.gitignore000066400000000000000000000002001351454127000146350ustar00rootroot00000000000000*.pyc *.pyo *.egg-info *.mo *.so .pytest_cache/ build/ _build/ __pycache__ dist/ .idea .vscode .cache .tox .coverage .venv prof imdbpy-6.8/.hgignore000066400000000000000000000002111351454127000144520ustar00rootroot00000000000000syntax: glob .cache .tox __pycache__ build dist .cache .tox __pycache__ *.egg-info *.mo *.pyc *.pyo *.so *.pyd *~ *.swp setuptools-*.egg imdbpy-6.8/.hgtags000066400000000000000000000005541351454127000141370ustar00rootroot00000000000000c3dba80881f0a810b3bf93051a56190b297e7a50 4.6 c8b07121469a2173a587b1a34beb4f1fecd640b6 4.7 ba221c9050599463b4b78c89a8bdada7d7aef173 4.8 e807ba790392d406018af0f98d5dad5117721a4d 4.8.1 b02c61369b27e0d5af0a755a8a2fc3355c08bb67 4.8.2 7f39f8ac4838b45fbf59f4167796dd17cd15c437 4.9 398c01b961076362958c27584d85fbdfa921ac63 5.0 cb1e19b508d03499e8f34bd066d8b930aca6aa2d 5.1 imdbpy-6.8/.python-version000066400000000000000000000001121351454127000156540ustar00rootroot000000000000003.7.4 3.6.9 3.5.7 3.4.10 2.7.16 pypy3.6-7.1.1 pypy3.5-7.0.0 pypy2.7-7.1.1 imdbpy-6.8/.travis.yml000066400000000000000000000003521351454127000147660ustar00rootroot00000000000000language: python dist: xenial # required for Python >= 3.7 python: - "2.7" - "3.5" - "3.6" - "3.7" install: - python setup.py install script: - py.test notifications: email: on_success: never on_failure: always imdbpy-6.8/LICENSE.txt000066400000000000000000000432541351454127000145100ustar00rootroot00000000000000 GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Lesser General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) year name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. , 1 April 1989 Ty Coon, President of Vice This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. imdbpy-6.8/MANIFEST.in000066400000000000000000000005411351454127000144130ustar00rootroot00000000000000# # MANIFEST.in # # Manifest template for creating the Distutils source distribution. # include tox.ini recursive-include imdb/locale * recursive-include tests * prune tests/__pycache__ prune tests/.cache prune imdb/locale/__pycache__ global-exclude __pycache__ global-exclude .pytest_cache global-exclude .tox global-exclude .cache global-exclude *~ imdbpy-6.8/Makefile000066400000000000000000000021731351454127000143200ustar00rootroot00000000000000.PHONY: help clean clean-build clean-pyc clean-docs lint test test-all coverage docs dist help: @echo "clean - clean everything" @echo "clean-build - remove build artifacts" @echo "clean-pyc - remove Python file artifacts" @echo "clean-docs - remove Sphinx documentation artifacts" @echo "lint - check style with flake8" @echo "test - run tests quickly with the default Python" @echo "test-all - run tests on every Python version with tox" @echo "coverage - check code coverage quickly with the default Python" @echo "docs - generate Sphinx HTML documentation, including API docs" @echo "dist - package" clean: clean-build clean-pyc clean-docs clean-build: rm -fr build/ rm -fr dist/ rm -fr *.egg-info clean-pyc: find . -name '*.pyc' -exec rm -f {} + find . -name '*.pyo' -exec rm -f {} + find . -name '*~' -exec rm -f {} + clean-docs: make -C docs clean lint: python setup.py flake8 test: pytest test-all: tox coverage: pytest --cov-report term-missing --cov=imdb tests docs: $(MAKE) -C docs clean $(MAKE) -C docs html dist: clean python setup.py check -r -s python setup.py sdist python setup.py bdist_wheel imdbpy-6.8/README.rst000066400000000000000000000043521351454127000143500ustar00rootroot00000000000000.. image:: https://travis-ci.org/alberanid/imdbpy.svg?branch=master :target: https://travis-ci.org/alberanid/imdbpy **IMDbPY** is a Python package for retrieving and managing the data of the `IMDb`_ movie database about movies, people and companies. :Homepage: https://imdbpy.sourceforge.io/ :PyPI: https://pypi.org/project/IMDbPY/ :Repository: https://github.com/alberanid/imdbpy :Documentation: https://imdbpy.readthedocs.io/ :Support: https://imdbpy.sourceforge.io/support.html .. admonition:: Revamp notice :class: note Starting on November 2017, many things were improved and simplified: - moved the package to Python 3 (compatible with Python 2.7) - removed dependencies: SQLObject, C compiler, BeautifulSoup - removed the "mobile" and "httpThin" parsers - introduced a test suite (`please help with it!`_) Main features ------------- - written in Python 3 (compatible with Python 2.7) - platform-independent - can retrieve data from both the IMDb's web server, or a local copy of the database - simple and complete API - released under the terms of the GPL 2 license IMDbPY powers many other software and has been used in various research papers. `Curious about that`_? Installation ------------ Whenever possible, please use the latest version from the repository:: pip install git+https://github.com/alberanid/imdbpy But if you want, you can also install the latest release from PyPI:: pip install imdbpy Example ------- Here's an example that demonstrates how to use IMDbPY: .. code-block:: python from imdb import IMDb # create an instance of the IMDb class ia = IMDb() # get a movie movie = ia.get_movie('0133093') # print the names of the directors of the movie print('Directors:') for director in movie['directors']: print(director['name']) # print the genres of the movie print('Genres:') for genre in movie['genres']: print(genre) # search for a person name people = ia.search_person('Mel Gibson') for person in people: print(person.personID, person['name']) .. _IMDb: https://www.imdb.com/ .. _please help with it!: http://imdbpy.readthedocs.io/en/latest/devel/test.html .. _Curious about that: https://imdbpy.sourceforge.io/ecosystem.html imdbpy-6.8/bin/000077500000000000000000000000001351454127000134255ustar00rootroot00000000000000imdbpy-6.8/bin/get_character.py000077500000000000000000000027721351454127000166050ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ get_character.py Usage: get_character "character_id" Show some info about the character with the given character_id (e.g. '0000001' for "Jesse James", using 'http' or 'mobile'). Notice that character_id, using 'sql', are not the same IDs used on the web. """ import sys # Import the IMDbPY package. try: import imdb except ImportError: print('You bad boy! You need to install the IMDbPY package!') sys.exit(1) if len(sys.argv) != 2: print('Only one argument is required:') print(' %s "character_id"' % sys.argv[0]) sys.exit(2) character_id = sys.argv[1] i = imdb.IMDb() try: # Get a character object with the data about the character identified by # the given character_id. character = i.get_character(character_id) except imdb.IMDbError as e: print("Probably you're not connected to Internet. Complete error report:") print(e) sys.exit(3) if not character: print(("It seems that there's no character" ' with character_id "%s"' % character_id)) sys.exit(4) # XXX: this is the easier way to print the main info about a character; # calling the summary() method of a character object will returns a string # with the main information about the character. # Obviously it's not really meaningful if you want to know how # to access the data stored in a character object, so look below; the # commented lines show some ways to retrieve information from a # character object. print(character.summary()) imdbpy-6.8/bin/get_company.py000077500000000000000000000027061351454127000163140ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ get_company.py Usage: get_company "company_id" Show some info about the company with the given company_id (e.g. '0071509' for "Columbia Pictures [us]", using 'http' or 'mobile'). Notice that company_id, using 'sql', are not the same IDs used on the web. """ import sys # Import the IMDbPY package. try: import imdb except ImportError: print('You bad boy! You need to install the IMDbPY package!') sys.exit(1) if len(sys.argv) != 2: print('Only one argument is required:') print(' %s "company_id"' % sys.argv[0]) sys.exit(2) company_id = sys.argv[1] i = imdb.IMDb() try: # Get a company object with the data about the company identified by # the given company_id. company = i.get_company(company_id) except imdb.IMDbError as e: print("Probably you're not connected to Internet. Complete error report:") print(e) sys.exit(3) if not company: print('It seems that there\'s no company with company_id "%s"' % company_id) sys.exit(4) # XXX: this is the easier way to print the main info about a company; # calling the summary() method of a company object will returns a string # with the main information about the company. # Obviously it's not really meaningful if you want to know how # to access the data stored in a company object, so look below; the # commented lines show some ways to retrieve information from a # company object. print(company.summary()) imdbpy-6.8/bin/get_first_character.py000077500000000000000000000021661351454127000200110ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ get_first_character.py Usage: get_first_character "character name" Search for the given name and print the best matching result. """ import sys # Import the IMDbPY package. try: import imdb except ImportError: print('You bad boy! You need to install the IMDbPY package!') sys.exit(1) if len(sys.argv) != 2: print('Only one argument is required:') print(' %s "character name"' % sys.argv[0]) sys.exit(2) name = sys.argv[1] i = imdb.IMDb() try: # Do the search, and get the results (a list of character objects). results = i.search_character(name) except imdb.IMDbError as e: print("Probably you're not connected to Internet. Complete error report:") print(e) sys.exit(3) if not results: print('No matches for "%s", sorry.' % name) sys.exit(0) # Print only the first result. print(' Best match for "%s"' % name) # This is a character instance. character = results[0] # So far the character object only contains basic information like the # name; retrieve main information: i.update(character) print(character.summary()) imdbpy-6.8/bin/get_first_company.py000077500000000000000000000021401351454127000175130ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ get_first_company.py Usage: get_first_company "company name" Search for the given name and print the best matching result. """ import sys # Import the IMDbPY package. try: import imdb except ImportError: print('You bad boy! You need to install the IMDbPY package!') sys.exit(1) if len(sys.argv) != 2: print('Only one argument is required:') print(' %s "company name"' % sys.argv[0]) sys.exit(2) name = sys.argv[1] i = imdb.IMDb() try: # Do the search, and get the results (a list of company objects). results = i.search_company(name) except imdb.IMDbError as e: print("Probably you're not connected to Internet. Complete error report:") print(e) sys.exit(3) if not results: print('No matches for "%s", sorry.' % name) sys.exit(0) # Print only the first result. print(' Best match for "%s"' % name) # This is a company instance. company = results[0] # So far the company object only contains basic information like the # name; retrieve main information: i.update(company) print(company.summary()) imdbpy-6.8/bin/get_first_movie.py000077500000000000000000000021371351454127000171720ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ get_first_movie.py Usage: get_first_movie "movie title" Search for the given title and print the best matching result. """ import sys # Import the IMDbPY package. try: import imdb except ImportError: print('You bad boy! You need to install the IMDbPY package!') sys.exit(1) if len(sys.argv) != 2: print('Only one argument is required:') print(' %s "movie title"' % sys.argv[0]) sys.exit(2) title = sys.argv[1] i = imdb.IMDb() try: # Do the search, and get the results (a list of Movie objects). results = i.search_movie(title) except imdb.IMDbError as e: print("Probably you're not connected to Internet. Complete error report:") print(e) sys.exit(3) if not results: print('No matches for "%s", sorry.' % title) sys.exit(0) # Print only the first result. print(' Best match for "%s"' % title) # This is a Movie instance. movie = results[0] # So far the Movie object only contains basic information like the # title and the year; retrieve main information: i.update(movie) print(movie.summary()) imdbpy-6.8/bin/get_first_person.py000077500000000000000000000021251351454127000173560ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ get_first_person.py Usage: get_first_person "person name" Search for the given name and print the best matching result. """ import sys # Import the IMDbPY package. try: import imdb except ImportError: print('You bad boy! You need to install the IMDbPY package!') sys.exit(1) if len(sys.argv) != 2: print('Only one argument is required:') print(' %s "person name"' % sys.argv[0]) sys.exit(2) name = sys.argv[1] i = imdb.IMDb() try: # Do the search, and get the results (a list of Person objects). results = i.search_person(name) except imdb.IMDbError as e: print("Probably you're not connected to Internet. Complete error report:") print(e) sys.exit(3) if not results: print('No matches for "%s", sorry.' % name) sys.exit(0) # Print only the first result. print(' Best match for "%s"' % name) # This is a Person instance. person = results[0] # So far the Person object only contains basic information like the # name; retrieve main information: i.update(person) print(person.summary()) imdbpy-6.8/bin/get_keyword.py000077500000000000000000000021351351454127000163260ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ get_keyword.py Usage: get_keyword "keyword" search for movies tagged with the given keyword and print the results. """ import sys # Import the IMDbPY package. try: import imdb except ImportError: print('You bad boy! You need to install the IMDbPY package!') sys.exit(1) if len(sys.argv) != 2: print('Only one argument is required:') print(' %s "keyword"' % sys.argv[0]) sys.exit(2) name = sys.argv[1] i = imdb.IMDb() try: # Do the search, and get the results (a list of movies). results = i.get_keyword(name, results=20) except imdb.IMDbError as e: print("Probably you're not connected to Internet. Complete error report:") print(e) sys.exit(3) # Print the results. print(' %s result%s for "%s":' % (len(results), ('', 's')[len(results) != 1], name)) print(' : movie title') # Print the long imdb title for every movie. for idx, movie in enumerate(results): outp = '%d: %s' % (idx+1, movie['long imdb title']) print(outp) imdbpy-6.8/bin/get_movie.py000077500000000000000000000064161351454127000157670ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ get_movie.py Usage: get_movie "movie_id" Show some info about the movie with the given movie_id (e.g. '0133093' for "The Matrix", using 'http' or 'mobile'). Notice that movie_id, using 'sql', are not the same IDs used on the web. """ import sys # Import the IMDbPY package. try: import imdb except ImportError: print('You bad boy! You need to install the IMDbPY package!') sys.exit(1) if len(sys.argv) != 2: print('Only one argument is required:') print(' %s "movie_id"' % sys.argv[0]) sys.exit(2) movie_id = sys.argv[1] i = imdb.IMDb() try: # Get a Movie object with the data about the movie identified by # the given movie_id. movie = i.get_movie(movie_id) except imdb.IMDbError as e: print("Probably you're not connected to Internet. Complete error report:") print(e) sys.exit(3) if not movie: print('It seems that there\'s no movie with movie_id "%s"' % movie_id) sys.exit(4) # XXX: this is the easier way to print the main info about a movie; # calling the summary() method of a Movie object will returns a string # with the main information about the movie. # Obviously it's not really meaningful if you want to know how # to access the data stored in a Movie object, so look below; the # commented lines show some ways to retrieve information from a # Movie object. print(movie.summary()) # Show some info about the movie. # This is only a short example; you can get a longer summary using # 'print movie.summary()' and the complete set of information looking for # the output of the movie.keys() method. # # print '==== "%s" / movie_id: %s ====' % (movie['title'], movie_id) # XXX: use the IMDb instance to get the IMDb web URL for the movie. # imdbURL = i.get_imdbURL(movie) # if imdbURL: # print 'IMDb URL: %s' % imdbURL # # XXX: many keys return a list of values, like "genres". # genres = movie.get('genres') # if genres: # print 'Genres: %s' % ' '.join(genres) # # XXX: even when only one value is present (e.g.: movie with only one # director), fields that can be multiple are ALWAYS a list. # Note that the 'name' variable is a Person object, but since its # __str__() method returns a string with the name, we can use it # directly, instead of name['name'] # director = movie.get('director') # if director: # print 'Director(s): ', # for name in director: # sys.stdout.write('%s ' % name) # print '' # # XXX: notice that every name in the cast is a Person object, with a # currentRole instance variable, which is a string for the played role. # cast = movie.get('cast') # if cast: # print 'Cast: ' # cast = cast[:5] # for name in cast: # print ' %s (%s)' % (name['name'], name.currentRole) # XXX: some information are not lists of strings or Person objects, but simple # strings, like 'rating'. # rating = movie.get('rating') # if rating: # print 'Rating: %s' % rating # XXX: an example of how to use information sets; retrieve the "trivia" # info set; check if it contains some data, select and print a # random entry. # import random # i.update(movie, info=['trivia']) # trivia = movie.get('trivia') # if trivia: # rand_trivia = trivia[random.randrange(len(trivia))] # print 'Random trivia: %s' % rand_trivia imdbpy-6.8/bin/get_person.py000077500000000000000000000053441351454127000161550ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ get_person.py Usage: get_person "person_id" Show some info about the person with the given person_id (e.g. '0000210' for "Julia Roberts". Notice that person_id, using 'sql', are not the same IDs used on the web. """ import sys # Import the IMDbPY package. try: import imdb except ImportError: print('You bad boy! You need to install the IMDbPY package!') sys.exit(1) if len(sys.argv) != 2: print('Only one argument is required:') print(' %s "person_id"' % sys.argv[0]) sys.exit(2) person_id = sys.argv[1] i = imdb.IMDb() try: # Get a Person object with the data about the person identified by # the given person_id. person = i.get_person(person_id) except imdb.IMDbError as e: print("Probably you're not connected to Internet. Complete error report:") print(e) sys.exit(3) if not person: print('It seems that there\'s no person with person_id "%s"' % person_id) sys.exit(4) # XXX: this is the easier way to print the main info about a person; # calling the summary() method of a Person object will returns a string # with the main information about the person. # Obviously it's not really meaningful if you want to know how # to access the data stored in a Person object, so look below; the # commented lines show some ways to retrieve information from a # Person object. print(person.summary()) # Show some info about the person. # This is only a short example; you can get a longer summary using # 'print person.summary()' and the complete set of information looking for # the output of the person.keys() method. # print '==== "%s" / person_id: %s ====' % (person['name'], person_id) # XXX: use the IMDb instance to get the IMDb web URL for the person. # imdbURL = i.get_imdbURL(person) # if imdbURL: # print 'IMDb URL: %s' % imdbURL # XXX: print the birth date and birth notes. # d_date = person.get('birth date') # if d_date: # print 'Birth date: %s' % d_date # b_notes = person.get('birth notes') # if b_notes: # print 'Birth notes: %s' % b_notes # XXX: print the last five movies he/she acted in, and the played role. # movies_acted = person.get('actor') or person.get('actress') # if movies_acted: # print 'Last roles played: ' # for movie in movies_acted[:5]: # print ' %s (in "%s")' % (movie.currentRole, movie['title']) # XXX: example of the use of information sets. # import random # i.update(person, info=['awards']) # awards = person.get('awards') # if awards: # rand_award = awards[random.randrange(len(awards))] # s = 'Random award: in year ' # s += rand_award.get('year', '') # s += ' %s "%s"' % (rand_award.get('result', '').lower(), # rand_award.get('award', '')) # print s imdbpy-6.8/bin/get_top_bottom_movies.py000077500000000000000000000014571351454127000204200ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ get_top_bottom_movies.py Usage: get_top_bottom_movies Return top and bottom 10 movies, by ratings. """ import sys # Import the IMDbPY package. try: import imdb except ImportError: print('You bad boy! You need to install the IMDbPY package!') sys.exit(1) if len(sys.argv) != 1: print('No arguments are required.') sys.exit(2) i = imdb.IMDb() top250 = i.get_top250_movies() bottom100 = i.get_bottom100_movies() for label, ml in [('top 10', top250[:10]), ('bottom 10', bottom100[:10])]: print('') print('%s movies' % label) print('rating\tvotes\ttitle') for movie in ml: outl = '%s\t%s\t%s' % (movie.get('rating'), movie.get('votes'), movie['long imdb title']) print(outl) imdbpy-6.8/bin/imdbpy2sql.py000077500000000000000000003405141351454127000160770ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ imdbpy2sql.py script. This script puts the data of the plain text data files into a SQL database. Copyright 2005-2017 Davide Alberani 2006 Giuseppe "Cowo" Corbelli lugbs.linux.it> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ import os import sys import getopt import time import re import warnings import operator import dbm from itertools import islice, chain try: from hashlib import md5 except ImportError: from md5 import md5 from gzip import GzipFile from imdb.parser.sql.dbschema import DB_SCHEMA, dropTables, createTables, createIndexes from imdb.parser.sql import soundex from imdb.utils import analyze_title, analyze_name, date_and_notes, \ build_name, build_title, normalizeName, normalizeTitle, _articles, \ build_company_name, analyze_company_name, canonicalTitle from imdb._exceptions import IMDbParserError, IMDbError from imdb.parser.sql.alchemyadapter import getDBTables, setConnection HELP = """imdbpy2sql.py usage: %s -d /directory/with/PlainTextDataFiles/ -u URI [-c /directory/for/CSV_files] [-i table,dbm] [--CSV-OPTIONS] [--COMPATIBILITY-OPTIONS] # NOTE: URI is something along the line: scheme://[user[:password]@]host[:port]/database[?parameters] Examples: mysql://user:password@host/database postgres://user:password@host/database sqlite:/tmp/imdb.db sqlite:/C|/full/path/to/database # NOTE: CSV mode (-c path): A directory is used to store CSV files; on supported database servers it should be really fast. # NOTE: imdbIDs store/restore (-i method): Valid options are 'table' (imdbIDs stored in a temporary table of the database) or 'dbm' (imdbIDs stored on a dbm file - this is the default if CSV is used). # NOTE: --CSV-OPTIONS can be: --csv-ext STRING files extension (.csv) --csv-only-write exit after the CSV files are written. --csv-only-load load an existing set of CSV files. # NOTE: --COMPATIBILITY-OPTIONS can be one of: --mysql-innodb insert data into a MySQL MyISAM db, and then convert it to InnoDB. --mysql-force-myisam force the creation of MyISAM tables. --ms-sqlserver compatibility mode for Microsoft SQL Server and SQL Express. --sqlite-transactions uses transactions, to speed-up SQLite. See README.sqldb for more information. """ % sys.argv[0] # Directory containing the IMDb's Plain Text Data Files. IMDB_PTDF_DIR = None # URI used to connect to the database. URI = None # List of tables of the database. DB_TABLES = [] # Max allowed recursion, inserting data. MAX_RECURSION = 10 # Method used to (re)store imdbIDs. IMDBIDS_METHOD = None # If set, this directory is used to output CSV files. CSV_DIR = None CSV_CURS = None CSV_ONLY_WRITE = False CSV_ONLY_LOAD = False CSV_EXT = '.csv' CSV_EOL = '\n' CSV_DELIMITER = ',' CSV_QUOTE = '"' CSV_ESCAPE = '"' CSV_NULL = 'NULL' CSV_QUOTEINT = False CSV_LOAD_SQL = None CSV_MYSQL = "LOAD DATA LOCAL INFILE '%(file)s' INTO TABLE `%(table)s` FIELDS TERMINATED BY '%(delimiter)s' ENCLOSED BY '%(quote)s' ESCAPED BY '%(escape)s' LINES TERMINATED BY '%(eol)s'" CSV_PGSQL = "COPY %(table)s FROM '%(file)s' WITH DELIMITER AS '%(delimiter)s' NULL AS '%(null)s' QUOTE AS '%(quote)s' ESCAPE AS '%(escape)s' CSV" CSV_DB2 = "CALL SYSPROC.ADMIN_CMD('LOAD FROM %(file)s OF del MODIFIED BY lobsinfile INSERT INTO %(table)s')" # Temporary fix for old style titles. # FIX_OLD_STYLE_TITLES = True # Store custom queries specified on the command line. CUSTOM_QUERIES = {} # Allowed time specification, for custom queries. ALLOWED_TIMES = ('BEGIN', 'BEFORE_DROP', 'BEFORE_CREATE', 'AFTER_CREATE', 'BEFORE_MOVIES', 'BEFORE_COMPANIES', 'BEFORE_CAST', 'BEFORE_RESTORE', 'BEFORE_INDEXES', 'END', 'BEFORE_MOVIES_TODB', 'AFTER_MOVIES_TODB', 'BEFORE_PERSONS_TODB', 'AFTER_PERSONS_TODB', 'BEFORE_SQLDATA_TODB', 'AFTER_SQLDATA_TODB', 'BEFORE_AKAMOVIES_TODB', 'AFTER_AKAMOVIES_TODB', 'BEFORE_CHARACTERS_TODB', 'AFTER_CHARACTERS_TODB', 'BEFORE_COMPANIES_TODB', 'AFTER_COMPANIES_TODB', 'BEFORE_EVERY_TODB', 'AFTER_EVERY_TODB', 'BEFORE_CSV_LOAD', 'BEFORE_CSV_TODB', 'AFTER_CSV_TODB') # Shortcuts for some compatibility options. MYSQLFORCEMYISAM_OPTS = ['-e', 'AFTER_CREATE:FOR_EVERY_TABLE:ALTER TABLE %(table)s ENGINE=MyISAM;'] MYSQLINNODB_OPTS = ['-e', 'AFTER_CREATE:FOR_EVERY_TABLE:ALTER TABLE %(table)s ENGINE=MyISAM;', '-e', 'BEFORE_INDEXES:FOR_EVERY_TABLE:ALTER TABLE %(table)s ENGINE=InnoDB;'] SQLSERVER_OPTS = ['-e', 'BEFORE_MOVIES_TODB:SET IDENTITY_INSERT %(table)s ON;', '-e', 'AFTER_MOVIES_TODB:SET IDENTITY_INSERT %(table)s OFF;', '-e', 'BEFORE_PERSONS_TODB:SET IDENTITY_INSERT %(table)s ON;', '-e', 'AFTER_PERSONS_TODB:SET IDENTITY_INSERT %(table)s OFF;', '-e', 'BEFORE_COMPANIES_TODB:SET IDENTITY_INSERT %(table)s ON;', '-e', 'AFTER_COMPANIES_TODB:SET IDENTITY_INSERT %(table)s OFF;', '-e', 'BEFORE_CHARACTERS_TODB:SET IDENTITY_INSERT %(table)s ON;', '-e', 'AFTER_CHARACTERS_TODB:SET IDENTITY_INSERT %(table)s OFF;', '-e', 'BEFORE_AKAMOVIES_TODB:SET IDENTITY_INSERT %(table)s ON;', '-e', 'AFTER_AKAMOVIES_TODB:SET IDENTITY_INSERT %(table)s OFF;'] SQLITE_OPTS = ['-e', 'BEGIN:PRAGMA synchronous = OFF;', '-e', 'BEFORE_EVERY_TODB:BEGIN TRANSACTION;', '-e', 'AFTER_EVERY_TODB:COMMIT;', '-e', 'BEFORE_INDEXES:BEGIN TRANSACTION;', 'e', 'END:COMMIT;'] if '--mysql-innodb' in sys.argv[1:]: sys.argv += MYSQLINNODB_OPTS if '--mysql-force-myisam' in sys.argv[1:]: sys.argv += MYSQLFORCEMYISAM_OPTS if '--ms-sqlserver' in sys.argv[1:]: sys.argv += SQLSERVER_OPTS if '--sqlite-transactions' in sys.argv[1:]: sys.argv += SQLITE_OPTS # Manage arguments list. try: optlist, args = getopt.getopt(sys.argv[1:], 'u:d:e:c:i:h', ['uri=', 'data=', 'execute=', 'mysql-innodb', 'ms-sqlserver', 'sqlite-transactions', 'fix-old-style-titles', 'mysql-force-myisam', 'csv-only-write', 'csv-only-load', 'csv=', 'csv-ext=', 'imdbids=', 'help']) except getopt.error as e: print('Troubles with arguments.') print(HELP) sys.exit(2) for opt in optlist: if opt[0] in ('-d', '--data'): IMDB_PTDF_DIR = opt[1] elif opt[0] in ('-u', '--uri'): URI = opt[1] elif opt[0] in ('-c', '--csv'): CSV_DIR = opt[1] elif opt[0] == '--csv-ext': CSV_EXT = opt[1] elif opt[0] in ('-i', '--imdbids'): IMDBIDS_METHOD = opt[1] elif opt[0] in ('-e', '--execute'): if opt[1].find(':') == -1: print('WARNING: wrong command syntax: "%s"' % opt[1]) continue when, cmd = opt[1].split(':', 1) if when not in ALLOWED_TIMES: print('WARNING: unknown time: "%s"' % when) continue if when == 'BEFORE_EVERY_TODB': for nw in ('BEFORE_MOVIES_TODB', 'BEFORE_PERSONS_TODB', 'BEFORE_SQLDATA_TODB', 'BEFORE_AKAMOVIES_TODB', 'BEFORE_CHARACTERS_TODB', 'BEFORE_COMPANIES_TODB'): CUSTOM_QUERIES.setdefault(nw, []).append(cmd) elif when == 'AFTER_EVERY_TODB': for nw in ('AFTER_MOVIES_TODB', 'AFTER_PERSONS_TODB', 'AFTER_SQLDATA_TODB', 'AFTER_AKAMOVIES_TODB', 'AFTER_CHARACTERS_TODB', 'AFTER_COMPANIES_TODB'): CUSTOM_QUERIES.setdefault(nw, []).append(cmd) else: CUSTOM_QUERIES.setdefault(when, []).append(cmd) elif opt[0] == '--fix-old-style-titles': warnings.warn('The --fix-old-style-titles argument is obsolete.') elif opt[0] == '--csv-only-write': CSV_ONLY_WRITE = True elif opt[0] == '--csv-only-load': CSV_ONLY_LOAD = True elif opt[0] in ('-h', '--help'): print(HELP) sys.exit(0) if IMDB_PTDF_DIR is None: print('You must supply the directory with the plain text data files') print(HELP) sys.exit(2) if URI is None: print('You must supply the URI for the database connection') print(HELP) sys.exit(2) if IMDBIDS_METHOD not in (None, 'dbm', 'table'): print('the method to (re)store imdbIDs must be one of "dbm" or "table"') print(HELP) sys.exit(2) if (CSV_ONLY_WRITE or CSV_ONLY_LOAD) and not CSV_DIR: print('You must specify the CSV directory with the -c argument') print(HELP) sys.exit(3) # Some warnings and notices. URIlower = URI.lower() if URIlower.startswith('mysql'): if '--mysql-force-myisam' in sys.argv[1:] and \ '--mysql-innodb' in sys.argv[1:]: print('\nWARNING: there is no sense in mixing the --mysql-innodb and\n' '--mysql-force-myisam command line options!\n') elif '--mysql-innodb' in sys.argv[1:]: print("\nNOTICE: you've specified the --mysql-innodb command line\n" "option; you should do this ONLY IF your system uses InnoDB\n" "tables or you really want to use InnoDB; if you're running\n" "a MyISAM-based database, please omit any option; if you\n" "want to force MyISAM usage on a InnoDB-based database,\n" "try the --mysql-force-myisam command line option, instead.\n") elif '--mysql-force-myisam' in sys.argv[1:]: print("\nNOTICE: you've specified the --mysql-force-myisam command\n" "line option; you should do this ONLY IF your system uses\n" "InnoDB tables and you want to use MyISAM tables, instead.\n") else: print("\nNOTICE: IF you're using InnoDB tables, data insertion can\n" "be very slow; you can switch to MyISAM tables - forcing it\n" "with the --mysql-force-myisam option - OR use the\n" "--mysql-innodb command line option, but DON'T USE these if\n" "you're already working on MyISAM tables, because it will\n" "force MySQL to use InnoDB, and performances will be poor.\n") elif URIlower.startswith('mssql') and \ '--ms-sqlserver' not in sys.argv[1:]: print("\nWARNING: you're using MS SQLServer without the --ms-sqlserver\n" "command line option: if something goes wrong, try using it.\n") elif URIlower.startswith('sqlite') and \ '--sqlite-transactions' not in sys.argv[1:]: print("\nWARNING: you're using SQLite without the --sqlite-transactions\n" "command line option: you'll have very poor performances! Try\n" "using it.\n") if ('--mysql-force-myisam' in sys.argv[1:] and not URIlower.startswith('mysql')) or ('--mysql-innodb' in sys.argv[1:] and not URIlower.startswith('mysql')) or ('--ms-sqlserver' in sys.argv[1:] and not URIlower.startswith('mssql')) or \ ('--sqlite-transactions' in sys.argv[1:] and not URIlower.startswith('sqlite')): print("\nWARNING: you've specified command line options that don't\n" "belong to the database server you're using: proceed at your\n" "own risk!\n") if CSV_DIR: if URIlower.startswith('mysql'): CSV_LOAD_SQL = CSV_MYSQL elif URIlower.startswith('postgres'): CSV_LOAD_SQL = CSV_PGSQL elif URIlower.startswith('ibm'): CSV_LOAD_SQL = CSV_DB2 CSV_NULL = '' else: print("\nERROR: importing CSV files is not supported for this database") if not CSV_ONLY_WRITE: sys.exit(3) DB_TABLES = getDBTables(URI) for t in DB_TABLES: globals()[t._imdbpyName] = t #----------------------- # CSV Handling. class CSVCursor(object): """Emulate a cursor object, but instead it writes data to a set of CSV files.""" def __init__(self, csvDir, csvExt=CSV_EXT, csvEOL=CSV_EOL, delimeter=CSV_DELIMITER, quote=CSV_QUOTE, escape=CSV_ESCAPE, null=CSV_NULL, quoteInteger=CSV_QUOTEINT): """Initialize a CSVCursor object; csvDir is the directory where the CSV files will be stored.""" self.csvDir = csvDir self.csvExt = csvExt self.csvEOL = csvEOL self.delimeter = delimeter self.quote = quote self.escape = escape self.escaped = '%s%s' % (escape, quote) self.null = null self.quoteInteger = quoteInteger self._fdPool = {} self._lobFDPool = {} self._counters = {} def buildLine(self, items, tableToAddID=False, rawValues=(), lobFD=None, lobFN=None): """Build a single text line for a set of information.""" # FIXME: there are too many special cases to handle, and that # affects performances: management of LOB files, at least, # must be moved away from here. quote = self.quote null = self.null escaped = self.escaped quoteInteger = self.quoteInteger if not tableToAddID: r = [] else: _counters = self._counters r = [_counters[tableToAddID]] _counters[tableToAddID] += 1 r += list(items) for idx, val in enumerate(r): if val is None: r[idx] = null continue if (not quoteInteger) and isinstance(val, int): r[idx] = str(val) continue if lobFD and idx == 3: continue val = str(val) if quote: val = '%s%s%s' % (quote, val.replace(quote, escaped), quote) r[idx] = val # Add RawValue(s), if present. rinsert = r.insert if tableToAddID: shift = 1 else: shift = 0 for idx, item in rawValues: rinsert(idx + shift, item) if lobFD: # XXX: totally tailored to suit person_info.info column! val3 = r[3] val3len = len(val3 or '') or -1 if val3len == -1: val3off = 0 else: val3off = lobFD.tell() r[3] = '%s.%d.%d/' % (lobFN, val3off, val3len) lobFD.write(val3) # Build the line and add the end-of-line. ret = '%s%s' % (self.delimeter.join(r), self.csvEOL) ret = ret.encode('latin1', 'ignore') return ret def executemany(self, sqlstr, items): """Emulate the executemany method of a cursor, but writes the data in a set of CSV files.""" # XXX: find a safer way to get the table/file name! tName = sqlstr.split()[2] lobFD = None lobFN = None doLOB = False # XXX: ugly special case, to create the LOB file. if URIlower.startswith('ibm') and tName == 'person_info': doLOB = True # Open the file descriptor or get it from the pool. if tName in self._fdPool: tFD = self._fdPool[tName] lobFD = self._lobFDPool.get(tName) lobFN = getattr(lobFD, 'name', None) if lobFN: lobFN = os.path.basename(lobFN) else: tFD = open(os.path.join(CSV_DIR, tName + self.csvExt), 'wb') self._fdPool[tName] = tFD if doLOB: lobFN = '%s.lob' % tName lobFD = open(os.path.join(CSV_DIR, lobFN), 'wb') self._lobFDPool[tName] = lobFD buildLine = self.buildLine tableToAddID = False if tName in ('cast_info', 'movie_info', 'person_info', 'movie_companies', 'movie_link', 'aka_name', 'complete_cast', 'movie_info_idx', 'movie_keyword'): tableToAddID = tName if tName not in self._counters: self._counters[tName] = 1 # Identify if there are RawValue in the VALUES (...) portion of # the query. parIdx = sqlstr.rfind('(') rawValues = [] vals = sqlstr[parIdx + 1:-1] if parIdx != 0: vals = sqlstr[parIdx + 1:-1] for idx, item in enumerate(vals.split(', ')): if item[0] in ('%', '?', ':'): continue rawValues.append((idx, item)) # Write these lines. tFD.writelines(buildLine(i, tableToAddID=tableToAddID, rawValues=rawValues, lobFD=lobFD, lobFN=lobFN) for i in items) # Flush to disk, so that no truncaded entries are ever left. # XXX: is this a good idea? tFD.flush() def fileNames(self): """Return the list of file names.""" return [fd.name for fd in list(self._fdPool.values())] def buildFakeFileNames(self): """Populate the self._fdPool dictionary with fake objects taking file names from the content of the self.csvDir directory.""" class _FakeFD(object): pass for fname in os.listdir(self.csvDir): if not fname.endswith(CSV_EXT): continue fpath = os.path.join(self.csvDir, fname) if not os.path.isfile(fpath): continue fd = _FakeFD() fd.name = fname self._fdPool[fname[:-len(CSV_EXT)]] = fd def close(self, tName): """Close a given table/file.""" if tName in self._fdPool: self._fdPool[tName].close() def closeAll(self): """Close all open file descriptors.""" for fd in list(self._fdPool.values()): fd.close() for fd in list(self._lobFDPool.values()): fd.close() def loadCSVFiles(): """Load every CSV file into the database.""" CSV_REPL = {'quote': CSV_QUOTE, 'delimiter': CSV_DELIMITER, 'escape': CSV_ESCAPE, 'null': CSV_NULL, 'eol': CSV_EOL} for fName in CSV_CURS.fileNames(): connectObject.commit() tName = os.path.basename(fName[:-len(CSV_EXT)]) cfName = os.path.join(CSV_DIR, fName) CSV_REPL['file'] = cfName CSV_REPL['table'] = tName sqlStr = CSV_LOAD_SQL % CSV_REPL print(' * LOADING CSV FILE %s...' % cfName) sys.stdout.flush() executeCustomQueries('BEFORE_CSV_TODB') try: CURS.execute(sqlStr) try: res = CURS.fetchall() if res: print('LOADING OUTPUT:', res) except: pass except Exception as e: print('ERROR: unable to import CSV file %s: %s' % (cfName, str(e))) continue connectObject.commit() executeCustomQueries('AFTER_CSV_TODB') #----------------------- conn = setConnection(URI, DB_TABLES) if CSV_DIR: # Go for a CSV ride... CSV_CURS = CSVCursor(CSV_DIR) # Extract exceptions to trap. try: OperationalError = conn.module.OperationalError except AttributeError as e: warnings.warn('Unable to import OperationalError; report this as a bug, ' 'since it will mask important exceptions: %s' % e) OperationalError = Exception try: IntegrityError = conn.module.IntegrityError except AttributeError as e: warnings.warn('Unable to import IntegrityError') IntegrityError = Exception connectObject = conn.getConnection() # Cursor object. CURS = connectObject.cursor() # Name of the database and style of the parameters. DB_NAME = conn.dbName PARAM_STYLE = conn.paramstyle def _get_imdbids_method(): """Return the method to be used to (re)store imdbIDs (one of 'dbm' or 'table').""" if IMDBIDS_METHOD: return IMDBIDS_METHOD if CSV_DIR: return 'dbm' return 'table' def tableName(table): """Return a string with the name of the table in the current db.""" return table.sqlmeta.table def colName(table, column): """Return a string with the name of the column in the current db.""" if column == 'id': return table.sqlmeta.idName return table.sqlmeta.columns[column].dbName class RawValue(object): """String-like objects to store raw SQL parameters, that are not intended to be replaced with positional parameters, in the query.""" def __init__(self, s, v): self.string = s self.value = v def __str__(self): return self.string def _makeConvNamed(cols): """Return a function to be used to convert a list of parameters from positional style to named style (convert from a list of tuples to a list of dictionaries.""" nrCols = len(cols) def _converter(params): for paramIndex, paramSet in enumerate(params): d = {} for i in range(nrCols): d[cols[i]] = paramSet[i] params[paramIndex] = d return params return _converter def createSQLstr(table, cols, command='INSERT'): """Given a table and a list of columns returns a sql statement useful to insert a set of data in the database. Along with the string, also a function useful to convert parameters from positional to named style is returned.""" sqlstr = '%s INTO %s ' % (command, tableName(table)) colNames = [] values = [] convCols = [] count = 1 def _valStr(s, index): if DB_NAME in ('mysql', 'postgres'): return '%s' elif PARAM_STYLE == 'format': return '%s' elif PARAM_STYLE == 'qmark': return '?' elif PARAM_STYLE == 'numeric': return ':%s' % index elif PARAM_STYLE == 'named': return ':%s' % s elif PARAM_STYLE == 'pyformat': return '%(' + s + ')s' return '%s' for col in cols: if isinstance(col, RawValue): colNames.append(colName(table, col.string)) values.append(str(col.value)) elif col == 'id': colNames.append(table.sqlmeta.idName) values.append(_valStr('id', count)) convCols.append(col) count += 1 else: colNames.append(colName(table, col)) values.append(_valStr(col, count)) convCols.append(col) count += 1 sqlstr += '(%s) ' % ', '.join(colNames) sqlstr += 'VALUES (%s)' % ', '.join(values) if DB_NAME not in ('mysql', 'postgres') and \ PARAM_STYLE in ('named', 'pyformat'): converter = _makeConvNamed(convCols) else: # Return the list itself. converter = lambda x: x return sqlstr, converter def _(s, truncateAt=None): """Nicely print a string to sys.stdout, optionally truncating it a the given char.""" if truncateAt is not None: s = s[:truncateAt] return s if not hasattr(os, 'times'): def times(): """Fake times() function.""" return 0.0, 0.0, 0.0, 0.0, 0.0 os.times = times # Show time consumed by the single function call. CTIME = int(time.time()) BEGIN_TIME = CTIME CTIMES = os.times() BEGIN_TIMES = CTIMES def _minSec(*t): """Return a tuple of (mins, secs, ...) - two for every item passed.""" l = [] for i in t: l.extend(divmod(int(i), 60)) return tuple(l) def t(s, sinceBegin=False): """Pretty-print timing information.""" global CTIME, CTIMES nt = int(time.time()) ntimes = os.times() if not sinceBegin: ct = CTIME cts = CTIMES else: ct = BEGIN_TIME cts = BEGIN_TIMES print('# TIME', s, ': %dmin, %dsec (wall) %dmin, %dsec (user) %dmin, %dsec (system)' % _minSec(nt - ct, ntimes[0] - cts[0], ntimes[1] - cts[1])) if not sinceBegin: CTIME = nt CTIMES = ntimes def title_soundex(title): """Return the soundex code for the given title; the (optional) starting article is pruned. It assumes to receive a title without year/imdbIndex or kind indications, but just the title string, as the one in the analyze_title(title)['title'] value.""" if not title: return None # Convert to canonical format. title = canonicalTitle(title) ts = title.split(', ') # Strip the ending article, if any. if ts[-1].lower() in _articles: title = ', '.join(ts[:-1]) return soundex(title) def name_soundexes(name, character=False): """Return three soundex codes for the given name; the name is assumed to be in the 'surname, name' format, without the imdbIndex indication, as the one in the analyze_name(name)['name'] value. The first one is the soundex of the name in the canonical format. The second is the soundex of the name in the normal format, if different from the first one. The third is the soundex of the surname, if different from the other two values.""" if not name: return None, None, None s1 = soundex(name) name_normal = normalizeName(name) s2 = soundex(name_normal) if s1 == s2: s2 = None if not character: namesplit = name.split(', ') s3 = soundex(namesplit[0]) else: s3 = soundex(name.split(' ')[-1]) if s3 and s3 in (s1, s2): s3 = None return s1, s2, s3 # Tags to identify where the meaningful data begin/end in files. MOVIES = 'movies.list.gz' MOVIES_START = ('MOVIES LIST', '===========', '') MOVIES_STOP = '--------------------------------------------------' CAST_START = ('Name', '----') CAST_STOP = '-----------------------------' RAT_START = ('MOVIE RATINGS REPORT', '', 'New Distribution Votes Rank Title') RAT_STOP = '\n' RAT_TOP250_START = ('note: for this top 250', '', 'New Distribution') RAT_BOT10_START = ('BOTTOM 10 MOVIES', '', 'New Distribution') TOPBOT_STOP = '\n' AKAT_START = ('AKA TITLES LIST', '=============', '', '', '') AKAT_IT_START = ('AKA TITLES LIST ITALIAN', '=======================', '', '') AKAT_DE_START = ('AKA TITLES LIST GERMAN', '======================', '') AKAT_ISO_START = ('AKA TITLES LIST ISO', '===================', '') AKAT_HU_START = ('AKA TITLES LIST HUNGARIAN', '=========================', '') AKAT_NO_START = ('AKA TITLES LIST NORWEGIAN', '=========================', '') AKAN_START = ('AKA NAMES LIST', '=============', '') AV_START = ('ALTERNATE VERSIONS LIST', '=======================', '', '') MINHASH_STOP = '-------------------------' GOOFS_START = ('GOOFS LIST', '==========', '') QUOTES_START = ('QUOTES LIST', '=============') CC_START = ('CRAZY CREDITS', '=============') BIO_START = ('BIOGRAPHY LIST', '==============') BUS_START = ('BUSINESS LIST', '=============', '') BUS_STOP = ' =====' CER_START = ('CERTIFICATES LIST', '=================') COL_START = ('COLOR INFO LIST', '===============') COU_START = ('COUNTRIES LIST', '==============') DIS_START = ('DISTRIBUTORS LIST', '=================', '') GEN_START = ('8: THE GENRES LIST', '==================', '') KEY_START = ('8: THE KEYWORDS LIST', '====================', '') LAN_START = ('LANGUAGE LIST', '=============') LOC_START = ('LOCATIONS LIST', '==============', '') MIS_START = ('MISCELLANEOUS COMPANY LIST', '============================') MIS_STOP = '--------------------------------------------------------------------------------' PRO_START = ('PRODUCTION COMPANIES LIST', '=========================', '') RUN_START = ('RUNNING TIMES LIST', '==================') SOU_START = ('SOUND-MIX LIST', '==============') SFX_START = ('SFXCO COMPANIES LIST', '====================', '') TCN_START = ('TECHNICAL LIST', '==============', '', '') LSD_START = ('LASERDISC LIST', '==============', '------------------------') LIT_START = ('LITERATURE LIST', '===============', '') LIT_STOP = 'COPYING POLICY' LINK_START = ('MOVIE LINKS LIST', '================', '') MPAA_START = ('MPAA RATINGS REASONS LIST', '=========================') PLOT_START = ('PLOT SUMMARIES LIST', '===================', '') RELDATE_START = ('RELEASE DATES LIST', '==================') SNDT_START = ('SOUNDTRACKS', '=============', '', '', '') TAGL_START = ('TAG LINES LIST', '==============', '', '') TAGL_STOP = '-----------------------------------------' TRIV_START = ('FILM TRIVIA', '===========', '') COMPCAST_START = ('CAST COVERAGE TRACKING LIST', '===========================') COMPCREW_START = ('CREW COVERAGE TRACKING LIST', '===========================') COMP_STOP = '---------------' GzipFileRL = GzipFile.readline class SourceFile(GzipFile): """Instances of this class are used to read gzipped files, starting from a defined line to a (optionally) given end.""" def __init__(self, filename=None, mode=None, start=(), stop=None, pwarning=1, *args, **kwds): filename = os.path.join(IMDB_PTDF_DIR, filename) try: GzipFile.__init__(self, filename, mode, *args, **kwds) except IOError as e: if not pwarning: raise print('WARNING WARNING WARNING') print('WARNING unable to read the "%s" file.' % filename) print('WARNING The file will be skipped, and the contained') print('WARNING information will NOT be stored in the database.') print('WARNING Complete error: ', e) # re-raise the exception. raise self.start = start for item in start: itemlen = len(item) for line in self: line = line.decode('latin1') if line[:itemlen] == item: break self.set_stop(stop) def set_stop(self, stop): if stop is not None: self.stop = stop self.stoplen = len(self.stop) self.readline = self.readline_checkEnd else: self.readline = self.readline_NOcheckEnd def readline_NOcheckEnd(self, size=-1): line = GzipFile.readline(self, size) return str(line, 'latin_1', 'ignore') def readline_checkEnd(self, size=-1): line = GzipFile.readline(self, size) if self.stop is not None and line[:self.stoplen] == self.stop: return '' return str(line, 'latin_1', 'ignore') def getByHashSections(self): return getSectionHash(self) def getByNMMVSections(self): return getSectionNMMV(self) def getSectionHash(fp): """Return sections separated by lines starting with #.""" curSectList = [] curSectListApp = curSectList.append curTitle = '' joiner = ''.join for line in fp: if line and line[0] == '#': if curSectList and curTitle: yield curTitle, joiner(curSectList) curSectList[:] = [] curTitle = '' curTitle = line[2:] else: curSectListApp(line) if curSectList and curTitle: yield curTitle, joiner(curSectList) curSectList[:] = [] curTitle = '' NMMVSections = dict([(x, None) for x in ('MV: ', 'NM: ', 'OT: ', 'MOVI')]) def getSectionNMMV(fp): """Return sections separated by lines starting with 'NM: ', 'MV: ', 'OT: ' or 'MOVI'.""" curSectList = [] curSectListApp = curSectList.append curNMMV = '' joiner = ''.join for line in fp: if line[:4] in NMMVSections: if curSectList and curNMMV: yield curNMMV, joiner(curSectList) curSectList[:] = [] curNMMV = '' if line[:4] == 'MOVI': curNMMV = line[6:] else: curNMMV = line[4:] elif not (line and line[0] == '-'): curSectListApp(line) if curSectList and curNMMV: yield curNMMV, joiner(curSectList) curSectList[:] = [] curNMMV = '' def counter(initValue=1): """A counter implemented using a generator.""" i = initValue while 1: yield i i += 1 class _BaseCache(dict): """Base class for Movie and Person basic information.""" def __init__(self, d=None, flushEvery=100000): dict.__init__(self) # Flush data into the SQL database every flushEvery entries. self.flushEvery = flushEvery self._tmpDict = {} self._flushing = 0 self._deferredData = {} self._recursionLevel = 0 self._table_name = '' self._id_for_custom_q = '' if d is not None: for k, v in d.items(): self[k] = v def __setitem__(self, key, counter): """Every time a key is set, its value is the counter; every flushEvery, the temporary dictionary is flushed to the database, and then zeroed.""" if counter % self.flushEvery == 0: self.flush() dict.__setitem__(self, key, counter) if not self._flushing: self._tmpDict[key] = counter else: self._deferredData[key] = counter def flush(self, quiet=0, _recursionLevel=0): """Flush to the database.""" if self._flushing: return self._flushing = 1 if _recursionLevel >= MAX_RECURSION: print('WARNING recursion level exceded trying to flush data') print('WARNING this batch of data is lost (%s).' % self.className) self._tmpDict.clear() return if self._tmpDict: # Horrible hack to know if AFTER_%s_TODB has run. _after_has_run = False keys = {'table': self._table_name} try: executeCustomQueries('BEFORE_%s_TODB' % self._id_for_custom_q, _keys=keys, _timeit=False) self._toDB(quiet) executeCustomQueries('AFTER_%s_TODB' % self._id_for_custom_q, _keys=keys, _timeit=False) _after_has_run = True self._tmpDict.clear() except OperationalError as e: # XXX: I'm not sure this is the right thing (and way) # to proceed. if not _after_has_run: executeCustomQueries('AFTER_%s_TODB' % self._id_for_custom_q, _keys=keys, _timeit=False) # Dataset too large; split it in two and retry. # XXX: new code! # the same class instance (self) is used, instead of # creating two separated objects. _recursionLevel += 1 self._flushing = 0 firstHalf = {} poptmpd = self._tmpDict.popitem originalLength = len(self._tmpDict) for x in range(1, 1 + originalLength // 2): k, v = poptmpd() firstHalf[k] = v self._secondHalf = self._tmpDict self._tmpDict = firstHalf print(' * TOO MANY DATA (%s items in %s), recursion: %s' % (originalLength, self.className, _recursionLevel)) print(' * SPLITTING (run 1 of 2), recursion: %s' % _recursionLevel) self.flush(quiet=quiet, _recursionLevel=_recursionLevel) self._tmpDict = self._secondHalf print(' * SPLITTING (run 2 of 2), recursion: %s' % _recursionLevel) self.flush(quiet=quiet, _recursionLevel=_recursionLevel) self._tmpDict.clear() except Exception as e: if isinstance(e, KeyboardInterrupt): raise print('WARNING: %s; unknown exception caught committing the data' % self.className) print('WARNING: to the database; report this as a bug, since') print('WARNING: many data (%d items) were lost: %s' % (len(self._tmpDict), e)) self._flushing = 0 try: connectObject.commit() except: pass # Flush also deferred data. if self._deferredData: self._tmpDict = self._deferredData self.flush(quiet=1) self._deferredData = {} connectObject.commit() def populate(self): """Populate the dictionary from the database.""" raise NotImplementedError def _toDB(self, quiet=0): """Write the dictionary to the database.""" raise NotImplementedError def add(self, key, miscData=None): """Insert a new key and return its value.""" c = next(self.counter) if miscData is not None: for d_name, data in miscData: getattr(self, d_name)[c] = data self[key] = c return c def addUnique(self, key, miscData=None): """Insert a new key and return its value; if the key is already in the dictionary, its previous value is returned.""" if key in self: return self[key] else: return self.add(key, miscData) def fetchsome(curs, size=20000): """Yes, I've read the Python Cookbook! :-)""" while 1: res = curs.fetchmany(size) if not res: break for r in res: yield r class MoviesCache(_BaseCache): """Manage the movies list.""" className = 'MoviesCache' counter = counter() def __init__(self, *args, **kwds): _BaseCache.__init__(self, *args, **kwds) self.movieYear = {} self._table_name = tableName(Title) self._id_for_custom_q = 'MOVIES' self.sqlstr, self.converter = createSQLstr(Title, ('id', 'title', 'imdbIndex', 'kindID', 'productionYear', 'imdbID', 'phoneticCode', 'episodeOfID', 'seasonNr', 'episodeNr', 'seriesYears', 'md5sum')) def populate(self): print(' * POPULATING %s...' % self.className) titleTbl = tableName(Title) movieidCol = colName(Title, 'id') titleCol = colName(Title, 'title') kindidCol = colName(Title, 'kindID') yearCol = colName(Title, 'productionYear') imdbindexCol = colName(Title, 'imdbIndex') episodeofidCol = colName(Title, 'episodeOfID') seasonNrCol = colName(Title, 'seasonNr') episodeNrCol = colName(Title, 'episodeNr') sqlPop = 'SELECT %s, %s, %s, %s, %s, %s, %s, %s FROM %s;' % \ (movieidCol, titleCol, kindidCol, yearCol, imdbindexCol, episodeofidCol, seasonNrCol, episodeNrCol, titleTbl) CURS.execute(sqlPop) _oldcacheValues = Title.sqlmeta.cacheValues Title.sqlmeta.cacheValues = False for x in fetchsome(CURS, self.flushEvery): mdict = {'title': x[1], 'kind': KIND_STRS[x[2]], 'year': x[3], 'imdbIndex': x[4]} if mdict['imdbIndex'] is None: del mdict['imdbIndex'] if mdict['year'] is None: del mdict['year'] else: mdict['year'] = str(mdict['year']) episodeOfID = x[5] if episodeOfID is not None: s = Title.get(episodeOfID) series_d = {'title': s.title, 'kind': str(KIND_STRS[s.kindID]), 'year': s.productionYear, 'imdbIndex': s.imdbIndex} if series_d['imdbIndex'] is None: del series_d['imdbIndex'] if series_d['year'] is None: del series_d['year'] else: series_d['year'] = str(series_d['year']) mdict['episode of'] = series_d title = build_title(mdict, ptdf=True) dict.__setitem__(self, title, x[0]) self.counter = counter(Title.select().count() + 1) Title.sqlmeta.cacheValues = _oldcacheValues def _toDB(self, quiet=0): if not quiet: print(' * FLUSHING %s...' % self.className) sys.stdout.flush() l = [] lapp = l.append for k, v in self._tmpDict.items(): try: t = analyze_title(k) except IMDbParserError: if k and k.strip(): print('WARNING %s._toDB() invalid title:' % self.className, end=' ') print(_(k)) continue tget = t.get episodeOf = None kind = tget('kind') if kind == 'episode': # Series title. stitle = build_title(tget('episode of'), ptdf=True) episodeOf = self.addUnique(stitle) del t['episode of'] year = self.movieYear.get(v) if year is not None and year != '????': try: t['year'] = int(year) except ValueError: pass elif kind in ('tv series', 'tv mini series'): t['series years'] = self.movieYear.get(v) title = tget('title') soundex = title_soundex(title) lapp((v, title, tget('imdbIndex'), KIND_IDS[kind], tget('year'), None, soundex, episodeOf, tget('season'), tget('episode'), tget('series years'), md5(k.encode('latin1')).hexdigest())) self._runCommand(l) def _runCommand(self, dataList): if not CSV_DIR: CURS.executemany(self.sqlstr, self.converter(dataList)) else: CSV_CURS.executemany(self.sqlstr, dataList) def addUnique(self, key, miscData=None): """Insert a new key and return its value; if the key is already in the dictionary, its previous value is returned.""" if key.endswith('{{SUSPENDED}}'): return None if key in self: return self[key] else: return self.add(key, miscData) class PersonsCache(_BaseCache): """Manage the persons list.""" className = 'PersonsCache' counter = counter() def __init__(self, *args, **kwds): _BaseCache.__init__(self, *args, **kwds) self.personGender = {} self._table_name = tableName(Name) self._id_for_custom_q = 'PERSONS' self.sqlstr, self.converter = createSQLstr(Name, ['id', 'name', 'imdbIndex', 'imdbID', 'gender', 'namePcodeCf', 'namePcodeNf', 'surnamePcode', 'md5sum']) def populate(self): print(' * POPULATING PersonsCache...') nameTbl = tableName(Name) personidCol = colName(Name, 'id') nameCol = colName(Name, 'name') imdbindexCol = colName(Name, 'imdbIndex') CURS.execute('SELECT %s, %s, %s FROM %s;' % (personidCol, nameCol, imdbindexCol, nameTbl)) _oldcacheValues = Name.sqlmeta.cacheValues Name.sqlmeta.cacheValues = False for x in fetchsome(CURS, self.flushEvery): nd = {'name': x[1]} if x[2]: nd['imdbIndex'] = x[2] name = build_name(nd) dict.__setitem__(self, name, x[0]) self.counter = counter(Name.select().count() + 1) Name.sqlmeta.cacheValues = _oldcacheValues def _toDB(self, quiet=0): if not quiet: print(' * FLUSHING PersonsCache...') sys.stdout.flush() l = [] lapp = l.append for k, v in self._tmpDict.items(): try: t = analyze_name(k) except IMDbParserError: if k and k.strip(): print('WARNING PersonsCache._toDB() invalid name:', _(k)) continue tget = t.get name = tget('name') namePcodeCf, namePcodeNf, surnamePcode = name_soundexes(name) gender = self.personGender.get(v) lapp((v, name, tget('imdbIndex'), None, gender, namePcodeCf, namePcodeNf, surnamePcode, md5(k.encode('latin1')).hexdigest())) if not CSV_DIR: CURS.executemany(self.sqlstr, self.converter(l)) else: CSV_CURS.executemany(self.sqlstr, l) class CharactersCache(_BaseCache): """Manage the characters list.""" counter = counter() className = 'CharactersCache' def __init__(self, *args, **kwds): _BaseCache.__init__(self, *args, **kwds) self._table_name = tableName(CharName) self._id_for_custom_q = 'CHARACTERS' self.sqlstr, self.converter = createSQLstr(CharName, ['id', 'name', 'imdbIndex', 'imdbID', 'namePcodeNf', 'surnamePcode', 'md5sum']) def populate(self): print(' * POPULATING CharactersCache...') nameTbl = tableName(CharName) personidCol = colName(CharName, 'id') nameCol = colName(CharName, 'name') imdbindexCol = colName(CharName, 'imdbIndex') CURS.execute('SELECT %s, %s, %s FROM %s;' % (personidCol, nameCol, imdbindexCol, nameTbl)) _oldcacheValues = CharName.sqlmeta.cacheValues CharName.sqlmeta.cacheValues = False for x in fetchsome(CURS, self.flushEvery): nd = {'name': x[1]} if x[2]: nd['imdbIndex'] = x[2] name = build_name(nd) dict.__setitem__(self, name, x[0]) self.counter = counter(CharName.select().count() + 1) CharName.sqlmeta.cacheValues = _oldcacheValues def _toDB(self, quiet=0): if not quiet: print(' * FLUSHING CharactersCache...') sys.stdout.flush() l = [] lapp = l.append for k, v in self._tmpDict.items(): try: t = analyze_name(k) except IMDbParserError: if k and k.strip(): print('WARNING CharactersCache._toDB() invalid name:', _(k)) continue tget = t.get name = tget('name') namePcodeCf, namePcodeNf, surnamePcode = name_soundexes(name, character=True) lapp((v, name, tget('imdbIndex'), None, namePcodeCf, surnamePcode, md5(k.encode('latin1')).hexdigest())) if not CSV_DIR: CURS.executemany(self.sqlstr, self.converter(l)) else: CSV_CURS.executemany(self.sqlstr, l) class CompaniesCache(_BaseCache): """Manage the companies list.""" counter = counter() className = 'CompaniesCache' def __init__(self, *args, **kwds): _BaseCache.__init__(self, *args, **kwds) self._table_name = tableName(CompanyName) self._id_for_custom_q = 'COMPANIES' self.sqlstr, self.converter = createSQLstr(CompanyName, ['id', 'name', 'countryCode', 'imdbID', 'namePcodeNf', 'namePcodeSf', 'md5sum']) def populate(self): print(' * POPULATING CharactersCache...') nameTbl = tableName(CompanyName) companyidCol = colName(CompanyName, 'id') nameCol = colName(CompanyName, 'name') countryCodeCol = colName(CompanyName, 'countryCode') CURS.execute('SELECT %s, %s, %s FROM %s;' % (companyidCol, nameCol, countryCodeCol, nameTbl)) _oldcacheValues = CompanyName.sqlmeta.cacheValues CompanyName.sqlmeta.cacheValues = False for x in fetchsome(CURS, self.flushEvery): nd = {'name': x[1]} if x[2]: nd['country'] = x[2] name = build_company_name(nd) dict.__setitem__(self, name, x[0]) self.counter = counter(CompanyName.select().count() + 1) CompanyName.sqlmeta.cacheValues = _oldcacheValues def _toDB(self, quiet=0): if not quiet: print(' * FLUSHING CompaniesCache...') sys.stdout.flush() l = [] lapp = l.append for k, v in self._tmpDict.items(): try: t = analyze_company_name(k) except IMDbParserError: if k and k.strip(): print('WARNING CompaniesCache._toDB() invalid name:', _(k)) continue tget = t.get name = tget('name') namePcodeNf = soundex(name) namePcodeSf = None country = tget('country') if k != name: namePcodeSf = soundex(k) lapp((v, name, country, None, namePcodeNf, namePcodeSf, md5(k.encode('latin1')).hexdigest())) if not CSV_DIR: CURS.executemany(self.sqlstr, self.converter(l)) else: CSV_CURS.executemany(self.sqlstr, l) class KeywordsCache(_BaseCache): """Manage the list of keywords.""" counter = counter() className = 'KeywordsCache' def __init__(self, *args, **kwds): _BaseCache.__init__(self, *args, **kwds) self._table_name = tableName(CompanyName) self._id_for_custom_q = 'KEYWORDS' self.flushEvery = 10000 self.sqlstr, self.converter = createSQLstr(Keyword, ['id', 'keyword', 'phoneticCode']) def populate(self): print(' * POPULATING KeywordsCache...') nameTbl = tableName(CompanyName) keywordidCol = colName(Keyword, 'id') keyCol = colName(Keyword, 'name') CURS.execute('SELECT %s, %s FROM %s;' % (keywordidCol, keyCol, nameTbl)) _oldcacheValues = Keyword.sqlmeta.cacheValues Keyword.sqlmeta.cacheValues = False for x in fetchsome(CURS, self.flushEvery): dict.__setitem__(self, x[1], x[0]) self.counter = counter(Keyword.select().count() + 1) Keyword.sqlmeta.cacheValues = _oldcacheValues def _toDB(self, quiet=0): if not quiet: print(' * FLUSHING KeywordsCache...') sys.stdout.flush() l = [] lapp = l.append for k, v in self._tmpDict.items(): keySoundex = soundex(k) lapp((v, k, keySoundex)) if not CSV_DIR: CURS.executemany(self.sqlstr, self.converter(l)) else: CSV_CURS.executemany(self.sqlstr, l) class SQLData(dict): """Variable set of information, to be stored from time to time to the SQL database.""" def __init__(self, table=None, cols=None, sqlString='', converter=None, d={}, flushEvery=20000, counterInit=1): if not sqlString: if not (table and cols): raise TypeError('"table" or "cols" unspecified') sqlString, converter = createSQLstr(table, cols) elif converter is None: raise TypeError('"sqlString" or "converter" unspecified') dict.__init__(self) self.counterInit = counterInit self.counter = counterInit self.flushEvery = flushEvery self.sqlString = sqlString self.converter = converter self._recursionLevel = 1 self._table = table self._table_name = tableName(table) for k, v in list(d.items()): self[k] = v def __setitem__(self, key, value): """The value is discarded, the counter is used as the 'real' key and the user's 'key' is used as its values.""" counter = self.counter if counter % self.flushEvery == 0: self.flush() dict.__setitem__(self, counter, key) self.counter += 1 def add(self, key): self[key] = None def flush(self, _resetRecursion=1): if not self: return # XXX: it's safer to flush MoviesCache and PersonsCache, to preserve consistency CACHE_MID.flush(quiet=1) CACHE_PID.flush(quiet=1) if _resetRecursion: self._recursionLevel = 1 if self._recursionLevel >= MAX_RECURSION: print('WARNING recursion level exceded trying to flush data') print('WARNING this batch of data is lost.') self.clear() self.counter = self.counterInit return keys = {'table': self._table_name} _after_has_run = False try: executeCustomQueries('BEFORE_SQLDATA_TODB', _keys=keys, _timeit=False) self._toDB() executeCustomQueries('AFTER_SQLDATA_TODB', _keys=keys, _timeit=False) _after_has_run = True self.clear() self.counter = self.counterInit except OperationalError as e: if not _after_has_run: executeCustomQueries('AFTER_SQLDATA_TODB', _keys=keys, _timeit=False) print(' * TOO MANY DATA (%s items), SPLITTING (run #%d)...' % (len(self), self._recursionLevel)) self._recursionLevel += 1 newdata = self.__class__(table=self._table, sqlString=self.sqlString, converter=self.converter) newdata._recursionLevel = self._recursionLevel newflushEvery = self.flushEvery // 2 if newflushEvery < 1: print('WARNING recursion level exceded trying to flush data') print('WARNING this batch of data is lost.') self.clear() self.counter = self.counterInit return self.flushEvery = newflushEvery newdata.flushEvery = newflushEvery popitem = self.popitem dsi = dict.__setitem__ for x in range(len(self) // 2): k, v = popitem() dsi(newdata, k, v) newdata.flush(_resetRecursion=0) del newdata self.flush(_resetRecursion=0) self.clear() self.counter = self.counterInit except Exception as e: if isinstance(e, KeyboardInterrupt): raise print('WARNING: SQLData; unknown exception caught committing the data') print('WARNING: to the database; report this as a bug, since') print('WARNING: many data (%d items) were lost: %s' % (len(self), e)) connectObject.commit() def _toDB(self): print(' * FLUSHING SQLData...') if not CSV_DIR: CURS.executemany(self.sqlString, self.converter(list(self.values()))) else: CSV_CURS.executemany(self.sqlString, list(self.values())) # Miscellaneous functions. def unpack(line, headers, sep='\t'): """Given a line, split at seps and return a dictionary with key from the header list. E.g.: line = ' 0000000124 8805 8.4 Incredibles, The (2004)' header = ('votes distribution', 'votes', 'rating', 'title') seps=(' ',) will returns: {'votes distribution': '0000000124', 'votes': '8805', 'rating': '8.4', 'title': 'Incredibles, The (2004)'} """ r = {} ls1 = [_f for _f in line.split(sep) if _f] for index, item in enumerate(ls1): try: name = headers[index] except IndexError: name = 'item%s' % index r[name] = item.strip() return r def _parseMinusList(fdata): """Parse a list of lines starting with '- '.""" rlist = [] tmplist = [] for line in fdata: if line and line[:2] == '- ': if tmplist: rlist.append(' '.join(tmplist)) l = line[2:].strip() if l: tmplist[:] = [l] else: tmplist[:] = [] else: l = line.strip() if l: tmplist.append(l) if tmplist: rlist.append(' '.join(tmplist)) return rlist def _parseColonList(lines, replaceKeys): """Parser for lists with "TAG: value" strings.""" out = {} for line in lines: line = line.strip() if not line: continue cols = line.split(':', 1) if len(cols) < 2: continue k = cols[0] k = replaceKeys.get(k, k) v = ' '.join(cols[1:]).strip() if k not in out: out[k] = [] out[k].append(v) return out # Functions used to manage data files. def readMovieList(): """Read the movies.list.gz file.""" try: mdbf = SourceFile(MOVIES, start=MOVIES_START, stop=MOVIES_STOP) except IOError: return count = 0 for line in mdbf: line_d = unpack(line, ('title', 'year')) title = line_d['title'] yearData = None # Collect 'year' column for tv "series years" and episodes' year. if title[0] == '"': yearData = [('movieYear', line_d['year'])] mid = CACHE_MID.addUnique(title, yearData) if mid is None: continue if count % 10000 == 0: print('SCANNING movies:', _(title), end=' ') print('(movieID: %s)' % mid) count += 1 CACHE_MID.flush() CACHE_MID.movieYear.clear() mdbf.close() def doCast(fp, roleid, rolename): """Populate the cast table.""" pid = None count = 0 name = '' roleidVal = RawValue('roleID', roleid) sqldata = SQLData(table=CastInfo, cols=['personID', 'movieID', 'personRoleID', 'note', 'nrOrder', roleidVal]) if rolename == 'miscellaneous crew': sqldata.flushEvery = 10000 for line in fp: if line and line[0] != '\t': if line[0] == '\n': continue sl = [_f for _f in line.split('\t') if _f] if len(sl) != 2: continue name, line = sl miscData = None if rolename == 'actor': miscData = [('personGender', 'm')] elif rolename == 'actress': miscData = [('personGender', 'f')] pid = CACHE_PID.addUnique(name.strip(), miscData) line = line.strip() ll = line.split(' ') title = ll[0] note = None role = None order = None for item in ll[1:]: if not item: continue if item[0] == '[': # Quite inefficient, but there are some very strange # cases of garbage in the plain text data files to handle... role = item[1:] if role[-1:] == ']': role = role[:-1] if role[-1:] == ')': nidx = role.find('(') if nidx != -1: note = role[nidx:] role = role[:nidx].rstrip() if not role: role = None elif item[0] == '(': if note is None: note = item else: note = '%s %s' % (note, item) elif item[0] == '<': textor = item[1:-1] try: order = int(textor) except ValueError: os = textor.split(',') if len(os) == 3: try: order = ((int(os[2]) - 1) * 1000) + \ ((int(os[1]) - 1) * 100) + (int(os[0]) - 1) except ValueError: pass movieid = CACHE_MID.addUnique(title) if movieid is None: continue if role is not None: roles = [_f for _f in [x.strip() for x in role.split('/')] if _f] for role in roles: cid = CACHE_CID.addUnique(role) sqldata.add((pid, movieid, cid, note, order)) else: sqldata.add((pid, movieid, None, note, order)) if count % 10000 == 0: print('SCANNING %s:' % rolename, end=' ') print(_(name)) count += 1 sqldata.flush() CACHE_PID.flush() CACHE_PID.personGender.clear() CACHE_CID.flush() print('CLOSING %s...' % rolename) def castLists(): """Read files listed in the 'role' column of the 'roletypes' table.""" rt = [(x.id, x.role) for x in RoleType.select()] for roleid, rolename in rt: if rolename == 'guest': continue fname = rolename fname = fname.replace(' ', '-') if fname == 'actress': fname = 'actresses.list.gz' elif fname == 'miscellaneous-crew': fname = 'miscellaneous.list.gz' else: fname = fname + 's.list.gz' print('DOING', fname) try: f = SourceFile(fname, start=CAST_START, stop=CAST_STOP) except IOError: if rolename == 'actress': CACHE_CID.flush() if not CSV_DIR: CACHE_CID.clear() continue doCast(f, roleid, rolename) f.close() if rolename == 'actress': CACHE_CID.flush() if not CSV_DIR: CACHE_CID.clear() t('castLists(%s)' % rolename) def doAkaNames(): """People's akas.""" pid = None count = 0 try: fp = SourceFile('aka-names.list.gz', start=AKAN_START) except IOError: return sqldata = SQLData(table=AkaName, cols=['personID', 'name', 'imdbIndex', 'namePcodeCf', 'namePcodeNf', 'surnamePcode', 'md5sum']) for line in fp: if line and line[0] != ' ': if line[0] == '\n': continue pid = CACHE_PID.addUnique(line.strip()) else: line = line.strip() if line[:5] == '(aka ': line = line[5:] if line[-1:] == ')': line = line[:-1] try: name_dict = analyze_name(line) except IMDbParserError: if line: print('WARNING doAkaNames wrong name:', _(line)) continue name = name_dict.get('name') namePcodeCf, namePcodeNf, surnamePcode = name_soundexes(name) sqldata.add((pid, name, name_dict.get('imdbIndex'), namePcodeCf, namePcodeNf, surnamePcode, md5(line.encode('latin1')).hexdigest())) if count % 10000 == 0: print('SCANNING akanames:', _(line)) count += 1 sqldata.flush() fp.close() class AkasMoviesCache(MoviesCache): """A MoviesCache-like class used to populate the AkaTitle table.""" className = 'AkasMoviesCache' counter = counter() def __init__(self, *args, **kdws): MoviesCache.__init__(self, *args, **kdws) self.flushEvery = 50000 self._mapsIDsToTitles = True self.notes = {} self.ids = {} self._table_name = tableName(AkaTitle) self._id_for_custom_q = 'AKAMOVIES' self.sqlstr, self.converter = createSQLstr(AkaTitle, ('id', 'movieID', 'title', 'imdbIndex', 'kindID', 'productionYear', 'phoneticCode', 'episodeOfID', 'seasonNr', 'episodeNr', 'note', 'md5sum')) def flush(self, *args, **kwds): CACHE_MID.flush(quiet=1) super(AkasMoviesCache, self).flush(*args, **kwds) def _runCommand(self, dataList): new_dataList = [] new_dataListapp = new_dataList.append while dataList: item = list(dataList.pop()) # Remove the imdbID. del item[5] # id used to store this entry. the_id = item[0] # id of the referred title. original_title_id = self.ids.get(the_id) or 0 new_item = [the_id, original_title_id] md5sum = item[-1] new_item += item[1:-2] new_item.append(self.notes.get(the_id)) new_item.append(md5sum) new_dataListapp(tuple(new_item)) new_dataList.reverse() if not CSV_DIR: CURS.executemany(self.sqlstr, self.converter(new_dataList)) else: CSV_CURS.executemany(self.sqlstr, new_dataList) CACHE_MID_AKAS = AkasMoviesCache() def doAkaTitles(): """Movies' akas.""" mid = None count = 0 for fname, start in (('aka-titles.list.gz', AKAT_START), ('italian-aka-titles.list.gz', AKAT_IT_START), ('german-aka-titles.list.gz', AKAT_DE_START), ('iso-aka-titles.list.gz', AKAT_ISO_START), (os.path.join('contrib', 'hungarian-aka-titles.list.gz'), AKAT_HU_START), (os.path.join('contrib', 'norwegian-aka-titles.list.gz'), AKAT_NO_START)): incontrib = 0 pwarning = 1 # Looks like that the only up-to-date AKA file is aka-titles. obsolete = False if fname != 'aka-titles.list.gz': obsolete = True if start in (AKAT_HU_START, AKAT_NO_START): pwarning = 0 incontrib = 1 try: fp = SourceFile(fname, start=start, stop='---------------------------', pwarning=pwarning) except IOError: continue isEpisode = False seriesID = None doNotAdd = False for line in fp: if line and line[0] != ' ': # Reading the official title. doNotAdd = False if line[0] == '\n': continue line = line.strip() if obsolete: try: tonD = analyze_title(line) except IMDbParserError: if line: print('WARNING doAkaTitles(obsol O) invalid title:', end=' ') print(_(line)) continue tonD['title'] = normalizeTitle(tonD['title']) line = build_title(tonD, ptdf=True) # Aka information for titles in obsolete files are # added only if the movie already exists in the cache. if line not in CACHE_MID: doNotAdd = True continue mid = CACHE_MID.addUnique(line) if mid is None: continue if line[0] == '"': try: titleDict = analyze_title(line) except IMDbParserError: if line: print('WARNING doAkaTitles (O) invalid title:', end=' ') print(_(line)) continue if 'episode of' in titleDict: if obsolete: titleDict['episode of']['title'] = \ normalizeTitle(titleDict['episode of']['title']) series = build_title(titleDict['episode of'], ptdf=True) seriesID = CACHE_MID.addUnique(series) if seriesID is None: continue isEpisode = True else: seriesID = None isEpisode = False else: seriesID = None isEpisode = False else: # Reading an aka title. if obsolete and doNotAdd: continue res = unpack(line.strip(), ('title', 'note')) note = res.get('note') if incontrib: if res.get('note'): note += ' ' else: note = '' if start == AKAT_HU_START: note += '(Hungary)' elif start == AKAT_NO_START: note += '(Norway)' akat = res.get('title', '') if akat[:5] == '(aka ': akat = akat[5:] if akat[-2:] in ('))', '})'): akat = akat[:-1] akat = akat.strip() if not akat: continue if obsolete: try: akatD = analyze_title(akat) except IMDbParserError: if line: print('WARNING doAkaTitles(obsol) invalid title:', end=' ') print(_(akat)) continue akatD['title'] = normalizeTitle(akatD['title']) akat = build_title(akatD, ptdf=True) if count % 10000 == 0: print('SCANNING %s:' % fname[:-8].replace('-', ' '), end=' ') print(_(akat)) if isEpisode and seriesID is not None: # Handle series for which only single episodes have # aliases. try: akaDict = analyze_title(akat) except IMDbParserError: if line: print('WARNING doAkaTitles (epis) invalid title:', end=' ') print(_(akat)) continue if 'episode of' in akaDict: if obsolete: akaDict['episode of']['title'] = normalizeTitle( akaDict['episode of']['title']) akaSeries = build_title(akaDict['episode of'], ptdf=True) CACHE_MID_AKAS.add(akaSeries, [('ids', seriesID)]) append_data = [('ids', mid)] if note is not None: append_data.append(('notes', note)) CACHE_MID_AKAS.add(akat, append_data) count += 1 fp.close() CACHE_MID_AKAS.flush() CACHE_MID_AKAS.clear() CACHE_MID_AKAS.notes.clear() CACHE_MID_AKAS.ids.clear() def doMovieLinks(): """Connections between movies.""" mid = None count = 0 sqldata = SQLData(table=MovieLink, cols=['movieID', 'linkedMovieID', 'linkTypeID'], flushEvery=10000) try: fp = SourceFile('movie-links.list.gz', start=LINK_START) except IOError: return for line in fp: if line and line[0] != ' ': if line[0] == '\n': continue title = line.strip() mid = CACHE_MID.addUnique(title) if mid is None: continue if count % 10000 == 0: print('SCANNING movielinks:', _(title)) else: if mid is None: continue link_txt = line = line.strip() theid = None for k, lenkp1, v in MOVIELINK_IDS: if link_txt and link_txt[0] == '(' \ and link_txt[1:lenkp1 + 1] == k: theid = v break if theid is None: continue totitle = line[lenkp1 + 2:-1].strip() totitleid = CACHE_MID.addUnique(totitle) if totitleid is None: continue sqldata.add((mid, totitleid, theid)) count += 1 sqldata.flush() fp.close() def minusHashFiles(fp, funct, defaultid, descr): """A file with lines starting with '# ' and '- '.""" sqldata = SQLData(table=MovieInfo, cols=['movieID', 'infoTypeID', 'info', 'note']) sqldata.flushEvery = 2500 if descr == 'quotes': sqldata.flushEvery = 4000 elif descr == 'soundtracks': sqldata.flushEvery = 3000 elif descr == 'trivia': sqldata.flushEvery = 3000 count = 0 for title, text in fp.getByHashSections(): title = title.strip() d = funct(text.split('\n')) if not d: print('WARNING skipping empty information about title:', end=' ') print(_(title)) continue if not title: print('WARNING skipping information associated to empty title:', end=' ') print(_(d[0], truncateAt=40)) continue mid = CACHE_MID.addUnique(title) if mid is None: continue if count % 5000 == 0: print('SCANNING %s:' % descr, end=' ') print(_(title)) for data in d: sqldata.add((mid, defaultid, data, None)) count += 1 sqldata.flush() def doMinusHashFiles(): """Files with lines starting with '# ' and '- '.""" for fname, start in [('alternate versions', AV_START), ('goofs', GOOFS_START), ('crazy credits', CC_START), ('quotes', QUOTES_START), ('soundtracks', SNDT_START), ('trivia', TRIV_START)]: try: fp = SourceFile(fname.replace(' ', '-') + '.list.gz', start=start, stop=MINHASH_STOP) except IOError: continue funct = _parseMinusList if fname == 'quotes': funct = getQuotes index = fname if index == 'soundtracks': index = 'soundtrack' minusHashFiles(fp, funct, INFO_TYPES[index], fname) fp.close() def getTaglines(): """Movie's taglines.""" try: fp = SourceFile('taglines.list.gz', start=TAGL_START, stop=TAGL_STOP) except IOError: return sqldata = SQLData(table=MovieInfo, cols=['movieID', 'infoTypeID', 'info', 'note'], flushEvery=10000) count = 0 for title, text in fp.getByHashSections(): title = title.strip() mid = CACHE_MID.addUnique(title) if mid is None: continue for tag in text.split('\n'): tag = tag.strip() if not tag: continue if count % 10000 == 0: print('SCANNING taglines:', _(title)) sqldata.add((mid, INFO_TYPES['taglines'], tag, None)) count += 1 sqldata.flush() fp.close() def getQuotes(lines): """Movie's quotes.""" quotes = [] qttl = [] for line in lines: if line and line[:2] == ' ' and qttl and qttl[-1] and \ not qttl[-1].endswith('::'): line = line.lstrip() if line: qttl[-1] += ' %s' % line elif not line.strip(): if qttl: quotes.append('::'.join(qttl)) qttl[:] = [] else: line = line.lstrip() if line: qttl.append(line) if qttl: quotes.append('::'.join(qttl)) return quotes _bus = {'BT': 'budget', 'WG': 'weekend gross', 'GR': 'gross', 'OW': 'opening weekend', 'RT': 'rentals', 'AD': 'admissions', 'SD': 'filming dates', 'PD': 'production dates', 'ST': 'studios', 'CP': 'copyright holder' } _usd = '$' _gbp = chr(0x00a3) _eur = chr(0x20ac) def getBusiness(lines): """Movie's business information.""" bd = _parseColonList(lines, _bus) for k in list(bd.keys()): nv = [] for v in bd[k]: v = v.replace('USD ', _usd).replace('GBP ', _gbp).replace('EUR', _eur) nv.append(v) bd[k] = nv return bd _ldk = {'OT': 'original title', 'PC': 'production country', 'YR': 'year', 'CF': 'certification', 'CA': 'category', 'GR': 'group genre', 'LA': 'language', 'SU': 'subtitles', 'LE': 'length', 'RD': 'release date', 'ST': 'status of availablility', 'PR': 'official retail price', 'RC': 'release country', 'VS': 'video standard', 'CO': 'color information', 'SE': 'sound encoding', 'DS': 'digital sound', 'AL': 'analog left', 'AR': 'analog right', 'MF': 'master format', 'PP': 'pressing plant', 'SZ': 'disc size', 'SI': 'number of sides', 'DF': 'disc format', 'PF': 'picture format', 'AS': 'aspect ratio', 'CC': 'close captions-teletext-ld-g', 'CS': 'number of chapter stops', 'QP': 'quality program', 'IN': 'additional information', 'SL': 'supplement', 'RV': 'review', 'V1': 'quality of source', 'V2': 'contrast', 'V3': 'color rendition', 'V4': 'sharpness', 'V5': 'video noise', 'V6': 'video artifacts', 'VQ': 'video quality', 'A1': 'frequency response', 'A2': 'dynamic range', 'A3': 'spaciality', 'A4': 'audio noise', 'A5': 'dialogue intellegibility', 'AQ': 'audio quality', 'LN': 'number', 'LB': 'label', 'CN': 'catalog number', 'LT': 'laserdisc title' } # Handle laserdisc keys. for key, value in list(_ldk.items()): _ldk[key] = 'LD %s' % value def getLaserDisc(lines): """Laserdisc information.""" d = _parseColonList(lines, _ldk) for k, v in d.items(): d[k] = ' '.join(v) return d _lit = {'SCRP': 'screenplay-teleplay', 'NOVL': 'novel', 'ADPT': 'adaption', 'BOOK': 'book', 'PROT': 'production process protocol', 'IVIW': 'interviews', 'CRIT': 'printed media reviews', 'ESSY': 'essays', 'OTHR': 'other literature' } def getLiterature(lines): """Movie's literature information.""" return _parseColonList(lines, _lit) _mpaa = {'RE': 'mpaa'} def getMPAA(lines): """Movie's mpaa information.""" d = _parseColonList(lines, _mpaa) for k, v in d.items(): d[k] = ' '.join(v) return d re_nameImdbIndex = re.compile(r'\(([IVXLCDM]+)\)') def nmmvFiles(fp, funct, fname): """Files with sections separated by 'MV: ' or 'NM: '.""" count = 0 sqlsP = (PersonInfo, ['personID', 'infoTypeID', 'info', 'note']) sqlsM = (MovieInfo, ['movieID', 'infoTypeID', 'info', 'note']) if fname == 'biographies.list.gz': datakind = 'person' sqls = sqlsP guestid = RoleType.select(RoleType.q.role == 'guest')[0].id roleid = str(guestid) guestdata = SQLData(table=CastInfo, cols=['personID', 'movieID', 'personRoleID', 'note', RawValue('roleID', roleid)], flushEvery=10000) akanamesdata = SQLData(table=AkaName, cols=['personID', 'name', 'imdbIndex', 'namePcodeCf', 'namePcodeNf', 'surnamePcode', 'md5sum']) else: datakind = 'movie' sqls = sqlsM guestdata = None akanamesdata = None sqldata = SQLData(table=sqls[0], cols=sqls[1]) if fname == 'plot.list.gz': sqldata.flushEvery = 1100 elif fname == 'literature.list.gz': sqldata.flushEvery = 5000 elif fname == 'business.list.gz': sqldata.flushEvery = 10000 elif fname == 'biographies.list.gz': sqldata.flushEvery = 5000 islaserdisc = False if fname == 'laserdisc.list.gz': islaserdisc = True _ltype = type([]) for ton, text in fp.getByNMMVSections(): ton = ton.strip() if not ton: continue note = None if datakind == 'movie': if islaserdisc: tonD = analyze_title(ton) tonD['title'] = normalizeTitle(tonD['title']) ton = build_title(tonD, ptdf=True) # Skips movies that are not already in the cache, since # laserdisc.list.gz is an obsolete file. if ton not in CACHE_MID: continue mopid = CACHE_MID.addUnique(ton) if mopid is None: continue else: mopid = CACHE_PID.addUnique(ton) if count % 6000 == 0: print('SCANNING %s:' % fname[:-8].replace('-', ' '), end=' ') print(_(ton)) d = funct(text.split('\n')) for k, v in d.items(): if k != 'notable tv guest appearances': theid = INFO_TYPES.get(k) if theid is None: print('WARNING key "%s" of ToN' % k, end=' ') print(_(ton), end=' ') print('not in INFO_TYPES') continue if type(v) is _ltype: for i in v: if k == 'notable tv guest appearances': # Put "guest" information in the cast table; these # are a list of Movie object (yes, imdb.Movie.Movie) # FIXME: no more used? title = i.get('long imdb canonical title') if not title: continue movieid = CACHE_MID.addUnique(title) if movieid is None: continue crole = i.currentRole if isinstance(crole, list): crole = ' / '.join([x.get('long imdb name', '') for x in crole]) if not crole: crole = None guestdata.add((mopid, movieid, crole, i.notes or None)) continue if k in ('plot', 'mini biography'): s = i.split('::') if len(s) == 2: note = s[1] i = s[0] if i: sqldata.add((mopid, theid, i, note)) note = None else: if v: sqldata.add((mopid, theid, v, note)) if k in ('nick names', 'birth name') and v: # Put also the birth name/nick names in the list of aliases. if k == 'birth name': realnames = [v] else: realnames = v for realname in realnames: imdbIndex = re_nameImdbIndex.findall(realname) or None if imdbIndex: imdbIndex = imdbIndex[0] realname = re_nameImdbIndex.sub('', realname) if realname: # XXX: check for duplicates? # if k == 'birth name': # realname = canonicalName(realname) # else: # realname = normalizeName(realname) namePcodeCf, namePcodeNf, surnamePcode = \ name_soundexes(realname) akanamesdata.add((mopid, realname, imdbIndex, namePcodeCf, namePcodeNf, surnamePcode, md5(realname.encode('latin1')).hexdigest())) count += 1 if guestdata is not None: guestdata.flush() if akanamesdata is not None: akanamesdata.flush() sqldata.flush() # ============ # Code from the old 'local' data access system. def _parseList(l, prefix, mline=1): """Given a list of lines l, strips prefix and join consecutive lines with the same prefix; if mline is True, there can be multiple info with the same prefix, and the first line starts with 'prefix: * '.""" resl = [] reslapp = resl.append ltmp = [] ltmpapp = ltmp.append fistl = '%s: * ' % prefix otherl = '%s: ' % prefix if not mline: fistl = fistl[:-2] otherl = otherl[:-2] firstlen = len(fistl) otherlen = len(otherl) parsing = 0 joiner = ' '.join for line in l: if line[:firstlen] == fistl: parsing = 1 if ltmp: reslapp(joiner(ltmp)) ltmp[:] = [] data = line[firstlen:].strip() if data: ltmpapp(data) elif mline and line[:otherlen] == otherl: data = line[otherlen:].strip() if data: ltmpapp(data) else: if ltmp: reslapp(joiner(ltmp)) ltmp[:] = [] if parsing: if ltmp: reslapp(joiner(ltmp)) break return resl def _parseBioBy(l): """Return a list of biographies.""" bios = [] biosappend = bios.append tmpbio = [] tmpbioappend = tmpbio.append joiner = ' '.join for line in l: if line[:4] == 'BG: ': tmpbioappend(line[4:].strip()) elif line[:4] == 'BY: ': if tmpbio: biosappend(joiner(tmpbio) + '::' + line[4:].strip()) tmpbio[:] = [] # Cut mini biographies up to 2**16-1 chars, to prevent errors with # some MySQL versions - when used by the imdbpy2sql.py script. bios[:] = [bio[:65535] for bio in bios] return bios def _parseBiography(biol): """Parse the biographies.data file.""" res = {} bio = ' '.join(_parseList(biol, 'BG', mline=0)) bio = _parseBioBy(biol) if bio: res['mini biography'] = bio for x in biol: x4 = x[:4] x6 = x[:6] if x4 == 'DB: ': date, notes = date_and_notes(x[4:]) if date: res['birth date'] = date if notes: res['birth notes'] = notes elif x4 == 'DD: ': date, notes = date_and_notes(x[4:]) if date: res['death date'] = date if notes: res['death notes'] = notes elif x6 == 'SP: * ': res.setdefault('spouse', []).append(x[6:].strip()) elif x4 == 'RN: ': n = x[4:].strip() if not n: continue try: rn = build_name(analyze_name(n, canonical=1), canonical=1) res['birth name'] = rn except IMDbParserError: if line: print('WARNING _parseBiography wrong name:', _(n)) continue elif x6 == 'AT: * ': res.setdefault('article', []).append(x[6:].strip()) elif x4 == 'HT: ': res['height'] = x[4:].strip() elif x6 == 'PT: * ': res.setdefault('pictorial', []).append(x[6:].strip()) elif x6 == 'CV: * ': res.setdefault('magazine cover photo', []).append(x[6:].strip()) elif x4 == 'NK: ': res.setdefault('nick names', []).append(normalizeName(x[4:])) elif x6 == 'PI: * ': res.setdefault('portrayed in', []).append(x[6:].strip()) elif x6 == 'SA: * ': sal = x[6:].strip().replace(' -> ', '::') res.setdefault('salary history', []).append(sal) trl = _parseList(biol, 'TR') if trl: res['trivia'] = trl quotes = _parseList(biol, 'QU') if quotes: res['quotes'] = quotes otherworks = _parseList(biol, 'OW') if otherworks: res['other works'] = otherworks books = _parseList(biol, 'BO') if books: res['books'] = books agent = _parseList(biol, 'AG') if agent: res['agent address'] = agent wherenow = _parseList(biol, 'WN') if wherenow: res['where now'] = wherenow[0] biomovies = _parseList(biol, 'BT') if biomovies: res['biographical movies'] = biomovies tm = _parseList(biol, 'TM') if tm: res['trade mark'] = tm interv = _parseList(biol, 'IT') if interv: res['interviews'] = interv return res # ============ def doNMMVFiles(): """Files with large sections, about movies and persons.""" for fname, start, funct in [ ('biographies.list.gz', BIO_START, _parseBiography), ('business.list.gz', BUS_START, getBusiness), ('laserdisc.list.gz', LSD_START, getLaserDisc), ('literature.list.gz', LIT_START, getLiterature), ('mpaa-ratings-reasons.list.gz', MPAA_START, getMPAA), ('plot.list.gz', PLOT_START, getPlot)]: try: fp = SourceFile(fname, start=start) except IOError: continue if fname == 'literature.list.gz': fp.set_stop(LIT_STOP) elif fname == 'business.list.gz': fp.set_stop(BUS_STOP) nmmvFiles(fp, funct, fname) fp.close() t('doNMMVFiles(%s)' % fname[:-8].replace('-', ' ')) def doMovieCompaniesInfo(): """Files with information on a single line about movies, concerning companies.""" sqldata = SQLData(table=MovieCompanies, cols=['movieID', 'companyID', 'companyTypeID', 'note']) for dataf in (('distributors.list.gz', DIS_START), ('miscellaneous-companies.list.gz', MIS_START), ('production-companies.list.gz', PRO_START), ('special-effects-companies.list.gz', SFX_START)): try: fp = SourceFile(dataf[0], start=dataf[1]) except IOError: continue typeindex = dataf[0][:-8].replace('-', ' ') infoid = COMP_TYPES[typeindex] count = 0 for line in fp: data = unpack(line.strip(), ('title', 'company', 'note')) if 'title' not in data: continue if 'company' not in data: continue title = data['title'] company = data['company'] mid = CACHE_MID.addUnique(title) if mid is None: continue cid = CACHE_COMPID.addUnique(company) note = None if 'note' in data: note = data['note'] if count % 10000 == 0: print('SCANNING %s:' % dataf[0][:-8].replace('-', ' '), end=' ') print(_(data['title'])) sqldata.add((mid, cid, infoid, note)) count += 1 sqldata.flush() CACHE_COMPID.flush() fp.close() t('doMovieCompaniesInfo(%s)' % dataf[0][:-8].replace('-', ' ')) def doMiscMovieInfo(): """Files with information on a single line about movies.""" for dataf in (('certificates.list.gz', CER_START), ('color-info.list.gz', COL_START), ('countries.list.gz', COU_START), ('genres.list.gz', GEN_START), ('keywords.list.gz', KEY_START), ('language.list.gz', LAN_START), ('locations.list.gz', LOC_START), ('running-times.list.gz', RUN_START), ('sound-mix.list.gz', SOU_START), ('technical.list.gz', TCN_START), ('release-dates.list.gz', RELDATE_START)): try: fp = SourceFile(dataf[0], start=dataf[1]) except IOError: continue typeindex = dataf[0][:-8].replace('-', ' ') if typeindex == 'running times': typeindex = 'runtimes' elif typeindex == 'technical': typeindex = 'tech info' elif typeindex == 'language': typeindex = 'languages' if typeindex != 'keywords': sqldata = SQLData(table=MovieInfo, cols=['movieID', 'infoTypeID', 'info', 'note']) else: sqldata = SQLData(table=MovieKeyword, cols=['movieID', 'keywordID']) infoid = INFO_TYPES[typeindex] count = 0 if dataf[0] == 'locations.list.gz': sqldata.flushEvery = 10000 else: sqldata.flushEvery = 20000 for line in fp: data = unpack(line.strip(), ('title', 'info', 'note')) if 'title' not in data: continue if 'info' not in data: continue title = data['title'] mid = CACHE_MID.addUnique(title) if mid is None: continue note = None if 'note' in data: note = data['note'] if count % 10000 == 0: print('SCANNING %s:' % dataf[0][:-8].replace('-', ' '), end=' ') print(_(data['title'])) info = data['info'] if typeindex == 'keywords': keywordID = CACHE_KWRDID.addUnique(info) sqldata.add((mid, keywordID)) else: sqldata.add((mid, infoid, info, note)) count += 1 sqldata.flush() if typeindex == 'keywords': CACHE_KWRDID.flush() CACHE_KWRDID.clear() fp.close() t('doMiscMovieInfo(%s)' % dataf[0][:-8].replace('-', ' ')) def getRating(): """Movie's rating.""" try: fp = SourceFile('ratings.list.gz', start=RAT_START, stop=RAT_STOP) except IOError: return sqldata = SQLData(table=MovieInfo, cols=['movieID', 'infoTypeID', 'info', 'note']) count = 0 for line in fp: data = unpack(line, ('votes distribution', 'votes', 'rating', 'title'), sep=' ') if 'title' not in data: continue title = data['title'].strip() mid = CACHE_MID.addUnique(title) if mid is None: continue if count % 10000 == 0: print('SCANNING rating:', _(title)) sqldata.add((mid, INFO_TYPES['votes distribution'], data.get('votes distribution'), None)) sqldata.add((mid, INFO_TYPES['votes'], data.get('votes'), None)) sqldata.add((mid, INFO_TYPES['rating'], data.get('rating'), None)) count += 1 sqldata.flush() fp.close() def getTopBottomRating(): """Movie's rating, scanning for top 250 and bottom 10.""" for what in ('top 250 rank', 'bottom 10 rank'): if what == 'top 250 rank': st = RAT_TOP250_START else: st = RAT_BOT10_START try: fp = SourceFile('ratings.list.gz', start=st, stop=TOPBOT_STOP) except IOError: break sqldata = SQLData(table=MovieInfo, cols=['movieID', RawValue('infoTypeID', INFO_TYPES[what]), 'info', 'note']) count = 1 print('SCANNING %s...' % what) for line in fp: data = unpack(line, ('votes distribution', 'votes', 'rank', 'title'), sep=' ') if 'title' not in data: continue title = data['title'].strip() mid = CACHE_MID.addUnique(title) if mid is None: continue if what == 'top 250 rank': rank = count else: rank = 11 - count sqldata.add((mid, str(rank), None)) count += 1 sqldata.flush() fp.close() def getPlot(lines): """Movie's plot.""" plotl = [] plotlappend = plotl.append plotltmp = [] plotltmpappend = plotltmp.append for line in lines: linestart = line[:4] if linestart == 'PL: ': plotltmpappend(line[4:]) elif linestart == 'BY: ': plotlappend('%s::%s' % (' '.join(plotltmp), line[4:].strip())) plotltmp[:] = [] return {'plot': plotl} def completeCast(): """Movie's complete cast/crew information.""" CCKind = {} cckinds = [(x.id, x.kind) for x in CompCastType.select()] for k, v in cckinds: CCKind[v] = k for fname, start in [('complete-cast.list.gz', COMPCAST_START), ('complete-crew.list.gz', COMPCREW_START)]: try: fp = SourceFile(fname, start=start, stop=COMP_STOP) except IOError: continue if fname == 'complete-cast.list.gz': obj = 'cast' else: obj = 'crew' subID = str(CCKind[obj]) sqldata = SQLData(table=CompleteCast, cols=['movieID', RawValue('subjectID', subID), 'statusID']) count = 0 for line in fp: ll = [x for x in line.split('\t') if x] if len(ll) != 2: continue title = ll[0] mid = CACHE_MID.addUnique(title) if mid is None: continue if count % 10000 == 0: print('SCANNING %s:' % fname[:-8].replace('-', ' '), end=' ') print(_(title)) sqldata.add((mid, CCKind[ll[1].lower().strip()])) count += 1 fp.close() sqldata.flush() # global instances CACHE_MID = MoviesCache() CACHE_PID = PersonsCache() CACHE_CID = CharactersCache() CACHE_CID.className = 'CharactersCache' CACHE_COMPID = CompaniesCache() CACHE_KWRDID = KeywordsCache() INFO_TYPES = {} MOVIELINK_IDS = [] KIND_IDS = {} KIND_STRS = {} CCAST_TYPES = {} COMP_TYPES = {} def readConstants(): """Read constants from the database.""" global INFO_TYPES, MOVIELINK_IDS, KIND_IDS, KIND_STRS, \ CCAST_TYPES, COMP_TYPES for x in InfoType.select(): INFO_TYPES[x.info] = x.id for x in LinkType.select(): MOVIELINK_IDS.append((x.link, len(x.link), x.id)) MOVIELINK_IDS.sort(key=lambda x: operator.length_hint(x[0]), reverse=True) for x in KindType.select(): KIND_IDS[x.kind] = x.id KIND_STRS[x.id] = x.kind for x in CompCastType.select(): CCAST_TYPES[x.kind] = x.id for x in CompanyType.select(): COMP_TYPES[x.kind] = x.id def _imdbIDsFileName(fname): """Return a file name, adding the optional CSV_DIR directory.""" return os.path.join(*([_f for _f in [CSV_DIR, fname] if _f])) def _countRows(tableName): """Return the number of rows in a table.""" try: CURS.execute('SELECT COUNT(*) FROM %s' % tableName) return (CURS.fetchone() or [0])[0] except Exception as e: print('WARNING: unable to count rows of table %s: %s' % (tableName, e)) return 0 def storeNotNULLimdbIDs(cls): """Store in a temporary table or in a dbm database a mapping between md5sum (of title or name) and imdbID, when the latter is present in the database.""" if cls is Title: cname = 'movies' elif cls is Name: cname = 'people' elif cls is CompanyName: cname = 'companies' else: cname = 'characters' table_name = tableName(cls) md5sum_col = colName(cls, 'md5sum') imdbID_col = colName(cls, 'imdbID') print('SAVING imdbID values for %s...' % cname, end=' ') sys.stdout.flush() if _get_imdbids_method() == 'table': try: try: CURS.execute('DROP TABLE %s_extract' % table_name) except: pass try: CURS.execute('SELECT * FROM %s LIMIT 1' % table_name) except Exception as e: print('missing "%s" table (ok if this is the first run)' % table_name) return query = 'CREATE TEMPORARY TABLE %s_extract AS SELECT %s, %s FROM %s WHERE %s IS NOT NULL' % \ (table_name, md5sum_col, imdbID_col, table_name, imdbID_col) CURS.execute(query) CURS.execute('CREATE INDEX %s_md5sum_idx ON %s_extract (%s)' % (table_name, table_name, md5sum_col)) CURS.execute('CREATE INDEX %s_imdbid_idx ON %s_extract (%s)' % (table_name, table_name, imdbID_col)) rows = _countRows('%s_extract' % table_name) print('DONE! (%d entries using a temporary table)' % rows) return except Exception as e: print('WARNING: unable to store imdbIDs in a temporary table (falling back to dbm): %s' % e) try: db = dbm.open(_imdbIDsFileName('%s_imdbIDs.db' % cname), 'c') except Exception as e: print('WARNING: unable to store imdbIDs: %s' % str(e)) return try: CURS.execute('SELECT %s, %s FROM %s WHERE %s IS NOT NULL' % (md5sum_col, imdbID_col, table_name, imdbID_col)) res = CURS.fetchmany(10000) while res: db.update(dict((str(x[0]), str(x[1])) for x in res)) res = CURS.fetchmany(10000) except Exception as e: print('SKIPPING: unable to retrieve data: %s' % e) return print('DONE! (%d entries)' % len(db)) db.close() return def iterbatch(iterable, size): """Process an iterable 'size' items at a time.""" sourceiter = iter(iterable) while True: batchiter = islice(sourceiter, size) yield chain([next(batchiter)], batchiter) def restoreImdbIDs(cls): """Restore imdbIDs for movies, people, companies and characters.""" if cls is Title: cname = 'movies' elif cls is Name: cname = 'people' elif cls is CompanyName: cname = 'companies' else: cname = 'characters' print('RESTORING imdbIDs values for %s...' % cname, end=' ') sys.stdout.flush() table_name = tableName(cls) md5sum_col = colName(cls, 'md5sum') imdbID_col = colName(cls, 'imdbID') if _get_imdbids_method() == 'table': try: try: CURS.execute('SELECT * FROM %s_extract LIMIT 1' % table_name) except Exception as e: raise Exception('missing "%s_extract" table (ok if this is the first run)' % table_name) if DB_NAME == 'mysql': query = 'UPDATE %s INNER JOIN %s_extract USING (%s) SET %s.%s = %s_extract.%s' % \ (table_name, table_name, md5sum_col, table_name, imdbID_col, table_name, imdbID_col) else: query = 'UPDATE %s SET %s = %s_extract.%s FROM %s_extract WHERE %s.%s = %s_extract.%s' % \ (table_name, imdbID_col, table_name, imdbID_col, table_name, table_name, md5sum_col, table_name, md5sum_col) CURS.execute(query) affected_rows = 'an unknown number of' try: CURS.execute('SELECT COUNT(*) FROM %s WHERE %s IS NOT NULL' % (table_name, imdbID_col)) affected_rows = (CURS.fetchone() or [0])[0] except Exception as e: pass rows = _countRows('%s_extract' % table_name) print('DONE! (restored %s entries out of %d)' % (affected_rows, rows)) t('restore %s' % cname) try: CURS.execute('DROP TABLE %s_extract' % table_name) except: pass return except Exception as e: print('INFO: unable to restore imdbIDs using the temporary table (falling back to dbm): %s' % e) try: db = dbm.open(_imdbIDsFileName('%s_imdbIDs.db' % cname), 'r') except Exception as e: print('INFO: unable to restore imdbIDs (ok if this is the first run)') return count = 0 sql = "UPDATE " + table_name + " SET " + imdbID_col + \ " = CASE " + md5sum_col + " %s END WHERE " + \ md5sum_col + " IN (%s)" def _restore(query, batch): """Execute a query to restore a batch of imdbIDs""" items = list(batch) case_clause = ' '.join("WHEN '%s' THEN %s" % (k, v) for k, v in items) where_clause = ', '.join("'%s'" % x[0] for x in items) success = _executeQuery(query % (case_clause, where_clause)) if success: return len(items) return 0 for batch in iterbatch(iter(db.items()), 10000): count += _restore(sql, batch) print('DONE! (restored %d entries out of %d)' % (count, len(db))) t('restore %s' % cname) db.close() return def restoreAll_imdbIDs(): """Restore imdbIDs for movies, persons, companies and characters.""" # Restoring imdbIDs for movies and persons (moved after the # built of indexes, so that it can take advantage of them). runSafely(restoreImdbIDs, 'failed to restore imdbIDs for movies', None, Title) runSafely(restoreImdbIDs, 'failed to restore imdbIDs for people', None, Name) runSafely(restoreImdbIDs, 'failed to restore imdbIDs for characters', None, CharName) runSafely(restoreImdbIDs, 'failed to restore imdbIDs for companies', None, CompanyName) def runSafely(funct, fmsg, default, *args, **kwds): """Run the function 'funct' with arguments args and kwds, catching every exception; fmsg is printed out (along with the exception message) in case of trouble; the return value of the function is returned (or 'default').""" try: return funct(*args, **kwds) except Exception as e: print('WARNING: %s: %s' % (fmsg, e)) return default def _executeQuery(query): """Execute a query on the CURS object.""" if len(query) > 60: s_query = query[:60] + '...' else: s_query = query print('EXECUTING "%s"...' % s_query, end=' ') sys.stdout.flush() try: CURS.execute(query) print('DONE!') return True except Exception as e: print('FAILED (%s)!' % e) return False def executeCustomQueries(when, _keys=None, _timeit=True): """Run custom queries as specified on the command line.""" if _keys is None: _keys = {} for query in CUSTOM_QUERIES.get(when, []): print('EXECUTING "%s:%s"...' % (when, query)) sys.stdout.flush() if query.startswith('FOR_EVERY_TABLE:'): query = query[16:] CURS.execute('SHOW TABLES;') tables = [x[0] for x in CURS.fetchall()] for table in tables: try: keys = {'table': table} keys.update(_keys) _executeQuery(query % keys) if _timeit: t('%s command' % when) except Exception as e: print('FAILED (%s)!' % e) continue else: try: _executeQuery(query % _keys) except Exception as e: print('FAILED (%s)!' % e) continue if _timeit: t('%s command' % when) def buildIndexesAndFK(): """Build indexes.""" executeCustomQueries('BEFORE_INDEXES') print('building database indexes (this may take a while)') sys.stdout.flush() # Build database indexes. idx_errors = createIndexes(DB_TABLES) for idx_error in idx_errors: print('ERROR caught exception creating an index: %s' % idx_error) t('createIndexes()') sys.stdout.flush() def restoreCSV(): """Only restore data from a set of CSV files.""" CSV_CURS.buildFakeFileNames() print('loading CSV files into the database') executeCustomQueries('BEFORE_CSV_LOAD') loadCSVFiles() t('loadCSVFiles()') executeCustomQueries('BEFORE_RESTORE') t('TOTAL TIME TO LOAD CSV FILES', sinceBegin=True) buildIndexesAndFK() restoreAll_imdbIDs() executeCustomQueries('END') t('FINAL', sinceBegin=True) # begin the iterations... def run(): print('RUNNING imdbpy2sql.py') executeCustomQueries('BEGIN') # Storing imdbIDs for movies and persons. runSafely(storeNotNULLimdbIDs, 'failed to read imdbIDs for movies', None, Title) runSafely(storeNotNULLimdbIDs, 'failed to read imdbIDs for people', None, Name) runSafely(storeNotNULLimdbIDs, 'failed to read imdbIDs for characters', None, CharName) runSafely(storeNotNULLimdbIDs, 'failed to read imdbIDs for companies', None, CompanyName) # Truncate the current database. print('DROPPING current database...', end=' ') sys.stdout.flush() dropTables(DB_TABLES) print('DONE!') executeCustomQueries('BEFORE_CREATE') # Rebuild the database structure. print('CREATING new tables...', end=' ') sys.stdout.flush() createTables(DB_TABLES) print('DONE!') t('dropping and recreating the database') executeCustomQueries('AFTER_CREATE') # Read the constants. readConstants() # Populate the CACHE_MID instance. readMovieList() # Comment readMovieList() and uncomment the following two lines # to keep the current info in the name and title tables. # CACHE_MID.populate() t('readMovieList()') executeCustomQueries('BEFORE_COMPANIES') # distributors, miscellaneous-companies, production-companies, # special-effects-companies. # CACHE_COMPID.populate() doMovieCompaniesInfo() # Do this now, and free some memory. CACHE_COMPID.flush() CACHE_COMPID.clear() executeCustomQueries('BEFORE_CAST') # actors, actresses, producers, writers, cinematographers, composers, # costume-designers, directors, editors, miscellaneous, # production-designers. castLists() # CACHE_PID.populate() # CACHE_CID.populate() # Aka names and titles. doAkaNames() t('doAkaNames()') doAkaTitles() t('doAkaTitles()') # alternate-versions, goofs, crazy-credits, quotes, soundtracks, trivia. doMinusHashFiles() t('doMinusHashFiles()') # biographies, business, laserdisc, literature, mpaa-ratings-reasons, plot. doNMMVFiles() # certificates, color-info, countries, genres, keywords, language, # locations, running-times, sound-mix, technical, release-dates. doMiscMovieInfo() # movie-links. doMovieLinks() t('doMovieLinks()') # ratings. getRating() t('getRating()') # taglines. getTaglines() t('getTaglines()') # ratings (top 250 and bottom 10 movies). getTopBottomRating() t('getTopBottomRating()') # complete-cast, complete-crew. completeCast() t('completeCast()') if CSV_DIR: CSV_CURS.closeAll() # Flush caches. CACHE_MID.flush() CACHE_PID.flush() CACHE_CID.flush() CACHE_MID.clear() CACHE_PID.clear() CACHE_CID.clear() t('fushing caches...') if CSV_ONLY_WRITE: t('TOTAL TIME TO WRITE CSV FILES', sinceBegin=True) executeCustomQueries('END') t('FINAL', sinceBegin=True) return if CSV_DIR: print('loading CSV files into the database') executeCustomQueries('BEFORE_CSV_LOAD') loadCSVFiles() t('loadCSVFiles()') executeCustomQueries('BEFORE_RESTORE') t('TOTAL TIME TO INSERT/WRITE DATA', sinceBegin=True) buildIndexesAndFK() restoreAll_imdbIDs() executeCustomQueries('END') t('FINAL', sinceBegin=True) _HEARD = 0 def _kdb_handler(signum, frame): """Die gracefully.""" global _HEARD if _HEARD: print("EHI! DON'T PUSH ME! I'VE HEARD YOU THE FIRST TIME! :-)") return print('INTERRUPT REQUEST RECEIVED FROM USER. FLUSHING CACHES...') _HEARD = 1 # XXX: trap _every_ error? try: CACHE_MID.flush() except IntegrityError: pass try: CACHE_PID.flush() except IntegrityError: pass try: CACHE_CID.flush() except IntegrityError: pass try: CACHE_COMPID.flush() except IntegrityError: pass print('DONE! (in %d minutes, %d seconds)' % divmod(int(time.time()) - BEGIN_TIME, 60)) sys.exit() if __name__ == '__main__': import signal signal.signal(signal.SIGINT, _kdb_handler) if CSV_ONLY_LOAD: restoreCSV() else: run() imdbpy-6.8/bin/s32imdbpy.py000077500000000000000000000145001351454127000156160ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ s32imdbpy.py script. This script imports the s3 dataset distributed by IMDb into a SQL database. Copyright 2017-2018 Davide Alberani This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ import os import glob import gzip import logging import argparse import sqlalchemy from imdb.parser.s3.utils import DB_TRANSFORM, title_soundex, name_soundexes TSV_EXT = '.tsv.gz' # how many entries to write to the database at a time. BLOCK_SIZE = 10000 logger = logging.getLogger() logger.setLevel(logging.INFO) metadata = sqlalchemy.MetaData() def generate_content(fd, headers, table): """Generate blocks of rows to be written to the database. :param fd: a file descriptor for the .tsv.gz file :type fd: :class:`_io.TextIOWrapper` :param headers: headers in the file :type headers: list :param table: the table that will populated :type table: :class:`sqlalchemy.Table` :returns: block of data to insert :rtype: list """ data = [] headers_len = len(headers) data_transf = {} table_name = table.name for column, conf in DB_TRANSFORM.get(table_name, {}).items(): if 'transform' in conf: data_transf[column] = conf['transform'] for line in fd: s_line = line.decode('utf-8').strip().split('\t') if len(s_line) != headers_len: continue info = dict(zip(headers, [x if x != r'\N' else None for x in s_line])) for key, tranf in data_transf.items(): if key not in info: continue info[key] = tranf(info[key]) if table_name == 'title_basics': info['t_soundex'] = title_soundex(info['primaryTitle']) elif table_name == 'title_akas': info['t_soundex'] = title_soundex(info['title']) elif table_name == 'name_basics': info['ns_soundex'], info['sn_soundex'], info['s_soundex'] = name_soundexes(info['primaryName']) data.append(info) if len(data) >= BLOCK_SIZE: yield data data = [] if data: yield data data = [] def build_table(fn, headers): """Build a Table object from a .tsv.gz file. :param fn: the .tsv.gz file :type fn: str :param headers: headers in the file :type headers: list """ logging.debug('building table for file %s' % fn) table_name = fn.replace(TSV_EXT, '').replace('.', '_') table_map = DB_TRANSFORM.get(table_name) or {} columns = [] all_headers = set(headers) all_headers.update(table_map.keys()) for header in all_headers: col_info = table_map.get(header) or {} col_type = col_info.get('type') or sqlalchemy.UnicodeText if 'length' in col_info and col_type is sqlalchemy.String: col_type = sqlalchemy.String(length=col_info['length']) col_args = { 'name': header, 'type_': col_type, 'index': col_info.get('index', False) } col_obj = sqlalchemy.Column(**col_args) columns.append(col_obj) return sqlalchemy.Table(table_name, metadata, *columns) def import_file(fn, engine): """Import data from a .tsv.gz file. :param fn: the .tsv.gz file :type fn: str :param engine: SQLAlchemy engine :type engine: :class:`sqlalchemy.engine.base.Engine` """ logging.info('begin processing file %s' % fn) connection = engine.connect() count = 0 nr_of_lines = 0 fn_basename = os.path.basename(fn) with gzip.GzipFile(fn, 'rb') as gz_file: gz_file.readline() for line in gz_file: nr_of_lines += 1 with gzip.GzipFile(fn, 'rb') as gz_file: headers = gz_file.readline().decode('utf-8').strip().split('\t') logging.debug('headers of file %s: %s' % (fn, ','.join(headers))) table = build_table(fn_basename, headers) try: table.drop() logging.debug('table %s dropped' % table.name) except: pass insert = table.insert() metadata.create_all(tables=[table]) try: for block in generate_content(gz_file, headers, table): try: connection.execute(insert, block) except Exception as e: logging.error('error processing data: %d entries lost: %s' % (len(block), e)) continue count += len(block) percent = count * 100 / nr_of_lines logging.debug('processed %.1f%% of file %s' % (percent, fn_basename)) except Exception as e: logging.error('error processing data on table %s: %s' % (table.name, e)) logging.info('processed %d%% of file %s: %d entries' % (percent, fn, count)) def import_dir(dir_name, engine): """Import data from a series of .tsv.gz files. :param dir_name: directory containing the .tsv.gz files :type dir_name: str :param engine: SQLAlchemy engine :type engine: :class:`sqlalchemy.engine.base.Engine` """ for fn in glob.glob(os.path.join(dir_name, '*%s' % TSV_EXT)): if not os.path.isfile(fn): logging.debug('skipping file %s' % fn) continue import_file(fn, engine) if __name__ == '__main__': parser = argparse.ArgumentParser() parser.add_argument('tsv_files_dir') parser.add_argument('db_uri') parser.add_argument('--verbose', help='increase verbosity and show progress', action='store_true') args = parser.parse_args() dir_name = args.tsv_files_dir db_uri = args.db_uri if args.verbose: logger.setLevel(logging.DEBUG) engine = sqlalchemy.create_engine(db_uri, encoding='utf-8', echo=False) metadata.bind = engine import_dir(dir_name, engine) imdbpy-6.8/bin/search_character.py000077500000000000000000000024371351454127000172710ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ search_character.py Usage: search_character "character name" Search for the given name and print the results. """ import sys # Import the IMDbPY package. try: import imdb except ImportError: print('You bad boy! You need to install the IMDbPY package!') sys.exit(1) if len(sys.argv) != 2: print('Only one argument is required:') print(' %s "character name"' % sys.argv[0]) sys.exit(2) name = sys.argv[1] i = imdb.IMDb() out_encoding = sys.stdout.encoding or sys.getdefaultencoding() try: # Do the search, and get the results (a list of character objects). results = i.search_character(name) except imdb.IMDbError as e: print("Probably you're not connected to Internet. Complete error report:") print(e) sys.exit(3) # Print the results. print(' %s result%s for "%s":' % (len(results), ('', 's')[len(results) != 1], name)) print('characterID\t: imdbID : name') # Print the long imdb name for every character. for character in results: outp = '%s\t\t: %s : %s' % (character.characterID, i.get_imdbID(character), character['long imdb name']) print(outp) imdbpy-6.8/bin/search_company.py000077500000000000000000000023441351454127000170000ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ search_company.py Usage: search_company "company name" Search for the given name and print the results. """ import sys # Import the IMDbPY package. try: import imdb except ImportError: print('You bad boy! You need to install the IMDbPY package!') sys.exit(1) if len(sys.argv) != 2: print('Only one argument is required:') print(' %s "company name"' % sys.argv[0]) sys.exit(2) name = sys.argv[1] i = imdb.IMDb() out_encoding = sys.stdout.encoding or sys.getdefaultencoding() try: # Do the search, and get the results (a list of company objects). results = i.search_company(name) except imdb.IMDbError as e: print("Probably you're not connected to Internet. Complete error report:") print(e) sys.exit(3) # Print the results. print(' %s result%s for "%s":' % (len(results), ('', 's')[len(results) != 1], name)) print('companyID\t: imdbID : name') # Print the long imdb name for every company. for company in results: outp = '%s\t\t: %s : %s' % (company.companyID, i.get_imdbID(company), company['long imdb name']) print(outp) imdbpy-6.8/bin/search_keyword.py000077500000000000000000000022071351454127000170140ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ search_keyword.py Usage: search_keyword "keyword" Search for keywords similar to the give one and print the results. """ import sys # Import the IMDbPY package. try: import imdb except ImportError: print('You bad boy! You need to install the IMDbPY package!') sys.exit(1) if len(sys.argv) != 2: print('Only one argument is required:') print(' %s "keyword name"' % sys.argv[0]) sys.exit(2) name = sys.argv[1] i = imdb.IMDb() out_encoding = sys.stdout.encoding or sys.getdefaultencoding() try: # Do the search, and get the results (a list of keyword strings). results = i.search_keyword(name, results=20) except imdb.IMDbError as e: print("Probably you're not connected to Internet. Complete error report:") print(e) sys.exit(3) # Print the results. print(' %s result%s for "%s":' % (len(results), ('', 's')[len(results) != 1], name)) print(' : keyword') # Print every keyword. for idx, keyword in enumerate(results): outp = '%d: %s' % (idx+1, keyword) print(outp) imdbpy-6.8/bin/search_movie.py000077500000000000000000000023171351454127000164510ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ search_movie.py Usage: search_movie "movie title" Search for the given title and print the results. """ import sys # Import the IMDbPY package. try: import imdb except ImportError: print('You bad boy! You need to install the IMDbPY package!') sys.exit(1) if len(sys.argv) != 2: print('Only one argument is required:') print(' %s "movie title"' % sys.argv[0]) sys.exit(2) title = sys.argv[1] i = imdb.IMDb() out_encoding = sys.stdout.encoding or sys.getdefaultencoding() try: # Do the search, and get the results (a list of Movie objects). results = i.search_movie(title) except imdb.IMDbError as e: print("Probably you're not connected to Internet. Complete error report:") print(e) sys.exit(3) # Print the results. print(' %s result%s for "%s":' % (len(results), ('', 's')[len(results) != 1], title)) print('movieID\t: imdbID : title') # Print the long imdb title for every movie. for movie in results: outp = '%s\t: %s : %s' % (movie.movieID, i.get_imdbID(movie), movie['long imdb title']) print(outp) imdbpy-6.8/bin/search_person.py000077500000000000000000000023231351454127000166350ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ search_person.py Usage: search_person "person name" Search for the given name and print the results. """ import sys # Import the IMDbPY package. try: import imdb except ImportError: print('You bad boy! You need to install the IMDbPY package!') sys.exit(1) if len(sys.argv) != 2: print('Only one argument is required:') print(' %s "person name"' % sys.argv[0]) sys.exit(2) name = sys.argv[1] i = imdb.IMDb() out_encoding = sys.stdout.encoding or sys.getdefaultencoding() try: # Do the search, and get the results (a list of Person objects). results = i.search_person(name) except imdb.IMDbError as e: print("Probably you're not connected to Internet. Complete error report:") print(e) sys.exit(3) # Print the results. print(' %s result%s for "%s":' % (len(results), ('', 's')[len(results) != 1], name)) print('personID\t: imdbID : name') # Print the long imdb name for every person. for person in results: outp = '%s\t: %s : %s' % (person.personID, i.get_imdbID(person), person['long imdb name']) print(outp) imdbpy-6.8/docs/000077500000000000000000000000001351454127000136055ustar00rootroot00000000000000imdbpy-6.8/docs/Changelog.rst000066400000000000000000001517401351454127000162360ustar00rootroot00000000000000Changelog ========= * What's new in release 6.8 "Apollo 11" (20 Jul 2019) [http] - #224: introduce the search_movie_advanced(title, adult=None, results=None, sort=None, sort_dir=None) method - #145: names are stored in normal format (Name Surname) - #225: remove obsolete cookie - #182: box office information - #168: parse series and episode number searching for movies - #217: grab poster from search - #218: extract MPAA rating - #220: extract actor headshot from full credits * What's new in release 6.7 "Game of Thrones" (19 May 2019) [general] - #180: include tests in source package - #188: avoid missing keys in search results [http] - #144: fix parser for currentRole and notes - #189: use HTTPS insted of HTTP - #192: fix list of AKAs and release dates - #200: fix keywords parser - #201: fix encoding doing searches - #210: fix TV series episode rating and votes [sql] - #176: correctly handle multiple characters [s3] - #163 and #193: fix import in MySQL - #193: handle the new format of title.principals.tsv.gz - #195: show progress, importing data (with --verbose) * What's new in release 6.6 "Stranger Things" (05 Aug 2018) [general] - #154: exclude docs and etc directories from packaging - introduce 'https' as an alias for 'http' - #151: the 'in' operator also considers key names - #172: fix for ASCII keys in XML output - #174: improve XML output - #179: introduce Travis CI at https://travis-ci.org/alberanid/imdbpy [http] - #149: store person birth and death dates in ISO8601 format - #166: fix birth and death dates without itemprop attributes - #160: fix series seasons list - #155 and #165: ignore certificate to prevent validation errors - #156: fix tech parser - #157: full-size headshot for persons - #161: fix string/unicode conversion in Python 2.7 - #173: raw akas and raw release dates fields - #178: fix mini biography parser [s3] - #158: fetch and search AKAs - update the goodies/download-from-s3 script to use the datasets.imdbws.com site * What's new in release 6.5 "Poultrygeist: Night of the Chicken Dead" (15 Apr 2018) [general] - converted the documentation to Sphinx rst format [http] - fix title parser for in-production movies - parsers are based on piculet - improve collection of full-size cover images * What's new in release 6.4 "Electric Dreams" (14 Mar 2018) [http] - remove obsolete parsers - remove Character objects - fix for search parsers * What's new in release 6.3 "Altered Carbon" (27 Feb 2018) [general] - documentation updates - introduced the 'imdbpy' CLI - s3 accessSystem to access the new dataset from IMDb [http] - fixes for IMDb site redesign - Person parser fixes - users review parser - improve external sites parser - switch from akas.imdb.com domain to www.imdb.com - fix for synopsis - fix for tv series episodes [s3] - ability to import and access all the information * What's new in release 6.2 "Justice League" (19 Nov 2017) [general] - introduce check for Python version - SQLAlchemy can be disabled using --without-sqlalchemy - fix #88: configuration file parser - update documentation [http] - fixed ratings parser - moved cookies from json to Python source * What's new in release 6.1 - skipped version 6.1 due to a wrong release on pypi * What's new in release 6.0 "Life is Strange" (12 Nov 2017) [general] - now IMDbPY is a Python 3 package - simplified the code base: #61 - remove dependencies: SQLObject, BeautifulSoup, C compiler - introduced a tox testsuite - fix various parsers * What's new in release 5.1 "Westworld" (13 Nov 2016) [general] - fix for company names containing square brackets. - fix XML output when imdb long name is missing. - fixes #33: unable to use --without-sql [http] - fix birth/death dates parsing. - fix top/bottom lists. - Persons's resume page parser (courtesy of codynhat) - fixes #29: split color info - parser for "my rating" (you have to use your own cookies) [sql] - sound track list correctly identified. - fixes #50: process splitted data in order - fixes #53: parser for movie-links * What's new in release 5.0 "House of Cards" (02 May 2014) [general] - Spanish, French, Arabic, Bulgarian and German translations. - Introduced the list of French articles. - fix for GAE. - download_applydiffs.py script. - fixed wrong handling of encoding in episode titles - renamed README.utf8 to README.unicode [http] - fixed searches (again). - search results are always in English. - updated the cookies. - support for obtaining metacritic score and URL. - fixed goofs parser. - fixed url for top250. - fixes for biography page. - fix for quotes. - better charset identification. - category and spoiler status for goofs. - changed query separators from ; to &. - fix for episodes of unknown seasons. - new cookie. [mobile] - fixed searches. [sql] - fix for MSSQL * What's new in release 4.9 "Iron Sky" (15 Jun 2012) [general] - urls used to access the IMDb site can be configured. - helpers function to handle movie AKAs in various languages (code by Alberto Malagoli). - renamed the 'articles' module into 'linguistics'. - introduced the 'reraiseExceptions' option, to re-raise evey caught exception. [http] - fix for changed search parameters. - introduced a 'timeout' parameter for connections to the web server. - fix for business information. - parser for the new style of episodes list. - unicode searches handled as iso8859-1. - fix for garbage in AKA titles. [sql] - vastly improved the store/restore of imdbIDs; now it should be faster and more accurate. - now the 'name' table contains a 'gender' field that can be 'm', 'f' or NULL. - fix for nicknames. - fix for missing titles in the crazy credits file. - handled exceptions creating indexes, foreign keys and executing custom queries. - fixed creation on index for keywords. - excluded {{SUSPENDED}} titles. * What's new in release 4.8.2 "The Big Bang Theory" (02 Nov 2011) [general] - fixed install path of locales. [http] - removed debug code. * What's new in release 4.8 "Super" (01 Nov 2011) [general] - fix for a problem managing exceptions with Python 2.4. - converted old-style exceptions to instances. - enanchements for the reduce.sh script. - added notes about problems connecting to IMDb's web servers. - improvements in the parsers of movie titles. - improvements in the parser of person names. [http] - potential fix for GAE environment. - handled the new style of "in production" information. - fix for 'episodes' list. - fix for 'episodes rating'. - fix for queries that returned too many results. - fix for wrong/missing references. - removed no more available information set "amazon reviews" and "dvd". - fix for cast of tv series. - fix for title of tv series. - now the beautiful parses work again. [httpThin] - removed "httpThin", falling back to "http". [mobile] - fix for missing headshots. - fix for rating and number of votes. - fix for missing genres. - many other fixes to keep up-to-date with the IMDb site. [sql] - fix for a nasty bug parsing notes about character names. - fixes for SQLite with SQLOjbect. * What's new in release 4.7 "Saw VI" (23 Jan 2011) [http] - first fixes for the new set of parsers. - first changes to support the new set of web pages. - fix for lists of uncategorized episodes. - fix for movies with multiple countries. - fix for the currentRole property. - more robust handling for vote details. [mobile] - first fixes for the new set of parsers. [sql] - the tables containing titles and names (and akas) now include a 'md5sum' column calculated on the "long imdb canonical title/name". * What's new in release 4.6 "The Road" (19 Jun 2010) [general] - introduced the 'full-size cover url' and 'full-size headshot' keys for Movie, Person and Character instances. - moved the development to a Mercurial repository. - introduced the parseXML function in the imdb.helpers module. - now the asXML method can exclude dynamically generated keys. - rationalized the use of the 'logging' and 'warnings' modules. - the 'update' method no longer raises an exception, if asked for an unknown info set. [http/mobile] - removed new garbage from the imdb pages. - support new style of akas. - fix for the "trivia" page. - fixes for searches with too many results. [sql] - fixes for garbage in the plain text data files. - support for SQLite shipped with Python 2.6. * What's new in release 4.5.1 "Dollhouse" (01 Mar 2010) [general] - reintroduced the ez_setup.py file. - fixes for AKAs on 'release dates'. - added the dtd. * What's new in release 4.5 "Invictus" (28 Feb 2010) [general] - moved to setuptools 0.6c11. - trying to make the SVN release versions work fine. - http/mobile should work in GAE (Google App Engine). - added some goodies scripts, useful for programmers (see the docs/goodies directory). [http/mobile] - removed urllib-based User-Agent header. - fixes for some minor changes to IMDb's html. - fixes for garbage in movie quotes. - improvements in the handling of AKAs. [mobile] - fixes for AKAs in search results. [sql] - fixes for bugs restoring imdbIDs. - first steps to split CSV creation/insertion. * What's new in release 4.4 "Gandhi" (06 Jan 2010) [general] - introduced a logging facility; see README.logging. - the 'http' and 'mobile' should be a lot more robust. [http] - fixes for the n-th set of changes to IMDb's HTML. - improvements to perfect-match searches. - slightly simplified the parsers for search results. [mobile] - fixes for the n-th set of changes to IMDb's HTML. - slightly simplified the parsers for search results. [sql] - movies' keywords are now correctly imported, using CSV files. - minor fixes to handle crap in the plain text data files. - removed an outdate parameter passed to SQLObject. - made imdbpy2sql.py more robust in some corner-cases. - fixes for the Windows environment. * What's new in release 4.3 "Public Enemies" (18 Nov 2009) [general] - the installer now takes care of .mo files. - introduced, in the helpers module, the functions keyToXML and translateKey, useful to translate dictionary keys. - support for smart guessing of the language of a movie title. - updated the DTD. [http] - fixed a lot of bugs introduced by the new IMDb.com design. - nicer handling of HTTP 404 response code. - fixed parsers for top250 and bottom100 lists. - fixed a bug parsing AKAs. - fixed misc bugs. [mobile] - removed duplicates in list of genres. [sql] - fixed a bug in the imdbpy2sql.py script using CSV files; the 'movie_info_idx' and 'movie_keyword' were left empty/with wrong data. * What's new in release 4.2 "Battlestar Galactica" (31 Aug 2009) [general] - the 'local' data access system is gone. See README.local. - the imdb.parser.common package was removed, and its code integrated in imdb.parser.sql and in the imdbpy2sql.py script. - fixes for the installer. - the helpers module contains the fullSizeCoverURL function, to convert a Movie, Person or Character instance (or a URL in a string) in an URL to the full-size version of its cover/headshot. Courtesy of Basil Shubin. - used a newer version of msgfmt.py, to work around a hideous bug generating locales. - minor updates to locales. - updated the DTD to version 4.2. [http] - removed garbage at the end of quotes. - fixed problems parsing company names and notes. - keys in character's quotes dictionary are now Movie instances. - fixed a bug converting entities char references (affected BeautifulSoup). - fixed a long-standing bug handling & with BeautifulSoup. - top250 is now correctly parsed by BeautifulSoup. [sql] - fixed DB2 call for loading blobs/cblobs. - information from obsolete files are now used if and only if they refer to still existing titles. - the --fix-old-style-titles argument is now obsolete. * What's new in release 4.1 "State Of Play" (02 May 2009) [general] - DTD definition. - support for locale. - support for the new style for movie titles ("The Title" and no more "Title, The" is internally used). - minor fix to XML code to work with the test-suite. [http] - char references in the &#xHEXCODE; format are handled. - fixed a bug with movies containing '....' in titles. And I'm talking about Malcolm McDowell's filmography! - 'airing' contains object (so the accessSystem variable is set). - 'tv schedule' ('airing') pages of episodes can be parsed. - 'tv schedule' is now a valid alias for 'airing'. - minor fixes for empty/wrong strings. [sql] - in the database, soundex values for titles are always calculated after the article is stripped (if any). - imdbpy2sql.py has the --fix-old-style-titles option, to handle files in the old format. - fixed a bug saving imdbIDs. [local] - the 'local' data access system should be considered obsolete, and will probably be removed in the next release. * What's new in release 4.0 "Watchmen" (12 Mar 2009) [general] - the installer is now based on setuptools. - new functions get_keyword and search_keyword to handle movie's keywords (example scripts included). - Movie/Person/... keys (and whole instances) can be converted to XML. - two new functions, get_top250_movies and get_bottom100_movies, to retrieve lists of best/worst movies (example scripts included). - searching for movies and persons - if present - the 'akas' keyword is filled, in the results. - 'quotes' for movies is now always a list of lists. - the old set of parsers (based on sgmllib.SGMLParser) are gone. - fixed limitations handling multiple roles (with notes). - fixed a bug converting somethingIDs to real imdbIDs. - fixed some summary methods. - updates to the documentation. [http] - adapted BeautifulSoup to lxml (internally, the lxml API is used). - currentRole is no longer populated, for non-cast entries (everything ends up into .notes). - fixed a bug search for too common terms. - fixed a bug identifying 'kind', searching for titles. - fixed a bug parsing airing dates. - fixed a bug searching for company names (when there's a direct hit). - fixed a bug handling multiple characters. - fixed a bug parsing episode ratings. - nicer keys for technical details. - removed the 'agent' page. [sql] - searching for a movie, the original titles are returned, instead of AKAs. - support for Foreign Keys. - minor changes to the db's design. - fixed a bug populating tables with SQLAlchemy. - imdbpy2sql.py shows user time and system time, along with wall time. [local] - searching for a movie, the original titles are returned, instead of AKAs. * What's new in release 3.9 "The Strangers" (06 Jan 2009) [general] - introduced the search_episode method, to search for episodes' titles. - movie['year'] is now an integer, and no more a string. - fixed a bug parsing company names. - introduced the helpers.makeTextNotes function, useful to pretty-print strings in the 'TEXT::NOTE' format. [http] - fixed a bug regarding movies listed in the Bottom 100. - fixed bugs about tv mini-series. - fixed a bug about 'series cast' using BeautifulSoup. [sql] - fixes for DB2 (with SQLAlchemy). - improved support for movies' aka titles (for series). - made imdbpy2sql.py more robust, catching exceptions even when huge amounts of data are skipped due to errors. - introduced CSV support in the imdbpy2sql.py script. * What's new in release 3.8 "Quattro Carogne a Malopasso" (03 Nov 2008) [http] - fixed search system for direct hits. - fixed IDs so that they always are str and not unicode. - fixed a bug about plot without authors. - for pages about a single episode of a series, "Series Crew" are now separated items. - introduced the preprocess_dom method of the DOMParserBase class. - handling rowspan for DOMHTMLAwardsParser is no more a special case. - first changes to remove old parsers. [sql] - introduced support for SQLAlchemy. [mobile] - fixed multiple 'nick names'. - added 'aspect ratio'. - fixed a "direct hit" bug searching for people. [global] - fixed search_* example scripts. - updated the documentation. * What's new in release 3.7 "Burn After Reading" (22 Sep 2008) [http] - introduced a new set of parsers, active by default, based on DOM/XPath. - old parsers fixed; 'news', 'genres', 'keywords', 'ratings', 'votes', 'tech', 'taglines' and 'episodes'. [sql] - the pure python soundex function now behaves correctly. [general] - minor updates to the documentation, with an introduction to the new set of parsers and notes for packagers. * What's new in release 3.6 "RahXephon" (08 Jun 2008) [general] - support for company objects for every data access systems. - introduced example scripts for companies. - updated the documentation. [http and mobile] - changes to support the new HTML for "plot outline" and some lists of values (languages, genres, ...) - introduced the set_cookies method to set cookies for IMDb's account and the del_cookies method to remove the use of cookies; in the imdbpy.cfg configuration file, options "cookie_id" and "cookie_uu" can be set to the appropriate values; if "cookie_id" is None, no cookies are sent. - fixed parser for 'news' pages. - fixed minor bug fetching movie/person/character references. [http] - fixed a search problem, while not using the IMDbPYweb's account. - fixed bugs searching for characters. [mobile] - fixed minor bugs parsing search results. [sql] - fixed a bug handling movieIDs, when there are some inconsistencies in the plain text data files. [local] - access to 'mpaa' and 'miscellaneous companies' information. * What's new in release 3.5 "Blade Runner" (19 Apr 2008) [general] - first changes to work on Symbian mobile phones. - now there is an imdb.available_access_systems() function, that can be used to get a list of available data access systems. - it's possible to pass 'results' as a parameter of the imdb.IMDb function; it sets the number of results to return for queries. - fixed summary() method in Movie and Person, to correctly handle unicode chars. - the helpers.makeObject2Txt function now supports recursion over dictionaries. - cutils.c MXLINELEN increased from 512 to 1024; some critical strcpy replaced with strncpy. - fixed configuration parser to be compatible with Python 2.2. - updated list of articles and some stats in the comments. - documentation updated. [sql] - fixed minor bugs in imdbpy2sql.py. - restores imdbIDs for characters. - now CharactersCache honors custom queries. - the imdbpy2sql.py's --mysql-force-myisam command line option can be used to force usage of MyISAM tables on InnoDB databases. - added some warnings to the imdbpy2sql.py script. [local] - fixed a bug in the fall-back function used to scan movie titles, when the cutils module is not available. - mini biographies are cut up to 2**16-1 chars, to prevent troubles with some MySQL servers. - fixed bug in characters4local.py, dealing with some garbage in the files. * What's new in release 3.4 "Flatliners" (16 Dec 2007) [general] - *** NOTE FOR PACKAGERS *** in the docs directory there is the "imdbpy.cfg" configuration file, which should be installed in /etc or equivalent directory; the setup.py script *doesn't* manage its installation. - introduced a global configuration file to set IMDbPY's parameters. - supported characters using "sql" and "local" data access systems. - fixed a bug retrieving characterID from a character's name. [http] - fixed a bug in "release dates" parser. - fixed bugs in "episodes" parser. - fixed bugs reading "series years". - stricter definition for ParserBase._re_imdbIDmatch regular expression. [mobile] - fixed bugs reading "series years". - fixed bugs reading characters' filmography. [sql] - support for characters. [local] - support for characters. - introduced the characters4local.py script. * What's new in release 3.3 "Heroes" (18 Nov 2007) [general] - first support for character pages; only for "http" and "mobile", so far. - support for multiple characters. - introduced an helper function to pretty-print objects. - added README.currentRole. - fixed minor bug in the __hash__ method of the _Container class. - fixed changes to some key names for movies. - introduced the search_character.py, get_character.py and get_first_character.py example scripts. [http] - full support for character pages. - fixed a bug retrieving some 'cover url'. - fixed a bug with multi-paragraphs biographies. - parsers are now instanced on demand. - accessSystem and modFunct are correctly set for every Movie, Person and Character object instanced. [mobile] - full support for character pages. [sql] - extended functionality of the custom queries support for the imdbpy2sql.py script to circumvent a problem with MS SQLServer. - introducted the "--mysql-innodb" and "--ms-sqlserver" shortcuts for the imdbpy2sql.py script. - introduced the "--sqlite-transactions" shortcut to activate transaction using SQLite which, otherwise, would have horrible performances. - fixed a minor bug with top/bottom ratings, in the imdbpy2sql.py script. [local] - filtered out some crap in the "quotes" plain text data files, which also affected sql, importing the data. * What's new in release 3.2 "Videodrome" (25 Sep 2007) [global] - now there's an unique place where "akas.imdb.com" is set, in the main module. - introduced __version__ and VERSION in the main module. - minor improvements to the documentation. [http] - updated the main movie parser to retrieve the recently modified cast section. - updated the crazy credits parser. - fixed a bug retrieving 'cover url'. [mobile] - fixed a bug parsing people's filmography when only one duty was listed. - updated to retrieve series' creator. [sql] - added the ability to perform custom SQL queries at the command line of the imdbpy2sql.py script. - minor fixes for the imdbpy2sql.py script. * What's new in release 3.1 "The Snake King" (18 Jul 2007) [global] - the IMDbPYweb account now returns a single item, when a search returns only one "good enough" match (this is the IMDb's default). - updated the documentation. - updated list of contributors and developers. [http] - supported the new result page for searches. - supported the 'synopsis' page. - supported the 'parents guide' page. - fixed a bug retrieving notes about a movie's connections. - fixed a bug for python2.2 (s60 mobile phones). - fixed a bug with 'Production Notes/Status'. - fixed a bug parsing role/duty and notes (also for httpThin). - fixed a bug retrieving user ratings. - fixed a bug (un)setting the proxy. - fixed 2 bugs in movie/person news. - fixed a bug in movie faqs. - fixed a bug in movie taglines. - fixed a bug in movie quotes. - fixed a bug in movie title, in "full cast and crew" page. - fixed 2 bugs in persons' other works. [sql] - hypothetical fix for a unicode problem in the imdbpy2sql.py script. - now the 'imdbID' fields in the Title and Name tables are restored, updating from an older version. - fixed a nasty bug handling utf-8 strings in the imdbpy2sql.py script. [mobile] - supported the new result page for searches. - fixed a bug for python2.2 (s60 mobile phones). - fixed a bug searching for persons with single match and no messages in the board. - fixed a bug parsing role/duty and notes. * What's new in release 3.0 "Spider-Man 3" (03 May 2007) [global] - IMDbPY now works with the new IMDb's site design; a new account is used to access data; this affect a lot of code, especially in the 'http', 'httpThin' and 'mobile' data access systems. - every returned string should now be unicode; dictionary keywords are _not_ guaranteed to be unicode (but they are always 7bit strings). - fixed a bug in the __contains__ method of the Movie class. - fix in the analyze_title() function to handle malformed episode numbers. [http] - introduced the _in_content instance variable for objects instances of ParserBase, True when inside the
tag. Opening and closing this pair of tags two methods, named _begin_content() and _end_content() are called with no parameters (by default, they do nothing). - in the utils module there's the build_person function, useful to create a Person instance from the tipical formats found in the IMDb's web site. - an analogue build_movie function can be used to instance Movie objects. - inverted the getRefs default - now if not otherwise set, it's False. - added a parser for the "merchandising" ("for sale") page for persons. - the 'rating' parser now collects also 'rating' and 'votes' data. - the HTMLMovieParser class (for movies) was rewritten from zero. - the HTMLMaindetailsParser class (for persons) was rewritten from zero. - unified the "episode list" and "episodes cast" parsers. - fixed a bug parsing locations, which resulted in missing information. - locations_parser splitted from "tech" parser. - "connections" parser now handles the recently introduced notes. [http parser conversion] - these parsers worked out-of-the-box; airing, eprating, alternateversions, dvd, goofs, keywords, movie_awards, movie_faqs, person_awards, rec, releasedates, search_movie, search_person, soundclips, soundtrack, trivia, videoclips. - these parsers were fixed; amazonrev, connections, episodes, crazycredits, externalrev, misclinks, newsgrouprev, news, officialsites, otherworks, photosites, plot, quotes, ratings, sales, taglines, tech, business, literature, publicity, trivia, videoclips, maindetails, movie. [mobile] - fixed to work with the new design. - a lot of code is now shared amongst 'http' and 'mobile'. [sql] - fixes for other bugs related to unicode support. - minor changes to slightly improve performances. * What's new in release 2.9 "Rodan! The Flying Monster" (21 Feb 2007) [global] - on 19 February IMDb has redesigned its site; this is the last IMDbPY's release to parse the "old layout" pages; from now on, the development will be geared to support the new web pages. See the README.redesign file for more information. - minor clean-ups and functions added to the helpers module. [http] - fixed some unicode-related problems searching for movie titles and person names; also changed the queries used to search titles/names. - fixed a bug parsing episodes for tv series. - fixed a bug retrieving movieID for tv series, searching for titles. [mobile] - fixed a problem searching exact matches (movie titles only). - fixed a bug with cast entries, after minor changes to the IMDb's web site HTML. [local and sql] - fixed a bug parsing birth/death dates and notes. [sql] - (maybe) fixed another unicode-related bug fetching data from a MySQL database. Maybe. Maybe. Maybe. * What's new in release 2.8 "Apollo 13" (14 Dec 2006) [general] - fix for environments where sys.stdin was overridden by a custom object. [http data access system] - added support for the movies' "FAQ" page. - now the "full credits" (aka "full cast and crew") page can be parsed; it's mostly useful for tv series, because this page is complete while "combined details" contains only partial data. E.g. ia.update(tvSeries, 'full credits') - added support for the movies' "on television" (ia.update(movie, "airing")) - fixed a bug with 'miscellaneous companies'. - fixed a bug retrieving the list of episodes for tv series. - fixed a bug with tv series episodes' cast. - generic fix for XML single tags (unvalid HTML tags) like
- fixed a minor bug with 'original air date'. [sql data access system] - fix for a unicode bug with recent versions of SQLObject and MySQL. - fix for a nasty bug in imdbpy2sql.py that will show up splitting a data set too large to be sent in a single shot to the database. [mobile data access system] - fixed a bug searching titles and names, where XML char references were not converted. * What's new in release 2.7 "Pitch Black" (26 Sep 2006) [general] - fixed search_movie.py and search_person.py scripts; now they return both the movieID/personID and the imdbID. - the IMDbPY account was configured to hide the mini-headshots. - http and mobile data access systems now try to handle queries with too many results. [http data access system] - fixed a minor bug retrieving information about persons, with movies in production. - fixed support for cast list of tv series. - fixed a bug retrieving 'plot keywords'. - some left out company credits are now properly handled. [mobile data access system] - fixed a major bug with the cast list, after the changes to the IMDb web site. - fixed support for cast list of tv series. - fixed a minor bug retrieving information about persons, with movies in production. - now every AKA title is correctly parsed. [sql data access system] - fixed a(nother) bug updating imdbID for movies and persons. - fixed a bug retrieving personID, while handling names references. [local data access system] - "where now" information now correctly handles multiple lines (also affecting the imdbpy2sql.py script). * What's new in release 2.6 "They Live" (04 Jul 2006) [general] - renamed sortMovies to cmpMovies and sortPeople to cmpPeople; these function are now used to compare Movie/Person objects. The cmpMovies also handles tv series episodes. [http data access system] - now information about "episodes rating" are retrieved. - fixed a bug retrieving runtimes and akas information. - fixed an obscure bug trying an Exact Primary Title/Name search when the provided title was wrong/incomplete. - support for the new format of the "DVD details" page. [sql data access system] - now at insert-time the tables doesn't have indexes, which are added later, resulting in a huge improvement of the performances of the imdbpy2sql.py script. - searching for tv series episodes now works. - fixed a bug inserting information about top250 and bottom10 films rank. - fixed a bug sorting movies in people's filmography. - fixed a bug filtering out adult-only movies. - removed unused ForeignKeys in the dbschema module. - fixed a bug inserting data in databases that require a commit() call, after a call to executemany(). - fixed a bug inserting aka titles in database that checks for foreign keys consistency. - fixed an obscure bug splitting too huge data sets. - MoviesCache and PersonsCache are now flushed few times. - fixed a bug handling excessive recursion. - improved the exceptions handling. * What's new in release 2.5 "Ninja Thunderbolt" (15 May 2006) [general] - support for tv series episodes; see the README.series file. - modified the DISCLAIMER.txt file to be compliant to the debian guidelines. - fixed a bug in the get_first_movie.py script. - Movie and Person instances are now hashable, so that they can be used as dictionary keys. - modified functions analyze_title and build_title to support tv episodes. - use isinstance for type checking. - minor updates to the documentation. - the imdbID for Movie and Person instances is now searched if either one of movieID/personID and title/name is provided. - introduced the isSame() method for both Movie and Person classes, useful to compare object by movieID/personID and accessSystem. - __contains__() methods are now recursive. - two new functions in the IMDbBase class, title2imdbID() and name2imdbID() are used to get the imdbID, given a movie title or person name. - two new functions in the helpers module, sortedSeasons() and sortedEpisodes(), useful to manage lists/dictionaries of tv series episodes. - in the helpers module, the get_byURL() function can be used to retrieve a Movie or Person object for the given URL. - renamed the "ratober" C module to "cutils". - added CONTRIBUTORS.txt file. [http data access system] - fixed a bug regarding currentRole for tv series. - fixed a bug about the "merchandising links" page. [http and mobile data access systems] - fixed a bug retrieving cover url for tv (mini) series. [mobile data access system] - fixed a bug with tv series titles. - retrieves the number of episodes for tv series. [local data access system] - new get_episodes function in the cutils/ratober C module. - search functions (both C and pure python) are now a lot faster. - updated the documentation with work-arounds to make the mkdb program works with a recent set of plain text data files. [sql data access system] - uses the SQLObject ORM to support a wide range of database engines. - added in the cutils C module the soundex() function, and a fall back Python only version in the parser.sql package. * What's new in release 2.4 "Munich" (09 Feb 2006) [general] - strings are now unicode/utf8. - unified Movie and Person classes. - the strings used to store every kind of information about movies and person now are modified (substituting titles and names references) only when it's really needed. - speed improvements in functions modifyStrings, sortMovies, canonicalName, analyze_name, analyze_title. - performance improvements in every data access system. - removed the deepcopy of the data, updating Movie and Person information. - moved the "ratober" C module in the imdb.parser.common package, being used by both ""http" and "sql" data access systems. - C functions in the "ratober" module are always case insensitive. - the setup.py script contains a work-around to make installation go on even if the "ratober" C module can't be compiled (displaying a warning), since it's now optional. - minor updates to documentation, to keep it in sync with changes in the code. - the new helpers.py module contains functions useful to write IMDbPY-based programs. - new doc file README.utf8, about unicode support. [http data access system] - the ParserBase class now inherits from sgmllib.SGMLParser, instead of htmllib.HTMLParser, resulting in a little improvement in parsing speed. - fixed a bug in the parser for the "news" page for movies and persons. - removed special handlers for entity and chardefs in the HTMLMovieParser class. - fixed bugs related to non-ascii chars. - fixed a bug retrieving the URL of the cover. - fixed a nasty bug retrieving the title field. - retrieve the 'merchandising links' page. - support for the new "episodes cast" page for tv series. - fixed a horrible bug retrieving guests information for tv series. [sql data access system] - fixed the imdbpy2sql.py script, to handle files with spurious lines. - searches for names and titles are now much faster, if the imdb.parser.common.ratober C module is compiled and installed. - imdbpy2sql.py now works also on partial data (i.e. if you've not downloaded every single plain text file). - imdbpy2sql.py considers also a couple of files in the contrib directory. - searching names and titles, only the first 5 chars returned from the SOUNDEX() SQL function are compared. - should works if the database is set to unicode/utf-8. [mobile data access system] - fixed bugs related to non-ascii chars. - fixed a bug retrieving the URL of the cover. - retrieve currentRole/notes also for tv guest appearances. [local data access system] - it can work even if the "ratober" C module is not compiled; obviously the pure python substitute is painfully slow (a warning is issued). * What's new in release 2.3 "Big Fish" (03 Dec 2005) [general] - uniformed numerous keys for Movie and Person objects. - 'birth name' is now always in canonical form, and 'nick names' are always normalized; these changes also affect the sql data access system. [http data access system] - removed the 'imdb mini-biography by' key; the name of the author is now prepended to the 'mini biography' key. - fixed an obscure bug using more than one access system (http in conjunction with mobile or httpThin). - fixed a bug in amazon reviews. [mobile data access system] - corrected some bugs retrieving filmography and cast list. [sql data access system] - remove 'birth name' and 'nick names' from the list of 'akas'. - in the SQL database, 'crewmembers' is now 'miscellaneous crew'. - fixed a bug retrieving "guests" for TV Series. * What's new in release 2.2 "The Thing" (17 Oct 2005) [general] - now the Person class has a 'billingPos' instance variable used to keep record of the position of the person in the list of credits (as an example, "Laurence Fishburne" is billed in 2nd position in the cast list for the "Matrix, The (1999)" movie. - added two functions to the utils module, to sort respectively movies (by year/title/imdbIndex) and persons (by billingPos/name/imdbIndex). - every data access system support the 'adultSearch' argument and the do_adult_search() method to exclude the adult movies from your searches. By default, adult movies are always listed. - renamed the scripts, appending the ".py" extension. - added an "IMDbPY Powered" logo and a bitmap used by the Windows installer. - now Person and Movie objects always convert name/title to the canonical format (Title, The). - minor changes to the functions used to convert to "canonical format" names and titles; they should be faster and with better matches. - 'title' is the first argument, instancing a Movie object (instead of 'movieID'). - 'name' is the first argument, instancing a Movie object (instead of 'personID'). [http data access system] - retrieves the 'guest appearances' page for TV series. - fixed a bug retrieving newsgroup reviews urls. - fixed a bug managing non-breaking spaces (they're truly a damnation!) - fixed a bug with mini TV Series in people's biographies. - now keywords are in format 'bullet-time' and no more 'Bullet Time'. [mobile data access system] - fixed a bug with direct hits, searching for a person's name. - fixed a bug with languages and countries. [local data access system] - now cast entries are correctly sorted. - new search system; it should return better matches in less time (searching people's name is still somewhat slow); it's also possibile to search for "long imdb canonical title/name". - fixed a bug retrieving information about a movie with the same person listed more than one time in a given role/duty (e.g., the same director for different episodes of a TV series). Now it works fine and it should also be a bit faster. - 'notable tv guest appearences' in biography is now a list of Movie objects. - writers are sorted in the right order. [sql data access system] - search results are now sorted in correct order; difflib is used to calculate strings similarity. - new search SQL query and comparison algorithm; it should return much better matches. - searches for only a surname now returns much better results. - fixed a bug in the imdbpy2sql.py script; now movie quotes are correctly managed. - added another role, 'guests', for notable tv guest appearences. - writers are sorted in the right order. - put also the 'birth name' and the 'nick names' in the akanames table. * What's new in release 2.1 "Madagascar" (30 Aug 2005) [general] - introduced the "sql data access system"; now you can transfer the whole content of the plain text data files (distributed by IMDb) into a SQL database (MySQL, so far). - written a tool to insert the plain text data files in a SQL database. - fixed a bug in items() and values() methods of Movie and Person classes. - unified portions of code shared between "local" and "sql". [http data access system] - fixed a bug in the search_movie() and search_person() methods. - parse the "external reviews", "newsgroup reviews", "newsgroup reviews", "misc links", "sound clips", "video clips", "amazon reviews", "news" and "photo sites" pages for movies. - parse the "news" page for persons. - fixed a bug retrieving personID and movieID within namesRefs and titlesRefs. [local data access system] - fixed a bug; 'producer' data where scanned two times. - some tags were missing for the laserdisc entries. [mobile data access system] - fixed a bug retrieving cast information (sometimes introduced with "Cast overview" and sometimes with "Credited cast"). - fixed a bug in the search_movie() and search_person() methods. * What's new in release 2.0 "Land Of The Dead" (16 Jul 2005) [general] - WARNING! Now, using http and mobile access methods, movie/person searches will include by default adult movie titles/pornstar names. You can still deactivate this feature by setting the adultSearch argument to false, or calling the do_adult_search() method with a false value. - fixed a bug using the 'all' keyword of the 'update' method. [http data access system] - added the "recommendations" page. - the 'notes' instance variable is now correctly used to store miscellaneous information about people in non-cast roles, replacing the 'currentRole' variable. - the adultSearch initialization argument is by default true. - you can supply the proxy to use with the 'proxy' initialization argument. - retrieve the "plot outline" information. - fixed a bug in the BasicMovieParser class, due to changes in the IMDb's html. - the "rating details" parse information about the total number of voters, arithmetic mean, median and so on. The values are stored as integers and floats, and no more as strings. - dictionary keys in soundtrack are lowercase. - fixed a bug with empty 'location' information. [mobile data access system] - number of votes, rating and top 250 rank are now integers/floats. - retrieve the "plot outline" information. [local data access system] - number of votes, rating and top 250 rank are now integers/floats. * What's new in release 1.9 "Ed Wood" (02 May 2005) [general] - introduced the new "mobile" data access system, useful for small systems. It should be from 2 to 20 times faster than "http" or "httpThin". - the "http", "httpThin" and "mobile" data access system can now search for adult movies. See the README.adult file. - now it should works again with python 2.0 and 2.1. - fixed a bug affecting performances/download time. - unified some keywords amongst differents data access systems. [http data access system] - fixed some bugs; now it retrieves names akas correctly. * What's new in release 1.8 "Paths Of Glory" (24 Mar 2005) [general] - introduced a new data access system "httpThin", useful for systems with limited bandwidth and CPU power, like PDA, hand-held devices and mobile phones. - the setup.py script can be configured to not compile/install the local access system and the example scripts (useful for hand-held devices); introduced setup.cfg and MANIFEST.in files. - updated the list of articles used to manage movie titles. - removed the all_info tuples from Movie and Person classes, since the list of available info sets depends on the access system. I've added two methods to the IMDbBase class, get_movie_infoset() and get_person_infoset(). - removed the IMDbNotAvailable exception. - unified some code in methods get_movie(), get_person() and update() in IMDbBase class. - minor updates to the documentation; added a 46x46 PNG icon. - documentation for small/mobile systems. [Movie class] - renamed the m['notes'] item of Movie objects to m['episodes']. [Person class] - the p.__contains__(m) method can be used to check if the p Person has worked in the m Movie. [local data access system] - gather information about "laserdisc", "literature" and "business". - fixed a bug in ratober.c; now the search_name() function handles search strings already in the "Surname, Name" format. - two new methods, get_lastMovieID() and get_lastPersonID(). [http data access system] - limit the number of results for the query; this will save a lot of bandwidth. - fixed a bug retrieving the number of episodes of tv series. - now it retrieves movies information about "technical specifications", "business data", "literature", "soundtrack", "dvd" and "locations". - retrieves people information about "publicity" and "agent". * What's new in release 1.7 "Saw" (04 Feb 2005) [general] - Person class has two new keys; 'canonical name' and 'long imdb canonical name', like "Gibson, Mel" and "Gibson, Mel (I)". - now titles and names are always internally stored in the canonical format. - search_movie() and search_person() methods return the "read" movieID or personID (handling aliases). - Movie and Person objects have a 'notes' instance attribute, used to specify comments about the role of a person in a movie. The Movie class can also contain a ['notes'] item, used to store information about the runtime; e.g. (26 episodes). - fixed minor bugs in the IMDbBase, Person and Movie classes. - some performance improvements. [http data access system] - fixed bugs retrieving the currentRole. - try to handle unicode chars; return unicode strings when required. - now the searches return also "popular titles" and "popular names" from the new IMDb's search system. [local data access system] - information about movie connections are retrieved. - support for multiple biographies. - now it works with Python 2.2 or previous versions. - fixed a minor glitch in the initialization of the ratober C module. - fixed a pair buffer overflows. - fixed some (very rare) infinite loops bugs. - it raises IMDbDataAccessError for (most of) I/O errors. [Movie class] - fixed a bug getting the "long imdb canonical title". * What's new in release 1.6 "Ninja Commandments" (04 Jan 2005) [general] - now inside Movie and Person object, the text strings (biography, movie plot, etc.) contain titles and names references, like "_Movie, The (1999)_ (qv)" or "'A Person' (qv)"; these reference are transformed at access time with a user defined function. - introduced _get_real_movieID and _get_real_personID methods in the IMDbBase class, to handle title/name aliases for the local access system. - split the _normalize_id method in _normalize_movieID and _normalize_personID. - fixed some bugs. [Movie class] - now you can access the 'canonical title' and 'long imdb canonical title' attributes, to get the movie title in the format "Movie Title, The". [local data access system] - title and name aliases now work correctly. - now get_imdbMovieID and get_imdbPersonID methods should work in almost every case. - people's akas are handled. [http data access system] - now the BasicMovieParser class can correctly gather the imdbID. * What's new in release 1.5 "The Incredibles" (23 Dec 2004) [local database] - support a local installation of the IMDb database! WOW! Now you can download the plain text data files from http://imdb.com/interfaces.html and access those information through IMDbPY! [general] - movie titles and person names are "fully normalized"; Not "Matrix, The (1999)", but "The Matrix (1999)"; Not "Cruise, Tom" but "Tom Cruise". - get_mop_infoSet() methods can now return a tuple with the dictionary data and a list of information sets they provided. [http data access system] - support for the new search system (yes, another one...) - a lot of small fixes to stay up-to-date with the html of the IMDb web server. - modified the personParser module so that it will no more download both "filmoyear" and "maindetails" pages; now only the latter is parsed. - movie search now correctly reports the movie year and index. - gather "locations" information about a movie. - modified the HTMLAwardsParser class so that it doesn't list empty entries. * What's new in release 1.4 "The Village" (10 Nov 2004) [http data access system] - modified the personParser.HTMLMaindetailsParser class, because IMDb has changed the img tag for the headshot. - now 'archive footage' is handled correctly. [IMDb class] - fixed minor glitches (missing "self" parameter in a couple of methods). [misc] - now distutils installs also the example scripts in ./bin/* * What's new in release 1.3 "House of 1000 Corpses" (6 Jul 2004) [http data access system] - modified the BasicMovieParser and BasicPersonParser classes, because IMDb has removed the "pageflicker" from the html pages. [general] - the test suite was moved outside the tgz package. * What's new in release 1.2 "Kill Bill" (2 May 2004) [general] - now it retrieves almost every available information about movie and people! - introduced the concept of "data set", to retrieve different sets of information about a movie/person (so that it's possibile to fetch only the needed information). - introduced a test suite, using the PyUnit (unittest) module. - fixed a nasty typo; the analyze_title and build_title functions now use the strings 'tv mini series' and 'tv series' for the 'kind' key (previously the 'serie' word ws used). - new design; removed the mix-in class and used a factory pattern; imdb.IMDb is now a function, which returns an instance of a class, subclass of imdb.IMDbBase. - introduced the build_name(name_dict) function in the utils module, which takes a dictionary and build a long imdb name. - fixed bugs in the analyze_name function; now it correctly raise an IMDbParserError exception for empty/all spaces strings. - now the analyze_title function sets only the meaningful information (i.e.: no 'kind' or 'year' key, if they're not set) [http data access system] - removed all non-greedy regular expressions. - removed all regular expressions in the movieParser module; now self.rawdata is no more used to search "strange" matches. - introduced a ParserBase class, used as base class for the parsers. - retrieve information about the production status (pre-production, announced, in production, etc.) - mpaa is now a string. - now when an IMDbDataAccessError is raised it shows also the used proxy. - minor changes to improve performances in the handle_data method of the HTMLMovieParser class. - minor changes to achieve a major performances improvement in the BasicPersonParser class in the searchPersonParse module. [Movie class] - fixed a bug in isSameTitle method, now the accessSystem is correctly checked. - fixed some typos. [Person class] - minor changes to the isSamePerson method (now it uses the build_name function). * What's new in release 1.1 "Gigli" (17 Apr 2004) [general] - added support for persons (search & retrieve information about people). - removed the dataSets module. - removed the MovieTitle and the SearchMovieResults classes; now information about the title is stored directly in the Movie object and the search methods return simple lists (of Movie or Person objects). - removed the IMDbTitleError exception. - added the analyze_name() function in the imdb.utils module, which returns a dictionary with the 'name' and 'imdbIndex' keys from the given long imdb name string. [http data access system] - http search uses the new search system. - moved the plotParser module content inside the movieParser module. - fixed a minor bug handling AKAs for movie titles. [IMDb class] - introduced the update(obj) method of the IMDb class, to update the information of the given object (a Movie or Person instance). - added the get_imdbURL(obj) method if the IMDb class, which returns the URL of the main IMDb page for the given object (a Movie or Person). - renamed the 'kind' parameter of the IMDb class to 'accessSystem'. [Movie class] - now __str__() returns only the short name; the summary() method returns a pretty-printed string for the Movie object. - persons are no more simple strings, but Person objects (the role/duty is stored in the currentRole variable of the object). - isSameTitle(obj) method to compare two Movie objects even when not all information are gathered. - new __contains__() method, to check is a given person was in a movie. [misc] - updated the documentation. - corrected some syntax/grammar errors. * What's new in release 1.0 "Equilibrium" (01 Apr 2004) [general] - first public release. - retrieve data only from the web server. - search only for movie titles. imdbpy-6.8/docs/GPL.txt000066400000000000000000000431101351454127000147670ustar00rootroot00000000000000 GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Library General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) year name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. , 1 April 1989 Ty Coon, President of Vice This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Library General Public License instead of this License. imdbpy-6.8/docs/LICENSE.txt000066400000000000000000000016601351454127000154330ustar00rootroot00000000000000# IMDbPY NOTE: see also the recommendations in the "DISCLAIMER.txt" file. NOTE: for a list of persons who share the copyright over specific portions of code, see the "CONTRIBUTORS.txt" file. Copyright 2004-2017 Davide Alberani et al. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA imdbpy-6.8/docs/Makefile000066400000000000000000000011331351454127000152430ustar00rootroot00000000000000# Minimal makefile for Sphinx documentation # # You can set these variables from the command line. SPHINXOPTS = SPHINXBUILD = sphinx-build SPHINXPROJ = IMDbPY SOURCEDIR = . BUILDDIR = _build # Put it first so that "make" without argument is like "make help". help: @$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) .PHONY: help Makefile # Catch-all target: route all unknown targets to Sphinx using the new # "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). %: Makefile @$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)imdbpy-6.8/docs/_static/000077500000000000000000000000001351454127000152335ustar00rootroot00000000000000imdbpy-6.8/docs/_static/.gitkeep000066400000000000000000000000001351454127000166520ustar00rootroot00000000000000imdbpy-6.8/docs/conf.py000066400000000000000000000115571351454127000151150ustar00rootroot00000000000000# -*- coding: utf-8 -*- # # Configuration file for the Sphinx documentation builder. # # This file does only contain a selection of the most common options. For a # full list see the documentation: # http://www.sphinx-doc.org/en/stable/config # -- Path setup -------------------------------------------------------------- # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. # # import os # import sys # sys.path.insert(0, os.path.abspath('.')) # -- Project information ----------------------------------------------------- project = 'IMDbPY' copyright = '2018, Davide Alberani, H. Turgut Uyar' author = 'Davide Alberani, H. Turgut Uyar' # The short X.Y version version = '' # The full version, including alpha/beta/rc tags release = '6.6' # -- General configuration --------------------------------------------------- # If your documentation needs a minimal Sphinx version, state it here. # # needs_sphinx = '1.0' # Add any Sphinx extension module names here, as strings. They can be # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. extensions = [ 'sphinx.ext.autodoc', ] # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] # The suffix(es) of source filenames. # You can specify multiple suffix as a list of string: # # source_suffix = ['.rst', '.md'] source_suffix = '.rst' # The master toctree document. master_doc = 'index' # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. # # This is also used if you do content translation via gettext catalogs. # Usually you set "language" from the command line for these cases. language = None # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. # This pattern also affects html_static_path and html_extra_path . exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store'] # The name of the Pygments (syntax highlighting) style to use. pygments_style = 'sphinx' # A list of ignored prefixes for module index sorting. modindex_common_prefix = ['imdb.'] # -- Options for HTML output ------------------------------------------------- # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. # html_theme = 'sphinx_rtd_theme' # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the # documentation. # # html_theme_options = {} # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ['_static'] # Custom sidebar templates, must be a dictionary that maps document names # to template names. # # The default sidebars (for documents that don't match any pattern) are # defined by theme itself. Builtin themes are using these templates by # default: ``['localtoc.html', 'relations.html', 'sourcelink.html', # 'searchbox.html']``. # # html_sidebars = {} # -- Options for HTMLHelp output --------------------------------------------- # Output file base name for HTML help builder. htmlhelp_basename = 'IMDbPYdoc' # -- Options for LaTeX output ------------------------------------------------ latex_elements = { # The paper size ('letterpaper' or 'a4paper'). # # 'papersize': 'letterpaper', # The font size ('10pt', '11pt' or '12pt'). # # 'pointsize': '10pt', # Additional stuff for the LaTeX preamble. # # 'preamble': '', # Latex figure (float) alignment # # 'figure_align': 'htbp', } # Grouping the document tree into LaTeX files. List of tuples # (source start file, target name, title, # author, documentclass [howto, manual, or own class]). latex_documents = [ (master_doc, 'IMDbPY.tex', 'IMDbPY Documentation', 'Davide Alberani, H. Turgut Uyar', 'manual'), ] # -- Options for manual page output ------------------------------------------ # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). man_pages = [ (master_doc, 'imdbpy', 'IMDbPY Documentation', [author], 1) ] # -- Options for Texinfo output ---------------------------------------------- # Grouping the document tree into Texinfo files. List of tuples # (source start file, target name, title, author, # dir menu entry, description, category) texinfo_documents = [ (master_doc, 'IMDbPY', 'IMDbPY Documentation', author, 'IMDbPY', 'One line description of project.', 'Miscellaneous'), ] # -- Extension configuration ------------------------------------------------- imdbpy-6.8/docs/contributors/000077500000000000000000000000001351454127000163425ustar00rootroot00000000000000imdbpy-6.8/docs/contributors/credits.rst000066400000000000000000000273711351454127000205430ustar00rootroot00000000000000Credits ------- First of all, I want to thank all the package maintainers, and especially Ana Guerrero. Another big thanks to the developers who used IMDbPY for their projects and research; they can be found here: https://imdbpy.sourceforge.io/ecosystem.html Other very special thanks go to some people who followed the development of IMDbPY very closely, providing hints and insights: Ori Cohen, James Rubino, Tero Saarni, and Jesper Noer (for a lot of help, and also for the wonderful https://bitbucket.org/); and let's not forget all the translators on https://www.transifex.com/davide_alberani/imdbpy/ Below is a list of people who contributed with bug reports, small patches, and hints (kept in reverse order since IMDbPY 4.5): * Charlotte Weaver for a PR to extract posters from searches * Christian Clauss for some fixes to Python 3 compatibility * Enrique A for pull request to fix box office parser * Christian Graabæk Steenfeldt for a report about tv series ratings * h3llrais3r for a patch for Python 2 and encodings * Konstantin Danilov and guillaumedsde for a bug report about s3 dataset * mmuehlenhoff for a bug report about list of AKAs and release dates * ChakshuGautam for a bug report about MySQL import * jh888 for suggesting to switch to https * tlc for a bug report about search results * JohnnyD0101 and Kagandi for a bug report and a patch about currentRole and notes * Ali Momen Sani for a report about mini biography * David Runge for package tests and hints * antonioforte for bug reports and patches about XML output and SQL parser * Jakub Synowiec for multiple bug reports and patches * Piotr Staszewski for multiple bug reports and patches * tiagoaquinofl and rednibia for extensive debugging on SSL certificate issues * Tim King for a report about birth and death dates. * Lars Gustäbel for a report about series seasons. * Filip Bačić for a report about full-size headshot * Matthew Clapp for a report about pip installation * Jannik S for a report on tech parser * Brad Pirtle, Adrien C. and Markus-at-GitHub for improvements to full-size covers * Tim Belcher for a report about forgotten debug code. * Paul Jensen for many bug reports about tv series. * Andrew D Bate for documentation on how to reintroduce foreign keys. * yiqingzhu007 for a bug report about synopsis. * David Runge for managing the Arch Linux package. * enriqueav for fixes after the IMDb redesign. * Piotr Staszewski for a fix for external sites parser. * Mike Christopher for the user reviews parser. * apelord for a parser for full credits. * Mike Christopher for a patch for synopsis parser. * Damon Brodie for a bug report about technical parser. * Sam Petulla for a bug report about searching for keywords. * zoomorph for an improvement for parsing your own votes. * Fabrice Laporte for a bug report on setup.py. * Wael Sinno for a patch for parsing movie-links. * Tool Man, for a fix on sound track parsing. * Rafael Lopez for a series of fixes on top/bottom lists. * Derek Duoba for a bug report about XML output. * Cody Hatfield for a parser for the Persons's resume page. * Mystfit for a fix handling company names. * Troy Deck for a path for MySQL. * miles82 for a patch on metascore parsing. * Albert Claret for the parser of the critic reviews page. * Shobhit Singhal for fixes in parsing biographies and plots. * Dan Poirier for documentation improvements. * Frank Braam for a fix for MSSQL. * Darshana Umakanth for a bug report the search functions. * Osman Boyaci for a bug report on movie quotes. * Mikko Matilainen for a patch on encodings. * Roy Stead for the download_applydiffs.py script. * Matt Keenan for a report about i18n in search results. * belgabor for a patch in the goofs parser. * Ian Havelock for a bug report on charset identification. * Mikael Puhakka for a bug report about foreign language results in a search. * Wu Mao for a bug report on the GAE environment. * legrostdg for a bug report on the new search pages. * Haukur Páll Hallvarðsson for a patch on query parameters. * Arthur de Peretti-Schlomoff for a list of French articles and fixes to Spanish articles. * John Lambert, Rick Summerhill and Maciej for reports and fixes for the search query. * Kaspars "Darklow" Sprogis for an impressive amount of tests and reports about bugs parsing the plain text data files and many new ideas. * Damien Stewart for many bug reports about the Windows environment. * Vincenzo Ampolo for a bug report about the new imdbIDs save/restore queries. * Tomáš Hnyk for the idea of an option to reraise caught exceptions. * Emmanuel Tabard for ideas, code and testing on restoring imdbIDs. * Fabian Roth for a bug report about the new style of episodes list. * Y. Josuin for a bug report on missing info in crazy credits file. * Arfrever Frehtes Taifersar Arahesis for a patch for locales. * Gustaf Nilsson for bug reports about BeautifulSoup. * Jernej Kos for patches to handle "in production" information and birth/death years. * Saravanan Thirumuruganathan for a bug report about genres in mobile. * Paul Koan, for a bug report about DVD pages and movie references. * Greg Walters for a report about a bug with queries with too many results. * Olav Kolbu for tests and report about how the IMDb.com servers reply to queries made with and without cookies. * Jef "ofthelit", for a patch for the reduce.sh script bug reports for Windows. * Reiner Herrmann for benchmarks using SSD hard drives. * Thomas Stewart for some tests and reports about a bug with charset in the plain text data files. * Ju-Hee Bae for an important series of bug reports about the problems derived by the last IMDb's redesign. * Luis Liras and Petite Abeille for a report and a bugfix about imdbpy2sql.py used with SQLite and SQLObject. * Kevin S. Anthony for a bug report about episodes list. * Bhupinder Singh for a bug report about exception handling in Python 2.4. * Ronald Hatcher for a bug report on the GAE environment. * Ramusus for a lot of precious bug reports. * Laurent Vergne for a hint about InnoDB, MyISAM and foreign keys. * Israel Fruch for patches to support the new set of parsers. * Inf3cted MonkeY, for a bug report about 'vote details'. * Alexmipego, for suggesting to add a md5sum to titles and names. * belgabortm for a bug report about movies with multiple 'countries'. * David Kaufman for an idea to make the 'update' method more robust. * Dustin Wyatt for a bug with SQLite of Python 2.6. * Julian Scheid for bug reports about garbage in the ptdf. * Adeodato Simó for a bug report about the new imdb.com layout. * Josh Harding for a bug report about the new imdb.com layout. * Xavier Naidoo for a bug report about top250 and BeautifulSoup. * Basil Shubin for hints about a new helper function. * Mark Jeffery, for some help debugging a lxml bug. * Hieu Nguyen for a bug report about fetching real imdbIDs. * Rdian06 for a patch for movies without plot authors. * Tero Saarni, for the series 60 GUI and a lot of testing and debugging. * Ana Guerrero, for maintaining the official debian package. * H. Turgut Uyar for a number of bug reports and a lot of work on the test-suite. * Ori Cohen for some code and various hints. * Jesper Nøhr for a lot of testing, especially on 'sql'. * James Rubino for many bug reports. * Cesare Lasorella for a bug report about newer versions of SQLObject. * Andre LeBlanc for a bug report about airing date of tv series episodes. * aow for a note about some misleading descriptions. * Sébastien Ragons for tests and reports. * Sridhar Ratnakumar for info about PKG-INF. * neonrush for a bug parsing Malcolm McDowell filmography! * Alen Ribic for some bug reports and hints. * Joachim Selke for some bug reports with SQLAlchemy and DB2 and a lot of testing and debugging of the ibm_db driver (plus a lot of hints about how to improve the imdbpy2sql.py script). * Karl Newman for bug reports about the installer of version 4.5. * Saruke Kun and Treas0n for bug reports about 'Forbidden' errors from the imdb.com server. * Chris Thompson for some bug reports about summary() methods. * Mike Castle for performace tests with SQLite and numerous hints. * Indy (indyx) for a bug about series cast parsing using BeautifulSoup. * Yoav Aviram for a bug report about tv mini-series. * Arjan Gijsberts for a bug report and patch for a problem with movies listed in the Bottom 100. * Helio MC Pereira for a bug report about unicode. * Michael Charclo for some bug reports performing 'http' queries. * Amit Belani for bug reports about plot outline and other changes. * Matt Warnock for some tests with MySQL. * Mark Armendariz for a bug report about too long field in MySQL db and some tests/analyses. * Alexy Khrabrov, for a report about a subtle bug in imdbpy2sql.py. * Clark Bassett for bug reports and fixes about the imdbpy2sql.py script and the cutils.c C module. * mumas for reporting a bug in summary methods. * Ken R. Garland for a bug report about 'cover url' and a lot of other hints. * Steven Ovits for hints and tests with Microsoft SQL Server, SQLExpress and preliminary work on supporting diff files. * Fredrik Arnell for tests and bug reports about the imdbpy2sql.py script. * Arnab for a bug report in the imdbpy2sql.py script. * Elefterios Stamatogiannakis for the hint about transactions and SQLite, to obtain an impressive improvement in performances. * Jon Sabo for a bug report about unicode and the imdbpy2sql.py script and some feedback. * Andrew Pendleton for a report about a very hideous bug in the imdbpy2sql.py (garbage in the plain text data files + programming errors + utf8 strings + postgres). * Ataru Moroboshi ;-) for a bug report about role/duty and notes. * Ivan Kedrin for a bug report about the analyze_title function. * Hadley Rich for reporting bugs and providing patches for troubles parsing tv series' episodes and searching for tv series' titles. * Jamie R. Rytlewski for a suggestion about saving imbIDs in 'sql'. * Vincent Crevot, for a bug report about unicode support. * Jay Klein for a bug report and testing to fix a nasty bug in the imdbpy2sql.py script (splitting too large data sets). * Ivan Garcia for an important bug report about the use of IMDbPY within wxPython programs. * Kessia Pinheiro for a bug report about tv series list of episodes. * Michael G. Noll for a bug report and a patch to fix a bug retrieving 'plot keywords'. * Alain Michel, for a bug report about search_*.py and get_*.py scripts. * Martin Arpon and Andreas Schoenle for bug reports (and patches) about "runtime", "aka titles" and "production notes" information not being parsed. * none none (dclist at gmail.com) for a useful hint and code to retrieve a movie/person object, given an URL. * Sebastian Pölsterl, for a bug report about the cover url for tv (mini) series, and another one about search_* methods. * Martin Kirst for many hints and the work on the imdbpyweb program. * Julian Mayer, for a bug report and a patch about non-ascii chars. * Wim Schut and "eccentric", for bug reports and a patches about movies' cover url. * Alfio Ferrara, for a bug report about the get_first_movie.py script. * Magnus Lie Hetland for an hint about the searches in sql package. * Thomas Jadjewski for a bug report about the imdbpy2sql.py script. * Trevor MacPhail, for a bug report about search_* methods and the ParserBase.parse method. * Guillaume Wisniewski, for a bug report. * Kent Johnson, for a bug report. * Andras Bali, for the hint about the "plot outline" information. * Nick S. Novikov, who provided the Windows installer until I've managed to set up a Windows development environment. * Simone Bacciglieri, who downloaded the plain text data files for me. * Carmine Noviello, for some design hints. * "Basilius" for a bug report. * Davide for a bug report. .. _Contributors: CONTRIBUTORS.html imdbpy-6.8/docs/contributors/index.rst000066400000000000000000000040561351454127000202100ustar00rootroot00000000000000Contributors ============ Authors ------- People who contributed a substantial amount of work and share the copyright over some portions of the code: Davide Alberani erlug.linux.it> Main author and project leader. \H. Turgut Uyar tekir.org> The whole "http" data access system (using a DOM and XPath-based approach) is based on his work. The imdbpykit interface was mostly written by him and he holds the copyright over the whole code (with some portions shared with others). He provided the tox testsuite. Giuseppe "Cowo" Corbelli lugbs.linux.it> Provided a lot of code and hints to integrate IMDbPY with SQLObject, working on the imdbpy2sql.py script and the dbschema.py module. Beside Turgut, Giuseppe and me, the following people are listed as developers for the IMDbPY project on sourceforge and may share copyright on some (minor) portions of the code: Alberto Malagoli Developed the new web site, and detains the copyright of it, and provided helper functions and other code. Martin Kirst s1998.tu-chemnitz.de> Has done an important refactoring of the imdbpyweb program and shares with me the copyright on the whole program. Jesper Nøhr noehr.org> Provided extensive testing and some patches for the "http" data access system. Joachim Selke tu-bs.de> Many tests on IBM DB2 and work on the CSV support. Timo Schulz users.sourceforge.net> Did a lot of work "sql", DB2 and CSV support and extensive analysis aimed at diff files support. Roy Stead gmail.com> Provided the download_applydiffs.py script. Translators ----------- Additional translations were provided by: - strel (Spanish) - Stéphane Aulery (French) - RainDropR (Arabic) - Atanas Kovachki (Bulgarian) - lukophron (French) - Raphael (German) .. include:: credits.rst Donations --------- We'd like to thank the following people for their donations: - Paulina Wadecka - Oleg Peil - Diego Sarmentero - Fabian Winter - Lacroix Scott imdbpy-6.8/docs/devel/000077500000000000000000000000001351454127000147045ustar00rootroot00000000000000imdbpy-6.8/docs/devel/extend.rst000066400000000000000000000134721351454127000167340ustar00rootroot00000000000000How to extend ------------- To introduce a new data access system, you have to write a new package inside the "parser" package; this new package must provide a subclass of the imdb.IMDb class which must define at least the following methods: ``_search_movie(title)`` To search for a given title; must return a list of (movieID, {movieData}) tuples. ``_search_episode(title)`` To search for a given episode title; must return a list of (movieID, {movieData}) tuples. ``_search_person(name)`` To search for a given name; must return a list of (movieID, {personData}) tuples. ``_search_character(name)`` To search for a given character's name; must return a list of (characterID, {characterData}) tuples. ``_search_company(name)`` To search for a given company's name; must return a list of (companyID, {companyData}) tuples. ``get_movie_*(movieID)`` A set of methods, one for every set of information defined for a Movie object; should return a dictionary with the relative information. This dictionary can contain some optional keys: - 'data': must be a dictionary with the movie info - 'titlesRefs': a dictionary of 'movie title': movieObj pairs - 'namesRefs': a dictionary of 'person name': personObj pairs ``get_person_*(personID)`` A set of methods, one for every set of information defined for a Person object; should return a dictionary with the relative information. ``get_character_*(characterID)`` A set of methods, one for every set of information defined for a Character object; should return a dictionary with the relative information. ``get_company_*(companyID)`` A set of methods, one for every set of information defined for a Company object; should return a dictionary with the relative information. ``_get_top_bottom_movies(kind)`` Kind can be one of 'top' and 'bottom'; returns the related list of movies. ``_get_keyword(keyword)`` Return a list of Movie objects with the given keyword. ``_search_keyword(key)`` Return a list of keywords similar to the given key. ``get_imdbMovieID(movieID)`` Convert the given movieID to a string representing the imdbID, as used by the IMDb web server (e.g.: '0094226' for Brian De Palma's "The Untouchables"). ``get_imdbPersonID(personID)`` Convert the given personID to a string representing the imdbID, as used by the IMDb web server (e.g.: '0000154' for "Mel Gibson"). ``get_imdbCharacterID(characterID)`` Convert the given characterID to a string representing the imdbID, as used by the IMDb web server (e.g.: '0000001' for "Jesse James"). ``get_imdbCompanyID(companyID)`` Convert the given companyID to a string representing the imdbID, as used by the IMDb web server (e.g.: '0071509' for "Columbia Pictures [us]"). ``_normalize_movieID(movieID)`` Convert the provided movieID in a format suitable for internal use (e.g.: convert a string to a long int). NOTE: As a rule of thumb you *always* need to provide a way to convert a "string representation of the movieID" into the internally used format, and the internally used format should *always* be converted to a string, in a way or another. Rationale: A movieID can be passed from the command line, or from a web browser. ``_normalize_personID(personID)`` idem ``_normalize_characterID(characterID)`` idem ``_normalize_companyID(companyID)`` idem ``_get_real_movieID(movieID)`` Return the true movieID; useful to handle title aliases. ``_get_real_personID(personID)`` idem ``_get_real_characterID(characterID)`` idem ``_get_real_companyID(companyID)`` idem The class should raise the appropriate exceptions, when needed: - ``IMDbDataAccessError`` must be raised when you cannot access the resource you need to retrieve movie info or you're unable to do a query (this is *not* the case when a query returns zero matches: in this situation an empty list must be returned). - ``IMDbParserError`` should be raised when an error occurred parsing some data. Now you've to modify the ``imdb.IMDb`` function so that, when the right data access system is selected with the "accessSystem" parameter, an instance of your newly created class is returned. For example, if you want to call your new data access system "mysql" (meaning that the data are stored in a mysql database), you have to add to the imdb.IMDb function something like: .. code-block:: python if accessSystem == 'mysql': from parser.mysql import IMDbMysqlAccessSystem return IMDbMysqlAccessSystem(*arguments, **keywords) where "parser.mysql" is the package you've created to access the local installation, and "IMDbMysqlAccessSystem" is the subclass of imdb.IMDbBase. Then it's possible to use the new data access system like: .. code-block:: python from imdb import IMDb i = IMDb(accessSystem='mysql') results = i.search_movie('the matrix') print(results) .. note:: This is a somewhat misleading example: we already have a data access system for SQL database (it's called 'sql' and it supports MySQL, amongst others). Maybe I'll find a better example... A specific data access system implementation can define its own methods. As an example, the IMDbHTTPAccessSystem that is in the parser.http package defines the method ``set_proxy()`` to manage the use a web proxy; you can use it this way: .. code-block:: python from imdb import IMDb i = IMDb(accessSystem='http') # the 'accessSystem' argument is not # really needed, since "http" is the default. i.set_proxy('http://localhost:8080/') A list of special methods provided by the imdb.IMDbBase subclass, along with their description, is always available calling the ``get_special_methods()`` of the IMDb class: .. code-block:: python i = IMDb(accessSystem='http') print(i.get_special_methods()) will print a dictionary with the format:: {'method_name': 'method_description', ...} imdbpy-6.8/docs/devel/index.rst000066400000000000000000000042671351454127000165560ustar00rootroot00000000000000Development =========== If you intend to do development on the IMDbPY package, it's recommended that you create a virtual environment for it. For example:: python -m venv ~/.virtualenvs/imdbpy . ~/.virtualenvs/imdbpy/bin/activate In the virtual environment, install IMDbPY in editable mode and include the extra packages. In the top level directory of the project (where the :file:`setup.py` file resides), run:: pip install -e .[dev,doc,test] .. packages linguistics Defines some functions and data useful to smartly guess the language of a movie title (internally used). parser (package) A package containing a package for every data access system implemented. http (package) Contains the IMDbHTTPAccessSystem class which is a subclass of the imdb.IMDbBase class; it provides the methods used to retrieve and manage data from the web server (using, in turn, the other modules in the package). It defines methods to get a movie and to search for a title. The parser.sql package manages the access to the data in the SQL database, created with the imdbpy2sql.py script; see the README.sqldb file. The dbschema module contains tables definitions and some useful functions. The helpers module contains functions and other goodies not directly used by the IMDbPY package, but that can be useful to develop IMDbPY-based programs. I wanted to stay independent from the source of the data for a given movie/person/character/company, so the :func:`imdb.IMDb` function returns an instance of a class that provides specific methods to access a given data source (web server, SQL database, etc.). Unfortunately this means that the ``movieID`` in the :class:`Movie ` class, the ``personID`` in the :class:`Person ` class, and the ``characterID`` in the :class:`Character ` class depend on the data access system being used. So, when a movie, person, or character object is instantiated, the ``accessSystem`` instance variable is set to a string used to identify the used data access system. .. toctree:: :maxdepth: 2 :caption: Contents: extend test translate release imdbpy-6.8/docs/devel/release.rst000066400000000000000000000044711351454127000170640ustar00rootroot00000000000000How to make a release ===================== **During development** *setup.cfg* The ``egg_info`` section must include the lines below:: [egg_info] tag_build = dev tag_date = true *setup.py* The ``version`` variable must be set to the **next** version. *imdb/__init__.py* When a major fix or feature is committed, the ``VERSION`` and ``__version__`` variables must be updated to something in the form *{next.version}devISO8601DATE* (not mandatory, but...) *docs/Changelog.rst* When a major fix or feature is committed, the changelog must be updated. **When a new release is planned** *setup.cfg* In the ``egg_info`` section, the lines mentioned above must be commented out. *setup.py* Not touched. *imdb/__init__.py* The *devISO8601DATE* part must be removed from the version variables. *docs/Changelog.rst* The date of the release has to be added. **How to release** - Commit the above changes. - Add an annotated tag like *major.minor*; e.g.: ``git tag -a 6.3`` (the commit message is not important). - ``python3 setup.py sdist`` - ``python3 setup.py bdist_wheel`` - ``git push`` - ``git push --tags`` - Don't forget to push both sources and tags to both the GitHub and Bitbucket repositories (they are kept in sync). - Upload to pypi: ``twine upload dist/IMDbPY-*`` (you probably need a recent version of twine and the appropriate ~/.pypi file) - The new tar.gz must also be uploaded to https://sourceforge.net/projects/imdbpy/ (along with a new "news"). **communication** - access the web site with: `sftp ${your-sourceforge-username}@frs.sourceforge.net` and move to the *imdbpy_web/htdocs/* - download *index.html* and add an *article* section, removing the one or more of the old ones - upload the page - add a news on https://sourceforge.net/p/imdbpy/news/new - send an email to imdbpy-devel@lists.sourceforge.net and imdbpy-help@lists.sourceforge.net **After the release** *setup.cfg* Uncomment the two lines again. *setup.py* Bump the ``version`` variable. *imdb/__init__.py* Bump the ``VERSION`` and ``__version__`` variables adding again the *devISO8601DATE* string. *docs/Changelog.rst* Add a new section for the next release, on top. After that, you can commit the above changes with a message like "version bump" imdbpy-6.8/docs/devel/test.rst000066400000000000000000000033231351454127000164160ustar00rootroot00000000000000.. _testing: How to test =========== IMDbPY has a test suite based on `pytest`_. The simplest way to run the tests is to run the following command in the top level directory of the project:: pytest You can execute a specific test module:: pytest tests/test_http_movie_combined.py Or execute test functions that match a given keyword:: pytest -k cover make ---- A :file:`Makefile` is provided for easier invocation of jobs. The following targets are defined (among others, run "make" to see the full list): test Run tests quickly with the default Python. lint Check style with flake8. docs Generate Sphinx HTML documentation, including API docs. coverage Check code coverage quickly with the default Python. clean Clean everything. tox --- Multiple test environments can be tested using tox:: tox This will test all the environments listed in the :file:`tox.ini` file. If you want to run all tests for a specific environment, for example python 3.4, supply it as an argument to tox:: tox -e py34 You can supply commands that will be executed in the given environment. For example, to run the test function that have the string "cover" in them using pypy3, execute:: tox -e pypy3 -- pytest -k cover Or to get a Python prompt under Python 3.5 (with IMDbPY and all dependencies already installed), execute:: tox -e py35 -- python S3 dataset ---------- The tests will use the HTTP access system by default. If you would also like to test the database generated from the S3 dataset, define the ``IMDBPY_S3_URI`` environment variable:: IMDBPY_S3_URI='postgres://imdb@localhost/imdb' pytest This will run the tests for both HTTP and S3 access systems. .. _pytest: https://pytest.org/ imdbpy-6.8/docs/devel/translate.rst000066400000000000000000000023241351454127000174340ustar00rootroot00000000000000.. _translate: How to translate ---------------- .. note:: You can (but you don't have to) use Transifex to manage/coordinate your translations: http://www.transifex.net/projects/p/imdbpy/ The :mod:`imdb.locale` package contains some scripts that are useful for building your own internationalization files: - The :file:`generatepot.py` script should be used only when the DTD is changed; it's used to create the :file:`imdbpy.pot` file (the one that gets shipped is always up-to-date). - You can copy the :file:`imdbpy.pot` file as your language's ``.po`` file (for example :file:`imdbpy-fr.po` for French) and modify it according to your language. - Then you have to run the :file:`rebuildmo.py` script (which is automatically executed at install time) to create the ``.mo`` files. If you need to upgrade an existing translation, after changes to the ``.pot`` file (usually because the DTD was changed), you can use the ``msgmerge`` utility which is part of the GNU gettext suite:: msgmerge -N imdbpy-fr.po imdbpy.pot > new-imdbpy-fr.po If you create a new translation or update an existing one, you can send it to the mailing list, for inclusion in upcoming releases. imdbpy-6.8/docs/disclaimer.rst000066400000000000000000000007361351454127000164610ustar00rootroot00000000000000IMDbPY and its authors are not affiliated with Internet Movie Database Inc. IMDb is a trademark of Internet Movie Database Inc., and all content and data included on the IMDb's site is the property of IMDb or its content suppliers and protected by United States and international copyright laws. Please read the IMDb's conditions of use on their website: - https://www.imdb.com/conditions - https://www.imdb.com/licensing - any other notice on the https://www.imdb.com/ site imdbpy-6.8/docs/faqs.rst000066400000000000000000000134421351454127000152750ustar00rootroot00000000000000FAQs ==== :Q: Is IMDbPY compatible with Python 3? :A: Yes. Actually, the versions after 6.0 are compatible only with Python 3. If you need an older, unmaintained, version for Python, see the imdbpy-legacy branch in the repository. :Q: Why is the movieID (and other IDs) used in the old "sql" database not the same as the ID used on the IMDb.com site? :A: First, a bit of nomenclature: "movieID" is the term we use for a unique identifier used by IMDbPY to manage a single movie (similar terms for other kinds of data such as "personID" for persons). An "imdbID" is the term we use for a unique identifier that the IMDb.com site uses for the same kind of data (e.g.: the 7-digit number in tt0094226, as seen in the URL for "The Untouchables"). When using IMDbPY to access the web ("http" data access system), movieIDs and imdbIDs are the same thing -beware that in this case a movieID is a string, with the leading zeroes. Unfortunately, when populating a SQL database with data from the plain text data files, we don't have access to imdbIDs -since they are not distributed at all- and so we have to generate them ourselves (they are the "id" columns in tables like "title" or "name"). This means that these values are valid only for your current database: if you update it with a newer set of plain text data files, these IDs will surely change (and, by the way, they are integers). It's also obvious, now, that you can't exchange IDs between the "http" and the "sql" data access systems, and similarly you can't use imdbIDs with your local database or vice-versa. :Q: When using a SQL database, what's the "imdb_id" (or something like that) column in tables like "title", "name" and so on? :A: It's internally used by IMDbPY to remember the imdbID of a movie (the one used by the web site), once it has been encountered. This way, if IMDbPY is asked again about the imdbID of a movie (or person, or ...), it won't have to contact the web site again. When accessing the database, you'll use the numeric value of the "id" column, e.g. "movieID". Note that, to update the SQL database, you have to access it using a user who has write permission. As a bonus, when possible, the values of the imdbIDs are saved between updates of the SQL database (using the imdbpy2sql.py script). Beware that it's tricky and not always possible, but the script does its best to succeed. :Q: But what if I really need the imdbIDs, to use in my database? :A: No, you don't. Search for a title, get its information. Be happy! :Q: I have a great idea: Write a script to fetch all the imdbIDs from the web site! Can't you do it? :A: Yeah, I can. But I won't. :-) It would be quite easy to map every title on the web to its imdbID, but there are still lots of problems. First of all, every user will end up doing it for their own copy of the plain text data files (and this will make the imdbpy2sql.py script painfully slow and prone to all sort of problems). Moreover, the imdbIDs are unique and never reused, true, but movie titles _do_ change: to fix typos, to override working titles, to cope with a new movie with the same title release in the same year, not to mention cancelled or postponed movies. Other than that, we'd have to do the same for persons, characters, and companies. Believe me: it doesn't make sense. Work on your local database using your movieIDs (or even better: don't mind about movieIDs and think in terms of searches and Movie instances!) and retrieve the imdbID only in the rare circumstances when you really need them (see the next FAQ). Repeat after me: I DON'T NEED ALL THE imdbIDs. :-) :Q: When using a SQL database, how can I convert a movieID (whose value is valid only locally) to an imdbID (the ID used by the imdb.com site)? :A: Various functions can be used to convert a movieID (or personID or other IDs) to the imdbID used by the web site. Example: .. code-block:: python from imdb import IMDb ia = IMDb('sql', uri=URI_TO_YOUR_SQL_DATABASE) movie = ia.search_movie('The Untouchables')[0] # a Movie instance. print('The movieID for The Untouchables:', movie.movieID) print('The imdbID used by the site:', ia.get_imdbMovieID(movie.movieID)) print('Same ID, smarter function:', ia.get_imdbID(movie)) It goes without saying that ``get_imdbMovieID`` method has some sibling methods: ``get_imdbPersonID``, ``get_imdbCompanyID`` and ``get_imdbCharacterID``. Also notice that the ``get_imdbID`` method is smarter, and takes any kind of instance (the other functions need a movieID, personID, ...) Another method that will try to retrieve the imdbID is ``get_imdbURL``, which works like ``get_imdbID`` but returns a URL. In case of problems, these methods will return None. :Q: I have a movie title (in the format used by the plain text data files) or other kind of data (like a person/character/company name) and I want to get its imdbID. How can I do it? :A: The safest thing is probably to do a normal search on IMDb (using the "http" data access system of IMDbPY) and see if the first item is the correct one. You can also try the "title2imdbID" method (and similar) of the IMDb instance (no matter if you're using "http" or "sql"), but expect some failures -in which case it will return None. :Q: I have an URL (of a movie, person or something else), how can I get a Movie/Person/... instance? :A: Import the ``imdb.helpers`` module and use the ``get_byURL`` function. :Q: I'm writing an interface based on IMDbPY and I have problems handling encoding, chars conversions, replacements of references and so on. :A: See the many functions in the imdb.helpers module. imdbpy-6.8/docs/goodies/000077500000000000000000000000001351454127000152365ustar00rootroot00000000000000imdbpy-6.8/docs/goodies/README.txt000066400000000000000000000014651351454127000167420ustar00rootroot00000000000000 IMDbPY's goodies ================ Useful shell scripts, especially for developers. See the comments at the top of the files for usage and configuration options. download-from-s3: download the new alternative interface dataset. s3-reduce: create smaller versions of .tsv.gz files. applydiffs.sh: Bash script useful apply patches to a set of IMDb's plain text data files. You can use this script to apply the diffs files distributed on a (more or less) weekly base by IMDb. download_applydiffs.py: courtesy of Roy Stead, download and apply diff files (especially suited for a Windows environment). reduce.sh: Bash script useful to create a "slimmed down" version of the IMDb's plain text data files. It's useful to create shorter versions of the plain text data files, to test the imdbpy2sql.py script faster. imdbpy-6.8/docs/goodies/applydiffs.sh000077500000000000000000000027661351454127000177510ustar00rootroot00000000000000#!/bin/sh # # applydiffs.sh: Bash script useful apply patches to a set of # IMDb's plain text data files. # # Usage: copy this script in the directory with the plain text # data files and run it passing a list of diffs-file(s) as # arguments. # It's possible that the plain text data files will be left # in an inconsistent state, so a backup is probably a good idea. # # Copyright: 2009-2010 Davide Alberani # # This program is released under the terms of the GNU GPL 2 or later license. # if [ $# -lt 1 ] ; then echo "USAGE: $0 diffs-file [diffs-file...]" echo " Beware that diffs-file must be sorted from the older to the newer!" exit 1 fi COMPRESSION="1" ALL_DIFFS="$@" for DIFFS in $@ do rm -rf diffs echo -n "Unpacking $DIFFS..." tar xfz "$DIFFS" echo " done!" for DF in diffs/*.list do fname="`basename $DF`" if [ -f "$fname" ] ; then wasUnpacked=1 applyTo="$fname" elif [ -f "$fname.gz" ] ; then wasUnpacked=0 applyTo="$fname.gz" else echo "NOT applying: $fname doesn't exists." continue fi if [ $wasUnpacked -eq 0 ] ; then echo -n "unzipping $applyTo..." gunzip "$applyTo" echo "done!" fi echo -n "patching $fname with $DF..." patch -s "$fname" "$DF" if [ $? -ne 0 ] ; then echo "FAILED!" continue fi echo "done!" done echo "finished with $DIFFS" echo "" done rm -rf diffs for lfile in *.list do echo -n "gzipping $lfile..." gzip -$COMPRESSION "$lfile" echo "done!" done imdbpy-6.8/docs/goodies/download-from-s3000077500000000000000000000002511351454127000202550ustar00rootroot00000000000000#!/bin/sh today="`date +'%Y-%m-%d'`" wget --mirror --no-parent -A tsv.gz --no-host-directories --directory-prefix=imdb-dataset-${today} https://datasets.imdbws.com/ imdbpy-6.8/docs/goodies/download_applydiffs.py000077500000000000000000000537321351454127000216550ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: UTF-8 -*- # This script downloads and applies any and all imdb diff files which # have not already been applied to the lists in the ImdbListsPath folder # # NOTE: this is especially useful in Windows environment; you have # to modify the paths in the 'Script configuration' section below, # accordingly with your needs. # # The script will check the imdb list files (compressed or incompressed) # in ImdbListsPath and assume that the imdb lists were most recently downloaded # or updated based on the most recently modified list file in that folder. # # In order to run correctly, the configuration section below needs to be # set to the location of the imdb list files and the commands required to # unGzip, UnTar, patch and Gzip files. # # Optional configuration settings are to set the imdb diff files download and/or # backup folders. If you do not want to keep or backup the downloaded imdb diff # files then set keepDiffFiles to False and diffFilesBackupFolder to None. # # If RunAfterSuccessfulUpdate is set to a value other than None then the program # specified will be run after the imdb list files have been successfully updated. # This enables, for example, the script to automatically run imdbPy to rebuild # the database once the imdb list files have been updated. # # If a specific downloaded imdb diff file cannot be applied correctly then this script # will fail as gracefully as possible. # # Copyright 2013 (C) Roy Stead # Released under the terms of the GPL license. # import os import shutil import subprocess import re import datetime import logging from datetime import timedelta from ftplib import FTP from random import choice ############################################# # Script configuration # ############################################# # The local folders where imdb list and diffs files are stored # # If ImdbDiffsPath is set to None then a working folder, "diffs" will be created as a sub-folder of ImdbListsPath # and will be cleaned up afterwards if you also set keepDiffFiles to False ImdbListsPath = "Z:\\MovieDB\\data\\lists" ImdbDiffsPath = None # The path to the logfile, if desired logfile = 'Z:\\MovieDB\\data\\logs\\update.log' # Define the system commands to unZip, unTar, Patch and Gzip a file # Values are substituted into these template strings at runtime, in the order indicated # # Note that this script REQUIRES that the program used to apply patches MUST return 0 on success and non-zero on failure # unGzip="\"C:/Program Files/7-Zip/7z.exe\" e %s -o%s" # params = archive, destination folder unTar=unGzip # params = archive, destination folder applyPatch="\"Z:/MovieDB/Scripts/patch.exe\" --binary --force --silent %s %s" # params = listfile, diffsfile progGZip="\"Z:/MovieDB/Scripts/gzip.exe\" %s" # param = file to Gzip # Specify a program to be run after a successful update of the imdb lists, # such as a command line to execute imdbPy to rebuild the db from the updated imdb list files # # Set to None if no such program should be run RunAfterSuccessfulUpdate="\"Z:\\MovieDB\\Scripts\\Update db from imdb lists.bat\"" # Folder to copy downloaded imdb diff files to once they have been successfully applied # Note that ONLY diff files which are successfully applied will be backed up. # # Set to None if no such folder diffFilesBackupFolder=None # Set keepDiffFiles to false if the script is to delete ImdbDiffsPath and all its files when it's finished # # If set to False and diffFilesBackupFolder is not None then diff files will be backed up before being deleted # (and will not be deleted if there's any problem with backing up the diff files) keepDiffFiles=True # Possible FTP servers for downloading imdb diff files and the path to the diff files on each server ImdbDiffsFtpServers = [ \ {'url': "ftp.fu-berlin.de", 'path': "/pub/misc/movies/database/diffs"}, \ # {'url': "ftp.sunet.se", 'path': "/pub/tv+movies/imdb/diffs"}, \ # Swedish server isn't kept up to date {'url': "ftp.funet.fi", 'path': "/pub/mirrors/ftp.imdb.com/pub/diffs"} ] # Finish server tends to be updated first ############################################# # Script Code # ############################################# logger = None # Returns the date of the most recent Friday # The returned datetime object contains ONLY date information, all time data is set to zero def previousFriday(day): friday = datetime(day.year, day.month, day.day) - timedelta(days=day.weekday()) + timedelta(days=4) # Saturday and Sunday are a special case since Python's day of the week numbering starts at Monday = 0 # Note that if day falls on a Friday then the "previous friday" for that date is the same date if day.weekday() <= 4: friday -= timedelta(weeks=1) return friday # Delete all files and subfolders in the specified folder as well as the folder itself def deleteFolder(folder): if os.path.isdir(folder): shutil.rmtree(folder) if os.path.isdir(folder): os.rmdir(folder) # Create folder and as many parent folders are needed to create the full path # Returns 0 on success or -1 on failure def mktree(path): import os.path as os_path paths_to_create = [] while not os_path.lexists(path): paths_to_create.insert(0, path) head,tail = os_path.split(path) if len(tail.strip())==0: # Just incase path ends with a / or \ path = head head,tail = os_path.split(path) path = head for path in paths_to_create: try: os.mkdir(path) except Exception: logger.exception("Error trying to create %p" % path) return -1 return 0 # Downloads and applies all imdb diff files which have not yet been applied to the current imdb lists def applyDiffs(): global keepDiffFiles, ImdbListsPath, ImdbDiffsPath, diffFilesBackupFolder global unGzip, unTar, applyPatch, progGZip, RunAfterSuccessfulUpdate, ImdbDiffsFtpServers if not os.path.exists(ImdbListsPath): logger.critical("Please edit this script file and set ImdbListsPath to the current location of your imdb list files") return # If no ImdbDiffsPath is specified, create a working folder for the diffs file as a sub-folder of the imdb lists repository if ImdbDiffsPath is None: ImdbDiffsPath = os.path.join(ImdbListsPath,"diffs") # Get the date of the most recent Friday (i.e. the most recently released imdb diffs) # Note Saturday and Sunday are a special case since Python's day of the week numbering starts at Monday = 0 day = datetime.now() mostrecentfriday = previousFriday(day) # Now get the date when the imdb list files in ImdbListsPath were most recently updated. # # At the end of this loop, day will contain the most recent date that a list file was # modified (Note: modified, not created, since Windows changes the creation date on file copies) # # This approach assumes that since the imdb list files were last downloaded or updated nobody has # unzipped a compressed list file and then re-zipped it again without updating all of the imdb # list files at that time (and also that nobody has manualy changed the file last modified dates). # Which seem like reasonable assumptions. # # An even more robust approach would be to look inside each zipfile and read the date/time stamp # from the first line of the imdb list file itself but that seems like overkill to me. day = None for f in os.listdir(ImdbListsPath): if re.match(".*\.list\.gz",f) or re.match(".*\.list",f): try: t = os.path.getmtime(os.path.join(ImdbListsPath,f)) d = datetime.fromtimestamp(t) if day is None: day = d elif d > day: day = d except Exception as e: logger.exception("Unable to read last modified date for file %s" % f) if day is None: # No diff files found and unable to read imdb list files logger.critical("Problem: Unable to check imdb lists in folder %s" % ImdbListsPath) logger.critical("Solutions: Download imdb lists, change ImdbListsPath value in this script or change access settings for that folder.") return # Last update date for imdb list files is the Friday before they were downloaded imdbListsDate = previousFriday(day) logger.debug("imdb lists updated up to %s" % imdbListsDate) if imdbListsDate >= mostrecentfriday: logger.info("imdb database is already up to date") return # Create diffs file folder if it does not already exist if not os.path.isdir(ImdbDiffsPath): try: os.mkdir(ImdbDiffsPath) except Exception as e: logger.exception("Unable to create folder for imdb diff files (%s)" % ImdbDiffsPath) return # Next we check for the imdb diff files and download any which we need to apply but which are not already downloaded diffFileDate = imdbListsDate haveFTPConnection = False while 1: if diffFileDate >= mostrecentfriday: break diff = "diffs-%s.tar.gz" % diffFileDate.strftime("%y%m%d") diffFilePath = os.path.join(ImdbDiffsPath, diff) logger.debug("Need diff file %s" % diff) if not os.path.isfile(diffFilePath): # diff file is missing so we need to download it so first make sure we have an FTP connection if not haveFTPConnection: try: # Choose a random ftp server from which to download the imdb diff file(s) ImdbDiffsFtpServer = choice(ImdbDiffsFtpServers) ImdbDiffsFtp = ImdbDiffsFtpServer['url'] ImdbDiffsFtpPath = ImdbDiffsFtpServer['path'] # Connect to chosen imdb FTP server ftp = FTP(ImdbDiffsFtp) ftp.login() # Change to the diffs folder on the imdb files server ftp.cwd(ImdbDiffsFtpPath) haveFTPConnection = True except Exception as e: logger.exception("Unable to connect to FTP server %s" % ImdbDiffsFtp) return # Now download the diffs file logger.info("Downloading ftp://%s%s/%s" % ( ImdbDiffsFtp, ImdbDiffsFtpPath, diff )) diffFile = open(diffFilePath, 'wb') try: ftp.retrbinary("RETR " + diff, diffFile.write) diffFile.close() except Exception as e: # Unable to download diff file. This may be because it's not yet available but is due for release today code, message = e.message.split(' ', 1) if code == '550' and diffFileDate == imdbListsDate: logger.info("Diff file %s not yet available on the imdb diffs server: try again later" % diff) else: logger.exception("Unable to download %s" % diff) # Delete the diffs file placeholder since the file did not download diffFile.close() os.remove(diffFilePath) if os.path.isdir(ImdbDiffsPath) and not keepDiffFiles: os.rmdir(ImdbDiffsPath) return logger.info("Successfully downloaded %s" % diffFilePath) # Check for the following week's diff file diffFileDate += timedelta(weeks=1) # Close FTP connection if we used one if haveFTPConnection: ftp.close() # At this point, we know we need to apply one or more diff files and we # also know that we have all of the diff files which need to be applied # so next step is to uncompress our existing list files to a folder so # we can apply diffs to them. # # Note that the script will ONLY apply diffs if ALL of the diff files # needed to bring the imdb lists up to date are available. It will, however, # partially-update the imdb list files if one of the later files could not # be applied for any reason but earlier ones were applied ok (see below). tmpListsPath = os.path.join(ImdbDiffsPath,"lists") deleteFolder(tmpListsPath) try: os.mkdir(tmpListsPath) except Exception as e: logger.exception("Unable to create temporary folder for imdb lists") return logger.info("Uncompressing imdb list files") # Uncompress list files in ImdbListsPath to our temporary folder tmpListsPath numListFiles = 0 for f in os.listdir(ImdbListsPath): if re.match(".*\.list\.gz",f): try: cmdUnGzip = unGzip % (os.path.join(ImdbListsPath,f), tmpListsPath) subprocess.call(cmdUnGzip , shell=True) except Exception as e: logger.exception("Unable to uncompress imdb list file using: %s" % cmdUnGzip) numListFiles += 1 if numListFiles == 0: # Somebody has deleted or moved the list files since we checked their datetime stamps earlier(!) logger.critical("No imdb list files found in %s." % ImdbListsPath) return # Now we loop through the diff files and apply each one in turn to the uncompressed list files patchedOKWith = None while 1: if imdbListsDate >= mostrecentfriday: break diff = "diffs-%s.tar.gz" % imdbListsDate.strftime("%y%m%d") diffFilePath = os.path.join(ImdbDiffsPath, diff) logger.info("Applying imdb diff file %s" % diff) # First uncompress the diffs file to a subdirectory. # # If that subdirectory already exists, delete any files from it # in case they are stale and replace them with files from the # newly-downloaded imdb diff file tmpDiffsPath = os.path.join(ImdbDiffsPath,"diffs") deleteFolder(tmpDiffsPath) os.mkdir(tmpDiffsPath) # unZip the diffs file to create a file diffs.tar try: cmdUnGzip = unGzip % (diffFilePath, tmpDiffsPath) subprocess.call(cmdUnGzip, shell=True) except Exception as e: logger.exception("Unable to unzip imdb diffs file using: %s" % cmdUnGzip) return # unTar the file diffs.tar tarFile = os.path.join(tmpDiffsPath,"diffs.tar") patchStatus = 0 if os.path.isfile(tarFile): try: cmdUnTar = unTar % (tarFile, tmpDiffsPath) subprocess.call(cmdUnTar, shell=True) except Exception as e: logger.exception("Unable to untar imdb diffs file using: %s" % cmdUnTar) return # Clean up tar file and the sub-folder which 7z may have (weirdly) created while unTarring it os.remove(tarFile) if os.path.exists(os.path.join(tmpDiffsPath,"diffs")): os.rmdir(os.path.join(tmpDiffsPath,"diffs")) # Apply all the patch files to the list files in tmpListsPath isFirstPatchFile = True for f in os.listdir(tmpDiffsPath): if re.match(".*\.list",f): logger.info("Patching imdb list file %s" % f) try: cmdApplyPatch = applyPatch % (os.path.join(tmpListsPath,f), os.path.join(tmpDiffsPath,f)) patchStatus = subprocess.call(cmdApplyPatch, shell=True) except Exception as e: logger.exception("Unable to patch imdb list file using: %s" % cmdApplyPatch) patchStatus=-1 if patchStatus != 0: # Patch failed so... logger.critical("Patch status %s: Wrong diff file for these imdb lists (%s)" % (patchStatus, diff)) # Delete the erroneous imdb diff file os.remove(diffFilePath) # Clean up temporary diff files deleteFolder(tmpDiffsPath) if patchedOKWith is not None and isFirstPatchFile: # The previous imdb diffs file succeeded and the current diffs file failed with the # first attempted patch, so we can keep our updated list files up to this point logger.warning("Patched OK up to and including imdb diff file %s ONLY" % patchedOKWith) break else: # We've not managed to successfully apply any imdb diff files and this was not the # first patch attempt from a diff file from this imdb diffs file so we cannot rely # on the updated imdb lists being accurate, in which case delete them and abandon logger.critical("Abandoning update: original imdb lists are unchanged") deleteFolder(tmpListsPath) return # Reset isFirstPatchFile flag since we have successfully # applied at least one patch file from this imdb diffs file isFirstPatchFile = False # Clean up the imdb diff files and their temporary folder deleteFolder(tmpDiffsPath) # Note the imdb patch file which was successfully applied, if any if patchStatus == 0: patchedOKWith = diff # Backup successfully-applied diff file if required if diffFilesBackupFolder is not None: # Create diff files backup folder if it does not already exist if not os.path.isdir(diffFilesBackupFolder): if mktree(diffFilesBackupFolder) == -1: if not keepDiffFiles: keepDiffFiles = True logger.warning("diff files will NOT be deleted but may be backed up manually") # Backup this imdb diff file to the backup folder if that folder exists and this diff file doesn't already exist there if os.path.isdir(diffFilesBackupFolder): if not os.path.isfile(os.path.join(diffFilesBackupFolder,diff)): try: shutil.copy(diffFilePath,diffFilesBackupFolder) except Exception as e: logger.exception("Unable to copy %s to backup folder %s" % (diffFilePath, diffFilesBackupFolder)) if not keepDiffFiles: keepDiffFiles = True logger.warning("diff files will NOT be deleted but may be backed up manually") # Clean up imdb diff file if required if not keepDiffFiles: if os.path.isfile(diffFilePath): os.remove(diffFilePath) # Next we apply the following week's imdb diff files imdbListsDate += timedelta(weeks=1) # List files are all updated so re-Gzip them up and delete the old list files for f in os.listdir(tmpListsPath): if re.match(".*\.list",f): try: cmdGZip = progGZip % os.path.join(tmpListsPath,f) subprocess.call(cmdGZip, shell=True) except Exception as e: logger.exception("Unable to Gzip imdb list file using: %s" % cmdGZip) break if os.path.isfile(os.path.join(tmpListsPath,f)): os.remove(os.path.join(tmpListsPath,f)) # Now move the updated and compressed lists to the main lists folder, replacing the old list files for f in os.listdir(tmpListsPath): if re.match(".*\.list.gz",f): # Delete the original compressed list file from ImdbListsPath if it exists if os.path.isfile(os.path.join(ImdbListsPath,f)): os.remove(os.path.join(ImdbListsPath,f)) # Move the updated compressed list file to ImdbListsPath os.rename(os.path.join(tmpListsPath,f),os.path.join(ImdbListsPath,f)) # Clean up the now-empty tmpListsPath temporary folder and anything left inside it deleteFolder(tmpListsPath) # Clean up imdb diff files if required # Note that this rmdir call will delete the folder only if it is empty. So if that folder was created, used and all # diff files deleted (possibly after being backed up) above then it should now be empty and will be removed. # # However, if the folder previously existed and contained some old diff files then those diff files will not be deleted. # To delete the folder and ALL of its contents regardless, replace os.rmdir() with a deleteFolder() call if not keepDiffFiles: os.rmdir(ImdbDiffsPath) # deleteFolder(ImdbDiffsPath) # If the imdb lists were successfully updated, even partially, then run my # DOS batch file "Update db from imdb lists.bat" to rebuild the imdbPy database # and relink and reintegrate my shadow tables data into it if patchedOKWith is not None: logger.info("imdb lists are updated up to imdb diffs file %s" % patchedOKWith) if RunAfterSuccessfulUpdate is not None: logger.info("Now running %s" % RunAfterSuccessfulUpdate) subprocess.call(RunAfterSuccessfulUpdate, shell=True) # Set up logging def initLogging(loggerName, logfilename): global logger logger = logging.getLogger(loggerName) logger.setLevel(logging.DEBUG) # Logger for file, if logfilename supplied if logfilename is not None: fh = logging.FileHandler(logfilename) fh.setLevel(logging.DEBUG) fh.setFormatter(logging.Formatter('%(name)s %(levelname)s %(asctime)s %(message)s\t\t\t[%(module)s line %(lineno)d: %(funcName)s%(args)s]', datefmt='%Y-%m-%d %H:%M:%S')) logger.addHandler(fh) # Logger for stdout ch = logging.StreamHandler() ch.setLevel(logging.DEBUG) ch.setFormatter(logging.Formatter('%(message)s')) logger.addHandler(ch) initLogging('__applydiffs__', logfile) applyDiffs() imdbpy-6.8/docs/goodies/reduce.sh000077500000000000000000000070101351454127000170420ustar00rootroot00000000000000#!/bin/bash # # reduce.sh: Bash script useful to create a "slimmed down" version of the # IMDb's plain text data files. # # Usage: copy this script in the directory with the plain text data files; # configure the options below and run it. # # Copyright: 2009-2010 Davide Alberani # # This program is released under the terms of the GNU GPL 2 or later license. # # Cygwin packages to install (Windows): # - util-unix for rev # - gzip for gzip, zcat, zgrep # Directory with the plain text data file. ORIG_DIR="." # Directory where "reduced" files will be stored; it will be create if needed. # Beware that this directory is relative to ORIG_DIR. DEST_DIR="./partial/" # How much percentage of the original file to keep. KEEP_X_PERCENT="1" # The compression ratio of the created files. COMPRESSION="1" # - # Nothing to configure below. # - cd "$ORIG_DIR" mkdir -p "$DEST_DIR" DIV_BY="`expr 100 / $KEEP_X_PERCENT`" for file in *.gz do LINES="`zcat "$file" | wc -l`" CONSIDER="`expr $LINES / $DIV_BY`" FULL_CONS="$CONSIDER" CONSIDER="`expr $CONSIDER / 2`" NEWNAME="`echo "$file" | rev | cut -c 4- | rev `" # Tries to keep enough lines from the top of the file. MIN_TOP_LINES="`zgrep -a -n -m 1 "^-----------------------------------------" "$file" | cut -d : -f 1`" if test -z "$MIN_TOP_LINES" ; then MIN_TOP_LINES=0 fi if test "$file" == "business.list.gz" -a $MIN_TOP_LINES -lt 260 ; then MIN_TOP_LINES=260 elif test "$file" == "alternate-versions.list.gz" -a $MIN_TOP_LINES -lt 320 ; then MIN_TOP_LINES=320 elif test "$file" == "cinematographers.list.gz" -a $MIN_TOP_LINES -lt 240 ; then MIN_TOP_LINES=240 elif test "$file" == "complete-cast.list.gz" ; then MIN_TOP_LINES=140 elif test "$file" == "complete-crew.list.gz" ; then MIN_TOP_LINES=150 elif test "$file" == "composers.list.gz" -a $MIN_TOP_LINES -lt 160 ; then MIN_TOP_LINES=160 elif test "$file" == "costume-designers.list.gz" -a $MIN_TOP_LINES -lt 240 ; then MIN_TOP_LINES=240 elif test "$file" == "directors.list.gz" -a $MIN_TOP_LINES -lt 160 ; then MIN_TOP_LINES=160 elif test "$file" == "genres.list.gz" -a $MIN_TOP_LINES -lt 400 ; then MIN_TOP_LINES=400 elif test "$file" == "keywords.list.gz" -a $MIN_TOP_LINES -lt 36000 ; then MIN_TOP_LINES=36000 elif test "$file" == "literature.list.gz" -a $MIN_TOP_LINES -lt 320 ; then MIN_TOP_LINES=320 elif test "$file" == "mpaa-ratings-reasons.list.gz" -a $MIN_TOP_LINES -lt 400 ; then MIN_TOP_LINES=400 elif test "$file" == "producers.list.gz" ; then MIN_TOP_LINES=220 elif test "$file" == "production-companies.list.gz" -a $MIN_TOP_LINES -lt 270 ; then MIN_TOP_LINES=270 elif test "$file" == "production-designers.list.gz" -a $MIN_TOP_LINES -lt 240 ; then MIN_TOP_LINES=240 elif test "$file" == "ratings.list.gz" -a $MIN_TOP_LINES -lt 320 ; then MIN_TOP_LINES=320 elif test "$file" == "special-effects-companies.list.gz" -a $MIN_TOP_LINES -lt 320 ; then MIN_TOP_LINES=320 elif test "$file" == "sound-mix.list.gz" -a $MIN_TOP_LINES -lt 340 ; then MIN_TOP_LINES=340 elif test "$file" == "writers.list.gz" ; then MIN_TOP_LINES=400 else MIN_TOP_LINES="`expr $MIN_TOP_LINES + 60`" fi if test "$MIN_TOP_LINES" -gt "$CONSIDER" ; then TOP_CONSIDER=$MIN_TOP_LINES else TOP_CONSIDER=$CONSIDER fi HOW_MANY="`expr $TOP_CONSIDER + $CONSIDER`" echo "Processing $file [$KEEP_X_PERCENT%: $HOW_MANY lines]" zcat "$file" | head -$TOP_CONSIDER > "$DEST_DIR/$NEWNAME" zcat "$file" | tail -$CONSIDER >> "$DEST_DIR/$NEWNAME" gzip -f -$COMPRESSION "$DEST_DIR/$NEWNAME" done imdbpy-6.8/docs/goodies/s3-reduce000077500000000000000000000020251351454127000167550ustar00rootroot00000000000000#!/bin/sh # Copyright 2018 Davide Alberani # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # # s3-reduce.sh: create smaller versions of .tsv.gz files COUNT="100" if [ "x$1" != "x" ] ; then pushd "$1" fi mkdir -p partials for fname in *.tsv.gz do zcat "${fname}" | head -${COUNT} | gzip -f - > "partials/${fname}" done if [ "x$1" != "x" ] ; then popd fi imdbpy-6.8/docs/imdbpy.cfg000066400000000000000000000035671351454127000155650ustar00rootroot00000000000000# # IMDbPY configuration file. # # This file can be placed in many locations; the first file found is # used, _ignoring_ the content of the others. # # Place it in one of the following directories (in order of precedence): # # - imdbpy.cfg in the current directory. # - .imdbpy.cfg in the current directory. # - imdbpy.cfg in the user's home directory. # - .imdbpy.cfg in the user's home directory. # - /etc/imdbpy.cfg Unix-like systems only. # - /etc/conf.d/imdbpy.cfg Unix-like systems only. # - sys.prefix + imdbpy.cfg for non-Unix (e.g.: C:\Python\etc\imdbpy.cfg) # # If this file is not found, 'http' access system is used by default. # # Lines starting with #, ; and // are considered comments and ignored. # # Some special values are replaced with Python equivalents (case insensitive): # # 0, off, false, no -> False # 1, on, true, yes -> True # none -> None # # Other options, like defaultModFunct, must be passed by the code. # [imdbpy] ## Default. accessSystem = http ## Optional (options common to every data access system): # Number of results for searches (20 by default). #results = 20 # Re-raise all caught exceptions (off, by default). #reraiseExceptions = off # Proxy used to access the network. If it requires authentication, # try with: http://username:password@server_address:port/ #proxy = http://localhost:8080/ ## Timeout for the connection to IMDb (30 seconds, by default). #timeout = 30 # Base url to access pages on the IMDb.com web server. #imdbURL_base = https://www.imdb.com/ ## Set the threshold for logging messages. # Can be one of "debug", "info", "warning", "error", "critical" (default: # "warning"). #loggingLevel = debug ## Path to a configuration file for the logging facility; # see: http://docs.python.org/library/logging.html#configuring-logging #loggingConfig = ~/.imdbpy-logger.cfg imdbpy-6.8/docs/imdbpy48.dtd000066400000000000000000000750731351454127000157560ustar00rootroot00000000000000 imdbpy-6.8/docs/imdbpyPowered.png000066400000000000000000000047601351454127000171340ustar00rootroot00000000000000PNG  IHDRd#Ua_PLTE]{`~_azR:Qª⾠ߺʺںεμζ۶ư߾ʽ޾&xD~#mC?Ʈھ۲خӢֲʳª۶ת͜zڥu0Qu1SR2v"ԦҬվѪp&6.0gmPi VVUֶjjhº濡orEkeBBBޮδRRQʳӝ{NNN֓uƈdƩJJJ޺⶘©ᒨP}HFFFνưưܐ(+sss}ںֵcb`9^^\A/Jͣv:Cu~@Yɲ}4͖tg~Fwoij~~|o\k5gnԺX|fqX^R<|QLE|FbZVc/)~njy^ӽƎ澾***222O|bzQuDHpCiFoGf+o8ezRWpQwzzxg|e[YW~j-pnkeRffecoDXgrS\:::T5bKGDH pHYs  #utIME  6XIDATxw3A;ϒB@YeIvpB 8lf]{?ɰ7t&  k BJ26F6Yi]S`k3`mY;mp v޽Ww{UU_CBQ_A֑{t_Gkj,-骫ܱ^zs!kl41w4kc:/9/[q_bO^R +I\BNjq[=*U5^T)֒u7l\'&NU[jg ;]]7s{+2'J&l"5"ItUojm~-[[m>'1ޟ/;pl˻qPŮEQh:Қ&oɪ*H'D޾֓~iH'e ?u[WFQWX(r ^Sxh.JqTY (Ip&nS/D==?i`@+u$u|ʦ B+(N^ `iUMn,$щW@ EM^}=|[ 4D JdMg=Xg]I =T}!nV_ؾzĮe{,-q?5!1`X;1(j`\1СJ-GH{߱'p~koD]F%$ {i:F TL R@u)}?؏#z][3ԼgڏZ5'`!TAboPG'} gbH? >HaeKwm$=i./8-mB cfWرOjqn-Ic @8N@`t:!w)qЋ[DvO`wbu+^\O_^?~֌e̿O '|g2iON ɲv{0nGGx勛ʕaB`k'DhKb$ M IpDU@ jHvůNc~ 7?99yȊ:xiW[#BC֖!4`l3QC5gh&06% P2bh%W/}gC:{~XODp ˡNq C'qN^{drtS{eHtA5O#c6MȀx!G>wϢ 6C-1GavAZIS c] $eE#o]WVSͶGx,|svppu=(8͕'MX_5H '9$s!,9)^גm WTSm{WBe1C \.R0F\㓐C.RS`:eN\6g M0pq"/F B!REL m6l0r0!5}T B1!b*i 3*9|B HJ0r%77(0o1/(Bzzf-!wȯ`^ܓl+fi$UGd2û?y?QFeIENDB`imdbpy-6.8/docs/imdbpy_new_logo.png000066400000000000000000000127221351454127000174740ustar00rootroot00000000000000PNG  IHDRDB| pHYs.#.#x?v OiCCPPhotoshop ICC profilexڝSgTS=BKKoR RB&*! J!QEEȠQ, !{kּ> H3Q5 B.@ $pd!s#~<<+"x M0B\t8K@zB@F&S`cbP-`'{[! eDh;VEX0fK9-0IWfH  0Q){`##xFW<+*x<$9E[-qWW.(I+6aa@.y24x6_-"bbϫp@t~,/;m%h^ uf@Wp~<5j>{-]cK'Xto(hw?G%fIq^D$.Tʳ?D*A, `6B$BB dr`)B(Ͱ*`/@4Qhp.U=pa( Aa!ڈbX#!H$ ɈQ"K5H1RT UH=r9\F;2G1Q= C7F dt1r=6Ыhڏ>C03l0.B8, c˱" VcϱwE 6wB aAHXLXNH $4 7 Q'"K&b21XH,#/{C7$C2'ITFnR#,4H#dk9, +ȅ3![ b@qS(RjJ4e2AURݨT5ZBRQ4u9̓IKhhitݕNWGw Ljg(gwLӋT071oUX**| J&*/Tު UUT^S}FU3S ԖUPSSg;goT?~YYLOCQ_ cx,!k u5&|v*=9C3J3WRf?qtN (~))4L1e\kXHQG6EYAJ'\'GgSSݧ M=:.kDwn^Loy}/TmG X $ <5qo</QC]@Caaᄑ.ȽJtq]zۯ6iܟ4)Y3sCQ? 0k߬~OCOg#/c/Wװwa>>r><72Y_7ȷOo_C#dz%gA[z|!?:eAAA!h쐭!ΑiP~aa~ 'W?pX15wCsDDDޛg1O9-J5*>.j<74?.fYXXIlK9.*6nl {/]py.,:@LN8A*%w% yg"/6шC\*NH*Mz쑼5y$3,幄'L Lݛ:v m2=:1qB!Mggfvˬen/kY- BTZ(*geWf͉9+̳ې7ᒶKW-X潬j9(xoʿܔĹdff-[n ڴ VE/(ۻCɾUUMfeI?m]Nmq#׹=TR+Gw- 6 U#pDy  :v{vg/jBFS[b[O>zG499?rCd&ˮ/~јѡ򗓿m|x31^VwwO| (hSЧc3- cHRMz%u0`:o_F IDATxoTU) TBK%))RXhbIHtC\F;7肕%ܰ" ` 05hMR)-!¸8󴹧s;6og:sgΙ=9RT*E2334w|ѣ"^(/??/~"B!eq{wv BHX*#6B*0"67e !P ̺".Yʦ@!;.lY !P־m#!P _MB뜂k҂:uy(\~. BTm~S)ul{_;w־O|LMx゚߹+EORڷONb l4`!2$qϞ=~`@٧|GOO{?;ƒB(kڂG )4;S}z⣣j1! ٳ|{(ذK']]"bjϞNʕdY5l!=NFS^snpANd WUWjUs'B#b:}9B! w\q}ܴI{ N qAAС7=nGXxW]W-hψxF~åK"9*d颵=LLU5{끏s9tq-Nll~? b7ɈK=lO,ĺTl7g0(BAHkG> X]΂ )y;R8aӁ/걾Ȅk5bz,y%T0}C m֡Wl5wY%0J- ~0ROmpRȜI=Ed% } =h S^HPum"OnyTXXhBqZ!45ETaa,}tH ~5}=Aۦ?O²uzaƦU lS6\/ 2_DPmo$ۍ)?R5hi LDY|\g]Pc@ 0`ݔOu h3>u]gCBnmnqJt#_Fy:#GKH )u̦sQe䮊6Th7EGb.]gc;ץ+|trG"Fں jccp(ױ\Z>Wa)SD*\HU+#jl#~.z2 QuخZ8l[T4>U8.<ݹ.4@Վ#X3F@rD jcn dmķRm70̺ 1Kv\%M ]tZs7k ѶjԶհt 2>xXW5E8d``:49uIKhkez#X֫s"uj] Ջ ;vTVܟCwK:IUKuA5guNZ%olMSi߻z/]llGYNJיqn+e:':.ӎԁU5l"9=|?S }*_HꊉnCrRN-X5;w%/voZS:jWh [*'Qmrp;Cӯ30}7u3v5@qjѰ0.븺s9 OffU?Byk_H"k;')qttIηkveqUGj3^WNWM dDf|:-LL~~=g"~Q fEc<S"44ݔnuW#F-t i糬+W¾n^qxy:6RT* 켶qQ꦳V'#p:,m ltQ9ښt\E@ Qto>_֗p<…H(\$|d5 W[,p=v^BeDliaۢp蝗rZN΁_Yuw-WWMu1 ) Kti UX9|ؔp:t!LAƲa%MʺCP#>t_q_q,4)PwpcB#2w2mщj(\$`.m#X//u"I* WN/+XymШˇ`X9.B[t"^u ~ނe([Vq WΎ-vjyծw.wx8WpTXL7ubUr / ڇrV/cZ&TT*E"IB? JW BD !'-5Ȕa0UHX?%!DiqAG).U^Wf6]q>}ԤM0{ȈzG7G"x=ґMޏftn9,3x=#MEp- F:8.qjy=-v^dmm"=6{~uÆ?찞=-f)\:WpFm_FĴTpR`Xm*gVB |T'.Bi(#yN(\R`$!c8B -\&oE!F^H, pBHqvF۷yN(\R`WEs$naQmbhIENDB`imdbpy-6.8/docs/imdbpyico.png000066400000000000000000000006331351454127000162740ustar00rootroot00000000000000PNG  IHDR..6PLTE?`YbKGDH pHYs  d_tIME  6%6 IDATxn -|J>Œ ~ܩyKΐ r?R|X6C]yo"D0'eu' 9~ߏKq.J4paLWIFc,!ZypSNyqJc/?\e%9oŚzR<Qnys{m [SܦrI>o:c3]>ꂲmmAĕؾ^\h8W{js]IENDB`imdbpy-6.8/docs/imdbpyico.xpm000066400000000000000000000144711351454127000163210ustar00rootroot00000000000000/* XPM */ static char * imdbpyico_xpm[] = { "32 32 264 2", " c None", ". c #000000", "+ c #160F00", "@ c #1C1300", "# c #6C6C6C", "$ c #8E8E8E", "% c #4A4A4A", "& c #141414", "* c #9B9B9B", "= c #282828", "- c #656565", "; c #737373", "> c #1B1B1B", ", c #222222", "' c #887F6C", ") c #251800", "! c #B1B1B1", "~ c #E9E9E9", "{ c #7A7A7A", "] c #212121", "^ c #FFFFFF", "/ c #434343", "( c #A6A6A6", "_ c #BCBCBC", ": c #2C2C2C", "< c #373737", "[ c #555555", "} c #6F6F6F", "| c #3A3A3A", "1 c #101010", "2 c #202020", "3 c #505050", "4 c #5A5A5A", "5 c #151515", "6 c #0A0A0A", "7 c #191919", "8 c #171717", "9 c #030303", "0 c #4E4E4E", "a c #7E7E6F", "b c #383800", "c c #757564", "d c #3F3F0D", "e c #848484", "f c #161603", "g c #262600", "h c #141400", "i c #2C2C00", "j c #676700", "k c #54541A", "l c #C6C6C6", "m c #595959", "n c #DDDD17", "o c #484827", "p c #CECE0F", "q c #A6A659", "r c #2F2F13", "s c #4E4E00", "t c #B1B143", "u c #00006F", "v c #B1B12C", "w c #0000FF", "x c #0B0B37", "y c #E8E8E8", "z c #424242", "A c #B1B103", "B c #606005", "C c #A6A602", "D c #B1B10B", "E c #494902", "F c #5B5B00", "G c #121208", "H c #F9F9F9", "I c #949494", "J c #616148", "K c #484800", "L c #686800", "M c #505000", "N c #747400", "O c #808000", "P c #272700", "Q c #6C6C65", "R c #BABABA", "S c #8F8F86", "T c #616109", "U c #757500", "V c #718F00", "W c #778900", "X c #20C800", "Y c #2C4B00", "Z c #777700", "` c #717100", " . c #525211", ".. c #BDBDBA", "+. c #DEDEDE", "@. c #555500", "#. c #585800", "$. c #7D7D00", "%. c #758B00", "&. c #60A000", "*. c #22DE00", "=. c #29D700", "-. c #00FF00", ";. c #00B100", ">. c #383821", ",. c #333C00", "'. c #627700", "). c #7C8400", "!. c #6A7200", "~. c #778600", "{. c #47B800", "]. c #55AB00", "^. c #0DE700", "/. c #065A00", "(. c #048F00", "_. c #03FC00", ":. c #00A900", "<. c #4E5700", "[. c #636307", "}. c #006F00", "|. c #33CD00", "1. c #18E700", "2. c #22DD00", "3. c #00A800", "4. c #005100", "5. c #233900", "6. c #1EE100", "7. c #18A200", "8. c #009300", "9. c #2FD000", "0. c #6F6F00", "a. c #F5F5F5", "b. c #648664", "c. c #04CD00", "d. c #3AC500", "e. c #1BE400", "f. c #277400", "g. c #000A00", "h. c #005500", "i. c #599600", "j. c #00BA00", "k. c #001800", "l. c #59A700", "m. c #23DC00", "n. c #1B0000", "o. c #C10000", "p. c #860000", "q. c #161600", "r. c #155800", "s. c #2ED100", "t. c #07BA07", "u. c #939393", "v. c #788800", "w. c #5E5E0A", "x. c #494343", "y. c #BE0000", "z. c #9A6500", "A. c #3B6500", "B. c #1CB800", "C. c #07DE00", "D. c #006400", "E. c #829882", "F. c #C2C2C2", "G. c #1E8A1E", "H. c #3BC500", "I. c #6C9300", "J. c #FFE7E7", "K. c #ED3434", "L. c #8A3E00", "M. c #08C000", "N. c #1F831F", "O. c #566956", "P. c #CECECE", "Q. c #406B38", "R. c #679E0D", "S. c #2ECA00", "T. c #7A7A00", "U. c #53531D", "V. c #CFCFCD", "W. c #FFEEEE", "X. c #FF7F7F", "Y. c #FA2828", "Z. c #858585", "`. c #CACACA", " + c #97978B", ".+ c #828240", "++ c #0FD600", "@+ c #1AE500", "#+ c #6D9200", "$+ c #C5C5C5", "%+ c #B4B4B4", "&+ c #EEEEEE", "*+ c #FFF9F9", "=+ c #FFE6E6", "-+ c #FF6565", ";+ c #B5B5B1", ">+ c #3D3D2C", ",+ c #17B706", "'+ c #5BA400", ")+ c #728D00", "!+ c #7B7B00", "~+ c #393900", "{+ c #AEAEAC", "]+ c #EFEFEF", "^+ c #F1F1F1", "/+ c #1B1B13", "(+ c #1E3B0B", "_+ c #48B700", ":+ c #45BA00", "<+ c #5AA500", "[+ c #434300", "}+ c #66664B", "|+ c #DDDDDD", "1+ c #6A6A51", "2+ c #565632", "3+ c #909090", "4+ c #4E4E06", "5+ c #08C303", "6+ c #06F900", "7+ c #17E800", "8+ c #424200", "9+ c #696900", "0+ c #4F4F2E", "a+ c #6D6D00", "b+ c #5F5F00", "c+ c #3B3B06", "d+ c #DFDFDF", "e+ c #50A500", "f+ c #53AC00", "g+ c #21DE00", "h+ c #537200", "i+ c #555917", "j+ c #B7B7B7", "k+ c #00F400", "l+ c #08F700", "m+ c #54AB00", "n+ c #46BA00", "o+ c #5C5C00", "p+ c #87877B", "q+ c #6C8B6C", "r+ c #005900", "s+ c #B6BAB6", "t+ c #F8F8F8", "u+ c #505B39", "v+ c #598A00", "w+ c #4AB600", "x+ c #13ED00", "y+ c #0DEC00", "z+ c #06B700", "A+ c #346A2E", "B+ c #306E2E", "C+ c #3D5B37", "D+ c #939390", "E+ c #FCFCFC", "F+ c #FBFBFB", "G+ c #F4F4F4", "H+ c #E2E2E2", "I+ c #3F7D3F", "J+ c #1B5400", "K+ c #19660D", "L+ c #ECECEC", "M+ c #0E0E0E", "N+ c #332200", "O+ c #412B00", ". . . . . . . . . . . . . . . . + @ . . . . . . . . . . . . . . ", "# . $ % & * = . - ; . > * , . # ' ) , * > . ; - . = * & % $ . # ", "! . ~ { ] ^ / . ( _ . : ^ < . ! ! . < ^ : . _ ( . / ^ ] { ~ . ! ", "[ . } | 1 { 2 . 3 4 . 5 { > . [ [ . > { 5 . 4 3 . 2 { 1 | } . [ ", "6 ] ] ] ] ] ] ] ] 7 . 8 . 9 ] ] ] ] ] ] ] ] ] ] ] ] ] ] ] ] ] 6 ", "0 ^ ^ ^ ^ ^ ^ ^ ^ a b c b d _ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 0 ", "0 ^ ^ ^ ^ ^ ^ ^ e f g h i j k l ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 0 ", "0 ^ ^ ^ ^ ^ ^ ^ m n o p q r s ! ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 0 ", "0 ^ ^ ^ ^ ^ ^ ^ m t u v w x s ! ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 0 ", "0 ^ ^ ^ ^ ^ ^ y z A B C D E F G _ _ H ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 0 ", "0 ^ ^ ^ ^ ^ I J K K L M K N O P K K Q R ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 0 ", "0 ^ ^ ^ H S T U O O O O O V W X Y Z ` ...^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 0 ", "0 ^ ^ ^ +.b @.O #.$.O %.&.*.=.-.;.s O O >.^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 0 ", "0 ^ ^ ^ +.,.'.).!.~.{.].^./.(._.:.<.O O [.$ ^ ^ ^ ^ ^ ^ ^ ^ ^ 0 ", "0 ^ ^ ^ +.}._.|.|.1.2.3.4.5.6.7.8.9.).O 0.} ^ ^ ^ ^ ^ ^ ^ ^ ^ 0 ", "0 ^ ^ ^ a.b.c.d.d.e.f.g.h.i.j.k.l.m.%.O 0.} ^ ^ ^ ^ ^ ^ ^ ^ ^ 0 ", "0 ^ ^ ^ ^ ^ / n.o.p.q.r.s.t.u.0 e.1.v.O w.* ^ ^ ^ ^ ^ ^ ^ ^ ^ 0 ", "0 ^ ^ ^ ^ ^ x.y.z.A.B.C.D.E.F.G.H.I.O O >.^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 0 ", "0 ^ ^ ^ ^ J.K.L.M.M.N.O.P.Q.R.S.v.O T.U.V.^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 0 ", "0 ^ ^ ^ W.X.Y.Z.Z.Z.`. +.+++@+#+O O U.$+^ ^ ^ ^ ^ ^ %+&+^ ^ ^ 0 ", "0 ^ ^ ^ *+=+-+^ ^ ;+>+,+6.'+)+O !+~+{+]+^ ^++.~ ^ ;+/++.^ ^ ^ 0 ", "0 ^ ^ ^ ^ ^ ^ ^ ^ (+_+:+<+O O O [+b b }+|+1+b 2+3+4+b +.^ ^ ^ 0 ", "0 ^ ^ ^ ^ ^ ^ ^ e 5+6+7+O O O 8+9+O O 0.0+a+O N b+$.c+d+^ ^ ^ 0 ", "0 ^ ^ ^ ^ ^ ^ ^ m e+f+g+O O O O O O O O O O b+h+f+i+j+^ ^ ^ ^ 0 ", "0 ^ ^ ^ ^ ^ ^ ^ m k+-.l+m+O O O n+O O O O o+p+q+r+s+^ ^ ^ ^ ^ 0 ", "0 ^ ^ ^ ^ ^ ^ ^ t+u+v+w+x+y+z+A+B+z+z+z+C+D+E+F+G+^ ^ ^ ^ ^ ^ 0 ", "0 ^ ^ ^ ^ ^ ^ ^ ^ H+I+}.J+K+3+|+|+3+3+3+L+^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 0 ", "6 ] ] ] ] ] ] ] ] ] M+. . 9 ] ] ] ] ] ] ] ] ] ] ] ] ] ] ] ] ] 6 ", "[ . } | 1 { 2 . 3 4 . 5 { > . [ [ . > { 5 . 4 3 . 2 { 1 | } . [ ", "! . ~ { ] ^ / . ( _ . : ^ < . ! ! . < ^ : . _ ( . / ^ ] { ~ . ! ", "# . $ % & * = . - ; . > * , . # # . , * > . ; - . = * & % $ . # ", ". . . . . . . . . . . . . . . . N+O+. . . . . . . . . . . . . . "}; imdbpy-6.8/docs/imdbpyico16x16.ico000077500000000000000000000025761351454127000170030ustar00rootroot00000000000000h(  /,,,>.8///11""Y666]BBBQAALL2TMTTZZ.[['XXXdd^^^nn~Pqjppsk ooxasntm(nn2rk8oo N8Hjjjlll 8zz9ssjnn ?>tt||C".|~@yy8M%sss|Lzz8d+BFUppGtd =+rv6moRhq($zbh~ªº    J,J==J,J[[[T--W[[[[_oooZML*25(cboo__ooo;X7!/4# $`o__omha.D8Yd\]Ho__of+CNSG9:kooio__oo 6FO9Uoooo__o^QVB>)oooo__oR&%A<CEHFIG@BAoACA&IFBCH fOMQOHJISPXL,PNUURTSVf;=LOM7[>XXXVNQOt)[[2VT\\0j5:[QSUT^^__,n2```a8l<VYWbd X[YrPfhgihjHaamkWhXomMefrj@}?wcqoced tr*(Qjkusewv23iljdlmVzy{zlom}|autuqtr~[oswqmddJx{y=_Zp|d\VRjjSm?ICD0&8y )6!fb Y.?}%tAtttAA22AZ2EeZZZZ2ZNq>5BEEYW]OZNZĦP/EEZbfKZN̸8ljHjwb*EE$8*'oy*U<*ZN9}'EEx˜JZNMdm`JEEA3|L7h,ZN?4 6R[EE򘳬S(jZN>DZv&"ucjEE'rp/X{DbZN+IT}C,EE[z0wKZNk:FFF^~zvj־fzRfffJZrRFz^frjnr^~**vƾf:::bbz^VVVRBF~f~ƶZFnB~vF"vF~jʲrrrfZ:RzfvzVzzzZFƲbbb~*>R2v:.6ʾ^jjj&^zBrbƾfbjvRfNƲj^pRnZvƾRfr~~~F~Rnn*vzRzf¾f^~:Zfv>¦nz$      #    "*! **** ** ***! *Ô* *T**Ô *Ô *ÔÔ! * ** Ô **  * * !* *)uÔÔu*uuuu*uÔÔ** *uÔÔ*uÔÔ *!6 ** ** Ô Ô *  * ** !* Ô** *!* Æ   Æ ÔÔ ** ÔÔ *T *! *   **  * !* ÔÔu *A* Æ  Æ ÔÔ Æ ** ÔÔ Æ ** *!3 Ô  ** ** Ô  T  Ô T !* ÔÔu*uÔÔu *5* uÔÔu uÔu*uÔÔu*uÔu *! ** *  * * Ô ** ** Ô ! ÔÔ**Ô**Ô*T* *Ô* * *!***** ** *** ** "  * *%** %*Ô*%     * *   %* ** ** ** ** uÔÔu ** *!%      ** * &%* * * ** ** * * Æ ** Æ ** *&%  *** *  **   * &% * Ô* *uu** ** ÔÔ Æ ** Æ Æu *&% ** Ô *    Ô Ô &%* ÔÔu*uTu* ** ÔÔ Æ ** Æ u *&% ** **   **   * &%* TT Æ ** ** ** * * Æ ** Æ ** *&% **  ** **  ** * &%* ÔÔu ** Ô  ** uÔÔu ** *&% *     * *  &%ÔÔ**ÔÔÔ**Ô*'% ***#****** (&) -  C     .                                                                                                                                  +  S ]JNNJ8uJ'+\JpN8uN,ƥJ  JN J78NN+ 8JNJu7'NJ88JN'J+u+NN'JJee 7e$;J  J$$e7 +zp#S'ȥpp7'SpzJ ppƫ'zppppƥSN++ezppzE_ES8 +S;pN f,+ f#$z;;+7eff$N8e$;;Nzf;f;$ƇJuJ,;f$$;ff;' +$fNNp;J  Jz$fpN,++,$_e78e$$$z;f;f$ESJu'$f;$f_ ǺF$fFNp;z'  J;;peN,+ 'S+ 7e$$f$_Ef;p+;;EEf$}Ǻ$fN C3Lff (o6$$$f;_,' "(QQňQ;pN 0ʂDyσEtWЃstЃ;Ey5WtʴɅo(,zE$f_,  ŋrԋBfu I[<[R/t׃t϶ksВʹfEߨ[55stקksߒy,b{3o ũZPe;fEe7  B\3@ ^S 9yʂ[ߨ$y[[2ЃЃ 2tyy偂y$IPo bݝvf;Nu oBr;u I[[yym2y$RsρՉ[߽[xRs$Ru[yy;.wwL j\},  ?aPP#, Ftymymyߧ󛛃Я`yt_2[[ymx߃Чϧ˽`ty2;zBd46!a 4z;puFaw ;Nu  Ik˃yy[m222קʛ˧_y[2׃tyy;.w "!U|4 ȶf+ |4 f;,J  Issyyymm`i]t[Gtss[[yyk]yG[[$4 $(&q$z'  (q_$;p7  Iϧyϧy[߃]߽mxyy[R"c)4HHffb¡dUڶfpNY¡Hqp;fu  :my[y[mm[ρߨmЛ][y_mmm[xxm2[y[y߿;E"YqO}"q"$fEq1Y"q"S_f_SJ  Oyρtʧmmkk״ymy_25xϨmmpkm2[;$E'Y}ggS7.}IwfFHY}gIB$;+  2[y[y%z[]km_2[yxӨmmp[ϛ]k[2ze'8d}S+ v){P3$f̝ݜŠ{P3ȶ;f$e+   [[y`$`[[2z"mR`y;f[[2'8o1}L'N+8ئ($;~oBǺݦS_$7 炒yy`Ay߽m7mm[y傂G_`AAxzmm_y[[炒[yy`A bf_e҈QJ>~b#bQZf$z8 IR[tyyfϒקRkG[xӨ2[2[ϹקRRk[vjXL(f$EEbHf7 jbH NzN8 I;ѿyϞf[ʒsϹѽϒϒ߽_fyk["j0Xc,f@fz ޱq_zuHp$ff' H׃n烛t%f$z߿GWktt[_߿yk׃t-lP +lCr^'C9rE+CP9eƎ$fEe y?DEЃ[S[[[[y[[xtϮ%Ѓ,2[y[[uܢFI)f#4^SJulF\'L4o-eS  [[[ykt5k嫇e'u[[[[etև=kyy'+[ʫeJ!U|VK&!vow)PeJu |\K7wdw!Xp$ffE [ʒς7yit[ϧЛk7)uyG[[77[tt[ʧЛ y[78!U(ad\U|d37  (a6ƫ+ PX!Uj_E [yy[Wג׃ks'[2m[[ [ג׃kЛ˽[ 4ĵ3dd) 4qLad4 PCd4)b֏v 4ĵP?Ne'8 2yѨ[t[[y뷨yyt[[ϧty ǢvHHdqLdd vH)8u  4 vu78 [ ϒmm[[Չ[[[[yϒmm[Չ[ wB3wddg}oB HOB)3wdH BB)\w  [A2tt222[y[[A[tt2m2[y[ BBBvwBBBvBBr WFhF(ooZ>? (Q>B~ XN99Zb imdbpy-6.8/docs/index.rst000066400000000000000000000005041351454127000154450ustar00rootroot00000000000000IMDbPY ====== .. include:: ../README.rst .. admonition:: Disclaimer .. include:: disclaimer.rst .. toctree:: :maxdepth: 2 :caption: Contents: usage/index devel/index faqs contributors/index Changelog Indices and tables ================== - :ref:`genindex` - :ref:`modindex` - :ref:`search` imdbpy-6.8/docs/make.bat000066400000000000000000000014521351454127000152140ustar00rootroot00000000000000@ECHO OFF pushd %~dp0 REM Command file for Sphinx documentation if "%SPHINXBUILD%" == "" ( set SPHINXBUILD=sphinx-build ) set SOURCEDIR=. set BUILDDIR=_build set SPHINXPROJ=IMDbPY if "%1" == "" goto help %SPHINXBUILD% >NUL 2>NUL if errorlevel 9009 ( echo. echo.The 'sphinx-build' command was not found. Make sure you have Sphinx echo.installed, then set the SPHINXBUILD environment variable to point echo.to the full path of the 'sphinx-build' executable. Alternatively you echo.may add the Sphinx directory to PATH. echo. echo.If you don't have Sphinx installed, grab it from echo.http://sphinx-doc.org/ exit /b 1 ) %SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% goto end :help %SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% :end popd imdbpy-6.8/docs/modules/000077500000000000000000000000001351454127000152555ustar00rootroot00000000000000imdbpy-6.8/docs/modules/Character.rst000066400000000000000000000001431351454127000177010ustar00rootroot00000000000000:orphan: :mod:`imdb.Character` ===================== .. automodule:: imdb.Character :members: imdbpy-6.8/docs/modules/Company.rst000066400000000000000000000001351351454127000174140ustar00rootroot00000000000000:orphan: :mod:`imdb.Company` =================== .. automodule:: imdb.Company :members: imdbpy-6.8/docs/modules/Movie.rst000066400000000000000000000001271351454127000170660ustar00rootroot00000000000000:orphan: :mod:`imdb.Movie` ================= .. automodule:: imdb.Movie :members: imdbpy-6.8/docs/modules/Person.rst000066400000000000000000000001321351454127000172510ustar00rootroot00000000000000:orphan: :mod:`imdb.Person` ================== .. automodule:: imdb.Person :members: imdbpy-6.8/docs/modules/_exceptions.rst000066400000000000000000000001511351454127000203240ustar00rootroot00000000000000:orphan: :mod:`imdb._exceptions` ======================= .. automodule:: imdb._exceptions :members: imdbpy-6.8/docs/modules/_logging.rst000066400000000000000000000001401351454127000175670ustar00rootroot00000000000000:orphan: :mod:`imdb._logging` ==================== .. automodule:: imdb._logging :members: imdbpy-6.8/docs/modules/cli.rst000066400000000000000000000001211351454127000165500ustar00rootroot00000000000000:orphan: :mod:`imdb.cli` =============== .. automodule:: imdb.cli :members: imdbpy-6.8/docs/modules/helpers.rst000066400000000000000000000001351351454127000174500ustar00rootroot00000000000000:orphan: :mod:`imdb.helpers` =================== .. automodule:: imdb.helpers :members: imdbpy-6.8/docs/modules/imdb.rst000066400000000000000000000001051351454127000167160ustar00rootroot00000000000000:orphan: :mod:`imdb` =========== .. automodule:: imdb :members: imdbpy-6.8/docs/modules/linguistics.rst000066400000000000000000000001511351454127000203410ustar00rootroot00000000000000:orphan: :mod:`imdb.linguistics` ======================= .. automodule:: imdb.linguistics :members: imdbpy-6.8/docs/modules/locale.rst000066400000000000000000000001321351454127000172420ustar00rootroot00000000000000:orphan: :mod:`imdb.locale` ================== .. automodule:: imdb.locale :members: imdbpy-6.8/docs/modules/parser.http.companyParser.rst000066400000000000000000000002231351454127000231000ustar00rootroot00000000000000:orphan: :mod:`imdb.parser.http.companyParser` ===================================== .. automodule:: imdb.parser.http.companyParser :members: imdbpy-6.8/docs/modules/parser.http.movieParser.rst000066400000000000000000000002171351454127000225540ustar00rootroot00000000000000:orphan: :mod:`imdb.parser.http.movieParser` ===================================== .. automodule:: imdb.parser.http.movieParser :members: imdbpy-6.8/docs/modules/parser.http.personParser.rst000066400000000000000000000002201351454127000227350ustar00rootroot00000000000000:orphan: :mod:`imdb.parser.http.personParser` ==================================== .. automodule:: imdb.parser.http.personParser :members: imdbpy-6.8/docs/modules/parser.http.rst000066400000000000000000000001511351454127000202560ustar00rootroot00000000000000:orphan: :mod:`imdb.parser.http` ======================= .. automodule:: imdb.parser.http :members: imdbpy-6.8/docs/modules/parser.http.searchCompanyParser.rst000066400000000000000000000002451351454127000242320ustar00rootroot00000000000000:orphan: :mod:`imdb.parser.http.searchCompanyParser` =========================================== .. automodule:: imdb.parser.http.searchCompanyParser :members: imdbpy-6.8/docs/modules/parser.http.searchKeywordParser.rst000066400000000000000000000002451351454127000242500ustar00rootroot00000000000000:orphan: :mod:`imdb.parser.http.searchKeywordParser` =========================================== .. automodule:: imdb.parser.http.searchKeywordParser :members: imdbpy-6.8/docs/modules/parser.http.searchMovieParser.rst000066400000000000000000000002371351454127000237040ustar00rootroot00000000000000:orphan: :mod:`imdb.parser.http.searchMovieParser` ========================================= .. automodule:: imdb.parser.http.searchMovieParser :members: imdbpy-6.8/docs/modules/parser.http.searchPersonParser.rst000066400000000000000000000002421351454127000240670ustar00rootroot00000000000000:orphan: :mod:`imdb.parser.http.searchPersonParser` ========================================== .. automodule:: imdb.parser.http.searchPersonParser :members: imdbpy-6.8/docs/modules/parser.http.topBottomParser.rst000066400000000000000000000002311351454127000234200ustar00rootroot00000000000000:orphan: :mod:`imdb.parser.http.topBottomParser` ======================================= .. automodule:: imdb.parser.http.topBottomParser :members: imdbpy-6.8/docs/modules/parser.http.utils.rst000066400000000000000000000001731351454127000214210ustar00rootroot00000000000000:orphan: :mod:`imdb.parser.http.utils` ============================= .. automodule:: imdb.parser.http.utils :members: imdbpy-6.8/docs/modules/parser.rst000066400000000000000000000001151351454127000173000ustar00rootroot00000000000000:orphan: :mod:`imdb.parser` ================== .. automodule:: imdb.parser imdbpy-6.8/docs/modules/parser.s3.rst000066400000000000000000000001431351454127000176250ustar00rootroot00000000000000:orphan: :mod:`imdb.parser.s3` ===================== .. automodule:: imdb.parser.s3 :members: imdbpy-6.8/docs/modules/parser.s3.utils.rst000066400000000000000000000001651351454127000207700ustar00rootroot00000000000000:orphan: :mod:`imdb.parser.s3.utils` =========================== .. automodule:: imdb.parser.s3.utils :members: imdbpy-6.8/docs/modules/parser.sql.alchemyadapter.rst000066400000000000000000000002231351454127000230600ustar00rootroot00000000000000:orphan: :mod:`imdb.parser.sql.alchemyadapter` ===================================== .. automodule:: imdb.parser.sql.alchemyadapter :members: imdbpy-6.8/docs/modules/parser.sql.dbschema.rst000066400000000000000000000002011351454127000216370ustar00rootroot00000000000000:orphan: :mod:`imdb.parser.sql.dbschema` =============================== .. automodule:: imdb.parser.sql.dbschema :members: imdbpy-6.8/docs/modules/parser.sql.rst000066400000000000000000000001461351454127000201020ustar00rootroot00000000000000:orphan: :mod:`imdb.parser.sql` ====================== .. automodule:: imdb.parser.sql :members: imdbpy-6.8/docs/modules/utils.rst000066400000000000000000000001271351454127000171470ustar00rootroot00000000000000:orphan: :mod:`imdb.utils` ================= .. automodule:: imdb.utils :members: imdbpy-6.8/docs/usage/000077500000000000000000000000001351454127000147115ustar00rootroot00000000000000imdbpy-6.8/docs/usage/access.rst000066400000000000000000000032771351454127000167150ustar00rootroot00000000000000.. _access: Access systems ============== IMDbPY supports different ways of accessing the IMDb data: - Fetching data directly from the web server. - Getting the data from a SQL database that can be created from the downloadable data sets provided by the IMDb. +------------------+-------------+----------------------+ | access system | aliases | data source | +==================+=============+======================+ | (default) 'http' | 'https' | imdb.com web server | | | | | | | 'web' | | | | | | | | 'html' | | +------------------+-------------+----------------------+ | 's3' | 's3dataset' | downloadable dataset | | | | | | | | *after Dec 2017* | +------------------+-------------+----------------------+ | 'sql' | 'db' | downloadable dataset | | | | | | | 'database' | *until Dec 2017* | +------------------+-------------+----------------------+ .. note:: Since release 3.4, the :file:`imdbpy.cfg` configuration file is available, so that you can set a system-wide (or per-user) default. The file is commented with indication of the location where it can be put, and how to modify it. If no :file:`imdbpy.cfg` file is found (or is not readable or it can't be parsed), 'http' will be used the default. See the :ref:`s3` and :ref:`ptdf` documents for more information about SQL based access systems. imdbpy-6.8/docs/usage/adult.rst000066400000000000000000000004741351454127000165610ustar00rootroot00000000000000Adult movies ============ Since July 2019 you can use the **search_movie_advanced(title, adult=None, results=None, sort=None, sort_dir=None)** method to search for adult titles .. code-block:: python >>> ia = IMDb(accessSystem='http') >>> movies = ia.search_movie_advanced('debby does dallas', adult=True) imdbpy-6.8/docs/usage/character.rst000066400000000000000000000007641351454127000174060ustar00rootroot00000000000000:orphan: Characters ========== It works mostly like the Person class. :-) For more information about the "currentRole" attribute, see the README.currentRole file. Character associated to a person who starred in a movie, and its notes: .. code-block:: python person_in_cast = movie['cast'][0] notes = person_in_cast.notes character = person_in_cast.currentRole Check whether a person worked in a given movie or not: .. code-block:: python person in movie movie in person imdbpy-6.8/docs/usage/company.rst000066400000000000000000000003651351454127000171150ustar00rootroot00000000000000:orphan: Companies ========= It works mostly like the Person class. :-) The "currentRole" attribute is always None. As for Person/Character and Movie objects, you can test -using the "in" operator- if a Company has worked on a given Movie. imdbpy-6.8/docs/usage/data-interface.rst000066400000000000000000000145451351454127000203230ustar00rootroot00000000000000Data interface ============== The IMDbPY objects that represent movies, people and companies provide a dictionary-like interface where the key identifies the information you want to get out of the object. At this point, I have really bad news: what the keys are is a little unclear! In general, the key is the label of the section as used by the IMDb web server to present the data. If the information is grouped into subsections, such as cast members, certifications, distributor companies, etc., the subsection label in the HTML page is used as the key. The key is almost always lowercase; underscores and dashes are replaced with spaces. Some keys aren't taken from the HTML page, but are defined within the respective class. Information sets ---------------- IMDbPY can retrieve almost every piece of information of a movie, person or company. This can be a problem, because (at least for the "http" data access system) it means that a lot of web pages must be fetched and parsed. This can be both time- and bandwidth-consuming, especially if you're interested in only a small part of the information. The :meth:`get_movie `, :meth:`get_person ` and :meth:`get_company ` methods take an optional ``info`` parameter, which can be used to specify the kinds of data to fetch. Each group of data that gets fetched together is called an "information set". Different types of objects have their own available information sets. For example, the movie objects have a set called "vote details" for the number of votes and their demographic breakdowns, whereas person objects have a set called "other works" for miscellaneous works of the person. Available information sets for each object type can be queried using the access object: .. code-block:: python >>> from imdb import IMDb >>> ia = IMDb() >>> ia.get_movie_infoset() ['airing', 'akas', ..., 'video clips', 'vote details'] >>> ia.get_person_infoset() ['awards', 'biography', ..., 'other works', 'publicity'] >>> ia.get_company_infoset() ['main'] For each object type, only the important information will be retrieved by default: - for a movie: "main", "plot" - for a person: "main", "filmography", "biography" - for a company: "main" These defaults can be retrieved from the ``default_info`` attributes of the classes: .. code-block:: python >>> from imdb.Person import Person >>> Person.default_info ('main', 'filmography', 'biography') Each instance also has a ``current_info`` attribute for tracking the information sets that have already been retrieved: .. code-block:: python >>> movie = ia.get_movie('0133093') >>> movie.current_info ['main', 'plot', 'synopsis'] The list of retrieved information sets and the keys they provide can be taken from the ``infoset2keys`` attribute: .. code-block:: python >>> movie = ia.get_movie('0133093') >>> movie.infoset2keys {'main': ['cast', 'genres', ..., 'top 250 rank'], 'plot': ['plot', 'synopsis']} >>> movie = ia.get_movie('0094226', info=['taglines', 'plot']) >>> movie.infoset2keys {'taglines': ['taglines'], 'plot': ['plot', 'synopsis']} >>> movie.get('title') >>> movie.get('taglines')[0] 'The Chicago Dream is that big' Search operations retrieve a fixed set of data and don't have the concept of information sets. Therefore objects listed in searches will have even less information than the defaults. For example, if you do a movie search operation, the movie objects in the result won't have many of the keys that would be available on a movie get operation: .. code-block:: python >>> movies = ia.search_movie('matrix') >>> movie = movies[0] >>> movie >>> movie.current_info [] >>> 'genres' in movie False Once an object is retrieved (through a get or a search), its data can be updated using the :meth:`update ` method with the desired information sets. Continuing from the example above: .. code-block:: python >>> 'median' in movie False >>> ia.update(movie, info=['taglines', 'vote details']) >>> movie.current_info ['taglines', 'vote details'] >>> movie['median'] 9 >>> ia.update(movie, info=['plot']) >>> movie.current_info ['taglines', 'vote details', 'plot', 'synopsis'] Beware that the information sets vary between access systems: locally not every piece of data is accessible, whereas -for example for SQL- accessing one set of data means automatically accessing a number of other information (without major performance drawbacks). Composite data -------------- In some data, the (not-so) universal ``::`` separator is used to delimit parts of the data inside a string, like the plot of a movie and its author: .. code-block:: python >>> movie = ia.get_movie('0094226') >>> plot = movie['plot'][0] >>> plot "1920's prohibition ... way to get him.::Jeremy Perkins " As a rule, there's at most one such separator inside a string. Splitting the string will result in two logical pieces as in ``TEXT::NOTE``. The :func:`imdb.helpers.makeTextNotes` function can be used to create a custom function to pretty-print this kind of information. References ---------- Sometimes the collected data contains strings with references to other movies or persons, e.g. in the plot of a movie or the biography of a person. These references are stored in the Movie, Person, and Character instances; in the strings you will find values like _A Movie (2003)_ (qv) or 'A Person' (qv) or '#A Character# (qv)'. When these strings are accessed (like movie['plot'] or person['biography']), they will be modified using a provided function, which must take the string and two dictionaries containing titles and names references as parameters. By default the (qv) strings are converted in the "normal" format ("A Movie (2003)", "A Person" and "A Character"). You can find some examples of these functions in the imdb.utils module. The function used to modify the strings can be set with the ``defaultModFunct`` parameter of the IMDb class or with the ``modFunct`` parameter of the ``get_movie``, ``get_person``, and ``get_character`` methods: .. code-block:: python import imdb i = imdb.IMDb(defaultModFunct=imdb.utils.modHtmlLinks) or: .. code-block:: python import imdb i = imdb.IMDb() i.get_person('0000154', modFunct=imdb.utils.modHtmlLinks) imdbpy-6.8/docs/usage/index.rst000066400000000000000000000005331351454127000165530ustar00rootroot00000000000000Usage ===== Here you can find information about how you can use IMDbPY in your own programs. .. warning:: This document is far from complete: the code is the final documentation! ;-) .. toctree:: :maxdepth: 2 :caption: Contents: quickstart data-interface role series adult info2xml l10n access s3 ptdf imdbpy-6.8/docs/usage/info2xml.rst000066400000000000000000000055531351454127000172110ustar00rootroot00000000000000Information in XML format ========================= Since version 4.0, IMDbPY can output information of Movie, Person, Character, and Company instances in XML format. It's possible to get a single information (a key) in XML format, using the ``getAsXML(key)`` method (it will return None if the key is not found). E.g.: .. code-block:: python from imdb import IMDb ia = IMDb('http') movie = ia.get_movie(theMovieID) print(movie.getAsXML('keywords')) It's also possible to get a representation of a whole object, using the ``asXML()`` method:: print(movie.asXML()) The ``_with_add_keys`` argument of the ``asXML()`` method can be set to False (default: True) to exclude the dynamically generated keys (like 'smart canonical title' and so on). XML format ---------- Keywords are converted to tags, items in lists are enclosed in a 'item' tag, e.g.: .. code-block:: xml a keyword another keyword Except when keys are known to be not fixed (e.g.: a list of keywords), in which case this schema is used: .. code-block:: xml ... In general, the 'key' attribute is present whenever the used tag doesn't match the key name. Movie, Person, Character and Company instances are converted as follows (portions in square brackets are optional): .. code-block:: xml A Long IMDb Movie Title (YEAR) [ Name Surname [A Note About The Person] ] [A Note About The Movie] Every 'id' can be empty. The returned XML string is mostly not pretty-printed. References ---------- Some text keys can contain references to other movies, persons and characters. The user can provide the ``defaultModFunct`` function (see the "MOVIE TITLES AND PERSON/CHARACTER NAMES REFERENCES" section of the README.package file), to replace these references with their own strings (e.g.: a link to a web page); it's up to the user, to be sure that the output of the defaultModFunct function is valid XML. DTD --- Since version 4.1 a DTD is available; it can be found in this directory or on the web, at: http://imdbpy.sf.net/dtd/imdbpy41.dtd The version number changes with the IMDbPY version. Localization ------------ Since version 4.1 it's possible to translate the XML tags; see README.locale. Deserializing ------------- Since version 4.6, you can dump the generated XML in a string or in a file, using it -later- to rebuild the original object. In the ``imdb.helpers`` module there's the ``parseXML()`` function which takes a string as input and returns -if possible- an instance of the Movie, Person, Character or Company class. imdbpy-6.8/docs/usage/l10n.rst000066400000000000000000000056641351454127000162300ustar00rootroot00000000000000Localization ============ Since version 4.1 the labels that describe the information are translatable. .. admonition:: Limitation Internal messages or exceptions are not translatable, the internationalization is limited to the "tags" returned by the ``getAsXML`` and ``asXML`` methods of the Movie, Person, Character, or Company classes. Beware that in many cases these "tags" are not the same as the "keys" used to access information in the same class. For example, you can translate the tag "long-imdb-name" -the tag returned by the call ``person.getAsXML('long imdb name')``, but not the key "long imdb name" itself. To translate keys, you can use the :func:`helpers.translateKey ` function. If you want to add i18n to your IMDbPY-based application, all you need to do is to switch to the ``imdbpy`` text domain: .. code-block:: python >>> import imdb.locale >>> import gettext >>> gettext.textdomain('imdbpy') 'imdbpy' >>> from gettext import gettext as _ >>> _('art-department') 'Art department' >>> import os >>> os.environ['LANG'] = 'it_IT' >>> _('art-department') 'Dipartimento artistico' If you want to translate IMDbPY into another language, see the :ref:`translate` document for instructions. Articles in titles ------------------ To convert a title to its canonical format as in "Title, The", IMDbPY makes some assumptions about what is an article and what isn't, and this can lead to some wrong canonical titles. For example, it can canonicalize the title "Die Hard" as "Hard, Die" because it guesses "Die" as an article (and it is, in Germany...). To solve this problem, there are other keys: "smart canonical title", "smart long imdb canonical title", "smart canonical series title", "smart canonical episode title" which can be used to do a better job converting a title into its canonical format. This works, but it needs to know about articles in various languages: if you want to help, see the :attr:`linguistics.LANG_ARTICLES` and :attr:`linguistics.LANG_COUNTRIES` dictionaries. To guess the language of a movie title, call its 'guessLanguage' method (it will return None, if unable to guess). If you want to force a given language instead of the guessed one, you can call its 'smartCanonicalTitle' method, setting the 'lang' argument appropriately. Alternative titles ------------------ Sometimes it's useful to manage a title's alternatives (AKAs) knowing their languages. In the 'helpers' module there are some (hopefully) useful functions: - ``akasLanguages(movie)`` - Given a movie, return a list of tuples in (lang, AKA) format (lang can be None, if unable to detect). - ``sortAKAsBySimilarity(movie, title)`` - Sort the AKAs on a movie considering how much they are similar to a given title (see the code for more options). - ``getAKAsInLanguage(movie, lang)`` - Return a list of AKAs of the movie in the given language (see the code for more options). imdbpy-6.8/docs/usage/movie.rst000066400000000000000000000066001351454127000165640ustar00rootroot00000000000000:orphan: Movies ====== Below is a list of each main key, the type of its value, and a short description or an example: title (string) The "usual" title of the movie, like "The Untouchables". long imdb title (string) "Uncommon Valor (1983/II) (TV)" canonical title (string) The title in canonical format, like "Untouchables, The". long imdb canonical title (string) "Patriot, The (2000)" year (string) The release year, or '????' if unknown. kind (string) One of: 'movie', 'tv series', 'tv mini series', 'video game', 'video movie', 'tv movie', 'episode' imdbIndex (string) The roman numeral for movies with the same title/year. director (Person list) A list of directors' names, e.g.: ['Brian De Palma']. cast (Person list) A list of actors/actresses, with the currentRole instance variable set to a Character object which describe his role. cover url (string) The link to the image of the poster. writer (Person list) A list of writers, e.g.: ['Oscar Fraley (novel)']. plot (list) A list of plot summaries and their authors. rating (string) User rating on IMDb from 1 to 10, e.g. '7.8'. votes (string) Number of votes, e.g. '24,101'. runtimes (string list) List of runtimes in minutes ['119'], or something like ['USA:118', 'UK:116']. number of episodes (int) Number or episodes for a TV series. color info (string list) ["Color (Technicolor)"] countries (string list) Production's country, e.g. ['USA', 'Italy']. genres (string list) One or more of: Action, Adventure, Adult, Animation, Comedy, Crime, Documentary, Drama, Family, Fantasy, Film-Noir, Horror, Musical, Mystery, Romance, Sci-Fi, Short, Thriller, War, Western, and other genres defined by IMDb. akas (string list) List of alternative titles. languages (string list) A list of languages. certificates (string list) ['UK:15', 'USA:R'] mpaa (string) The MPAA rating. episodes (series only) (dictionary of dictionaries) One key for every season, one key for every episode in the season. number of episodes (series only) (int) Total number of episodes. number of seasons (series only) (int) Total number of seasons. series years (series only) (string) Range of years when the series was produced. episode of (episodes only) (Movie object) The series to which the episode belongs. season (episodes only) (int) The season number. episode (episodes only) (int) The number of the episode in the season. long imdb episode title (episodes only) (string) Episode and series title. series title (string) The title of the series to which the episode belongs. canonical series title (string) The canonical title of the series to which the episode belongs. Other keys that contain a list of Person objects are: costume designer, sound crew, crewmembers, editor, production manager, visual effects, assistant director, art department, composer, art director, cinematographer, make up, stunt performer, producer, set decorator, production designer. Other keys that contain list of companies are: production companies, special effects, sound mix, special effects companies, miscellaneous companies, distributors. Converting a title to its "Title, The" canonical format, IMDbPY makes some assumptions about what is an article and what isn't, and this could lead to some wrong canonical titles. For more information on this subject, see the "ARTICLES IN TITLES" section of the README.locale file. imdbpy-6.8/docs/usage/person.rst000066400000000000000000000014771351454127000167620ustar00rootroot00000000000000:orphan: Persons ======= It works mostly like the Movie class. :-) The Movie class defines a ``__contains__()`` method, which is used to check if a given person has worked in a given movie with the syntax: .. code-block:: python if personObject in movieObject: print('%s worked in %s' % (personObject['name'], movieObject['title'])) The Person class defines a ``isSamePerson(otherPersonObject)`` method, which can be used to compare two person objects. This can be used to check whether an object has retrieved complete information or not, as in the case of a Person object returned by a query: .. code-block:: python if personObject.isSamePerson(otherPersonObject): print('they are the same person!') A similar method is defined for the Movie class, and it's called ``isSameTitle(otherMovieObject)``. imdbpy-6.8/docs/usage/ptdf.rst000066400000000000000000000333031351454127000164020ustar00rootroot00000000000000.. _ptdf: Old data files ============== .. warning:: Since the end of 2017, IMDb is no longer updating the data files which are described in this document. For working with the updated -but less comprehensive- downloadable data, check the :ref:`s3` document. Until the end of 2017, IMDb used to distribute some of its data as downloadable text files. IMDbPY can import this data into a database and make it accessible through its API. For this, you will first need to install `SQLAlchemy`_ and the libraries that are needed for the database server you want to use. Check out the `SQLAlchemy dialects`_ documentation for more detail. Then, follow these steps: #. Download the files from the following address and put all of them in the same directory: ftp://ftp.funet.fi/pub/mirrors/ftp.imdb.com/pub/frozendata/ You can just download the files you need instead of downloading all files. The files that are not downloaded will be skipped during import. This feature is still quite untested, so please report any bugs. .. warning:: Beware that the :file:`diffs` subdirectory contains **a lot** of files you **don't** need, so don't start mirroring everything! #. Create a database. Use a collation like ``utf8_unicode_ci``. #. Import the data using the :file:`imdbpy2sql.py` script:: imdbpy2sql.py -d /path/to/the/data_files_dir/ -u URI *URI* is the identifier used to access the SQL database. For example:: imdbpy2sql.py -d ~/Download/imdb-frozendata/ \ -u postgres://user:password@localhost/imdb Once the import is finished, you will have a SQL database with all the information and you can use the normal IMDbPY API: .. code-block:: python from imdb import IMDb ia = IMDb('sql', uri='postgres://user:password@localhost/imdb') results = ia.search_movie('the matrix') for result in results: print(result.movieID, result) matrix = results[0] ia.update(matrix) print(matrix.keys()) .. note:: It should be noted that the :file:`imdbpy2sql.py` script will not create any foreign keys, but only indexes. If you need foreign keys, try using the version in the "imdbpy-legacy" branch. If you need instructions on how to manually build the foreign keys, see `this comment by Andrew D Bate`_. Performance ----------- The import performance hugely depends on the underlying module used to access the database. The :file:`imdbpy2sql.py` script has a number of command line arguments for choosing presets that can improve performance in specific database servers. The fastest database appears to be MySQL, with about 200 minutes to complete on my test system (read below). A lot of memory (RAM or swap space) is required, in the range of at least 250/500 megabytes (plus more for the database server). In the end, the database requires between 2.5GB and 5GB of disk space. As said, the performance varies greatly using one database server or another. MySQL, for instance, has an ``executemany()`` method of the cursor object that accepts multiple data insertion with a single SQL statement; other databases require a call to the ``execute()`` method for every single row of data, and they will be much slower -2 to 7 times slower than MySQL. There are generic suggestions that can lead to better performance, such as turning off your filesystem journaling (so it can be a good idea to remount an ext3 filesystem as ext2 for example). Another option is using a ramdisk/tmpfs, if you have enough RAM. Obviously these have effect only at insert-time; during day-to-day use, you can turn journaling on again. You can also consider using CSV output as explained below, if your database server can import CSV files. I've done some tests, using an AMD Athlon 1800+, 1GB of RAM, over a complete plain text data files set (as of 11 Apr 2008, with more than 1.200.000 titles and over 2.200.000 names): +----------------------+------------------------------------------------------+ | database | time in minutes: total (insert data/create indexes) | +======================+======================================================+ | MySQL 5.0 MyISAM | 205 (160/45) | +----------------------+------------------------------------------------------+ | MySQL 5.0 InnoDB | _untested_, see NOTES below | +----------------------+------------------------------------------------------+ | PostgreSQL 8.1 | 560 (530/30) | +----------------------+------------------------------------------------------+ | SQLite 3.3 | ??? (150/???) -very slow building indexes | | | | | | Timed with the "--sqlite-transactions" command | | | | | | line option; otherwise it's _really_ slow: | | | | | | even 35 hours or more | +----------------------+------------------------------------------------------+ | SQLite 3.7 | 65/13 - with --sqlite-transactions | | | and using an SSD disk | +----------------------+------------------------------------------------------+ | SQL Server | about 3 or 4 hours | +----------------------+------------------------------------------------------+ If you have different experiences, please tell me! As expected, the most important things that you can do to improve performance are: #. Use an in-memory filesystem or an SSD disk. #. Use the ``-c /path/to/empty/dir`` argument to use CSV files. #. Follow the specific notes about your database server. Notes ----- [save the output] The imdbpy2sql.py will print a lot of debug information on standard output; you can save it in a file, appending (without quotes) "2>&1 | tee output.txt" [Microsoft Windows paths] It's much safer, in a Microsoft Windows environment, to use full paths for the values of the '-c' and '-d' arguments, complete with drive letter. The best thing is to use _UNIX_ path separator, and to add a leading separator, e.g.:: -d C:/path/to/imdb_files/ -c C:/path/to/csv_tmp_files/ [MySQL] In general, if you get an annoyingly high number of "TOO MANY DATA ... SPLITTING" lines, consider increasing max_allowed_packet (in the configuration of your MySQL server) to at least 8M or 16M. Otherwise, inserting the data will be very slow, and some data may be lost. [MySQL InnoDB and MyISAM] InnoDB is abysmal slow for our purposes: my suggestion is to always use MyISAM tables and -if you really want to use InnoDB- convert the tables later. The imdbpy2sql.py script provides a simple way to manage these cases, see ADVANCED FEATURES below. In my opinion, the cleaner thing to do is to set the server to use MyISAM tables or -if you can't modify the server- use the ``--mysql-force-myisam`` command line option of imdbpy2sql.py. Anyway, if you really need to use InnoDB, in the server-side settings I recommend to set innodb_file_per_table to "true". Beware that the conversion will be extremely slow (some hours), but still faster than using InnoDB from the start. You can use the "--mysql-innodb" command line option to force the creation of a database with MyISAM tables, converted at the end into InnoDB. [Microsoft SQL Server/SQLExpress] If you get and error about how wrong and against nature the blasphemous act of inserting an identity key is, you can try to fix it with the new custom queries support; see ADVANCED FEATURES below. As a shortcut, you can use the "--ms-sqlserver" command line option to set all the needed options. [SQLite speed-up] For some reason, SQLite is really slow, except when used with transactions; you can use the "--sqlite-transactions" command line option to obtain acceptable performance. The same command also turns off "PRAGMA synchronous". SQLite seems to hugely benefit from the use of a non-journaling filesystem and/or of a ramdisk/tmpfs: see the generic suggestions discussed above in the Timing section. [SQLite failure] It seems that with older versions of the python-sqlite package, the first run may fail; if you get a DatabaseError exception saying "no such table", try running again the command with the same arguments. Double funny, huh? ;-) [data truncated] If you get an insane amount (hundreds or thousands, on various text columns) of warnings like these: imdbpy2sql.py:727: Warning: Data truncated for column 'person_role' at row 4979 CURS.executemany(self.sqlString, self.converter(self.values())) you probably have a problem with the configuration of your database. The error comes from strings that get cut at the first non-ASCII character (and so you're losing a lot of information). To solves this problem, you must be sure that your database server is set up properly, with the use library/client configured to communicate with the server in a consistent way. For example, for MySQL you can set:: character-set-server = utf8 default-collation = utf8_unicode_ci default-character-set = utf8 or even:: character-set-server = latin1 default-collation = latin1_bin default-character-set = latin1 [adult titles] Beware that, while running, the imdbpy2sql.py script will output a lot of strings containing both person names and movie titles. The script has absolutely no way of knowing that the processed title is an adult-only movie, so... if you leave it on and your little daughter runs to you screaming "daddy! daddy! what kind of animals does Rocco train in the documentary 'Rocco: Animal Trainer 17'???"... well, it's not my fault! ;-) Advanced features ----------------- With the -e (or --execute) command line argument you can specify custom queries to be executed at certain times, with the syntax:: -e "TIME:[OPTIONAL_MODIFIER:]QUERY" where TIME is one of: 'BEGIN', 'BEFORE_DROP', 'BEFORE_CREATE', 'AFTER_CREATE', 'BEFORE_MOVIES', 'BEFORE_CAST', 'BEFORE_RESTORE', 'BEFORE_INDEXES', 'END'. The only available OPTIONAL_MODIFIER is 'FOR_EVERY_TABLE' and it means that the QUERY command will be executed for every table in the database (so it doesn't make much sense to use it with BEGIN, BEFORE_DROP or BEFORE_CREATE time...), replacing the "%(table)s" text in the QUERY with the appropriate table name. Other available TIMEs are: 'BEFORE_MOVIES_TODB', 'AFTER_MOVIES_TODB', 'BEFORE_PERSONS_TODB', 'AFTER_PERSONS_TODB', 'BEFORE_CHARACTERS_TODB', 'AFTER_CHARACTERS_TODB', 'BEFORE_SQLDATA_TODB', 'AFTER_SQLDATA_TODB', 'BEFORE_AKAMOVIES_TODB' and 'AFTER_AKAMOVIES_TODB'; they take no modifiers. Special TIMEs 'BEFORE_EVERY_TODB' and 'AFTER_EVERY_TODB' apply to every BEFORE_* and AFTER_* TIME above mentioned. These commands are executed before and after every _toDB() call in their respective objects (CACHE_MID, CACHE_PID and SQLData instances); the "%(table)s" text in the QUERY is replaced as above. You can specify so many -e arguments as you need, even if they refer to the same TIME: they will be executed from the first to the last. Also, always remember to correctly escape queries: after all you're passing it on the command line! E.g. (ok, quite a silly example...):: -e "AFTER_CREATE:SELECT * FROM title;" The most useful case is when you want to convert the tables of a MySQL from MyISAM to InnoDB:: -e "END:FOR_EVERY_TABLE:ALTER TABLE %(table)s ENGINE=InnoDB;" If your system uses InnoDB by default, you can trick it with:: -e "AFTER_CREATE:FOR_EVERY_TABLE:ALTER TABLE %(table)s ENGINE=MyISAM;" -e "END:FOR_EVERY_TABLE:ALTER TABLE %(table)s ENGINE=InnoDB;" You can use the "--mysql-innodb" command line option as a shortcut of the above command. Cool, huh? Another possible use is to fix a problem with Microsoft SQLServer/SQLExpress. To prevent errors setting IDENTITY fields, you can run something like this:: -e 'BEFORE_EVERY_TODB:SET IDENTITY_INSERT %(table)s ON' -e 'AFTER_EVERY_TODB:SET IDENTITY_INSERT %(table)s OFF' You can use the "--ms-sqlserver" command line option as a shortcut of the above command. To use transactions to speed-up SQLite, try:: -e 'BEFORE_EVERY_TODB:BEGIN TRANSACTION;' -e 'AFTER_EVERY_TODB:COMMIT;' Which is also the same thing the command line option "--sqlite-transactions" does. CSV files --------- .. note:: Keep in mind that not all database servers support this. Moreover, you can run into problems. For example, if you're using PostgreSQL, your server process will need read access to the directory where the CSV files are stored. To create the database using a set of CSV files, run :file:`imdbpy2sql.py` as follows:: imdbpy2sql.py -d /dir/with/plainTextDataFiles/ -u URI \ -c /path/to/the/csv_files_dir/ The created CSV files will be imported near the end of processing. After the import is finished, you can safely remove these files. Since version 4.5, it's possible to separate the two steps involved when using CSV files: - With the ``--csv-only-write`` command line option, the old database will be truncated and the CSV files saved, along with imdbID information. - With the ``--csv-only-load`` option, these saved files can be loaded into an existing database (this database MUST be the one left almost empty by the previous run). Beware that right now the whole procedure is not very well tested. For both commands, you still have to specify the whole ``-u URI -d /path/plainTextDataFiles/ -c /path/CSVfiles/`` arguments. .. _SQLAlchemy: https://www.sqlalchemy.org/ .. _SQLAlchemy dialects: http://docs.sqlalchemy.org/en/latest/dialects/ .. _this comment by Andrew D Bate: https://github.com/alberanid/imdbpy/issues/130#issuecomment-365707620 imdbpy-6.8/docs/usage/query.rst000066400000000000000000000110461351454127000166120ustar00rootroot00000000000000:orphan: Querying data ============= Method descriptions: ``search_movie(title)`` Searches for the given title, and returns a list of Movie objects containing only basic information like the movie title and year, and with a "movieID" instance variable: - ``movieID`` is an identifier of some kind; for the sake of simplicity you can think of it as the ID used by the IMDb's web server used to uniquely identify a movie (e.g.: '0094226' for Brian De Palma's "The Untouchables"), but keep in mind that it's not necessary the same ID!!! For some implementations of the "data access system" these two IDs can be the same (as is the case for the 'http' data access system), but other access systems can use a totally different kind of movieID. The easier (I hope!) way to understand this is to think of the movieID returned by the search_movie() method as the *thing* you have to pass to the get_movie() method, so that it can retrieve info about the referred movie. So, movieID *can* be the imdbID ('0094226') if you're accessing the web server, but with a SQL installation of the IMDb database, movieID will be an integer, as read from the id column in the database. ``search_episode(title)`` This is identical to ``search_movie()``, except that it is tailored to searching for titles of TV series episodes. Best results are expected when searching for just the title of the episode, *without* the title of the TV series. ``get_movie(movieID)`` This will fetch the needed data and return a Movie object for the movie referenced by the given movieID. The Movie class can be found in the Movie module. A Movie object presents basically the same interface of a Python's dictionary; so you can access, for example, the list of actors and actresses using the syntax ``movieObject['cast']``. The ``search_person(name)``, ``get_person(personID)``, ``search_character(name)``, ``get_character(characterID)``, ``search_company(name)``, and ``get_company(companyID)`` methods work the same way as ``search_movie(title)`` and ``get_movie(movieID)``. The ``search_keyword(string)`` method returns a list of strings that are valid keywords, similar to the one given. The ``get_keyword(keyword)`` method returns a list of Movie instances that are tagged with the given keyword. The ``get_imdbMovieID(movieID)``, ``get_imdbPersonID(personID)``, ``get_imdbCharacterID(characterID)``, and ``get_imdbCompanyID(companyID)`` methods take, respectively, a movieID, a personID, a movieID, or a companyID and return the relative imdbID; it's safer to use the ``get_imdbID(MovieOrPersonOrCharacterOrCompanyObject)`` method. The ``title2imdbID(title)``, ``name2imdbID(name)``, ``character2imdbID(name)``, and ``company2imdbID(name)`` methods take, respectively, a movie title (in the plain text data files format), a person name, a character name, or a company name, and return the relative imdbID; when possible it's safer to use the ``get_imdbID(MovieOrPersonOrCharacterOrCompanyObject)`` method. The ``get_imdbID(MovieOrPersonOrCharacterOrCompanyObject)`` method returns the imdbID for the given Movie, Person, Character or Company object. The ``get_imdbURL(MovieOrPersonOrCharacterOrCompanyObject)`` method returns a string with the main IMDb URL for the given Movie, Person, Character, or Company object; it does its best to retrieve the URL. The ``update(MovieOrPersonOrCharacterOrCompanyObject)`` method takes an instance of a Movie, Person, Character, or Company class, and retrieves other available information. Remember that the ``search_*(txt)`` methods will return a list of Movie, Person, Character or Company objects with only basic information, such as the movie title or the person/character name. So, ``update()`` can be used to retrieve every other information. By default a "reasonable" set of information are retrieved: 'main', 'filmography', and 'biography' for a Person/Character object; 'main' and 'plot' for a Movie object; 'main' for a Company object. Example: .. code-block:: python # only basic information like the title will be printed. print(first_match.summary()) # update the information for this movie. i.update(first_match) # a lot of information will be printed! print(first_match.summary()) # retrieve trivia information i.update(first_match, 'trivia') print(m['trivia']) # retrieve both 'quotes' and 'goofs' information (with a list or tuple) i.update(m, ['quotes', 'goofs']) print(m['quotes']) print(m['goofs']) # retrieve every available information. i.update(m, 'all') imdbpy-6.8/docs/usage/quickstart.rst000066400000000000000000000076701351454127000176470ustar00rootroot00000000000000Quick start =========== The first thing to do is to import :mod:`imdb` and call the :mod:`imdb.IMDb` function to get an access object through which IMDb data can be retrieved: .. code-block:: python >>> import imdb >>> ia = imdb.IMDb() By default this will fetch the data from the IMDb web server but there are other options. See the :ref:`access systems ` document for more information. Searching --------- You can use the :meth:`search_movie ` method of the access object to search for movies with a given (or similar) title. For example, to search for movies with titles like "matrix": .. code-block:: python >>> movies = ia.search_movie('matrix') >>> movies[0] Similarly, you can search for people and companies using the :meth:`search_person ` and the :meth:`search_company ` methods: .. code-block:: python >>> people = ia.search_person('angelina') >>> people[0] >>> companies = ia.search_company('rko') >>> companies[0] As the examples indicate, the results are lists of :class:`Movie `, :class:`Person `, or :class:`Company ` objects. These behave like dictionaries, i.e. they can be queried by giving the key of the data you want to obtain: .. code-block:: python >>> movies[0]['title'] 'The Matrix' >>> people[0]['name'] 'Angelina Jolie' >>> companies[0]['name'] 'RKO' Movie, person, and company objects have id attributes which -when fetched through the IMDb web server- store the IMDb id of the object: .. code-block:: python >>> movies[0].movieID '0133093' >>> people[0].personID '0001401' >>> companies[0].companyID '0226417' Retrieving ---------- If you know the IMDb id of a movie, you can use the :meth:`get_movie ` method to retrieve its data. For example, the movie "The Untouchables" by Brian De Palma has the id "0094226": .. code-block:: python >>> movie = ia.get_movie('0094226') >>> movie Similarly, the :meth:`get_person ` and the :meth:`get_company ` methods can be used for retrieving :class:`Person ` and :class:`Company ` data: .. code-block:: python >>> person = ia.get_person('0000206') >>> person['name'] 'Keanu Reeves' >>> person['birth date'] '1964-9-2' >>> company = ia.get_company('0017902') >>> company['name'] 'Pixar Animation Studios' Keywords -------- You can search for keywords similar to the one provided: .. code-block:: python >>> keywords = ia.search_keyword('dystopia') >>> keywords ['dystopia', 'dystopian-future', ..., 'dystopic-future'] And movies that match a given keyword: .. code-block:: python >>> movies = ia.get_keyword('dystopia') >>> len(movies) 50 >>> movies[0] Top / bottom movies ------------------- It's possible to retrieve the list of top 250 and bottom 100 movies: [#sql_bottom]_ .. code-block:: python >>> top = ia.get_top250_movies() >>> top[0] >>> bottom = ia.get_bottom100_movies() >>> bottom[0] Exceptions ---------- Any error related to IMDbPY can be caught by checking for the :class:`imdb.IMDbError` exception: .. code-block:: python from imdb import IMDb, IMDbError try: ia = IMDb() people = ia.search_person('Mel Gibson') except IMDbError as e: print(e) .. [#sql_bottom] Beware that in an SQL-based access system, the bottom 100 list is limited to the first 10 results. imdbpy-6.8/docs/usage/role.rst000066400000000000000000000150511351454127000164060ustar00rootroot00000000000000Roles ===== When parsing data of a movie, you'll encounter references to the people who worked on it, like its cast, director and crew members. For people in the cast (actors and actresses), the :attr:`currentRole ` attribute is set to the name of the character they played: .. code-block:: python >>> movie = ia.get_movie('0075860') >>> movie >>> actor = movie['cast'][6] >>> actor >>> actor['name'] 'Warren J. Kemmerling' >>> actor.currentRole 'Wild Bill' Miscellaneous data, such as an AKA name for the actor or an "uncredited" notice, is stored in the :attr:`notes ` attribute: .. code-block:: python >>> actor.notes '(as Warren Kemmerling)' For crew members other than the cast, the :attr:`notes ` attribute contains the description of the person's job: .. code-block:: python >>> crew_member = movie['art department'][0] >>> crew_member >>> crew_member.notes 'property master' The ``in`` operator can be used to check whether a person worked in a given movie or not: .. code-block:: python >>> movie >>> actor >>> actor in movie True >>> crew_member >>> crew_member in movie True >>> person >>> person in movie False Obviously these Person objects contain only information directly available upon parsing the movie pages, e.g.: the name, an imdbID, the role. So if now you write:: print(writer['actor']) to get a list of movies acted by Mel Gibson, you'll get a KeyError exception, because the Person object doesn't contain this kind of information. The same is true when parsing person data: you'll find a list of movie the person worked on and, for every movie, the currentRole instance variable is set to a string describing the role of the considered person: .. code-block:: python # Julia Roberts julia = i.get_person('0000210') # Output a list of movies she acted in and the played role # separated by '::' print([movie['title'] + '::' + movie.currentRole for movie in julia['actress']]) Here the various Movie objects only contain minimal information, like the title and the year; the latest movie with Julia Roberts: .. code-block:: python last = julia['actress'][0] # Retrieve full information i.update(last) # name of the first director print(last['director'][0]['name']) .. note:: Since the end of 2017, IMDb has removed the Character kind of information. This document is still valid, but only for the obsolete "sql" data access system. Since version 3.3, IMDbPY supports the character pages of the IMDb database; this required some substantial changes to how actors' and acresses' roles were handled. Starting with release 3.4, "sql" data access system is supported, too - but it works a bit differently from "http". See "SQL" below. The currentRole instance attribute can be found in every instance of Person, Movie and Character classes, even if actually the Character never uses it. The currentRole of a Person object is set to a Character instance, inside a list of person who acted in a given movie. The currentRole of a Movie object is set to a Character instance, inside a list of movies played be given person. The currentRole of a Movie object is set to a Person instance, inside a list of movies in which a given character was portrayed. Schema:: movie['cast'][0].currentRole -> a Character object. | +-> a Person object. person['actor'][0].currentRole -> a Character object. | +-> a Movie object. character['filmography'][0].currentRole -> a Person object. | +-> a Movie object. The roleID attribute can be used to access/set the characterID or personID instance attribute of the current currentRole. When building Movie or Person objects, you can pass the currentRole parameter and the roleID parameter (to set the ID). The currentRole parameter can be an object (Character or Person), a string (in which case a Character or Person object is automatically instantiated) or a list of objects or strings (to handle multiple characters played by the same actor/actress in a movie, or character played by more then a single actor/actress in the same movie). Anyway, currentRole objects (Character or Person instances) can be pretty-printed easily: calling unicode(CharacterOrPersonObject) will return a good-old-string. SQL --- Fetching data from the web, only characters with an active page on the web site will have their characterID; we don't have these information when accessing through "sql", so *every* character will have an associated characterID. This way, every character with the same name will share the same characterID, even if - in fact - they may not be portraying the same character. Goodies ------- To help getting the required information from Movie, Person and Character objects, in the "helpers" module there's a new factory function, makeObject2Txt, which can be used to create your pretty-printing function. It takes some optional parameters: movieTxt, personTxt, characterTxt and companyTxt; in these strings %(value)s items are replaced with object['value'] or with obj.value (if the first is not present). E.g.: .. code-block:: python import imdb myPrint = imdb.helpers.makeObject2Txt(personTxt=u'%(name)s ... %(currentRole)s') i = imdb.IMDb() m = i.get_movie('0057012') ps = m['cast'][0] print(myPrint(ps)) # The output will be something like: # Peter Sellers ... Group Captain Lionel Mandrake / President Merkin Muffley / Dr. Strangelove Portions of the formatting string can be stripped conditionally: if the specified condition is false, they will be cancelled. E.g.:: myPrint = imdb.helpers.makeObject2Txt(personTxt='%(long imdb name)s ... %(currentRole)s %(notes)s' Another useful argument is 'applyToValues': if set to a function, it will be applied to every value before the substitution; it can be useful to format strings for HTML output. imdbpy-6.8/docs/usage/s3.rst000066400000000000000000000041311351454127000157670ustar00rootroot00000000000000.. _s3: S3 datasets =========== IMDb distributes some of its data as downloadable `datasets`_. IMDbPY can import this data into a database and make it accessible through its API. [#ptdf]_ For this, you will first need to install `SQLAlchemy`_ and the libraries that are needed for the database server you want to use. Check out the `SQLAlchemy dialects`_ documentation for more detail. Then, follow these steps: #. Download the files from the following address and put all of them in the same directory: https://datasets.imdbws.com/ #. Create a database. Use a collation like ``utf8_unicode_ci``. #. Import the data using the :file:`s32imdbpy.py` script:: s32imdbpy.py /path/to/the/tsv.gz/files/ URI *URI* is the identifier used to access the SQL database. For example:: s32imdbpy.py ~/Download/imdb-s3-dataset-2018-02-07/ \ postgres://user:password@localhost/imdb Please notice that for some database engines (like MySQL and MariaDB) you may need to specify the charset on the URI and sometimes also the dialect, with something like 'mysql+mysqldb://username:password@localhost/imdb?charset=utf8' Once the import is finished - which should take about an hour or less on a modern system - you will have a SQL database with all the information and you can use the normal IMDbPY API: .. code-block:: python from imdb import IMDb ia = IMDb('s3', 'postgres://user:password@localhost/imdb') results = ia.search_movie('the matrix') for result in results: print(result.movieID, result) matrix = results[0] ia.update(matrix) print(matrix.keys()) .. note:: Running the script again will drop the current tables and import the data again. .. [#ptdf] Until the end of 2017, IMDb used to distribute a more comprehensive subset of its data in a different format. IMDbPY can also import that data but note that the data is not being updated anymore. For more information, see :ref:`ptdf`. .. _datasets: https://www.imdb.com/interfaces/ .. _SQLAlchemy: https://www.sqlalchemy.org/ .. _SQLAlchemy dialects: http://docs.sqlalchemy.org/en/latest/dialects/ imdbpy-6.8/docs/usage/series.rst000066400000000000000000000146061351454127000167440ustar00rootroot00000000000000Series ====== As on the IMDb site, each TV series and also each of a TV series' episodes is treated as a regular title, just like a movie. The ``kind`` key can be used to distinguish series and episodes from movies: .. code-block:: python >>> series = ia.get_movie('0389564') >>> series >>> series['kind'] 'tv series' >>> episode = ia.get_movie('0502803') >>> episode >>> episode['kind'] 'episode' The episodes of a series can be fetched using the "episodes" infoset. This infoset adds an ``episodes`` key which is a dictionary from season numbers to episodes. And each season is a dictionary from episode numbers within the season to the episodes. Note that the season and episode numbers don't start from 0; they are the numbers given by the IMDb: .. code-block:: python >>> ia.update(series, 'episodes') >>> sorted(series['episodes'].keys()) [1, 2, 3, 4] >>> season4 = series['episodes'][4] >>> len(season4) 13 >>> episode = series['episodes'][4][2] >>> episode >>> episode['season'] 4 >>> episode['episode'] 2 The title of the episode doesn't contain the title of the series: .. code-block:: python >>> episode['title'] 'Fear Itself' >>> episode['series title'] 'The 4400' The episode also contains a key that refers to the series, but beware that, to avoid circular references, it's not the same object as the series object we started with: .. code-block:: python >>> episode['episode of'] >>> series Titles ------ The ``analyze_title()`` and ``build_title()`` functions now support TV episodes. You can pass a string to the ``analyze_title`` function in the format used by the web server (``"The Series" The Episode (2005)``) or in the format of the plain text data files (``"The Series" (2004) {The Episode (#ser.epi)}``). For example, if you call the function:: analyze_title('"The Series" The Episode (2005)') the result will be:: { 'kind': 'episode', # kind is set to 'episode' 'year': '2005', # release year of this episode 'title': 'The Episode', # episode title 'episode of': { # 'episode of' will contain 'kind': 'tv series', # information about the series 'title': 'The Series' } } The ``episode of`` key can be a dictionary or a ``Movie`` instance with the same information. The ``build_title()`` function takes an optional argument: ``ptdf``, which when set to false (the default) returns the title of the episode in the format used by the IMDb's web server ("The Series" An Episode (2006)); otherwise, it uses the format used by the plain text data files (something like "The Series" (2004) {An Episode (#2.5)}) Full credits ------------ When retrieving credits for a TV series or mini-series, you may notice that many long lists (like "cast" and "writers") are incomplete. You can fetch the complete list of cast and crew with the "full credits" data set: .. code-block:: python >>> series = ia.get_movie('0285331') >>> series >>> len(series['cast']) 50 >>> ia.update(series, 'full credits') >>> len(series['cast']) 2514 If you prefer, you can retrieve the complete cast of every episode, keeping the lists separated for each episode. Instead of retrieving with:: ia.update(series, 'episodes') use:: ia.update(series, 'episodes cast') or the equivalent:: i.update(m, 'guests') Now you end up having the same information as if you have updated the 'episodes' info set, but every Movie object inside the dictionary of dictionary has the complete cast, e.g.:: cast = m['episodes'][1][2]['cast'] # cast list for the second episode # of the first season. Beware that both 'episodes cast' and 'guests' will update the keyword 'episodes' (and not 'episodes cast' or 'guests'). Ratings ------- You can retrieve rating information about every episode in a TV series or mini series using the 'episodes rating' data set. People ------ You can retrieve information about single episodes acted/directed/... by a person. .. code-block:: python from imdb import IMDb i = IMDb() p = i.get_person('0005041') # Laura Innes p['actress'][0] # # At this point you have an entry (in keys like 'actor', 'actress', # 'director', ...) for every series the person starred/worked in, but # you knows nothing about singles episodes. i.update(p, 'episodes') # updates information about single episodes. p['episodes'] # a dictionary with the format: # {: [ # , # , # ... # ], # ... # } er = p['actress'][0] # ER tv series p['episodes'][er] # list of Movie objects; one for every ER episode # she starred/worked in p['episodes'][er][0] # p['episodes'][er]['kind'] # 'episode' p['episodes'][er][0].currentRole # 'Dr. Kerry Weaver' Goodies ------- In the ``imdb.helpers`` module there are some functions useful to manage lists of episodes: - ``sortedSeasons(m)`` returns a sorted list of seasons of the given series, e.g.: .. code-block:: python >>> from imdb import IMDb >>> i = IMDb() >>> m = i.get_movie('0411008') >>> i.update(m, 'episodes') >>> sortedSeasons(m) [1, 2] - ``sortedEpisodes(m, season=None)`` returns a sorted list of episodes of the the given series for only the specified season(s) (if None, every season), e.g.: .. code-block:: python >>> from imdb import IMDb >>> i = IMDb() >>> m = i.get_movie('0411008') >>> i.update(m, 'episodes') >>> sortedEpisodes(m, season=1) [, , ...] imdbpy-6.8/imdb/000077500000000000000000000000001351454127000135705ustar00rootroot00000000000000imdbpy-6.8/imdb/Character.py000066400000000000000000000157461351454127000160530ustar00rootroot00000000000000# Copyright 2007-2019 Davide Alberani # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides the Character class, used to store information about a given character. """ from __future__ import absolute_import, division, print_function, unicode_literals from copy import deepcopy from imdb._exceptions import IMDbParserError from imdb.utils import _Container, analyze_name, build_name, cmpPeople, flatten class Character(_Container): """A Character. Every information about a character can be accessed as:: characterObject['information'] to get a list of the kind of information stored in a Character object, use the keys() method; some useful aliases are defined (as "also known as" for the "akas" key); see the keys_alias dictionary. """ # The default sets of information retrieved. default_info = ('main', 'filmography', 'biography') # Aliases for some not-so-intuitive keys. keys_alias = { 'mini biography': 'biography', 'bio': 'biography', 'character biography': 'biography', 'character biographies': 'biography', 'biographies': 'biography', 'character bio': 'biography', 'aka': 'akas', 'also known as': 'akas', 'alternate names': 'akas', 'personal quotes': 'quotes', 'keys': 'keywords', 'keyword': 'keywords' } keys_tomodify_list = ('biography', 'quotes') cmpFunct = cmpPeople def _init(self, **kwds): """Initialize a Character object. *characterID* -- the unique identifier for the character. *name* -- the name of the Character, if not in the data dictionary. *myName* -- the nickname you use for this character. *myID* -- your personal id for this character. *data* -- a dictionary used to initialize the object. *notes* -- notes about the given character. *accessSystem* -- a string representing the data access system used. *titlesRefs* -- a dictionary with references to movies. *namesRefs* -- a dictionary with references to persons. *charactersRefs* -- a dictionary with references to characters. *modFunct* -- function called returning text fields. """ name = kwds.get('name') if name and 'name' not in self.data: self.set_name(name) self.characterID = kwds.get('characterID', None) self.myName = kwds.get('myName', '') def _reset(self): """Reset the Character object.""" self.characterID = None self.myName = '' def set_name(self, name): """Set the name of the character.""" try: d = analyze_name(name) self.data.update(d) except IMDbParserError: pass def _additional_keys(self): """Valid keys to append to the data.keys() list.""" addkeys = [] if 'name' in self.data: addkeys += ['long imdb name'] if 'headshot' in self.data: addkeys += ['full-size headshot'] return addkeys def _getitem(self, key): """Handle special keys.""" # XXX: can a character have an imdbIndex? if 'name' in self.data: if key == 'long imdb name': return build_name(self.data) return None def getID(self): """Return the characterID.""" return self.characterID def __bool__(self): """The Character is "false" if the self.data does not contain a name.""" # XXX: check the name and the characterID? return bool(self.data.get('name')) def __contains__(self, item): """Return true if this Character was portrayed in the given Movie or it was impersonated by the given Person.""" from .Movie import Movie from .Person import Person if isinstance(item, Person): for m in flatten(self.data, yieldDictKeys=True, scalar=Movie): if item.isSame(m.currentRole): return True elif isinstance(item, Movie): for m in flatten(self.data, yieldDictKeys=True, scalar=Movie): if item.isSame(m): return True elif isinstance(item, str): return item in self.data return False def isSameName(self, other): """Return true if two character have the same name and/or characterID.""" if not isinstance(other, self.__class__): return False if 'name' in self.data and 'name' in other.data and \ build_name(self.data, canonical=False) == build_name(other.data, canonical=False): return True if self.accessSystem == other.accessSystem and \ self.characterID is not None and \ self.characterID == other.characterID: return True return False isSameCharacter = isSameName def __deepcopy__(self, memo): """Return a deep copy of a Character instance.""" c = Character(name='', characterID=self.characterID, myName=self.myName, myID=self.myID, data=deepcopy(self.data, memo), notes=self.notes, accessSystem=self.accessSystem, titlesRefs=deepcopy(self.titlesRefs, memo), namesRefs=deepcopy(self.namesRefs, memo), charactersRefs=deepcopy(self.charactersRefs, memo)) c.current_info = list(self.current_info) c.set_mod_funct(self.modFunct) return c def __repr__(self): """String representation of a Character object.""" return '' % ( self.characterID, self.accessSystem, self.get('name') ) def __str__(self): """Simply print the short name.""" return self.get('name', '') def summary(self): """Return a string with a pretty-printed summary for the character.""" if not self: return '' s = 'Character\n=====\nName: %s\n' % self.get('name', '') bio = self.get('biography') if bio: s += 'Biography: %s\n' % bio[0] filmo = self.get('filmography') if filmo: a_list = [x.get('long imdb canonical title', '') for x in filmo[:5]] s += 'Last movies with this character: %s.\n' % '; '.join(a_list) return s imdbpy-6.8/imdb/Company.py000066400000000000000000000157361351454127000155640ustar00rootroot00000000000000# Copyright 2008-2017 Davide Alberani # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides the company class, used to store information about a given company. """ from __future__ import absolute_import, division, print_function, unicode_literals from copy import deepcopy from imdb.utils import _Container from imdb.utils import analyze_company_name, build_company_name, cmpCompanies, flatten class Company(_Container): """A company. Every information about a company can be accessed as:: companyObject['information'] to get a list of the kind of information stored in a company object, use the keys() method; some useful aliases are defined (as "also known as" for the "akas" key); see the keys_alias dictionary. """ # The default sets of information retrieved. default_info = ('main',) # Aliases for some not-so-intuitive keys. keys_alias = { 'distributor': 'distributors', 'special effects company': 'special effects companies', 'other company': 'miscellaneous companies', 'miscellaneous company': 'miscellaneous companies', 'other companies': 'miscellaneous companies', 'misc companies': 'miscellaneous companies', 'misc company': 'miscellaneous companies', 'production company': 'production companies' } keys_tomodify_list = () cmpFunct = cmpCompanies def _init(self, **kwds): """Initialize a company object. *companyID* -- the unique identifier for the company. *name* -- the name of the company, if not in the data dictionary. *myName* -- the nickname you use for this company. *myID* -- your personal id for this company. *data* -- a dictionary used to initialize the object. *notes* -- notes about the given company. *accessSystem* -- a string representing the data access system used. *titlesRefs* -- a dictionary with references to movies. *namesRefs* -- a dictionary with references to persons. *charactersRefs* -- a dictionary with references to companies. *modFunct* -- function called returning text fields. """ name = kwds.get('name') if name and 'name' not in self.data: self.set_name(name) self.companyID = kwds.get('companyID', None) self.myName = kwds.get('myName', '') def _reset(self): """Reset the company object.""" self.companyID = None self.myName = '' def set_name(self, name): """Set the name of the company.""" # Company diverges a bit from other classes, being able # to directly handle its "notes". AND THAT'S PROBABLY A BAD IDEA! oname = name = name.strip() notes = '' if name.endswith(')'): fparidx = name.find('(') if fparidx != -1: notes = name[fparidx:] name = name[:fparidx].rstrip() if self.notes: name = oname d = analyze_company_name(name) self.data.update(d) if notes and not self.notes: self.notes = notes def _additional_keys(self): """Valid keys to append to the data.keys() list.""" if 'name' in self.data: return ['long imdb name'] return [] def _getitem(self, key): """Handle special keys.""" # XXX: can a company have an imdbIndex? if 'name' in self.data: if key == 'long imdb name': return build_company_name(self.data) return None def getID(self): """Return the companyID.""" return self.companyID def __bool__(self): """The company is "false" if the self.data does not contain a name.""" # XXX: check the name and the companyID? return bool(self.data.get('name')) def __contains__(self, item): """Return true if this company and the given Movie are related.""" from .Movie import Movie if isinstance(item, Movie): for m in flatten(self.data, yieldDictKeys=True, scalar=Movie): if item.isSame(m): return True elif isinstance(item, str): return item in self.data return False def isSameName(self, other): """Return true if two company have the same name and/or companyID.""" if not isinstance(other, self.__class__): return False if 'name' in self.data and \ 'name' in other.data and \ build_company_name(self.data) == \ build_company_name(other.data): return True if self.accessSystem == other.accessSystem and \ self.companyID is not None and \ self.companyID == other.companyID: return True return False isSameCompany = isSameName def __deepcopy__(self, memo): """Return a deep copy of a company instance.""" c = Company(name='', companyID=self.companyID, myName=self.myName, myID=self.myID, data=deepcopy(self.data, memo), notes=self.notes, accessSystem=self.accessSystem, titlesRefs=deepcopy(self.titlesRefs, memo), namesRefs=deepcopy(self.namesRefs, memo), charactersRefs=deepcopy(self.charactersRefs, memo)) c.current_info = list(self.current_info) c.set_mod_funct(self.modFunct) return c def __repr__(self): """String representation of a Company object.""" return '' % ( self.companyID, self.accessSystem, self.get('long imdb name') ) def __str__(self): """Simply print the short name.""" return self.get('name', '') def summary(self): """Return a string with a pretty-printed summary for the company.""" if not self: return '' s = 'Company\n=======\nName: %s\n' % self.get('name', '') for k in ('distributor', 'production company', 'miscellaneous company', 'special effects company'): d = self.get(k, [])[:5] if not d: continue s += 'Last movies from this company (%s): %s.\n' % ( k, '; '.join([x.get('long imdb title', '') for x in d]) ) return s imdbpy-6.8/imdb/Movie.py000066400000000000000000000326761351454127000152370ustar00rootroot00000000000000# Copyright 2004-2018 Davide Alberani # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides the Movie class, used to store information about a given movie. """ from __future__ import absolute_import, division, print_function, unicode_literals from copy import deepcopy from imdb import linguistics from imdb.utils import _Container from imdb.utils import analyze_title, build_title, canonicalTitle, cmpMovies, flatten class Movie(_Container): """A Movie. Every information about a movie can be accessed as:: movieObject['information'] to get a list of the kind of information stored in a Movie object, use the keys() method; some useful aliases are defined (as "casting" for the "casting director" key); see the keys_alias dictionary. """ # The default sets of information retrieved. default_info = ('main', 'plot') # Aliases for some not-so-intuitive keys. keys_alias = { 'tv schedule': 'airing', 'user rating': 'rating', 'plot summary': 'plot', 'plot summaries': 'plot', 'directed by': 'director', 'actors': 'cast', 'actresses': 'cast', 'aka': 'akas', 'also known as': 'akas', 'country': 'countries', 'production country': 'countries', 'production countries': 'countries', 'genre': 'genres', 'runtime': 'runtimes', 'lang': 'languages', 'color': 'color info', 'cover': 'cover url', 'full-size cover': 'full-size cover url', 'seasons': 'number of seasons', 'language': 'languages', 'certificate': 'certificates', 'certifications': 'certificates', 'certification': 'certificates', 'episodes number': 'number of episodes', 'faq': 'faqs', 'technical': 'tech', 'frequently asked questions': 'faqs' } keys_tomodify_list = ( 'plot', 'trivia', 'alternate versions', 'goofs', 'quotes', 'dvd', 'laserdisc', 'news', 'soundtrack', 'crazy credits', 'business', 'supplements', 'video review', 'faqs' ) _image_key = 'cover url' cmpFunct = cmpMovies def _init(self, **kwds): """Initialize a Movie object. *movieID* -- the unique identifier for the movie. *title* -- the title of the Movie, if not in the data dictionary. *myTitle* -- your personal title for the movie. *myID* -- your personal identifier for the movie. *data* -- a dictionary used to initialize the object. *currentRole* -- a Character instance representing the current role or duty of a person in this movie, or a Person object representing the actor/actress who played a given character in a Movie. If a string is passed, an object is automatically build. *roleID* -- if available, the characterID/personID of the currentRole object. *roleIsPerson* -- when False (default) the currentRole is assumed to be a Character object, otherwise a Person. *notes* -- notes for the person referred in the currentRole attribute; e.g.: '(voice)'. *accessSystem* -- a string representing the data access system used. *titlesRefs* -- a dictionary with references to movies. *namesRefs* -- a dictionary with references to persons. *charactersRefs* -- a dictionary with references to characters. *modFunct* -- function called returning text fields. """ title = kwds.get('title') if title and 'title' not in self.data: self.set_title(title) self.movieID = kwds.get('movieID', None) self.myTitle = kwds.get('myTitle', '') def _reset(self): """Reset the Movie object.""" self.movieID = None self.myTitle = '' def set_title(self, title): """Set the title of the movie.""" d_title = analyze_title(title) self.data.update(d_title) def _additional_keys(self): """Valid keys to append to the data.keys() list.""" addkeys = [] if 'title' in self.data: addkeys += ['canonical title', 'long imdb title', 'long imdb canonical title', 'smart canonical title', 'smart long imdb canonical title'] if 'episode of' in self.data: addkeys += ['long imdb episode title', 'series title', 'canonical series title', 'episode title', 'canonical episode title', 'smart canonical series title', 'smart canonical episode title'] if 'cover url' in self.data: addkeys += ['full-size cover url'] return addkeys def guessLanguage(self): """Guess the language of the title of this movie; returns None if there are no hints.""" lang = self.get('languages') if lang: lang = lang[0] else: country = self.get('countries') if country: lang = linguistics.COUNTRY_LANG.get(country[0]) return lang def smartCanonicalTitle(self, title=None, lang=None): """Return the canonical title, guessing its language. The title can be forces with the 'title' argument (internally used) and the language can be forced with the 'lang' argument, otherwise it's auto-detected.""" if title is None: title = self.data.get('title', '') if lang is None: lang = self.guessLanguage() return canonicalTitle(title, lang=lang) def _getSeriesTitle(self, obj): """Get the title from a Movie object or return the string itself.""" if isinstance(obj, Movie): return obj.get('title', '') return obj def _getitem(self, key): """Handle special keys.""" if 'episode of' in self.data: if key == 'long imdb episode title': return build_title(self.data) elif key == 'series title': return self._getSeriesTitle(self.data['episode of']) elif key == 'canonical series title': ser_title = self._getSeriesTitle(self.data['episode of']) return canonicalTitle(ser_title) elif key == 'smart canonical series title': ser_title = self._getSeriesTitle(self.data['episode of']) return self.smartCanonicalTitle(ser_title) elif key == 'episode title': return self.data.get('title', '') elif key == 'canonical episode title': return canonicalTitle(self.data.get('title', '')) elif key == 'smart canonical episode title': return self.smartCanonicalTitle(self.data.get('title', '')) if 'title' in self.data: if key == 'title': return self.data['title'] elif key == 'long imdb title': return build_title(self.data) elif key == 'canonical title': return canonicalTitle(self.data['title']) elif key == 'smart canonical title': return self.smartCanonicalTitle(self.data['title']) elif key == 'long imdb canonical title': return build_title(self.data, canonical=True) elif key == 'smart long imdb canonical title': return build_title(self.data, canonical=True, lang=self.guessLanguage()) if key == 'full-size cover url': return self.get_fullsizeURL() return None def getID(self): """Return the movieID.""" return self.movieID def __bool__(self): """The Movie is "false" if the self.data does not contain a title.""" # XXX: check the title and the movieID? return 'title' in self.data def isSameTitle(self, other): """Return true if this and the compared object have the same long imdb title and/or movieID. """ # XXX: obsolete? if not isinstance(other, self.__class__): return False if 'title' in self.data and 'title' in other.data and \ build_title(self.data, canonical=False) == build_title(other.data, canonical=False): return True if self.accessSystem == other.accessSystem and \ self.movieID is not None and self.movieID == other.movieID: return True return False isSameMovie = isSameTitle # XXX: just for backward compatiblity. def __contains__(self, item): """Return true if the given Person object is listed in this Movie, or if the the given Character is represented in this Movie.""" from .Person import Person from .Character import Character from .Company import Company if isinstance(item, Person): for p in flatten(self.data, yieldDictKeys=True, scalar=Person, toDescend=(list, dict, tuple, Movie)): if item.isSame(p): return True elif isinstance(item, Character): for p in flatten(self.data, yieldDictKeys=True, scalar=Person, toDescend=(list, dict, tuple, Movie)): if item.isSame(p.currentRole): return True elif isinstance(item, Company): for c in flatten(self.data, yieldDictKeys=True, scalar=Company, toDescend=(list, dict, tuple, Movie)): if item.isSame(c): return True elif isinstance(item, str): return item in self.data return False def __deepcopy__(self, memo): """Return a deep copy of a Movie instance.""" m = Movie(title='', movieID=self.movieID, myTitle=self.myTitle, myID=self.myID, data=deepcopy(self.data, memo), currentRole=deepcopy(self.currentRole, memo), roleIsPerson=self._roleIsPerson, notes=self.notes, accessSystem=self.accessSystem, titlesRefs=deepcopy(self.titlesRefs, memo), namesRefs=deepcopy(self.namesRefs, memo), charactersRefs=deepcopy(self.charactersRefs, memo)) m.current_info = list(self.current_info) m.set_mod_funct(self.modFunct) return m def __repr__(self): """String representation of a Movie object.""" # XXX: add also currentRole and notes, if present? if 'long imdb episode title' in self: title = self.get('long imdb episode title') else: title = self.get('long imdb title') return '' % (self.movieID, self.accessSystem, title) def __str__(self): """Simply print the short title.""" return self.get('title', '') def summary(self): """Return a string with a pretty-printed summary for the movie.""" if not self: return '' def _nameAndRole(personList, joiner=', '): """Build a pretty string with name and role.""" nl = [] for person in personList: n = person.get('name', '') if person.currentRole: n += ' (%s)' % person.currentRole nl.append(n) return joiner.join(nl) s = 'Movie\n=====\nTitle: %s\n' % self.get('long imdb canonical title', '') genres = self.get('genres') if genres: s += 'Genres: %s.\n' % ', '.join(genres) director = self.get('director') if director: s += 'Director: %s.\n' % _nameAndRole(director) writer = self.get('writer') if writer: s += 'Writer: %s.\n' % _nameAndRole(writer) cast = self.get('cast') if cast: cast = cast[:5] s += 'Cast: %s.\n' % _nameAndRole(cast) runtime = self.get('runtimes') if runtime: s += 'Runtime: %s.\n' % ', '.join(runtime) countries = self.get('countries') if countries: s += 'Country: %s.\n' % ', '.join(countries) lang = self.get('languages') if lang: s += 'Language: %s.\n' % ', '.join(lang) rating = self.get('rating') if rating: s += 'Rating: %s' % rating nr_votes = self.get('votes') if nr_votes: s += ' (%s votes)' % nr_votes s += '.\n' plot = self.get('plot') if not plot: plot = self.get('plot summary') if plot: plot = [plot] if plot: plot = plot[0] i = plot.find('::') if i != -1: plot = plot[:i] s += 'Plot: %s' % plot return s imdbpy-6.8/imdb/Person.py000066400000000000000000000250421351454127000154130ustar00rootroot00000000000000# Copyright 2004-2019 Davide Alberani # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides the Person class, used to store information about a given person. """ from __future__ import absolute_import, division, print_function, unicode_literals from copy import deepcopy from imdb.utils import _Container, analyze_name, build_name, cmpPeople, flatten, normalizeName, canonicalName class Person(_Container): """A Person. Every information about a person can be accessed as:: personObject['information'] to get a list of the kind of information stored in a Person object, use the keys() method; some useful aliases are defined (as "biography" for the "mini biography" key); see the keys_alias dictionary. """ # The default sets of information retrieved. default_info = ('main', 'filmography', 'biography') # Aliases for some not-so-intuitive keys. keys_alias = { 'biography': 'mini biography', 'bio': 'mini biography', 'aka': 'akas', 'also known as': 'akas', 'nick name': 'nick names', 'nicks': 'nick names', 'nickname': 'nick names', 'nicknames': 'nick names', 'miscellaneouscrew': 'miscellaneous crew', 'crewmembers': 'miscellaneous crew', 'misc': 'miscellaneous crew', 'guest': 'notable tv guest appearances', 'guests': 'notable tv guest appearances', 'tv guest': 'notable tv guest appearances', 'guest appearances': 'notable tv guest appearances', 'spouses': 'spouse', 'salary': 'salary history', 'salaries': 'salary history', 'otherworks': 'other works', "maltin's biography": "biography from leonard maltin's movie encyclopedia", "leonard maltin's biography": "biography from leonard maltin's movie encyclopedia", 'real name': 'birth name', 'where are they now': 'where now', 'personal quotes': 'quotes', 'mini-biography author': 'imdb mini-biography by', 'biography author': 'imdb mini-biography by', 'genre': 'genres', 'portrayed': 'portrayed in', 'keys': 'keywords', 'trademarks': 'trade mark', 'trade mark': 'trade mark', 'trade marks': 'trade mark', 'trademark': 'trade mark', 'pictorials': 'pictorial', 'magazine covers': 'magazine cover photo', 'magazine-covers': 'magazine cover photo', 'tv series episodes': 'episodes', 'tv-series episodes': 'episodes', 'articles': 'article', 'keyword': 'keywords' } # 'nick names'??? keys_tomodify_list = ( 'mini biography', 'spouse', 'quotes', 'other works', 'salary history', 'trivia', 'trade mark', 'news', 'books', 'biographical movies', 'portrayed in', 'where now', 'interviews', 'article', "biography from leonard maltin's movie encyclopedia" ) _image_key = 'headshot' cmpFunct = cmpPeople def _init(self, **kwds): """Initialize a Person object. *personID* -- the unique identifier for the person. *name* -- the name of the Person, if not in the data dictionary. *myName* -- the nickname you use for this person. *myID* -- your personal id for this person. *data* -- a dictionary used to initialize the object. *currentRole* -- a Character instance representing the current role or duty of a person in this movie, or a Person object representing the actor/actress who played a given character in a Movie. If a string is passed, an object is automatically build. *roleID* -- if available, the characterID/personID of the currentRole object. *roleIsPerson* -- when False (default) the currentRole is assumed to be a Character object, otherwise a Person. *notes* -- notes about the given person for a specific movie or role (e.g.: the alias used in the movie credits). *accessSystem* -- a string representing the data access system used. *titlesRefs* -- a dictionary with references to movies. *namesRefs* -- a dictionary with references to persons. *modFunct* -- function called returning text fields. *billingPos* -- position of this person in the credits list. """ name = kwds.get('name') if name and 'name' not in self.data: self.set_name(name) self.personID = kwds.get('personID', None) self.myName = kwds.get('myName', '') self.billingPos = kwds.get('billingPos', None) def _reset(self): """Reset the Person object.""" self.personID = None self.myName = '' self.billingPos = None def _clear(self): """Reset the dictionary.""" self.billingPos = None def set_name(self, name): """Set the name of the person.""" d = analyze_name(name, canonical=False) self.data.update(d) def _additional_keys(self): """Valid keys to append to the data.keys() list.""" addkeys = [] if 'name' in self.data: addkeys += ['canonical name', 'long imdb name', 'long imdb canonical name'] if 'headshot' in self.data: addkeys += ['full-size headshot'] return addkeys def _getitem(self, key): """Handle special keys.""" if 'name' in self.data: if key == 'name': return normalizeName(self.data['name']) elif key == 'canonical name': return canonicalName(self.data['name']) elif key == 'long imdb name': return build_name(self.data, canonical=False) elif key == 'long imdb canonical name': return build_name(self.data, canonical=True) if key == 'full-size headshot': return self.get_fullsizeURL() return None def getID(self): """Return the personID.""" return self.personID def __bool__(self): """The Person is "false" if the self.data does not contain a name.""" # XXX: check the name and the personID? return 'name' in self.data def __contains__(self, item): """Return true if this Person has worked in the given Movie, or if the fiven Character was played by this Person.""" from .Movie import Movie from .Character import Character if isinstance(item, Movie): for m in flatten(self.data, yieldDictKeys=True, scalar=Movie): if item.isSame(m): return True elif isinstance(item, Character): for m in flatten(self.data, yieldDictKeys=True, scalar=Movie): if item.isSame(m.currentRole): return True elif isinstance(item, str): return item in self.data return False def isSameName(self, other): """Return true if two persons have the same name and imdbIndex and/or personID. """ if not isinstance(other, self.__class__): return False if 'name' in self.data and \ 'name' in other.data and \ build_name(self.data, canonical=True) == \ build_name(other.data, canonical=True): return True if self.accessSystem == other.accessSystem and \ self.personID and self.personID == other.personID: return True return False isSamePerson = isSameName # XXX: just for backward compatiblity. def __deepcopy__(self, memo): """Return a deep copy of a Person instance.""" p = Person(name='', personID=self.personID, myName=self.myName, myID=self.myID, data=deepcopy(self.data, memo), currentRole=deepcopy(self.currentRole, memo), roleIsPerson=self._roleIsPerson, notes=self.notes, accessSystem=self.accessSystem, titlesRefs=deepcopy(self.titlesRefs, memo), namesRefs=deepcopy(self.namesRefs, memo), charactersRefs=deepcopy(self.charactersRefs, memo)) p.current_info = list(self.current_info) p.set_mod_funct(self.modFunct) p.billingPos = self.billingPos return p def __repr__(self): """String representation of a Person object.""" # XXX: add also currentRole and notes, if present? return '' % ( self.personID, self.accessSystem, self.get('long imdb name') ) def __str__(self): """Simply print the short name.""" return self.get('name', '') def summary(self): """Return a string with a pretty-printed summary for the person.""" if not self: return '' s = 'Person\n=====\nName: %s\n' % self.get('long imdb canonical name', '') bdate = self.get('birth date') if bdate: s += 'Birth date: %s' % bdate bnotes = self.get('birth notes') if bnotes: s += ' (%s)' % bnotes s += '.\n' ddate = self.get('death date') if ddate: s += 'Death date: %s' % ddate dnotes = self.get('death notes') if dnotes: s += ' (%s)' % dnotes s += '.\n' bio = self.get('mini biography') if bio: s += 'Biography: %s\n' % bio[0] director = self.get('director') if director: d_list = [x.get('long imdb canonical title', '') for x in director[:3]] s += 'Last movies directed: %s.\n' % '; '.join(d_list) act = self.get('actor') or self.get('actress') if act: a_list = [x.get('long imdb canonical title', '') for x in act[:5]] s += 'Last movies acted: %s.\n' % '; '.join(a_list) return s imdbpy-6.8/imdb/__init__.py000066400000000000000000001161711351454127000157100ustar00rootroot00000000000000# Copyright 2004-2019 Davide Alberani # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This package can be used to retrieve information about a movie or a person from the IMDb database. It can fetch data through different media such as the IMDb web pages, or a SQL database. """ from __future__ import absolute_import, division, print_function, unicode_literals __all__ = ['IMDb', 'IMDbError', 'Movie', 'Person', 'Character', 'Company', 'available_access_systems'] __version__ = VERSION = '6.8' import logging import os import sys from pkgutil import find_loader from types import MethodType, FunctionType import imdb._logging from imdb._exceptions import IMDbDataAccessError, IMDbError from imdb import Character, Company, Movie, Person from imdb.utils import build_company_name, build_name, build_title PY2 = sys.hexversion < 0x3000000 if PY2: import ConfigParser as configparser else: import configparser _imdb_logger = logging.getLogger('imdbpy') _aux_logger = logging.getLogger('imdbpy.aux') # URLs of the main pages for movies, persons, characters and queries. imdbURL_base = 'https://www.imdb.com/' # NOTE: the urls below will be removed in a future version. # please use the values in the 'urls' attribute # of the IMDbBase subclass instance. # http://www.imdb.com/title/ imdbURL_movie_base = '%stitle/' % imdbURL_base # http://www.imdb.com/title/tt%s/ imdbURL_movie_main = imdbURL_movie_base + 'tt%s/' # http://www.imdb.com/name/ imdbURL_person_base = '%sname/' % imdbURL_base # http://www.imdb.com/name/nm%s/ imdbURL_person_main = imdbURL_person_base + 'nm%s/' # http://www.imdb.com/character/ imdbURL_character_base = '%scharacter/' % imdbURL_base # http://www.imdb.com/character/ch%s/ imdbURL_character_main = imdbURL_character_base + 'ch%s/' # http://www.imdb.com/company/ imdbURL_company_base = '%scompany/' % imdbURL_base # http://www.imdb.com/company/co%s/ imdbURL_company_main = imdbURL_company_base + 'co%s/' # http://www.imdb.com/keyword/%s/ imdbURL_keyword_main = imdbURL_base + 'keyword/%s/' # http://www.imdb.com/chart/top imdbURL_top250 = imdbURL_base + 'chart/top' # http://www.imdb.com/chart/bottom imdbURL_bottom100 = imdbURL_base + 'chart/bottom' # http://www.imdb.com/find?%s imdbURL_find = imdbURL_base + 'find?%s' # Name of the configuration file. confFileName = 'imdbpy.cfg' class ConfigParserWithCase(configparser.ConfigParser): """A case-sensitive parser for configuration files.""" def __init__(self, defaults=None, confFile=None, *args, **kwds): """Initialize the parser. *defaults* -- defaults values. *confFile* -- the file (or list of files) to parse.""" if PY2: configparser.ConfigParser.__init__(self, defaults=defaults) else: super(configparser.ConfigParser, self).__init__(defaults=defaults) if confFile is None: dotFileName = '.' + confFileName # Current and home directory. confFile = [os.path.join(os.getcwd(), confFileName), os.path.join(os.getcwd(), dotFileName), os.path.join(os.path.expanduser('~'), confFileName), os.path.join(os.path.expanduser('~'), dotFileName)] if os.name == 'posix': sep = getattr(os.path, 'sep', '/') # /etc/ and /etc/conf.d/ confFile.append(os.path.join(sep, 'etc', confFileName)) confFile.append(os.path.join(sep, 'etc', 'conf.d', confFileName)) else: # etc subdirectory of sys.prefix, for non-unix systems. confFile.append(os.path.join(sys.prefix, 'etc', confFileName)) for fname in confFile: try: self.read(fname) except (configparser.MissingSectionHeaderError, configparser.ParsingError) as e: _aux_logger.warn('Troubles reading config file: %s' % e) # Stop at the first valid file. if self.has_section('imdbpy'): break def optionxform(self, optionstr): """Option names are case sensitive.""" return optionstr def _manageValue(self, value): """Custom substitutions for values.""" if not isinstance(value, str): return value vlower = value.lower() if vlower in ('1', 'on', 'false', '0', 'off', 'yes', 'no', 'true'): return self._convert_to_boolean(vlower) elif vlower == 'none': return None return value def get(self, section, option, *args, **kwds): """Return the value of an option from a given section.""" value = configparser.ConfigParser.get(self, section, option, *args, **kwds) return self._manageValue(value) def items(self, section, *args, **kwds): """Return a list of (key, value) tuples of items of the given section.""" if section != 'DEFAULT' and not self.has_section(section): return [] keys = configparser.ConfigParser.options(self, section) return [(k, self.get(section, k, *args, **kwds)) for k in keys] def getDict(self, section): """Return a dictionary of items of the specified section.""" return dict(self.items(section)) def IMDb(accessSystem=None, *arguments, **keywords): """Return an instance of the appropriate class. The accessSystem parameter is used to specify the kind of the preferred access system.""" if accessSystem is None or accessSystem in ('auto', 'config'): try: cfg_file = ConfigParserWithCase(*arguments, **keywords) # Parameters set by the code take precedence. kwds = cfg_file.getDict('imdbpy') if 'accessSystem' in kwds: accessSystem = kwds['accessSystem'] del kwds['accessSystem'] else: accessSystem = 'http' kwds.update(keywords) keywords = kwds except Exception as e: _imdb_logger.warn('Unable to read configuration file; complete error: %s' % e) # It just LOOKS LIKE a bad habit: we tried to read config # options from some files, but something is gone horribly # wrong: ignore everything and pretend we were called with # the 'http' accessSystem. accessSystem = 'http' if 'loggingLevel' in keywords: imdb._logging.setLevel(keywords['loggingLevel']) del keywords['loggingLevel'] if 'loggingConfig' in keywords: logCfg = keywords['loggingConfig'] del keywords['loggingConfig'] try: import logging.config logging.config.fileConfig(os.path.expanduser(logCfg)) except Exception as e: _imdb_logger.warn('unable to read logger config: %s' % e) if accessSystem in ('http', 'https', 'web', 'html'): from .parser.http import IMDbHTTPAccessSystem return IMDbHTTPAccessSystem(*arguments, **keywords) if accessSystem in ('s3', 's3dataset', 'imdbws'): from .parser.s3 import IMDbS3AccessSystem return IMDbS3AccessSystem(*arguments, **keywords) elif accessSystem in ('sql', 'db', 'database'): try: from .parser.sql import IMDbSqlAccessSystem except ImportError: raise IMDbError('the sql access system is not installed') return IMDbSqlAccessSystem(*arguments, **keywords) else: raise IMDbError('unknown kind of data access system: "%s"' % accessSystem) def available_access_systems(): """Return the list of available data access systems.""" asList = [] if find_loader('imdb.parser.http') is not None: asList.append('http') if find_loader('imdb.parser.sql') is not None: asList.append('sql') return asList # XXX: I'm not sure this is a good guess. # I suppose that an argument of the IMDb function can be used to # set a default encoding for the output, and then Movie, Person and # Character objects can use this default encoding, returning strings. # Anyway, passing unicode strings to search_movie(), search_person() # and search_character() methods is always safer. encoding = getattr(sys.stdin, 'encoding', '') or sys.getdefaultencoding() class IMDbBase: """The base class used to search for a movie/person/character and to get a Movie/Person/Character object. This class cannot directly fetch data of any kind and so you have to search the "real" code into a subclass.""" # The name of the preferred access system (MUST be overridden # in the subclasses). accessSystem = 'UNKNOWN' # Whether to re-raise caught exceptions or not. _reraise_exceptions = False def __init__(self, defaultModFunct=None, results=20, keywordsResults=100, *arguments, **keywords): """Initialize the access system. If specified, defaultModFunct is the function used by default by the Person, Movie and Character objects, when accessing their text fields. """ # The function used to output the strings that need modification (the # ones containing references to movie titles and person names). self._defModFunct = defaultModFunct # Number of results to get. try: results = int(results) except (TypeError, ValueError): results = 20 if results < 1: results = 20 self._results = results try: keywordsResults = int(keywordsResults) except (TypeError, ValueError): keywordsResults = 100 if keywordsResults < 1: keywordsResults = 100 self._keywordsResults = keywordsResults self._reraise_exceptions = keywords.get('reraiseExceptions') or False self.set_imdb_urls(keywords.get('imdbURL_base') or imdbURL_base) def set_imdb_urls(self, imdbURL_base): """Set the urls used accessing the IMDb site.""" imdbURL_base = imdbURL_base.strip().strip('"\'') if not imdbURL_base.startswith(('https://', 'http://')): imdbURL_base = 'https://%s' % imdbURL_base if not imdbURL_base.endswith('/'): imdbURL_base = '%s/' % imdbURL_base # http://www.imdb.com/title/ imdbURL_movie_base = '%stitle/' % imdbURL_base # http://www.imdb.com/title/tt%s/ imdbURL_movie_main = imdbURL_movie_base + 'tt%s/' # http://www.imdb.com/name/ imdbURL_person_base = '%sname/' % imdbURL_base # http://www.imdb.com/name/nm%s/ imdbURL_person_main = imdbURL_person_base + 'nm%s/' # http://www.imdb.com/character/ imdbURL_character_base = '%scharacter/' % imdbURL_base # http://www.imdb.com/character/ch%s/ imdbURL_character_main = imdbURL_character_base + 'ch%s/' # http://www.imdb.com/company/ imdbURL_company_base = '%scompany/' % imdbURL_base # http://www.imdb.com/company/co%s/ imdbURL_company_main = imdbURL_company_base + 'co%s/' # http://www.imdb.com/keyword/%s/ imdbURL_keyword_main = imdbURL_base + 'keyword/%s/' # http://www.imdb.com/chart/top imdbURL_top250 = imdbURL_base + 'chart/top' # http://www.imdb.com/chart/bottom imdbURL_bottom100 = imdbURL_base + 'chart/bottom' # http://www.imdb.com/find?%s imdbURL_find = imdbURL_base + 'find?%s' # http://www.imdb.com/search/title?%s imdbURL_search_movie_advanced = imdbURL_base + 'search/title/?%s' self.urls = dict( movie_base=imdbURL_movie_base, movie_main=imdbURL_movie_main, person_base=imdbURL_person_base, person_main=imdbURL_person_main, character_base=imdbURL_character_base, character_main=imdbURL_character_main, company_base=imdbURL_company_base, company_main=imdbURL_company_main, keyword_main=imdbURL_keyword_main, top250=imdbURL_top250, bottom100=imdbURL_bottom100, find=imdbURL_find, search_movie_advanced=imdbURL_search_movie_advanced) def _normalize_movieID(self, movieID): """Normalize the given movieID.""" # By default, do nothing. return movieID def _normalize_personID(self, personID): """Normalize the given personID.""" # By default, do nothing. return personID def _normalize_characterID(self, characterID): """Normalize the given characterID.""" # By default, do nothing. return characterID def _normalize_companyID(self, companyID): """Normalize the given companyID.""" # By default, do nothing. return companyID def _get_real_movieID(self, movieID): """Handle title aliases.""" # By default, do nothing. return movieID def _get_real_personID(self, personID): """Handle name aliases.""" # By default, do nothing. return personID def _get_real_characterID(self, characterID): """Handle character name aliases.""" # By default, do nothing. return characterID def _get_real_companyID(self, companyID): """Handle company name aliases.""" # By default, do nothing. return companyID def _get_infoset(self, prefname): """Return methods with the name starting with prefname.""" infoset = [] excludes = ('%sinfoset' % prefname,) preflen = len(prefname) for name in dir(self.__class__): if name.startswith(prefname) and name not in excludes: member = getattr(self.__class__, name) if isinstance(member, (MethodType, FunctionType)): infoset.append(name[preflen:].replace('_', ' ')) return infoset def get_movie_infoset(self): """Return the list of info set available for movies.""" return self._get_infoset('get_movie_') def get_person_infoset(self): """Return the list of info set available for persons.""" return self._get_infoset('get_person_') def get_character_infoset(self): """Return the list of info set available for characters.""" return self._get_infoset('get_character_') def get_company_infoset(self): """Return the list of info set available for companies.""" return self._get_infoset('get_company_') def get_movie(self, movieID, info=Movie.Movie.default_info, modFunct=None): """Return a Movie object for the given movieID. The movieID is something used to univocally identify a movie; it can be the imdbID used by the IMDb web server, a file pointer, a line number in a file, an ID in a database, etc. info is the list of sets of information to retrieve. If specified, modFunct will be the function used by the Movie object when accessing its text fields (like 'plot').""" movieID = self._normalize_movieID(movieID) movieID = self._get_real_movieID(movieID) movie = Movie.Movie(movieID=movieID, accessSystem=self.accessSystem) modFunct = modFunct or self._defModFunct if modFunct is not None: movie.set_mod_funct(modFunct) self.update(movie, info) return movie get_episode = get_movie def _search_movie(self, title, results): """Return a list of tuples (movieID, {movieData})""" # XXX: for the real implementation, see the method of the # subclass, somewhere under the imdb.parser package. raise NotImplementedError('override this method') def search_movie(self, title, results=None, _episodes=False): """Return a list of Movie objects for a query for the given title. The results argument is the maximum number of results to return.""" if results is None: results = self._results try: results = int(results) except (ValueError, OverflowError): results = 20 if not _episodes: res = self._search_movie(title, results) else: res = self._search_episode(title, results) return [Movie.Movie(movieID=self._get_real_movieID(mi), data=md, modFunct=self._defModFunct, accessSystem=self.accessSystem) for mi, md in res][:results] def _search_movie_advanced(self, title=None, adult=None, results=None, sort=None, sort_dir=None): """Return a list of tuples (movieID, {movieData})""" # XXX: for the real implementation, see the method of the # subclass, somewhere under the imdb.parser package. raise NotImplementedError('override this method') def search_movie_advanced(self, title=None, adult=None, results=None, sort=None, sort_dir=None): """Return a list of Movie objects for a query for the given title. The results argument is the maximum number of results to return.""" if results is None: results = self._results try: results = int(results) except (ValueError, OverflowError): results = 20 res = self._search_movie_advanced(title=title, adult=adult, results=results, sort=sort, sort_dir=sort_dir) return [Movie.Movie(movieID=self._get_real_movieID(mi), data=md, modFunct=self._defModFunct, accessSystem=self.accessSystem) for mi, md in res][:results] def _search_episode(self, title, results): """Return a list of tuples (movieID, {movieData})""" # XXX: for the real implementation, see the method of the # subclass, somewhere under the imdb.parser package. raise NotImplementedError('override this method') def search_episode(self, title, results=None): """Return a list of Movie objects for a query for the given title. The results argument is the maximum number of results to return; this method searches only for titles of tv (mini) series' episodes.""" return self.search_movie(title, results=results, _episodes=True) def get_person(self, personID, info=Person.Person.default_info, modFunct=None): """Return a Person object for the given personID. The personID is something used to univocally identify a person; it can be the imdbID used by the IMDb web server, a file pointer, a line number in a file, an ID in a database, etc. info is the list of sets of information to retrieve. If specified, modFunct will be the function used by the Person object when accessing its text fields (like 'mini biography').""" personID = self._normalize_personID(personID) personID = self._get_real_personID(personID) person = Person.Person(personID=personID, accessSystem=self.accessSystem) modFunct = modFunct or self._defModFunct if modFunct is not None: person.set_mod_funct(modFunct) self.update(person, info) return person def _search_person(self, name, results): """Return a list of tuples (personID, {personData})""" # XXX: for the real implementation, see the method of the # subclass, somewhere under the imdb.parser package. raise NotImplementedError('override this method') def search_person(self, name, results=None): """Return a list of Person objects for a query for the given name. The results argument is the maximum number of results to return.""" if results is None: results = self._results try: results = int(results) except (ValueError, OverflowError): results = 20 res = self._search_person(name, results) return [Person.Person(personID=self._get_real_personID(pi), data=pd, modFunct=self._defModFunct, accessSystem=self.accessSystem) for pi, pd in res][:results] def get_character(self, characterID, info=Character.Character.default_info, modFunct=None): """Return a Character object for the given characterID. The characterID is something used to univocally identify a character; it can be the imdbID used by the IMDb web server, a file pointer, a line number in a file, an ID in a database, etc. info is the list of sets of information to retrieve. If specified, modFunct will be the function used by the Character object when accessing its text fields (like 'biography').""" characterID = self._normalize_characterID(characterID) characterID = self._get_real_characterID(characterID) character = Character.Character(characterID=characterID, accessSystem=self.accessSystem) modFunct = modFunct or self._defModFunct if modFunct is not None: character.set_mod_funct(modFunct) self.update(character, info) return character def _search_character(self, name, results): """Return a list of tuples (characterID, {characterData})""" # XXX: for the real implementation, see the method of the # subclass, somewhere under the imdb.parser package. raise NotImplementedError('override this method') def search_character(self, name, results=None): """Return a list of Character objects for a query for the given name. The results argument is the maximum number of results to return.""" if results is None: results = self._results try: results = int(results) except (ValueError, OverflowError): results = 20 res = self._search_character(name, results) return [Character.Character(characterID=self._get_real_characterID(pi), data=pd, modFunct=self._defModFunct, accessSystem=self.accessSystem) for pi, pd in res][:results] def get_company(self, companyID, info=Company.Company.default_info, modFunct=None): """Return a Company object for the given companyID. The companyID is something used to univocally identify a company; it can be the imdbID used by the IMDb web server, a file pointer, a line number in a file, an ID in a database, etc. info is the list of sets of information to retrieve. If specified, modFunct will be the function used by the Company object when accessing its text fields (none, so far).""" companyID = self._normalize_companyID(companyID) companyID = self._get_real_companyID(companyID) company = Company.Company(companyID=companyID, accessSystem=self.accessSystem) modFunct = modFunct or self._defModFunct if modFunct is not None: company.set_mod_funct(modFunct) self.update(company, info) return company def _search_company(self, name, results): """Return a list of tuples (companyID, {companyData})""" # XXX: for the real implementation, see the method of the # subclass, somewhere under the imdb.parser package. raise NotImplementedError('override this method') def search_company(self, name, results=None): """Return a list of Company objects for a query for the given name. The results argument is the maximum number of results to return.""" if results is None: results = self._results try: results = int(results) except (ValueError, OverflowError): results = 20 res = self._search_company(name, results) return [Company.Company(companyID=self._get_real_companyID(pi), data=pd, modFunct=self._defModFunct, accessSystem=self.accessSystem) for pi, pd in res][:results] def _search_keyword(self, keyword, results): """Return a list of 'keyword' strings.""" # XXX: for the real implementation, see the method of the # subclass, somewhere under the imdb.parser package. raise NotImplementedError('override this method') def search_keyword(self, keyword, results=None): """Search for existing keywords, similar to the given one.""" if results is None: results = self._keywordsResults try: results = int(results) except (ValueError, OverflowError): results = 100 return self._search_keyword(keyword, results) def _get_keyword(self, keyword, results): """Return a list of tuples (movieID, {movieData})""" # XXX: for the real implementation, see the method of the # subclass, somewhere under the imdb.parser package. raise NotImplementedError('override this method') def get_keyword(self, keyword, results=None): """Return a list of movies for the given keyword.""" if results is None: results = self._keywordsResults try: results = int(results) except (ValueError, OverflowError): results = 100 res = self._get_keyword(keyword, results) return [Movie.Movie(movieID=self._get_real_movieID(mi), data=md, modFunct=self._defModFunct, accessSystem=self.accessSystem) for mi, md in res][:results] def _get_top_bottom_movies(self, kind): """Return the list of the top 250 or bottom 100 movies.""" # XXX: for the real implementation, see the method of the # subclass, somewhere under the imdb.parser package. # This method must return a list of (movieID, {movieDict}) # tuples. The kind parameter can be 'top' or 'bottom'. raise NotImplementedError('override this method') def get_top250_movies(self): """Return the list of the top 250 movies.""" res = self._get_top_bottom_movies('top') return [Movie.Movie(movieID=self._get_real_movieID(mi), data=md, modFunct=self._defModFunct, accessSystem=self.accessSystem) for mi, md in res] def get_bottom100_movies(self): """Return the list of the bottom 100 movies.""" res = self._get_top_bottom_movies('bottom') return [Movie.Movie(movieID=self._get_real_movieID(mi), data=md, modFunct=self._defModFunct, accessSystem=self.accessSystem) for mi, md in res] def new_movie(self, *arguments, **keywords): """Return a Movie object.""" # XXX: not really useful... return Movie.Movie(accessSystem=self.accessSystem, *arguments, **keywords) def new_person(self, *arguments, **keywords): """Return a Person object.""" # XXX: not really useful... return Person.Person(accessSystem=self.accessSystem, *arguments, **keywords) def new_character(self, *arguments, **keywords): """Return a Character object.""" # XXX: not really useful... return Character.Character(accessSystem=self.accessSystem, *arguments, **keywords) def new_company(self, *arguments, **keywords): """Return a Company object.""" # XXX: not really useful... return Company.Company(accessSystem=self.accessSystem, *arguments, **keywords) def update(self, mop, info=None, override=0): """Given a Movie, Person, Character or Company object with only partial information, retrieve the required set of information. info is the list of sets of information to retrieve. If override is set, the information are retrieved and updated even if they're already in the object.""" # XXX: should this be a method of the Movie/Person/Character/Company # classes? NO! What for instances created by external functions? mopID = None prefix = '' if isinstance(mop, Movie.Movie): mopID = mop.movieID prefix = 'movie' elif isinstance(mop, Person.Person): mopID = mop.personID prefix = 'person' elif isinstance(mop, Character.Character): mopID = mop.characterID prefix = 'character' elif isinstance(mop, Company.Company): mopID = mop.companyID prefix = 'company' else: raise IMDbError('object ' + repr(mop) + ' is not a Movie, Person, Character or Company instance') if mopID is None: # XXX: enough? It's obvious that there are Characters # objects without characterID, so I think they should # just do nothing, when an i.update(character) is tried. if prefix == 'character': return raise IMDbDataAccessError('supplied object has null movieID, personID or companyID') if mop.accessSystem == self.accessSystem: aSystem = self else: aSystem = IMDb(mop.accessSystem) if info is None: info = mop.default_info elif info == 'all': if isinstance(mop, Movie.Movie): info = self.get_movie_infoset() elif isinstance(mop, Person.Person): info = self.get_person_infoset() elif isinstance(mop, Character.Character): info = self.get_character_infoset() else: info = self.get_company_infoset() if not isinstance(info, (tuple, list)): info = (info,) res = {} for i in info: if i in mop.current_info and not override: continue if not i: continue _imdb_logger.debug('retrieving "%s" info set', i) try: method = getattr(aSystem, 'get_%s_%s' % (prefix, i.replace(' ', '_'))) except AttributeError: _imdb_logger.error('unknown information set "%s"', i) # Keeps going. method = lambda *x: {} try: ret = method(mopID) except Exception: _imdb_logger.critical( 'caught an exception retrieving or parsing "%s" info set' ' for mopID "%s" (accessSystem: %s)', i, mopID, mop.accessSystem, exc_info=True ) ret = {} # If requested by the user, reraise the exception. if self._reraise_exceptions: raise keys = None if 'data' in ret: res.update(ret['data']) if isinstance(ret['data'], dict): keys = list(ret['data'].keys()) if 'info sets' in ret: for ri in ret['info sets']: mop.add_to_current_info(ri, keys, mainInfoset=i) else: mop.add_to_current_info(i, keys) if 'titlesRefs' in ret: mop.update_titlesRefs(ret['titlesRefs']) if 'namesRefs' in ret: mop.update_namesRefs(ret['namesRefs']) if 'charactersRefs' in ret: mop.update_charactersRefs(ret['charactersRefs']) mop.set_data(res, override=0) def get_imdbMovieID(self, movieID): """Translate a movieID in an imdbID (the ID used by the IMDb web server); must be overridden by the subclass.""" # XXX: for the real implementation, see the method of the # subclass, somewhere under the imdb.parser package. raise NotImplementedError('override this method') def get_imdbPersonID(self, personID): """Translate a personID in a imdbID (the ID used by the IMDb web server); must be overridden by the subclass.""" # XXX: for the real implementation, see the method of the # subclass, somewhere under the imdb.parser package. raise NotImplementedError('override this method') def get_imdbCharacterID(self, characterID): """Translate a characterID in a imdbID (the ID used by the IMDb web server); must be overridden by the subclass.""" # XXX: for the real implementation, see the method of the # subclass, somewhere under the imdb.parser package. raise NotImplementedError('override this method') def get_imdbCompanyID(self, companyID): """Translate a companyID in a imdbID (the ID used by the IMDb web server); must be overridden by the subclass.""" # XXX: for the real implementation, see the method of the # subclass, somewhere under the imdb.parser package. raise NotImplementedError('override this method') def _searchIMDb(self, kind, ton, title_kind=None): """Search the IMDb www server for the given title or name.""" if not ton: return None ton = ton.strip('"') aSystem = IMDb() if kind == 'tt': searchFunct = aSystem.search_movie check = 'long imdb title' elif kind == 'nm': searchFunct = aSystem.search_person check = 'long imdb name' elif kind == 'char': searchFunct = aSystem.search_character check = 'long imdb name' elif kind == 'co': # XXX: are [COUNTRY] codes included in the results? searchFunct = aSystem.search_company check = 'long imdb name' try: searchRes = searchFunct(ton) except IMDbError: return None # When only one result is returned, assume it was from an # exact match. if len(searchRes) == 1: return searchRes[0].getID() title_only_matches = [] for item in searchRes: # Return the first perfect match. if item[check].strip('"') == ton: # For titles do additional check for kind if kind != 'tt' or title_kind == item['kind']: return item.getID() elif kind == 'tt': title_only_matches.append(item.getID()) # imdbpy2sql.py could detected wrong type, so if no title and kind # matches found - collect all results with title only match # Return list of IDs if multiple matches (can happen when searching # titles with no title_kind specified) # Example: DB: Band of Brothers "tv series" vs "tv mini-series" if title_only_matches: if len(title_only_matches) == 1: return title_only_matches[0] else: return title_only_matches return None def title2imdbID(self, title, kind=None): """Translate a movie title (in the plain text data files format) to an imdbID. Try an Exact Primary Title search on IMDb; return None if it's unable to get the imdbID; Always specify kind: movie, tv series, video game etc. or search can return list of IDs if multiple matches found """ return self._searchIMDb('tt', title, kind) def name2imdbID(self, name): """Translate a person name in an imdbID. Try an Exact Primary Name search on IMDb; return None if it's unable to get the imdbID.""" return self._searchIMDb('nm', name) def character2imdbID(self, name): """Translate a character name in an imdbID. Try an Exact Primary Name search on IMDb; return None if it's unable to get the imdbID.""" return self._searchIMDb('char', name) def company2imdbID(self, name): """Translate a company name in an imdbID. Try an Exact Primary Name search on IMDb; return None if it's unable to get the imdbID.""" return self._searchIMDb('co', name) def get_imdbID(self, mop): """Return the imdbID for the given Movie, Person, Character or Company object.""" imdbID = None if mop.accessSystem == self.accessSystem: aSystem = self else: aSystem = IMDb(mop.accessSystem) if isinstance(mop, Movie.Movie): if mop.movieID is not None: imdbID = aSystem.get_imdbMovieID(mop.movieID) else: imdbID = aSystem.title2imdbID(build_title(mop, canonical=0, ptdf=0, appendKind=False), mop['kind']) elif isinstance(mop, Person.Person): if mop.personID is not None: imdbID = aSystem.get_imdbPersonID(mop.personID) else: imdbID = aSystem.name2imdbID(build_name(mop, canonical=False)) elif isinstance(mop, Character.Character): if mop.characterID is not None: imdbID = aSystem.get_imdbCharacterID(mop.characterID) else: # canonical=0 ? imdbID = aSystem.character2imdbID(build_name(mop, canonical=False)) elif isinstance(mop, Company.Company): if mop.companyID is not None: imdbID = aSystem.get_imdbCompanyID(mop.companyID) else: imdbID = aSystem.company2imdbID(build_company_name(mop)) else: raise IMDbError('object ' + repr(mop) + ' is not a Movie, Person or Character instance') return imdbID def get_imdbURL(self, mop): """Return the main IMDb URL for the given Movie, Person, Character or Company object, or None if unable to get it.""" imdbID = self.get_imdbID(mop) if imdbID is None: return None if isinstance(mop, Movie.Movie): url_firstPart = imdbURL_movie_main elif isinstance(mop, Person.Person): url_firstPart = imdbURL_person_main elif isinstance(mop, Character.Character): url_firstPart = imdbURL_character_main elif isinstance(mop, Company.Company): url_firstPart = imdbURL_company_main else: raise IMDbError('object ' + repr(mop) + ' is not a Movie, Person, Character or Company instance') return url_firstPart % imdbID def get_special_methods(self): """Return the special methods defined by the subclass.""" sm_dict = {} base_methods = [] for name in dir(IMDbBase): member = getattr(IMDbBase, name) if isinstance(member, (MethodType, FunctionType)): base_methods.append(name) for name in dir(self.__class__): if name.startswith('_') or name in base_methods or \ name.startswith('get_movie_') or \ name.startswith('get_person_') or \ name.startswith('get_company_') or \ name.startswith('get_character_'): continue member = getattr(self.__class__, name) if isinstance(member, (MethodType, FunctionType)): sm_dict.update({name: member.__doc__}) return sm_dict imdbpy-6.8/imdb/_exceptions.py000066400000000000000000000032511351454127000164630ustar00rootroot00000000000000# Copyright 2004-2017 Davide Alberani # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides the exception hierarchy used by the imdb package. """ from __future__ import absolute_import, division, print_function, unicode_literals import logging class IMDbError(Exception): """Base class for every exception raised by the imdb package.""" _logger = logging.getLogger('imdbpy') def __init__(self, *args, **kwargs): """Initialize the exception and pass the message to the log system.""" # Every raised exception also dispatch a critical log. self._logger.critical('%s exception raised; args: %s; kwds: %s', self.__class__.__name__, args, kwargs, exc_info=True) Exception.__init__(self, *args, **kwargs) class IMDbDataAccessError(IMDbError): """Exception raised when is not possible to access needed data.""" pass class IMDbParserError(IMDbError): """Exception raised when an error occurred parsing the data.""" pass imdbpy-6.8/imdb/_logging.py000066400000000000000000000033051351454127000157300ustar00rootroot00000000000000# Copyright 2009-2017 Davide Alberani # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides the logging facilities used by the imdb package. """ from __future__ import absolute_import, division, print_function, unicode_literals import logging LEVELS = { 'debug': logging.DEBUG, 'info': logging.INFO, 'warn': logging.WARNING, 'warning': logging.WARNING, 'error': logging.ERROR, 'critical': logging.CRITICAL } imdbpyLogger = logging.getLogger('imdbpy') imdbpyStreamHandler = logging.StreamHandler() imdbpyFormatter = logging.Formatter( '%(asctime)s %(levelname)s [%(name)s] %(pathname)s:%(lineno)d: %(message)s' ) imdbpyStreamHandler.setFormatter(imdbpyFormatter) imdbpyLogger.addHandler(imdbpyStreamHandler) def setLevel(level): """Set logging level for the main logger.""" level = level.lower().strip() imdbpyLogger.setLevel(LEVELS.get(level, logging.NOTSET)) imdbpyLogger.log(logging.INFO, 'set logging threshold to "%s"', logging.getLevelName(imdbpyLogger.level)) imdbpy-6.8/imdb/cli.py000066400000000000000000000145241351454127000147170ustar00rootroot00000000000000# Copyright 2017 H. Turgut Uyar # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides the command line interface for IMDbPY. """ from __future__ import absolute_import, division, print_function, unicode_literals import sys from argparse import ArgumentParser from imdb import VERSION, IMDb DEFAULT_RESULT_SIZE = 20 def list_results(items, type_, n=None): field = 'title' if type_ == 'movie' else 'name' print(' # IMDb id %s' % field) print('=== ======= %s' % ('=' * len(field),)) for i, item in enumerate(items[:n]): print('%(index)3d %(imdb_id)7s %(title)s' % { 'index': i + 1, 'imdb_id': getattr(item, type_ + 'ID'), 'title': item['long imdb ' + field] }) def search_item(args): connection = IMDb() n = args.n if args.n is not None else DEFAULT_RESULT_SIZE if args.type == 'keyword': items = connection.search_keyword(args.key) if args.first: items = connection.get_keyword(items[0]) list_results(items, type_='movie', n=n) else: print(' # keyword') print('=== =======') for i, keyword in enumerate(items[:n]): print('%(index)3d %(kw)s' % {'index': i + 1, 'kw': keyword}) else: if args.type == 'movie': items = connection.search_movie(args.key) elif args.type == 'person': items = connection.search_person(args.key) elif args.type == 'character': items = connection.search_character(args.key) elif args.type == 'company': items = connection.search_company(args.key) if args.first: connection.update(items[0]) print(items[0].summary()) else: list_results(items, type_=args.type, n=args.n) def get_item(args): connection = IMDb() if args.type == 'keyword': n = args.n if args.n is not None else DEFAULT_RESULT_SIZE items = connection.get_keyword(args.key, results=n) list_results(items, type_='movie') else: if args.type == 'movie': item = connection.get_movie(args.key) elif args.type == 'person': item = connection.get_person(args.key) elif args.type == 'character': item = connection.get_character(args.key) elif args.type == 'company': item = connection.get_company(args.key) print(item.summary()) def list_ranking(items, n=None): print(' # rating votes IMDb id title') print('=== ====== ======= ======= =====') n = n if n is not None else DEFAULT_RESULT_SIZE for i, movie in enumerate(items[:n]): print('%(index)3d %(rating)s %(votes)7s %(imdb_id)7s %(title)s' % { 'index': i + 1, 'rating': movie.get('rating'), 'votes': movie.get('votes'), 'imdb_id': movie.movieID, 'title': movie.get('long imdb title') }) def get_top_movies(args): connection = IMDb() items = connection.get_top250_movies() if args.first: connection.update(items[0]) print(items[0].summary()) else: list_ranking(items, n=args.n) def get_bottom_movies(args): connection = IMDb() items = connection.get_bottom100_movies() if args.first: connection.update(items[0]) print(items[0].summary()) else: list_ranking(items, n=args.n) def make_parser(prog): parser = ArgumentParser(prog) parser.add_argument('--version', action='version', version='%(prog)s ' + VERSION) command_parsers = parser.add_subparsers(metavar='command', dest='command') command_parsers.required = True command_search_parser = command_parsers.add_parser('search', help='search for items') command_search_parser.add_argument('type', help='type of item to search for', choices=['movie', 'person', 'character', 'company', 'keyword']) command_search_parser.add_argument('key', help='title or name of item to search for') command_search_parser.add_argument('-n', type=int, help='number of items to list') command_search_parser.add_argument('--first', action='store_true', help='display only the first result') command_search_parser.set_defaults(func=search_item) command_get_parser = command_parsers.add_parser('get', help='retrieve information about an item') command_get_parser.add_argument('type', help='type of item to retrieve', choices=['movie', 'person', 'character', 'company', 'keyword']) command_get_parser.add_argument('key', help='IMDb id (or keyword name) of item to retrieve') command_get_parser.add_argument('-n', type=int, help='number of movies to list (only for keywords)') command_get_parser.set_defaults(func=get_item) command_top_parser = command_parsers.add_parser('top', help='get top ranked movies') command_top_parser.add_argument('-n', type=int, help='number of movies to list') command_top_parser.add_argument('--first', action='store_true', help='display only the first result') command_top_parser.set_defaults(func=get_top_movies) command_bottom_parser = command_parsers.add_parser('bottom', help='get bottom ranked movies') command_bottom_parser.add_argument('-n', type=int, help='number of movies to list') command_bottom_parser.add_argument('--first', action='store_true', help='display only the first result') command_bottom_parser.set_defaults(func=get_bottom_movies) return parser def main(argv=None): argv = argv if argv is not None else sys.argv parser = make_parser(prog='imdbpy') arguments = parser.parse_args(argv[1:]) arguments.func(arguments) if __name__ == '__main__': main() imdbpy-6.8/imdb/helpers.py000066400000000000000000000543421351454127000156140ustar00rootroot00000000000000# Copyright 2006-2018 Davide Alberani # 2012 Alberto Malagoli # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides functions not used directly by the imdb package, but useful for IMDbPY-based programs. """ # XXX: Find better names for the functions in this module. from __future__ import absolute_import, division, print_function, unicode_literals import difflib import gettext import re from cgi import escape from gettext import gettext as _ # The modClearRefs can be used to strip names and titles references from # the strings in Movie and Person objects. from imdb import IMDb, imdbURL_character_base, imdbURL_movie_base, imdbURL_person_base from imdb.Character import Character from imdb.Company import Company from imdb.linguistics import COUNTRY_LANG from imdb.Movie import Movie from imdb.Person import Person from imdb.utils import _tagAttr, re_characterRef, re_nameRef, re_titleRef from imdb.utils import TAGS_TO_MODIFY gettext.textdomain('imdbpy') # An URL, more or less. _re_href = re.compile(r'(http://.+?)(?=\s|$)', re.I) _re_hrefsub = _re_href.sub def makeCgiPrintEncoding(encoding): """Make a function to pretty-print strings for the web.""" def cgiPrint(s): """Encode the given string using the %s encoding, and replace chars outside the given charset with XML char references.""" % encoding s = escape(s, quote=1) if isinstance(s, str): s = s.encode(encoding, 'xmlcharrefreplace') return s return cgiPrint # cgiPrint uses the utf8 encoding. cgiPrint = makeCgiPrintEncoding('utf8') # Regular expression for %(varname)s substitutions. re_subst = re.compile(r'%\((.+?)\)s') # Regular expression for .... clauses. re_conditional = re.compile(r'(.+?)') def makeTextNotes(replaceTxtNotes): """Create a function useful to handle text[::optional_note] values. replaceTxtNotes is a format string, which can include the following values: %(text)s and %(notes)s. Portions of the text can be conditionally excluded, if one of the values is absent. E.g.: [%(notes)s] will be replaced with '[notes]' if notes exists, or by an empty string otherwise. The returned function is suitable be passed as applyToValues argument of the makeObject2Txt function.""" def _replacer(s): outS = replaceTxtNotes if not isinstance(s, str): return s ssplit = s.split('::', 1) text = ssplit[0] # Used to keep track of text and note existence. keysDict = {} if text: keysDict['text'] = True outS = outS.replace('%(text)s', text) if len(ssplit) == 2: keysDict['notes'] = True outS = outS.replace('%(notes)s', ssplit[1]) else: outS = outS.replace('%(notes)s', '') def _excludeFalseConditionals(matchobj): # Return an empty string if the conditional is false/empty. if matchobj.group(1) in keysDict: return matchobj.group(2) return '' while re_conditional.search(outS): outS = re_conditional.sub(_excludeFalseConditionals, outS) return outS return _replacer def makeObject2Txt(movieTxt=None, personTxt=None, characterTxt=None, companyTxt=None, joiner=' / ', applyToValues=lambda x: x, _recurse=True): """"Return a function useful to pretty-print Movie, Person, Character and Company instances. *movieTxt* -- how to format a Movie object. *personTxt* -- how to format a Person object. *characterTxt* -- how to format a Character object. *companyTxt* -- how to format a Company object. *joiner* -- string used to join a list of objects. *applyToValues* -- function to apply to values. *_recurse* -- if True (default) manage only the given object. """ # Some useful defaults. if movieTxt is None: movieTxt = '%(long imdb title)s' if personTxt is None: personTxt = '%(long imdb name)s' if characterTxt is None: characterTxt = '%(long imdb name)s' if companyTxt is None: companyTxt = '%(long imdb name)s' def object2txt(obj, _limitRecursion=None): """Pretty-print objects.""" # Prevent unlimited recursion. if _limitRecursion is None: _limitRecursion = 0 elif _limitRecursion > 5: return '' _limitRecursion += 1 if isinstance(obj, (list, tuple)): return joiner.join([object2txt(o, _limitRecursion=_limitRecursion) for o in obj]) elif isinstance(obj, dict): # XXX: not exactly nice, neither useful, I fear. return joiner.join( ['%s::%s' % (object2txt(k, _limitRecursion=_limitRecursion), object2txt(v, _limitRecursion=_limitRecursion)) for k, v in list(obj.items())] ) objData = {} if isinstance(obj, Movie): objData['movieID'] = obj.movieID outs = movieTxt elif isinstance(obj, Person): objData['personID'] = obj.personID outs = personTxt elif isinstance(obj, Character): objData['characterID'] = obj.characterID outs = characterTxt elif isinstance(obj, Company): objData['companyID'] = obj.companyID outs = companyTxt else: return obj def _excludeFalseConditionals(matchobj): # Return an empty string if the conditional is false/empty. condition = matchobj.group(1) proceed = obj.get(condition) or getattr(obj, condition, None) if proceed: return matchobj.group(2) else: return '' while re_conditional.search(outs): outs = re_conditional.sub(_excludeFalseConditionals, outs) for key in re_subst.findall(outs): value = obj.get(key) or getattr(obj, key, None) if not isinstance(value, str): if not _recurse: if value: value = str(value) if value: value = object2txt(value, _limitRecursion=_limitRecursion) elif value: value = applyToValues(str(value)) if not value: value = '' elif not isinstance(value, str): value = str(value) outs = outs.replace('%(' + key + ')s', value) return outs return object2txt def makeModCGILinks(movieTxt, personTxt, characterTxt=None, encoding='utf8'): """Make a function used to pretty-print movies and persons refereces; movieTxt and personTxt are the strings used for the substitutions. movieTxt must contains %(movieID)s and %(title)s, while personTxt must contains %(personID)s and %(name)s and characterTxt %(characterID)s and %(name)s; characterTxt is optional, for backward compatibility.""" _cgiPrint = makeCgiPrintEncoding(encoding) def modCGILinks(s, titlesRefs, namesRefs, characterRefs=None): """Substitute movies and persons references.""" if characterRefs is None: characterRefs = {} # XXX: look ma'... more nested scopes! def _replaceMovie(match): to_replace = match.group(1) item = titlesRefs.get(to_replace) if item: movieID = item.movieID to_replace = movieTxt % { 'movieID': movieID, 'title': str(_cgiPrint(to_replace), encoding, 'xmlcharrefreplace') } return to_replace def _replacePerson(match): to_replace = match.group(1) item = namesRefs.get(to_replace) if item: personID = item.personID to_replace = personTxt % { 'personID': personID, 'name': str(_cgiPrint(to_replace), encoding, 'xmlcharrefreplace') } return to_replace def _replaceCharacter(match): to_replace = match.group(1) if characterTxt is None: return to_replace item = characterRefs.get(to_replace) if item: characterID = item.characterID if characterID is None: return to_replace to_replace = characterTxt % { 'characterID': characterID, 'name': str(_cgiPrint(to_replace), encoding, 'xmlcharrefreplace') } return to_replace s = s.replace('<', '<').replace('>', '>') s = _re_hrefsub(r'\1', s) s = re_titleRef.sub(_replaceMovie, s) s = re_nameRef.sub(_replacePerson, s) s = re_characterRef.sub(_replaceCharacter, s) return s modCGILinks.movieTxt = movieTxt modCGILinks.personTxt = personTxt modCGILinks.characterTxt = characterTxt return modCGILinks # links to the imdb.com web site. _movieTxt = '%(title)s' _personTxt = '%(name)s' _characterTxt = '%(name)s' modHtmlLinks = makeModCGILinks(movieTxt=_movieTxt, personTxt=_personTxt, characterTxt=_characterTxt) modHtmlLinksASCII = makeModCGILinks(movieTxt=_movieTxt, personTxt=_personTxt, characterTxt=_characterTxt, encoding='ascii') def sortedSeasons(m): """Return a sorted list of seasons of the given series.""" seasons = list(m.get('episodes', {}).keys()) seasons.sort() return seasons def sortedEpisodes(m, season=None): """Return a sorted list of episodes of the given series, considering only the specified season(s) (every season, if None).""" episodes = [] seasons = season if season is None: seasons = sortedSeasons(m) else: if not isinstance(season, (tuple, list)): seasons = [season] for s in seasons: eps_indx = list(m.get('episodes', {}).get(s, {}).keys()) eps_indx.sort() for e in eps_indx: episodes.append(m['episodes'][s][e]) return episodes # Idea and portions of the code courtesy of none none (dclist at gmail.com) _re_imdbIDurl = re.compile(r'\b(nm|tt|ch|co)([0-9]{7})\b') def get_byURL(url, info=None, args=None, kwds=None): """Return a Movie, Person, Character or Company object for the given URL; info is the info set to retrieve, args and kwds are respectively a list and a dictionary or arguments to initialize the data access system. Returns None if unable to correctly parse the url; can raise exceptions if unable to retrieve the data.""" if args is None: args = [] if kwds is None: kwds = {} ia = IMDb(*args, **kwds) match = _re_imdbIDurl.search(url) if not match: return None imdbtype = match.group(1) imdbID = match.group(2) if imdbtype == 'tt': return ia.get_movie(imdbID, info=info) elif imdbtype == 'nm': return ia.get_person(imdbID, info=info) elif imdbtype == 'ch': return ia.get_character(imdbID, info=info) elif imdbtype == 'co': return ia.get_company(imdbID, info=info) return None # Idea and portions of code courtesy of Basil Shubin. # Beware that these information are now available directly by # the Movie/Person/Character instances. def fullSizeCoverURL(obj): """Given an URL string or a Movie, Person or Character instance, returns an URL to the full-size version of the cover/headshot, or None otherwise. This function is obsolete: the same information are available as keys: 'full-size cover url' and 'full-size headshot', respectively for movies and persons/characters.""" return obj.get_fullsizeURL() def keyToXML(key): """Return a key (the ones used to access information in Movie and other classes instances) converted to the style of the XML output.""" return _tagAttr(key, '')[0] def translateKey(key): """Translate a given key.""" return _(keyToXML(key)) # Maps tags to classes. _MAP_TOP_OBJ = { 'person': Person, 'movie': Movie, 'character': Character, 'company': Company } # Tags to be converted to lists. _TAGS_TO_LIST = dict([(x[0], None) for x in list(TAGS_TO_MODIFY.values())]) _TAGS_TO_LIST.update(_MAP_TOP_OBJ) def tagToKey(tag): """Return the name of the tag, taking it from the 'key' attribute, if present.""" keyAttr = tag.get('key') if keyAttr: if tag.get('keytype') == 'int': keyAttr = int(keyAttr) return keyAttr return tag.tag def _valueWithType(tag, tagValue): """Return tagValue, handling some type conversions.""" tagType = tag.get('type') if tagType == 'int': tagValue = int(tagValue) elif tagType == 'float': tagValue = float(tagValue) return tagValue # Extra tags to get (if values were not already read from title/name). _titleTags = ('imdbindex', 'kind', 'year') _nameTags = ('imdbindex',) _companyTags = ('imdbindex', 'country') def parseTags(tag, _topLevel=True, _as=None, _infoset2keys=None, _key2infoset=None): """Recursively parse a tree of tags.""" # The returned object (usually a _Container subclass, but it can # be a string, an int, a float, a list or a dictionary). item = None if _infoset2keys is None: _infoset2keys = {} if _key2infoset is None: _key2infoset = {} name = tagToKey(tag) firstChild = (tag.getchildren() or [None])[0] tagStr = (tag.text or '').strip() if not tagStr and name == 'item': # Handles 'item' tags containing text and a 'notes' sub-tag. tagContent = tag.getchildren() if tagContent and tagContent[0].text: tagStr = (tagContent[0].text or '').strip() infoset = tag.get('infoset') if infoset: _key2infoset[name] = infoset _infoset2keys.setdefault(infoset, []).append(name) # Here we use tag.name to avoid tags like if tag.tag in _MAP_TOP_OBJ: # One of the subclasses of _Container. item = _MAP_TOP_OBJ[name]() itemAs = tag.get('access-system') if itemAs: if not _as: _as = itemAs else: itemAs = _as item.accessSystem = itemAs tagsToGet = [] theID = tag.get('id') if name == 'movie': item.movieID = theID tagsToGet = _titleTags ttitle = tag.find('title') if ttitle is not None: item.set_title(ttitle.text) tag.remove(ttitle) else: if name == 'person': item.personID = theID tagsToGet = _nameTags theName = tag.find('long imdb canonical name') if not theName: theName = tag.find('name') elif name == 'character': item.characterID = theID tagsToGet = _nameTags theName = tag.find('name') elif name == 'company': item.companyID = theID tagsToGet = _companyTags theName = tag.find('name') if theName is not None: item.set_name(theName.text) tag.remove(theName) for t in tagsToGet: if t in item.data: continue dataTag = tag.find(t) if dataTag is not None: item.data[tagToKey(dataTag)] = _valueWithType(dataTag, dataTag.text) notesTag = tag.find('notes') if notesTag is not None: item.notes = notesTag.text tag.remove(notesTag) episodeOf = tag.find('episode-of') if episodeOf is not None: item.data['episode of'] = parseTags(episodeOf, _topLevel=False, _as=_as, _infoset2keys=_infoset2keys, _key2infoset=_key2infoset) tag.remove(episodeOf) cRole = tag.find('current-role') if cRole is not None: cr = parseTags(cRole, _topLevel=False, _as=_as, _infoset2keys=_infoset2keys, _key2infoset=_key2infoset) item.currentRole = cr tag.remove(cRole) # XXX: big assumption, here. What about Movie instances used # as keys in dictionaries? What about other keys (season and # episode number, for example?) if not _topLevel: # tag.extract() return item _adder = lambda key, value: item.data.update({key: value}) elif tagStr: tagNotes = tag.find('notes') if tagNotes is not None: notes = (tagNotes.text or '').strip() if notes: tagStr += '::%s' % notes else: tagStr = _valueWithType(tag, tagStr) return tagStr elif firstChild is not None: firstChildName = tagToKey(firstChild) if firstChildName in _TAGS_TO_LIST: item = [] _adder = lambda key, value: item.append(value) else: item = {} _adder = lambda key, value: item.update({key: value}) else: item = {} _adder = lambda key, value: item.update({name: value}) for subTag in tag.getchildren(): subTagKey = tagToKey(subTag) # Exclude dinamically generated keys. if tag.tag in _MAP_TOP_OBJ and subTagKey in item._additional_keys(): continue subItem = parseTags(subTag, _topLevel=False, _as=_as, _infoset2keys=_infoset2keys, _key2infoset=_key2infoset) if subItem: _adder(subTagKey, subItem) if _topLevel and name in _MAP_TOP_OBJ: # Add information about 'info sets', but only to the top-level object. item.infoset2keys = _infoset2keys item.key2infoset = _key2infoset item.current_info = list(_infoset2keys.keys()) return item def parseXML(xml): """Parse a XML string, returning an appropriate object (usually an instance of a subclass of _Container.""" import lxml.etree return parseTags(lxml.etree.fromstring(xml)) _re_akas_lang = re.compile('(?:[(])([a-zA-Z]+?)(?: title[)])') _re_akas_country = re.compile('\(.*?\)') # akasLanguages, sortAKAsBySimilarity and getAKAsInLanguage code # copyright of Alberto Malagoli (refactoring by Davide Alberani). def akasLanguages(movie): """Given a movie, return a list of tuples in (lang, AKA) format; lang can be None, if unable to detect.""" lang_and_aka = [] akas = set((movie.get('akas') or []) + (movie.get('akas from release info') or [])) for aka in akas: # split aka aka = aka.split('::') # sometimes there is no countries information if len(aka) == 2: # search for something like "(... title)" where ... is a language language = _re_akas_lang.search(aka[1]) if language: language = language.groups()[0] else: # split countries using , and keep only the first one (it's sufficient) country = aka[1].split(',')[0] # remove parenthesis country = _re_akas_country.sub('', country).strip() # given the country, get corresponding language from dictionary language = COUNTRY_LANG.get(country) else: language = None lang_and_aka.append((language, aka[0])) return lang_and_aka def sortAKAsBySimilarity(movie, title, _titlesOnly=True, _preferredLang=None): """Return a list of movie AKAs, sorted by their similarity to the given title. If _titlesOnly is not True, similarity information are returned. If _preferredLang is specified, AKAs in the given language will get a higher score. The return is a list of title, or a list of tuples if _titlesOnly is False.""" language = movie.guessLanguage() # estimate string distance between current title and given title m_title = movie['title'].lower() l_title = title.lower() scores = [] score = difflib.SequenceMatcher(None, m_title, l_title).ratio() # set original title and corresponding score as the best match for given title scores.append((score, movie['title'], None)) for language, aka in akasLanguages(movie): # estimate string distance between current title and given title m_title = aka.lower() score = difflib.SequenceMatcher(None, m_title, l_title).ratio() # if current language is the same as the given one, increase score if _preferredLang and _preferredLang == language: score += 1 scores.append((score, aka, language)) scores.sort(reverse=True) if _titlesOnly: return [x[1] for x in scores] return scores def getAKAsInLanguage(movie, lang, _searchedTitle=None): """Return a list of AKAs of a movie, in the specified language. If _searchedTitle is given, the AKAs are sorted by their similarity to it.""" akas = [] for language, aka in akasLanguages(movie): if lang == language: akas.append(aka) if _searchedTitle: scores = [] for aka in akas: scores.append(difflib.SequenceMatcher(None, aka.lower(), _searchedTitle.lower()), aka) scores.sort(reverse=True) akas = [x[1] for x in scores] return akas imdbpy-6.8/imdb/linguistics.py000066400000000000000000000224161351454127000165040ustar00rootroot00000000000000# -*- coding: utf-8 -*- # Copyright 2009-2017 Davide Alberani # 2012 Alberto Malagoli # 2009 H. Turgut Uyar # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides functions and data to handle languages and articles (in various languages) at the beginning of movie titles in a smart way. """ from __future__ import absolute_import, division, print_function, unicode_literals # List of generic articles used when the language of the title is unknown (or # we don't have information about articles in that language). # XXX: Managing titles in a lot of different languages, a function to recognize # an initial article can't be perfect; sometimes we'll stumble upon a short # word that is an article in some language, but it's not in another; in these # situations we have to choose if we want to interpret this little word # as an article or not (remember that we don't know what the original language # of the title was). # Example: 'en' is (I suppose) an article in Some Language. Unfortunately it # seems also to be a preposition in other languages (French?). # Running a script over the whole list of titles (and aliases), I've found # that 'en' is used as an article only 376 times, and as another thing 594 # times, so I've decided to _always_ consider 'en' as a non article. # # Here is a list of words that are _never_ considered as articles, complete # with the cound of times they are used in a way or another: # 'en' (376 vs 594), 'to' (399 vs 727), 'as' (198 vs 276), 'et' (79 vs 99), # 'des' (75 vs 150), 'al' (78 vs 304), 'ye' (14 vs 70), # 'da' (23 vs 298), "'n" (8 vs 12) # # I've left in the list 'i' (1939 vs 2151) and 'uno' (52 vs 56) # I'm not sure what '-al' is, and so I've left it out... # # Generic list of articles in unicode: GENERIC_ARTICLES = ( 'the', 'la', 'a', 'die', 'der', 'le', 'el', "l'", 'il', 'das', 'les', 'i', 'o', 'ein', 'un', 'de', 'los', 'an', 'una', 'las', 'eine', 'den', 'het', 'gli', 'lo', 'os', 'ang', 'oi', 'az', 'een', 'ha-', 'det', 'ta', 'al-', 'mga', "un'", 'uno', 'ett', 'dem', 'egy', 'els', 'eines', 'Ï', 'Ç', 'Ôï', 'Ïé' ) # Lists of articles separated by language. If possible, the list should # be sorted by frequency (not very important, but...) # If you want to add a list of articles for another language, mail it # it at imdbpy-devel@lists.sourceforge.net LANG_ARTICLES = { 'English': ('the', 'a', 'an'), 'Italian': ('la', 'le', "l'", 'il', 'i', 'un', 'una', 'gli', 'lo', "un'", 'uno'), 'Spanish': ( 'la', 'lo', 'el', 'las', 'un', 'los', 'una', 'al', 'del', 'unos', 'unas', 'uno' ), 'French': ('le', "l'", 'la', 'les', 'un', 'une', 'des', 'au', 'du', 'à la', 'de la', 'aux'), 'Portuguese': ('a', 'as', 'o', 'os', 'um', 'uns', 'uma', 'umas'), 'Turkish': () # Some languages doesn't have articles. } LANG_ARTICLESget = LANG_ARTICLES.get # Maps a language to countries where it is the main language. # If you want to add an entry for another language or country, mail it at # imdbpy-devel@lists.sourceforge.net . LANG_COUNTRIES = { 'English': ( 'Canada', 'Swaziland', 'Ghana', 'St. Lucia', 'Liberia', 'Jamaica', 'Bahamas', 'New Zealand', 'Lesotho', 'Kenya', 'Solomon Islands', 'United States', 'South Africa', 'St. Vincent and the Grenadines', 'Fiji', 'UK', 'Nigeria', 'Australia', 'USA', 'St. Kitts and Nevis', 'Belize', 'Sierra Leone', 'Gambia', 'Namibia', 'Micronesia', 'Kiribati', 'Grenada', 'Antigua and Barbuda', 'Barbados', 'Malta', 'Zimbabwe', 'Ireland', 'Uganda', 'Trinidad and Tobago', 'South Sudan', 'Guyana', 'Botswana', 'United Kingdom', 'Zambia' ), 'Italian': ('Italy', 'San Marino', 'Vatican City'), 'Spanish': ( 'Spain', 'Mexico', 'Argentina', 'Bolivia', 'Guatemala', 'Uruguay', 'Peru', 'Cuba', 'Dominican Republic', 'Panama', 'Costa Rica', 'Ecuador', 'El Salvador', 'Chile', 'Equatorial Guinea', 'Spain', 'Colombia', 'Nicaragua', 'Venezuela', 'Honduras', 'Paraguay' ), 'French': ( 'Cameroon', 'Burkina Faso', 'Dominica', 'Gabon', 'Monaco', 'France', "Cote d'Ivoire", 'Benin', 'Togo', 'Central African Republic', 'Mali', 'Niger', 'Congo, Republic of', 'Guinea', 'Congo, Democratic Republic of the', 'Luxembourg', 'Haiti', 'Chad', 'Burundi', 'Madagascar', 'Comoros', 'Senegal' ), 'Portuguese': ( 'Portugal', 'Brazil', 'Sao Tome and Principe', 'Cape Verde', 'Angola', 'Mozambique', 'Guinea-Bissau' ), 'German': ( 'Liechtenstein', 'Austria', 'West Germany', 'Switzerland', 'East Germany', 'Germany' ), 'Arabic': ( 'Saudi Arabia', 'Kuwait', 'Jordan', 'Oman', 'Yemen', 'United Arab Emirates', 'Mauritania', 'Lebanon', 'Bahrain', 'Libya', 'Palestinian State (proposed)', 'Qatar', 'Algeria', 'Morocco', 'Iraq', 'Egypt', 'Djibouti', 'Sudan', 'Syria', 'Tunisia' ), 'Turkish': ('Turkey', 'Azerbaijan'), 'Swahili': ('Tanzania',), 'Swedish': ('Sweden',), 'Icelandic': ('Iceland',), 'Estonian': ('Estonia',), 'Romanian': ('Romania',), 'Samoan': ('Samoa',), 'Slovenian': ('Slovenia',), 'Tok Pisin': ('Papua New Guinea',), 'Palauan': ('Palau',), 'Macedonian': ('Macedonia',), 'Hindi': ('India',), 'Dutch': ('Netherlands', 'Belgium', 'Suriname'), 'Marshallese': ('Marshall Islands',), 'Korean': ('Korea, North', 'Korea, South', 'North Korea', 'South Korea'), 'Vietnamese': ('Vietnam',), 'Danish': ('Denmark',), 'Khmer': ('Cambodia',), 'Lao': ('Laos',), 'Somali': ('Somalia',), 'Filipino': ('Philippines',), 'Hungarian': ('Hungary',), 'Ukrainian': ('Ukraine',), 'Bosnian': ('Bosnia and Herzegovina',), 'Georgian': ('Georgia',), 'Lithuanian': ('Lithuania',), 'Malay': ('Brunei',), 'Tetum': ('East Timor',), 'Norwegian': ('Norway',), 'Armenian': ('Armenia',), 'Russian': ('Russia',), 'Slovak': ('Slovakia',), 'Thai': ('Thailand',), 'Croatian': ('Croatia',), 'Turkmen': ('Turkmenistan',), 'Nepali': ('Nepal',), 'Finnish': ('Finland',), 'Uzbek': ('Uzbekistan',), 'Albanian': ('Albania', 'Kosovo'), 'Hebrew': ('Israel',), 'Bulgarian': ('Bulgaria',), 'Greek': ('Cyprus', 'Greece'), 'Burmese': ('Myanmar',), 'Latvian': ('Latvia',), 'Serbian': ('Serbia',), 'Afar': ('Eritrea',), 'Catalan': ('Andorra',), 'Chinese': ('China', 'Taiwan'), 'Czech': ('Czech Republic', 'Czechoslovakia'), 'Bislama': ('Vanuatu',), 'Japanese': ('Japan',), 'Kinyarwanda': ('Rwanda',), 'Amharic': ('Ethiopia',), 'Persian': ('Afghanistan', 'Iran'), 'Tajik': ('Tajikistan',), 'Mongolian': ('Mongolia',), 'Dzongkha': ('Bhutan',), 'Urdu': ('Pakistan',), 'Polish': ('Poland',), 'Sinhala': ('Sri Lanka',), } # Maps countries to their main language. COUNTRY_LANG = {} for lang in LANG_COUNTRIES: for country in LANG_COUNTRIES[lang]: COUNTRY_LANG[country] = lang def toUTF8(articles): """Convert a list of unicode articles to utf-8 encoded strings.""" return tuple([art.encode('utf8') for art in articles]) def toDicts(articles): """Given a list of unicode encoded articles, build two dictionary (one utf-8 encoded and another one with unicode keys) for faster matches.""" utf8Articles = toUTF8(articles) return dict([(x, x) for x in utf8Articles]), dict([(x, x) for x in articles]) def addTrailingSpace(articles): """From the given list of unicode articles, return two lists (one utf-8 encoded and another one in unicode) where a space is added at the end - if the last char is not ' or -.""" _spArticles = [] _spUnicodeArticles = [] for article in articles: if article[-1] not in ("'", '-'): article += ' ' _spArticles.append(article.encode('utf8')) _spUnicodeArticles.append(article) return _spArticles, _spUnicodeArticles # Caches. _ART_CACHE = {} _SP_ART_CACHE = {} def articlesDictsForLang(lang): """Return dictionaries of articles specific for the given language, or the default one if the language is not known.""" if lang in _ART_CACHE: return _ART_CACHE[lang] artDicts = toDicts(LANG_ARTICLESget(lang, GENERIC_ARTICLES)) _ART_CACHE[lang] = artDicts return artDicts def spArticlesForLang(lang): """Return lists of articles (plus optional spaces) specific for the given language, or the default one if the language is not known.""" if lang in _SP_ART_CACHE: return _SP_ART_CACHE[lang] spArticles = addTrailingSpace(LANG_ARTICLESget(lang, GENERIC_ARTICLES)) _SP_ART_CACHE[lang] = spArticles return spArticles imdbpy-6.8/imdb/locale/000077500000000000000000000000001351454127000150275ustar00rootroot00000000000000imdbpy-6.8/imdb/locale/__init__.py000066400000000000000000000020101351454127000171310ustar00rootroot00000000000000# Copyright 2009 H. Turgut Uyar # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This package provides scripts and files for internationalization of IMDbPY. """ from __future__ import absolute_import, division, print_function, unicode_literals import gettext import os LOCALE_DIR = os.path.dirname(__file__) gettext.bindtextdomain('imdbpy', LOCALE_DIR) imdbpy-6.8/imdb/locale/generatepot.py000077500000000000000000000044631351454127000177300ustar00rootroot00000000000000# Copyright 2009 H. Turgut Uyar # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This script generates the imdbpy.pot file, from the DTD. """ import re import sys from datetime import datetime as dt DEFAULT_MESSAGES = {} ELEMENT_PATTERN = r"""\n" "Language-Team: TEAM NAME \n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: nplurals=1; plural=0;\n" "Language-Code: en\n" "Language-Name: English\n" "Preferred-Encodings: utf-8\n" "Domain: imdbpy\n" """ if len(sys.argv) != 2: print("Usage: %s dtd_file" % sys.argv[0]) sys.exit() dtdfilename = sys.argv[1] dtd = open(dtdfilename).read() elements = re_element.findall(dtd) uniq = set(elements) elements = list(uniq) print(POT_HEADER_TEMPLATE % { 'now': dt.strftime(dt.now(), "%Y-%m-%d %H:%M+0000") }) for element in sorted(elements): if element in DEFAULT_MESSAGES: print('# Default: %s' % DEFAULT_MESSAGES[element]) else: print('# Default: %s' % element.replace('-', ' ').capitalize()) print('msgid "%s"' % element) print('msgstr ""') # use this part instead of the line above to generate the po file for English # if element in DEFAULT_MESSAGES: # print 'msgstr "%s"' % DEFAULT_MESSAGES[element] # else: # print 'msgstr "%s"' % element.replace('-', ' ').capitalize() print() imdbpy-6.8/imdb/locale/imdbpy-ar.po000066400000000000000000000464611351454127000172660ustar00rootroot00000000000000# Gettext message file for imdbpy # Translators: # Rajaa Jalil , 2013 msgid "" msgstr "" "Project-Id-Version: IMDbPY\n" "POT-Creation-Date: 2010-03-18 14:35+0000\n" "PO-Revision-Date: 2016-03-28 20:40+0000\n" "Last-Translator: Rajaa Jalil \n" "Language-Team: Arabic (http://www.transifex.com/davide_alberani/imdbpy/language/ar/)\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Domain: imdbpy\n" "Language: ar\n" "Language-Code: en\n" "Language-Name: English\n" "Plural-Forms: nplurals=6; plural=n==0 ? 0 : n==1 ? 1 : n==2 ? 2 : n%100>=3 && n%100<=10 ? 3 : n%100>=11 && n%100<=99 ? 4 : 5;\n" "Preferred-Encodings: utf-8\n" # Default: Actor msgid "actor" msgstr "ممثل" # Default: Actress msgid "actress" msgstr "ممثلة" # Default: Adaption msgid "adaption" msgstr "إقتباس" # Default: Additional information msgid "additional-information" msgstr "معلومات إضافية" # Default: Admissions msgid "admissions" msgstr "" # Default: Agent address msgid "agent-address" msgstr "" # Default: Airing msgid "airing" msgstr "" # Default: Akas msgid "akas" msgstr "" # Default: Akas from release info msgid "akas-from-release-info" msgstr "" # Default: All products msgid "all-products" msgstr "" # Default: Alternate language version of msgid "alternate-language-version-of" msgstr "" # Default: Alternate versions msgid "alternate-versions" msgstr "" # Default: Amazon reviews msgid "amazon-reviews" msgstr "" # Default: Analog left msgid "analog-left" msgstr "" # Default: Analog right msgid "analog-right" msgstr "" # Default: Animation department msgid "animation-department" msgstr "" # Default: Archive footage msgid "archive-footage" msgstr "" # Default: Arithmetic mean msgid "arithmetic-mean" msgstr "" # Default: Art department msgid "art-department" msgstr "" # Default: Art direction msgid "art-direction" msgstr "" # Default: Art director msgid "art-director" msgstr "" # Default: Article msgid "article" msgstr "مقال" # Default: Asin msgid "asin" msgstr "" # Default: Aspect ratio msgid "aspect-ratio" msgstr "" # Default: Assigner msgid "assigner" msgstr "" # Default: Assistant director msgid "assistant-director" msgstr "" # Default: Auctions msgid "auctions" msgstr "" # Default: Audio noise msgid "audio-noise" msgstr "" # Default: Audio quality msgid "audio-quality" msgstr "جودة الصوت" # Default: Award msgid "award" msgstr "جائزة" # Default: Awards msgid "awards" msgstr "جوائز" # Default: Biographical movies msgid "biographical-movies" msgstr "" # Default: Biography msgid "biography" msgstr "" # Default: Biography print msgid "biography-print" msgstr "" # Default: Birth date msgid "birth-date" msgstr "تاريخ الميلاد" # Default: Birth name msgid "birth-name" msgstr "" # Default: Birth notes msgid "birth-notes" msgstr "" # Default: Body msgid "body" msgstr "جسد" # Default: Book msgid "book" msgstr "كتاب" # Default: Books msgid "books" msgstr "كتب" # Default: Bottom 100 rank msgid "bottom-100-rank" msgstr "" # Default: Budget msgid "budget" msgstr "" # Default: Business msgid "business" msgstr "" # Default: By arrangement with msgid "by-arrangement-with" msgstr "" # Default: Camera msgid "camera" msgstr "كاميرا" # Default: Camera and electrical department msgid "camera-and-electrical-department" msgstr "" # Default: Canonical episode title msgid "canonical-episode-title" msgstr "" # Default: Canonical name msgid "canonical-name" msgstr "" # Default: Canonical series title msgid "canonical-series-title" msgstr "" # Default: Canonical title msgid "canonical-title" msgstr "" # Default: Cast msgid "cast" msgstr "" # Default: Casting department msgid "casting-department" msgstr "" # Default: Casting director msgid "casting-director" msgstr "" # Default: Catalog number msgid "catalog-number" msgstr "" # Default: Category msgid "category" msgstr "فئة" # Default: Certificate msgid "certificate" msgstr "شهادة" # Default: Certificates msgid "certificates" msgstr "شهادات" # Default: Certification msgid "certification" msgstr "" # Default: Channel msgid "channel" msgstr "قناة" # Default: Character msgid "character" msgstr "شخصية" # Default: Cinematographer msgid "cinematographer" msgstr "" # Default: Cinematographic process msgid "cinematographic-process" msgstr "" # Default: Close captions teletext ld g msgid "close-captions-teletext-ld-g" msgstr "" # Default: Color info msgid "color-info" msgstr "" # Default: Color information msgid "color-information" msgstr "" # Default: Color rendition msgid "color-rendition" msgstr "" # Default: Company msgid "company" msgstr "شركة" # Default: Complete cast msgid "complete-cast" msgstr "" # Default: Complete crew msgid "complete-crew" msgstr "" # Default: Composer msgid "composer" msgstr "مؤلف" # Default: Connections msgid "connections" msgstr "" # Default: Contrast msgid "contrast" msgstr "" # Default: Copyright holder msgid "copyright-holder" msgstr "" # Default: Costume department msgid "costume-department" msgstr "" # Default: Costume designer msgid "costume-designer" msgstr "" # Default: Countries msgid "countries" msgstr "بلدان" # Default: Country msgid "country" msgstr "بلد" # Default: Courtesy of msgid "courtesy-of" msgstr "" # Default: Cover msgid "cover" msgstr "" # Default: Cover url msgid "cover-url" msgstr "" # Default: Crazy credits msgid "crazy-credits" msgstr "" # Default: Creator msgid "creator" msgstr "" # Default: Current role msgid "current-role" msgstr "" # Default: Database msgid "database" msgstr "قاعدة البيانات" # Default: Date msgid "date" msgstr "تاريخ" # Default: Death date msgid "death-date" msgstr "" # Default: Death notes msgid "death-notes" msgstr "" # Default: Demographic msgid "demographic" msgstr "ديموغرافي" # Default: Description msgid "description" msgstr "وصف" # Default: Dialogue intellegibility msgid "dialogue-intellegibility" msgstr "" # Default: Digital sound msgid "digital-sound" msgstr "" # Default: Director msgid "director" msgstr "مخرج" # Default: Disc format msgid "disc-format" msgstr "" # Default: Disc size msgid "disc-size" msgstr "" # Default: Distributors msgid "distributors" msgstr "موزعون" # Default: Dvd msgid "dvd" msgstr "dvd" # Default: Dvd features msgid "dvd-features" msgstr "" # Default: Dvd format msgid "dvd-format" msgstr "" # Default: Dvds msgid "dvds" msgstr "" # Default: Dynamic range msgid "dynamic-range" msgstr "" # Default: Edited from msgid "edited-from" msgstr "" # Default: Edited into msgid "edited-into" msgstr "" # Default: Editor msgid "editor" msgstr "محرر" # Default: Editorial department msgid "editorial-department" msgstr "" # Default: Episode msgid "episode" msgstr "حلقة" # Default: Episode of msgid "episode-of" msgstr "" # Default: Episode title msgid "episode-title" msgstr "" # Default: Episodes msgid "episodes" msgstr "حلقات" # Default: Episodes rating msgid "episodes-rating" msgstr "" # Default: Essays msgid "essays" msgstr "" # Default: External reviews msgid "external-reviews" msgstr "" # Default: Faqs msgid "faqs" msgstr "" # Default: Feature msgid "feature" msgstr "" # Default: Featured in msgid "featured-in" msgstr "" # Default: Features msgid "features" msgstr "" # Default: Film negative format msgid "film-negative-format" msgstr "" # Default: Filming dates msgid "filming-dates" msgstr "" # Default: Filmography msgid "filmography" msgstr "" # Default: Followed by msgid "followed-by" msgstr "" # Default: Follows msgid "follows" msgstr "" # Default: For msgid "for" msgstr "ل" # Default: Frequency response msgid "frequency-response" msgstr "" # Default: From msgid "from" msgstr "من" # Default: Full article link msgid "full-article-link" msgstr "" # Default: Full size cover url msgid "full-size-cover-url" msgstr "" # Default: Full size headshot msgid "full-size-headshot" msgstr "" # Default: Genres msgid "genres" msgstr "أنواع" # Default: Goofs msgid "goofs" msgstr "" # Default: Gross msgid "gross" msgstr "" # Default: Group genre msgid "group-genre" msgstr "" # Default: Headshot msgid "headshot" msgstr "" # Default: Height msgid "height" msgstr "" # Default: Imdbindex msgid "imdbindex" msgstr "" # Default: In development msgid "in-development" msgstr "" # Default: Interview msgid "interview" msgstr "حوار" # Default: Interviews msgid "interviews" msgstr "" # Default: Introduction msgid "introduction" msgstr "مقدمة" # Default: Item msgid "item" msgstr "" # Default: Keywords msgid "keywords" msgstr "" # Default: Kind msgid "kind" msgstr "" # Default: Label msgid "label" msgstr "" # Default: Laboratory msgid "laboratory" msgstr "مختبر" # Default: Language msgid "language" msgstr "لغة" # Default: Languages msgid "languages" msgstr "لغات" # Default: Laserdisc msgid "laserdisc" msgstr "" # Default: Laserdisc title msgid "laserdisc-title" msgstr "" # Default: Length msgid "length" msgstr "" # Default: Line msgid "line" msgstr "" # Default: Link msgid "link" msgstr "رابط" # Default: Link text msgid "link-text" msgstr "" # Default: Literature msgid "literature" msgstr "" # Default: Locations msgid "locations" msgstr "مواقع" # Default: Long imdb canonical name msgid "long-imdb-canonical-name" msgstr "" # Default: Long imdb canonical title msgid "long-imdb-canonical-title" msgstr "" # Default: Long imdb episode title msgid "long-imdb-episode-title" msgstr "" # Default: Long imdb name msgid "long-imdb-name" msgstr "" # Default: Long imdb title msgid "long-imdb-title" msgstr "" # Default: Magazine cover photo msgid "magazine-cover-photo" msgstr "" # Default: Make up msgid "make-up" msgstr "" # Default: Master format msgid "master-format" msgstr "" # Default: Median msgid "median" msgstr "" # Default: Merchandising links msgid "merchandising-links" msgstr "" # Default: Mini biography msgid "mini-biography" msgstr "" # Default: Misc links msgid "misc-links" msgstr "" # Default: Miscellaneous companies msgid "miscellaneous-companies" msgstr "" # Default: Miscellaneous crew msgid "miscellaneous-crew" msgstr "" # Default: Movie msgid "movie" msgstr "فيلم" # Default: Mpaa msgid "mpaa" msgstr "" # Default: Music department msgid "music-department" msgstr "" # Default: Name msgid "name" msgstr "إسم" # Default: News msgid "news" msgstr "أخبار" # Default: Newsgroup reviews msgid "newsgroup-reviews" msgstr "" # Default: Nick names msgid "nick-names" msgstr "" # Default: Notes msgid "notes" msgstr "" # Default: Novel msgid "novel" msgstr "رواية" # Default: Number msgid "number" msgstr "رقم" # Default: Number of chapter stops msgid "number-of-chapter-stops" msgstr "" # Default: Number of episodes msgid "number-of-episodes" msgstr "" # Default: Number of seasons msgid "number-of-seasons" msgstr "" # Default: Number of sides msgid "number-of-sides" msgstr "" # Default: Number of votes msgid "number-of-votes" msgstr "" # Default: Official retail price msgid "official-retail-price" msgstr "" # Default: Official sites msgid "official-sites" msgstr "" # Default: Opening weekend msgid "opening-weekend" msgstr "" # Default: Original air date msgid "original-air-date" msgstr "" # Default: Original music msgid "original-music" msgstr "" # Default: Original title msgid "original-title" msgstr "" # Default: Other literature msgid "other-literature" msgstr "" # Default: Other works msgid "other-works" msgstr "" # Default: Parents guide msgid "parents-guide" msgstr "" # Default: Performed by msgid "performed-by" msgstr "" # Default: Person msgid "person" msgstr "شخص" # Default: Photo sites msgid "photo-sites" msgstr "" # Default: Pictorial msgid "pictorial" msgstr "" # Default: Picture format msgid "picture-format" msgstr "" # Default: Plot msgid "plot" msgstr "حبكة" # Default: Plot outline msgid "plot-outline" msgstr "" # Default: Portrayed in msgid "portrayed-in" msgstr "" # Default: Pressing plant msgid "pressing-plant" msgstr "" # Default: Printed film format msgid "printed-film-format" msgstr "" # Default: Printed media reviews msgid "printed-media-reviews" msgstr "" # Default: Producer msgid "producer" msgstr "منتج" # Default: Production companies msgid "production-companies" msgstr "" # Default: Production country msgid "production-country" msgstr "" # Default: Production dates msgid "production-dates" msgstr "" # Default: Production design msgid "production-design" msgstr "" # Default: Production designer msgid "production-designer" msgstr "" # Default: Production manager msgid "production-manager" msgstr "" # Default: Production process protocol msgid "production-process-protocol" msgstr "" # Default: Quality of source msgid "quality-of-source" msgstr "" # Default: Quality program msgid "quality-program" msgstr "" # Default: Quote msgid "quote" msgstr "مقتبس" # Default: Quotes msgid "quotes" msgstr "مقتبسات" # Default: Rating msgid "rating" msgstr "تقييم" # Default: Recommendations msgid "recommendations" msgstr "" # Default: Referenced in msgid "referenced-in" msgstr "" # Default: References msgid "references" msgstr "" # Default: Region msgid "region" msgstr "" # Default: Release country msgid "release-country" msgstr "" # Default: Release date msgid "release-date" msgstr "" # Default: Release dates msgid "release-dates" msgstr "" # Default: Remade as msgid "remade-as" msgstr "" # Default: Remake of msgid "remake-of" msgstr "" # Default: Rentals msgid "rentals" msgstr "" # Default: Result msgid "result" msgstr "نتيجة" # Default: Review msgid "review" msgstr "" # Default: Review author msgid "review-author" msgstr "" # Default: Review kind msgid "review-kind" msgstr "" # Default: Runtime msgid "runtime" msgstr "" # Default: Runtimes msgid "runtimes" msgstr "" # Default: Salary history msgid "salary-history" msgstr "" # Default: Screenplay teleplay msgid "screenplay-teleplay" msgstr "" # Default: Season msgid "season" msgstr "موسم" # Default: Second unit director or assistant director msgid "second-unit-director-or-assistant-director" msgstr "" # Default: Self msgid "self" msgstr "" # Default: Series animation department msgid "series-animation-department" msgstr "" # Default: Series art department msgid "series-art-department" msgstr "" # Default: Series assistant directors msgid "series-assistant-directors" msgstr "" # Default: Series camera department msgid "series-camera-department" msgstr "" # Default: Series casting department msgid "series-casting-department" msgstr "" # Default: Series cinematographers msgid "series-cinematographers" msgstr "" # Default: Series costume department msgid "series-costume-department" msgstr "" # Default: Series editorial department msgid "series-editorial-department" msgstr "" # Default: Series editors msgid "series-editors" msgstr "" # Default: Series make up department msgid "series-make-up-department" msgstr "" # Default: Series miscellaneous msgid "series-miscellaneous" msgstr "" # Default: Series music department msgid "series-music-department" msgstr "" # Default: Series producers msgid "series-producers" msgstr "" # Default: Series production designers msgid "series-production-designers" msgstr "" # Default: Series production managers msgid "series-production-managers" msgstr "" # Default: Series sound department msgid "series-sound-department" msgstr "" # Default: Series special effects department msgid "series-special-effects-department" msgstr "" # Default: Series stunts msgid "series-stunts" msgstr "" # Default: Series title msgid "series-title" msgstr "" # Default: Series transportation department msgid "series-transportation-department" msgstr "" # Default: Series visual effects department msgid "series-visual-effects-department" msgstr "" # Default: Series writers msgid "series-writers" msgstr "" # Default: Series years msgid "series-years" msgstr "" # Default: Set decoration msgid "set-decoration" msgstr "" # Default: Sharpness msgid "sharpness" msgstr "" # Default: Similar to msgid "similar-to" msgstr "" # Default: Smart canonical episode title msgid "smart-canonical-episode-title" msgstr "" # Default: Smart canonical series title msgid "smart-canonical-series-title" msgstr "" # Default: Smart canonical title msgid "smart-canonical-title" msgstr "" # Default: Smart long imdb canonical title msgid "smart-long-imdb-canonical-title" msgstr "" # Default: Sound clips msgid "sound-clips" msgstr "" # Default: Sound crew msgid "sound-crew" msgstr "" # Default: Sound encoding msgid "sound-encoding" msgstr "" # Default: Sound mix msgid "sound-mix" msgstr "" # Default: Soundtrack msgid "soundtrack" msgstr "" # Default: Spaciality msgid "spaciality" msgstr "إختصاص" # Default: Special effects msgid "special-effects" msgstr "" # Default: Special effects companies msgid "special-effects-companies" msgstr "" # Default: Special effects department msgid "special-effects-department" msgstr "" # Default: Spin off msgid "spin-off" msgstr "" # Default: Spin off from msgid "spin-off-from" msgstr "" # Default: Spoofed in msgid "spoofed-in" msgstr "" # Default: Spoofs msgid "spoofs" msgstr "" # Default: Spouse msgid "spouse" msgstr "زوج" # Default: Status of availablility msgid "status-of-availablility" msgstr "" # Default: Studio msgid "studio" msgstr "استوديو" # Default: Studios msgid "studios" msgstr "استوديوهات" # Default: Stunt performer msgid "stunt-performer" msgstr "" # Default: Stunts msgid "stunts" msgstr "" # Default: Subtitles msgid "subtitles" msgstr "" # Default: Supplement msgid "supplement" msgstr "" # Default: Supplements msgid "supplements" msgstr "" # Default: Synopsis msgid "synopsis" msgstr "" # Default: Taglines msgid "taglines" msgstr "" # Default: Tech info msgid "tech-info" msgstr "" # Default: Thanks msgid "thanks" msgstr "بفضل" # Default: Time msgid "time" msgstr "وقت" # Default: Title msgid "title" msgstr "عنوان" # Default: Titles in this product msgid "titles-in-this-product" msgstr "" # Default: To msgid "to" msgstr "إلى" # Default: Top 250 rank msgid "top-250-rank" msgstr "" # Default: Trade mark msgid "trade-mark" msgstr "" # Default: Transportation department msgid "transportation-department" msgstr "" # Default: Trivia msgid "trivia" msgstr "" # Default: Tv msgid "tv" msgstr "تلفزيون" # Default: Under license from msgid "under-license-from" msgstr "" # Default: Unknown link msgid "unknown-link" msgstr "" # Default: Upc msgid "upc" msgstr "upc" # Default: Version of msgid "version-of" msgstr "" # Default: Vhs msgid "vhs" msgstr "vhs" # Default: Video msgid "video" msgstr "فيديو" # Default: Video artifacts msgid "video-artifacts" msgstr "" # Default: Video clips msgid "video-clips" msgstr "" # Default: Video noise msgid "video-noise" msgstr "" # Default: Video quality msgid "video-quality" msgstr "" # Default: Video standard msgid "video-standard" msgstr "" # Default: Visual effects msgid "visual-effects" msgstr "" # Default: Votes msgid "votes" msgstr "أصوات" # Default: Votes distribution msgid "votes-distribution" msgstr "" # Default: Weekend gross msgid "weekend-gross" msgstr "" # Default: Where now msgid "where-now" msgstr "" # Default: With msgid "with" msgstr "مع" # Default: Writer msgid "writer" msgstr "كاتب" # Default: Written by msgid "written-by" msgstr "" # Default: Year msgid "year" msgstr "سنة" # Default: Zshops msgid "zshops" msgstr "" imdbpy-6.8/imdb/locale/imdbpy-bg.po000066400000000000000000000512151351454127000172450ustar00rootroot00000000000000# Gettext message file for imdbpy # Translators: # Atanas Kovachki , 2014 msgid "" msgstr "" "Project-Id-Version: IMDbPY\n" "POT-Creation-Date: 2010-03-18 14:35+0000\n" "PO-Revision-Date: 2016-03-28 20:40+0000\n" "Last-Translator: Atanas Kovachki \n" "Language-Team: Bulgarian (http://www.transifex.com/davide_alberani/imdbpy/language/bg/)\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Domain: imdbpy\n" "Language: bg\n" "Language-Code: en\n" "Language-Name: English\n" "Plural-Forms: nplurals=2; plural=(n != 1);\n" "Preferred-Encodings: utf-8\n" # Default: Actor msgid "actor" msgstr "актьор" # Default: Actress msgid "actress" msgstr "актриса" # Default: Adaption msgid "adaption" msgstr "адаптация" # Default: Additional information msgid "additional-information" msgstr "допълнителна информация" # Default: Admissions msgid "admissions" msgstr "" # Default: Agent address msgid "agent-address" msgstr "" # Default: Airing msgid "airing" msgstr "" # Default: Akas msgid "akas" msgstr "" # Default: Akas from release info msgid "akas-from-release-info" msgstr "" # Default: All products msgid "all-products" msgstr "всички продукти" # Default: Alternate language version of msgid "alternate-language-version-of" msgstr "" # Default: Alternate versions msgid "alternate-versions" msgstr "" # Default: Amazon reviews msgid "amazon-reviews" msgstr "" # Default: Analog left msgid "analog-left" msgstr "" # Default: Analog right msgid "analog-right" msgstr "" # Default: Animation department msgid "animation-department" msgstr "" # Default: Archive footage msgid "archive-footage" msgstr "" # Default: Arithmetic mean msgid "arithmetic-mean" msgstr "" # Default: Art department msgid "art-department" msgstr "" # Default: Art direction msgid "art-direction" msgstr "" # Default: Art director msgid "art-director" msgstr "арт директор" # Default: Article msgid "article" msgstr "" # Default: Asin msgid "asin" msgstr "" # Default: Aspect ratio msgid "aspect-ratio" msgstr "" # Default: Assigner msgid "assigner" msgstr "" # Default: Assistant director msgid "assistant-director" msgstr "" # Default: Auctions msgid "auctions" msgstr "" # Default: Audio noise msgid "audio-noise" msgstr "" # Default: Audio quality msgid "audio-quality" msgstr "" # Default: Award msgid "award" msgstr "награда" # Default: Awards msgid "awards" msgstr "награди" # Default: Biographical movies msgid "biographical-movies" msgstr "" # Default: Biography msgid "biography" msgstr "биография" # Default: Biography print msgid "biography-print" msgstr "" # Default: Birth date msgid "birth-date" msgstr "рождена дата" # Default: Birth name msgid "birth-name" msgstr "" # Default: Birth notes msgid "birth-notes" msgstr "" # Default: Body msgid "body" msgstr "" # Default: Book msgid "book" msgstr "книга" # Default: Books msgid "books" msgstr "книги" # Default: Bottom 100 rank msgid "bottom-100-rank" msgstr "" # Default: Budget msgid "budget" msgstr "бюджет" # Default: Business msgid "business" msgstr "" # Default: By arrangement with msgid "by-arrangement-with" msgstr "" # Default: Camera msgid "camera" msgstr "камера" # Default: Camera and electrical department msgid "camera-and-electrical-department" msgstr "" # Default: Canonical episode title msgid "canonical-episode-title" msgstr "" # Default: Canonical name msgid "canonical-name" msgstr "" # Default: Canonical series title msgid "canonical-series-title" msgstr "" # Default: Canonical title msgid "canonical-title" msgstr "" # Default: Cast msgid "cast" msgstr "" # Default: Casting department msgid "casting-department" msgstr "" # Default: Casting director msgid "casting-director" msgstr "кастинг директор" # Default: Catalog number msgid "catalog-number" msgstr "" # Default: Category msgid "category" msgstr "категория" # Default: Certificate msgid "certificate" msgstr "сертификат" # Default: Certificates msgid "certificates" msgstr "сертификати" # Default: Certification msgid "certification" msgstr "сертифициране" # Default: Channel msgid "channel" msgstr "канал" # Default: Character msgid "character" msgstr "характер" # Default: Cinematographer msgid "cinematographer" msgstr "кинематограф" # Default: Cinematographic process msgid "cinematographic-process" msgstr "" # Default: Close captions teletext ld g msgid "close-captions-teletext-ld-g" msgstr "" # Default: Color info msgid "color-info" msgstr "" # Default: Color information msgid "color-information" msgstr "" # Default: Color rendition msgid "color-rendition" msgstr "" # Default: Company msgid "company" msgstr "компания" # Default: Complete cast msgid "complete-cast" msgstr "" # Default: Complete crew msgid "complete-crew" msgstr "" # Default: Composer msgid "composer" msgstr "композитор" # Default: Connections msgid "connections" msgstr "връзки" # Default: Contrast msgid "contrast" msgstr "контраст" # Default: Copyright holder msgid "copyright-holder" msgstr "" # Default: Costume department msgid "costume-department" msgstr "" # Default: Costume designer msgid "costume-designer" msgstr "" # Default: Countries msgid "countries" msgstr "страни" # Default: Country msgid "country" msgstr "страна" # Default: Courtesy of msgid "courtesy-of" msgstr "" # Default: Cover msgid "cover" msgstr "" # Default: Cover url msgid "cover-url" msgstr "" # Default: Crazy credits msgid "crazy-credits" msgstr "" # Default: Creator msgid "creator" msgstr "създател" # Default: Current role msgid "current-role" msgstr "" # Default: Database msgid "database" msgstr "база данни" # Default: Date msgid "date" msgstr "дата" # Default: Death date msgid "death-date" msgstr "дата на смъртта" # Default: Death notes msgid "death-notes" msgstr "" # Default: Demographic msgid "demographic" msgstr "" # Default: Description msgid "description" msgstr "описание" # Default: Dialogue intellegibility msgid "dialogue-intellegibility" msgstr "" # Default: Digital sound msgid "digital-sound" msgstr "" # Default: Director msgid "director" msgstr "директор" # Default: Disc format msgid "disc-format" msgstr "формат на диска" # Default: Disc size msgid "disc-size" msgstr "размер на диска" # Default: Distributors msgid "distributors" msgstr "дистрибутори" # Default: Dvd msgid "dvd" msgstr "dvd" # Default: Dvd features msgid "dvd-features" msgstr "dvd характеристики" # Default: Dvd format msgid "dvd-format" msgstr "dvd формат" # Default: Dvds msgid "dvds" msgstr "dvd-та" # Default: Dynamic range msgid "dynamic-range" msgstr "динамичен обхват" # Default: Edited from msgid "edited-from" msgstr "" # Default: Edited into msgid "edited-into" msgstr "" # Default: Editor msgid "editor" msgstr "редактор" # Default: Editorial department msgid "editorial-department" msgstr "" # Default: Episode msgid "episode" msgstr "епизод" # Default: Episode of msgid "episode-of" msgstr "епизод от" # Default: Episode title msgid "episode-title" msgstr "име на епизода" # Default: Episodes msgid "episodes" msgstr "епизоди" # Default: Episodes rating msgid "episodes-rating" msgstr "рейтинг на епизодите" # Default: Essays msgid "essays" msgstr "" # Default: External reviews msgid "external-reviews" msgstr "външни рецензии" # Default: Faqs msgid "faqs" msgstr "" # Default: Feature msgid "feature" msgstr "" # Default: Featured in msgid "featured-in" msgstr "" # Default: Features msgid "features" msgstr "" # Default: Film negative format msgid "film-negative-format" msgstr "" # Default: Filming dates msgid "filming-dates" msgstr "" # Default: Filmography msgid "filmography" msgstr "филмография" # Default: Followed by msgid "followed-by" msgstr "последван от" # Default: Follows msgid "follows" msgstr "последователи" # Default: For msgid "for" msgstr "" # Default: Frequency response msgid "frequency-response" msgstr "" # Default: From msgid "from" msgstr "" # Default: Full article link msgid "full-article-link" msgstr "" # Default: Full size cover url msgid "full-size-cover-url" msgstr "" # Default: Full size headshot msgid "full-size-headshot" msgstr "" # Default: Genres msgid "genres" msgstr "жанрове" # Default: Goofs msgid "goofs" msgstr "" # Default: Gross msgid "gross" msgstr "" # Default: Group genre msgid "group-genre" msgstr "" # Default: Headshot msgid "headshot" msgstr "" # Default: Height msgid "height" msgstr "" # Default: Imdbindex msgid "imdbindex" msgstr "imdb индекс" # Default: In development msgid "in-development" msgstr "" # Default: Interview msgid "interview" msgstr "интервю" # Default: Interviews msgid "interviews" msgstr "интервюта" # Default: Introduction msgid "introduction" msgstr "въведение" # Default: Item msgid "item" msgstr "елемент" # Default: Keywords msgid "keywords" msgstr "ключови думи" # Default: Kind msgid "kind" msgstr "" # Default: Label msgid "label" msgstr "етикет" # Default: Laboratory msgid "laboratory" msgstr "лаборатория" # Default: Language msgid "language" msgstr "език" # Default: Languages msgid "languages" msgstr "езици" # Default: Laserdisc msgid "laserdisc" msgstr "лазерен диск" # Default: Laserdisc title msgid "laserdisc-title" msgstr "име на лазерения диск" # Default: Length msgid "length" msgstr "продължителност" # Default: Line msgid "line" msgstr "" # Default: Link msgid "link" msgstr "връзка" # Default: Link text msgid "link-text" msgstr "текст за връзката" # Default: Literature msgid "literature" msgstr "литература" # Default: Locations msgid "locations" msgstr "местоположения" # Default: Long imdb canonical name msgid "long-imdb-canonical-name" msgstr "" # Default: Long imdb canonical title msgid "long-imdb-canonical-title" msgstr "" # Default: Long imdb episode title msgid "long-imdb-episode-title" msgstr "" # Default: Long imdb name msgid "long-imdb-name" msgstr "" # Default: Long imdb title msgid "long-imdb-title" msgstr "" # Default: Magazine cover photo msgid "magazine-cover-photo" msgstr "" # Default: Make up msgid "make-up" msgstr "" # Default: Master format msgid "master-format" msgstr "" # Default: Median msgid "median" msgstr "" # Default: Merchandising links msgid "merchandising-links" msgstr "" # Default: Mini biography msgid "mini-biography" msgstr "" # Default: Misc links msgid "misc-links" msgstr "" # Default: Miscellaneous companies msgid "miscellaneous-companies" msgstr "" # Default: Miscellaneous crew msgid "miscellaneous-crew" msgstr "" # Default: Movie msgid "movie" msgstr "филм" # Default: Mpaa msgid "mpaa" msgstr "mpaa" # Default: Music department msgid "music-department" msgstr "музикален департамент" # Default: Name msgid "name" msgstr "име" # Default: News msgid "news" msgstr "новини" # Default: Newsgroup reviews msgid "newsgroup-reviews" msgstr "" # Default: Nick names msgid "nick-names" msgstr "" # Default: Notes msgid "notes" msgstr "бележки" # Default: Novel msgid "novel" msgstr "роман" # Default: Number msgid "number" msgstr "номер" # Default: Number of chapter stops msgid "number-of-chapter-stops" msgstr "" # Default: Number of episodes msgid "number-of-episodes" msgstr "" # Default: Number of seasons msgid "number-of-seasons" msgstr "" # Default: Number of sides msgid "number-of-sides" msgstr "" # Default: Number of votes msgid "number-of-votes" msgstr "" # Default: Official retail price msgid "official-retail-price" msgstr "" # Default: Official sites msgid "official-sites" msgstr "официалени сайтове" # Default: Opening weekend msgid "opening-weekend" msgstr "" # Default: Original air date msgid "original-air-date" msgstr "" # Default: Original music msgid "original-music" msgstr "" # Default: Original title msgid "original-title" msgstr "" # Default: Other literature msgid "other-literature" msgstr "" # Default: Other works msgid "other-works" msgstr "" # Default: Parents guide msgid "parents-guide" msgstr "" # Default: Performed by msgid "performed-by" msgstr "" # Default: Person msgid "person" msgstr "персона" # Default: Photo sites msgid "photo-sites" msgstr "" # Default: Pictorial msgid "pictorial" msgstr "" # Default: Picture format msgid "picture-format" msgstr "" # Default: Plot msgid "plot" msgstr "" # Default: Plot outline msgid "plot-outline" msgstr "" # Default: Portrayed in msgid "portrayed-in" msgstr "" # Default: Pressing plant msgid "pressing-plant" msgstr "" # Default: Printed film format msgid "printed-film-format" msgstr "" # Default: Printed media reviews msgid "printed-media-reviews" msgstr "" # Default: Producer msgid "producer" msgstr "продуцент" # Default: Production companies msgid "production-companies" msgstr "производствени компании" # Default: Production country msgid "production-country" msgstr "производствена страна" # Default: Production dates msgid "production-dates" msgstr "" # Default: Production design msgid "production-design" msgstr "" # Default: Production designer msgid "production-designer" msgstr "" # Default: Production manager msgid "production-manager" msgstr "" # Default: Production process protocol msgid "production-process-protocol" msgstr "" # Default: Quality of source msgid "quality-of-source" msgstr "" # Default: Quality program msgid "quality-program" msgstr "" # Default: Quote msgid "quote" msgstr "цитат" # Default: Quotes msgid "quotes" msgstr "цитати" # Default: Rating msgid "rating" msgstr "рейтинг" # Default: Recommendations msgid "recommendations" msgstr "препоръки" # Default: Referenced in msgid "referenced-in" msgstr "" # Default: References msgid "references" msgstr "референции" # Default: Region msgid "region" msgstr "регион" # Default: Release country msgid "release-country" msgstr "" # Default: Release date msgid "release-date" msgstr "" # Default: Release dates msgid "release-dates" msgstr "" # Default: Remade as msgid "remade-as" msgstr "" # Default: Remake of msgid "remake-of" msgstr "" # Default: Rentals msgid "rentals" msgstr "" # Default: Result msgid "result" msgstr "резултат" # Default: Review msgid "review" msgstr "рецензия" # Default: Review author msgid "review-author" msgstr "автор на рецензията" # Default: Review kind msgid "review-kind" msgstr "" # Default: Runtime msgid "runtime" msgstr "времетраене" # Default: Runtimes msgid "runtimes" msgstr "" # Default: Salary history msgid "salary-history" msgstr "" # Default: Screenplay teleplay msgid "screenplay-teleplay" msgstr "" # Default: Season msgid "season" msgstr "сезон" # Default: Second unit director or assistant director msgid "second-unit-director-or-assistant-director" msgstr "" # Default: Self msgid "self" msgstr "" # Default: Series animation department msgid "series-animation-department" msgstr "" # Default: Series art department msgid "series-art-department" msgstr "" # Default: Series assistant directors msgid "series-assistant-directors" msgstr "" # Default: Series camera department msgid "series-camera-department" msgstr "" # Default: Series casting department msgid "series-casting-department" msgstr "" # Default: Series cinematographers msgid "series-cinematographers" msgstr "" # Default: Series costume department msgid "series-costume-department" msgstr "" # Default: Series editorial department msgid "series-editorial-department" msgstr "" # Default: Series editors msgid "series-editors" msgstr "" # Default: Series make up department msgid "series-make-up-department" msgstr "" # Default: Series miscellaneous msgid "series-miscellaneous" msgstr "" # Default: Series music department msgid "series-music-department" msgstr "" # Default: Series producers msgid "series-producers" msgstr "" # Default: Series production designers msgid "series-production-designers" msgstr "" # Default: Series production managers msgid "series-production-managers" msgstr "" # Default: Series sound department msgid "series-sound-department" msgstr "" # Default: Series special effects department msgid "series-special-effects-department" msgstr "" # Default: Series stunts msgid "series-stunts" msgstr "" # Default: Series title msgid "series-title" msgstr "име на серията" # Default: Series transportation department msgid "series-transportation-department" msgstr "" # Default: Series visual effects department msgid "series-visual-effects-department" msgstr "" # Default: Series writers msgid "series-writers" msgstr "" # Default: Series years msgid "series-years" msgstr "" # Default: Set decoration msgid "set-decoration" msgstr "" # Default: Sharpness msgid "sharpness" msgstr "" # Default: Similar to msgid "similar-to" msgstr "" # Default: Smart canonical episode title msgid "smart-canonical-episode-title" msgstr "" # Default: Smart canonical series title msgid "smart-canonical-series-title" msgstr "" # Default: Smart canonical title msgid "smart-canonical-title" msgstr "" # Default: Smart long imdb canonical title msgid "smart-long-imdb-canonical-title" msgstr "" # Default: Sound clips msgid "sound-clips" msgstr "" # Default: Sound crew msgid "sound-crew" msgstr "" # Default: Sound encoding msgid "sound-encoding" msgstr "" # Default: Sound mix msgid "sound-mix" msgstr "" # Default: Soundtrack msgid "soundtrack" msgstr "саундтрак" # Default: Spaciality msgid "spaciality" msgstr "" # Default: Special effects msgid "special-effects" msgstr "специални ефекти" # Default: Special effects companies msgid "special-effects-companies" msgstr "" # Default: Special effects department msgid "special-effects-department" msgstr "" # Default: Spin off msgid "spin-off" msgstr "" # Default: Spin off from msgid "spin-off-from" msgstr "" # Default: Spoofed in msgid "spoofed-in" msgstr "" # Default: Spoofs msgid "spoofs" msgstr "" # Default: Spouse msgid "spouse" msgstr "" # Default: Status of availablility msgid "status-of-availablility" msgstr "" # Default: Studio msgid "studio" msgstr "студио" # Default: Studios msgid "studios" msgstr "студиа" # Default: Stunt performer msgid "stunt-performer" msgstr "" # Default: Stunts msgid "stunts" msgstr "" # Default: Subtitles msgid "subtitles" msgstr "субтитри" # Default: Supplement msgid "supplement" msgstr "допълнение" # Default: Supplements msgid "supplements" msgstr "допълнения" # Default: Synopsis msgid "synopsis" msgstr "синопсис" # Default: Taglines msgid "taglines" msgstr "подзаглавия" # Default: Tech info msgid "tech-info" msgstr "" # Default: Thanks msgid "thanks" msgstr "" # Default: Time msgid "time" msgstr "време" # Default: Title msgid "title" msgstr "име" # Default: Titles in this product msgid "titles-in-this-product" msgstr "" # Default: To msgid "to" msgstr "" # Default: Top 250 rank msgid "top-250-rank" msgstr "" # Default: Trade mark msgid "trade-mark" msgstr "търговска марка" # Default: Transportation department msgid "transportation-department" msgstr "" # Default: Trivia msgid "trivia" msgstr "любопитно" # Default: Tv msgid "tv" msgstr "тв" # Default: Under license from msgid "under-license-from" msgstr "" # Default: Unknown link msgid "unknown-link" msgstr "" # Default: Upc msgid "upc" msgstr "" # Default: Version of msgid "version-of" msgstr "" # Default: Vhs msgid "vhs" msgstr "" # Default: Video msgid "video" msgstr "видео" # Default: Video artifacts msgid "video-artifacts" msgstr "" # Default: Video clips msgid "video-clips" msgstr "" # Default: Video noise msgid "video-noise" msgstr "" # Default: Video quality msgid "video-quality" msgstr "" # Default: Video standard msgid "video-standard" msgstr "" # Default: Visual effects msgid "visual-effects" msgstr "" # Default: Votes msgid "votes" msgstr "гласа" # Default: Votes distribution msgid "votes-distribution" msgstr "" # Default: Weekend gross msgid "weekend-gross" msgstr "" # Default: Where now msgid "where-now" msgstr "" # Default: With msgid "with" msgstr "" # Default: Writer msgid "writer" msgstr "сценарист" # Default: Written by msgid "written-by" msgstr "" # Default: Year msgid "year" msgstr "година" # Default: Zshops msgid "zshops" msgstr "" imdbpy-6.8/imdb/locale/imdbpy-de.po000066400000000000000000000471321351454127000172500ustar00rootroot00000000000000# Gettext message file for imdbpy # Translators: # Nils Welzk, 2013 # Raphael, 2014 msgid "" msgstr "" "Project-Id-Version: IMDbPY\n" "POT-Creation-Date: 2010-03-18 14:35+0000\n" "PO-Revision-Date: 2016-03-28 20:40+0000\n" "Last-Translator: Raphael\n" "Language-Team: German (http://www.transifex.com/davide_alberani/imdbpy/language/de/)\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Domain: imdbpy\n" "Language: de\n" "Language-Code: en\n" "Language-Name: English\n" "Plural-Forms: nplurals=2; plural=(n != 1);\n" "Preferred-Encodings: utf-8\n" # Default: Actor msgid "actor" msgstr "Schauspieler" # Default: Actress msgid "actress" msgstr "Schauspielerin" # Default: Adaption msgid "adaption" msgstr "" # Default: Additional information msgid "additional-information" msgstr "zusätzliche Information" # Default: Admissions msgid "admissions" msgstr "" # Default: Agent address msgid "agent-address" msgstr "" # Default: Airing msgid "airing" msgstr "" # Default: Akas msgid "akas" msgstr "Pseudonüme" # Default: Akas from release info msgid "akas-from-release-info" msgstr "" # Default: All products msgid "all-products" msgstr "Alle Produkte" # Default: Alternate language version of msgid "alternate-language-version-of" msgstr "" # Default: Alternate versions msgid "alternate-versions" msgstr "" # Default: Amazon reviews msgid "amazon-reviews" msgstr "Amazon Rezensionen" # Default: Analog left msgid "analog-left" msgstr "" # Default: Analog right msgid "analog-right" msgstr "" # Default: Animation department msgid "animation-department" msgstr "" # Default: Archive footage msgid "archive-footage" msgstr "" # Default: Arithmetic mean msgid "arithmetic-mean" msgstr "" # Default: Art department msgid "art-department" msgstr "" # Default: Art direction msgid "art-direction" msgstr "" # Default: Art director msgid "art-director" msgstr "Art Director" # Default: Article msgid "article" msgstr "Artikel" # Default: Asin msgid "asin" msgstr "" # Default: Aspect ratio msgid "aspect-ratio" msgstr "Seitenverhältnis" # Default: Assigner msgid "assigner" msgstr "" # Default: Assistant director msgid "assistant-director" msgstr "" # Default: Auctions msgid "auctions" msgstr "" # Default: Audio noise msgid "audio-noise" msgstr "" # Default: Audio quality msgid "audio-quality" msgstr "Audio Qualität" # Default: Award msgid "award" msgstr "Auszeichnung" # Default: Awards msgid "awards" msgstr "Auszeichnungen" # Default: Biographical movies msgid "biographical-movies" msgstr "" # Default: Biography msgid "biography" msgstr "Biographie" # Default: Biography print msgid "biography-print" msgstr "" # Default: Birth date msgid "birth-date" msgstr "Geburtsdatum" # Default: Birth name msgid "birth-name" msgstr "Geburtsname" # Default: Birth notes msgid "birth-notes" msgstr "" # Default: Body msgid "body" msgstr "" # Default: Book msgid "book" msgstr "Buch" # Default: Books msgid "books" msgstr "Bücher" # Default: Bottom 100 rank msgid "bottom-100-rank" msgstr "" # Default: Budget msgid "budget" msgstr "Kosten" # Default: Business msgid "business" msgstr "Geschäft" # Default: By arrangement with msgid "by-arrangement-with" msgstr "" # Default: Camera msgid "camera" msgstr "Kamera" # Default: Camera and electrical department msgid "camera-and-electrical-department" msgstr "" # Default: Canonical episode title msgid "canonical-episode-title" msgstr "" # Default: Canonical name msgid "canonical-name" msgstr "" # Default: Canonical series title msgid "canonical-series-title" msgstr "" # Default: Canonical title msgid "canonical-title" msgstr "" # Default: Cast msgid "cast" msgstr "Besetzung" # Default: Casting department msgid "casting-department" msgstr "" # Default: Casting director msgid "casting-director" msgstr "" # Default: Catalog number msgid "catalog-number" msgstr "" # Default: Category msgid "category" msgstr "Kategorie" # Default: Certificate msgid "certificate" msgstr "Zertifikat" # Default: Certificates msgid "certificates" msgstr "Zertifikate" # Default: Certification msgid "certification" msgstr "Bescheinigung" # Default: Channel msgid "channel" msgstr "Kanal" # Default: Character msgid "character" msgstr "" # Default: Cinematographer msgid "cinematographer" msgstr "" # Default: Cinematographic process msgid "cinematographic-process" msgstr "" # Default: Close captions teletext ld g msgid "close-captions-teletext-ld-g" msgstr "" # Default: Color info msgid "color-info" msgstr "" # Default: Color information msgid "color-information" msgstr "" # Default: Color rendition msgid "color-rendition" msgstr "" # Default: Company msgid "company" msgstr "" # Default: Complete cast msgid "complete-cast" msgstr "" # Default: Complete crew msgid "complete-crew" msgstr "" # Default: Composer msgid "composer" msgstr "" # Default: Connections msgid "connections" msgstr "" # Default: Contrast msgid "contrast" msgstr "" # Default: Copyright holder msgid "copyright-holder" msgstr "" # Default: Costume department msgid "costume-department" msgstr "" # Default: Costume designer msgid "costume-designer" msgstr "" # Default: Countries msgid "countries" msgstr "Länder" # Default: Country msgid "country" msgstr "Land" # Default: Courtesy of msgid "courtesy-of" msgstr "" # Default: Cover msgid "cover" msgstr "Cover" # Default: Cover url msgid "cover-url" msgstr "" # Default: Crazy credits msgid "crazy-credits" msgstr "" # Default: Creator msgid "creator" msgstr "Ersteller" # Default: Current role msgid "current-role" msgstr "" # Default: Database msgid "database" msgstr "Datenbank" # Default: Date msgid "date" msgstr "Datum" # Default: Death date msgid "death-date" msgstr "" # Default: Death notes msgid "death-notes" msgstr "" # Default: Demographic msgid "demographic" msgstr "" # Default: Description msgid "description" msgstr "Beschreibung" # Default: Dialogue intellegibility msgid "dialogue-intellegibility" msgstr "" # Default: Digital sound msgid "digital-sound" msgstr "" # Default: Director msgid "director" msgstr "" # Default: Disc format msgid "disc-format" msgstr "" # Default: Disc size msgid "disc-size" msgstr "" # Default: Distributors msgid "distributors" msgstr "Händler" # Default: Dvd msgid "dvd" msgstr "DVD" # Default: Dvd features msgid "dvd-features" msgstr "" # Default: Dvd format msgid "dvd-format" msgstr "" # Default: Dvds msgid "dvds" msgstr "DVDs" # Default: Dynamic range msgid "dynamic-range" msgstr "" # Default: Edited from msgid "edited-from" msgstr "" # Default: Edited into msgid "edited-into" msgstr "" # Default: Editor msgid "editor" msgstr "" # Default: Editorial department msgid "editorial-department" msgstr "" # Default: Episode msgid "episode" msgstr "Episode" # Default: Episode of msgid "episode-of" msgstr "" # Default: Episode title msgid "episode-title" msgstr "Episodentitel" # Default: Episodes msgid "episodes" msgstr "Episoden" # Default: Episodes rating msgid "episodes-rating" msgstr "Episoden Bewertung" # Default: Essays msgid "essays" msgstr "" # Default: External reviews msgid "external-reviews" msgstr "" # Default: Faqs msgid "faqs" msgstr "FAQs" # Default: Feature msgid "feature" msgstr "" # Default: Featured in msgid "featured-in" msgstr "" # Default: Features msgid "features" msgstr "" # Default: Film negative format msgid "film-negative-format" msgstr "" # Default: Filming dates msgid "filming-dates" msgstr "" # Default: Filmography msgid "filmography" msgstr "Filmografie" # Default: Followed by msgid "followed-by" msgstr "gefolgt von" # Default: Follows msgid "follows" msgstr "folgt" # Default: For msgid "for" msgstr "für" # Default: Frequency response msgid "frequency-response" msgstr "" # Default: From msgid "from" msgstr "von" # Default: Full article link msgid "full-article-link" msgstr "" # Default: Full size cover url msgid "full-size-cover-url" msgstr "" # Default: Full size headshot msgid "full-size-headshot" msgstr "" # Default: Genres msgid "genres" msgstr "Genres" # Default: Goofs msgid "goofs" msgstr "" # Default: Gross msgid "gross" msgstr "" # Default: Group genre msgid "group-genre" msgstr "" # Default: Headshot msgid "headshot" msgstr "Portrait" # Default: Height msgid "height" msgstr "Höhe" # Default: Imdbindex msgid "imdbindex" msgstr "" # Default: In development msgid "in-development" msgstr "" # Default: Interview msgid "interview" msgstr "Interview" # Default: Interviews msgid "interviews" msgstr "Interviews" # Default: Introduction msgid "introduction" msgstr "Vorstellung" # Default: Item msgid "item" msgstr "" # Default: Keywords msgid "keywords" msgstr "Schlüsselwörter" # Default: Kind msgid "kind" msgstr "" # Default: Label msgid "label" msgstr "" # Default: Laboratory msgid "laboratory" msgstr "" # Default: Language msgid "language" msgstr "Sprache" # Default: Languages msgid "languages" msgstr "Sprachen" # Default: Laserdisc msgid "laserdisc" msgstr "Laserdisc" # Default: Laserdisc title msgid "laserdisc-title" msgstr "" # Default: Length msgid "length" msgstr "Länge" # Default: Line msgid "line" msgstr "" # Default: Link msgid "link" msgstr "Link" # Default: Link text msgid "link-text" msgstr "" # Default: Literature msgid "literature" msgstr "Literatur" # Default: Locations msgid "locations" msgstr "Standorte" # Default: Long imdb canonical name msgid "long-imdb-canonical-name" msgstr "" # Default: Long imdb canonical title msgid "long-imdb-canonical-title" msgstr "" # Default: Long imdb episode title msgid "long-imdb-episode-title" msgstr "" # Default: Long imdb name msgid "long-imdb-name" msgstr "" # Default: Long imdb title msgid "long-imdb-title" msgstr "" # Default: Magazine cover photo msgid "magazine-cover-photo" msgstr "" # Default: Make up msgid "make-up" msgstr "" # Default: Master format msgid "master-format" msgstr "" # Default: Median msgid "median" msgstr "" # Default: Merchandising links msgid "merchandising-links" msgstr "" # Default: Mini biography msgid "mini-biography" msgstr "" # Default: Misc links msgid "misc-links" msgstr "" # Default: Miscellaneous companies msgid "miscellaneous-companies" msgstr "" # Default: Miscellaneous crew msgid "miscellaneous-crew" msgstr "" # Default: Movie msgid "movie" msgstr "Film" # Default: Mpaa msgid "mpaa" msgstr "" # Default: Music department msgid "music-department" msgstr "" # Default: Name msgid "name" msgstr "Name" # Default: News msgid "news" msgstr "Nachrichten" # Default: Newsgroup reviews msgid "newsgroup-reviews" msgstr "" # Default: Nick names msgid "nick-names" msgstr "Spitznamen" # Default: Notes msgid "notes" msgstr "Anmerkungen" # Default: Novel msgid "novel" msgstr "" # Default: Number msgid "number" msgstr "Zahl" # Default: Number of chapter stops msgid "number-of-chapter-stops" msgstr "" # Default: Number of episodes msgid "number-of-episodes" msgstr "" # Default: Number of seasons msgid "number-of-seasons" msgstr "" # Default: Number of sides msgid "number-of-sides" msgstr "" # Default: Number of votes msgid "number-of-votes" msgstr "" # Default: Official retail price msgid "official-retail-price" msgstr "" # Default: Official sites msgid "official-sites" msgstr "" # Default: Opening weekend msgid "opening-weekend" msgstr "" # Default: Original air date msgid "original-air-date" msgstr "" # Default: Original music msgid "original-music" msgstr "" # Default: Original title msgid "original-title" msgstr "" # Default: Other literature msgid "other-literature" msgstr "" # Default: Other works msgid "other-works" msgstr "" # Default: Parents guide msgid "parents-guide" msgstr "" # Default: Performed by msgid "performed-by" msgstr "" # Default: Person msgid "person" msgstr "" # Default: Photo sites msgid "photo-sites" msgstr "" # Default: Pictorial msgid "pictorial" msgstr "" # Default: Picture format msgid "picture-format" msgstr "" # Default: Plot msgid "plot" msgstr "Handlung" # Default: Plot outline msgid "plot-outline" msgstr "" # Default: Portrayed in msgid "portrayed-in" msgstr "" # Default: Pressing plant msgid "pressing-plant" msgstr "" # Default: Printed film format msgid "printed-film-format" msgstr "" # Default: Printed media reviews msgid "printed-media-reviews" msgstr "" # Default: Producer msgid "producer" msgstr "Produzent" # Default: Production companies msgid "production-companies" msgstr "" # Default: Production country msgid "production-country" msgstr "" # Default: Production dates msgid "production-dates" msgstr "" # Default: Production design msgid "production-design" msgstr "" # Default: Production designer msgid "production-designer" msgstr "" # Default: Production manager msgid "production-manager" msgstr "" # Default: Production process protocol msgid "production-process-protocol" msgstr "" # Default: Quality of source msgid "quality-of-source" msgstr "" # Default: Quality program msgid "quality-program" msgstr "" # Default: Quote msgid "quote" msgstr "Zitat" # Default: Quotes msgid "quotes" msgstr "Zitate" # Default: Rating msgid "rating" msgstr "Bewertung" # Default: Recommendations msgid "recommendations" msgstr "" # Default: Referenced in msgid "referenced-in" msgstr "" # Default: References msgid "references" msgstr "" # Default: Region msgid "region" msgstr "Region" # Default: Release country msgid "release-country" msgstr "" # Default: Release date msgid "release-date" msgstr "Veröffentlichungsdatum" # Default: Release dates msgid "release-dates" msgstr "Veröffentlichungstermine" # Default: Remade as msgid "remade-as" msgstr "" # Default: Remake of msgid "remake-of" msgstr "Remake von" # Default: Rentals msgid "rentals" msgstr "Leigebühr" # Default: Result msgid "result" msgstr "Ergebnis" # Default: Review msgid "review" msgstr "Kritik" # Default: Review author msgid "review-author" msgstr "Kritik Autor" # Default: Review kind msgid "review-kind" msgstr "Kritik Art" # Default: Runtime msgid "runtime" msgstr "Laufzeit" # Default: Runtimes msgid "runtimes" msgstr "Laufzeiten" # Default: Salary history msgid "salary-history" msgstr "" # Default: Screenplay teleplay msgid "screenplay-teleplay" msgstr "" # Default: Season msgid "season" msgstr "" # Default: Second unit director or assistant director msgid "second-unit-director-or-assistant-director" msgstr "" # Default: Self msgid "self" msgstr "" # Default: Series animation department msgid "series-animation-department" msgstr "" # Default: Series art department msgid "series-art-department" msgstr "" # Default: Series assistant directors msgid "series-assistant-directors" msgstr "" # Default: Series camera department msgid "series-camera-department" msgstr "" # Default: Series casting department msgid "series-casting-department" msgstr "" # Default: Series cinematographers msgid "series-cinematographers" msgstr "" # Default: Series costume department msgid "series-costume-department" msgstr "" # Default: Series editorial department msgid "series-editorial-department" msgstr "" # Default: Series editors msgid "series-editors" msgstr "" # Default: Series make up department msgid "series-make-up-department" msgstr "" # Default: Series miscellaneous msgid "series-miscellaneous" msgstr "" # Default: Series music department msgid "series-music-department" msgstr "" # Default: Series producers msgid "series-producers" msgstr "" # Default: Series production designers msgid "series-production-designers" msgstr "" # Default: Series production managers msgid "series-production-managers" msgstr "" # Default: Series sound department msgid "series-sound-department" msgstr "" # Default: Series special effects department msgid "series-special-effects-department" msgstr "" # Default: Series stunts msgid "series-stunts" msgstr "" # Default: Series title msgid "series-title" msgstr "" # Default: Series transportation department msgid "series-transportation-department" msgstr "" # Default: Series visual effects department msgid "series-visual-effects-department" msgstr "" # Default: Series writers msgid "series-writers" msgstr "" # Default: Series years msgid "series-years" msgstr "" # Default: Set decoration msgid "set-decoration" msgstr "" # Default: Sharpness msgid "sharpness" msgstr "" # Default: Similar to msgid "similar-to" msgstr "" # Default: Smart canonical episode title msgid "smart-canonical-episode-title" msgstr "" # Default: Smart canonical series title msgid "smart-canonical-series-title" msgstr "" # Default: Smart canonical title msgid "smart-canonical-title" msgstr "" # Default: Smart long imdb canonical title msgid "smart-long-imdb-canonical-title" msgstr "" # Default: Sound clips msgid "sound-clips" msgstr "" # Default: Sound crew msgid "sound-crew" msgstr "" # Default: Sound encoding msgid "sound-encoding" msgstr "" # Default: Sound mix msgid "sound-mix" msgstr "" # Default: Soundtrack msgid "soundtrack" msgstr "Soundtrack" # Default: Spaciality msgid "spaciality" msgstr "" # Default: Special effects msgid "special-effects" msgstr "" # Default: Special effects companies msgid "special-effects-companies" msgstr "" # Default: Special effects department msgid "special-effects-department" msgstr "" # Default: Spin off msgid "spin-off" msgstr "Nebenprodukt" # Default: Spin off from msgid "spin-off-from" msgstr "Nebenprodukt von" # Default: Spoofed in msgid "spoofed-in" msgstr "Parodiert in" # Default: Spoofs msgid "spoofs" msgstr "Parodie" # Default: Spouse msgid "spouse" msgstr "Gattin" # Default: Status of availablility msgid "status-of-availablility" msgstr "Verfügbarkeitsstatus" # Default: Studio msgid "studio" msgstr "Studio" # Default: Studios msgid "studios" msgstr "Studios" # Default: Stunt performer msgid "stunt-performer" msgstr "Stunt-Darsteller" # Default: Stunts msgid "stunts" msgstr "Stunts" # Default: Subtitles msgid "subtitles" msgstr "Untertitel" # Default: Supplement msgid "supplement" msgstr "Ergänzung" # Default: Supplements msgid "supplements" msgstr "Ergänzungen" # Default: Synopsis msgid "synopsis" msgstr "Zusammenfassung" # Default: Taglines msgid "taglines" msgstr "Slogan" # Default: Tech info msgid "tech-info" msgstr "" # Default: Thanks msgid "thanks" msgstr "Danke" # Default: Time msgid "time" msgstr "Zeit" # Default: Title msgid "title" msgstr "Titel" # Default: Titles in this product msgid "titles-in-this-product" msgstr "" # Default: To msgid "to" msgstr "" # Default: Top 250 rank msgid "top-250-rank" msgstr "Top 250 platzierung" # Default: Trade mark msgid "trade-mark" msgstr "Warenzeichen" # Default: Transportation department msgid "transportation-department" msgstr "" # Default: Trivia msgid "trivia" msgstr "Nichtigkeiten" # Default: Tv msgid "tv" msgstr "TV" # Default: Under license from msgid "under-license-from" msgstr "lizensiert von" # Default: Unknown link msgid "unknown-link" msgstr "" # Default: Upc msgid "upc" msgstr "" # Default: Version of msgid "version-of" msgstr "" # Default: Vhs msgid "vhs" msgstr "VHS" # Default: Video msgid "video" msgstr "Video" # Default: Video artifacts msgid "video-artifacts" msgstr "" # Default: Video clips msgid "video-clips" msgstr "" # Default: Video noise msgid "video-noise" msgstr "" # Default: Video quality msgid "video-quality" msgstr "Video Qualität" # Default: Video standard msgid "video-standard" msgstr "Video Standart" # Default: Visual effects msgid "visual-effects" msgstr "Visuelle Effekte" # Default: Votes msgid "votes" msgstr "Stimmen" # Default: Votes distribution msgid "votes-distribution" msgstr "" # Default: Weekend gross msgid "weekend-gross" msgstr "" # Default: Where now msgid "where-now" msgstr "" # Default: With msgid "with" msgstr "mit" # Default: Writer msgid "writer" msgstr "Autor" # Default: Written by msgid "written-by" msgstr "" # Default: Year msgid "year" msgstr "Jahr" # Default: Zshops msgid "zshops" msgstr "" imdbpy-6.8/imdb/locale/imdbpy-en.po000066400000000000000000000527261351454127000172670ustar00rootroot00000000000000# Gettext message file for imdbpy msgid "" msgstr "" "Project-Id-Version: imdbpy\n" "POT-Creation-Date: 2009-04-16 14:27+0000\n" "PO-Revision-Date: YYYY-MM-DD HH:MM+0000\n" "Last-Translator: YOUR NAME \n" "Language-Team: TEAM NAME \n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: nplurals=1; plural=0;\n" "Language-Code: en\n" "Language-Name: English\n" "Preferred-Encodings: utf-8\n" "Domain: imdbpy\n" # Default: Actor msgid "actor" msgstr "Actor" # Default: Actress msgid "actress" msgstr "Actress" # Default: Adaption msgid "adaption" msgstr "Adaption" # Default: Additional information msgid "additional-information" msgstr "Additional information" # Default: Admissions msgid "admissions" msgstr "Admissions" # Default: Agent address msgid "agent-address" msgstr "Agent address" # Default: Airing msgid "airing" msgstr "Airing" # Default: Akas msgid "akas" msgstr "Akas" # Default: All products msgid "all-products" msgstr "All products" # Default: Alternate language version of msgid "alternate-language-version-of" msgstr "Alternate language version of" # Default: Alternate versions msgid "alternate-versions" msgstr "Alternate versions" # Default: Amazon reviews msgid "amazon-reviews" msgstr "Amazon reviews" # Default: Analog left msgid "analog-left" msgstr "Analog left" # Default: Analog right msgid "analog-right" msgstr "Analog right" # Default: Animation department msgid "animation-department" msgstr "Animation department" # Default: Archive footage msgid "archive-footage" msgstr "Archive footage" # Default: Arithmetic mean msgid "arithmetic-mean" msgstr "Arithmetic mean" # Default: Art department msgid "art-department" msgstr "Art department" # Default: Art direction msgid "art-direction" msgstr "Art direction" # Default: Art director msgid "art-director" msgstr "Art director" # Default: Article msgid "article" msgstr "Article" # Default: Asin msgid "asin" msgstr "Asin" # Default: Aspect ratio msgid "aspect-ratio" msgstr "Aspect ratio" # Default: Assigner msgid "assigner" msgstr "Assigner" # Default: Assistant director msgid "assistant-director" msgstr "Assistant director" # Default: Auctions msgid "auctions" msgstr "Auctions" # Default: Audio noise msgid "audio-noise" msgstr "Audio noise" # Default: Audio quality msgid "audio-quality" msgstr "Audio quality" # Default: Award msgid "award" msgstr "Award" # Default: Awards msgid "awards" msgstr "Awards" # Default: Biographical movies msgid "biographical-movies" msgstr "Biographical movies" # Default: Biography msgid "biography" msgstr "Biography" # Default: Biography print msgid "biography-print" msgstr "Biography print" # Default: Birth date msgid "birth-date" msgstr "Birth date" # Default: Birth name msgid "birth-name" msgstr "Birth name" # Default: Birth notes msgid "birth-notes" msgstr "Birth notes" # Default: Body msgid "body" msgstr "Body" # Default: Book msgid "book" msgstr "Book" # Default: Books msgid "books" msgstr "Books" # Default: Bottom 100 rank msgid "bottom-100-rank" msgstr "Bottom 100 rank" # Default: Budget msgid "budget" msgstr "Budget" # Default: Business msgid "business" msgstr "Business" # Default: By arrangement with msgid "by-arrangement-with" msgstr "By arrangement with" # Default: Camera msgid "camera" msgstr "Camera" # Default: Camera and electrical department msgid "camera-and-electrical-department" msgstr "Camera and electrical department" # Default: Canonical episode title msgid "canonical-episode-title" msgstr "Canonical episode title" # Default: Canonical name msgid "canonical-name" msgstr "Canonical name" # Default: Canonical series title msgid "canonical-series-title" msgstr "Canonical series title" # Default: Canonical title msgid "canonical-title" msgstr "Canonical title" # Default: Cast msgid "cast" msgstr "Cast" # Default: Casting department msgid "casting-department" msgstr "Casting department" # Default: Casting director msgid "casting-director" msgstr "Casting director" # Default: Catalog number msgid "catalog-number" msgstr "Catalog number" # Default: Category msgid "category" msgstr "Category" # Default: Certificate msgid "certificate" msgstr "Certificate" # Default: Certificates msgid "certificates" msgstr "Certificates" # Default: Certification msgid "certification" msgstr "Certification" # Default: Channel msgid "channel" msgstr "Channel" # Default: Character msgid "character" msgstr "Character" # Default: Cinematographer msgid "cinematographer" msgstr "Cinematographer" # Default: Cinematographic process msgid "cinematographic-process" msgstr "Cinematographic process" # Default: Close captions teletext ld g msgid "close-captions-teletext-ld-g" msgstr "Close captions teletext ld g" # Default: Color info msgid "color-info" msgstr "Color info" # Default: Color information msgid "color-information" msgstr "Color information" # Default: Color rendition msgid "color-rendition" msgstr "Color rendition" # Default: Company msgid "company" msgstr "Company" # Default: Complete cast msgid "complete-cast" msgstr "Complete cast" # Default: Complete crew msgid "complete-crew" msgstr "Complete crew" # Default: Composer msgid "composer" msgstr "Composer" # Default: Connections msgid "connections" msgstr "Connections" # Default: Contrast msgid "contrast" msgstr "Contrast" # Default: Copyright holder msgid "copyright-holder" msgstr "Copyright holder" # Default: Costume department msgid "costume-department" msgstr "Costume department" # Default: Costume designer msgid "costume-designer" msgstr "Costume designer" # Default: Countries msgid "countries" msgstr "Countries" # Default: Country msgid "country" msgstr "Country" # Default: Courtesy of msgid "courtesy-of" msgstr "Courtesy of" # Default: Cover msgid "cover" msgstr "Cover" # Default: Cover url msgid "cover-url" msgstr "Cover url" # Default: Crazy credits msgid "crazy-credits" msgstr "Crazy credits" # Default: Creator msgid "creator" msgstr "Creator" # Default: Current role msgid "current-role" msgstr "Current role" # Default: Database msgid "database" msgstr "Database" # Default: Date msgid "date" msgstr "Date" # Default: Death date msgid "death-date" msgstr "Death date" # Default: Death notes msgid "death-notes" msgstr "Death notes" # Default: Demographic msgid "demographic" msgstr "Demographic" # Default: Description msgid "description" msgstr "Description" # Default: Dialogue intellegibility msgid "dialogue-intellegibility" msgstr "Dialogue intellegibility" # Default: Digital sound msgid "digital-sound" msgstr "Digital sound" # Default: Director msgid "director" msgstr "Director" # Default: Disc format msgid "disc-format" msgstr "Disc format" # Default: Disc size msgid "disc-size" msgstr "Disc size" # Default: Distributors msgid "distributors" msgstr "Distributors" # Default: Dvd msgid "dvd" msgstr "Dvd" # Default: Dvd features msgid "dvd-features" msgstr "Dvd features" # Default: Dvd format msgid "dvd-format" msgstr "Dvd format" # Default: Dvds msgid "dvds" msgstr "Dvds" # Default: Dynamic range msgid "dynamic-range" msgstr "Dynamic range" # Default: Edited from msgid "edited-from" msgstr "Edited from" # Default: Edited into msgid "edited-into" msgstr "Edited into" # Default: Editor msgid "editor" msgstr "Editor" # Default: Editorial department msgid "editorial-department" msgstr "Editorial department" # Default: Episode msgid "episode" msgstr "Episode" # Default: Episode of msgid "episode-of" msgstr "Episode of" # Default: Episode title msgid "episode-title" msgstr "Episode title" # Default: Episodes msgid "episodes" msgstr "Episodes" # Default: Episodes rating msgid "episodes-rating" msgstr "Episodes rating" # Default: Essays msgid "essays" msgstr "Essays" # Default: External reviews msgid "external-reviews" msgstr "External reviews" # Default: Faqs msgid "faqs" msgstr "Faqs" # Default: Featured in msgid "featured-in" msgstr "Featured in" # Default: Features msgid "features" msgstr "Features" # Default: Film negative format msgid "film-negative-format" msgstr "Film negative format" # Default: Filming dates msgid "filming-dates" msgstr "Filming dates" # Default: Filmography msgid "filmography" msgstr "Filmography" # Default: Followed by msgid "followed-by" msgstr "Followed by" # Default: Follows msgid "follows" msgstr "Follows" # Default: For msgid "for" msgstr "For" # Default: Frequency response msgid "frequency-response" msgstr "Frequency response" # Default: From msgid "from" msgstr "From" # Default: Full article link msgid "full-article-link" msgstr "Full article link" # Default: Genres msgid "genres" msgstr "Genres" # Default: Goofs msgid "goofs" msgstr "Goofs" # Default: Gross msgid "gross" msgstr "Gross" # Default: Group genre msgid "group-genre" msgstr "Group genre" # Default: Headshot msgid "headshot" msgstr "Headshot" # Default: Height msgid "height" msgstr "Height" # Default: Imdbindex msgid "imdbindex" msgstr "Imdbindex" # Default: Interview msgid "interview" msgstr "Interview" # Default: Interviews msgid "interviews" msgstr "Interviews" # Default: Introduction msgid "introduction" msgstr "Introduction" # Default: Item msgid "item" msgstr "Item" # Default: Keywords msgid "keywords" msgstr "Keywords" # Default: Kind msgid "kind" msgstr "Kind" # Default: Label msgid "label" msgstr "Label" # Default: Laboratory msgid "laboratory" msgstr "Laboratory" # Default: Language msgid "language" msgstr "Language" # Default: Languages msgid "languages" msgstr "Languages" # Default: Laserdisc msgid "laserdisc" msgstr "Laserdisc" # Default: Laserdisc title msgid "laserdisc-title" msgstr "Laserdisc title" # Default: Length msgid "length" msgstr "Length" # Default: Line msgid "line" msgstr "Line" # Default: Link msgid "link" msgstr "Link" # Default: Link text msgid "link-text" msgstr "Link text" # Default: Literature msgid "literature" msgstr "Literature" # Default: Locations msgid "locations" msgstr "Locations" # Default: Long imdb canonical name msgid "long-imdb-canonical-name" msgstr "Long imdb canonical name" # Default: Long imdb canonical title msgid "long-imdb-canonical-title" msgstr "Long imdb canonical title" # Default: Long imdb episode title msgid "long-imdb-episode-title" msgstr "Long imdb episode title" # Default: Long imdb name msgid "long-imdb-name" msgstr "Long imdb name" # Default: Long imdb title msgid "long-imdb-title" msgstr "Long imdb title" # Default: Magazine cover photo msgid "magazine-cover-photo" msgstr "Magazine cover photo" # Default: Make up msgid "make-up" msgstr "Make up" # Default: Master format msgid "master-format" msgstr "Master format" # Default: Median msgid "median" msgstr "Median" # Default: Merchandising links msgid "merchandising-links" msgstr "Merchandising links" # Default: Mini biography msgid "mini-biography" msgstr "Mini biography" # Default: Misc links msgid "misc-links" msgstr "Misc links" # Default: Miscellaneous companies msgid "miscellaneous-companies" msgstr "Miscellaneous companies" # Default: Miscellaneous crew msgid "miscellaneous-crew" msgstr "Miscellaneous crew" # Default: Movie msgid "movie" msgstr "Movie" # Default: Mpaa msgid "mpaa" msgstr "Mpaa" # Default: Music department msgid "music-department" msgstr "Music department" # Default: Name msgid "name" msgstr "Name" # Default: News msgid "news" msgstr "News" # Default: Newsgroup reviews msgid "newsgroup-reviews" msgstr "Newsgroup reviews" # Default: Nick names msgid "nick-names" msgstr "Nick names" # Default: Notes msgid "notes" msgstr "Notes" # Default: Novel msgid "novel" msgstr "Novel" # Default: Number msgid "number" msgstr "Number" # Default: Number of chapter stops msgid "number-of-chapter-stops" msgstr "Number of chapter stops" # Default: Number of episodes msgid "number-of-episodes" msgstr "Number of episodes" # Default: Number of seasons msgid "number-of-seasons" msgstr "Number of seasons" # Default: Number of sides msgid "number-of-sides" msgstr "Number of sides" # Default: Number of votes msgid "number-of-votes" msgstr "Number of votes" # Default: Official retail price msgid "official-retail-price" msgstr "Official retail price" # Default: Official sites msgid "official-sites" msgstr "Official sites" # Default: Opening weekend msgid "opening-weekend" msgstr "Opening weekend" # Default: Original air date msgid "original-air-date" msgstr "Original air date" # Default: Original music msgid "original-music" msgstr "Original music" # Default: Original title msgid "original-title" msgstr "Original title" # Default: Other literature msgid "other-literature" msgstr "Other literature" # Default: Other works msgid "other-works" msgstr "Other works" # Default: Parents guide msgid "parents-guide" msgstr "Parents guide" # Default: Performed by msgid "performed-by" msgstr "Performed by" # Default: Person msgid "person" msgstr "Person" # Default: Photo sites msgid "photo-sites" msgstr "Photo sites" # Default: Pictorial msgid "pictorial" msgstr "Pictorial" # Default: Picture format msgid "picture-format" msgstr "Picture format" # Default: Plot msgid "plot" msgstr "Plot" # Default: Plot outline msgid "plot-outline" msgstr "Plot outline" # Default: Portrayed in msgid "portrayed-in" msgstr "Portrayed in" # Default: Pressing plant msgid "pressing-plant" msgstr "Pressing plant" # Default: Printed film format msgid "printed-film-format" msgstr "Printed film format" # Default: Printed media reviews msgid "printed-media-reviews" msgstr "Printed media reviews" # Default: Producer msgid "producer" msgstr "Producer" # Default: Production companies msgid "production-companies" msgstr "Production companies" # Default: Production country msgid "production-country" msgstr "Production country" # Default: Production dates msgid "production-dates" msgstr "Production dates" # Default: Production design msgid "production-design" msgstr "Production design" # Default: Production designer msgid "production-designer" msgstr "Production designer" # Default: Production manager msgid "production-manager" msgstr "Production manager" # Default: Production process protocol msgid "production-process-protocol" msgstr "Production process protocol" # Default: Quality of source msgid "quality-of-source" msgstr "Quality of source" # Default: Quality program msgid "quality-program" msgstr "Quality program" # Default: Quote msgid "quote" msgstr "Quote" # Default: Quotes msgid "quotes" msgstr "Quotes" # Default: Rating msgid "rating" msgstr "Rating" # Default: Recommendations msgid "recommendations" msgstr "Recommendations" # Default: Referenced in msgid "referenced-in" msgstr "Referenced in" # Default: References msgid "references" msgstr "References" # Default: Region msgid "region" msgstr "Region" # Default: Release country msgid "release-country" msgstr "Release country" # Default: Release date msgid "release-date" msgstr "Release date" # Default: Release dates msgid "release-dates" msgstr "Release dates" # Default: Remade as msgid "remade-as" msgstr "Remade as" # Default: Remake of msgid "remake-of" msgstr "Remake of" # Default: Rentals msgid "rentals" msgstr "Rentals" # Default: Result msgid "result" msgstr "Result" # Default: Review msgid "review" msgstr "Review" # Default: Review author msgid "review-author" msgstr "Review author" # Default: Review kind msgid "review-kind" msgstr "Review kind" # Default: Runtime msgid "runtime" msgstr "Runtime" # Default: Runtimes msgid "runtimes" msgstr "Runtimes" # Default: Salary history msgid "salary-history" msgstr "Salary history" # Default: Screenplay teleplay msgid "screenplay-teleplay" msgstr "Screenplay teleplay" # Default: Season msgid "season" msgstr "Season" # Default: Second unit director or assistant director msgid "second-unit-director-or-assistant-director" msgstr "Second unit director or assistant director" # Default: Self msgid "self" msgstr "Self" # Default: Series animation department msgid "series-animation-department" msgstr "Series animation department" # Default: Series art department msgid "series-art-department" msgstr "Series art department" # Default: Series assistant directors msgid "series-assistant-directors" msgstr "Series assistant directors" # Default: Series camera department msgid "series-camera-department" msgstr "Series camera department" # Default: Series casting department msgid "series-casting-department" msgstr "Series casting department" # Default: Series cinematographers msgid "series-cinematographers" msgstr "Series cinematographers" # Default: Series costume department msgid "series-costume-department" msgstr "Series costume department" # Default: Series editorial department msgid "series-editorial-department" msgstr "Series editorial department" # Default: Series editors msgid "series-editors" msgstr "Series editors" # Default: Series make up department msgid "series-make-up-department" msgstr "Series make up department" # Default: Series miscellaneous msgid "series-miscellaneous" msgstr "Series miscellaneous" # Default: Series music department msgid "series-music-department" msgstr "Series music department" # Default: Series producers msgid "series-producers" msgstr "Series producers" # Default: Series production designers msgid "series-production-designers" msgstr "Series production designers" # Default: Series production managers msgid "series-production-managers" msgstr "Series production managers" # Default: Series sound department msgid "series-sound-department" msgstr "Series sound department" # Default: Series special effects department msgid "series-special-effects-department" msgstr "Series special effects department" # Default: Series stunts msgid "series-stunts" msgstr "Series stunts" # Default: Series title msgid "series-title" msgstr "Series title" # Default: Series transportation department msgid "series-transportation-department" msgstr "Series transportation department" # Default: Series visual effects department msgid "series-visual-effects-department" msgstr "Series visual effects department" # Default: Series writers msgid "series-writers" msgstr "Series writers" # Default: Series years msgid "series-years" msgstr "Series years" # Default: Set decoration msgid "set-decoration" msgstr "Set decoration" # Default: Sharpness msgid "sharpness" msgstr "Sharpness" # Default: Similar to msgid "similar-to" msgstr "Similar to" # Default: Sound clips msgid "sound-clips" msgstr "Sound clips" # Default: Sound crew msgid "sound-crew" msgstr "Sound crew" # Default: Sound encoding msgid "sound-encoding" msgstr "Sound encoding" # Default: Sound mix msgid "sound-mix" msgstr "Sound mix" # Default: Soundtrack msgid "soundtrack" msgstr "Soundtrack" # Default: Spaciality msgid "spaciality" msgstr "Spaciality" # Default: Special effects msgid "special-effects" msgstr "Special effects" # Default: Special effects companies msgid "special-effects-companies" msgstr "Special effects companies" # Default: Special effects department msgid "special-effects-department" msgstr "Special effects department" # Default: Spin off msgid "spin-off" msgstr "Spin off" # Default: Spin off from msgid "spin-off-from" msgstr "Spin off from" # Default: Spoofed in msgid "spoofed-in" msgstr "Spoofed in" # Default: Spoofs msgid "spoofs" msgstr "Spoofs" # Default: Spouse msgid "spouse" msgstr "Spouse" # Default: Status of availablility msgid "status-of-availablility" msgstr "Status of availablility" # Default: Studio msgid "studio" msgstr "Studio" # Default: Studios msgid "studios" msgstr "Studios" # Default: Stunt performer msgid "stunt-performer" msgstr "Stunt performer" # Default: Stunts msgid "stunts" msgstr "Stunts" # Default: Subtitles msgid "subtitles" msgstr "Subtitles" # Default: Supplement msgid "supplement" msgstr "Supplement" # Default: Supplements msgid "supplements" msgstr "Supplements" # Default: Synopsis msgid "synopsis" msgstr "Synopsis" # Default: Taglines msgid "taglines" msgstr "Taglines" # Default: Tech info msgid "tech-info" msgstr "Tech info" # Default: Thanks msgid "thanks" msgstr "Thanks" # Default: Time msgid "time" msgstr "Time" # Default: Title msgid "title" msgstr "Title" # Default: Titles in this product msgid "titles-in-this-product" msgstr "Titles in this product" # Default: To msgid "to" msgstr "To" # Default: Top 250 rank msgid "top-250-rank" msgstr "Top 250 rank" # Default: Trade mark msgid "trade-mark" msgstr "Trade mark" # Default: Transportation department msgid "transportation-department" msgstr "Transportation department" # Default: Trivia msgid "trivia" msgstr "Trivia" # Default: Under license from msgid "under-license-from" msgstr "Under license from" # Default: Unknown link msgid "unknown-link" msgstr "Unknown link" # Default: Upc msgid "upc" msgstr "Upc" # Default: Version of msgid "version-of" msgstr "Version of" # Default: Vhs msgid "vhs" msgstr "Vhs" # Default: Video artifacts msgid "video-artifacts" msgstr "Video artifacts" # Default: Video clips msgid "video-clips" msgstr "Video clips" # Default: Video noise msgid "video-noise" msgstr "Video noise" # Default: Video quality msgid "video-quality" msgstr "Video quality" # Default: Video standard msgid "video-standard" msgstr "Video standard" # Default: Visual effects msgid "visual-effects" msgstr "Visual effects" # Default: Votes msgid "votes" msgstr "Votes" # Default: Votes distribution msgid "votes-distribution" msgstr "Votes distribution" # Default: Weekend gross msgid "weekend-gross" msgstr "Weekend gross" # Default: Where now msgid "where-now" msgstr "Where now" # Default: With msgid "with" msgstr "With" # Default: Writer msgid "writer" msgstr "Writer" # Default: Written by msgid "written-by" msgstr "Written by" # Default: Year msgid "year" msgstr "Year" # Default: Zshops msgid "zshops" msgstr "Zshops" imdbpy-6.8/imdb/locale/imdbpy-es.po000066400000000000000000000572751351454127000173000ustar00rootroot00000000000000# Gettext message file for imdbpy # Translators: # strel, 2013 msgid "" msgstr "" "Project-Id-Version: IMDbPY\n" "POT-Creation-Date: 2010-03-18 14:35+0000\n" "PO-Revision-Date: 2016-03-28 20:40+0000\n" "Last-Translator: strel\n" "Language-Team: Spanish (http://www.transifex.com/davide_alberani/imdbpy/language/es/)\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Domain: imdbpy\n" "Language: es\n" "Language-Code: en\n" "Language-Name: English\n" "Plural-Forms: nplurals=2; plural=(n != 1);\n" "Preferred-Encodings: utf-8\n" # Default: Actor msgid "actor" msgstr "actor" # Default: Actress msgid "actress" msgstr "actriz" # Default: Adaption msgid "adaption" msgstr "adaptación" # Default: Additional information msgid "additional-information" msgstr "información-adicional" # Default: Admissions msgid "admissions" msgstr "facturación" # Default: Agent address msgid "agent-address" msgstr "dirección-del-agente" # Default: Airing msgid "airing" msgstr "estreno-televisivo" # Default: Akas msgid "akas" msgstr "alias" # Default: Akas from release info msgid "akas-from-release-info" msgstr "alias-en-los-datos-de-publicación" # Default: All products msgid "all-products" msgstr "todos-los-productos" # Default: Alternate language version of msgid "alternate-language-version-of" msgstr "versión-con-distinto-idioma-de" # Default: Alternate versions msgid "alternate-versions" msgstr "versiones-distintas" # Default: Amazon reviews msgid "amazon-reviews" msgstr "revisiones-de-amazon" # Default: Analog left msgid "analog-left" msgstr "izquierda-analógico" # Default: Analog right msgid "analog-right" msgstr "análogo-derecha" # Default: Animation department msgid "animation-department" msgstr "departamento-de-animación" # Default: Archive footage msgid "archive-footage" msgstr "video-de-archivo" # Default: Arithmetic mean msgid "arithmetic-mean" msgstr "media-aritmética" # Default: Art department msgid "art-department" msgstr "departamento-artístico" # Default: Art direction msgid "art-direction" msgstr "dirección-artística" # Default: Art director msgid "art-director" msgstr "director-artístico" # Default: Article msgid "article" msgstr "artículo" # Default: Asin msgid "asin" msgstr "asin" # Default: Aspect ratio msgid "aspect-ratio" msgstr "relación-de-aspecto" # Default: Assigner msgid "assigner" msgstr "director" # Default: Assistant director msgid "assistant-director" msgstr "asistente-de-dirección" # Default: Auctions msgid "auctions" msgstr "subastas" # Default: Audio noise msgid "audio-noise" msgstr "audio-ruido" # Default: Audio quality msgid "audio-quality" msgstr "calidad-del-audio" # Default: Award msgid "award" msgstr "premio" # Default: Awards msgid "awards" msgstr "premios" # Default: Biographical movies msgid "biographical-movies" msgstr "películas-biográficas" # Default: Biography msgid "biography" msgstr "biografía" # Default: Biography print msgid "biography-print" msgstr "biografía-impresa" # Default: Birth date msgid "birth-date" msgstr "fecha-de-nacimiento" # Default: Birth name msgid "birth-name" msgstr "nombre-de-pila" # Default: Birth notes msgid "birth-notes" msgstr "notas-de-nacimiento" # Default: Body msgid "body" msgstr "cuerpo" # Default: Book msgid "book" msgstr "libro" # Default: Books msgid "books" msgstr "libros" # Default: Bottom 100 rank msgid "bottom-100-rank" msgstr "ranking-de-los-últimos-100" # Default: Budget msgid "budget" msgstr "presupuesto" # Default: Business msgid "business" msgstr "negocio" # Default: By arrangement with msgid "by-arrangement-with" msgstr "por-acuerdo-con" # Default: Camera msgid "camera" msgstr "cámara" # Default: Camera and electrical department msgid "camera-and-electrical-department" msgstr "departamento-de-cámara-y-eléctrico" # Default: Canonical episode title msgid "canonical-episode-title" msgstr "título-canónico-de-episodio" # Default: Canonical name msgid "canonical-name" msgstr "nombre-canónico" # Default: Canonical series title msgid "canonical-series-title" msgstr "título-canónico-de-serie" # Default: Canonical title msgid "canonical-title" msgstr "título-canónico" # Default: Cast msgid "cast" msgstr "selección" # Default: Casting department msgid "casting-department" msgstr "departamento-de-selección" # Default: Casting director msgid "casting-director" msgstr "director-de-selección" # Default: Catalog number msgid "catalog-number" msgstr "número-de-catálogo" # Default: Category msgid "category" msgstr "categoría" # Default: Certificate msgid "certificate" msgstr "certificado" # Default: Certificates msgid "certificates" msgstr "certificados" # Default: Certification msgid "certification" msgstr "certificación" # Default: Channel msgid "channel" msgstr "canal" # Default: Character msgid "character" msgstr "personaje" # Default: Cinematographer msgid "cinematographer" msgstr "técnico-de-cámara" # Default: Cinematographic process msgid "cinematographic-process" msgstr "proceso-de-rodaje" # Default: Close captions teletext ld g msgid "close-captions-teletext-ld-g" msgstr "cerrar-subtítulos-teletexto-ld-g" # Default: Color info msgid "color-info" msgstr "info-de-color" # Default: Color information msgid "color-information" msgstr "información-de-color" # Default: Color rendition msgid "color-rendition" msgstr "fidelidad-de-color" # Default: Company msgid "company" msgstr "compañía" # Default: Complete cast msgid "complete-cast" msgstr "selección-completa" # Default: Complete crew msgid "complete-crew" msgstr "equipo-completo" # Default: Composer msgid "composer" msgstr "compositor" # Default: Connections msgid "connections" msgstr "conexiones" # Default: Contrast msgid "contrast" msgstr "contraste" # Default: Copyright holder msgid "copyright-holder" msgstr "tenerdor-del-copyright" # Default: Costume department msgid "costume-department" msgstr "departamento-de-vestuario" # Default: Costume designer msgid "costume-designer" msgstr "diseñador-de-vestuario" # Default: Countries msgid "countries" msgstr "países" # Default: Country msgid "country" msgstr "país" # Default: Courtesy of msgid "courtesy-of" msgstr "cortesía-de" # Default: Cover msgid "cover" msgstr "cubierta" # Default: Cover url msgid "cover-url" msgstr "url-de-la-cubierta" # Default: Crazy credits msgid "crazy-credits" msgstr "créditos-locos" # Default: Creator msgid "creator" msgstr "creador" # Default: Current role msgid "current-role" msgstr "papel-actual" # Default: Database msgid "database" msgstr "base-de-datos" # Default: Date msgid "date" msgstr "fecha" # Default: Death date msgid "death-date" msgstr "fecha-de-fallecimiento" # Default: Death notes msgid "death-notes" msgstr "notas-de-fallecimiento" # Default: Demographic msgid "demographic" msgstr "demografía" # Default: Description msgid "description" msgstr "descripción" # Default: Dialogue intellegibility msgid "dialogue-intellegibility" msgstr "diálogo-inteligible" # Default: Digital sound msgid "digital-sound" msgstr "sonido-digital" # Default: Director msgid "director" msgstr "director" # Default: Disc format msgid "disc-format" msgstr "formato-de-disco" # Default: Disc size msgid "disc-size" msgstr "tamaño-de-disco" # Default: Distributors msgid "distributors" msgstr "distribuidor" # Default: Dvd msgid "dvd" msgstr "dvd" # Default: Dvd features msgid "dvd-features" msgstr "características-de-dvd" # Default: Dvd format msgid "dvd-format" msgstr "formato-de-dvd" # Default: Dvds msgid "dvds" msgstr "dvds" # Default: Dynamic range msgid "dynamic-range" msgstr "rango-dinámico" # Default: Edited from msgid "edited-from" msgstr "editado-desde" # Default: Edited into msgid "edited-into" msgstr "editado-a" # Default: Editor msgid "editor" msgstr "editor" # Default: Editorial department msgid "editorial-department" msgstr "departamento-editorial" # Default: Episode msgid "episode" msgstr "episodio" # Default: Episode of msgid "episode-of" msgstr "episodio-de" # Default: Episode title msgid "episode-title" msgstr "título-del-episodio" # Default: Episodes msgid "episodes" msgstr "episodios" # Default: Episodes rating msgid "episodes-rating" msgstr "valoración-de-episodios" # Default: Essays msgid "essays" msgstr "ensayos" # Default: External reviews msgid "external-reviews" msgstr "reseñas-externas" # Default: Faqs msgid "faqs" msgstr "faqs-(preguntas-frecuentes)" # Default: Feature msgid "feature" msgstr "aparición" # Default: Featured in msgid "featured-in" msgstr "aparecido-en" # Default: Features msgid "features" msgstr "apariciones" # Default: Film negative format msgid "film-negative-format" msgstr "formato-de-negativo-de-película" # Default: Filming dates msgid "filming-dates" msgstr "fechas-de-filmación" # Default: Filmography msgid "filmography" msgstr "filmografía" # Default: Followed by msgid "followed-by" msgstr "seguido-por" # Default: Follows msgid "follows" msgstr "sigue-a" # Default: For msgid "for" msgstr "para" # Default: Frequency response msgid "frequency-response" msgstr "respuesta-de-frecuencia" # Default: From msgid "from" msgstr "desde" # Default: Full article link msgid "full-article-link" msgstr "enlace-al-artículo-completo" # Default: Full size cover url msgid "full-size-cover-url" msgstr "url-a-la-caratula-de-tamaño-completo" # Default: Full size headshot msgid "full-size-headshot" msgstr "retrato-de-tamaño-completo" # Default: Genres msgid "genres" msgstr "géneros" # Default: Goofs msgid "goofs" msgstr "gazapos" # Default: Gross msgid "gross" msgstr "recaudación-bruta" # Default: Group genre msgid "group-genre" msgstr "género-del-grupo" # Default: Headshot msgid "headshot" msgstr "retrato" # Default: Height msgid "height" msgstr "altura" # Default: Imdbindex msgid "imdbindex" msgstr "íncide-imdb" # Default: In development msgid "in-development" msgstr "en-desarrollo" # Default: Interview msgid "interview" msgstr "entrevista" # Default: Interviews msgid "interviews" msgstr "entrevistas" # Default: Introduction msgid "introduction" msgstr "introducción" # Default: Item msgid "item" msgstr "elemento" # Default: Keywords msgid "keywords" msgstr "contraseñas" # Default: Kind msgid "kind" msgstr "clase" # Default: Label msgid "label" msgstr "etiqueta" # Default: Laboratory msgid "laboratory" msgstr "laboratorio" # Default: Language msgid "language" msgstr "idioma" # Default: Languages msgid "languages" msgstr "idiomas" # Default: Laserdisc msgid "laserdisc" msgstr "laserdisc" # Default: Laserdisc title msgid "laserdisc-title" msgstr "título-del-laserdisc" # Default: Length msgid "length" msgstr "duración" # Default: Line msgid "line" msgstr "línea" # Default: Link msgid "link" msgstr "enlace" # Default: Link text msgid "link-text" msgstr "texto-del-enlace" # Default: Literature msgid "literature" msgstr "escritos" # Default: Locations msgid "locations" msgstr "localizaciones" # Default: Long imdb canonical name msgid "long-imdb-canonical-name" msgstr "nombre-canónico-largo-de-imdb" # Default: Long imdb canonical title msgid "long-imdb-canonical-title" msgstr "título-canónico-largo-de-imdb" # Default: Long imdb episode title msgid "long-imdb-episode-title" msgstr "título-largo-de-episodio-de-imdb" # Default: Long imdb name msgid "long-imdb-name" msgstr "nombre-largo-de-imdb" # Default: Long imdb title msgid "long-imdb-title" msgstr "título-largo-de-imdb" # Default: Magazine cover photo msgid "magazine-cover-photo" msgstr "foto-de-cubierta-de-magazine" # Default: Make up msgid "make-up" msgstr "maquillaje" # Default: Master format msgid "master-format" msgstr "formato-maestro" # Default: Median msgid "median" msgstr "mediana" # Default: Merchandising links msgid "merchandising-links" msgstr "enlaces-de-merchandising" # Default: Mini biography msgid "mini-biography" msgstr "mini-biografía" # Default: Misc links msgid "misc-links" msgstr "enlaces-varios" # Default: Miscellaneous companies msgid "miscellaneous-companies" msgstr "compañías-varias" # Default: Miscellaneous crew msgid "miscellaneous-crew" msgstr "personal-vario" # Default: Movie msgid "movie" msgstr "película" # Default: Mpaa msgid "mpaa" msgstr "mpaa" # Default: Music department msgid "music-department" msgstr "departamento-musical" # Default: Name msgid "name" msgstr "nombre" # Default: News msgid "news" msgstr "noticias" # Default: Newsgroup reviews msgid "newsgroup-reviews" msgstr "reseñas-de-grupos-de-noticias" # Default: Nick names msgid "nick-names" msgstr "apodos" # Default: Notes msgid "notes" msgstr "notas" # Default: Novel msgid "novel" msgstr "novela" # Default: Number msgid "number" msgstr "número" # Default: Number of chapter stops msgid "number-of-chapter-stops" msgstr "número-de-pausas-del-capítulo" # Default: Number of episodes msgid "number-of-episodes" msgstr "número-de-episodios" # Default: Number of seasons msgid "number-of-seasons" msgstr "número-de-temporadas" # Default: Number of sides msgid "number-of-sides" msgstr "número-de-caras" # Default: Number of votes msgid "number-of-votes" msgstr "número-de-votos" # Default: Official retail price msgid "official-retail-price" msgstr "precio-minorista-oficial" # Default: Official sites msgid "official-sites" msgstr "sitios-oficiales" # Default: Opening weekend msgid "opening-weekend" msgstr "fin-de-semana-inaugural" # Default: Original air date msgid "original-air-date" msgstr "fecha-de-emisión-original" # Default: Original music msgid "original-music" msgstr "música-original" # Default: Original title msgid "original-title" msgstr "título-original" # Default: Other literature msgid "other-literature" msgstr "otros-escritos" # Default: Other works msgid "other-works" msgstr "otros-trabajos" # Default: Parents guide msgid "parents-guide" msgstr "guía-parental" # Default: Performed by msgid "performed-by" msgstr "interpretado-por" # Default: Person msgid "person" msgstr "persona" # Default: Photo sites msgid "photo-sites" msgstr "lugares-fotografiados" # Default: Pictorial msgid "pictorial" msgstr "reportaje-fotográfico" # Default: Picture format msgid "picture-format" msgstr "formato-de-fotografía" # Default: Plot msgid "plot" msgstr "trama" # Default: Plot outline msgid "plot-outline" msgstr "resumen-de-la-trama" # Default: Portrayed in msgid "portrayed-in" msgstr "representado-en" # Default: Pressing plant msgid "pressing-plant" msgstr "fábrica-de-copias" # Default: Printed film format msgid "printed-film-format" msgstr "formato-de-película-impresa" # Default: Printed media reviews msgid "printed-media-reviews" msgstr "reseñas-en-medios-escritos" # Default: Producer msgid "producer" msgstr "productor" # Default: Production companies msgid "production-companies" msgstr "compañías-de-la-producción" # Default: Production country msgid "production-country" msgstr "país-de-la-producción" # Default: Production dates msgid "production-dates" msgstr "fechas-de-producción" # Default: Production design msgid "production-design" msgstr "diseño-de-producción" # Default: Production designer msgid "production-designer" msgstr "diseñador-de-producción" # Default: Production manager msgid "production-manager" msgstr "director-de-producción" # Default: Production process protocol msgid "production-process-protocol" msgstr "protocolo-de-proceso-de-producción" # Default: Quality of source msgid "quality-of-source" msgstr "calidad-del-original" # Default: Quality program msgid "quality-program" msgstr "programa-de-calidad" # Default: Quote msgid "quote" msgstr "cita" # Default: Quotes msgid "quotes" msgstr "citas" # Default: Rating msgid "rating" msgstr "valoración" # Default: Recommendations msgid "recommendations" msgstr "recomendaciones" # Default: Referenced in msgid "referenced-in" msgstr "referenciado-en" # Default: References msgid "references" msgstr "referencias" # Default: Region msgid "region" msgstr "región" # Default: Release country msgid "release-country" msgstr "país-de-estreno" # Default: Release date msgid "release-date" msgstr "fecha-de-estreno" # Default: Release dates msgid "release-dates" msgstr "fechas-de-estreno" # Default: Remade as msgid "remade-as" msgstr "reversionado-como" # Default: Remake of msgid "remake-of" msgstr "refrito-de" # Default: Rentals msgid "rentals" msgstr "recaudación-por-alquileres" # Default: Result msgid "result" msgstr "resultado" # Default: Review msgid "review" msgstr "reseña" # Default: Review author msgid "review-author" msgstr "autor-de-la-reseña" # Default: Review kind msgid "review-kind" msgstr "tipo-de-reseña" # Default: Runtime msgid "runtime" msgstr "duración" # Default: Runtimes msgid "runtimes" msgstr "duraciones" # Default: Salary history msgid "salary-history" msgstr "historial-salarial" # Default: Screenplay teleplay msgid "screenplay-teleplay" msgstr "guiones-cinematrográfico-y-televisivo" # Default: Season msgid "season" msgstr "temporada" # Default: Second unit director or assistant director msgid "second-unit-director-or-assistant-director" msgstr "segundo-director-de-unidad-o-asistente-de-dirección" # Default: Self msgid "self" msgstr "auto" # Default: Series animation department msgid "series-animation-department" msgstr "departamento-de-animación-de-la-serie" # Default: Series art department msgid "series-art-department" msgstr "departamento-artístico-de-la-serie" # Default: Series assistant directors msgid "series-assistant-directors" msgstr "asistentes-de-dirección-de-la-serie" # Default: Series camera department msgid "series-camera-department" msgstr "departamento-de-cámaras-de-la-serie" # Default: Series casting department msgid "series-casting-department" msgstr "departamento-de-selección-de-la-serie" # Default: Series cinematographers msgid "series-cinematographers" msgstr "técnicos-de-cámara-de-la-serie" # Default: Series costume department msgid "series-costume-department" msgstr "departamento-de-vestuario-de-la-serie" # Default: Series editorial department msgid "series-editorial-department" msgstr "departamento-editorial-de-la-serie" # Default: Series editors msgid "series-editors" msgstr "editores-de-la-serie" # Default: Series make up department msgid "series-make-up-department" msgstr "departamento-de-maquillaje-de-la-serie" # Default: Series miscellaneous msgid "series-miscellaneous" msgstr "series-varias" # Default: Series music department msgid "series-music-department" msgstr "departamento-musical-de-la-serie" # Default: Series producers msgid "series-producers" msgstr "productores-de-la-serie" # Default: Series production designers msgid "series-production-designers" msgstr "diseñadores-de-producción-de-la-serie" # Default: Series production managers msgid "series-production-managers" msgstr "directores-de-producción-de-la-serie" # Default: Series sound department msgid "series-sound-department" msgstr "departamento-de-sonido-de-la-serie" # Default: Series special effects department msgid "series-special-effects-department" msgstr "departamento-de-efectos-especiales-de-la-serie" # Default: Series stunts msgid "series-stunts" msgstr "acrobacias-de-la-serie" # Default: Series title msgid "series-title" msgstr "título" # Default: Series transportation department msgid "series-transportation-department" msgstr "departamento-de-transporte-de-la-serie" # Default: Series visual effects department msgid "series-visual-effects-department" msgstr "departamento-de-efectos-visuales-de-la-serie" # Default: Series writers msgid "series-writers" msgstr "guionistas-de-la-serie" # Default: Series years msgid "series-years" msgstr "años-de-la-serie" # Default: Set decoration msgid "set-decoration" msgstr "decoración-del-set" # Default: Sharpness msgid "sharpness" msgstr "agudeza" # Default: Similar to msgid "similar-to" msgstr "similar-a" # Default: Smart canonical episode title msgid "smart-canonical-episode-title" msgstr "título-canónico-inteligente-del-episodio" # Default: Smart canonical series title msgid "smart-canonical-series-title" msgstr "título-canónico-inteligente-de-la-serie" # Default: Smart canonical title msgid "smart-canonical-title" msgstr "título-canónico-inteligente" # Default: Smart long imdb canonical title msgid "smart-long-imdb-canonical-title" msgstr "título-canónico-inteligente-largo-de-imdb" # Default: Sound clips msgid "sound-clips" msgstr "audio-clips" # Default: Sound crew msgid "sound-crew" msgstr "equipo-de-audio" # Default: Sound encoding msgid "sound-encoding" msgstr "compresión-de-audio" # Default: Sound mix msgid "sound-mix" msgstr "mezcla-de-audio" # Default: Soundtrack msgid "soundtrack" msgstr "banda-sonora" # Default: Spaciality msgid "spaciality" msgstr "espacialidad" # Default: Special effects msgid "special-effects" msgstr "efectos-especiales" # Default: Special effects companies msgid "special-effects-companies" msgstr "compañías-de-efectos-especiales" # Default: Special effects department msgid "special-effects-department" msgstr "departamento-de-efectos-especiales" # Default: Spin off msgid "spin-off" msgstr "secuela" # Default: Spin off from msgid "spin-off-from" msgstr "secuela-de" # Default: Spoofed in msgid "spoofed-in" msgstr "paradiado-en" # Default: Spoofs msgid "spoofs" msgstr "parodias" # Default: Spouse msgid "spouse" msgstr "esposa" # Default: Status of availablility msgid "status-of-availablility" msgstr "estado-de-disponibilidad" # Default: Studio msgid "studio" msgstr "estudio" # Default: Studios msgid "studios" msgstr "estudios" # Default: Stunt performer msgid "stunt-performer" msgstr "especialista-de-acrobacias" # Default: Stunts msgid "stunts" msgstr "acrobacias" # Default: Subtitles msgid "subtitles" msgstr "subtítulos" # Default: Supplement msgid "supplement" msgstr "suplemento" # Default: Supplements msgid "supplements" msgstr "suplementos" # Default: Synopsis msgid "synopsis" msgstr "sinopsis" # Default: Taglines msgid "taglines" msgstr "eslogan" # Default: Tech info msgid "tech-info" msgstr "información-técnica" # Default: Thanks msgid "thanks" msgstr "gracias" # Default: Time msgid "time" msgstr "hora" # Default: Title msgid "title" msgstr "título" # Default: Titles in this product msgid "titles-in-this-product" msgstr "títulos-en-este-producto" # Default: To msgid "to" msgstr "a" # Default: Top 250 rank msgid "top-250-rank" msgstr "primeros-250-de-la-clasificación" # Default: Trade mark msgid "trade-mark" msgstr "marca-registrada" # Default: Transportation department msgid "transportation-department" msgstr "departamento-de-transporte" # Default: Trivia msgid "trivia" msgstr "curiosidades" # Default: Tv msgid "tv" msgstr "tv" # Default: Under license from msgid "under-license-from" msgstr "bajo-licencia-de" # Default: Unknown link msgid "unknown-link" msgstr "enlace-desconocido" # Default: Upc msgid "upc" msgstr "upc" # Default: Version of msgid "version-of" msgstr "versión-de" # Default: Vhs msgid "vhs" msgstr "vhs" # Default: Video msgid "video" msgstr "vídeo" # Default: Video artifacts msgid "video-artifacts" msgstr "efectos-de-vídeo" # Default: Video clips msgid "video-clips" msgstr "vídeo-clips" # Default: Video noise msgid "video-noise" msgstr "vídeo-ruido" # Default: Video quality msgid "video-quality" msgstr "calidad-de-vídeo" # Default: Video standard msgid "video-standard" msgstr "estandar-de-vídeo" # Default: Visual effects msgid "visual-effects" msgstr "efectos-visuales" # Default: Votes msgid "votes" msgstr "votos" # Default: Votes distribution msgid "votes-distribution" msgstr "distribución-de-votos" # Default: Weekend gross msgid "weekend-gross" msgstr "recaudación-bruta-del-fin-de-semana" # Default: Where now msgid "where-now" msgstr "dónde-está-ahora" # Default: With msgid "with" msgstr "con" # Default: Writer msgid "writer" msgstr "escritor" # Default: Written by msgid "written-by" msgstr "escrito-por" # Default: Year msgid "year" msgstr "año" # Default: Zshops msgid "zshops" msgstr "amazon-zshops" imdbpy-6.8/imdb/locale/imdbpy-fr.po000066400000000000000000000463401351454127000172670ustar00rootroot00000000000000# Gettext message file for imdbpy # Translators: # lukophron, 2014-2016 # Rajaa Jalil , 2013 # lkppo, 2012 msgid "" msgstr "" "Project-Id-Version: IMDbPY\n" "POT-Creation-Date: 2010-03-18 14:35+0000\n" "PO-Revision-Date: 2016-03-20 05:27+0000\n" "Last-Translator: lukophron\n" "Language-Team: French (http://www.transifex.com/davide_alberani/imdbpy/language/fr/)\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Domain: imdbpy\n" "Language: fr\n" "Language-Code: en\n" "Language-Name: English\n" "Plural-Forms: nplurals=2; plural=(n > 1);\n" "Preferred-Encodings: utf-8\n" # Default: Actor msgid "actor" msgstr "acteur" # Default: Actress msgid "actress" msgstr "actrice" # Default: Adaption msgid "adaption" msgstr "adaptation" # Default: Additional information msgid "additional-information" msgstr "information-additionnelle" # Default: Admissions msgid "admissions" msgstr "entrées" # Default: Agent address msgid "agent-address" msgstr "" # Default: Airing msgid "airing" msgstr "en-diffusion" # Default: Akas msgid "akas" msgstr "alias" # Default: Akas from release info msgid "akas-from-release-info" msgstr "alias-depuis-info-sortie" # Default: All products msgid "all-products" msgstr "" # Default: Alternate language version of msgid "alternate-language-version-of" msgstr "" # Default: Alternate versions msgid "alternate-versions" msgstr "" # Default: Amazon reviews msgid "amazon-reviews" msgstr "critiques-amazon" # Default: Analog left msgid "analog-left" msgstr "" # Default: Analog right msgid "analog-right" msgstr "" # Default: Animation department msgid "animation-department" msgstr "département-animation" # Default: Archive footage msgid "archive-footage" msgstr "" # Default: Arithmetic mean msgid "arithmetic-mean" msgstr "" # Default: Art department msgid "art-department" msgstr "" # Default: Art direction msgid "art-direction" msgstr "" # Default: Art director msgid "art-director" msgstr "" # Default: Article msgid "article" msgstr "article" # Default: Asin msgid "asin" msgstr "asin" # Default: Aspect ratio msgid "aspect-ratio" msgstr "" # Default: Assigner msgid "assigner" msgstr "" # Default: Assistant director msgid "assistant-director" msgstr "" # Default: Auctions msgid "auctions" msgstr "" # Default: Audio noise msgid "audio-noise" msgstr "" # Default: Audio quality msgid "audio-quality" msgstr "" # Default: Award msgid "award" msgstr "récompense" # Default: Awards msgid "awards" msgstr "récompenses" # Default: Biographical movies msgid "biographical-movies" msgstr "" # Default: Biography msgid "biography" msgstr "biographie" # Default: Biography print msgid "biography-print" msgstr "" # Default: Birth date msgid "birth-date" msgstr "" # Default: Birth name msgid "birth-name" msgstr "" # Default: Birth notes msgid "birth-notes" msgstr "" # Default: Body msgid "body" msgstr "Corps" # Default: Book msgid "book" msgstr "livre" # Default: Books msgid "books" msgstr "livres" # Default: Bottom 100 rank msgid "bottom-100-rank" msgstr "" # Default: Budget msgid "budget" msgstr "budget" # Default: Business msgid "business" msgstr "" # Default: By arrangement with msgid "by-arrangement-with" msgstr "" # Default: Camera msgid "camera" msgstr "camera" # Default: Camera and electrical department msgid "camera-and-electrical-department" msgstr "" # Default: Canonical episode title msgid "canonical-episode-title" msgstr "" # Default: Canonical name msgid "canonical-name" msgstr "" # Default: Canonical series title msgid "canonical-series-title" msgstr "" # Default: Canonical title msgid "canonical-title" msgstr "" # Default: Cast msgid "cast" msgstr "" # Default: Casting department msgid "casting-department" msgstr "" # Default: Casting director msgid "casting-director" msgstr "" # Default: Catalog number msgid "catalog-number" msgstr "" # Default: Category msgid "category" msgstr "catégorie" # Default: Certificate msgid "certificate" msgstr "certificat" # Default: Certificates msgid "certificates" msgstr "" # Default: Certification msgid "certification" msgstr "" # Default: Channel msgid "channel" msgstr "chaîne" # Default: Character msgid "character" msgstr "" # Default: Cinematographer msgid "cinematographer" msgstr "" # Default: Cinematographic process msgid "cinematographic-process" msgstr "" # Default: Close captions teletext ld g msgid "close-captions-teletext-ld-g" msgstr "" # Default: Color info msgid "color-info" msgstr "" # Default: Color information msgid "color-information" msgstr "" # Default: Color rendition msgid "color-rendition" msgstr "" # Default: Company msgid "company" msgstr "société" # Default: Complete cast msgid "complete-cast" msgstr "" # Default: Complete crew msgid "complete-crew" msgstr "" # Default: Composer msgid "composer" msgstr "" # Default: Connections msgid "connections" msgstr "" # Default: Contrast msgid "contrast" msgstr "" # Default: Copyright holder msgid "copyright-holder" msgstr "" # Default: Costume department msgid "costume-department" msgstr "" # Default: Costume designer msgid "costume-designer" msgstr "" # Default: Countries msgid "countries" msgstr "pays" # Default: Country msgid "country" msgstr "pays" # Default: Courtesy of msgid "courtesy-of" msgstr "" # Default: Cover msgid "cover" msgstr "couverture" # Default: Cover url msgid "cover-url" msgstr "" # Default: Crazy credits msgid "crazy-credits" msgstr "" # Default: Creator msgid "creator" msgstr "créateur" # Default: Current role msgid "current-role" msgstr "" # Default: Database msgid "database" msgstr "base de données" # Default: Date msgid "date" msgstr "date" # Default: Death date msgid "death-date" msgstr "" # Default: Death notes msgid "death-notes" msgstr "" # Default: Demographic msgid "demographic" msgstr "" # Default: Description msgid "description" msgstr "description" # Default: Dialogue intellegibility msgid "dialogue-intellegibility" msgstr "" # Default: Digital sound msgid "digital-sound" msgstr "" # Default: Director msgid "director" msgstr "directeur" # Default: Disc format msgid "disc-format" msgstr "" # Default: Disc size msgid "disc-size" msgstr "" # Default: Distributors msgid "distributors" msgstr "distributeurs" # Default: Dvd msgid "dvd" msgstr "dvd" # Default: Dvd features msgid "dvd-features" msgstr "" # Default: Dvd format msgid "dvd-format" msgstr "" # Default: Dvds msgid "dvds" msgstr "dvds" # Default: Dynamic range msgid "dynamic-range" msgstr "" # Default: Edited from msgid "edited-from" msgstr "" # Default: Edited into msgid "edited-into" msgstr "" # Default: Editor msgid "editor" msgstr "éditeur" # Default: Editorial department msgid "editorial-department" msgstr "" # Default: Episode msgid "episode" msgstr "épisode" # Default: Episode of msgid "episode-of" msgstr "" # Default: Episode title msgid "episode-title" msgstr "" # Default: Episodes msgid "episodes" msgstr "épisodes" # Default: Episodes rating msgid "episodes-rating" msgstr "" # Default: Essays msgid "essays" msgstr "" # Default: External reviews msgid "external-reviews" msgstr "" # Default: Faqs msgid "faqs" msgstr "faqs" # Default: Feature msgid "feature" msgstr "Caractéristique" # Default: Featured in msgid "featured-in" msgstr "" # Default: Features msgid "features" msgstr "caractéristiques" # Default: Film negative format msgid "film-negative-format" msgstr "" # Default: Filming dates msgid "filming-dates" msgstr "" # Default: Filmography msgid "filmography" msgstr "" # Default: Followed by msgid "followed-by" msgstr "" # Default: Follows msgid "follows" msgstr "" # Default: For msgid "for" msgstr "pour" # Default: Frequency response msgid "frequency-response" msgstr "" # Default: From msgid "from" msgstr "de" # Default: Full article link msgid "full-article-link" msgstr "" # Default: Full size cover url msgid "full-size-cover-url" msgstr "" # Default: Full size headshot msgid "full-size-headshot" msgstr "" # Default: Genres msgid "genres" msgstr "genres" # Default: Goofs msgid "goofs" msgstr "" # Default: Gross msgid "gross" msgstr "" # Default: Group genre msgid "group-genre" msgstr "" # Default: Headshot msgid "headshot" msgstr "" # Default: Height msgid "height" msgstr "hauteur" # Default: Imdbindex msgid "imdbindex" msgstr "" # Default: In development msgid "in-development" msgstr "" # Default: Interview msgid "interview" msgstr "interview" # Default: Interviews msgid "interviews" msgstr "" # Default: Introduction msgid "introduction" msgstr "introduction" # Default: Item msgid "item" msgstr "élément" # Default: Keywords msgid "keywords" msgstr "" # Default: Kind msgid "kind" msgstr "" # Default: Label msgid "label" msgstr "" # Default: Laboratory msgid "laboratory" msgstr "laboratoire" # Default: Language msgid "language" msgstr "langue" # Default: Languages msgid "languages" msgstr "langues" # Default: Laserdisc msgid "laserdisc" msgstr "" # Default: Laserdisc title msgid "laserdisc-title" msgstr "" # Default: Length msgid "length" msgstr "" # Default: Line msgid "line" msgstr "ligne" # Default: Link msgid "link" msgstr "lien" # Default: Link text msgid "link-text" msgstr "" # Default: Literature msgid "literature" msgstr "" # Default: Locations msgid "locations" msgstr "" # Default: Long imdb canonical name msgid "long-imdb-canonical-name" msgstr "" # Default: Long imdb canonical title msgid "long-imdb-canonical-title" msgstr "" # Default: Long imdb episode title msgid "long-imdb-episode-title" msgstr "" # Default: Long imdb name msgid "long-imdb-name" msgstr "" # Default: Long imdb title msgid "long-imdb-title" msgstr "" # Default: Magazine cover photo msgid "magazine-cover-photo" msgstr "" # Default: Make up msgid "make-up" msgstr "" # Default: Master format msgid "master-format" msgstr "" # Default: Median msgid "median" msgstr "" # Default: Merchandising links msgid "merchandising-links" msgstr "" # Default: Mini biography msgid "mini-biography" msgstr "" # Default: Misc links msgid "misc-links" msgstr "" # Default: Miscellaneous companies msgid "miscellaneous-companies" msgstr "" # Default: Miscellaneous crew msgid "miscellaneous-crew" msgstr "" # Default: Movie msgid "movie" msgstr "film" # Default: Mpaa msgid "mpaa" msgstr "mpaa" # Default: Music department msgid "music-department" msgstr "" # Default: Name msgid "name" msgstr "nom" # Default: News msgid "news" msgstr "actualisé" # Default: Newsgroup reviews msgid "newsgroup-reviews" msgstr "" # Default: Nick names msgid "nick-names" msgstr "" # Default: Notes msgid "notes" msgstr "notesnouvelle" # Default: Novel msgid "novel" msgstr "nouvelle" # Default: Number msgid "number" msgstr "numéro" # Default: Number of chapter stops msgid "number-of-chapter-stops" msgstr "" # Default: Number of episodes msgid "number-of-episodes" msgstr "" # Default: Number of seasons msgid "number-of-seasons" msgstr "" # Default: Number of sides msgid "number-of-sides" msgstr "" # Default: Number of votes msgid "number-of-votes" msgstr "" # Default: Official retail price msgid "official-retail-price" msgstr "" # Default: Official sites msgid "official-sites" msgstr "" # Default: Opening weekend msgid "opening-weekend" msgstr "" # Default: Original air date msgid "original-air-date" msgstr "" # Default: Original music msgid "original-music" msgstr "" # Default: Original title msgid "original-title" msgstr "" # Default: Other literature msgid "other-literature" msgstr "" # Default: Other works msgid "other-works" msgstr "" # Default: Parents guide msgid "parents-guide" msgstr "" # Default: Performed by msgid "performed-by" msgstr "" # Default: Person msgid "person" msgstr "" # Default: Photo sites msgid "photo-sites" msgstr "" # Default: Pictorial msgid "pictorial" msgstr "" # Default: Picture format msgid "picture-format" msgstr "" # Default: Plot msgid "plot" msgstr "" # Default: Plot outline msgid "plot-outline" msgstr "" # Default: Portrayed in msgid "portrayed-in" msgstr "" # Default: Pressing plant msgid "pressing-plant" msgstr "" # Default: Printed film format msgid "printed-film-format" msgstr "" # Default: Printed media reviews msgid "printed-media-reviews" msgstr "" # Default: Producer msgid "producer" msgstr "producteur" # Default: Production companies msgid "production-companies" msgstr "" # Default: Production country msgid "production-country" msgstr "" # Default: Production dates msgid "production-dates" msgstr "" # Default: Production design msgid "production-design" msgstr "" # Default: Production designer msgid "production-designer" msgstr "" # Default: Production manager msgid "production-manager" msgstr "" # Default: Production process protocol msgid "production-process-protocol" msgstr "" # Default: Quality of source msgid "quality-of-source" msgstr "" # Default: Quality program msgid "quality-program" msgstr "" # Default: Quote msgid "quote" msgstr "citation" # Default: Quotes msgid "quotes" msgstr "citations" # Default: Rating msgid "rating" msgstr "" # Default: Recommendations msgid "recommendations" msgstr "" # Default: Referenced in msgid "referenced-in" msgstr "" # Default: References msgid "references" msgstr "références" # Default: Region msgid "region" msgstr "" # Default: Release country msgid "release-country" msgstr "" # Default: Release date msgid "release-date" msgstr "" # Default: Release dates msgid "release-dates" msgstr "" # Default: Remade as msgid "remade-as" msgstr "" # Default: Remake of msgid "remake-of" msgstr "" # Default: Rentals msgid "rentals" msgstr "" # Default: Result msgid "result" msgstr "résultat" # Default: Review msgid "review" msgstr "revue" # Default: Review author msgid "review-author" msgstr "" # Default: Review kind msgid "review-kind" msgstr "" # Default: Runtime msgid "runtime" msgstr "" # Default: Runtimes msgid "runtimes" msgstr "" # Default: Salary history msgid "salary-history" msgstr "" # Default: Screenplay teleplay msgid "screenplay-teleplay" msgstr "" # Default: Season msgid "season" msgstr "saison" # Default: Second unit director or assistant director msgid "second-unit-director-or-assistant-director" msgstr "" # Default: Self msgid "self" msgstr "" # Default: Series animation department msgid "series-animation-department" msgstr "" # Default: Series art department msgid "series-art-department" msgstr "" # Default: Series assistant directors msgid "series-assistant-directors" msgstr "" # Default: Series camera department msgid "series-camera-department" msgstr "" # Default: Series casting department msgid "series-casting-department" msgstr "" # Default: Series cinematographers msgid "series-cinematographers" msgstr "" # Default: Series costume department msgid "series-costume-department" msgstr "" # Default: Series editorial department msgid "series-editorial-department" msgstr "" # Default: Series editors msgid "series-editors" msgstr "" # Default: Series make up department msgid "series-make-up-department" msgstr "" # Default: Series miscellaneous msgid "series-miscellaneous" msgstr "" # Default: Series music department msgid "series-music-department" msgstr "" # Default: Series producers msgid "series-producers" msgstr "" # Default: Series production designers msgid "series-production-designers" msgstr "" # Default: Series production managers msgid "series-production-managers" msgstr "" # Default: Series sound department msgid "series-sound-department" msgstr "" # Default: Series special effects department msgid "series-special-effects-department" msgstr "" # Default: Series stunts msgid "series-stunts" msgstr "" # Default: Series title msgid "series-title" msgstr "" # Default: Series transportation department msgid "series-transportation-department" msgstr "" # Default: Series visual effects department msgid "series-visual-effects-department" msgstr "" # Default: Series writers msgid "series-writers" msgstr "" # Default: Series years msgid "series-years" msgstr "" # Default: Set decoration msgid "set-decoration" msgstr "" # Default: Sharpness msgid "sharpness" msgstr "" # Default: Similar to msgid "similar-to" msgstr "" # Default: Smart canonical episode title msgid "smart-canonical-episode-title" msgstr "" # Default: Smart canonical series title msgid "smart-canonical-series-title" msgstr "" # Default: Smart canonical title msgid "smart-canonical-title" msgstr "" # Default: Smart long imdb canonical title msgid "smart-long-imdb-canonical-title" msgstr "" # Default: Sound clips msgid "sound-clips" msgstr "" # Default: Sound crew msgid "sound-crew" msgstr "" # Default: Sound encoding msgid "sound-encoding" msgstr "" # Default: Sound mix msgid "sound-mix" msgstr "" # Default: Soundtrack msgid "soundtrack" msgstr "" # Default: Spaciality msgid "spaciality" msgstr "" # Default: Special effects msgid "special-effects" msgstr "" # Default: Special effects companies msgid "special-effects-companies" msgstr "" # Default: Special effects department msgid "special-effects-department" msgstr "" # Default: Spin off msgid "spin-off" msgstr "" # Default: Spin off from msgid "spin-off-from" msgstr "" # Default: Spoofed in msgid "spoofed-in" msgstr "" # Default: Spoofs msgid "spoofs" msgstr "" # Default: Spouse msgid "spouse" msgstr "époux" # Default: Status of availablility msgid "status-of-availablility" msgstr "" # Default: Studio msgid "studio" msgstr "studio" # Default: Studios msgid "studios" msgstr "studios" # Default: Stunt performer msgid "stunt-performer" msgstr "" # Default: Stunts msgid "stunts" msgstr "" # Default: Subtitles msgid "subtitles" msgstr "sous-titres" # Default: Supplement msgid "supplement" msgstr "bonus" # Default: Supplements msgid "supplements" msgstr "bonus" # Default: Synopsis msgid "synopsis" msgstr "synopsis" # Default: Taglines msgid "taglines" msgstr "" # Default: Tech info msgid "tech-info" msgstr "" # Default: Thanks msgid "thanks" msgstr "" # Default: Time msgid "time" msgstr "temps" # Default: Title msgid "title" msgstr "titre" # Default: Titles in this product msgid "titles-in-this-product" msgstr "" # Default: To msgid "to" msgstr "" # Default: Top 250 rank msgid "top-250-rank" msgstr "" # Default: Trade mark msgid "trade-mark" msgstr "" # Default: Transportation department msgid "transportation-department" msgstr "" # Default: Trivia msgid "trivia" msgstr "" # Default: Tv msgid "tv" msgstr "tv" # Default: Under license from msgid "under-license-from" msgstr "" # Default: Unknown link msgid "unknown-link" msgstr "" # Default: Upc msgid "upc" msgstr "upc" # Default: Version of msgid "version-of" msgstr "" # Default: Vhs msgid "vhs" msgstr "" # Default: Video msgid "video" msgstr "video" # Default: Video artifacts msgid "video-artifacts" msgstr "" # Default: Video clips msgid "video-clips" msgstr "" # Default: Video noise msgid "video-noise" msgstr "" # Default: Video quality msgid "video-quality" msgstr "" # Default: Video standard msgid "video-standard" msgstr "" # Default: Visual effects msgid "visual-effects" msgstr "" # Default: Votes msgid "votes" msgstr "" # Default: Votes distribution msgid "votes-distribution" msgstr "" # Default: Weekend gross msgid "weekend-gross" msgstr "" # Default: Where now msgid "where-now" msgstr "" # Default: With msgid "with" msgstr "avec" # Default: Writer msgid "writer" msgstr "auteur" # Default: Written by msgid "written-by" msgstr "" # Default: Year msgid "year" msgstr "année" # Default: Zshops msgid "zshops" msgstr "" imdbpy-6.8/imdb/locale/imdbpy-it.po000066400000000000000000000553471351454127000173030ustar00rootroot00000000000000# Gettext message file for imdbpy msgid "" msgstr "" "Project-Id-Version: imdbpy\n" "POT-Creation-Date: 2010-03-18 14:35+0000\n" "PO-Revision-Date: 2009-07-03 13:00+0000\n" "Last-Translator: Davide Alberani \n" "Language-Team: Davide Alberani \n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: nplurals=2; plural=(n != 1);\n" "Language-Code: it\n" "Language-Name: Italian\n" "Preferred-Encodings: utf-8\n" "Domain: imdbpy\n" # Default: Actor msgid "actor" msgstr "Attore" # Default: Actress msgid "actress" msgstr "Attrice" # Default: Adaption msgid "adaption" msgstr "Adattamento" # Default: Additional information msgid "additional-information" msgstr "Ulteriori informazioni" # Default: Admissions msgid "admissions" msgstr "Biglietti venduti" # Default: Agent address msgid "agent-address" msgstr "Indirizzo dell'agente" # Default: Airing msgid "airing" msgstr "In onda" # Default: Akas msgid "akas" msgstr "Alias" # Default: Akas from release info msgid "akas-from-release-info" msgstr "Alias dalle informazioni di rilascio" # Default: All products msgid "all-products" msgstr "Tutti i prodotti" # Default: Alternate language version of msgid "alternate-language-version-of" msgstr "Versione in altra lingua di" # Default: Alternate versions msgid "alternate-versions" msgstr "Versioni alternative" # Default: Amazon reviews msgid "amazon-reviews" msgstr "Recensione di Amazon" # Default: Analog left msgid "analog-left" msgstr "Analogico sinistro" # Default: Analog right msgid "analog-right" msgstr "Analogico destro" # Default: Animation department msgid "animation-department" msgstr "Dipartimento animazione" # Default: Archive footage msgid "archive-footage" msgstr "Materiale d'archivio" # Default: Arithmetic mean msgid "arithmetic-mean" msgstr "Media aritmetica" # Default: Art department msgid "art-department" msgstr "Dipartimento artistico" # Default: Art direction msgid "art-direction" msgstr "Direzione artistica" # Default: Art director msgid "art-director" msgstr "Direttore artistico" # Default: Article msgid "article" msgstr "Articolo" # Default: Asin msgid "asin" msgstr "Asin" # Default: Aspect ratio msgid "aspect-ratio" msgstr "Rapporto d'aspetto" # Default: Assigner msgid "assigner" msgstr "Assegnatario" # Default: Assistant director msgid "assistant-director" msgstr "Assistente regista" # Default: Auctions msgid "auctions" msgstr "Aste" # Default: Audio noise msgid "audio-noise" msgstr "Rumore audio" # Default: Audio quality msgid "audio-quality" msgstr "Qualità audio" # Default: Award msgid "award" msgstr "Premio" # Default: Awards msgid "awards" msgstr "Premi" # Default: Biographical movies msgid "biographical-movies" msgstr "Film biografici" # Default: Biography msgid "biography" msgstr "Biografia" # Default: Biography print msgid "biography-print" msgstr "Biografia" # Default: Birth date msgid "birth-date" msgstr "Data di nascita" # Default: Birth name msgid "birth-name" msgstr "Nome di nascita" # Default: Birth notes msgid "birth-notes" msgstr "Note di nascita" # Default: Body msgid "body" msgstr "Corpo" # Default: Book msgid "book" msgstr "Libro" # Default: Books msgid "books" msgstr "Libri" # Default: Bottom 100 rank msgid "bottom-100-rank" msgstr "Posizione nella bottom 100" # Default: Budget msgid "budget" msgstr "Bilancio" # Default: Business msgid "business" msgstr "Affari" # Default: By arrangement with msgid "by-arrangement-with" msgstr "Arrangiamento con" # Default: Camera msgid "camera" msgstr "Cinepresa" # Default: Camera and electrical department msgid "camera-and-electrical-department" msgstr "Cinepresa e dipartimento elettrico" # Default: Canonical episode title msgid "canonical-episode-title" msgstr "Titolo dell'episodio in forma canonica" # Default: Canonical name msgid "canonical-name" msgstr "Nome in forma canonica" # Default: Canonical series title msgid "canonical-series-title" msgstr "Titolo della serie in forma canonica" # Default: Canonical title msgid "canonical-title" msgstr "Titolo in forma canonica" # Default: Cast msgid "cast" msgstr "Cast" # Default: Casting department msgid "casting-department" msgstr "Casting" # Default: Casting director msgid "casting-director" msgstr "Direttore del casting" # Default: Catalog number msgid "catalog-number" msgstr "Numero di catalogo" # Default: Category msgid "category" msgstr "Categoria" # Default: Certificate msgid "certificate" msgstr "Certificazione" # Default: Certificates msgid "certificates" msgstr "Certificazioni" # Default: Certification msgid "certification" msgstr "Certificazioni" # Default: Channel msgid "channel" msgstr "Canale" # Default: Character msgid "character" msgstr "Personaggio" # Default: Cinematographer msgid "cinematographer" msgstr "Fotografia" # Default: Cinematographic process msgid "cinematographic-process" msgstr "Processo cinematografico" # Default: Close captions teletext ld g msgid "close-captions-teletext-ld-g" msgstr "" # Default: Color info msgid "color-info" msgstr "Colore" # Default: Color information msgid "color-information" msgstr "Informazioni sul colore" # Default: Color rendition msgid "color-rendition" msgstr "Resa dei colori" # Default: Company msgid "company" msgstr "Compagnia" # Default: Complete cast msgid "complete-cast" msgstr "Cast completo" # Default: Complete crew msgid "complete-crew" msgstr "Troupe completa" # Default: Composer msgid "composer" msgstr "Compositore" # Default: Connections msgid "connections" msgstr "Collegamenti" # Default: Contrast msgid "contrast" msgstr "Contrasto" # Default: Copyright holder msgid "copyright-holder" msgstr "Detentore dei diritti d'autore" # Default: Costume department msgid "costume-department" msgstr "Dipartimento costumi" # Default: Costume designer msgid "costume-designer" msgstr "Costumista" # Default: Countries msgid "countries" msgstr "Paesi" # Default: Country msgid "country" msgstr "Paese" # Default: Courtesy of msgid "courtesy-of" msgstr "Cortesia di" # Default: Cover msgid "cover" msgstr "Copertina" # Default: Cover url msgid "cover-url" msgstr "Locandina" # Default: Crazy credits msgid "crazy-credits" msgstr "Titoli pazzi" # Default: Creator msgid "creator" msgstr "Creatore" # Default: Current role msgid "current-role" msgstr "Ruolo" # Default: Database msgid "database" msgstr "Database" # Default: Date msgid "date" msgstr "Data" # Default: Death date msgid "death-date" msgstr "Data di morte" # Default: Death notes msgid "death-notes" msgstr "Note di morte" # Default: Demographic msgid "demographic" msgstr "Spaccato demografico" # Default: Description msgid "description" msgstr "Descrizione" # Default: Dialogue intellegibility msgid "dialogue-intellegibility" msgstr "Comprensibilità dei dialoghi" # Default: Digital sound msgid "digital-sound" msgstr "Suono digitale" # Default: Director msgid "director" msgstr "Regista" # Default: Disc format msgid "disc-format" msgstr "Formato del disco" # Default: Disc size msgid "disc-size" msgstr "Dimensione del disco" # Default: Distributors msgid "distributors" msgstr "Distributori" # Default: Dvd msgid "dvd" msgstr "Dvd" # Default: Dvd features msgid "dvd-features" msgstr "Caratteristiche del DVD" # Default: Dvd format msgid "dvd-format" msgstr "Formato del DVD" # Default: Dvds msgid "dvds" msgstr "Dvd" # Default: Dynamic range msgid "dynamic-range" msgstr "Intervallo dinamico" # Default: Edited from msgid "edited-from" msgstr "Tratto da" # Default: Edited into msgid "edited-into" msgstr "Montato in" # Default: Editor msgid "editor" msgstr "Editore" # Default: Editorial department msgid "editorial-department" msgstr "Dipartimento editoriale" # Default: Episode msgid "episode" msgstr "Episodio" # Default: Episode of msgid "episode-of" msgstr "Episodio di" # Default: Episode title msgid "episode-title" msgstr "Titolo dell'episodio" # Default: Episodes msgid "episodes" msgstr "Episodi" # Default: Episodes rating msgid "episodes-rating" msgstr "Voto degli episodi" # Default: Essays msgid "essays" msgstr "Saggi" # Default: External reviews msgid "external-reviews" msgstr "Recensioni esterne" # Default: Faqs msgid "faqs" msgstr "Domande ricorrenti" # Default: Feature msgid "feature" msgstr "Caratteristica" # Default: Featured in msgid "featured-in" msgstr "Ripreso in" # Default: Features msgid "features" msgstr "Caratteristiche" # Default: Film negative format msgid "film-negative-format" msgstr "Formato del negativo" # Default: Filming dates msgid "filming-dates" msgstr "Data delle riprese" # Default: Filmography msgid "filmography" msgstr "Filmografia" # Default: Followed by msgid "followed-by" msgstr "Seguito da" # Default: Follows msgid "follows" msgstr "Segue" # Default: For msgid "for" msgstr "Per" # Default: Frequency response msgid "frequency-response" msgstr "Frequenze di risposta" # Default: From msgid "from" msgstr "Da" # Default: Full article link msgid "full-article-link" msgstr "Collegamento all'articolo completo" # Default: Full size cover url msgid "full-size-cover-url" msgstr "URL della copertina nelle dimensioni originali" # Default: Full size headshot msgid "full-size-headshot" msgstr "Ritratto nelle dimensioni originali" # Default: Genres msgid "genres" msgstr "Generi" # Default: Goofs msgid "goofs" msgstr "Errori" # Default: Gross msgid "gross" msgstr "Lordo" # Default: Group genre msgid "group-genre" msgstr "" # Default: Headshot msgid "headshot" msgstr "Foto" # Default: Height msgid "height" msgstr "Altezza" # Default: Imdbindex msgid "imdbindex" msgstr "" # Default: In development msgid "in-development" msgstr "In sviluppo" # Default: Interview msgid "interview" msgstr "Intervista" # Default: Interviews msgid "interviews" msgstr "Interviste" # Default: Introduction msgid "introduction" msgstr "Introduzione" # Default: Item msgid "item" msgstr "Elemento" # Default: Keywords msgid "keywords" msgstr "Parole chiave" # Default: Kind msgid "kind" msgstr "Tipo" # Default: Label msgid "label" msgstr "Etichetta" # Default: Laboratory msgid "laboratory" msgstr "Laboratorio" # Default: Language msgid "language" msgstr "Lingua" # Default: Languages msgid "languages" msgstr "Lingue" # Default: Laserdisc msgid "laserdisc" msgstr "Laserdisc" # Default: Laserdisc title msgid "laserdisc-title" msgstr "Titolo del laserdisc" # Default: Length msgid "length" msgstr "Durata" # Default: Line msgid "line" msgstr "Battuta" # Default: Link msgid "link" msgstr "Collegamento" # Default: Link text msgid "link-text" msgstr "Testo del link" # Default: Literature msgid "literature" msgstr "Letteratura" # Default: Locations msgid "locations" msgstr "Luoghi" # Default: Long imdb canonical name msgid "long-imdb-canonical-name" msgstr "Nome canonico IMDb lungo" # Default: Long imdb canonical title msgid "long-imdb-canonical-title" msgstr "Titolo canonico IMDb lungo" # Default: Long imdb episode title msgid "long-imdb-episode-title" msgstr "Titolo dell'episodio canonico IMDb lungo" # Default: Long imdb name msgid "long-imdb-name" msgstr "Nome IMDb lungo" # Default: Long imdb title msgid "long-imdb-title" msgstr "Titolo IMDb lungo" # Default: Magazine cover photo msgid "magazine-cover-photo" msgstr "Foto di copertina" # Default: Make up msgid "make-up" msgstr "Trucco" # Default: Master format msgid "master-format" msgstr "Formato del master" # Default: Median msgid "median" msgstr "Mediana" # Default: Merchandising links msgid "merchandising-links" msgstr "Collegamenti al merchandising" # Default: Mini biography msgid "mini-biography" msgstr "Biografia" # Default: Misc links msgid "misc-links" msgstr "Altri collegamenti" # Default: Miscellaneous companies msgid "miscellaneous-companies" msgstr "Altre compagnie" # Default: Miscellaneous crew msgid "miscellaneous-crew" msgstr "Altra troupe" # Default: Movie msgid "movie" msgstr "Film" # Default: Mpaa msgid "mpaa" msgstr "Visto MPAA" # Default: Music department msgid "music-department" msgstr "Dipartimento musicale" # Default: Name msgid "name" msgstr "Nome" # Default: News msgid "news" msgstr "Notizie" # Default: Newsgroup reviews msgid "newsgroup-reviews" msgstr "Recensioni dai gruppi di discussione" # Default: Nick names msgid "nick-names" msgstr "Soprannomi" # Default: Notes msgid "notes" msgstr "Note" # Default: Novel msgid "novel" msgstr "Novella" # Default: Number msgid "number" msgstr "Numero" # Default: Number of chapter stops msgid "number-of-chapter-stops" msgstr "Numero di interruzioni di capitolo" # Default: Number of episodes msgid "number-of-episodes" msgstr "Numero di episodi" # Default: Number of seasons msgid "number-of-seasons" msgstr "Numero di stagioni" # Default: Number of sides msgid "number-of-sides" msgstr "Numero di lati" # Default: Number of votes msgid "number-of-votes" msgstr "Numero di voti" # Default: Official retail price msgid "official-retail-price" msgstr "Prezzo ufficiale al pubblico" # Default: Official sites msgid "official-sites" msgstr "Siti ufficiali" # Default: Opening weekend msgid "opening-weekend" msgstr "Weekend d'apertura" # Default: Original air date msgid "original-air-date" msgstr "Data della prima messa in onda" # Default: Original music msgid "original-music" msgstr "Musica originale" # Default: Original title msgid "original-title" msgstr "Titolo originale" # Default: Other literature msgid "other-literature" msgstr "Altre opere letterarie" # Default: Other works msgid "other-works" msgstr "Altri lavori" # Default: Parents guide msgid "parents-guide" msgstr "Guida per i genitori" # Default: Performed by msgid "performed-by" msgstr "Eseguito da" # Default: Person msgid "person" msgstr "Persona" # Default: Photo sites msgid "photo-sites" msgstr "Siti con fotografie" # Default: Pictorial msgid "pictorial" msgstr "Ritratto" # Default: Picture format msgid "picture-format" msgstr "Formato dell'immagine" # Default: Plot msgid "plot" msgstr "Trama" # Default: Plot outline msgid "plot-outline" msgstr "Trama in breve" # Default: Portrayed in msgid "portrayed-in" msgstr "Rappresentato in" # Default: Pressing plant msgid "pressing-plant" msgstr "Impianto di stampa" # Default: Printed film format msgid "printed-film-format" msgstr "Formato della pellicola" # Default: Printed media reviews msgid "printed-media-reviews" msgstr "Recensioni su carta stampata" # Default: Producer msgid "producer" msgstr "Produttore" # Default: Production companies msgid "production-companies" msgstr "Compagnie di produzione" # Default: Production country msgid "production-country" msgstr "Paese di produzione" # Default: Production dates msgid "production-dates" msgstr "Date di produzione" # Default: Production design msgid "production-design" msgstr "Design di produzione" # Default: Production designer msgid "production-designer" msgstr "Designer di produzione" # Default: Production manager msgid "production-manager" msgstr "Manager di produzione" # Default: Production process protocol msgid "production-process-protocol" msgstr "Controllo del processo di produzione" # Default: Quality of source msgid "quality-of-source" msgstr "Qualità dell'originale" # Default: Quality program msgid "quality-program" msgstr "Programma di Qualità" # Default: Quote msgid "quote" msgstr "Citazione" # Default: Quotes msgid "quotes" msgstr "Citazioni" # Default: Rating msgid "rating" msgstr "Voto" # Default: Recommendations msgid "recommendations" msgstr "Raccomandazioni" # Default: Referenced in msgid "referenced-in" msgstr "Citato in" # Default: References msgid "references" msgstr "Cita" # Default: Region msgid "region" msgstr "Regione" # Default: Release country msgid "release-country" msgstr "Paese d'uscita" # Default: Release date msgid "release-date" msgstr "Data d'uscita" # Default: Release dates msgid "release-dates" msgstr "Date d'uscita" # Default: Remade as msgid "remade-as" msgstr "Rifatto come" # Default: Remake of msgid "remake-of" msgstr "Rifacimento di" # Default: Rentals msgid "rentals" msgstr "Noleggi" # Default: Result msgid "result" msgstr "Risultato" # Default: Review msgid "review" msgstr "Recensione" # Default: Review author msgid "review-author" msgstr "Autore della recensione" # Default: Review kind msgid "review-kind" msgstr "Tipo di recensione" # Default: Runtime msgid "runtime" msgstr "Durata" # Default: Runtimes msgid "runtimes" msgstr "Durate" # Default: Salary history msgid "salary-history" msgstr "Stipendi" # Default: Screenplay teleplay msgid "screenplay-teleplay" msgstr "" # Default: Season msgid "season" msgstr "Stagione" # Default: Second unit director or assistant director msgid "second-unit-director-or-assistant-director" msgstr "Regista della seconda unità o aiuto regista" # Default: Self msgid "self" msgstr "Se stesso" # Default: Series animation department msgid "series-animation-department" msgstr "Dipartimento animazione della serie" # Default: Series art department msgid "series-art-department" msgstr "Dipartimento artistico della serie" # Default: Series assistant directors msgid "series-assistant-directors" msgstr "Assistenti registi della serie" # Default: Series camera department msgid "series-camera-department" msgstr "" # Default: Series casting department msgid "series-casting-department" msgstr "" # Default: Series cinematographers msgid "series-cinematographers" msgstr "" # Default: Series costume department msgid "series-costume-department" msgstr "" # Default: Series editorial department msgid "series-editorial-department" msgstr "" # Default: Series editors msgid "series-editors" msgstr "" # Default: Series make up department msgid "series-make-up-department" msgstr "" # Default: Series miscellaneous msgid "series-miscellaneous" msgstr "" # Default: Series music department msgid "series-music-department" msgstr "" # Default: Series producers msgid "series-producers" msgstr "" # Default: Series production designers msgid "series-production-designers" msgstr "" # Default: Series production managers msgid "series-production-managers" msgstr "" # Default: Series sound department msgid "series-sound-department" msgstr "Dipartimento sonoro della serie" # Default: Series special effects department msgid "series-special-effects-department" msgstr "Dipartimento effetti speciali della serie" # Default: Series stunts msgid "series-stunts" msgstr "Controfigure della serie" # Default: Series title msgid "series-title" msgstr "Titolo della serie" # Default: Series transportation department msgid "series-transportation-department" msgstr "" # Default: Series visual effects department msgid "series-visual-effects-department" msgstr "" # Default: Series writers msgid "series-writers" msgstr "Scrittori della serie" # Default: Series years msgid "series-years" msgstr "Anni della serie" # Default: Set decoration msgid "set-decoration" msgstr "Decorazione del set" # Default: Sharpness msgid "sharpness" msgstr "" # Default: Similar to msgid "similar-to" msgstr "Simile a" # Default: Smart canonical episode title msgid "smart-canonical-episode-title" msgstr "Titolo canonico intelligente dell'episodio" # Default: Smart canonical series title msgid "smart-canonical-series-title" msgstr "Titolo canonico intelligente della serie" # Default: Smart canonical title msgid "smart-canonical-title" msgstr "Titolo canonico intelligente" # Default: Smart long imdb canonical title msgid "smart-long-imdb-canonical-title" msgstr "Titolo canonico lungo intelligente" # Default: Sound clips msgid "sound-clips" msgstr "" # Default: Sound crew msgid "sound-crew" msgstr "" # Default: Sound encoding msgid "sound-encoding" msgstr "Codifica sonora" # Default: Sound mix msgid "sound-mix" msgstr "Mix audio" # Default: Soundtrack msgid "soundtrack" msgstr "Colonna sonora" # Default: Spaciality msgid "spaciality" msgstr "Specialità" # Default: Special effects msgid "special-effects" msgstr "Effetti speciali" # Default: Special effects companies msgid "special-effects-companies" msgstr "Compagnie di effetti speciali" # Default: Special effects department msgid "special-effects-department" msgstr "Dipartimento effetti speciali" # Default: Spin off msgid "spin-off" msgstr "Derivati" # Default: Spin off from msgid "spin-off-from" msgstr "Deriva da" # Default: Spoofed in msgid "spoofed-in" msgstr "Preso in giro in" # Default: Spoofs msgid "spoofs" msgstr "Prende in giro" # Default: Spouse msgid "spouse" msgstr "Coniuge" # Default: Status of availablility msgid "status-of-availablility" msgstr "Disponibilità" # Default: Studio msgid "studio" msgstr "Studio" # Default: Studios msgid "studios" msgstr "Studi" # Default: Stunt performer msgid "stunt-performer" msgstr "" # Default: Stunts msgid "stunts" msgstr "Stuntman" # Default: Subtitles msgid "subtitles" msgstr "Sottotitoli" # Default: Supplement msgid "supplement" msgstr "Extra" # Default: Supplements msgid "supplements" msgstr "Extra" # Default: Synopsis msgid "synopsis" msgstr "Compendio della trama" # Default: Taglines msgid "taglines" msgstr "Slogan" # Default: Tech info msgid "tech-info" msgstr "Informazioni tecniche" # Default: Thanks msgid "thanks" msgstr "Ringraziamenti" # Default: Time msgid "time" msgstr "Tempo" # Default: Title msgid "title" msgstr "Titolo" # Default: Titles in this product msgid "titles-in-this-product" msgstr "Titoli in questo prodotto" # Default: To msgid "to" msgstr "A" # Default: Top 250 rank msgid "top-250-rank" msgstr "Posizione nella top 250" # Default: Trade mark msgid "trade-mark" msgstr "Marchio registrato" # Default: Transportation department msgid "transportation-department" msgstr "Dipartimento trasporti" # Default: Trivia msgid "trivia" msgstr "Frivolezze" # Default: Tv msgid "tv" msgstr "Tv" # Default: Under license from msgid "under-license-from" msgstr "Sotto licenza da" # Default: Unknown link msgid "unknown-link" msgstr "Collegamento sconosciuto" # Default: Upc msgid "upc" msgstr "" # Default: Version of msgid "version-of" msgstr "Versione di" # Default: Vhs msgid "vhs" msgstr "VHS" # Default: Video msgid "video" msgstr "Video" # Default: Video artifacts msgid "video-artifacts" msgstr "Imperfezioni video" # Default: Video clips msgid "video-clips" msgstr "Video clips" # Default: Video noise msgid "video-noise" msgstr "Rumore video" # Default: Video quality msgid "video-quality" msgstr "Qualità video" # Default: Video standard msgid "video-standard" msgstr "Standard video" # Default: Visual effects msgid "visual-effects" msgstr "Effetti visivi" # Default: Votes msgid "votes" msgstr "Voti" # Default: Votes distribution msgid "votes-distribution" msgstr "Distribuzione dei voti" # Default: Weekend gross msgid "weekend-gross" msgstr "Lordo del primo fine settimana" # Default: Where now msgid "where-now" msgstr "Cosa sta facendo ora" # Default: With msgid "with" msgstr "Con" # Default: Writer msgid "writer" msgstr "Scrittore" # Default: Written by msgid "written-by" msgstr "Scritto da" # Default: Year msgid "year" msgstr "Anno" # Default: Zshops msgid "zshops" msgstr "" imdbpy-6.8/imdb/locale/imdbpy-pt_BR.po000066400000000000000000000454251351454127000176710ustar00rootroot00000000000000# Gettext message file for imdbpy # Translators: # Wagner Marques Oliveira , 2015 msgid "" msgstr "" "Project-Id-Version: IMDbPY\n" "POT-Creation-Date: 2010-03-18 14:35+0000\n" "PO-Revision-Date: 2016-03-28 20:40+0000\n" "Last-Translator: Wagner Marques Oliveira \n" "Language-Team: Portuguese (Brazil) (http://www.transifex.com/davide_alberani/imdbpy/language/pt_BR/)\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Domain: imdbpy\n" "Language: pt_BR\n" "Language-Code: en\n" "Language-Name: English\n" "Plural-Forms: nplurals=2; plural=(n > 1);\n" "Preferred-Encodings: utf-8\n" # Default: Actor msgid "actor" msgstr "ator" # Default: Actress msgid "actress" msgstr "atriz" # Default: Adaption msgid "adaption" msgstr "adaptação" # Default: Additional information msgid "additional-information" msgstr "informação-adicional" # Default: Admissions msgid "admissions" msgstr "admissões" # Default: Agent address msgid "agent-address" msgstr "endereço-de-agente" # Default: Airing msgid "airing" msgstr "no ar" # Default: Akas msgid "akas" msgstr "mais conhecido como" # Default: Akas from release info msgid "akas-from-release-info" msgstr "mais conhecido como-para-lançamento-informação" # Default: All products msgid "all-products" msgstr "todos-produtos" # Default: Alternate language version of msgid "alternate-language-version-of" msgstr "" # Default: Alternate versions msgid "alternate-versions" msgstr "" # Default: Amazon reviews msgid "amazon-reviews" msgstr "" # Default: Analog left msgid "analog-left" msgstr "" # Default: Analog right msgid "analog-right" msgstr "" # Default: Animation department msgid "animation-department" msgstr "" # Default: Archive footage msgid "archive-footage" msgstr "" # Default: Arithmetic mean msgid "arithmetic-mean" msgstr "" # Default: Art department msgid "art-department" msgstr "" # Default: Art direction msgid "art-direction" msgstr "" # Default: Art director msgid "art-director" msgstr "" # Default: Article msgid "article" msgstr "" # Default: Asin msgid "asin" msgstr "" # Default: Aspect ratio msgid "aspect-ratio" msgstr "" # Default: Assigner msgid "assigner" msgstr "" # Default: Assistant director msgid "assistant-director" msgstr "" # Default: Auctions msgid "auctions" msgstr "" # Default: Audio noise msgid "audio-noise" msgstr "" # Default: Audio quality msgid "audio-quality" msgstr "" # Default: Award msgid "award" msgstr "" # Default: Awards msgid "awards" msgstr "" # Default: Biographical movies msgid "biographical-movies" msgstr "" # Default: Biography msgid "biography" msgstr "" # Default: Biography print msgid "biography-print" msgstr "" # Default: Birth date msgid "birth-date" msgstr "" # Default: Birth name msgid "birth-name" msgstr "" # Default: Birth notes msgid "birth-notes" msgstr "" # Default: Body msgid "body" msgstr "" # Default: Book msgid "book" msgstr "" # Default: Books msgid "books" msgstr "" # Default: Bottom 100 rank msgid "bottom-100-rank" msgstr "" # Default: Budget msgid "budget" msgstr "" # Default: Business msgid "business" msgstr "" # Default: By arrangement with msgid "by-arrangement-with" msgstr "" # Default: Camera msgid "camera" msgstr "" # Default: Camera and electrical department msgid "camera-and-electrical-department" msgstr "" # Default: Canonical episode title msgid "canonical-episode-title" msgstr "" # Default: Canonical name msgid "canonical-name" msgstr "" # Default: Canonical series title msgid "canonical-series-title" msgstr "" # Default: Canonical title msgid "canonical-title" msgstr "" # Default: Cast msgid "cast" msgstr "" # Default: Casting department msgid "casting-department" msgstr "" # Default: Casting director msgid "casting-director" msgstr "" # Default: Catalog number msgid "catalog-number" msgstr "" # Default: Category msgid "category" msgstr "" # Default: Certificate msgid "certificate" msgstr "" # Default: Certificates msgid "certificates" msgstr "" # Default: Certification msgid "certification" msgstr "" # Default: Channel msgid "channel" msgstr "" # Default: Character msgid "character" msgstr "" # Default: Cinematographer msgid "cinematographer" msgstr "" # Default: Cinematographic process msgid "cinematographic-process" msgstr "" # Default: Close captions teletext ld g msgid "close-captions-teletext-ld-g" msgstr "" # Default: Color info msgid "color-info" msgstr "" # Default: Color information msgid "color-information" msgstr "" # Default: Color rendition msgid "color-rendition" msgstr "" # Default: Company msgid "company" msgstr "" # Default: Complete cast msgid "complete-cast" msgstr "" # Default: Complete crew msgid "complete-crew" msgstr "" # Default: Composer msgid "composer" msgstr "" # Default: Connections msgid "connections" msgstr "" # Default: Contrast msgid "contrast" msgstr "" # Default: Copyright holder msgid "copyright-holder" msgstr "" # Default: Costume department msgid "costume-department" msgstr "" # Default: Costume designer msgid "costume-designer" msgstr "" # Default: Countries msgid "countries" msgstr "" # Default: Country msgid "country" msgstr "" # Default: Courtesy of msgid "courtesy-of" msgstr "" # Default: Cover msgid "cover" msgstr "" # Default: Cover url msgid "cover-url" msgstr "" # Default: Crazy credits msgid "crazy-credits" msgstr "" # Default: Creator msgid "creator" msgstr "" # Default: Current role msgid "current-role" msgstr "" # Default: Database msgid "database" msgstr "" # Default: Date msgid "date" msgstr "" # Default: Death date msgid "death-date" msgstr "" # Default: Death notes msgid "death-notes" msgstr "" # Default: Demographic msgid "demographic" msgstr "" # Default: Description msgid "description" msgstr "" # Default: Dialogue intellegibility msgid "dialogue-intellegibility" msgstr "" # Default: Digital sound msgid "digital-sound" msgstr "" # Default: Director msgid "director" msgstr "" # Default: Disc format msgid "disc-format" msgstr "" # Default: Disc size msgid "disc-size" msgstr "" # Default: Distributors msgid "distributors" msgstr "" # Default: Dvd msgid "dvd" msgstr "" # Default: Dvd features msgid "dvd-features" msgstr "" # Default: Dvd format msgid "dvd-format" msgstr "" # Default: Dvds msgid "dvds" msgstr "" # Default: Dynamic range msgid "dynamic-range" msgstr "" # Default: Edited from msgid "edited-from" msgstr "" # Default: Edited into msgid "edited-into" msgstr "" # Default: Editor msgid "editor" msgstr "" # Default: Editorial department msgid "editorial-department" msgstr "" # Default: Episode msgid "episode" msgstr "" # Default: Episode of msgid "episode-of" msgstr "" # Default: Episode title msgid "episode-title" msgstr "" # Default: Episodes msgid "episodes" msgstr "" # Default: Episodes rating msgid "episodes-rating" msgstr "" # Default: Essays msgid "essays" msgstr "" # Default: External reviews msgid "external-reviews" msgstr "" # Default: Faqs msgid "faqs" msgstr "" # Default: Feature msgid "feature" msgstr "" # Default: Featured in msgid "featured-in" msgstr "" # Default: Features msgid "features" msgstr "" # Default: Film negative format msgid "film-negative-format" msgstr "" # Default: Filming dates msgid "filming-dates" msgstr "" # Default: Filmography msgid "filmography" msgstr "" # Default: Followed by msgid "followed-by" msgstr "" # Default: Follows msgid "follows" msgstr "" # Default: For msgid "for" msgstr "" # Default: Frequency response msgid "frequency-response" msgstr "" # Default: From msgid "from" msgstr "" # Default: Full article link msgid "full-article-link" msgstr "" # Default: Full size cover url msgid "full-size-cover-url" msgstr "" # Default: Full size headshot msgid "full-size-headshot" msgstr "" # Default: Genres msgid "genres" msgstr "" # Default: Goofs msgid "goofs" msgstr "" # Default: Gross msgid "gross" msgstr "" # Default: Group genre msgid "group-genre" msgstr "" # Default: Headshot msgid "headshot" msgstr "" # Default: Height msgid "height" msgstr "" # Default: Imdbindex msgid "imdbindex" msgstr "" # Default: In development msgid "in-development" msgstr "" # Default: Interview msgid "interview" msgstr "" # Default: Interviews msgid "interviews" msgstr "" # Default: Introduction msgid "introduction" msgstr "" # Default: Item msgid "item" msgstr "" # Default: Keywords msgid "keywords" msgstr "" # Default: Kind msgid "kind" msgstr "" # Default: Label msgid "label" msgstr "" # Default: Laboratory msgid "laboratory" msgstr "" # Default: Language msgid "language" msgstr "" # Default: Languages msgid "languages" msgstr "" # Default: Laserdisc msgid "laserdisc" msgstr "" # Default: Laserdisc title msgid "laserdisc-title" msgstr "" # Default: Length msgid "length" msgstr "" # Default: Line msgid "line" msgstr "" # Default: Link msgid "link" msgstr "" # Default: Link text msgid "link-text" msgstr "" # Default: Literature msgid "literature" msgstr "" # Default: Locations msgid "locations" msgstr "" # Default: Long imdb canonical name msgid "long-imdb-canonical-name" msgstr "" # Default: Long imdb canonical title msgid "long-imdb-canonical-title" msgstr "" # Default: Long imdb episode title msgid "long-imdb-episode-title" msgstr "" # Default: Long imdb name msgid "long-imdb-name" msgstr "" # Default: Long imdb title msgid "long-imdb-title" msgstr "" # Default: Magazine cover photo msgid "magazine-cover-photo" msgstr "" # Default: Make up msgid "make-up" msgstr "" # Default: Master format msgid "master-format" msgstr "" # Default: Median msgid "median" msgstr "" # Default: Merchandising links msgid "merchandising-links" msgstr "" # Default: Mini biography msgid "mini-biography" msgstr "" # Default: Misc links msgid "misc-links" msgstr "" # Default: Miscellaneous companies msgid "miscellaneous-companies" msgstr "" # Default: Miscellaneous crew msgid "miscellaneous-crew" msgstr "" # Default: Movie msgid "movie" msgstr "" # Default: Mpaa msgid "mpaa" msgstr "" # Default: Music department msgid "music-department" msgstr "" # Default: Name msgid "name" msgstr "" # Default: News msgid "news" msgstr "" # Default: Newsgroup reviews msgid "newsgroup-reviews" msgstr "" # Default: Nick names msgid "nick-names" msgstr "" # Default: Notes msgid "notes" msgstr "" # Default: Novel msgid "novel" msgstr "" # Default: Number msgid "number" msgstr "" # Default: Number of chapter stops msgid "number-of-chapter-stops" msgstr "" # Default: Number of episodes msgid "number-of-episodes" msgstr "" # Default: Number of seasons msgid "number-of-seasons" msgstr "" # Default: Number of sides msgid "number-of-sides" msgstr "" # Default: Number of votes msgid "number-of-votes" msgstr "" # Default: Official retail price msgid "official-retail-price" msgstr "" # Default: Official sites msgid "official-sites" msgstr "" # Default: Opening weekend msgid "opening-weekend" msgstr "" # Default: Original air date msgid "original-air-date" msgstr "" # Default: Original music msgid "original-music" msgstr "" # Default: Original title msgid "original-title" msgstr "" # Default: Other literature msgid "other-literature" msgstr "" # Default: Other works msgid "other-works" msgstr "" # Default: Parents guide msgid "parents-guide" msgstr "" # Default: Performed by msgid "performed-by" msgstr "" # Default: Person msgid "person" msgstr "" # Default: Photo sites msgid "photo-sites" msgstr "" # Default: Pictorial msgid "pictorial" msgstr "" # Default: Picture format msgid "picture-format" msgstr "" # Default: Plot msgid "plot" msgstr "" # Default: Plot outline msgid "plot-outline" msgstr "" # Default: Portrayed in msgid "portrayed-in" msgstr "" # Default: Pressing plant msgid "pressing-plant" msgstr "" # Default: Printed film format msgid "printed-film-format" msgstr "" # Default: Printed media reviews msgid "printed-media-reviews" msgstr "" # Default: Producer msgid "producer" msgstr "" # Default: Production companies msgid "production-companies" msgstr "" # Default: Production country msgid "production-country" msgstr "" # Default: Production dates msgid "production-dates" msgstr "" # Default: Production design msgid "production-design" msgstr "" # Default: Production designer msgid "production-designer" msgstr "" # Default: Production manager msgid "production-manager" msgstr "" # Default: Production process protocol msgid "production-process-protocol" msgstr "" # Default: Quality of source msgid "quality-of-source" msgstr "" # Default: Quality program msgid "quality-program" msgstr "" # Default: Quote msgid "quote" msgstr "" # Default: Quotes msgid "quotes" msgstr "" # Default: Rating msgid "rating" msgstr "" # Default: Recommendations msgid "recommendations" msgstr "" # Default: Referenced in msgid "referenced-in" msgstr "" # Default: References msgid "references" msgstr "" # Default: Region msgid "region" msgstr "" # Default: Release country msgid "release-country" msgstr "" # Default: Release date msgid "release-date" msgstr "" # Default: Release dates msgid "release-dates" msgstr "" # Default: Remade as msgid "remade-as" msgstr "" # Default: Remake of msgid "remake-of" msgstr "" # Default: Rentals msgid "rentals" msgstr "" # Default: Result msgid "result" msgstr "" # Default: Review msgid "review" msgstr "" # Default: Review author msgid "review-author" msgstr "" # Default: Review kind msgid "review-kind" msgstr "" # Default: Runtime msgid "runtime" msgstr "" # Default: Runtimes msgid "runtimes" msgstr "" # Default: Salary history msgid "salary-history" msgstr "" # Default: Screenplay teleplay msgid "screenplay-teleplay" msgstr "" # Default: Season msgid "season" msgstr "" # Default: Second unit director or assistant director msgid "second-unit-director-or-assistant-director" msgstr "" # Default: Self msgid "self" msgstr "" # Default: Series animation department msgid "series-animation-department" msgstr "" # Default: Series art department msgid "series-art-department" msgstr "" # Default: Series assistant directors msgid "series-assistant-directors" msgstr "" # Default: Series camera department msgid "series-camera-department" msgstr "" # Default: Series casting department msgid "series-casting-department" msgstr "" # Default: Series cinematographers msgid "series-cinematographers" msgstr "" # Default: Series costume department msgid "series-costume-department" msgstr "" # Default: Series editorial department msgid "series-editorial-department" msgstr "" # Default: Series editors msgid "series-editors" msgstr "" # Default: Series make up department msgid "series-make-up-department" msgstr "" # Default: Series miscellaneous msgid "series-miscellaneous" msgstr "" # Default: Series music department msgid "series-music-department" msgstr "" # Default: Series producers msgid "series-producers" msgstr "" # Default: Series production designers msgid "series-production-designers" msgstr "" # Default: Series production managers msgid "series-production-managers" msgstr "" # Default: Series sound department msgid "series-sound-department" msgstr "" # Default: Series special effects department msgid "series-special-effects-department" msgstr "" # Default: Series stunts msgid "series-stunts" msgstr "" # Default: Series title msgid "series-title" msgstr "" # Default: Series transportation department msgid "series-transportation-department" msgstr "" # Default: Series visual effects department msgid "series-visual-effects-department" msgstr "" # Default: Series writers msgid "series-writers" msgstr "" # Default: Series years msgid "series-years" msgstr "" # Default: Set decoration msgid "set-decoration" msgstr "" # Default: Sharpness msgid "sharpness" msgstr "" # Default: Similar to msgid "similar-to" msgstr "" # Default: Smart canonical episode title msgid "smart-canonical-episode-title" msgstr "" # Default: Smart canonical series title msgid "smart-canonical-series-title" msgstr "" # Default: Smart canonical title msgid "smart-canonical-title" msgstr "" # Default: Smart long imdb canonical title msgid "smart-long-imdb-canonical-title" msgstr "" # Default: Sound clips msgid "sound-clips" msgstr "" # Default: Sound crew msgid "sound-crew" msgstr "" # Default: Sound encoding msgid "sound-encoding" msgstr "" # Default: Sound mix msgid "sound-mix" msgstr "" # Default: Soundtrack msgid "soundtrack" msgstr "" # Default: Spaciality msgid "spaciality" msgstr "" # Default: Special effects msgid "special-effects" msgstr "" # Default: Special effects companies msgid "special-effects-companies" msgstr "" # Default: Special effects department msgid "special-effects-department" msgstr "" # Default: Spin off msgid "spin-off" msgstr "" # Default: Spin off from msgid "spin-off-from" msgstr "" # Default: Spoofed in msgid "spoofed-in" msgstr "" # Default: Spoofs msgid "spoofs" msgstr "" # Default: Spouse msgid "spouse" msgstr "" # Default: Status of availablility msgid "status-of-availablility" msgstr "" # Default: Studio msgid "studio" msgstr "" # Default: Studios msgid "studios" msgstr "" # Default: Stunt performer msgid "stunt-performer" msgstr "" # Default: Stunts msgid "stunts" msgstr "" # Default: Subtitles msgid "subtitles" msgstr "" # Default: Supplement msgid "supplement" msgstr "" # Default: Supplements msgid "supplements" msgstr "" # Default: Synopsis msgid "synopsis" msgstr "" # Default: Taglines msgid "taglines" msgstr "" # Default: Tech info msgid "tech-info" msgstr "" # Default: Thanks msgid "thanks" msgstr "" # Default: Time msgid "time" msgstr "" # Default: Title msgid "title" msgstr "" # Default: Titles in this product msgid "titles-in-this-product" msgstr "" # Default: To msgid "to" msgstr "" # Default: Top 250 rank msgid "top-250-rank" msgstr "" # Default: Trade mark msgid "trade-mark" msgstr "" # Default: Transportation department msgid "transportation-department" msgstr "" # Default: Trivia msgid "trivia" msgstr "" # Default: Tv msgid "tv" msgstr "" # Default: Under license from msgid "under-license-from" msgstr "" # Default: Unknown link msgid "unknown-link" msgstr "" # Default: Upc msgid "upc" msgstr "" # Default: Version of msgid "version-of" msgstr "" # Default: Vhs msgid "vhs" msgstr "" # Default: Video msgid "video" msgstr "" # Default: Video artifacts msgid "video-artifacts" msgstr "" # Default: Video clips msgid "video-clips" msgstr "" # Default: Video noise msgid "video-noise" msgstr "" # Default: Video quality msgid "video-quality" msgstr "" # Default: Video standard msgid "video-standard" msgstr "" # Default: Visual effects msgid "visual-effects" msgstr "" # Default: Votes msgid "votes" msgstr "" # Default: Votes distribution msgid "votes-distribution" msgstr "" # Default: Weekend gross msgid "weekend-gross" msgstr "" # Default: Where now msgid "where-now" msgstr "" # Default: With msgid "with" msgstr "" # Default: Writer msgid "writer" msgstr "" # Default: Written by msgid "written-by" msgstr "" # Default: Year msgid "year" msgstr "" # Default: Zshops msgid "zshops" msgstr "" imdbpy-6.8/imdb/locale/imdbpy-tr.po000066400000000000000000000534331351454127000173060ustar00rootroot00000000000000# Gettext message file for imdbpy msgid "" msgstr "" "Project-Id-Version: imdbpy\n" "POT-Creation-Date: 2010-03-18 14:35+0000\n" "PO-Revision-Date: 2009-04-21 19:04+0200\n" "Last-Translator: H. Turgut Uyar \n" "Language-Team: IMDbPY Türkçe \n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: nplurals=1; plural=0;\n" "Language-Code: tr\n" "Language-Name: Türkçe\n" "Preferred-Encodings: utf-8\n" "Domain: imdbpy\n" # Default: Actor msgid "actor" msgstr "Oyuncu" # Default: Actress msgid "actress" msgstr "Oyuncu" # Default: Adaption msgid "adaption" msgstr "" # Default: Additional information msgid "additional-information" msgstr "Ek bilgi" # Default: Admissions msgid "admissions" msgstr "" # Default: Agent address msgid "agent-address" msgstr "" # Default: Airing msgid "airing" msgstr "Yayımlanma" # Default: Akas msgid "akas" msgstr "Diğer başlıklar" # Default: Akas from release info msgid "akas-from-release-info" msgstr "" # Default: All products msgid "all-products" msgstr "Bütün ürünler" # Default: Alternate language version of msgid "alternate-language-version-of" msgstr "" # Default: Alternate versions msgid "alternate-versions" msgstr "" # Default: Amazon reviews msgid "amazon-reviews" msgstr "Amazon eleştirileri" # Default: Analog left msgid "analog-left" msgstr "Analog sol" # Default: Analog right msgid "analog-right" msgstr "Analog sağ" # Default: Animation department msgid "animation-department" msgstr "Animasyon departmanı" # Default: Archive footage msgid "archive-footage" msgstr "Arşiv çekimleri" # Default: Arithmetic mean msgid "arithmetic-mean" msgstr "Aritmetik ortalama" # Default: Art department msgid "art-department" msgstr "Sanat departmanı" # Default: Art direction msgid "art-direction" msgstr "Sanat yönetmenliği" # Default: Art director msgid "art-director" msgstr "Sanat yönetmeni" # Default: Article msgid "article" msgstr "" # Default: Asin msgid "asin" msgstr "ASIN" # Default: Aspect ratio msgid "aspect-ratio" msgstr "En-boy oranı" # Default: Assigner msgid "assigner" msgstr "Veren" # Default: Assistant director msgid "assistant-director" msgstr "Yardımcı yönetmen" # Default: Auctions msgid "auctions" msgstr "Açık artırmalar" # Default: Audio noise msgid "audio-noise" msgstr "Ses gürültüsü" # Default: Audio quality msgid "audio-quality" msgstr "Ses kalitesi" # Default: Award msgid "award" msgstr "Ödül" # Default: Awards msgid "awards" msgstr "Ödüller" # Default: Biographical movies msgid "biographical-movies" msgstr "Biyografik filmler" # Default: Biography msgid "biography" msgstr "Biyografi" # Default: Biography print msgid "biography-print" msgstr "Basılı biyografi" # Default: Birth date msgid "birth-date" msgstr "Doğum tarihi" # Default: Birth name msgid "birth-name" msgstr "Asıl ismi" # Default: Birth notes msgid "birth-notes" msgstr "Doğum notları" # Default: Body msgid "body" msgstr "Metin" # Default: Book msgid "book" msgstr "Kitap" # Default: Books msgid "books" msgstr "Kitaplar" # Default: Bottom 100 rank msgid "bottom-100-rank" msgstr "En kötü 100 içindeki sırası" # Default: Budget msgid "budget" msgstr "Bütçe" # Default: Business msgid "business" msgstr "Gişe" # Default: By arrangement with msgid "by-arrangement-with" msgstr "" # Default: Camera msgid "camera" msgstr "Kamera" # Default: Camera and electrical department msgid "camera-and-electrical-department" msgstr "Kamera ve elektrik departmanı" # Default: Canonical episode title msgid "canonical-episode-title" msgstr "" # Default: Canonical name msgid "canonical-name" msgstr "" # Default: Canonical series title msgid "canonical-series-title" msgstr "" # Default: Canonical title msgid "canonical-title" msgstr "" # Default: Cast msgid "cast" msgstr "Oynayanlar" # Default: Casting department msgid "casting-department" msgstr "Oyuncu seçme departmanı" # Default: Casting director msgid "casting-director" msgstr "Oyuncu seçme yönetmeni" # Default: Catalog number msgid "catalog-number" msgstr "Katalog numarası" # Default: Category msgid "category" msgstr "Kategori" # Default: Certificate msgid "certificate" msgstr "Sertifika" # Default: Certificates msgid "certificates" msgstr "Sertifikalar" # Default: Certification msgid "certification" msgstr "" # Default: Channel msgid "channel" msgstr "Kanal" # Default: Character msgid "character" msgstr "Karakter" # Default: Cinematographer msgid "cinematographer" msgstr "Kameraman" # Default: Cinematographic process msgid "cinematographic-process" msgstr "" # Default: Close captions teletext ld g msgid "close-captions-teletext-ld-g" msgstr "" # Default: Color info msgid "color-info" msgstr "Renk bilgisi" # Default: Color information msgid "color-information" msgstr "Renk bilgisi" # Default: Color rendition msgid "color-rendition" msgstr "" # Default: Company msgid "company" msgstr "Şirket" # Default: Complete cast msgid "complete-cast" msgstr "Bütün oynayanlar" # Default: Complete crew msgid "complete-crew" msgstr "Bütün çalışanlar" # Default: Composer msgid "composer" msgstr "Besteci" # Default: Connections msgid "connections" msgstr "Bağlantılar" # Default: Contrast msgid "contrast" msgstr "Kontrast" # Default: Copyright holder msgid "copyright-holder" msgstr "Telif sahibi" # Default: Costume department msgid "costume-department" msgstr "Kostüm departmanı" # Default: Costume designer msgid "costume-designer" msgstr "Kostüm tasarımcısı" # Default: Countries msgid "countries" msgstr "Ülkeler" # Default: Country msgid "country" msgstr "Ülke" # Default: Courtesy of msgid "courtesy-of" msgstr "" # Default: Cover msgid "cover" msgstr "Poster" # Default: Cover url msgid "cover-url" msgstr "Poster adresi" # Default: Crazy credits msgid "crazy-credits" msgstr "" # Default: Creator msgid "creator" msgstr "Yaratıcı" # Default: Current role msgid "current-role" msgstr "Şimdiki rol" # Default: Database msgid "database" msgstr "Veritabanı" # Default: Date msgid "date" msgstr "Tarih" # Default: Death date msgid "death-date" msgstr "Ölüm tarihi" # Default: Death notes msgid "death-notes" msgstr "Ölüm notları" # Default: Demographic msgid "demographic" msgstr "Demografi" # Default: Description msgid "description" msgstr "Tarif" # Default: Dialogue intellegibility msgid "dialogue-intellegibility" msgstr "" # Default: Digital sound msgid "digital-sound" msgstr "Dijital ses" # Default: Director msgid "director" msgstr "Yönetmen" # Default: Disc format msgid "disc-format" msgstr "Disk formatı" # Default: Disc size msgid "disc-size" msgstr "Disk boyu" # Default: Distributors msgid "distributors" msgstr "Dağıtıcılar" # Default: Dvd msgid "dvd" msgstr "DVD" # Default: Dvd features msgid "dvd-features" msgstr "DVD özellikleri" # Default: Dvd format msgid "dvd-format" msgstr "DVD formatı" # Default: Dvds msgid "dvds" msgstr "DVD'ler" # Default: Dynamic range msgid "dynamic-range" msgstr "" # Default: Edited from msgid "edited-from" msgstr "" # Default: Edited into msgid "edited-into" msgstr "" # Default: Editor msgid "editor" msgstr "Montajcı" # Default: Editorial department msgid "editorial-department" msgstr "Montaj departmanı" # Default: Episode msgid "episode" msgstr "Bölüm" # Default: Episode of msgid "episode-of" msgstr "Dizi" # Default: Episode title msgid "episode-title" msgstr "Bölüm başlığı" # Default: Episodes msgid "episodes" msgstr "Bölümler" # Default: Episodes rating msgid "episodes-rating" msgstr "Bölüm puanı" # Default: Essays msgid "essays" msgstr "Denemeler" # Default: External reviews msgid "external-reviews" msgstr "Harici eleştiriler" # Default: Faqs msgid "faqs" msgstr "SSS" # Default: Feature msgid "feature" msgstr "" # Default: Featured in msgid "featured-in" msgstr "" # Default: Features msgid "features" msgstr "" # Default: Film negative format msgid "film-negative-format" msgstr "Film negatif formatı" # Default: Filming dates msgid "filming-dates" msgstr "Çekim tarihleri" # Default: Filmography msgid "filmography" msgstr "Filmografi" # Default: Followed by msgid "followed-by" msgstr "Peşinden gelen film" # Default: Follows msgid "follows" msgstr "Peşinden geldiği film" # Default: For msgid "for" msgstr "Film" # Default: Frequency response msgid "frequency-response" msgstr "" # Default: From msgid "from" msgstr "" # Default: Full article link msgid "full-article-link" msgstr "" # Default: Full size cover url msgid "full-size-cover-url" msgstr "" # Default: Full size headshot msgid "full-size-headshot" msgstr "" # Default: Genres msgid "genres" msgstr "Türler" # Default: Goofs msgid "goofs" msgstr "Hatalar" # Default: Gross msgid "gross" msgstr "Hasılat" # Default: Group genre msgid "group-genre" msgstr "" # Default: Headshot msgid "headshot" msgstr "Resim" # Default: Height msgid "height" msgstr "Boy" # Default: Imdbindex msgid "imdbindex" msgstr "" # Default: In development msgid "in-development" msgstr "" # Default: Interview msgid "interview" msgstr "Söyleşi" # Default: Interviews msgid "interviews" msgstr "Söyleşiler" # Default: Introduction msgid "introduction" msgstr "İlk filmi" # Default: Item msgid "item" msgstr "" # Default: Keywords msgid "keywords" msgstr "Anahtar sözcükler" # Default: Kind msgid "kind" msgstr "Tip" # Default: Label msgid "label" msgstr "" # Default: Laboratory msgid "laboratory" msgstr "Laboratuar" # Default: Language msgid "language" msgstr "Dil" # Default: Languages msgid "languages" msgstr "Diller" # Default: Laserdisc msgid "laserdisc" msgstr "Lazer Disk" # Default: Laserdisc title msgid "laserdisc-title" msgstr "" # Default: Length msgid "length" msgstr "Süre" # Default: Line msgid "line" msgstr "Replik" # Default: Link msgid "link" msgstr "Bağlantı" # Default: Link text msgid "link-text" msgstr "Bağlantı metni" # Default: Literature msgid "literature" msgstr "Edebiyat" # Default: Locations msgid "locations" msgstr "Çekim yerleri" # Default: Long imdb canonical name msgid "long-imdb-canonical-name" msgstr "" # Default: Long imdb canonical title msgid "long-imdb-canonical-title" msgstr "" # Default: Long imdb episode title msgid "long-imdb-episode-title" msgstr "IMDb uzun bölüm başlığı" # Default: Long imdb name msgid "long-imdb-name" msgstr "IMDb uzun ismi" # Default: Long imdb title msgid "long-imdb-title" msgstr "IMDb uzun başlığı" # Default: Magazine cover photo msgid "magazine-cover-photo" msgstr "Dergi kapağı resmi" # Default: Make up msgid "make-up" msgstr "Makyaj" # Default: Master format msgid "master-format" msgstr "Master format" # Default: Median msgid "median" msgstr "Orta değer" # Default: Merchandising links msgid "merchandising-links" msgstr "" # Default: Mini biography msgid "mini-biography" msgstr "Mini biyografi" # Default: Misc links msgid "misc-links" msgstr "" # Default: Miscellaneous companies msgid "miscellaneous-companies" msgstr "" # Default: Miscellaneous crew msgid "miscellaneous-crew" msgstr "" # Default: Movie msgid "movie" msgstr "Film" # Default: Mpaa msgid "mpaa" msgstr "MPAA" # Default: Music department msgid "music-department" msgstr "Müzik departmanı" # Default: Name msgid "name" msgstr "İsim" # Default: News msgid "news" msgstr "Haberler" # Default: Newsgroup reviews msgid "newsgroup-reviews" msgstr "Haber grubu eleştirileri" # Default: Nick names msgid "nick-names" msgstr "Takma isimler" # Default: Notes msgid "notes" msgstr "Notlar" # Default: Novel msgid "novel" msgstr "Roman" # Default: Number msgid "number" msgstr "Sayı" # Default: Number of chapter stops msgid "number-of-chapter-stops" msgstr "" # Default: Number of episodes msgid "number-of-episodes" msgstr "Bölüm sayısı" # Default: Number of seasons msgid "number-of-seasons" msgstr "Sezon sayısı" # Default: Number of sides msgid "number-of-sides" msgstr "" # Default: Number of votes msgid "number-of-votes" msgstr "Oy sayısı" # Default: Official retail price msgid "official-retail-price" msgstr "Resmi perakende satış fiyatı" # Default: Official sites msgid "official-sites" msgstr "Resmi siteler" # Default: Opening weekend msgid "opening-weekend" msgstr "Açılış haftasonu" # Default: Original air date msgid "original-air-date" msgstr "İlk yayımlanma tarihi" # Default: Original music msgid "original-music" msgstr "Orijinal müzik" # Default: Original title msgid "original-title" msgstr "" # Default: Other literature msgid "other-literature" msgstr "" # Default: Other works msgid "other-works" msgstr "Diğer çalışmalar" # Default: Parents guide msgid "parents-guide" msgstr "Ana-baba kılavuzu" # Default: Performed by msgid "performed-by" msgstr "İcra eden" # Default: Person msgid "person" msgstr "Kişi" # Default: Photo sites msgid "photo-sites" msgstr "Fotoğraf siteleri" # Default: Pictorial msgid "pictorial" msgstr "" # Default: Picture format msgid "picture-format" msgstr "Resim formatı" # Default: Plot msgid "plot" msgstr "Konu" # Default: Plot outline msgid "plot-outline" msgstr "Konu kısa özeti" # Default: Portrayed in msgid "portrayed-in" msgstr "" # Default: Pressing plant msgid "pressing-plant" msgstr "" # Default: Printed film format msgid "printed-film-format" msgstr "Basılı film formatı" # Default: Printed media reviews msgid "printed-media-reviews" msgstr "Basın eleştirileri" # Default: Producer msgid "producer" msgstr "Yapımcı" # Default: Production companies msgid "production-companies" msgstr "Yapım şirketleri" # Default: Production country msgid "production-country" msgstr "Yapımcı ülke" # Default: Production dates msgid "production-dates" msgstr "Yapım tarihleri" # Default: Production design msgid "production-design" msgstr "Yapım tasarımı" # Default: Production designer msgid "production-designer" msgstr "Yapım tasarımcısı" # Default: Production manager msgid "production-manager" msgstr "Yapım yöneticisi" # Default: Production process protocol msgid "production-process-protocol" msgstr "" # Default: Quality of source msgid "quality-of-source" msgstr "" # Default: Quality program msgid "quality-program" msgstr "" # Default: Quote msgid "quote" msgstr "Alıntı" # Default: Quotes msgid "quotes" msgstr "Alıntılar" # Default: Rating msgid "rating" msgstr "Puan" # Default: Recommendations msgid "recommendations" msgstr "Tavsiyeler" # Default: Referenced in msgid "referenced-in" msgstr "Gönderme yapılan filmler" # Default: References msgid "references" msgstr "Gönderme yaptığı filmler" # Default: Region msgid "region" msgstr "Bölge" # Default: Release country msgid "release-country" msgstr "" # Default: Release date msgid "release-date" msgstr "" # Default: Release dates msgid "release-dates" msgstr "" # Default: Remade as msgid "remade-as" msgstr "Yeniden çekilişi" # Default: Remake of msgid "remake-of" msgstr "Yeniden çekimi olduğu film" # Default: Rentals msgid "rentals" msgstr "Kiralamalar" # Default: Result msgid "result" msgstr "Sonuç" # Default: Review msgid "review" msgstr "Eleştiri" # Default: Review author msgid "review-author" msgstr "Eleştiri yazarı" # Default: Review kind msgid "review-kind" msgstr "Eleştiri tipi" # Default: Runtime msgid "runtime" msgstr "Süre" # Default: Runtimes msgid "runtimes" msgstr "Süreler" # Default: Salary history msgid "salary-history" msgstr "Üçret tarihçesi" # Default: Screenplay teleplay msgid "screenplay-teleplay" msgstr "Senaryo" # Default: Season msgid "season" msgstr "Sezon" # Default: Second unit director or assistant director msgid "second-unit-director-or-assistant-director" msgstr "İkinci birim yönetmeni ya da yardımcı yönetmen" # Default: Self msgid "self" msgstr "Kendisi" # Default: Series animation department msgid "series-animation-department" msgstr "Dizinin animasyon departmanı" # Default: Series art department msgid "series-art-department" msgstr "Dizinin sanat departmanı" # Default: Series assistant directors msgid "series-assistant-directors" msgstr "Dizinin yardımcı yönetmenleri" # Default: Series camera department msgid "series-camera-department" msgstr "Dizinin kamera departmanı" # Default: Series casting department msgid "series-casting-department" msgstr "Dizinin oyuncu seçimi departmanı" # Default: Series cinematographers msgid "series-cinematographers" msgstr "Dizinin kameramanları" # Default: Series costume department msgid "series-costume-department" msgstr "Dizinin kostüm departmanı" # Default: Series editorial department msgid "series-editorial-department" msgstr "Dizinin montaj departmanı" # Default: Series editors msgid "series-editors" msgstr "Dizinin montajcıları" # Default: Series make up department msgid "series-make-up-department" msgstr "Dizinin makyaj departmanı" # Default: Series miscellaneous msgid "series-miscellaneous" msgstr "" # Default: Series music department msgid "series-music-department" msgstr "Dizinin müzik departmanı" # Default: Series producers msgid "series-producers" msgstr "Dizinin yapımcıları" # Default: Series production designers msgid "series-production-designers" msgstr "Dizinin yapım tasarımcıları" # Default: Series production managers msgid "series-production-managers" msgstr "Dizinin yapım yöneticileri" # Default: Series sound department msgid "series-sound-department" msgstr "Dizinin ses departmanı" # Default: Series special effects department msgid "series-special-effects-department" msgstr "Dizinin özel efekt departmanı" # Default: Series stunts msgid "series-stunts" msgstr "Dizinin dublörleri" # Default: Series title msgid "series-title" msgstr "Dizinin başlığı" # Default: Series transportation department msgid "series-transportation-department" msgstr "Dizinin ulaşım departmanı" # Default: Series visual effects department msgid "series-visual-effects-department" msgstr "Dizinin görsel efekt departmanı" # Default: Series writers msgid "series-writers" msgstr "Dizinin yazarları" # Default: Series years msgid "series-years" msgstr "Dizinin yılları" # Default: Set decoration msgid "set-decoration" msgstr "Set dekorasyonu" # Default: Sharpness msgid "sharpness" msgstr "Keskinlik" # Default: Similar to msgid "similar-to" msgstr "Benzer" # Default: Smart canonical episode title msgid "smart-canonical-episode-title" msgstr "" # Default: Smart canonical series title msgid "smart-canonical-series-title" msgstr "" # Default: Smart canonical title msgid "smart-canonical-title" msgstr "" # Default: Smart long imdb canonical title msgid "smart-long-imdb-canonical-title" msgstr "" # Default: Sound clips msgid "sound-clips" msgstr "Ses klipleri" # Default: Sound crew msgid "sound-crew" msgstr "Ses ekibi" # Default: Sound encoding msgid "sound-encoding" msgstr "Ses kodlaması" # Default: Sound mix msgid "sound-mix" msgstr "" # Default: Soundtrack msgid "soundtrack" msgstr "Film müzikleri" # Default: Spaciality msgid "spaciality" msgstr "" # Default: Special effects msgid "special-effects" msgstr "Özel efektler" # Default: Special effects companies msgid "special-effects-companies" msgstr "Özel efekt şirketleri" # Default: Special effects department msgid "special-effects-department" msgstr "Özel efekt departmanı" # Default: Spin off msgid "spin-off" msgstr "" # Default: Spin off from msgid "spin-off-from" msgstr "" # Default: Spoofed in msgid "spoofed-in" msgstr "Dalga geçildiği filmler" # Default: Spoofs msgid "spoofs" msgstr "Dalga geçtiği filmler" # Default: Spouse msgid "spouse" msgstr "Eşi" # Default: Status of availablility msgid "status-of-availablility" msgstr "" # Default: Studio msgid "studio" msgstr "Stüdyo" # Default: Studios msgid "studios" msgstr "Stüdyolar" # Default: Stunt performer msgid "stunt-performer" msgstr "" # Default: Stunts msgid "stunts" msgstr "Dublörler" # Default: Subtitles msgid "subtitles" msgstr "Altyazılar" # Default: Supplement msgid "supplement" msgstr "" # Default: Supplements msgid "supplements" msgstr "" # Default: Synopsis msgid "synopsis" msgstr "Sinopsis" # Default: Taglines msgid "taglines" msgstr "Spotlar" # Default: Tech info msgid "tech-info" msgstr "Teknik bilgi" # Default: Thanks msgid "thanks" msgstr "Teşekkürler" # Default: Time msgid "time" msgstr "Zaman" # Default: Title msgid "title" msgstr "Başlık" # Default: Titles in this product msgid "titles-in-this-product" msgstr "Bu üründeki başlıklar" # Default: To msgid "to" msgstr "Alan" # Default: Top 250 rank msgid "top-250-rank" msgstr "En iyi 250 içindeki sırası" # Default: Trade mark msgid "trade-mark" msgstr "Kendine has özelliği" # Default: Transportation department msgid "transportation-department" msgstr "Ulaşım departmanı" # Default: Trivia msgid "trivia" msgstr "İlginç notlar" # Default: Tv msgid "tv" msgstr "" # Default: Under license from msgid "under-license-from" msgstr "" # Default: Unknown link msgid "unknown-link" msgstr "" # Default: Upc msgid "upc" msgstr "" # Default: Version of msgid "version-of" msgstr "" # Default: Vhs msgid "vhs" msgstr "VHS" # Default: Video msgid "video" msgstr "" # Default: Video artifacts msgid "video-artifacts" msgstr "" # Default: Video clips msgid "video-clips" msgstr "Video klipleri" # Default: Video noise msgid "video-noise" msgstr "Video gürültüsü" # Default: Video quality msgid "video-quality" msgstr "Video kalitesi" # Default: Video standard msgid "video-standard" msgstr "Video standardı" # Default: Visual effects msgid "visual-effects" msgstr "Görsel efektler" # Default: Votes msgid "votes" msgstr "Oylar" # Default: Votes distribution msgid "votes-distribution" msgstr "Oyların dağılımı" # Default: Weekend gross msgid "weekend-gross" msgstr "Haftasonu hasılatı" # Default: Where now msgid "where-now" msgstr "Şu anda nerede" # Default: With msgid "with" msgstr "" # Default: Writer msgid "writer" msgstr "Yazar" # Default: Written by msgid "written-by" msgstr "Yazan" # Default: Year msgid "year" msgstr "Yıl" # Default: Zshops msgid "zshops" msgstr "ZShops" imdbpy-6.8/imdb/locale/imdbpy.pot000066400000000000000000000446631351454127000170540ustar00rootroot00000000000000# Gettext message file for imdbpy msgid "" msgstr "" "Project-Id-Version: imdbpy\n" "POT-Creation-Date: 2010-03-18 14:35+0000\n" "PO-Revision-Date: YYYY-MM-DD HH:MM+0000\n" "Last-Translator: YOUR NAME \n" "Language-Team: TEAM NAME \n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: nplurals=1; plural=0;\n" "Language-Code: en\n" "Language-Name: English\n" "Preferred-Encodings: utf-8\n" "Domain: imdbpy\n" # Default: Actor msgid "actor" msgstr "" # Default: Actress msgid "actress" msgstr "" # Default: Adaption msgid "adaption" msgstr "" # Default: Additional information msgid "additional-information" msgstr "" # Default: Admissions msgid "admissions" msgstr "" # Default: Agent address msgid "agent-address" msgstr "" # Default: Airing msgid "airing" msgstr "" # Default: Akas msgid "akas" msgstr "" # Default: Akas from release info msgid "akas-from-release-info" msgstr "" # Default: All products msgid "all-products" msgstr "" # Default: Alternate language version of msgid "alternate-language-version-of" msgstr "" # Default: Alternate versions msgid "alternate-versions" msgstr "" # Default: Amazon reviews msgid "amazon-reviews" msgstr "" # Default: Analog left msgid "analog-left" msgstr "" # Default: Analog right msgid "analog-right" msgstr "" # Default: Animation department msgid "animation-department" msgstr "" # Default: Archive footage msgid "archive-footage" msgstr "" # Default: Arithmetic mean msgid "arithmetic-mean" msgstr "" # Default: Art department msgid "art-department" msgstr "" # Default: Art direction msgid "art-direction" msgstr "" # Default: Art director msgid "art-director" msgstr "" # Default: Article msgid "article" msgstr "" # Default: Asin msgid "asin" msgstr "" # Default: Aspect ratio msgid "aspect-ratio" msgstr "" # Default: Assigner msgid "assigner" msgstr "" # Default: Assistant director msgid "assistant-director" msgstr "" # Default: Auctions msgid "auctions" msgstr "" # Default: Audio noise msgid "audio-noise" msgstr "" # Default: Audio quality msgid "audio-quality" msgstr "" # Default: Award msgid "award" msgstr "" # Default: Awards msgid "awards" msgstr "" # Default: Biographical movies msgid "biographical-movies" msgstr "" # Default: Biography msgid "biography" msgstr "" # Default: Biography print msgid "biography-print" msgstr "" # Default: Birth date msgid "birth-date" msgstr "" # Default: Birth name msgid "birth-name" msgstr "" # Default: Birth notes msgid "birth-notes" msgstr "" # Default: Body msgid "body" msgstr "" # Default: Book msgid "book" msgstr "" # Default: Books msgid "books" msgstr "" # Default: Bottom 100 rank msgid "bottom-100-rank" msgstr "" # Default: Budget msgid "budget" msgstr "" # Default: Business msgid "business" msgstr "" # Default: By arrangement with msgid "by-arrangement-with" msgstr "" # Default: Camera msgid "camera" msgstr "" # Default: Camera and electrical department msgid "camera-and-electrical-department" msgstr "" # Default: Canonical episode title msgid "canonical-episode-title" msgstr "" # Default: Canonical name msgid "canonical-name" msgstr "" # Default: Canonical series title msgid "canonical-series-title" msgstr "" # Default: Canonical title msgid "canonical-title" msgstr "" # Default: Cast msgid "cast" msgstr "" # Default: Casting department msgid "casting-department" msgstr "" # Default: Casting director msgid "casting-director" msgstr "" # Default: Catalog number msgid "catalog-number" msgstr "" # Default: Category msgid "category" msgstr "" # Default: Certificate msgid "certificate" msgstr "" # Default: Certificates msgid "certificates" msgstr "" # Default: Certification msgid "certification" msgstr "" # Default: Channel msgid "channel" msgstr "" # Default: Character msgid "character" msgstr "" # Default: Cinematographer msgid "cinematographer" msgstr "" # Default: Cinematographic process msgid "cinematographic-process" msgstr "" # Default: Close captions teletext ld g msgid "close-captions-teletext-ld-g" msgstr "" # Default: Color info msgid "color-info" msgstr "" # Default: Color information msgid "color-information" msgstr "" # Default: Color rendition msgid "color-rendition" msgstr "" # Default: Company msgid "company" msgstr "" # Default: Complete cast msgid "complete-cast" msgstr "" # Default: Complete crew msgid "complete-crew" msgstr "" # Default: Composer msgid "composer" msgstr "" # Default: Connections msgid "connections" msgstr "" # Default: Contrast msgid "contrast" msgstr "" # Default: Copyright holder msgid "copyright-holder" msgstr "" # Default: Costume department msgid "costume-department" msgstr "" # Default: Costume designer msgid "costume-designer" msgstr "" # Default: Countries msgid "countries" msgstr "" # Default: Country msgid "country" msgstr "" # Default: Courtesy of msgid "courtesy-of" msgstr "" # Default: Cover msgid "cover" msgstr "" # Default: Cover url msgid "cover-url" msgstr "" # Default: Crazy credits msgid "crazy-credits" msgstr "" # Default: Creator msgid "creator" msgstr "" # Default: Current role msgid "current-role" msgstr "" # Default: Database msgid "database" msgstr "" # Default: Date msgid "date" msgstr "" # Default: Death date msgid "death-date" msgstr "" # Default: Death notes msgid "death-notes" msgstr "" # Default: Demographic msgid "demographic" msgstr "" # Default: Description msgid "description" msgstr "" # Default: Dialogue intellegibility msgid "dialogue-intellegibility" msgstr "" # Default: Digital sound msgid "digital-sound" msgstr "" # Default: Director msgid "director" msgstr "" # Default: Disc format msgid "disc-format" msgstr "" # Default: Disc size msgid "disc-size" msgstr "" # Default: Distributors msgid "distributors" msgstr "" # Default: Dvd msgid "dvd" msgstr "" # Default: Dvd features msgid "dvd-features" msgstr "" # Default: Dvd format msgid "dvd-format" msgstr "" # Default: Dvds msgid "dvds" msgstr "" # Default: Dynamic range msgid "dynamic-range" msgstr "" # Default: Edited from msgid "edited-from" msgstr "" # Default: Edited into msgid "edited-into" msgstr "" # Default: Editor msgid "editor" msgstr "" # Default: Editorial department msgid "editorial-department" msgstr "" # Default: Episode msgid "episode" msgstr "" # Default: Episode of msgid "episode-of" msgstr "" # Default: Episode title msgid "episode-title" msgstr "" # Default: Episodes msgid "episodes" msgstr "" # Default: Episodes rating msgid "episodes-rating" msgstr "" # Default: Essays msgid "essays" msgstr "" # Default: External reviews msgid "external-reviews" msgstr "" # Default: Faqs msgid "faqs" msgstr "" # Default: Feature msgid "feature" msgstr "" # Default: Featured in msgid "featured-in" msgstr "" # Default: Features msgid "features" msgstr "" # Default: Film negative format msgid "film-negative-format" msgstr "" # Default: Filming dates msgid "filming-dates" msgstr "" # Default: Filmography msgid "filmography" msgstr "" # Default: Followed by msgid "followed-by" msgstr "" # Default: Follows msgid "follows" msgstr "" # Default: For msgid "for" msgstr "" # Default: Frequency response msgid "frequency-response" msgstr "" # Default: From msgid "from" msgstr "" # Default: Full article link msgid "full-article-link" msgstr "" # Default: Full size cover url msgid "full-size-cover-url" msgstr "" # Default: Full size headshot msgid "full-size-headshot" msgstr "" # Default: Genres msgid "genres" msgstr "" # Default: Goofs msgid "goofs" msgstr "" # Default: Gross msgid "gross" msgstr "" # Default: Group genre msgid "group-genre" msgstr "" # Default: Headshot msgid "headshot" msgstr "" # Default: Height msgid "height" msgstr "" # Default: Imdbindex msgid "imdbindex" msgstr "" # Default: In development msgid "in-development" msgstr "" # Default: Interview msgid "interview" msgstr "" # Default: Interviews msgid "interviews" msgstr "" # Default: Introduction msgid "introduction" msgstr "" # Default: Item msgid "item" msgstr "" # Default: Keywords msgid "keywords" msgstr "" # Default: Kind msgid "kind" msgstr "" # Default: Label msgid "label" msgstr "" # Default: Laboratory msgid "laboratory" msgstr "" # Default: Language msgid "language" msgstr "" # Default: Languages msgid "languages" msgstr "" # Default: Laserdisc msgid "laserdisc" msgstr "" # Default: Laserdisc title msgid "laserdisc-title" msgstr "" # Default: Length msgid "length" msgstr "" # Default: Line msgid "line" msgstr "" # Default: Link msgid "link" msgstr "" # Default: Link text msgid "link-text" msgstr "" # Default: Literature msgid "literature" msgstr "" # Default: Locations msgid "locations" msgstr "" # Default: Long imdb canonical name msgid "long-imdb-canonical-name" msgstr "" # Default: Long imdb canonical title msgid "long-imdb-canonical-title" msgstr "" # Default: Long imdb episode title msgid "long-imdb-episode-title" msgstr "" # Default: Long imdb name msgid "long-imdb-name" msgstr "" # Default: Long imdb title msgid "long-imdb-title" msgstr "" # Default: Magazine cover photo msgid "magazine-cover-photo" msgstr "" # Default: Make up msgid "make-up" msgstr "" # Default: Master format msgid "master-format" msgstr "" # Default: Median msgid "median" msgstr "" # Default: Merchandising links msgid "merchandising-links" msgstr "" # Default: Mini biography msgid "mini-biography" msgstr "" # Default: Misc links msgid "misc-links" msgstr "" # Default: Miscellaneous companies msgid "miscellaneous-companies" msgstr "" # Default: Miscellaneous crew msgid "miscellaneous-crew" msgstr "" # Default: Movie msgid "movie" msgstr "" # Default: Mpaa msgid "mpaa" msgstr "" # Default: Music department msgid "music-department" msgstr "" # Default: Name msgid "name" msgstr "" # Default: News msgid "news" msgstr "" # Default: Newsgroup reviews msgid "newsgroup-reviews" msgstr "" # Default: Nick names msgid "nick-names" msgstr "" # Default: Notes msgid "notes" msgstr "" # Default: Novel msgid "novel" msgstr "" # Default: Number msgid "number" msgstr "" # Default: Number of chapter stops msgid "number-of-chapter-stops" msgstr "" # Default: Number of episodes msgid "number-of-episodes" msgstr "" # Default: Number of seasons msgid "number-of-seasons" msgstr "" # Default: Number of sides msgid "number-of-sides" msgstr "" # Default: Number of votes msgid "number-of-votes" msgstr "" # Default: Official retail price msgid "official-retail-price" msgstr "" # Default: Official sites msgid "official-sites" msgstr "" # Default: Opening weekend msgid "opening-weekend" msgstr "" # Default: Original air date msgid "original-air-date" msgstr "" # Default: Original music msgid "original-music" msgstr "" # Default: Original title msgid "original-title" msgstr "" # Default: Other literature msgid "other-literature" msgstr "" # Default: Other works msgid "other-works" msgstr "" # Default: Parents guide msgid "parents-guide" msgstr "" # Default: Performed by msgid "performed-by" msgstr "" # Default: Person msgid "person" msgstr "" # Default: Photo sites msgid "photo-sites" msgstr "" # Default: Pictorial msgid "pictorial" msgstr "" # Default: Picture format msgid "picture-format" msgstr "" # Default: Plot msgid "plot" msgstr "" # Default: Plot outline msgid "plot-outline" msgstr "" # Default: Portrayed in msgid "portrayed-in" msgstr "" # Default: Pressing plant msgid "pressing-plant" msgstr "" # Default: Printed film format msgid "printed-film-format" msgstr "" # Default: Printed media reviews msgid "printed-media-reviews" msgstr "" # Default: Producer msgid "producer" msgstr "" # Default: Production companies msgid "production-companies" msgstr "" # Default: Production country msgid "production-country" msgstr "" # Default: Production dates msgid "production-dates" msgstr "" # Default: Production design msgid "production-design" msgstr "" # Default: Production designer msgid "production-designer" msgstr "" # Default: Production manager msgid "production-manager" msgstr "" # Default: Production process protocol msgid "production-process-protocol" msgstr "" # Default: Quality of source msgid "quality-of-source" msgstr "" # Default: Quality program msgid "quality-program" msgstr "" # Default: Quote msgid "quote" msgstr "" # Default: Quotes msgid "quotes" msgstr "" # Default: Rating msgid "rating" msgstr "" # Default: Recommendations msgid "recommendations" msgstr "" # Default: Referenced in msgid "referenced-in" msgstr "" # Default: References msgid "references" msgstr "" # Default: Region msgid "region" msgstr "" # Default: Release country msgid "release-country" msgstr "" # Default: Release date msgid "release-date" msgstr "" # Default: Release dates msgid "release-dates" msgstr "" # Default: Remade as msgid "remade-as" msgstr "" # Default: Remake of msgid "remake-of" msgstr "" # Default: Rentals msgid "rentals" msgstr "" # Default: Result msgid "result" msgstr "" # Default: Review msgid "review" msgstr "" # Default: Review author msgid "review-author" msgstr "" # Default: Review kind msgid "review-kind" msgstr "" # Default: Runtime msgid "runtime" msgstr "" # Default: Runtimes msgid "runtimes" msgstr "" # Default: Salary history msgid "salary-history" msgstr "" # Default: Screenplay teleplay msgid "screenplay-teleplay" msgstr "" # Default: Season msgid "season" msgstr "" # Default: Second unit director or assistant director msgid "second-unit-director-or-assistant-director" msgstr "" # Default: Self msgid "self" msgstr "" # Default: Series animation department msgid "series-animation-department" msgstr "" # Default: Series art department msgid "series-art-department" msgstr "" # Default: Series assistant directors msgid "series-assistant-directors" msgstr "" # Default: Series camera department msgid "series-camera-department" msgstr "" # Default: Series casting department msgid "series-casting-department" msgstr "" # Default: Series cinematographers msgid "series-cinematographers" msgstr "" # Default: Series costume department msgid "series-costume-department" msgstr "" # Default: Series editorial department msgid "series-editorial-department" msgstr "" # Default: Series editors msgid "series-editors" msgstr "" # Default: Series make up department msgid "series-make-up-department" msgstr "" # Default: Series miscellaneous msgid "series-miscellaneous" msgstr "" # Default: Series music department msgid "series-music-department" msgstr "" # Default: Series producers msgid "series-producers" msgstr "" # Default: Series production designers msgid "series-production-designers" msgstr "" # Default: Series production managers msgid "series-production-managers" msgstr "" # Default: Series sound department msgid "series-sound-department" msgstr "" # Default: Series special effects department msgid "series-special-effects-department" msgstr "" # Default: Series stunts msgid "series-stunts" msgstr "" # Default: Series title msgid "series-title" msgstr "" # Default: Series transportation department msgid "series-transportation-department" msgstr "" # Default: Series visual effects department msgid "series-visual-effects-department" msgstr "" # Default: Series writers msgid "series-writers" msgstr "" # Default: Series years msgid "series-years" msgstr "" # Default: Set decoration msgid "set-decoration" msgstr "" # Default: Sharpness msgid "sharpness" msgstr "" # Default: Similar to msgid "similar-to" msgstr "" # Default: Smart canonical episode title msgid "smart-canonical-episode-title" msgstr "" # Default: Smart canonical series title msgid "smart-canonical-series-title" msgstr "" # Default: Smart canonical title msgid "smart-canonical-title" msgstr "" # Default: Smart long imdb canonical title msgid "smart-long-imdb-canonical-title" msgstr "" # Default: Sound clips msgid "sound-clips" msgstr "" # Default: Sound crew msgid "sound-crew" msgstr "" # Default: Sound encoding msgid "sound-encoding" msgstr "" # Default: Sound mix msgid "sound-mix" msgstr "" # Default: Soundtrack msgid "soundtrack" msgstr "" # Default: Spaciality msgid "spaciality" msgstr "" # Default: Special effects msgid "special-effects" msgstr "" # Default: Special effects companies msgid "special-effects-companies" msgstr "" # Default: Special effects department msgid "special-effects-department" msgstr "" # Default: Spin off msgid "spin-off" msgstr "" # Default: Spin off from msgid "spin-off-from" msgstr "" # Default: Spoofed in msgid "spoofed-in" msgstr "" # Default: Spoofs msgid "spoofs" msgstr "" # Default: Spouse msgid "spouse" msgstr "" # Default: Status of availablility msgid "status-of-availablility" msgstr "" # Default: Studio msgid "studio" msgstr "" # Default: Studios msgid "studios" msgstr "" # Default: Stunt performer msgid "stunt-performer" msgstr "" # Default: Stunts msgid "stunts" msgstr "" # Default: Subtitles msgid "subtitles" msgstr "" # Default: Supplement msgid "supplement" msgstr "" # Default: Supplements msgid "supplements" msgstr "" # Default: Synopsis msgid "synopsis" msgstr "" # Default: Taglines msgid "taglines" msgstr "" # Default: Tech info msgid "tech-info" msgstr "" # Default: Thanks msgid "thanks" msgstr "" # Default: Time msgid "time" msgstr "" # Default: Title msgid "title" msgstr "" # Default: Titles in this product msgid "titles-in-this-product" msgstr "" # Default: To msgid "to" msgstr "" # Default: Top 250 rank msgid "top-250-rank" msgstr "" # Default: Trade mark msgid "trade-mark" msgstr "" # Default: Transportation department msgid "transportation-department" msgstr "" # Default: Trivia msgid "trivia" msgstr "" # Default: Tv msgid "tv" msgstr "" # Default: Under license from msgid "under-license-from" msgstr "" # Default: Unknown link msgid "unknown-link" msgstr "" # Default: Upc msgid "upc" msgstr "" # Default: Version of msgid "version-of" msgstr "" # Default: Vhs msgid "vhs" msgstr "" # Default: Video msgid "video" msgstr "" # Default: Video artifacts msgid "video-artifacts" msgstr "" # Default: Video clips msgid "video-clips" msgstr "" # Default: Video noise msgid "video-noise" msgstr "" # Default: Video quality msgid "video-quality" msgstr "" # Default: Video standard msgid "video-standard" msgstr "" # Default: Visual effects msgid "visual-effects" msgstr "" # Default: Votes msgid "votes" msgstr "" # Default: Votes distribution msgid "votes-distribution" msgstr "" # Default: Weekend gross msgid "weekend-gross" msgstr "" # Default: Where now msgid "where-now" msgstr "" # Default: With msgid "with" msgstr "" # Default: Writer msgid "writer" msgstr "" # Default: Written by msgid "written-by" msgstr "" # Default: Year msgid "year" msgstr "" # Default: Zshops msgid "zshops" msgstr "" imdbpy-6.8/imdb/locale/msgfmt.py000066400000000000000000000156201351454127000167020ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- # Written by Martin v. Löwis """Generate binary message catalog from textual translation description. This program converts a textual Uniforum-style message catalog (.po file) into a binary GNU catalog (.mo file). This is essentially the same function as the GNU msgfmt program, however, it is a simpler implementation. Usage: msgfmt.py [OPTIONS] filename.po Options: -o file --output-file=file Specify the output file to write to. If omitted, output will go to a file named filename.mo (based off the input file name). -h --help Print this message and exit. -V --version Display version information and exit. """ import os import sys import ast import getopt import struct import array from email.parser import HeaderParser __version__ = "1.1" MESSAGES = {} def usage(code, msg=''): print(__doc__, file=sys.stderr) if msg: print(msg, file=sys.stderr) sys.exit(code) def add(id, str, fuzzy): "Add a non-fuzzy translation to the dictionary." global MESSAGES if not fuzzy and str: MESSAGES[id] = str def generate(): "Return the generated output." global MESSAGES # the keys are sorted in the .mo file keys = sorted(MESSAGES.keys()) offsets = [] ids = strs = b'' for id in keys: # For each string, we need size and file offset. Each string is NUL # terminated; the NUL does not count into the size. offsets.append((len(ids), len(id), len(strs), len(MESSAGES[id]))) ids += id + b'\0' strs += MESSAGES[id] + b'\0' output = '' # The header is 7 32-bit unsigned integers. We don't use hash tables, so # the keys start right after the index tables. # translated string. keystart = 7*4+16*len(keys) # and the values start after the keys valuestart = keystart + len(ids) koffsets = [] voffsets = [] # The string table first has the list of keys, then the list of values. # Each entry has first the size of the string, then the file offset. for o1, l1, o2, l2 in offsets: koffsets += [l1, o1+keystart] voffsets += [l2, o2+valuestart] offsets = koffsets + voffsets output = struct.pack("Iiiiiii", 0x950412de, # Magic 0, # Version len(keys), # # of entries 7*4, # start of key index 7*4+len(keys)*8, # start of value index 0, 0) # size and offset of hash table output += array.array("i", offsets).tostring() output += ids output += strs return output def make(filename, outfile): ID = 1 STR = 2 # Compute .mo name from .po name and arguments if filename.endswith('.po'): infile = filename else: infile = filename + '.po' if outfile is None: outfile = os.path.splitext(infile)[0] + '.mo' try: lines = open(infile, 'rb').readlines() except IOError as msg: print(msg, file=sys.stderr) sys.exit(1) section = None fuzzy = 0 # Start off assuming Latin-1, so everything decodes without failure, # until we know the exact encoding encoding = 'latin-1' # Parse the catalog lno = 0 for l in lines: l = l.decode(encoding) lno += 1 # If we get a comment line after a msgstr, this is a new entry if l[0] == '#' and section == STR: add(msgid, msgstr, fuzzy) section = None fuzzy = 0 # Record a fuzzy mark if l[:2] == '#,' and 'fuzzy' in l: fuzzy = 1 # Skip comments if l[0] == '#': continue # Now we are in a msgid section, output previous section if l.startswith('msgid') and not l.startswith('msgid_plural'): if section == STR: add(msgid, msgstr, fuzzy) if not msgid: # See whether there is an encoding declaration p = HeaderParser() charset = p.parsestr(msgstr.decode(encoding)).get_content_charset() if charset: encoding = charset section = ID l = l[5:] msgid = msgstr = b'' is_plural = False # This is a message with plural forms elif l.startswith('msgid_plural'): if section != ID: print('msgid_plural not preceded by msgid on %s:%d' % (infile, lno), file=sys.stderr) sys.exit(1) l = l[12:] msgid += b'\0' # separator of singular and plural is_plural = True # Now we are in a msgstr section elif l.startswith('msgstr'): section = STR if l.startswith('msgstr['): if not is_plural: print('plural without msgid_plural on %s:%d' % (infile, lno), file=sys.stderr) sys.exit(1) l = l.split(']', 1)[1] if msgstr: msgstr += b'\0' # Separator of the various plural forms else: if is_plural: print('indexed msgstr required for plural on %s:%d' % (infile, lno), file=sys.stderr) sys.exit(1) l = l[6:] # Skip empty lines l = l.strip() if not l: continue l = ast.literal_eval(l) if section == ID: msgid += l.encode(encoding) elif section == STR: msgstr += l.encode(encoding) else: print('Syntax error on %s:%d' % (infile, lno), \ 'before:', file=sys.stderr) print(l, file=sys.stderr) sys.exit(1) # Add last entry if section == STR: add(msgid, msgstr, fuzzy) # Compute output output = generate() try: open(outfile,"wb").write(output) except IOError as msg: print(msg, file=sys.stderr) def main(): try: opts, args = getopt.getopt(sys.argv[1:], 'hVo:', ['help', 'version', 'output-file=']) except getopt.error as msg: usage(1, msg) outfile = None # parse options for opt, arg in opts: if opt in ('-h', '--help'): usage(0) elif opt in ('-V', '--version'): print("msgfmt.py", __version__) sys.exit(0) elif opt in ('-o', '--output-file'): outfile = arg # do it if not args: print('No input file given', file=sys.stderr) print("Try `msgfmt --help' for more information.", file=sys.stderr) return for filename in args: make(filename, outfile) if __name__ == '__main__': main() imdbpy-6.8/imdb/locale/rebuildmo.py000077500000000000000000000026521351454127000173730ustar00rootroot00000000000000# Copyright 2009 H. Turgut Uyar # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This script builds the .mo files, from the .po files. """ import glob import os import msgfmt def rebuildmo(): lang_glob = 'imdbpy-*.po' created = [] for input_file in sorted(glob.glob(lang_glob)): lang = input_file[7:-3] if not os.path.exists(lang): os.mkdir(lang) mo_dir = os.path.join(lang, 'LC_MESSAGES') if not os.path.exists(mo_dir): os.mkdir(mo_dir) output_file = os.path.join(mo_dir, 'imdbpy.mo') msgfmt.make(input_file, output_file) created.append(lang) return created if __name__ == '__main__': languages = rebuildmo() print('Created locale for: %s.' % ' '.join(languages)) imdbpy-6.8/imdb/parser/000077500000000000000000000000001351454127000150645ustar00rootroot00000000000000imdbpy-6.8/imdb/parser/__init__.py000066400000000000000000000021031351454127000171710ustar00rootroot00000000000000# Copyright 2004-2017 Davide Alberani # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This package provides various parsers to access IMDb data, such as a parser for the web/http interface, a parser for the SQL database interface, etc. So far, the http, s3 and sql parsers are implemented. """ from __future__ import absolute_import, division, print_function, unicode_literals __all__ = ['http', 'sql', 's3'] imdbpy-6.8/imdb/parser/http/000077500000000000000000000000001351454127000160435ustar00rootroot00000000000000imdbpy-6.8/imdb/parser/http/__init__.py000066400000000000000000000711501351454127000201600ustar00rootroot00000000000000# Copyright 2004-2019 Davide Alberani # 2008 H. Turgut Uyar # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This package provides the IMDbHTTPAccessSystem class used to access IMDb's data through the web interface. The :func:`imdb.IMDb` function will return an instance of this class when called with the ``accessSystem`` argument is set to "http" or "web" or "html" (this is the default). """ from __future__ import absolute_import, division, print_function, unicode_literals import logging import socket import ssl from codecs import lookup import warnings from imdb import PY2 from imdb import IMDbBase from imdb.utils import analyze_title from imdb._exceptions import IMDbDataAccessError, IMDbParserError from . import ( companyParser, movieParser, personParser, searchMovieParser, searchMovieAdvancedParser, searchPersonParser, searchCompanyParser, searchKeywordParser, topBottomParser ) if PY2: from urllib import quote_plus from urllib2 import HTTPSHandler, ProxyHandler, build_opener else: from urllib.parse import quote_plus from urllib.request import HTTPSHandler, ProxyHandler, build_opener # Logger for miscellaneous functions. _aux_logger = logging.getLogger('imdbpy.parser.http.aux') class _ModuleProxy: """A proxy to instantiate and access parsers.""" def __init__(self, module, defaultKeys=None): """Initialize a proxy for the given module; defaultKeys, if set, muste be a dictionary of values to set for instanced objects.""" if defaultKeys is None: defaultKeys = {} self._defaultKeys = defaultKeys self._module = module def __getattr__(self, name): """Called only when no look-up is found.""" _sm = self._module # Read the _OBJECTS dictionary to build the asked parser. if name in _sm._OBJECTS: _entry = _sm._OBJECTS[name] # Initialize the parser. kwds = {} parserClass = _entry[0][0] obj = parserClass(**kwds) attrsToSet = self._defaultKeys.copy() attrsToSet.update(_entry[1] or {}) # Set attribute to the object. for key in attrsToSet: setattr(obj, key, attrsToSet[key]) setattr(self, name, obj) return obj return getattr(_sm, name) class _FakeURLOpener(object): """Fake URLOpener object, used to return empty strings instead of errors. """ def __init__(self, url, headers): self.url = url self.headers = headers def read(self, *args, **kwds): return '' def close(self, *args, **kwds): pass def info(self, *args, **kwds): return self.headers class IMDbHTTPSHandler(HTTPSHandler, object): """HTTPSHandler that ignores the SSL certificate.""" def __init__(self, logger=None, *args, **kwds): self._logger = logger context = ssl.create_default_context() context.check_hostname = False context.verify_mode = ssl.CERT_NONE super(IMDbHTTPSHandler, self).__init__(context=context) def http_error_default(self, url, fp, errcode, errmsg, headers): if errcode == 404: if self._logger: self._logger.warn('404 code returned for %s: %s (headers: %s)', url, errmsg, headers) return _FakeURLOpener(url, headers) raise IMDbDataAccessError( {'url': 'http:%s' % url, 'errcode': errcode, 'errmsg': errmsg, 'headers': headers, 'error type': 'http_error_default', 'proxy': self.get_proxy()} ) def open_unknown(self, fullurl, data=None): raise IMDbDataAccessError( {'fullurl': fullurl, 'data': str(data), 'error type': 'open_unknown', 'proxy': self.get_proxy()} ) def open_unknown_proxy(self, proxy, fullurl, data=None): raise IMDbDataAccessError( {'proxy': str(proxy), 'fullurl': fullurl, 'error type': 'open_unknown_proxy', 'data': str(data)} ) class IMDbURLopener: """Fetch web pages and handle errors.""" _logger = logging.getLogger('imdbpy.parser.http.urlopener') def __init__(self, *args, **kwargs): self._last_url = '' self.https_handler = IMDbHTTPSHandler(logger=self._logger) self.proxies = {} self.addheaders = [] for header in ('User-Agent', 'User-agent', 'user-agent'): self.del_header(header) self.set_header('User-Agent', 'Mozilla/5.0') self.set_header('Accept-Language', 'en-us,en;q=0.5') def get_proxy(self): """Return the used proxy, or an empty string.""" return self.proxies.get('http', '') def set_proxy(self, proxy): """Set the proxy.""" if not proxy: if 'http' in self.proxies: del self.proxies['http'] else: if not proxy.lower().startswith('http://'): proxy = 'http://%s' % proxy self.proxies['http'] = proxy def set_header(self, header, value, _overwrite=True): """Set a default header.""" if _overwrite: self.del_header(header) self.addheaders.append((header, value)) def get_header(self, header): """Return the first value of a header, or None if not present.""" for index in range(len(self.addheaders)): if self.addheaders[index][0] == header: return self.addheaders[index][1] return None def del_header(self, header): """Remove a default header.""" for index in range(len(self.addheaders)): if self.addheaders[index][0] == header: del self.addheaders[index] break def retrieve_unicode(self, url, size=-1): """Retrieves the given URL, and returns a unicode string, trying to guess the encoding of the data (assuming utf8 by default)""" encode = None try: if size != -1: self.set_header('Range', 'bytes=0-%d' % size) handlers = [] if 'http' in self.proxies: proxy_handler = ProxyHandler({ 'http': self.proxies['http'], 'https': self.proxies['http'] }) handlers.append(proxy_handler) handlers.append(self.https_handler) uopener = build_opener(*handlers) uopener.addheaders = list(self.addheaders) response = uopener.open(url) content = response.read() self._last_url = response.url # Maybe the server is so nice to tell us the charset... if PY2: server_encode = response.headers.getparam('charset') or None else: server_encode = response.headers.get_content_charset(None) # Otherwise, look at the content-type HTML meta tag. if server_encode is None and content: begin_h = content.find(b'text/html; charset=') if begin_h != -1: end_h = content[19 + begin_h:].find('"') if end_h != -1: server_encode = content[19 + begin_h:19 + begin_h + end_h] if server_encode: try: if lookup(server_encode): encode = server_encode except (LookupError, ValueError, TypeError): pass if size != -1: self.del_header('Range') response.close() except IOError as e: if size != -1: # Ensure that the Range header is removed. self.del_header('Range') raise IMDbDataAccessError( {'errcode': e.errno, 'errmsg': str(e.strerror), 'url': url, 'proxy': self.get_proxy(), 'exception type': 'IOError', 'original exception': e} ) if encode is None: encode = 'utf8' # The detection of the encoding is error prone... self._logger.warn('Unable to detect the encoding of the retrieved page [%s];' ' falling back to default utf8.', encode) if isinstance(content, str): return content return str(content, encode, 'replace') class IMDbHTTPAccessSystem(IMDbBase): """The class used to access IMDb's data through the web.""" accessSystem = 'http' _http_logger = logging.getLogger('imdbpy.parser.http') def __init__(self, adultSearch=True, proxy=-1, cookie_id=-1, timeout=30, cookie_uu=None, *arguments, **keywords): """Initialize the access system.""" IMDbBase.__init__(self, *arguments, **keywords) self.urlOpener = IMDbURLopener() self._getRefs = True self._mdparse = False self.set_timeout(timeout) if proxy != -1: self.set_proxy(proxy) _def = {'_modFunct': self._defModFunct, '_as': self.accessSystem} # Proxy objects. self.smProxy = _ModuleProxy(searchMovieParser, defaultKeys=_def) self.smaProxy = _ModuleProxy(searchMovieAdvancedParser, defaultKeys=_def) self.spProxy = _ModuleProxy(searchPersonParser, defaultKeys=_def) self.scompProxy = _ModuleProxy(searchCompanyParser, defaultKeys=_def) self.skProxy = _ModuleProxy(searchKeywordParser, defaultKeys=_def) self.mProxy = _ModuleProxy(movieParser, defaultKeys=_def) self.pProxy = _ModuleProxy(personParser, defaultKeys=_def) self.compProxy = _ModuleProxy(companyParser, defaultKeys=_def) self.topBottomProxy = _ModuleProxy(topBottomParser, defaultKeys=_def) def _normalize_movieID(self, movieID): """Normalize the given movieID.""" try: return '%07d' % int(movieID) except ValueError as e: raise IMDbParserError('invalid movieID "%s": %s' % (movieID, e)) def _normalize_personID(self, personID): """Normalize the given personID.""" try: return '%07d' % int(personID) except ValueError as e: raise IMDbParserError('invalid personID "%s": %s' % (personID, e)) def _normalize_companyID(self, companyID): """Normalize the given companyID.""" try: return '%07d' % int(companyID) except ValueError as e: raise IMDbParserError('invalid companyID "%s": %s' % (companyID, e)) def get_imdbMovieID(self, movieID): """Translate a movieID in an imdbID; in this implementation the movieID _is_ the imdbID. """ return movieID def get_imdbPersonID(self, personID): """Translate a personID in an imdbID; in this implementation the personID _is_ the imdbID. """ return personID def get_imdbCompanyID(self, companyID): """Translate a companyID in an imdbID; in this implementation the companyID _is_ the imdbID. """ return companyID def get_proxy(self): """Return the used proxy or an empty string.""" return self.urlOpener.get_proxy() def set_proxy(self, proxy): """Set the web proxy to use. It should be a string like 'http://localhost:8080/'; if the string is empty, no proxy will be used. If set, the value of the environment variable HTTP_PROXY is automatically used. """ self.urlOpener.set_proxy(proxy) def set_timeout(self, timeout): """Set the default timeout, in seconds, of the connection.""" try: timeout = int(timeout) except Exception: timeout = 0 if timeout <= 0: timeout = None socket.setdefaulttimeout(timeout) def set_cookies(self, cookie_id, cookie_uu): """Set a cookie to access an IMDb's account.""" warnings.warn("set_cookies has been deprecated") def del_cookies(self): """Remove the used cookie.""" warnings.warn("del_cookies has been deprecated") def do_adult_search(self, doAdult, cookie_id=None, cookie_uu=None): """If doAdult is true, 'adult' movies are included in the search results; cookie_id and cookie_uu are optional parameters to select a specific account (see your cookie or cookies.txt file.""" return def _retrieve(self, url, size=-1, _noCookies=False): """Retrieve the given URL.""" self._http_logger.debug('fetching url %s (size: %d)', url, size) ret = self.urlOpener.retrieve_unicode(url, size=size) if PY2 and isinstance(ret, str): ret = ret.decode('utf-8') return ret def _get_search_content(self, kind, ton, results): """Retrieve the web page for a given search. kind can be 'tt' (for titles), 'nm' (for names), or 'co' (for companies). ton is the title or the name to search. results is the maximum number of results to be retrieved.""" if PY2: params = 'q=%s&s=%s' % (quote_plus(ton.encode('utf8'), safe=''.encode('utf8')), kind.encode('utf8')) else: params = 'q=%s&s=%s' % (quote_plus(ton, safe=''), kind) if kind == 'ep': params = params.replace('s=ep&', 's=tt&ttype=ep&', 1) cont = self._retrieve(self.urls['find'] % params) # print 'URL:', imdbURL_find % params if cont.find('Your search returned more than') == -1 or \ cont.find("displayed the exact matches") == -1: return cont # The retrieved page contains no results, because too many # titles or names contain the string we're looking for. params = 'q=%s&ls=%s&lm=0' % (quote_plus(ton, safe=''), kind) size = 131072 + results * 512 return self._retrieve(self.urls['find'] % params, size=size) def _search_movie(self, title, results): cont = self._get_search_content('tt', title, results) return self.smProxy.search_movie_parser.parse(cont, results=results)['data'] def _get_search_movie_advanced_content(self, title=None, adult=None, results=None, sort=None, sort_dir=None): """Retrieve the web page for a given search. results is the maximum number of results to be retrieved.""" criteria = {} if title is not None: criteria['title'] = quote_plus(title, safe='') if adult: criteria['adult'] = 'include' if results is not None: criteria['count'] = str(results) if sort is not None: criteria['sort'] = sort if sort_dir is not None: criteria['sort'] = sort + ',' + sort_dir params = '&'.join(['%s=%s' % (k, v) for k, v in criteria.items()]) return self._retrieve(self.urls['search_movie_advanced'] % params) def _search_movie_advanced(self, title=None, adult=None, results=None, sort=None, sort_dir=None): cont = self._get_search_movie_advanced_content(title=title, adult=adult, results=results, sort=sort, sort_dir=sort_dir) return self.smaProxy.search_movie_advanced_parser.parse(cont, results=results)['data'] def _search_episode(self, title, results): t_dict = analyze_title(title) if t_dict['kind'] == 'episode': title = t_dict['title'] cont = self._get_search_content('ep', title, results) return self.smProxy.search_movie_parser.parse(cont, results=results)['data'] def get_movie_main(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'reference') return self.mProxy.movie_parser.parse(cont, mdparse=self._mdparse) def get_movie_full_credits(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'fullcredits') return self.mProxy.full_credits_parser.parse(cont) def get_movie_plot(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'plotsummary') ret = self.mProxy.plot_parser.parse(cont, getRefs=self._getRefs) ret['info sets'] = ('plot', 'synopsis') return ret def get_movie_awards(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'awards') return self.mProxy.movie_awards_parser.parse(cont) def get_movie_taglines(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'taglines') return self.mProxy.taglines_parser.parse(cont) def get_movie_keywords(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'keywords') return self.mProxy.keywords_parser.parse(cont) def get_movie_alternate_versions(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'alternateversions') return self.mProxy.alternateversions_parser.parse(cont, getRefs=self._getRefs) def get_movie_crazy_credits(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'crazycredits') return self.mProxy.crazycredits_parser.parse(cont, getRefs=self._getRefs) def get_movie_goofs(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'goofs') return self.mProxy.goofs_parser.parse(cont, getRefs=self._getRefs) def get_movie_quotes(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'quotes') return self.mProxy.quotes_parser.parse(cont, getRefs=self._getRefs) def get_movie_release_dates(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'releaseinfo') ret = self.mProxy.releasedates_parser.parse(cont) ret['info sets'] = ('release dates', 'akas') return ret get_movie_akas = get_movie_release_dates get_movie_release_info = get_movie_release_dates def get_movie_vote_details(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'ratings') return self.mProxy.ratings_parser.parse(cont) def get_movie_trivia(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'trivia') return self.mProxy.trivia_parser.parse(cont, getRefs=self._getRefs) def get_movie_connections(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'movieconnections') return self.mProxy.connections_parser.parse(cont) def get_movie_technical(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'technical') return self.mProxy.tech_parser.parse(cont) def get_movie_locations(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'locations') return self.mProxy.locations_parser.parse(cont) def get_movie_soundtrack(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'soundtrack') return self.mProxy.soundtrack_parser.parse(cont) def get_movie_reviews(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'reviews?count=9999999&start=0') return self.mProxy.reviews_parser.parse(cont) def get_movie_critic_reviews(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'criticreviews') return self.mProxy.criticrev_parser.parse(cont) def get_movie_external_reviews(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'externalreviews') return self.mProxy.externalrev_parser.parse(cont) def get_movie_external_sites(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'externalsites') ret = self.mProxy.externalsites_parser.parse(cont) ret['info sets'] = ('external sites', 'misc sites', 'sound clips', 'video sites', 'photo sites', 'official sites') return ret def get_movie_official_sites(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'officialsites') ret = self.mProxy.officialsites_parser.parse(cont) ret['info sets'] = ('external sites', 'misc sites', 'sound clips', 'video sites', 'photo sites', 'official sites') return ret def get_movie_misc_sites(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'miscsites') ret = self.mProxy.misclinks_parser.parse(cont) ret['info sets'] = ('external sites', 'misc sites', 'sound clips', 'video sites', 'photo sites', 'official sites') return ret def get_movie_sound_clips(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'soundsites') ret = self.mProxy.soundclips_parser.parse(cont) ret['info sets'] = ('external sites', 'misc sites', 'sound clips', 'video sites', 'photo sites', 'official sites') return ret def get_movie_video_clips(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'videosites') ret = self.mProxy.videoclips_parser.parse(cont) ret['info sets'] = ('external sites', 'misc sites', 'sound clips', 'video sites', 'photo sites', 'official sites') return ret def get_movie_photo_sites(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'photosites') ret = self.mProxy.photosites_parser.parse(cont) ret['info sets'] = ('external sites', 'misc sites', 'sound clips', 'video sites', 'photo sites', 'official sites') return ret def get_movie_news(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'news') return self.mProxy.news_parser.parse(cont, getRefs=self._getRefs) def _purge_seasons_data(self, data_d): if '_current_season' in data_d['data']: del data_d['data']['_current_season'] if '_seasons' in data_d['data']: del data_d['data']['_seasons'] return data_d def get_movie_episodes(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'episodes') data_d = self.mProxy.season_episodes_parser.parse(cont) if not data_d and 'data' in data_d: return {} _current_season = data_d['data'].get('_current_season', '') _seasons = data_d['data'].get('_seasons') or [] data_d = self._purge_seasons_data(data_d) data_d['data'].setdefault('episodes', {}) nr_eps = len(data_d['data']['episodes'].get(_current_season) or []) for season in _seasons: if season == _current_season: continue other_cont = self._retrieve( self.urls['movie_main'] % movieID + 'episodes?season=' + str(season) ) other_d = self.mProxy.season_episodes_parser.parse(other_cont) other_d = self._purge_seasons_data(other_d) other_d['data'].setdefault('episodes', {}) if not (other_d and other_d['data'] and other_d['data']['episodes'][season]): continue nr_eps += len(other_d['data']['episodes'].get(season) or []) data_d['data']['episodes'][season] = other_d['data']['episodes'][season] data_d['data']['number of episodes'] = nr_eps return data_d def get_movie_faqs(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'faq') return self.mProxy.movie_faqs_parser.parse(cont, getRefs=self._getRefs) def get_movie_airing(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'tvschedule') return self.mProxy.airing_parser.parse(cont) get_movie_tv_schedule = get_movie_airing def get_movie_synopsis(self, movieID): return self.get_movie_plot(movieID) def get_movie_parents_guide(self, movieID): cont = self._retrieve(self.urls['movie_main'] % movieID + 'parentalguide') return self.mProxy.parentsguide_parser.parse(cont) def _search_person(self, name, results): cont = self._get_search_content('nm', name, results) return self.spProxy.search_person_parser.parse(cont, results=results)['data'] def get_person_main(self, personID): cont = self._retrieve(self.urls['person_main'] % personID) ret = self.pProxy.maindetails_parser.parse(cont) ret['info sets'] = ('main', 'filmography') return ret def get_person_filmography(self, personID): return self.get_person_main(personID) def get_person_biography(self, personID): cont = self._retrieve(self.urls['person_main'] % personID + 'bio') return self.pProxy.bio_parser.parse(cont, getRefs=self._getRefs) def get_person_awards(self, personID): cont = self._retrieve(self.urls['person_main'] % personID + 'awards') return self.pProxy.person_awards_parser.parse(cont) def get_person_other_works(self, personID): cont = self._retrieve(self.urls['person_main'] % personID + 'otherworks') return self.pProxy.otherworks_parser.parse(cont, getRefs=self._getRefs) def get_person_publicity(self, personID): cont = self._retrieve(self.urls['person_main'] % personID + 'publicity') return self.pProxy.publicity_parser.parse(cont) def get_person_official_sites(self, personID): cont = self._retrieve(self.urls['person_main'] % personID + 'officialsites') return self.pProxy.person_officialsites_parser.parse(cont) def get_person_news(self, personID): cont = self._retrieve(self.urls['person_main'] % personID + 'news') return self.pProxy.news_parser.parse(cont) def get_person_genres_links(self, personID): cont = self._retrieve(self.urls['person_main'] % personID + 'filmogenre') return self.pProxy.person_genres_parser.parse(cont) def get_person_keywords_links(self, personID): cont = self._retrieve(self.urls['person_main'] % personID + 'filmokey') return self.pProxy.person_keywords_parser.parse(cont) def _search_company(self, name, results): cont = self._get_search_content('co', name, results) url = self.urlOpener._last_url return self.scompProxy.search_company_parser.parse(cont, url=url, results=results)['data'] def get_company_main(self, companyID): cont = self._retrieve(self.urls['company_main'] % companyID) ret = self.compProxy.company_main_parser.parse(cont) return ret def _search_keyword(self, keyword, results): # XXX: the IMDb web server seems to have some serious problem with # non-ascii keyword. # E.g.: http://www.imdb.com/keyword/fianc%E9/ # will return a 500 Internal Server Error: Redirect Recursion. try: cont = self._get_search_content('kw', keyword, results) except IMDbDataAccessError: self._http_logger.warn('unable to search for keyword %s', keyword, exc_info=True) return [] return self.skProxy.search_keyword_parser.parse(cont, results=results)['data'] def _get_keyword(self, keyword, results): try: cont = self._retrieve(self.urls['keyword_main'] % keyword) except IMDbDataAccessError: self._http_logger.warn('unable to get keyword %s', keyword, exc_info=True) return [] return self.skProxy.search_moviekeyword_parser.parse(cont, results=results)['data'] def _get_top_bottom_movies(self, kind): if kind == 'top': parser = self.topBottomProxy.top250_parser url = self.urls['top250'] elif kind == 'bottom': parser = self.topBottomProxy.bottom100_parser url = self.urls['bottom100'] else: return [] cont = self._retrieve(url) return parser.parse(cont)['data'] imdbpy-6.8/imdb/parser/http/companyParser.py000066400000000000000000000102351351454127000212410ustar00rootroot00000000000000# Copyright 2008-2019 Davide Alberani # 2008-2018 H. Turgut Uyar # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides the classes (and the instances) that are used to parse the IMDb pages on the www.imdb.com server about a company. For example, for "Columbia Pictures [us]" the referred page would be: main details http://www.imdb.com/company/co0071509/ """ from __future__ import absolute_import, division, print_function, unicode_literals import re from .piculet import Path, Rule, Rules from .utils import DOMParserBase, analyze_imdbid, build_movie _re_company_name = re.compile('With\s+(.+)\s+\(Sorted by.*', re.I | re.M) def clean_company_title(title): """Extract company name""" name = _re_company_name.findall(title or '') if name and name[0]: return name[0] class DOMCompanyParser(DOMParserBase): """Parser for the main page of a given company. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: cparser = DOMCompanyParser() result = cparser.parse(company_html_string) """ _containsObjects = True rules = [ Rule( key='name', extractor=Path( '//h1[@class="header"]/text()', transform=lambda x: clean_company_title(x) ) ), Rule( key='filmography', extractor=Rules( foreach='//b/a[@name]', rules=[ Rule( key=Path('./text()', transform=str.lower), extractor=Rules( foreach='../following-sibling::ol[1]/li', rules=[ Rule( key='link', extractor=Path('./a[1]/@href') ), Rule( key='title', extractor=Path('./a[1]/text()') ), Rule( key='year', extractor=Path('./text()[1]') ) ], transform=lambda x: build_movie( '%s %s' % (x.get('title'), x.get('year').strip()), movieID=analyze_imdbid(x.get('link') or ''), _parsingCompany=True ) ) ) ] ) ) ] preprocessors = [ (re.compile('(\1') ] def postprocess_data(self, data): for key in ['name']: if (key in data) and isinstance(data[key], dict): subdata = data[key] del data[key] data.update(subdata) for key in list(data.keys()): new_key = key.replace('company', 'companies') new_key = new_key.replace('other', 'miscellaneous') new_key = new_key.replace('distributor', 'distributors') if new_key != key: data[new_key] = data[key] del data[key] return data _OBJECTS = { 'company_main_parser': ((DOMCompanyParser,), None) } imdbpy-6.8/imdb/parser/http/movieParser.py000066400000000000000000002635151351454127000207250ustar00rootroot00000000000000# -*- coding: utf-8 -*- # Copyright 2004-2019 Davide Alberani # 2008-2018 H. Turgut Uyar # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides the classes (and the instances) that are used to parse the IMDb pages on the www.imdb.com server about a movie. For example, for Brian De Palma's "The Untouchables", the referred pages would be: combined details http://www.imdb.com/title/tt0094226/reference plot summary http://www.imdb.com/title/tt0094226/plotsummary ...and so on. """ from __future__ import absolute_import, division, print_function, unicode_literals import functools import re from imdb import PY2 from imdb import imdbURL_base from imdb.Company import Company from imdb.Movie import Movie from imdb.Person import Person from imdb.utils import _Container, KIND_MAP from .piculet import Path, Rule, Rules, preprocessors, transformers from .utils import DOMParserBase, analyze_imdbid, build_person if PY2: from urllib import unquote else: from urllib.parse import unquote # Dictionary used to convert some section's names. _SECT_CONV = { 'directed': 'director', 'directed by': 'director', 'directors': 'director', 'editors': 'editor', 'writing credits': 'writer', 'writers': 'writer', 'produced': 'producer', 'cinematography': 'cinematographer', 'film editing': 'editor', 'casting': 'casting director', 'costume design': 'costume designer', 'makeup department': 'make up', 'production management': 'production manager', 'second unit director or assistant director': 'assistant director', 'costume and wardrobe department': 'costume department', 'costume departmen': 'costume department', 'sound department': 'sound crew', 'stunts': 'stunt performer', 'other crew': 'miscellaneous crew', 'also known as': 'akas', 'country': 'countries', 'runtime': 'runtimes', 'language': 'languages', 'certification': 'certificates', 'genre': 'genres', 'created': 'creator', 'creators': 'creator', 'color': 'color info', 'plot': 'plot outline', 'art directors': 'art direction', 'assistant directors': 'assistant director', 'set decorators': 'set decoration', 'visual effects department': 'visual effects', 'miscellaneous': 'miscellaneous crew', 'make up department': 'make up', 'plot summary': 'plot outline', 'cinematographers': 'cinematographer', 'camera department': 'camera and electrical department', 'costume designers': 'costume designer', 'production designers': 'production design', 'production managers': 'production manager', 'music original': 'original music', 'casting directors': 'casting director', 'other companies': 'miscellaneous companies', 'producers': 'producer', 'special effects by': 'special effects department', 'special effects': 'special effects companies' } re_space = re.compile(r'\s+') def _manageRoles(mo): """Perform some transformation on the html, so that roleIDs can be easily retrieved.""" firstHalf = mo.group(1) secondHalf = mo.group(2) newRoles = [] roles = secondHalf.split(' / ') for role in roles: role = role.strip() if not role: continue roleID = analyze_imdbid(role) if roleID is None: roleID = '/' else: roleID += '/' newRoles.append('
%s
' % ( roleID, role.strip() )) return firstHalf + ' / '.join(newRoles) + mo.group(3) _reRolesMovie = re.compile(r'()(.*?)()', re.I | re.M | re.S) def makeSplitter(lstrip=None, sep='|', comments=True, origNotesSep=' (', newNotesSep='::(', strip=None): """Return a splitter function suitable for a given set of data.""" def splitter(x): if not x: return x x = x.strip() if not x: return x if lstrip is not None: x = x.lstrip(lstrip).lstrip() lx = x.split(sep) lx[:] = [_f for _f in [j.strip() for j in lx] if _f] if comments: lx[:] = [j.replace(origNotesSep, newNotesSep, 1) for j in lx] if strip: lx[:] = [j.strip(strip) for j in lx] return lx return splitter def _toInt(val, replace=()): """Return the value, converted to integer, or None; if present, 'replace' must be a list of tuples of values to replace.""" for before, after in replace: val = val.replace(before, after) try: return int(val) except (TypeError, ValueError): return None _re_og_title = re.compile( r'(.*) \((?:(?:(.+)(?= ))? ?(\d{4})(?:(–)(\d{4}| ))?|(.+))\)', re.UNICODE ) def analyze_og_title(og_title): data = {} match = _re_og_title.match(og_title) if og_title and not match: # assume it's a title in production, missing release date information return {'title': og_title} data['title'] = match.group(1) if match.group(3): data['year'] = int(match.group(3)) kind = match.group(2) or match.group(6) if kind is None: kind = 'movie' else: kind = kind.lower() kind = KIND_MAP.get(kind, kind) data['kind'] = kind year_separator = match.group(4) # There is a year separator so assume an ongoing or ended series if year_separator is not None: end_year = match.group(5) if end_year is not None: data['series years'] = '%(year)d-%(end_year)s' % { 'year': data['year'], 'end_year': end_year.strip(), } elif kind.endswith('series'): data['series years'] = '%(year)d-' % {'year': data['year']} # No year separator and series, so assume that it ended the same year elif kind.endswith('series') and 'year' in data: data['series years'] = '%(year)d-%(year)d' % {'year': data['year']} if data['kind'] == 'episode' and data['title'][0] == '"': quote_end = data['title'].find('"', 1) data['tv series title'] = data['title'][1:quote_end] data['title'] = data['title'][quote_end + 1:].strip() return data def analyze_certificates(certificates): def reducer(acc, el): cert_re = re.compile(r'^(.+):(.+)$', re.UNICODE) if cert_re.match(el): acc.append(el) elif acc: acc[-1] = u'{}::{}'.format( acc[-1], el, ) return acc certificates = [el.strip() for el in certificates.split('\n') if el.strip()] return functools.reduce(reducer, certificates, []) def clean_akas(aka): aka = re_space.sub(' ', aka).strip() if aka.lower().startswith('see more'): aka = '' return aka class DOMHTMLMovieParser(DOMParserBase): """Parser for the "reference" page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: mparser = DOMHTMLMovieParser() result = mparser.parse(reference_html_string) """ _containsObjects = True rules = [ Rule( key='title', extractor=Path('//meta[@property="og:title"]/@content', transform=analyze_og_title) ), # parser for misc sections like 'casting department', 'stunts', ... Rule( key='misc sections', extractor=Rules( foreach='//h4[contains(@class, "ipl-header__content")]', rules=[ Rule( key=Path('./@name', transform=lambda x: x.replace('_', ' ').strip()), extractor=Rules( foreach='../../following-sibling::table[1]//tr', rules=[ Rule( key='person', extractor=Path('.//text()') ), Rule( key='link', extractor=Path('./td[1]/a[@href]/@href') ) ], transform=lambda x: build_person( x.get('person') or '', personID=analyze_imdbid(x.get('link')) ) ) ) ] ) ), Rule( key='cast', extractor=Rules( foreach='//table[@class="cast_list"]//tr', rules=[ Rule( key='person', extractor=Path('.//text()') ), Rule( key='link', extractor=Path('./td[2]/a/@href') ), Rule( key='roleID', extractor=Path('./td[4]//div[@class="_imdbpyrole"]/@roleid') ) ], transform=lambda x: build_person( x.get('person') or '', personID=analyze_imdbid(x.get('link')), roleID=(x.get('roleID') or '').split('/') ) ) ), Rule( key='myrating', extractor=Path('//span[@id="voteuser"]//text()') ), Rule( key='plot summary', extractor=Path('//td[starts-with(text(), "Plot")]/..//p/text()', transform=lambda x: x.strip().rstrip('|').rstrip()) ), Rule( key='genres', extractor=Path( foreach='//td[starts-with(text(), "Genre")]/..//li/a', path='./text()' ) ), Rule( key='runtimes', extractor=Path( foreach='//td[starts-with(text(), "Runtime")]/..//li', path='./text()', transform=lambda x: x.strip().replace(' min', '') ) ), Rule( key='countries', extractor=Path( foreach='//td[starts-with(text(), "Countr")]/..//li/a', path='./text()' ) ), Rule( key='country codes', extractor=Path( foreach='//td[starts-with(text(), "Countr")]/..//li/a', path='./@href', transform=lambda x: x.split('/')[2].strip().lower() ) ), Rule( key='language', extractor=Path( foreach='//td[starts-with(text(), "Language")]/..//li/a', path='./text()' ) ), Rule( key='language codes', extractor=Path( foreach='//td[starts-with(text(), "Language")]/..//li/a', path='./@href', transform=lambda x: x.split('/')[2].strip() ) ), Rule( key='color info', extractor=Path( foreach='//td[starts-with(text(), "Color")]/..//li/a', path='./text()', transform=lambda x: x.replace(' (', '::(') ) ), Rule( key='aspect ratio', extractor=Path( '//td[starts-with(text(), "Aspect")]/..//li/text()', transform=transformers.strip ) ), Rule( key='sound mix', extractor=Path( foreach='//td[starts-with(text(), "Sound Mix")]/..//li/a', path='./text()', transform=lambda x: x.replace(' (', '::(') ) ), Rule( key='box office', extractor=Rules( foreach='//section[contains(@class, "titlereference-section-box-office")]' '//table[contains(@class, "titlereference-list")]//tr', rules=[ Rule( key='box_office_title', extractor=Path('./td[1]/text()') ), Rule( key='box_office_detail', extractor=Path('./td[2]/text()') ) ], transform=lambda x: (x['box_office_title'].strip(), x['box_office_detail'].strip()) ), ), Rule( key='certificates', extractor=Path( '//td[starts-with(text(), "Certificat")]/..//text()', transform=analyze_certificates ) ), # Collects akas not encosed in tags. Rule( key='other akas', extractor=Path( foreach='//section[contains(@class, "listo")]//td[starts-with(text(), "Also Known As")]/..//ul/li', path='.//text()', transform=clean_akas ) ), Rule( key='creator', extractor=Rules( foreach='//td[starts-with(text(), "Creator")]/..//a', rules=[ Rule( key='name', extractor=Path('./text()') ), Rule( key='link', extractor=Path('./@href') ) ], transform=lambda x: build_person( x.get('name') or '', personID=analyze_imdbid(x.get('link')) ) ) ), Rule( key='thin writer', extractor=Rules( foreach='//div[starts-with(normalize-space(text()), "Writer")]/ul/li[1]/a', rules=[ Rule( key='name', extractor=Path('./text()') ), Rule( key='link', extractor=Path('./@href') ) ], transform=lambda x: build_person( x.get('name') or '', personID=analyze_imdbid(x.get('link')) ) ) ), Rule( key='thin director', extractor=Rules( foreach='//div[starts-with(normalize-space(text()), "Director")]/ul/li[1]/a', rules=[ Rule( key='name', extractor=Path('./text()') ), Rule( key='link', extractor=Path('./@href') ) ], transform=lambda x: build_person( x.get('name') or '', personID=analyze_imdbid(x.get('link')) ) ) ), Rule( key='top/bottom rank', extractor=Path( '//li[@class="ipl-inline-list__item"]//a[starts-with(@href, "/chart/")]/text()' ) ), Rule( key='original air date', extractor=Path('//span[@imdbpy="airdate"]/text()') ), Rule( key='series years', extractor=Path( '//div[@id="tn15title"]//span[starts-with(text(), "TV series")]/text()', transform=lambda x: x.replace('TV series', '').strip() ) ), Rule( key='season/episode', extractor=Path( '//div[@class="titlereference-overview-season-episode-section"]/ul//text()', transform=transformers.strip ) ), Rule( key='number of episodes', extractor=Path( '//a[starts-with(text(), "All Episodes")]/text()', transform=lambda x: int(x.replace('All Episodes', '').strip()[1:-1]) ) ), Rule( key='episode number', extractor=Path( '//div[@id="tn15epnav"]/text()', transform=lambda x: int(re.sub(r'[^a-z0-9 ]', '', x.lower()).strip().split()[0])) ), Rule( key='previous episode', extractor=Path( '//span[@class="titlereference-overview-episodes-links"]' '//a[contains(text(), "Previous")]/@href', transform=analyze_imdbid ) ), Rule( key='next episode', extractor=Path( '//span[@class="titlereference-overview-episodes-links"]' '//a[contains(text(), "Next")]/@href', transform=analyze_imdbid ) ), Rule( key='number of seasons', extractor=Path( '//span[@class="titlereference-overview-years-links"]/../a[1]/text()', transform=int ) ), Rule( key='tv series link', extractor=Path('//a[starts-with(text(), "All Episodes")]/@href') ), Rule( key='akas', extractor=Path( foreach='//i[@class="transl"]', path='./text()', transform=lambda x: x .replace(' ', ' ') .rstrip('-') .replace('" - ', '"::', 1) .strip('"') .replace(' ', ' ') ) ), Rule( key='production status', extractor=Path( '//td[starts-with(text(), "Status:")]/..//div[@class="info-content"]//text()', transform=lambda x: x.strip().split('|')[0].strip().lower() ) ), Rule( key='production status updated', extractor=Path( '//td[starts-with(text(), "Status Updated:")]/' '..//div[@class="info-content"]//text()', transform=transformers.strip ) ), Rule( key='production comments', extractor=Path( '//td[starts-with(text(), "Comments:")]/' '..//div[@class="info-content"]//text()', transform=transformers.strip ) ), Rule( key='production note', extractor=Path( '//td[starts-with(text(), "Note:")]/' '..//div[@class="info-content"]//text()', transform=transformers.strip ) ), Rule( key='companies', extractor=Rules( foreach="//ul[@class='simpleList']", rules=[ Rule( key=Path('preceding-sibling::header[1]/div/h4/text()', transform=transformers.lower), extractor=Rules( foreach='./li', rules=[ Rule( key='name', extractor=Path('./a//text()') ), Rule( key='comp-link', extractor=Path('./a/@href') ), Rule( key='notes', extractor=Path('./text()') ) ], transform=lambda x: Company( name=x.get('name') or '', accessSystem='http', companyID=analyze_imdbid(x.get('comp-link')), notes=(x.get('notes') or '').strip() ) ) ) ] ) ), Rule( key='rating', extractor=Path('(//span[@class="ipl-rating-star__rating"])[1]/text()') ), Rule( key='votes', extractor=Path('//span[@class="ipl-rating-star__total-votes"][1]/text()') ), Rule( key='cover url', extractor=Path('//img[@alt="Poster"]/@src') ) ] preprocessors = [ ('/releaseinfo">', '">'), (re.compile(r'(.+?)', re.I), r'
\1'), ('Full cast and crew for
', ''), (' ', '...'), (re.compile(r'TV mini-series(\s+.*?)', re.I), r'TV series\1 (mini)'), (_reRolesMovie, _manageRoles) ] def preprocess_dom(self, dom): # Handle series information. xpath = self.xpath(dom, "//b[text()='Series Crew']") if xpath: b = xpath[-1] # In doubt, take the last one. for a in self.xpath(b, "./following::h5/a[@class='glossary']"): name = a.get('name') if name: a.set('name', 'series %s' % name) # Remove links to IMDbPro. preprocessors.remove(dom, '//span[@class="pro-link"]') # Remove some 'more' links (keep others, like the one around # the number of votes). preprocessors.remove(dom, '//a[@class="tn15more"][starts-with(@href, "/title/")]') # Remove the "rest of list" in cast. preprocessors.remove(dom, '//td[@colspan="4"]/..') return dom re_space = re.compile(r'\s+') re_airdate = re.compile(r'(.*)\s*\(season (\d+), episode (\d+)\)', re.I) def postprocess_data(self, data): # Convert section names. for sect in list(data.keys()): if sect in _SECT_CONV: data[_SECT_CONV[sect]] = data[sect] del data[sect] sect = _SECT_CONV[sect] # Filter out fake values. for key in data: value = data[key] if isinstance(value, list) and value: if isinstance(value[0], Person): data[key] = [x for x in value if x.personID is not None] if isinstance(value[0], _Container): for obj in data[key]: obj.accessSystem = self._as obj.modFunct = self._modFunct for key in ['title']: if (key in data) and isinstance(data[key], dict): subdata = data[key] del data[key] data.update(subdata) misc_sections = data.get('misc sections') if misc_sections is not None: for section in misc_sections: # skip sections with their own parsers if 'cast' in section.keys(): continue data.update(section) del data['misc sections'] if 'akas' in data or 'other akas' in data: akas = data.get('akas') or [] other_akas = data.get('other akas') or [] akas += other_akas nakas = [] for aka in akas: aka = aka.strip() if not aka: continue if aka.endswith('" -'): aka = aka[:-3].rstrip() nakas.append(aka) if 'akas' in data: del data['akas'] if 'other akas' in data: del data['other akas'] if nakas: data['akas'] = nakas if 'runtimes' in data: data['runtimes'] = [x.replace(' min', '') for x in data['runtimes']] if 'number of seasons' in data: data['seasons'] = [str(i) for i in range(1, data['number of seasons'] + 1)] if 'season/episode' in data: tokens = data['season/episode'].split('Episode') try: data['season'] = int(tokens[0].split('Season')[1]) except: data['season'] = 'unknown' try: data['episode'] = int(tokens[1]) except: data['episode'] = 'unknown' del data['season/episode'] for k in ('writer', 'director'): t_k = 'thin %s' % k if t_k not in data: continue if k not in data: data[k] = data[t_k] del data[t_k] if 'top/bottom rank' in data: tbVal = data['top/bottom rank'].lower() if tbVal.startswith('top'): tbKey = 'top 250 rank' tbVal = _toInt(tbVal, [('top rated movies: #', '')]) else: tbKey = 'bottom 100 rank' tbVal = _toInt(tbVal, [('bottom rated movies: #', '')]) if tbVal: data[tbKey] = tbVal del data['top/bottom rank'] if 'year' in data and data['year'] == '????': del data['year'] if 'tv series link' in data: if 'tv series title' in data: data['episode of'] = Movie(title=data['tv series title'], movieID=analyze_imdbid(data['tv series link']), accessSystem=self._as, modFunct=self._modFunct) data['episode of']['kind'] = 'tv series' del data['tv series title'] del data['tv series link'] if 'rating' in data: try: data['rating'] = float(data['rating'].replace('/10', '')) except (TypeError, ValueError): pass if data['rating'] == 0: del data['rating'] if 'votes' in data: try: votes = data['votes'].replace('(', '').replace(')', '').replace(',', '').replace('votes', '') data['votes'] = int(votes) except (TypeError, ValueError): pass companies = data.get('companies') if companies: for section in companies: for key, value in section.items(): if key in data: key = '%s companies' % key data.update({key: value}) del data['companies'] if 'box office' in data: data['box office'] = dict(data['box office']) return data def _process_plotsummary(x): """Process a plot (contributed by Rdian06).""" xauthor = x.get('author') xplot = x.get('plot', '').strip() if xauthor: xplot += '::%s' % xauthor return xplot class DOMHTMLPlotParser(DOMParserBase): """Parser for the "plot summary" page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a 'plot' key, containing a list of string with the structure: 'summary::summary_author '. Example:: pparser = HTMLPlotParser() result = pparser.parse(plot_summary_html_string) """ _defGetRefs = True # Notice that recently IMDb started to put the email of the # author only in the link, that we're not collecting, here. rules = [ Rule( key='plot', extractor=Rules( foreach='//ul[@id="plot-summaries-content"]/li', rules=[ Rule( key='plot', extractor=Path('./p//text()') ), Rule( key='author', extractor=Path('.//div[@class="author-container"]//a/text()') ) ], transform=_process_plotsummary ) ), Rule( key='synopsis', extractor=Path( foreach='//ul[@id="plot-synopsis-content"]', path='.//li//text()' ) ) ] def preprocess_dom(self, dom): preprocessors.remove(dom, '//li[@id="no-summary-content"]') return dom def postprocess_data(self, data): if 'synopsis' in data and data['synopsis'][0] and 'a Synopsis for this title' in data['synopsis'][0]: del data['synopsis'] return data def _process_award(x): award = {} _award = x.get('award') if _award is not None: _award = _award.strip() award['award'] = _award if not award['award']: return {} award['year'] = x.get('year').strip() if award['year'] and award['year'].isdigit(): award['year'] = int(award['year']) award['result'] = x.get('result').strip() category = x.get('category').strip() if category: award['category'] = category received_with = x.get('with') if received_with is not None: award['with'] = received_with.strip() notes = x.get('notes') if notes is not None: notes = notes.strip() if notes: award['notes'] = notes award['anchor'] = x.get('anchor') return award class DOMHTMLAwardsParser(DOMParserBase): """Parser for the "awards" page of a given person or movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: awparser = HTMLAwardsParser() result = awparser.parse(awards_html_string) """ subject = 'title' _containsObjects = True rules = [ Rule( key='awards', extractor=Rules( foreach='//table//big', rules=[ Rule( key=Path('./a'), extractor=Rules( foreach='./ancestor::tr[1]/following-sibling::tr/td[last()][not(@colspan)]', rules=[ Rule( key='year', extractor=Path('./td[1]/a/text()') ), Rule( key='result', extractor=Path('../td[2]/b/text()') ), Rule( key='award', extractor=Path('./td[3]/text()') ), Rule( key='category', extractor=Path('./text()[1]') ), Rule( key='with', extractor=Path( './small[starts-with(text(), "Shared with:")]/' 'following-sibling::a[1]/text()' ) ), Rule( key='notes', extractor=Path('./small[last()]//text()') ), Rule( key='anchor', extractor=Path('.//text()') ) ], transform=_process_award ) ) ] ) ), Rule( key='recipients', extractor=Rules( foreach='//table//big', rules=[ Rule( key=Path('./a'), extractor=Rules( foreach='./ancestor::tr[1]/following-sibling::tr' '/td[last()]/small[1]/preceding-sibling::a', rules=[ Rule( key='name', extractor=Path('./text()') ), Rule( key='link', extractor=Path('./@href') ), Rule( key='anchor', extractor=Path('..//text()') ) ] ) ) ] ) ) ] preprocessors = [ (re.compile('(]*>.*?\n\n)', re.I), r'\1'), (re.compile('(]*>\n\n.*?)', re.I), r'\1'), (re.compile('(]*>\n\n)
(.*?)
(.*?\n\n)(\2') ] def preprocess_dom(self, dom): """Repeat td elements according to their rowspan attributes in subsequent tr elements. """ cols = self.xpath(dom, "//td[@rowspan]") for col in cols: span = int(col.get('rowspan')) del col.attrib['rowspan'] position = len(self.xpath(col, "./preceding-sibling::td")) row = col.getparent() for tr in self.xpath(row, "./following-sibling::tr")[:span - 1]: # if not cloned, child will be moved to new parent clone = self.clone(col) tr.insert(position, clone) return dom def postprocess_data(self, data): if len(data) == 0: return {} nd = [] for key in list(data.keys()): dom = self.get_dom(key) assigner = self.xpath(dom, "//a/text()")[0] for entry in data[key]: if 'name' not in entry: if not entry: continue # this is an award, not a recipient entry['assigner'] = assigner.strip() # find the recipients matches = [p for p in data[key] if 'name' in p and (entry['anchor'] == p['anchor'])] if self.subject == 'title': recipients = [ Person(name=recipient['name'], personID=analyze_imdbid(recipient['link'])) for recipient in matches ] entry['to'] = recipients elif self.subject == 'name': recipients = [ Movie(title=recipient['name'], movieID=analyze_imdbid(recipient['link'])) for recipient in matches ] entry['for'] = recipients nd.append(entry) del entry['anchor'] return {'awards': nd} class DOMHTMLTaglinesParser(DOMParserBase): """Parser for the "taglines" page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: tparser = DOMHTMLTaglinesParser() result = tparser.parse(taglines_html_string) """ rules = [ Rule( key='taglines', extractor=Path( foreach='//div[@id="taglines_content"]/div', path='.//text()' ) ) ] def preprocess_dom(self, dom): preprocessors.remove(dom, '//div[@id="taglines_content"]/div[@class="header"]') preprocessors.remove(dom, '//div[@id="taglines_content"]/div[@id="no_content"]') return dom def postprocess_data(self, data): if 'taglines' in data: data['taglines'] = [tagline.strip() for tagline in data['taglines']] return data class DOMHTMLKeywordsParser(DOMParserBase): """Parser for the "keywords" page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: kwparser = DOMHTMLKeywordsParser() result = kwparser.parse(keywords_html_string) """ rules = [ Rule( key='keywords', extractor=Path( foreach='//a[starts-with(@href, "/search/keyword?keywords=")]', path='./text()', transform=lambda x: x.lower().replace(' ', '-') ) ) ] class DOMHTMLAlternateVersionsParser(DOMParserBase): """Parser for the "alternate versions" page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: avparser = DOMHTMLAlternateVersionsParser() result = avparser.parse(alternateversions_html_string) """ _defGetRefs = True rules = [ Rule( key='alternate versions', extractor=Path( foreach='//ul[@class="trivia"]/li', path='.//text()', transform=transformers.strip ) ) ] class DOMHTMLTriviaParser(DOMParserBase): """Parser for the "trivia" page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: tparser = DOMHTMLTriviaParser() result = tparser.parse(trivia_html_string) """ _defGetRefs = True rules = [ Rule( key='trivia', extractor=Path( foreach='//div[@class="sodatext"]', path='.//text()', transform=transformers.strip ) ) ] def preprocess_dom(self, dom): # Remove "link this quote" links. preprocessors.remove(dom, '//span[@class="linksoda"]') return dom class DOMHTMLSoundtrackParser(DOMParserBase): """Parser for the "soundtrack" page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: stparser = DOMHTMLSoundtrackParser() result = stparser.parse(soundtrack_html_string) """ _defGetRefs = True preprocessors = [('
', '\n'), ('
', '\n')] rules = [ Rule( key='soundtrack', extractor=Path( foreach='//div[@class="list"]//div', path='.//text()', transform=transformers.strip ) ) ] def postprocess_data(self, data): if 'soundtrack' in data: nd = [] for x in data['soundtrack']: ds = x.split('\n') title = ds[0] if title[0] == '"' and title[-1] == '"': title = title[1:-1] nds = [] newData = {} for l in ds[1:]: if ' with ' in l or ' by ' in l or ' from ' in l \ or ' of ' in l or l.startswith('From '): nds.append(l) else: if nds: nds[-1] += l else: nds.append(l) newData[title] = {} for l in nds: skip = False for sep in ('From ',): if l.startswith(sep): fdix = len(sep) kind = l[:fdix].rstrip().lower() info = l[fdix:].lstrip() newData[title][kind] = info skip = True if not skip: for sep in ' with ', ' by ', ' from ', ' of ': fdix = l.find(sep) if fdix != -1: fdix = fdix + len(sep) kind = l[:fdix].rstrip().lower() info = l[fdix:].lstrip() newData[title][kind] = info break nd.append(newData) data['soundtrack'] = nd return data class DOMHTMLCrazyCreditsParser(DOMParserBase): """Parser for the "crazy credits" page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: ccparser = DOMHTMLCrazyCreditsParser() result = ccparser.parse(crazycredits_html_string) """ _defGetRefs = True rules = [ Rule( key='crazy credits', extractor=Path( foreach='//ul/li/tt', path='.//text()', transform=lambda x: x.replace('\n', ' ').replace(' ', ' ') ) ) ] def _process_goof(x): if x['spoiler_category']: return x['spoiler_category'].strip() + ': SPOILER: ' + x['text'].strip() else: return x['category'].strip() + ': ' + x['text'].strip() class DOMHTMLGoofsParser(DOMParserBase): """Parser for the "goofs" page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: gparser = DOMHTMLGoofsParser() result = gparser.parse(goofs_html_string) """ _defGetRefs = True rules = [ Rule( key='goofs', extractor=Rules( foreach='//div[@class="soda odd"]', rules=[ Rule( key='text', extractor=Path('./text()') ), Rule( key='category', extractor=Path('./preceding-sibling::h4[1]/text()') ), Rule( key='spoiler_category', extractor=Path('./h4/text()') ) ], transform=_process_goof ) ) ] class DOMHTMLQuotesParser(DOMParserBase): """Parser for the "memorable quotes" page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: qparser = DOMHTMLQuotesParser() result = qparser.parse(quotes_html_string) """ _defGetRefs = True rules = [ Rule( key='quotes_odd', extractor=Path( foreach='//div[@class="quote soda odd"]', path='.//text()', transform=lambda x: x .strip() .replace(' \n', '::') .replace('::\n', '::') .replace('\n', ' ') ) ), Rule( key='quotes_even', extractor=Path( foreach='//div[@class="quote soda even"]', path='.//text()', transform=lambda x: x .strip() .replace(' \n', '::') .replace('::\n', '::') .replace('\n', ' ') ) ) ] preprocessors = [ (re.compile('

', re.I), '') ] def preprocess_dom(self, dom): # Remove "link this quote" links. preprocessors.remove(dom, '//span[@class="linksoda"]') preprocessors.remove(dom, '//div[@class="sharesoda_pre"]') return dom def postprocess_data(self, data): quotes = data.get('quotes_odd', []) + data.get('quotes_even', []) if not quotes: return {} quotes = [q.split('::') for q in quotes] return {'quotes': quotes} class DOMHTMLReleaseinfoParser(DOMParserBase): """Parser for the "release dates" page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: rdparser = DOMHTMLReleaseinfoParser() result = rdparser.parse(releaseinfo_html_string) """ rules = [ Rule( key='release dates', extractor=Rules( foreach='//table[contains(@class, "release-dates-table-test-only")]//tr', rules=[ Rule( key='country', extractor=Path('.//td[1]//text()') ), Rule( key='date', extractor=Path('.//td[2]//text()') ), Rule( key='notes', extractor=Path('.//td[3]//text()') ) ] ) ), Rule( key='akas', extractor=Rules( foreach='//table[contains(@class, "akas-table-test-only")]//tr', rules=[ Rule( key='countries', extractor=Path('./td[1]/text()') ), Rule( key='title', extractor=Path('./td[2]/text()') ) ] ) ) ] preprocessors = [ (re.compile('(
)', re.I | re.M | re.S), r'
\1
') ] def postprocess_data(self, data): if not ('release dates' in data or 'akas' in data): return data releases = data.get('release dates') or [] rl = [] for i in releases: country = i.get('country') date = i.get('date') if not (country and date): continue country = country.strip() date = date.strip() if not (country and date): continue notes = i.get('notes') info = '%s::%s' % (country, date) if notes: notes = notes.replace('\n', '') i['notes'] = notes info += notes rl.append(info) if releases: data['raw release dates'] = data['release dates'] del data['release dates'] if rl: data['release dates'] = rl akas = data.get('akas') or [] nakas = [] for aka in akas: title = (aka.get('title') or '').strip() if not title: continue countries = (aka.get('countries') or '').split(',') if not countries: nakas.append(title) else: for country in countries: nakas.append('%s %s' % (title, country.strip())) if akas: data['raw akas'] = data['akas'] del data['akas'] if nakas: data['akas'] = data['akas from release info'] = nakas return data class DOMHTMLRatingsParser(DOMParserBase): """Parser for the "user ratings" page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: rparser = DOMHTMLRatingsParser() result = rparser.parse(userratings_html_string) """ re_means = re.compile('mean\s*=\s*([0-9]\.[0-9])\s*median\s*=\s*([0-9])', re.I) rules = [ Rule( key='votes', extractor=Rules( foreach='//th[@class="firstTableCoulmn"]/../../tr', rules=[ Rule( key='ordinal', extractor=Path('./td[1]/div//text()') ), Rule( key='votes', extractor=Path('./td[3]/div/div//text()') ) ] ) ), Rule( key='mean and median', extractor=Path( '//div[starts-with(normalize-space(text()), "Arithmetic mean")]/text()' ) ), Rule( key='demographics', extractor=Rules( foreach='//div[@class="smallcell"]', rules=[ Rule( key='link', extractor=Path('./a/@href') ), Rule( key='rating', extractor=Path('..//div[@class="bigcell"]//text()') ), Rule( key='votes', extractor=Path('./a/text()') ) ] ) ) ] def postprocess_data(self, data): nd = {} demographics = data.get('demographics') if demographics: dem = {} for dem_data in demographics: link = (dem_data.get('link') or '').strip() votes = (dem_data.get('votes') or '').strip() rating = (dem_data.get('rating') or '').strip() if not (link and votes and rating): continue eq_idx = link.rfind('=') if eq_idx == -1: continue info = link[eq_idx + 1:].replace('_', ' ') try: votes = int(votes.replace(',', '')) except Exception: continue try: rating = float(rating) except Exception: continue dem[info] = {'votes': votes, 'rating': rating} nd['demographics'] = dem votes = data.get('votes', []) if votes: nd['number of votes'] = {} for v_info in votes: ordinal = v_info.get('ordinal') nr_votes = v_info.get('votes') if not (ordinal and nr_votes): continue try: ordinal = int(ordinal) except Exception: continue try: nr_votes = int(nr_votes.replace(',', '')) except Exception: continue nd['number of votes'][ordinal] = nr_votes mean = data.get('mean and median', '') if mean: means = self.re_means.findall(mean) if means and len(means[0]) == 2: am, med = means[0] try: am = float(am) except (ValueError, OverflowError): pass if isinstance(am, float): nd['arithmetic mean'] = am try: med = int(med) except (ValueError, OverflowError): pass if isinstance(med, int): nd['median'] = med return nd def _normalize_href(href): if (href is not None) and (not href.lower().startswith('http://')): if href.startswith('/'): href = href[1:] # TODO: imdbURL_base may be set by the user! href = '%s%s' % (imdbURL_base, href) return href class DOMHTMLCriticReviewsParser(DOMParserBase): """Parser for the "critic reviews" pages of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: crparser = DOMHTMLCriticReviewsParser() result = crparser.parse(criticreviews_html_string) """ kind = 'critic reviews' rules = [ Rule( key='metascore', extractor=Path('//div[@class="metascore_wrap"]/div/span//text()') ), Rule( key='metacritic url', extractor=Path('//div[@class="article"]/div[@class="see-more"]/a/@href') ) ] class DOMHTMLReviewsParser(DOMParserBase): """Parser for the "reviews" pages of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: rparser = DOMHTMLReviewsParser() result = rparser.parse(reviews_html_string) """ rules = [ Rule( key='reviews', extractor=Rules( foreach='//div[@class="review-container"]', rules=[ Rule( key='text', extractor=Path('.//div[@class="text show-more__control"]//text()') ), Rule( key='helpful', extractor=Path('.//div[@class="text-muted"]/text()[1]') ), Rule( key='title', extractor=Path('.//div[@class="title"]//text()') ), Rule( key='author', extractor=Path('.//span[@class="display-name-link"]/a/@href') ), Rule( key='date', extractor=Path('.//span[@class="review-date"]//text()') ), Rule( key='rating', extractor=Path('.//span[@class="point-scale"]/preceding-sibling::span[1]/text()') ) ], transform=lambda x: ({ 'content': x.get('text', '').replace('\n', ' ').replace(' ', ' ').strip(), 'helpful': [int(s) for s in x.get('helpful', '').split() if s.isdigit()], 'title': x.get('title', '').strip(), 'author': analyze_imdbid(x.get('author')), 'date': x.get('date', '').strip(), 'rating': x.get('rating', '').strip() }) ) ) ] preprocessors = [('
', '
\n')] def postprocess_data(self, data): for review in data.get('reviews', []): if review.get('rating') and len(review['rating']) == 2: review['rating'] = int(review['rating'][0]) else: review['rating'] = None if review.get('helpful') and len(review['helpful']) == 2: review['not_helpful'] = review['helpful'][1] - review['helpful'][0] review['helpful'] = review['helpful'][0] else: review['helpful'] = 0 review['not_helpful'] = 0 review['author'] = "ur%s" % review['author'] return data class DOMHTMLFullCreditsParser(DOMParserBase): """Parser for the "full credits" (series cast section) page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: fcparser = DOMHTMLFullCreditsParser() result = fcparser.parse(fullcredits_html_string) """ kind = 'full credits' rules = [ Rule( key='cast', extractor=Rules( foreach='//table[@class="cast_list"]//tr[@class="odd" or @class="even"]', rules=[ Rule( key='person', extractor=Path('.//text()') ), Rule( key='link', extractor=Path('./td[2]/a/@href') ), Rule( key='roleID', extractor=Path('./td[4]//div[@class="_imdbpyrole"]/@roleid') ), Rule( key='headshot', extractor=Path('./td[@class="primary_photo"]/a/img/@loadlate') ) ], transform=lambda x: build_person( x.get('person', ''), personID=analyze_imdbid(x.get('link')), roleID=(x.get('roleID', '')).split('/'), headshot=(x.get('headshot', '')) ) ) ) ] preprocessors = [ (_reRolesMovie, _manageRoles) ] def postprocess_data(self, data): clean_cast = [] for person in data.get('cast', []): if person.personID and person.get('name'): clean_cast.append(person) if clean_cast: data['cast'] = clean_cast return data class DOMHTMLOfficialsitesParser(DOMParserBase): """Parser for the "official sites", "external reviews" "miscellaneous links", "sound clips", "video clips" and "photographs" pages of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: osparser = DOMHTMLOfficialsitesParser() result = osparser.parse(officialsites_html_string) """ rules = [ Rule( foreach='//h4[@class="li_group"]', key=Path( './text()', transform=lambda x: x.strip().lower() ), extractor=Rules( foreach='./following::ul[1]/li/a', rules=[ Rule( key='link', extractor=Path('./@href') ), Rule( key='info', extractor=Path('./text()') ) ], transform=lambda x: ( x.get('info').strip(), unquote(_normalize_href(x.get('link'))) ) ) ) ] class DOMHTMLConnectionParser(DOMParserBase): """Parser for the "connections" page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: connparser = DOMHTMLConnectionParser() result = connparser.parse(connections_html_string) """ _containsObjects = True rules = [ Rule( key='connection', extractor=Rules( foreach='//div[@class="_imdbpy"]', rules=[ Rule( key=Path('./h5/text()', transform=transformers.lower), extractor=Rules( foreach='./a', rules=[ Rule( key='title', extractor=Path('./text()') ), Rule( key='movieID', extractor=Path('./@href') ) ] ) ) ] ) ) ] preprocessors = [ ('
', '
'), # To get the movie's year. (' (', ' ('), ('\n
', ''), ('
- ', '::') ] def postprocess_data(self, data): for key in list(data.keys()): nl = [] for v in data[key]: title = v['title'] ts = title.split('::', 1) title = ts[0].strip() notes = '' if len(ts) == 2: notes = ts[1].strip() m = Movie(title=title, movieID=analyze_imdbid(v['movieID']), accessSystem=self._as, notes=notes, modFunct=self._modFunct) nl.append(m) data[key] = nl if not data: return {} return {'connections': data} class DOMHTMLLocationsParser(DOMParserBase): """Parser for the "locations" page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: lparser = DOMHTMLLocationsParser() result = lparser.parse(locations_html_string) """ rules = [ Rule( key='locations', extractor=Rules( foreach='//dt', rules=[ Rule( key='place', extractor=Path('.//text()') ), Rule( key='note', extractor=Path('./following-sibling::dd[1]//text()') ) ], transform=lambda x: ('%s::%s' % (x['place'].strip(), (x['note'] or '').strip())).strip(':') ) ) ] class DOMHTMLTechParser(DOMParserBase): """Parser for the "technical", "publicity" (for people) and "contacts" (for people) pages of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: tparser = DOMHTMLTechParser() result = tparser.parse(technical_html_string) """ kind = 'tech' re_space = re.compile(r'\s+') rules = [ Rule( key='tech', extractor=Rules( foreach='//table//tr/td[@class="label"]', rules=[ Rule( key=Path( './text()', transform=lambda x: x.lower().strip()), extractor=Path( '..//td[2]//text()', transform=lambda x: [t.strip() for t in x.split(':::') if t.strip()] ) ) ] ) ) ] preprocessors = [ (re.compile('(
.*?
)', re.I), r'
\1
'), (re.compile('((
|

|))\n?
(?!'), # the ones below are for the publicity parser (re.compile('

(.*?)

', re.I), r'\1
'), (re.compile('()', re.I), r'\1::'), (re.compile('()', re.I), r'\n\1'), (re.compile('\|', re.I), r':::'), (re.compile('
', re.I), r':::') # this is for splitting individual entries ] def postprocess_data(self, data): info = {} for section in data.get('tech', []): info.update(section) for key, value in info.items(): if isinstance(value, list): info[key] = [self.re_space.sub(' ', x).strip() for x in value] else: info[key] = self.re_space.sub(' ', value).strip() return {self.kind: info} class DOMHTMLNewsParser(DOMParserBase): """Parser for the "news" page of a given movie or person. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: nwparser = DOMHTMLNewsParser() result = nwparser.parse(news_html_string) """ _defGetRefs = True rules = [ Rule( key='news', extractor=Rules( foreach='//h2', rules=[ Rule( key='title', extractor=Path('./text()') ), Rule( key='fromdate', extractor=Path('./following-sibling::p[1]/small//text()') ), Rule( key='body', extractor=Path('../following-sibling::p[2]//text()') ), Rule( key='link', extractor=Path('../..//a[text()="Permalink"]/@href') ), Rule( key='fulllink', extractor=Path('../..//a[starts-with(text(), "See full article at")]/@href') ) ], transform=lambda x: { 'title': x.get('title').strip(), 'date': x.get('fromdate').split('|')[0].strip(), 'from': x.get('fromdate').split('|')[1].replace('From ', '').strip(), 'body': (x.get('body') or '').strip(), 'link': _normalize_href(x.get('link')), 'full article link': _normalize_href(x.get('fulllink')) } ) ) ] preprocessors = [ (re.compile('(]+>

)', re.I), r'
\1'), (re.compile('(
)', re.I), r'
\1'), (re.compile('

', re.I), r'') ] def postprocess_data(self, data): if 'news' not in data: return {} for news in data['news']: if 'full article link' in news: if news['full article link'] is None: del news['full article link'] return data def _parse_review(x): result = {} title = x.get('title').strip() if title[-1] == ':': title = title[:-1] result['title'] = title result['link'] = _normalize_href(x.get('link')) kind = x.get('kind').strip() if kind[-1] == ':': kind = kind[:-1] result['review kind'] = kind text = x.get('review').replace('\n\n', '||').replace('\n', ' ').split('||') review = '\n'.join(text) if x.get('author') is not None: author = x.get('author').strip() review = review.split(author)[0].strip() result['review author'] = author[2:] if x.get('item') is not None: item = x.get('item').strip() review = review[len(item):].strip() review = "%s: %s" % (item, review) result['review'] = review return result class DOMHTMLSeasonEpisodesParser(DOMParserBase): """Parser for the "episode list" page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: sparser = DOMHTMLSeasonEpisodesParser() result = sparser.parse(episodes_html_string) """ rules = [ Rule( key='series link', extractor=Path('//div[@class="parent"]//a/@href') ), Rule( key='series title', extractor=Path('//head/meta[@property="og:title"]/@content') ), Rule( key='_seasons', extractor=Path( foreach='//select[@id="bySeason"]//option', path='./@value' ) ), Rule( key='_current_season', extractor=Path('//select[@id="bySeason"]//option[@selected]/@value') ), Rule( key='episodes', extractor=Rules( foreach='//div[@class="info"]', rules=[ Rule( key=Path('.//meta/@content', transform=lambda x: 'episode %s' % x), extractor=Rules( rules=[ Rule( key='link', extractor=Path('.//strong//a[@href][1]/@href') ), Rule( key='original air date', extractor=Path('.//div[@class="airdate"]/text()') ), Rule( key='title', extractor=Path('.//strong//text()') ), Rule( key='rating', extractor=Path( './/div[contains(@class, "ipl-rating-star")][1]' '/span[@class="ipl-rating-star__rating"][1]/text()' ) ), Rule( key='votes', extractor=Path( './/div[contains(@class, "ipl-rating-star")][1]' '/span[@class="ipl-rating-star__total-votes"][1]/text()' ) ), Rule( key='plot', extractor=Path('.//div[@class="item_description"]//text()') ) ] ) ) ] ) ) ] def postprocess_data(self, data): series_id = analyze_imdbid(data.get('series link')) series_title = data.get('series title', '').strip() selected_season = data.get('_current_season', 'unknown season').strip() if not (series_id and series_title): return {} series = Movie(title=series_title, movieID=str(series_id), accessSystem=self._as, modFunct=self._modFunct) if series.get('kind') == 'movie': series['kind'] = 'tv series' try: selected_season = int(selected_season) except ValueError: pass nd = {selected_season: {}} if 'episode -1' in data: counter = 1 for episode in data['episode -1']: while 'episode %d' % counter in data: counter += 1 k = 'episode %d' % counter data[k] = [episode] del data['episode -1'] episodes = data.get('episodes', []) for ep in episodes: if not ep: continue episode_nr, episode = list(ep.items())[0] if not episode_nr.startswith('episode '): continue episode_nr = episode_nr[8:].rstrip() try: episode_nr = int(episode_nr) except ValueError: pass episode_id = analyze_imdbid(episode.get('link' '')) episode_air_date = episode.get('original air date', '').strip() episode_title = episode.get('title', '').strip() episode_plot = episode.get('plot', '') episode_rating = episode.get('rating', '') episode_votes = episode.get('votes', '') if not (episode_nr is not None and episode_id and episode_title): continue ep_obj = Movie(movieID=episode_id, title=episode_title, accessSystem=self._as, modFunct=self._modFunct) ep_obj['kind'] = 'episode' ep_obj['episode of'] = series ep_obj['season'] = selected_season ep_obj['episode'] = episode_nr if episode_rating: try: ep_obj['rating'] = float(episode_rating) except: pass if episode_votes: try: ep_obj['votes'] = int(episode_votes.replace(',', '') .replace('.', '').replace('(', '').replace(')', '')) except: pass if episode_air_date: ep_obj['original air date'] = episode_air_date if episode_air_date[-4:].isdigit(): ep_obj['year'] = episode_air_date[-4:] if episode_plot: ep_obj['plot'] = episode_plot nd[selected_season][episode_nr] = ep_obj _seasons = data.get('_seasons') or [] for idx, season in enumerate(_seasons): try: _seasons[idx] = int(season) except ValueError: pass return {'episodes': nd, '_seasons': _seasons, '_current_season': selected_season} def _build_episode(x): """Create a Movie object for a given series' episode.""" episode_id = analyze_imdbid(x.get('link')) episode_title = x.get('title') e = Movie(movieID=episode_id, title=episode_title) e['kind'] = 'episode' oad = x.get('oad') if oad: e['original air date'] = oad.strip() year = x.get('year') if year is not None: year = year[5:] if year == 'unknown': year = '????' if year and year.isdigit(): year = int(year) e['year'] = year else: if oad and oad[-4:].isdigit(): e['year'] = int(oad[-4:]) epinfo = x.get('episode') if epinfo is not None: season, episode = epinfo.split(':')[0].split(',') e['season'] = int(season[7:]) e['episode'] = int(episode[8:]) else: e['season'] = 'unknown' e['episode'] = 'unknown' plot = x.get('plot') if plot: e['plot'] = plot.strip() return e class DOMHTMLEpisodesParser(DOMParserBase): """Parser for the "episode list" page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: eparser = DOMHTMLEpisodesParser() result = eparser.parse(episodes_html_string) """ # XXX: no more used for the list of episodes parser, # but only for the episodes cast parser (see below). _containsObjects = True kind = 'episodes list' _episodes_path = "..//h4" _oad_path = "./following-sibling::span/strong[1]/text()" def _init(self): self.rules = [ Rule( key='series title', extractor=Path('//title/text()') ), Rule( key='series movieID', extractor=Path( './/h1/a[@class="main"]/@href', transform=analyze_imdbid ) ), Rule( key='episodes', extractor=Rules( foreach='//div[@class="_imdbpy"]/h3', rules=[ Rule( key='./a/@name', extractor=Rules( foreach=self._episodes_path, rules=[ Rule( key='link', extractor=Path('./a/@href') ), Rule( key='title', extractor=Path('./a/text()') ), Rule( key='year', extractor=Path('./preceding-sibling::a[1]/@name') ), Rule( key='episode', extractor=Path('./text()[1]') ), Rule( key='oad', extractor=Path(self._oad_path) ), Rule( key='plot', extractor=Path('./following-sibling::text()[1]') ) ], transform=_build_episode ) ) ] ) ) ] if self.kind == 'episodes cast': self.rules += [ Rule( key='cast', extractor=Rules( foreach='//h4', rules=[ Rule( key=Path('./text()[1]', transform=transformers.strip), extractor=Rules( foreach='./following-sibling::table[1]//td[@class="nm"]', rules=[ Rule( key='person', extractor=Path('..//text()') ), Rule( key='link', extractor=Path('./a/@href') ), Rule( key='roleID', extractor=Path('../td[4]//div[@class="_imdbpyrole"]/@roleid') ) ], transform=lambda x: build_person( x.get('person') or '', personID=analyze_imdbid(x.get('link')), roleID=(x.get('roleID') or '').split('/'), accessSystem=self._as, modFunct=self._modFunct ) ) ) ] ) ) ] preprocessors = [ (re.compile('(
\n)(

)', re.I), r'

\1
\2'), (re.compile('(

\n\n)
', re.I), r'\1'), (re.compile('

(.*?)

', re.I), r'

\1

'), (_reRolesMovie, _manageRoles), (re.compile('(

\n)(
)', re.I), r'\1\2') ] def postprocess_data(self, data): # A bit extreme? if 'series title' not in data: return {} if 'series movieID' not in data: return {} stitle = data['series title'].replace('- Episode list', '') stitle = stitle.replace('- Episodes list', '') stitle = stitle.replace('- Episode cast', '') stitle = stitle.replace('- Episodes cast', '') stitle = stitle.strip() if not stitle: return {} seriesID = data['series movieID'] if seriesID is None: return {} series = Movie(title=stitle, movieID=str(seriesID), accessSystem=self._as, modFunct=self._modFunct) nd = {} for key in list(data.keys()): if key.startswith('filter-season-') or key.startswith('season-'): season_key = key.replace('filter-season-', '').replace('season-', '') try: season_key = int(season_key) except ValueError: pass nd[season_key] = {} ep_counter = 1 for episode in data[key]: if not episode: continue episode_key = episode.get('episode') if episode_key is None: continue if not isinstance(episode_key, int): episode_key = ep_counter ep_counter += 1 cast_key = 'Season %s, Episode %s:' % (season_key, episode_key) if cast_key in data: cast = data[cast_key] for i in range(len(cast)): cast[i].billingPos = i + 1 episode['cast'] = cast episode['episode of'] = series nd[season_key][episode_key] = episode if len(nd) == 0: return {} return {'episodes': nd} class DOMHTMLFaqsParser(DOMParserBase): """Parser for the "FAQ" page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: fparser = DOMHTMLFaqsParser() result = fparser.parse(faqs_html_string) """ _defGetRefs = True rules = [ Rule( key='faqs', extractor=Rules( foreach='//div[@class="section"]', rules=[ Rule( key='question', extractor=Path('./h3/a/span/text()') ), Rule( key='answer', extractor=Path('../following-sibling::div[1]//text()') ) ], transform=lambda x: '%s::%s' % ( x.get('question').strip(), '\n\n'.join(x.get('answer').replace('\n\n', '\n').strip().split('||')) ) ) ) ] preprocessors = [ (re.compile('

', re.I), r'||'), (re.compile('

(.*?)

\n', re.I), r'||\1--'), (re.compile('(.*?)', re.I), r'[spoiler]\1[/spoiler]') ] class DOMHTMLAiringParser(DOMParserBase): """Parser for the "airing" page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: aparser = DOMHTMLAiringParser() result = aparser.parse(airing_html_string) """ _containsObjects = True rules = [ Rule( key='series title', extractor=Path( '//title/text()', transform=lambda x: x.replace(' - TV schedule', '') ) ), Rule( key='series id', extractor=Path('//h1/a[@href]/@href') ), Rule( key='tv airings', extractor=Rules( foreach='//tr[@class]', rules=[ Rule( key='date', extractor=Path('./td[1]//text()') ), Rule( key='time', extractor=Path('./td[2]//text()') ), Rule( key='channel', extractor=Path('./td[3]//text()') ), Rule( key='link', extractor=Path('./td[4]/a[1]/@href') ), Rule( key='title', extractor=Path('./td[4]//text()') ), Rule( key='season', extractor=Path('./td[5]//text()') ) ], transform=lambda x: { 'date': x.get('date'), 'time': x.get('time'), 'channel': x.get('channel').strip(), 'link': x.get('link'), 'title': x.get('title'), 'season': (x.get('season') or '').strip() } ) ) ] def postprocess_data(self, data): if len(data) == 0: return {} seriesTitle = data.get('series title') or '' seriesID = analyze_imdbid(data.get('series id')) if seriesID and 'airing' in data: for airing in data['airing']: title = airing.get('title', '').strip() if not title: epsTitle = seriesTitle if seriesID is None: continue epsID = seriesID else: epsTitle = '%s {%s}' % (data['series title'], airing['title']) epsID = analyze_imdbid(airing['link']) e = Movie(title=epsTitle, movieID=epsID) airing['episode'] = e del airing['link'] del airing['title'] if not airing['season']: del airing['season'] if 'series title' in data: del data['series title'] if 'series id' in data: del data['series id'] if 'airing' in data: data['airing'] = [_f for _f in data['airing'] if _f] if 'airing' not in data or not data['airing']: return {} return data class DOMHTMLParentsGuideParser(DOMParserBase): """Parser for the "parents guide" page of a given movie. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: pgparser = HTMLParentsGuideParser() result = pgparser.parse(parentsguide_html_string) """ rules = [ Rule( key='parents guide', extractor=Rules( foreach='//tr[@class="ipl-zebra-list__item"]', rules=[ Rule( key=Path( './td[1]/text()', transform=transformers.lower ), extractor=Path( path='./td[2]//text()', transform=lambda x: [ re_space.sub(' ', t) for t in x.split('\n') if t.strip() ] ) ) ] ) ) ] def postprocess_data(self, data): ret = {} for sect in data.get('parents guide', []): for key, value in sect.items(): ret[key] = value if isinstance(ret.get('mpaa'), list): ret['mpaa'] = ret['mpaa'][0] return ret _OBJECTS = { 'movie_parser': ((DOMHTMLMovieParser,), None), 'full_credits_parser': ((DOMHTMLFullCreditsParser,), None), 'plot_parser': ((DOMHTMLPlotParser,), None), 'movie_awards_parser': ((DOMHTMLAwardsParser,), None), 'taglines_parser': ((DOMHTMLTaglinesParser,), None), 'keywords_parser': ((DOMHTMLKeywordsParser,), None), 'crazycredits_parser': ((DOMHTMLCrazyCreditsParser,), None), 'goofs_parser': ((DOMHTMLGoofsParser,), None), 'alternateversions_parser': ((DOMHTMLAlternateVersionsParser,), None), 'trivia_parser': ((DOMHTMLTriviaParser,), None), 'soundtrack_parser': ((DOMHTMLSoundtrackParser,), None), 'quotes_parser': ((DOMHTMLQuotesParser,), None), 'releasedates_parser': ((DOMHTMLReleaseinfoParser,), None), 'ratings_parser': ((DOMHTMLRatingsParser,), None), 'criticrev_parser': ((DOMHTMLCriticReviewsParser,), {'kind': 'critic reviews'}), 'reviews_parser': ((DOMHTMLReviewsParser,), {'kind': 'reviews'}), 'externalsites_parser': ((DOMHTMLOfficialsitesParser,), None), 'officialsites_parser': ((DOMHTMLOfficialsitesParser,), None), 'externalrev_parser': ((DOMHTMLOfficialsitesParser,), None), 'misclinks_parser': ((DOMHTMLOfficialsitesParser,), None), 'soundclips_parser': ((DOMHTMLOfficialsitesParser,), None), 'videoclips_parser': ((DOMHTMLOfficialsitesParser,), None), 'photosites_parser': ((DOMHTMLOfficialsitesParser,), None), 'connections_parser': ((DOMHTMLConnectionParser,), None), 'tech_parser': ((DOMHTMLTechParser,), None), 'locations_parser': ((DOMHTMLLocationsParser,), None), 'news_parser': ((DOMHTMLNewsParser,), None), 'episodes_parser': ((DOMHTMLEpisodesParser,), None), 'season_episodes_parser': ((DOMHTMLSeasonEpisodesParser,), None), 'movie_faqs_parser': ((DOMHTMLFaqsParser,), None), 'airing_parser': ((DOMHTMLAiringParser,), None), 'parentsguide_parser': ((DOMHTMLParentsGuideParser,), None) } imdbpy-6.8/imdb/parser/http/personParser.py000066400000000000000000000437601351454127000211120ustar00rootroot00000000000000# Copyright 2004-2019 Davide Alberani # 2008-2018 H. Turgut Uyar # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides the classes (and the instances) that are used to parse the IMDb pages on the www.imdb.com server about a person. For example, for "Mel Gibson" the referred pages would be: categorized http://www.imdb.com/name/nm0000154/maindetails biography http://www.imdb.com/name/nm0000154/bio ...and so on. """ from __future__ import absolute_import, division, print_function, unicode_literals import re from imdb.utils import analyze_name from .movieParser import ( DOMHTMLAwardsParser, DOMHTMLNewsParser, DOMHTMLOfficialsitesParser, DOMHTMLTechParser ) from .piculet import Path, Rule, Rules, transformers from .utils import DOMParserBase, analyze_imdbid, build_movie _re_spaces = re.compile(r'\s+') _reRoles = re.compile(r'(
  • .*? \.\.\.\. )(.*?)(
  • |
    )', re.I | re.M | re.S) class DOMHTMLMaindetailsParser(DOMParserBase): """Parser for the "categorized" (maindetails) page of a given person. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: cparser = DOMHTMLMaindetailsParser() result = cparser.parse(categorized_html_string) """ _containsObjects = True _name_imdb_index = re.compile(r'\([IVXLCDM]+\)') _birth_rules = [ Rule( key='birth date', extractor=Path('.//time[@itemprop="birthDate"]/@datetime') ), Rule( key='birth place', extractor=Path('.//a[starts-with(@href, "/search/name?birth_place=")]/text()') ) ] _death_rules = [ Rule( key='death date', extractor=Path('.//time[@itemprop="deathDate"]/@datetime') ), Rule( key='death place', extractor=Path('.//a[starts-with(@href, "/search/name?death_place=")]/text()') ) ] _film_rules = [ Rule( key='link', extractor=Path('./b/a[1]/@href') ), Rule( key='title', extractor=Path('./b/a[1]/text()') ), Rule( key='notes', extractor=Path('./b/following-sibling::text()') ), Rule( key='year', extractor=Path('./span[@class="year_column"]/text()') ), Rule( key='status', extractor=Path('./a[@class="in_production"]/text()') ), Rule( key='rolesNoChar', extractor=Path('.//br/following-sibling::text()') ), Rule( key='chrRoles', extractor=Path('./a[@imdbpyname]/@imdbpyname') ) ] rules = [ Rule( key='name', extractor=Path( '//h1[@class="header"]//text()', transform=lambda x: analyze_name(x) ) ), Rule( key='name_index', extractor=Path('//h1[@class="header"]/span[1]/text()') ), Rule( key='birth info', extractor=Rules( section='//div[h4="Born:"]', rules=_birth_rules ) ), Rule( key='death info', extractor=Rules( section='//div[h4="Died:"]', rules=_death_rules, ) ), Rule( key='headshot', extractor=Path('//td[@id="img_primary"]//div[@class="image"]/a/img/@src') ), Rule( key='akas', extractor=Path( '//div[h4="Alternate Names:"]/text()', transform=lambda x: x.strip().split(' ') ) ), Rule( key='filmography', extractor=Rules( foreach='//div[starts-with(@id, "filmo-head-")]', rules=[ Rule( key=Path( './a[@name]/text()', transform=lambda x: x.lower().replace(': ', ' ') ), extractor=Rules( foreach='./following-sibling::div[1]/div[starts-with(@class, "filmo-row")]', rules=_film_rules, transform=lambda x: build_movie( x.get('title') or '', year=x.get('year'), movieID=analyze_imdbid(x.get('link') or ''), rolesNoChar=(x.get('rolesNoChar') or '').strip(), chrRoles=(x.get('chrRoles') or '').strip(), additionalNotes=x.get('notes'), status=x.get('status') or None ) ) ) ] ) ), Rule( key='in development', extractor=Rules( foreach='//div[starts-with(@class,"devitem")]', rules=[ Rule( key='link', extractor=Path('./a/@href') ), Rule( key='title', extractor=Path('./a/text()') ) ], transform=lambda x: build_movie( x.get('title') or '', movieID=analyze_imdbid(x.get('link') or ''), roleID=(x.get('roleID') or '').split('/'), status=x.get('status') or None ) ) ) ] preprocessors = [ ('
    ', ''), ('
    ', '
    ') ] def postprocess_data(self, data): for key in ['name']: if (key in data) and isinstance(data[key], dict): subdata = data[key] del data[key] data.update(subdata) for what in 'birth date', 'death date': if what in data and not data[what]: del data[what] name_index = (data.get('name_index') or '').strip() if name_index: if self._name_imdb_index.match(name_index): data['imdbIndex'] = name_index[1:-1] del data['name_index'] # XXX: the code below is for backwards compatibility # probably could be removed for key in list(data.keys()): if key.startswith('actor '): if 'actor' not in data: data['actor'] = [] data['actor'].extend(data[key]) del data[key] if key.startswith('actress '): if 'actress' not in data: data['actress'] = [] data['actress'].extend(data[key]) del data[key] if key.startswith('self '): if 'self' not in data: data['self'] = [] data['self'].extend(data[key]) del data[key] if key == 'birth place': data['birth notes'] = data[key] del data[key] if key == 'death place': data['death notes'] = data[key] del data[key] return data class DOMHTMLBioParser(DOMParserBase): """Parser for the "biography" page of a given person. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: bioparser = DOMHTMLBioParser() result = bioparser.parse(biography_html_string) """ _defGetRefs = True _birth_rules = [ Rule( key='birth date', extractor=Path( './time/@datetime', transform=lambda s: '%4d-%02d-%02d' % tuple(map(int, s.split('-'))) ) ), Rule( key='birth notes', extractor=Path('./a[starts-with(@href, "/search/name?birth_place=")]/text()') ) ] _death_rules = [ Rule( key='death date', extractor=Path( './time/@datetime', transform=lambda s: '%4d-%02d-%02d' % tuple(map(int, s.split('-'))) ) ), Rule( key='death cause', extractor=Path( './text()', transform=lambda x: ''.join(x).strip()[2:].lstrip() ) ), Rule( key='death notes', extractor=Path( '..//text()', transform=lambda x: _re_spaces.sub(' ', (x or '').strip().split('\n')[-1]) ) ) ] rules = [ Rule( key='headshot', extractor=Path('//img[@class="poster"]/@src') ), Rule( key='birth info', extractor=Rules( section='//table[@id="overviewTable"]' '//td[text()="Born"]/following-sibling::td[1]', rules=_birth_rules ) ), Rule( key='death info', extractor=Rules( section='//table[@id="overviewTable"]' '//td[text()="Died"]/following-sibling::td[1]', rules=_death_rules ) ), Rule( key='nick names', extractor=Path( '//table[@id="overviewTable"]' '//td[starts-with(text(), "Nickname")]/following-sibling::td[1]/text()', reduce=lambda xs: '|'.join(xs), transform=lambda x: [ n.strip().replace(' (', '::(', 1) for n in x.split('|') if n.strip() ] ) ), Rule( key='birth name', extractor=Path( '//table[@id="overviewTable"]' '//td[text()="Birth Name"]/following-sibling::td[1]/text()', transform=lambda x: x.strip() ) ), Rule( key='height', extractor=Path( '//table[@id="overviewTable"]' '//td[text()="Height"]/following-sibling::td[1]/text()', transform=transformers.strip ) ), Rule( key='mini biography', extractor=Rules( foreach='//h4[starts-with(text(), "Mini Bio")]/following-sibling::div', rules=[ Rule( key='bio', extractor=Path('.//text()') ), Rule( key='by', extractor=Path('.//a[@name="ba"]//text()') ) ], transform=lambda x: "%s::%s" % ( (x.get('bio') or '').split('- IMDb Mini Biography By:')[0].strip(), (x.get('by') or '').strip() or 'Anonymous' ) ) ), Rule( key='spouse', extractor=Rules( foreach='//a[@name="spouse"]/following::table[1]//tr', rules=[ Rule( key='name', extractor=Path('./td[1]//text()') ), Rule( key='info', extractor=Path('./td[2]//text()') ) ], transform=lambda x: ("%s::%s" % ( x.get('name').strip(), (_re_spaces.sub(' ', x.get('info') or '')).strip())).strip(':') ) ), Rule( key='trade mark', extractor=Path( foreach='//div[@class="_imdbpyh4"]/h4[starts-with(text(), "Trade Mark")]' '/.././div[contains(@class, "soda")]', path='.//text()', transform=transformers.strip ) ), Rule( key='trivia', extractor=Path( foreach='//div[@class="_imdbpyh4"]/h4[starts-with(text(), "Trivia")]' '/.././div[contains(@class, "soda")]', path='.//text()', transform=transformers.strip ) ), Rule( key='quotes', extractor=Path( foreach='//div[@class="_imdbpyh4"]/h4[starts-with(text(), "Personal Quotes")]' '/.././div[contains(@class, "soda")]', path='.//text()', transform=transformers.strip ) ), Rule( key='salary history', extractor=Rules( foreach='//a[@name="salary"]/following::table[1]//tr', rules=[ Rule( key='title', extractor=Path('./td[1]//text()') ), Rule( key='info', extractor=Path('./td[2]//text()') ) ], transform=lambda x: "%s::%s" % ( x.get('title').strip(), _re_spaces.sub(' ', (x.get('info') or '')).strip()) ) ) ] preprocessors = [ (re.compile('(
    )', re.I), r'
    \1'), (re.compile('(
    \1'), (re.compile('(\n
    \s+)
    ', re.I + re.DOTALL), r'\1'), (re.compile('(
    )'), r'
    \1'), (re.compile('\.

    ([^\s])', re.I), r'. \1') ] def postprocess_data(self, data): for key in ['birth info', 'death info']: if key in data and isinstance(data[key], dict): subdata = data[key] del data[key] data.update(subdata) for what in 'birth date', 'death date', 'death cause': if what in data and not data[what]: del data[what] return data class DOMHTMLOtherWorksParser(DOMParserBase): """Parser for the "other works" page of a given person. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: owparser = DOMHTMLOtherWorksParser() result = owparser.parse(otherworks_html_string) """ _defGetRefs = True rules = [ Rule( key='other works', extractor=Path( foreach='//li[@class="ipl-zebra-list__item"]', path='.//text()', transform=transformers.strip ) ) ] class DOMHTMLPersonGenresParser(DOMParserBase): """Parser for the "by genre" and "by keywords" pages of a given person. The page should be provided as a string, as taken from the www.imdb.com server. The final result will be a dictionary, with a key for every relevant section. Example:: gparser = DOMHTMLPersonGenresParser() result = gparser.parse(bygenre_html_string) """ kind = 'genres' _containsObjects = True rules = [ Rule( key='genres', extractor=Rules( foreach='//b/a[@name]/following-sibling::a[1]', rules=[ Rule( key=Path('./text()', transform=str.lower), extractor=Rules( foreach='../../following-sibling::ol[1]/li//a[1]', rules=[ Rule( key='link', extractor=Path('./@href') ), Rule( key='title', extractor=Path('./text()') ), Rule( key='info', extractor=Path('./following-sibling::text()') ) ], transform=lambda x: build_movie( x.get('title') + x.get('info').split('[')[0], analyze_imdbid(x.get('link'))) ) ) ] ) ) ] def postprocess_data(self, data): if len(data) == 0: return {} return {self.kind: data} _OBJECTS = { 'maindetails_parser': ((DOMHTMLMaindetailsParser,), None), 'bio_parser': ((DOMHTMLBioParser,), None), 'otherworks_parser': ((DOMHTMLOtherWorksParser,), None), 'person_officialsites_parser': ((DOMHTMLOfficialsitesParser,), None), 'person_awards_parser': ((DOMHTMLAwardsParser,), {'subject': 'name'}), 'publicity_parser': ((DOMHTMLTechParser,), {'kind': 'publicity'}), 'person_contacts_parser': ((DOMHTMLTechParser,), {'kind': 'contacts'}), 'person_genres_parser': ((DOMHTMLPersonGenresParser,), None), 'person_keywords_parser': ((DOMHTMLPersonGenresParser,), {'kind': 'keywords'}), 'news_parser': ((DOMHTMLNewsParser,), None), } imdbpy-6.8/imdb/parser/http/piculet.py000066400000000000000000001004171351454127000200650ustar00rootroot00000000000000# Copyright (C) 2014-2018 H. Turgut Uyar # # Piculet is free software: you can redistribute it and/or modify # it under the terms of the GNU Lesser General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # Piculet is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU Lesser General Public License for more details. # # You should have received a copy of the GNU Lesser General Public License # along with Piculet. If not, see . """Piculet is a module for scraping XML and HTML documents using XPath queries. It consists of this single source file with no dependencies other than the standard library, which makes it very easy to integrate into applications. It has been tested with Python 2.7, Python 3.4+, PyPy2 5.7+, and PyPy3 5.7+. For more information, please refer to the documentation: https://piculet.readthedocs.io/ """ from __future__ import absolute_import, division, print_function, unicode_literals import json import logging import os import re import sys from argparse import ArgumentParser from collections import deque from functools import partial from operator import itemgetter from pkgutil import find_loader PY2 = sys.version_info < (3, 0) if PY2: str, bytes = unicode, str if PY2: from cgi import escape as html_escape from HTMLParser import HTMLParser from StringIO import StringIO from htmlentitydefs import name2codepoint from urllib2 import urlopen else: from html import escape as html_escape from html.parser import HTMLParser from io import StringIO from urllib.request import urlopen if PY2: from contextlib import contextmanager @contextmanager def redirect_stdout(new_stdout): """Context manager for temporarily redirecting stdout.""" old_stdout, sys.stdout = sys.stdout, new_stdout try: yield new_stdout finally: sys.stdout = old_stdout else: from contextlib import redirect_stdout _logger = logging.getLogger(__name__) ########################################################### # HTML OPERATIONS ########################################################### # TODO: this is too fragile _CHARSET_TAGS = [ b' str :param content: Content of HTML document to decode. :param charset: Character set of the page. :param fallback_charset: Character set to use if it can't be figured out. :return: Decoded content of the document. """ if charset is None: for tag in _CHARSET_TAGS: start = content.find(tag) if start >= 0: charset_start = start + len(tag) charset_end = content.find(b'"', charset_start) charset = content[charset_start:charset_end].decode('ascii') _logger.debug('charset found in "meta": "%s"', charset) break else: _logger.debug('charset not found, using fallback: "%s"', fallback_charset) charset = fallback_charset _logger.debug('decoding for charset: "%s"', charset) return content.decode(charset) class HTMLNormalizer(HTMLParser): """HTML cleaner and XHTML convertor. DOCTYPE declarations and comments are removed. """ SELF_CLOSING_TAGS = {'br', 'hr', 'img', 'input', 'link', 'meta'} """Tags to handle as self-closing.""" def __init__(self, omit_tags=None, omit_attrs=None): """Initialize this normalizer. :sig: (Optional[Iterable[str]], Optional[Iterable[str]]) -> None :param omit_tags: Tags to remove, along with all their content. :param omit_attrs: Attributes to remove. """ if PY2: HTMLParser.__init__(self) else: super().__init__(convert_charrefs=True) self.omit_tags = set(omit_tags) if omit_tags is not None else set() # sig: Set[str] self.omit_attrs = set(omit_attrs) if omit_attrs is not None else set() # sig: Set[str] # stacks used during normalization self._open_tags = deque() self._open_omitted_tags = deque() def handle_starttag(self, tag, attrs): """Process the starting of a new element.""" if tag in self.omit_tags: _logger.debug('omitting: "%s"', tag) self._open_omitted_tags.append(tag) if not self._open_omitted_tags: # stack empty -> not in omit mode if '@' in tag: # email address in angular brackets print('<%s>' % tag, end='') return if (tag == 'li') and (self._open_tags[-1] == 'li'): _logger.debug('opened "li" without closing previous "li", adding closing tag') self.handle_endtag('li') attributes = [] for attr_name, attr_value in attrs: if attr_name in self.omit_attrs: _logger.debug('omitting "%s" attribute of "%s"', attr_name, tag) continue if attr_value is None: _logger.debug('no value for "%s" attribute of "%s", adding empty value', attr_name, tag) attr_value = '' markup = '%(name)s="%(value)s"' % { 'name': attr_name, 'value': html_escape(attr_value, quote=True) } attributes.append(markup) line = '<%(tag)s%(attrs)s%(slash)s>' % { 'tag': tag, 'attrs': (' ' + ' '.join(attributes)) if len(attributes) > 0 else '', 'slash': ' /' if tag in self.SELF_CLOSING_TAGS else '' } print(line, end='') if tag not in self.SELF_CLOSING_TAGS: self._open_tags.append(tag) def handle_endtag(self, tag): """Process the ending of an element.""" if not self._open_omitted_tags: # stack empty -> not in omit mode if tag not in self.SELF_CLOSING_TAGS: last = self._open_tags[-1] if (tag == 'ul') and (last == 'li'): _logger.debug('closing "ul" without closing last "li", adding closing tag') self.handle_endtag('li') if tag == last: # expected end tag print('' % {'tag': tag}, end='') self._open_tags.pop() elif tag not in self._open_tags: _logger.debug('closing tag "%s" without opening tag', tag) # XXX: for
    , this case gets invoked after the case below elif tag == self._open_tags[-2]: _logger.debug('unexpected closing tag "%s" instead of "%s", closing both', tag, last) print('' % {'tag': last}, end='') print('' % {'tag': tag}, end='') self._open_tags.pop() self._open_tags.pop() elif (tag in self.omit_tags) and (tag == self._open_omitted_tags[-1]): # end of expected omitted tag self._open_omitted_tags.pop() def handle_data(self, data): """Process collected character data.""" if not self._open_omitted_tags: # stack empty -> not in omit mode line = html_escape(data) print(line.decode('utf-8') if PY2 and isinstance(line, bytes) else line, end='') def handle_entityref(self, name): """Process an entity reference.""" # XXX: doesn't get called if convert_charrefs=True num = name2codepoint.get(name) # we are sure we're on PY2 here if num is not None: print('&#%(ref)d;' % {'ref': num}, end='') def handle_charref(self, name): """Process a character reference.""" # XXX: doesn't get called if convert_charrefs=True print('&#%(ref)s;' % {'ref': name}, end='') # def feed(self, data): # super().feed(data) # # close all remaining open tags # for tag in reversed(self._open_tags): # print('' % {'tag': tag}, end='') def html_to_xhtml(document, omit_tags=None, omit_attrs=None): """Clean HTML and convert to XHTML. :sig: (str, Optional[Iterable[str]], Optional[Iterable[str]]) -> str :param document: HTML document to clean and convert. :param omit_tags: Tags to exclude from the output. :param omit_attrs: Attributes to exclude from the output. :return: Normalized XHTML content. """ out = StringIO() normalizer = HTMLNormalizer(omit_tags=omit_tags, omit_attrs=omit_attrs) with redirect_stdout(out): normalizer.feed(document) return out.getvalue() ########################################################### # DATA EXTRACTION OPERATIONS ########################################################### # sigalias: XPathResult = Union[Sequence[str], Sequence[Element]] _USE_LXML = find_loader('lxml') is not None if _USE_LXML: _logger.info('using lxml') from lxml import etree as ElementTree from lxml.etree import Element XPath = ElementTree.XPath xpath = ElementTree._Element.xpath else: from xml.etree import ElementTree from xml.etree.ElementTree import Element class XPath: """An XPath expression evaluator. This class is mainly needed to compensate for the lack of ``text()`` and ``@attr`` axis queries in ElementTree XPath support. """ def __init__(self, path): """Initialize this evaluator. :sig: (str) -> None :param path: XPath expression to evaluate. """ def descendant(element): # strip trailing '//text()' return [t for e in element.findall(path[:-8]) for t in e.itertext() if t] def child(element): # strip trailing '/text()' return [t for e in element.findall(path[:-7]) for t in ([e.text] + [c.tail if c.tail else '' for c in e]) if t] def attribute(element, subpath, attr): result = [e.attrib.get(attr) for e in element.findall(subpath)] return [r for r in result if r is not None] if path[0] == '/': # ElementTree doesn't support absolute paths # TODO: handle this properly, find root of tree path = '.' + path if path.endswith('//text()'): _apply = descendant elif path.endswith('/text()'): _apply = child else: steps = path.split('/') front, last = steps[:-1], steps[-1] # after dropping PY2: *front, last = path.split('/') if last.startswith('@'): _apply = partial(attribute, subpath='/'.join(front), attr=last[1:]) else: _apply = partial(Element.findall, path=path) self._apply = _apply # sig: Callable[[Element], XPathResult] def __call__(self, element): """Apply this evaluator to an element. :sig: (Element) -> XPathResult :param element: Element to apply this expression to. :return: Elements or strings resulting from the query. """ return self._apply(element) xpath = lambda e, p: XPath(p)(e) _EMPTY = {} # empty result singleton # sigalias: Reducer = Callable[[Sequence[str]], str] # sigalias: PathTransformer = Callable[[str], Any] # sigalias: MapTransformer = Callable[[Mapping[str, Any]], Any] # sigalias: Transformer = Union[PathTransformer, MapTransformer] # sigalias: ExtractedItem = Union[str, Mapping[str, Any]] class Extractor: """Abstract base extractor for getting data out of an XML element.""" def __init__(self, transform=None, foreach=None): """Initialize this extractor. :sig: (Optional[Transformer], Optional[str]) -> None :param transform: Function to transform the extracted value. :param foreach: Path to apply for generating a collection of values. """ self.transform = transform # sig: Optional[Transformer] """Function to transform the extracted value.""" self.foreach = XPath(foreach) if foreach is not None else None # sig: Optional[XPath] """Path to apply for generating a collection of values.""" def apply(self, element): """Get the raw data from an element using this extractor. :sig: (Element) -> ExtractedItem :param element: Element to apply this extractor to. :return: Extracted raw data. """ raise NotImplementedError('Concrete extractors must implement this method') def extract(self, element, transform=True): """Get the processed data from an element using this extractor. :sig: (Element, Optional[bool]) -> Any :param element: Element to extract the data from. :param transform: Whether the transformation will be applied or not. :return: Extracted and optionally transformed data. """ value = self.apply(element) if (value is None) or (value is _EMPTY) or (not transform): return value return value if self.transform is None else self.transform(value) @staticmethod def from_map(item): """Generate an extractor from a description map. :sig: (Mapping[str, Any]) -> Extractor :param item: Extractor description. :return: Extractor object. :raise ValueError: When reducer or transformer names are unknown. """ transformer = item.get('transform') if transformer is None: transform = None else: transform = transformers.get(transformer) if transform is None: raise ValueError('Unknown transformer') foreach = item.get('foreach') path = item.get('path') if path is not None: reducer = item.get('reduce') if reducer is None: reduce = None else: reduce = reducers.get(reducer) if reduce is None: raise ValueError('Unknown reducer') extractor = Path(path, reduce, transform=transform, foreach=foreach) else: items = item.get('items') # TODO: check for None rules = [Rule.from_map(i) for i in items] extractor = Rules(rules, section=item.get('section'), transform=transform, foreach=foreach) return extractor class Path(Extractor): """An extractor for getting text out of an XML element.""" def __init__(self, path, reduce=None, transform=None, foreach=None): """Initialize this extractor. :sig: ( str, Optional[Reducer], Optional[PathTransformer], Optional[str] ) -> None :param path: Path to apply to get the data. :param reduce: Function to reduce selected texts into a single string. :param transform: Function to transform extracted value. :param foreach: Path to apply for generating a collection of data. """ if PY2: Extractor.__init__(self, transform=transform, foreach=foreach) else: super().__init__(transform=transform, foreach=foreach) self.path = XPath(path) # sig: XPath """XPath evaluator to apply to get the data.""" if reduce is None: reduce = reducers.concat self.reduce = reduce # sig: Reducer """Function to reduce selected texts into a single string.""" def apply(self, element): """Apply this extractor to an element. :sig: (Element) -> str :param element: Element to apply this extractor to. :return: Extracted text. """ # _logger.debug('applying path "%s" on "%s" element', self.path, element.tag) selected = self.path(element) if len(selected) == 0: # _logger.debug('no match') value = None else: # _logger.debug('selected elements: "%s"', selected) value = self.reduce(selected) # _logger.debug('reduced using "%s": "%s"', self.reduce, value) return value class Rules(Extractor): """An extractor for getting data items out of an XML element.""" def __init__(self, rules, section=None, transform=None, foreach=None): """Initialize this extractor. :sig: ( Sequence[Rule], str, Optional[MapTransformer], Optional[str] ) -> None :param rules: Rules for generating the data items. :param section: Path for setting the root of this section. :param transform: Function to transform extracted value. :param foreach: Path for generating multiple items. """ if PY2: Extractor.__init__(self, transform=transform, foreach=foreach) else: super().__init__(transform=transform, foreach=foreach) self.rules = rules # sig: Sequence[Rule] """Rules for generating the data items.""" self.section = XPath(section) if section is not None else None # sig: Optional[XPath] """XPath expression for selecting a subroot for this section.""" def apply(self, element): """Apply this extractor to an element. :sig: (Element) -> Mapping[str, Any] :param element: Element to apply the extractor to. :return: Extracted mapping. """ if self.section is None: subroot = element else: subroots = self.section(element) if len(subroots) == 0: _logger.debug('No section root found') return _EMPTY if len(subroots) > 1: raise ValueError('Section path should select exactly one element') subroot = subroots[0] _logger.debug('Moving root to %s element', subroot.tag) data = {} for rule in self.rules: extracted = rule.extract(subroot) data.update(extracted) return data if len(data) > 0 else _EMPTY class Rule: """A rule describing how to get a data item out of an XML element.""" def __init__(self, key, extractor, foreach=None): """Initialize this rule. :sig: (Union[str, Extractor], Extractor, Optional[str]) -> None :param key: Name to distinguish this data item. :param extractor: Extractor that will generate this data item. :param foreach: Path for generating multiple items. """ self.key = key # sig: Union[str, Extractor] """Name to distinguish this data item.""" self.extractor = extractor # sig: Extractor """Extractor that will generate this data item.""" self.foreach = XPath(foreach) if foreach is not None else None # sig: Optional[XPath] """XPath evaluator for generating multiple items.""" @staticmethod def from_map(item): """Generate a rule from a description map. :sig: (Mapping[str, Any]) -> Rule :param item: Item description. :return: Rule object. """ item_key = item['key'] key = item_key if isinstance(item_key, str) else Extractor.from_map(item_key) value = Extractor.from_map(item['value']) return Rule(key=key, extractor=value, foreach=item.get('foreach')) def extract(self, element): """Extract data out of an element using this rule. :sig: (Element) -> Mapping[str, Any] :param element: Element to extract the data from. :return: Extracted data. """ data = {} subroots = [element] if self.foreach is None else self.foreach(element) for subroot in subroots: # _logger.debug('setting section element to: "%s"', section.tag) key = self.key if isinstance(self.key, str) else self.key.extract(subroot) if key is None: # _logger.debug('no value generated for key name') continue # _logger.debug('extracting key: "%s"', key) if self.extractor.foreach is None: value = self.extractor.extract(subroot) if (value is None) or (value is _EMPTY): # _logger.debug('no value generated for key') continue data[key] = value # _logger.debug('extracted value for "%s": "%s"', key, data[key]) else: # don't try to transform list items by default, it might waste a lot of time raw_values = [self.extractor.extract(r, transform=False) for r in self.extractor.foreach(subroot)] values = [v for v in raw_values if (v is not None) and (v is not _EMPTY)] if len(values) == 0: # _logger.debug('no items found in list') continue data[key] = values if self.extractor.transform is None else \ list(map(self.extractor.transform, values)) # _logger.debug('extracted value for "%s": "%s"', key, data[key]) return data def remove_elements(root, path): """Remove selected elements from the tree. :sig: (Element, str) -> None :param root: Root element of the tree. :param path: XPath to select the elements to remove. """ if _USE_LXML: get_parent = ElementTree._Element.getparent else: # ElementTree doesn't support parent queries, so we'll build a map for it get_parent = root.attrib.get('_get_parent') if get_parent is None: get_parent = {e: p for p in root.iter() for e in p}.get root.attrib['_get_parent'] = get_parent elements = XPath(path)(root) _logger.debug('removing %s elements using path: "%s"', len(elements), path) if len(elements) > 0: for element in elements: _logger.debug('removing element: "%s"', element.tag) # XXX: could this be hazardous? parent removed in earlier iteration? get_parent(element).remove(element) def set_element_attr(root, path, name, value): """Set an attribute for selected elements. :sig: ( Element, str, Union[str, Mapping[str, Any]], Union[str, Mapping[str, Any]] ) -> None :param root: Root element of the tree. :param path: XPath to select the elements to set attributes for. :param name: Description for name generation. :param value: Description for value generation. """ elements = XPath(path)(root) _logger.debug('updating %s elements using path: "%s"', len(elements), path) for element in elements: attr_name = name if isinstance(name, str) else \ Extractor.from_map(name).extract(element) if attr_name is None: _logger.debug('no attribute name generated for "%s" element', element.tag) continue attr_value = value if isinstance(value, str) else \ Extractor.from_map(value).extract(element) if attr_value is None: _logger.debug('no attribute value generated for "%s" element', element.tag) continue _logger.debug('setting "%s" attribute to "%s" on "%s" element', attr_name, attr_value, element.tag) element.attrib[attr_name] = attr_value def set_element_text(root, path, text): """Set the text for selected elements. :sig: (Element, str, Union[str, Mapping[str, Any]]) -> None :param root: Root element of the tree. :param path: XPath to select the elements to set attributes for. :param text: Description for text generation. """ elements = XPath(path)(root) _logger.debug('updating %s elements using path: "%s"', len(elements), path) for element in elements: element_text = text if isinstance(text, str) else \ Extractor.from_map(text).extract(element) # note that the text can be None in which case the existing text will be cleared _logger.debug('setting text to "%s" on "%s" element', element_text, element.tag) element.text = element_text def build_tree(document, force_html=False): """Build a tree from an XML document. :sig: (str, Optional[bool]) -> Element :param document: XML document to build the tree from. :param force_html: Force to parse from HTML without converting. :return: Root element of the XML tree. """ content = document.encode('utf-8') if PY2 else document if _USE_LXML and force_html: _logger.info('using lxml html builder') import lxml.html return lxml.html.fromstring(content) return ElementTree.fromstring(content) class Registry: """A simple, attribute-based namespace.""" def __init__(self, entries): """Initialize this registry. :sig: (Mapping[str, Any]) -> None :param entries: Entries to add to this registry. """ self.__dict__.update(entries) def get(self, item): """Get the value of an entry from this registry. :sig: (str) -> Any :param item: Entry to get the value for. :return: Value of entry. """ return self.__dict__.get(item) def register(self, key, value): """Register a new entry in this registry. :sig: (str, Any) -> None :param key: Key to search the entry in this registry. :param value: Value to store for the entry. """ self.__dict__[key] = value _PREPROCESSORS = { 'remove': remove_elements, 'set_attr': set_element_attr, 'set_text': set_element_text } preprocessors = Registry(_PREPROCESSORS) # sig: Registry """Predefined preprocessors.""" _REDUCERS = { 'first': itemgetter(0), 'concat': partial(str.join, ''), 'clean': lambda xs: re.sub('\s+', ' ', ''.join(xs).replace('\xa0', ' ')).strip(), 'normalize': lambda xs: re.sub('[^a-z0-9_]', '', ''.join(xs).lower().replace(' ', '_')) } reducers = Registry(_REDUCERS) # sig: Registry """Predefined reducers.""" _TRANSFORMERS = { 'int': int, 'float': float, 'bool': bool, 'len': len, 'lower': str.lower, 'upper': str.upper, 'capitalize': str.capitalize, 'lstrip': str.lstrip, 'rstrip': str.rstrip, 'strip': str.strip } transformers = Registry(_TRANSFORMERS) # sig: Registry """Predefined transformers.""" def preprocess(root, pre): """Process a tree before starting extraction. :sig: (Element, Sequence[Mapping[str, Any]]) -> None :param root: Root of tree to process. :param pre: Descriptions for processing operations. """ for step in pre: op = step['op'] if op == 'remove': remove_elements(root, step['path']) elif op == 'set_attr': set_element_attr(root, step['path'], name=step['name'], value=step['value']) elif op == 'set_text': set_element_text(root, step['path'], text=step['text']) else: raise ValueError('Unknown preprocessing operation') def extract(element, items, section=None): """Extract data from an XML element. :sig: ( Element, Sequence[Mapping[str, Any]], Optional[str] ) -> Mapping[str, Any] :param element: Element to extract the data from. :param items: Descriptions for extracting items. :param section: Path to select the root element for these items. :return: Extracted data. """ rules = Rules([Rule.from_map(item) for item in items], section=section) return rules.extract(element) def scrape(document, spec): """Extract data from a document after optionally preprocessing it. :sig: (str, Mapping[str, Any]) -> Mapping[str, Any] :param document: Document to scrape. :param spec: Extraction specification. :return: Extracted data. """ root = build_tree(document) pre = spec.get('pre') if pre is not None: preprocess(root, pre) data = extract(root, spec.get('items'), section=spec.get('section')) return data ########################################################### # COMMAND-LINE INTERFACE ########################################################### def h2x(source): """Convert an HTML file into XHTML and print. :sig: (str) -> None :param source: Path of HTML file to convert. """ if source == '-': _logger.debug('reading from stdin') content = sys.stdin.read() else: _logger.debug('reading from file: "%s"', os.path.abspath(source)) with open(source, 'rb') as f: content = decode_html(f.read()) print(html_to_xhtml(content), end='') def scrape_document(address, spec, content_format='xml'): """Scrape data from a file path or a URL and print. :sig: (str, str, Optional[str]) -> None :param address: File path or URL of document to scrape. :param spec: Path of spec file. :param content_format: Whether the content is XML or HTML. """ _logger.debug('loading spec from file: "%s"', os.path.abspath(spec)) if os.path.splitext(spec)[-1] == '.yaml': if find_loader('yaml') is None: raise RuntimeError('YAML support not available') import yaml spec_loader = yaml.load else: spec_loader = json.loads with open(spec) as f: spec_map = spec_loader(f.read()) if address.startswith(('http://', 'https://')): _logger.debug('loading url: "%s"', address) with urlopen(address) as f: content = f.read() else: _logger.debug('loading file: "%s"', os.path.abspath(address)) with open(address, 'rb') as f: content = f.read() document = decode_html(content) if content_format == 'html': _logger.debug('converting html document to xhtml') document = html_to_xhtml(document) # _logger.debug('=== CONTENT START ===\n%s\n=== CONTENT END===', document) data = scrape(document, spec_map) print(json.dumps(data, indent=2, sort_keys=True)) def make_parser(prog): """Build a parser for command line arguments. :sig: (str) -> ArgumentParser :param prog: Name of program. :return: Parser for arguments. """ parser = ArgumentParser(prog=prog) parser.add_argument('--version', action='version', version='%(prog)s 1.0b7') parser.add_argument('--debug', action='store_true', help='enable debug messages') commands = parser.add_subparsers(metavar='command', dest='command') commands.required = True h2x_parser = commands.add_parser('h2x', help='convert HTML to XHTML') h2x_parser.add_argument('file', help='file to convert') h2x_parser.set_defaults(func=lambda a: h2x(a.file)) scrape_parser = commands.add_parser('scrape', help='scrape a document') scrape_parser.add_argument('document', help='file path or URL of document to scrape') scrape_parser.add_argument('-s', '--spec', required=True, help='spec file') scrape_parser.add_argument('--html', action='store_true', help='document is in HTML format') scrape_parser.set_defaults(func=lambda a: scrape_document( a.document, a.spec, content_format='html' if a.html else 'xml' )) return parser def main(argv=None): """Entry point of the command line utility. :sig: (Optional[List[str]]) -> None :param argv: Command line arguments. """ argv = argv if argv is not None else sys.argv parser = make_parser(prog='piculet') arguments = parser.parse_args(argv[1:]) # set debug mode if arguments.debug: logging.basicConfig(level=logging.DEBUG) _logger.debug('running in debug mode') # run the handler for the selected command try: arguments.func(arguments) except Exception as e: print(e, file=sys.stderr) sys.exit(1) if __name__ == '__main__': main() imdbpy-6.8/imdb/parser/http/searchCompanyParser.py000066400000000000000000000045461351454127000223770ustar00rootroot00000000000000# Copyright 2008-2018 Davide Alberani # 2008-2018 H. Turgut Uyar # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides the classes (and the instances) that are used to parse the results of a search for a given company. For example, when searching for the name "Columbia Pictures", the parsed page would be: http://www.imdb.com/find?q=Columbia+Pictures&s=co """ from __future__ import absolute_import, division, print_function, unicode_literals from imdb.utils import analyze_company_name from .piculet import Path, Rule, Rules, reducers from .searchMovieParser import DOMHTMLSearchMovieParser from .utils import analyze_imdbid class DOMHTMLSearchCompanyParser(DOMHTMLSearchMovieParser): """A parser for the company search page.""" rules = [ Rule( key='data', extractor=Rules( foreach='//td[@class="result_text"]', rules=[ Rule( key='link', extractor=Path('./a/@href', reduce=reducers.first) ), Rule( key='name', extractor=Path('./a/text()') ), Rule( key='notes', extractor=Path('./text()') ) ], transform=lambda x: ( analyze_imdbid(x.get('link')), analyze_company_name(x.get('name') + x.get('notes', ''), stripNotes=True) ) ) ) ] _OBJECTS = { 'search_company_parser': ((DOMHTMLSearchCompanyParser,), {'kind': 'company'}) } imdbpy-6.8/imdb/parser/http/searchKeywordParser.py000066400000000000000000000070131351454127000224050ustar00rootroot00000000000000# Copyright 2009-2018 Davide Alberani # 2018 H. Turgut Uyar # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides the classes (and the instances) that are used to parse the results of a search for a given keyword. For example, when searching for the keyword "alabama", the parsed page would be: http://www.imdb.com/find?q=alabama&s=kw """ from __future__ import absolute_import, division, print_function, unicode_literals from imdb.utils import analyze_title from .piculet import Path, Rule, Rules, reducers from .searchMovieParser import DOMHTMLSearchMovieParser from .utils import analyze_imdbid class DOMHTMLSearchKeywordParser(DOMHTMLSearchMovieParser): """A parser for the keyword search page.""" rules = [ Rule( key='data', extractor=Path( foreach='//td[@class="result_text"]', path='./a/text()' ) ) ] def custom_analyze_title4kwd(title, yearNote, outline): """Return a dictionary with the needed info.""" title = title.strip() if not title: return {} if yearNote: yearNote = '%s)' % yearNote.split(' ')[0] title = title + ' ' + yearNote retDict = analyze_title(title) if outline: retDict['plot outline'] = outline return retDict class DOMHTMLSearchMovieKeywordParser(DOMHTMLSearchMovieParser): """A parser for the movie search by keyword page.""" rules = [ Rule( key='data', extractor=Rules( foreach='//h3[@class="lister-item-header"]', rules=[ Rule( key='link', extractor=Path('./a/@href', reduce=reducers.first) ), Rule( key='info', extractor=Path('./a//text()') ), Rule( key='ynote', extractor=Path('./span[@class="lister-item-year text-muted unbold"]/text()') ), Rule( key='outline', extractor=Path('./span[@class="outline"]//text()') ) ], transform=lambda x: ( analyze_imdbid(x.get('link')), custom_analyze_title4kwd( x.get('info', ''), x.get('ynote', ''), x.get('outline', '') ) ) ) ) ] def preprocess_string(self, html_string): return html_string.replace(' + >', '>') _OBJECTS = { 'search_keyword_parser': ((DOMHTMLSearchKeywordParser,), {'kind': 'keyword'}), 'search_moviekeyword_parser': ((DOMHTMLSearchMovieKeywordParser,), None) } imdbpy-6.8/imdb/parser/http/searchMovieAdvancedParser.py000066400000000000000000000225631351454127000234750ustar00rootroot00000000000000# -*- coding: utf-8 -*- # Copyright 2019 H. Turgut Uyar # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides the classes (and the instances) that are used to parse the results of an advanced search for a given title. For example, when searching for the title "the passion", the parsed page would be: http://www.imdb.com/search/title/?title=the+passion """ from __future__ import absolute_import, division, print_function, unicode_literals import re from .piculet import Path, Rule, Rules, preprocessors, reducers from .utils import DOMParserBase, analyze_imdbid, build_movie, build_person _re_secondary_info = re.compile( r'''(\(([IVXLCM]+)\)\s+)?\((\d{4})(–(\s|(\d{4})))?(\s+(.*))?\)''' ) _KIND_MAP = { 'tv short': 'tv short movie', 'video': 'video movie' } def _parse_secondary_info(info): parsed = {} match = _re_secondary_info.match(info) kind = None if match.group(2): parsed['imdbIndex'] = match.group(2) if match.group(3): parsed['year'] = int(match.group(3)) if match.group(4): kind = 'tv series' if match.group(6): parsed['end_year'] = int(match.group(6)) if match.group(8): kind = match.group(8).lower() if kind is None: kind = 'movie' parsed['kind'] = _KIND_MAP.get(kind, kind) return parsed class DOMHTMLSearchMovieAdvancedParser(DOMParserBase): """A parser for the title search page.""" person_rules = [ Rule(key='name', extractor=Path('./text()', reduce=reducers.first)), Rule(key='link', extractor=Path('./@href', reduce=reducers.first)) ] rules = [ Rule( key='data', extractor=Rules( foreach='//div[@class="lister-item-content"]', rules=[ Rule( key='link', extractor=Path('./h3/a/@href', reduce=reducers.first) ), Rule( key='title', extractor=Path('./h3/a/text()', reduce=reducers.first) ), Rule( key='secondary_info', extractor=Path('./h3/span[@class="lister-item-year text-muted unbold"]/text()', reduce=reducers.first) ), Rule( key='state', extractor=Path('.//b/text()', reduce=reducers.first) ), Rule( key='certificates', extractor=Path('.//span[@class="certificate"]/text()', reduce=reducers.first, transform=lambda s: [s]) ), Rule( key='runtimes', extractor=Path('.//span[@class="runtime"]/text()', reduce=reducers.first, transform=lambda s: [[w for w in s.split() if w.isdigit()][0]]) ), Rule( key='genres', extractor=Path('.//span[@class="genre"]/text()', reduce=reducers.first, transform=lambda s: [w.strip() for w in s.split(',')]) ), Rule( key='rating', extractor=Path('.//div[@name="ir"]/@data-value', reduce=reducers.first, transform=float) ), Rule( key='votes', extractor=Path('.//span[@name="nv"]/@data-value', reduce=reducers.first, transform=int) ), Rule( key='metascore', extractor=Path('.//span[@class="metascore favorable"]/text()', reduce=reducers.first, transform=int) ), Rule( key='gross', extractor=Path('.//span[@name="GROSS"]/@data-value', reduce=reducers.normalize, transform=int) ), Rule( key='plot', extractor=Path('./p[@class="text-muted"]//text()', reduce=reducers.clean) ), Rule( key='directors', extractor=Rules( foreach='.//div[@class="DIRECTORS"]/a', rules=person_rules, transform=lambda x: build_person(x['name'], personID=analyze_imdbid(x['link'])) ) ), Rule( key='cast', extractor=Rules( foreach='.//div[@class="STARS"]/a', rules=person_rules, transform=lambda x: build_person(x['name'], personID=analyze_imdbid(x['link'])) ) ), Rule( key='cover url', extractor=Path('..//a/img/@loadlate') ), Rule( key='episode', extractor=Rules( rules=[ Rule(key='link', extractor=Path('./h3/small/a/@href', reduce=reducers.first)), Rule(key='title', extractor=Path('./h3/small/a/text()', reduce=reducers.first)), Rule(key='secondary_info', extractor=Path('./h3/small/span[@class="lister-item-year text-muted unbold"]/text()', reduce=reducers.first)), ] ) ) ] ) ) ] def _init(self): self.url = '' def _reset(self): self.url = '' preprocessors = [ (re.compile(r'Directors?:(.*?)()', re.DOTALL), r'
    \1
    \2'), (re.compile(r'Stars?:(.*?)()', re.DOTALL), r'
    \1
    \2'), (re.compile(r'(Gross:.*?'), (re.compile(r'(Episode:)(
    )(.*?)(
    )', re.DOTALL), r'\1\3\2\4') ] def preprocess_dom(self, dom): preprocessors.remove(dom, '//br[@class="ADD_A_PLOT"]/../..') return dom def postprocess_data(self, data): if 'data' not in data: data['data'] = [] results = getattr(self, 'results', None) if results is not None: data['data'][:] = data['data'][:results] result = [] for movie in data['data']: episode = movie.pop('episode', None) if episode is not None: series = build_movie(movie.get('title'), movieID=analyze_imdbid(movie['link'])) series['kind'] = 'tv series' series_secondary = movie.get('secondary_info') if series_secondary: series.update(_parse_secondary_info(series_secondary)) movie['episode of'] = series movie['link'] = episode['link'] movie['title'] = episode['title'] ep_secondary = episode.get('secondary_info') if ep_secondary is not None: movie['secondary_info'] = ep_secondary secondary_info = movie.pop('secondary_info', None) if secondary_info is not None: secondary = _parse_secondary_info(secondary_info) movie.update(secondary) if episode is not None: movie['kind'] = 'episode' result.append((analyze_imdbid(movie.pop('link')), movie)) data['data'] = result return data def add_refs(self, data): return data _OBJECTS = { 'search_movie_advanced_parser': ((DOMHTMLSearchMovieAdvancedParser,), None) } imdbpy-6.8/imdb/parser/http/searchMovieParser.py000066400000000000000000000071721351454127000220460ustar00rootroot00000000000000# Copyright 2004-2018 Davide Alberani # 2008-2018 H. Turgut Uyar # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides the classes (and the instances) that are used to parse the results of a search for a given title. For example, when searching for the title "the passion", the parsed page would be: http://www.imdb.com/find?q=the+passion&s=tt """ from __future__ import absolute_import, division, print_function, unicode_literals from imdb.utils import analyze_title from .piculet import Path, Rule, Rules, reducers from .utils import DOMParserBase, analyze_imdbid class DOMHTMLSearchMovieParser(DOMParserBase): """A parser for the title search page.""" rules = [ Rule( key='data', extractor=Rules( foreach='//td[@class="result_text"]', rules=[ Rule( key='link', extractor=Path('./a/@href', reduce=reducers.first) ), Rule( key='info', extractor=Path('.//text()') ), Rule( key='akas', extractor=Path(foreach='./i', path='./text()') ), Rule( key='cover url', extractor=Path('../td[@class="primary_photo"]/a/img/@src') ) ], transform=lambda x: ( analyze_imdbid(x.get('link')), analyze_title(x.get('info', '')), x.get('akas'), x.get('cover url') ) ) ) ] def _init(self): self.url = '' self.img_type = 'cover url' def _reset(self): self.url = '' def postprocess_data(self, data): if 'data' not in data: data['data'] = [] results = getattr(self, 'results', None) if results is not None: data['data'][:] = data['data'][:results] # Horrible hack to support AKAs. data['data'] = [x for x in data['data'] if x[0] and x[1]] if data and data['data'] and len(data['data'][0]) == 4 and isinstance(data['data'][0], tuple): for idx, datum in enumerate(data['data']): if not isinstance(datum, tuple): continue if not datum[0] and datum[1]: continue if datum[2] is not None: akas = [aka[1:-1] for aka in datum[2]] # remove the quotes datum[1]['akas'] = akas if datum[3] is not None: datum[1][self.img_type] = datum[3] data['data'][idx] = (datum[0], datum[1]) return data def add_refs(self, data): return data _OBJECTS = { 'search_movie_parser': ((DOMHTMLSearchMovieParser,), None) } imdbpy-6.8/imdb/parser/http/searchPersonParser.py000066400000000000000000000055021351454127000222300ustar00rootroot00000000000000# Copyright 2004-2019 Davide Alberani # 2008-2018 H. Turgut Uyar # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides the classes (and the instances) that are used to parse the results of a search for a given person. For example, when searching for the name "Mel Gibson", the parsed page would be: http://www.imdb.com/find?q=Mel+Gibson&s=nm """ from __future__ import absolute_import, division, print_function, unicode_literals from imdb.utils import analyze_name from .piculet import Path, Rule, Rules, reducers from .searchMovieParser import DOMHTMLSearchMovieParser from .utils import analyze_imdbid class DOMHTMLSearchPersonParser(DOMHTMLSearchMovieParser): """A parser for the name search page.""" rules = [ Rule( key='data', extractor=Rules( foreach='//td[@class="result_text"]', rules=[ Rule( key='link', extractor=Path('./a/@href', reduce=reducers.first) ), Rule( key='name', extractor=Path('./a/text()') ), Rule( key='index', extractor=Path('./text()') ), Rule( key='akas', extractor=Path(foreach='./i', path='./text()') ), Rule( key='headshot', extractor=Path('../td[@class="primary_photo"]/a/img/@src') ) ], transform=lambda x: ( analyze_imdbid(x.get('link')), analyze_name(x.get('name', '') + x.get('index', ''), canonical=1), x.get('akas'), x.get('headshot') ) ) ) ] def _init(self): super(DOMHTMLSearchPersonParser, self)._init() self.img_type = 'headshot' _OBJECTS = { 'search_person_parser': ((DOMHTMLSearchPersonParser,), {'kind': 'person'}) } imdbpy-6.8/imdb/parser/http/topBottomParser.py000066400000000000000000000074501351454127000215670ustar00rootroot00000000000000# Copyright 2009-2017 Davide Alberani # 2018 H. Turgut Uyar # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides the classes (and the instances) that are used to parse the pages for the lists of top 250 and bottom 100 movies. Pages: http://www.imdb.com/chart/top http://www.imdb.com/chart/bottom """ from __future__ import absolute_import, division, print_function, unicode_literals from imdb.utils import analyze_title from .piculet import Path, Rule, Rules, reducers from .utils import DOMParserBase, analyze_imdbid class DOMHTMLTop250Parser(DOMParserBase): """A parser for the "top 250 movies" page.""" ranktext = 'top 250 rank' rules = [ Rule( key='chart', extractor=Rules( foreach='//tbody[@class="lister-list"]/tr', rules=[ Rule( key='rank', extractor=Path('.//span[@name="rk"]/@data-value', reduce=reducers.first, transform=int) ), Rule( key='rating', extractor=Path('.//span[@name="ir"]/@data-value', reduce=reducers.first, transform=lambda x: round(float(x), 1)) ), Rule( key='movieID', extractor=Path('./td[@class="titleColumn"]/a/@href', reduce=reducers.first) ), Rule( key='title', extractor=Path('./td[@class="titleColumn"]/a/text()') ), Rule( key='year', extractor=Path('./td[@class="titleColumn"]/span/text()') ), Rule( key='votes', extractor=Path('.//span[@name="nv"]/@data-value', reduce=reducers.first, transform=int) ) ] ) ) ] def postprocess_data(self, data): if (not data) or ('chart' not in data): return [] movies = [] for entry in data['chart']: if ('movieID' not in entry) or ('rank' not in entry) or ('title' not in entry): continue movie_id = analyze_imdbid(entry['movieID']) if movie_id is None: continue del entry['movieID'] entry[self.ranktext] = entry['rank'] del entry['rank'] title = analyze_title(entry['title'] + ' ' + entry.get('year', '')) entry.update(title) movies.append((movie_id, entry)) return movies class DOMHTMLBottom100Parser(DOMHTMLTop250Parser): """A parser for the "bottom 100 movies" page.""" ranktext = 'bottom 100 rank' _OBJECTS = { 'top250_parser': ((DOMHTMLTop250Parser,), None), 'bottom100_parser': ((DOMHTMLBottom100Parser,), None) } imdbpy-6.8/imdb/parser/http/utils.py000066400000000000000000000543621351454127000175670ustar00rootroot00000000000000# Copyright 2004-2018 Davide Alberani # 2008-2018 H. Turgut Uyar # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides miscellaneous utilities used by the components in the :mod:`imdb.parser.http` package. """ from __future__ import absolute_import, division, print_function, unicode_literals import logging import re from imdb import PY2 from imdb.Character import Character from imdb.Movie import Movie from imdb.Person import Person from imdb.utils import _Container, flatten from .piculet import _USE_LXML, ElementTree, Rules, build_tree, html_to_xhtml from .piculet import xpath as piculet_xpath from .piculet import Rule, Path if PY2: from collections import Callable else: from collections.abc import Callable # Year, imdbIndex and kind. re_yearKind_index = re.compile( r'(\([0-9\?]{4}(?:/[IVXLCDM]+)?\)(?: \(mini\)| \(TV\)| \(V\)| \(VG\))?)' ) # Match imdb ids in href tags re_imdbid = re.compile(r'(title/tt|name/nm|company/co|user/ur)([0-9]+)') def analyze_imdbid(href): """Return an imdbID from an URL.""" if not href: return None match = re_imdbid.search(href) if not match: return None return str(match.group(2)) _modify_keys = list(Movie.keys_tomodify_list) + list(Person.keys_tomodify_list) def _putRefs(d, re_titles, re_names, lastKey=None): """Iterate over the strings inside list items or dictionary values, substitutes movie titles and person names with the (qv) references.""" if isinstance(d, list): for i in range(len(d)): if isinstance(d[i], str): if lastKey in _modify_keys: if re_names: d[i] = re_names.sub(r"'\1' (qv)", d[i]) if re_titles: d[i] = re_titles.sub(r'_\1_ (qv)', d[i]) elif isinstance(d[i], (list, dict)): _putRefs(d[i], re_titles, re_names, lastKey=lastKey) elif isinstance(d, dict): for k, v in list(d.items()): lastKey = k if isinstance(v, str): if lastKey in _modify_keys: if re_names: d[k] = re_names.sub(r"'\1' (qv)", v) if re_titles: d[k] = re_titles.sub(r'_\1_ (qv)', v) elif isinstance(v, (list, dict)): _putRefs(d[k], re_titles, re_names, lastKey=lastKey) _b_p_logger = logging.getLogger('imdbpy.parser.http.build_person') def build_person(txt, personID=None, billingPos=None, roleID=None, accessSystem='http', modFunct=None, headshot=None): """Return a Person instance from the tipical ... strings found in the IMDb's web site.""" # if personID is None # _b_p_logger.debug('empty name or personID for "%s"', txt) notes = '' role = '' # Search the (optional) separator between name and role/notes. if txt.find('....') != -1: sep = '....' elif txt.find('...') != -1: sep = '...' else: sep = '...' # Replace the first parenthesis, assuming there are only notes, after. # Rationale: no imdbIndex is (ever?) showed on the web site. txt = txt.replace('(', '...(', 1) txt_split = txt.split(sep, 1) if isinstance(roleID, list): roleID = [r for r in roleID if r] if not roleID: roleID = [''] name = txt_split[0].strip() if len(txt_split) == 2: role_comment = re_spaces.sub(' ', txt_split[1]).strip() re_episodes = re.compile(r'(\d+ episodes.*)', re.I | re.M | re.S) ep_match = re_episodes.search(role_comment) if ep_match and (not ep_match.start() or role_comment[ep_match.start() - 1] != '('): role_comment = re_episodes.sub(r'(\1)', role_comment) # Strip common endings. if role_comment[-4:] == ' and': role_comment = role_comment[:-4].rstrip() elif role_comment[-2:] == ' &': role_comment = role_comment[:-2].rstrip() elif role_comment[-6:] == '& ....': role_comment = role_comment[:-6].rstrip() # Get the notes. if roleID is not None: if not isinstance(roleID, list): cmt_idx = role_comment.find('(') if cmt_idx != -1: role = role_comment[:cmt_idx].rstrip() notes = role_comment[cmt_idx:] else: # Just a role, without notes. role = role_comment else: role = role_comment else: # We're managing something that doesn't have a 'role', so # everything are notes. notes = role_comment if role == '....': role = '' roleNotes = [] # Manages multiple roleIDs. if isinstance(roleID, list): rolesplit = role.split('/') role = [] for r in rolesplit: nidx = r.find('(') if nidx != -1: role.append(r[:nidx].rstrip()) roleNotes.append(r[nidx:]) else: role.append(r) roleNotes.append(None) lr = len(role) lrid = len(roleID) if lr > lrid: roleID += [None] * (lrid - lr) elif lr < lrid: roleID = roleID[:lr] for i, rid in enumerate(roleID): if rid is not None: roleID[i] = str(rid) if lr == 1: role = role[0] roleID = roleID[0] notes = roleNotes[0] or '' elif roleID is not None: roleID = str(roleID) if personID is not None: personID = str(personID) if (not name) or (personID is None): # Set to 'debug', since build_person is expected to receive some crap. _b_p_logger.debug('empty name or personID for "%s"', txt) if role: if isinstance(role, list): role = [r.strip() for r in role] else: role = role.strip() if notes: if isinstance(notes, list): notes = [n.strip() for n in notes] else: notes = notes.strip() # XXX: return None if something strange is detected? data = {} if headshot: data['headshot'] = headshot person = Person(name=name, personID=personID, currentRole=role, roleID=roleID, notes=notes, billingPos=billingPos, modFunct=modFunct, accessSystem=accessSystem, data=data) if roleNotes and len(roleNotes) == len(roleID): for idx, role in enumerate(person.currentRole): if roleNotes[idx]: role.notes = roleNotes[idx] elif person.currentRole and isinstance(person.currentRole, Character) and \ not person.currentRole.notes and notes: person.currentRole.notes = notes return person _re_chrIDs = re.compile('[0-9]{7}') _b_m_logger = logging.getLogger('imdbpy.parser.http.build_movie') # To shrink spaces. re_spaces = re.compile(r'\s+') def build_movie(txt, movieID=None, roleID=None, status=None, accessSystem='http', modFunct=None, _parsingCharacter=False, _parsingCompany=False, year=None, chrRoles=None, rolesNoChar=None, additionalNotes=None): """Given a string as normally seen on the "categorized" page of a person on the IMDb's web site, returns a Movie instance.""" # FIXME: Oook, lets face it: build_movie and build_person are now # two horrible sets of patches to support the new IMDb design. They # must be rewritten from scratch. if _parsingCompany: _defSep = ' ... ' else: _defSep = ' .... ' title = re_spaces.sub(' ', txt).strip() # Split the role/notes from the movie title. tsplit = title.split(_defSep, 1) role = '' notes = '' roleNotes = [] if len(tsplit) == 2: title = tsplit[0].rstrip() role = tsplit[1].lstrip() if title[-9:] == 'TV Series': title = title[:-9].rstrip() # elif title[-7:] == '(short)': # title = title[:-7].rstrip() # elif title[-11:] == '(TV series)': # title = title[:-11].rstrip() # elif title[-10:] == '(TV movie)': # title = title[:-10].rstrip() elif title[-14:] == 'TV mini-series': title = title[:-14] + ' (mini)' if title and title.endswith(_defSep.rstrip()): title = title[:-len(_defSep) + 1] # Try to understand where the movie title ends. while True: if year: break if title[-1:] != ')': # Ignore the silly "TV Series" notice. if title[-9:] == 'TV Series': title = title[:-9].rstrip() continue else: # Just a title: stop here. break # Try to match paired parentheses; yes: sometimes there are # parentheses inside comments... nidx = title.rfind('(') while nidx != -1 and title[nidx:].count('(') != title[nidx:].count(')'): nidx = title[:nidx].rfind('(') # Unbalanced parentheses: stop here. if nidx == -1: break # The last item in parentheses seems to be a year: stop here. first4 = title[nidx + 1:nidx + 5] if (first4.isdigit() or first4 == '????') and title[nidx + 5:nidx + 6] in (')', '/'): break # The last item in parentheses is a known kind: stop here. if title[nidx + 1:-1] in ('TV', 'V', 'mini', 'VG', 'TV movie', 'TV series', 'short'): break # Else, in parentheses there are some notes. # XXX: should the notes in the role half be kept separated # from the notes in the movie title half? if notes: notes = '%s %s' % (title[nidx:], notes) else: notes = title[nidx:] title = title[:nidx].rstrip() if year: year = year.strip() if title[-1:] == ')': fpIdx = title.rfind('(') if fpIdx != -1: if notes: notes = '%s %s' % (title[fpIdx:], notes) else: notes = title[fpIdx:] title = title[:fpIdx].rstrip() title = '%s (%s)' % (title, year) if not roleID: roleID = None elif len(roleID) == 1: roleID = roleID[0] if not role and chrRoles and isinstance(roleID, str): roleID = _re_chrIDs.findall(roleID) role = ' / '.join([_f for _f in chrRoles.split('@@') if _f]) # Manages multiple roleIDs. if isinstance(roleID, list): tmprole = role.split('/') role = [] for r in tmprole: nidx = r.find('(') if nidx != -1: role.append(r[:nidx].rstrip()) roleNotes.append(r[nidx:]) else: role.append(r) roleNotes.append(None) lr = len(role) lrid = len(roleID) if lr > lrid: roleID += [None] * (lrid - lr) elif lr < lrid: roleID = roleID[:lr] for i, rid in enumerate(roleID): if rid is not None: roleID[i] = str(rid) if lr == 1: role = role[0] roleID = roleID[0] elif roleID is not None: roleID = str(roleID) if movieID is not None: movieID = str(movieID) if (not title) or (movieID is None): _b_m_logger.error('empty title or movieID for "%s"', txt) if rolesNoChar: rolesNoChar = [_f for _f in [x.strip() for x in rolesNoChar.split('/')] if _f] if not role: role = [] elif not isinstance(role, list): role = [role] role += rolesNoChar notes = notes.strip() if additionalNotes: additionalNotes = re_spaces.sub(' ', additionalNotes).strip() if notes: notes += ' ' notes += additionalNotes if role and isinstance(role, list) and notes.endswith(role[-1].replace('\n', ' ')): role = role[:-1] m = Movie(title=title, movieID=movieID, notes=notes, currentRole=role, roleID=roleID, roleIsPerson=_parsingCharacter, modFunct=modFunct, accessSystem=accessSystem) if additionalNotes: if '(TV Series)' in additionalNotes: m['kind'] = 'tv series' elif '(Video Game)' in additionalNotes: m['kind'] = 'video game' elif '(TV Movie)' in additionalNotes: m['kind'] = 'tv movie' elif '(TV Short)' in additionalNotes: m['kind'] = 'tv short' if roleNotes and len(roleNotes) == len(roleID): for idx, role in enumerate(m.currentRole): try: if roleNotes[idx]: role.notes = roleNotes[idx] except IndexError: break # Status can't be checked here, and must be detected by the parser. if status: m['status'] = status return m class DOMParserBase(object): """Base parser to handle HTML data from the IMDb's web server.""" _defGetRefs = False _containsObjects = False preprocessors = [] rules = [] _logger = logging.getLogger('imdbpy.parser.http.domparser') def __init__(self): """Initialize the parser.""" self._modFunct = None self._as = 'http' self._cname = self.__class__.__name__ self._init() self.reset() def reset(self): """Reset the parser.""" # Names and titles references. self._namesRefs = {} self._titlesRefs = {} self._reset() def _init(self): """Subclasses can override this method, if needed.""" pass def _reset(self): """Subclasses can override this method, if needed.""" pass def parse(self, html_string, getRefs=None, **kwds): """Return the dictionary generated from the given html string; getRefs can be used to force the gathering of movies/persons references.""" self.reset() if getRefs is not None: self.getRefs = getRefs else: self.getRefs = self._defGetRefs if PY2 and isinstance(html_string, str): html_string = html_string.decode('utf-8') # Temporary fix: self.parse_dom must work even for empty strings. html_string = self.preprocess_string(html_string) if html_string: html_string = html_string.replace(' ', ' ') dom = self.get_dom(html_string) try: dom = self.preprocess_dom(dom) except Exception: self._logger.error('%s: caught exception preprocessing DOM', self._cname, exc_info=True) if self.getRefs: try: self.gather_refs(dom) except Exception: self._logger.warn('%s: unable to gather refs: %s', self._cname, exc_info=True) data = self.parse_dom(dom) else: data = {} try: data = self.postprocess_data(data) except Exception: self._logger.error('%s: caught exception postprocessing data', self._cname, exc_info=True) if self._containsObjects: self.set_objects_params(data) data = self.add_refs(data) return data def get_dom(self, html_string): """Return a dom object, from the given string.""" try: if not _USE_LXML: html_string = html_to_xhtml(html_string, omit_tags={"script"}) dom = build_tree(html_string, force_html=True) if dom is None: dom = build_tree('') self._logger.error('%s: using a fake empty DOM', self._cname) return dom except Exception: self._logger.error('%s: caught exception parsing DOM', self._cname, exc_info=True) return build_tree('') def xpath(self, element, path): """Return elements matching the given XPath.""" try: return piculet_xpath(element, path) except Exception: self._logger.error('%s: caught exception extracting XPath "%s"', self._cname, path, exc_info=True) return [] def tostring(self, element): """Convert the element to a string.""" if isinstance(element, str): return str(element) else: try: return ElementTree.tostring(element, encoding='utf8') except Exception: self._logger.error('%s: unable to convert to string', self._cname, exc_info=True) return '' def clone(self, element): """Clone an element.""" return build_tree(self.tostring(element)) def preprocess_string(self, html_string): """Here we can modify the text, before it's parsed.""" if not html_string: return html_string try: preprocessors = self.preprocessors except AttributeError: return html_string for src, sub in preprocessors: # re._pattern_type is present only since Python 2.5. if isinstance(getattr(src, 'sub', None), Callable): html_string = src.sub(sub, html_string) elif isinstance(src, str) or isinstance(src, unicode): html_string = html_string.replace(src, sub) elif isinstance(src, Callable): try: html_string = src(html_string) except Exception: _msg = '%s: caught exception preprocessing html' self._logger.error(_msg, self._cname, exc_info=True) continue return html_string def gather_refs(self, dom): """Collect references.""" grParser = GatherRefs() grParser._as = self._as grParser._modFunct = self._modFunct refs = grParser.parse_dom(dom) refs = grParser.postprocess_data(refs) self._namesRefs = refs['names refs'] self._titlesRefs = refs['titles refs'] def preprocess_dom(self, dom): """Last chance to modify the dom, before the rules are applied.""" return dom def parse_dom(self, dom): """Parse the given dom according to the rules specified in self.rules.""" return Rules(self.rules).extract(dom) def postprocess_data(self, data): """Here we can modify the data.""" return data def set_objects_params(self, data): """Set parameters of Movie/Person/... instances, since they are not always set in the parser's code.""" for obj in flatten(data, yieldDictKeys=True, scalar=_Container): obj.accessSystem = self._as obj.modFunct = self._modFunct def add_refs(self, data): """Modify data according to the expected output.""" if self.getRefs: titl_re = r'(%s)' % '|'.join( [re.escape(x) for x in list(self._titlesRefs.keys())] ) if titl_re != r'()': re_titles = re.compile(titl_re, re.U) else: re_titles = None nam_re = r'(%s)' % '|'.join( [re.escape(x) for x in list(self._namesRefs.keys())] ) if nam_re != r'()': re_names = re.compile(nam_re, re.U) else: re_names = None _putRefs(data, re_titles, re_names) return {'data': data, 'titlesRefs': self._titlesRefs, 'namesRefs': self._namesRefs } def _parse_ref(text, link, info): """Manage links to references.""" if link.find('/title/tt') != -1: yearK = re_yearKind_index.match(info) if yearK and yearK.start() == 0: text += ' %s' % info[:yearK.end()] return text.replace('\n', ' '), link class GatherRefs(DOMParserBase): """Parser used to gather references to movies, persons.""" _common_rules = [ Rule( key='text', extractor=Path('./text()') ), Rule( key='link', extractor=Path('./@href') ), Rule( key='info', extractor=Path('./following::text()[1]') ) ] _common_transform = lambda x: _parse_ref( x.get('text') or '', x.get('link') or '', (x.get('info') or '').strip() ) rules = [ Rule( key='names refs', extractor=Rules( foreach='//a[starts-with(@href, "/name/nm")]', rules=_common_rules, transform=_common_transform ) ), Rule( key='titles refs', extractor=Rules( foreach='//a[starts-with(@href, "/title/tt")]', rules=_common_rules, transform=_common_transform ) ) ] def postprocess_data(self, data): result = {} for item in ('names refs', 'titles refs'): result[item] = {} for k, v in data.get(item, []): k = k.strip() v = v.strip() if not (k and v): continue imdbID = analyze_imdbid(v) if item == 'names refs': obj = Person(personID=imdbID, name=k, accessSystem=self._as, modFunct=self._modFunct) elif item == 'titles refs': obj = Movie(movieID=imdbID, title=k, accessSystem=self._as, modFunct=self._modFunct) result[item][k] = obj return result def add_refs(self, data): return data imdbpy-6.8/imdb/parser/s3/000077500000000000000000000000001351454127000154115ustar00rootroot00000000000000imdbpy-6.8/imdb/parser/s3/__init__.py000066400000000000000000000263471351454127000175360ustar00rootroot00000000000000# -*- coding: utf-8 -*- # Copyright 2017-2019 Davide Alberani # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This package provides the IMDbS3AccessSystem class used to access IMDb's data through the Amazon S3 dataset. The :func:`imdb.IMDb` function will return an instance of this class when called with the ``accessSystem`` parameter is set to "s3" or "s3dataset". """ from __future__ import absolute_import, division, print_function, unicode_literals import logging import sqlalchemy from operator import itemgetter from imdb import IMDbBase from .utils import DB_TRANSFORM, title_soundex, name_soundexes, scan_titles, scan_names from imdb.Movie import Movie from imdb.Person import Person def split_array(text): """Split a string assuming it's an array. :param text: the text to split :type text: str :returns: list of splitted strings :rtype: list """ if not isinstance(text, str): return text # for some reason, titles.akas.tsv.gz contains \x02 as a separator sep = ',' if ',' in text else '\x02' return text.split(sep) class IMDbS3AccessSystem(IMDbBase): """The class used to access IMDb's data through the s3 dataset.""" accessSystem = 's3' _s3_logger = logging.getLogger('imdbpy.parser.s3') _metadata = sqlalchemy.MetaData() def __init__(self, uri, adultSearch=True, *arguments, **keywords): """Initialize the access system.""" IMDbBase.__init__(self, *arguments, **keywords) self._engine = sqlalchemy.create_engine(uri, encoding='utf-8', echo=False) self._metadata.bind = self._engine self._metadata.reflect() self.T = self._metadata.tables def _rename(self, table, data): for column, conf in DB_TRANSFORM.get(table, {}).items(): if 'rename' not in conf: continue if column not in data: continue data[conf['rename']] = data[column] del data[column] return data def _clean(self, data, keys_to_remove=None): if keys_to_remove is None: keys_to_remove = [] for key in list(data.keys()): if key in keys_to_remove or data[key] in (None, '', []): del data[key] return data def _base_title_info(self, movieID, movies_cache=None, persons_cache=None): if movies_cache is None: movies_cache = {} if persons_cache is None: persons_cache = {} if movieID in movies_cache: return movies_cache[movieID] tb = self.T['title_basics'] movie = tb.select(tb.c.tconst == movieID).execute().fetchone() or {} data = self._rename('title_basics', dict(movie)) data['year'] = str(data.get('startYear') or '') if 'endYear' in data and data['endYear']: data['year'] += '-%s' % data['endYear'] genres = data.get('genres') or '' data['genres'] = split_array(genres.lower()) if 'runtimes' in data and data['runtimes']: data['runtimes'] = [data['runtimes']] self._clean(data, ('startYear', 'endYear', 'movieID')) movies_cache[movieID] = data return data def _base_person_info(self, personID, movies_cache=None, persons_cache=None): if movies_cache is None: movies_cache = {} if persons_cache is None: persons_cache = {} if personID in persons_cache: return persons_cache[personID] nb = self.T['name_basics'] person = nb.select(nb.c.nconst == personID).execute().fetchone() or {} data = self._rename('name_basics', dict(person)) movies = [] for movieID in split_array(data.get('known for') or ''): if not movieID: continue movieID = int(movieID) movie_data = self._base_title_info(movieID, movies_cache=movies_cache, persons_cache=persons_cache) movie = Movie(movieID=movieID, data=movie_data, accessSystem=self.accessSystem) movies.append(movie) data['known for'] = movies self._clean(data, ('ns_soundex', 'sn_soundex', 's_soundex', 'personID')) persons_cache[personID] = data return data def get_movie_main(self, movieID): movieID = int(movieID) data = self._base_title_info(movieID) _movies_cache = {movieID: data} _persons_cache = {} tc = self.T['title_crew'] movie = tc.select(tc.c.tconst == movieID).execute().fetchone() or {} tc_data = self._rename('title_crew', dict(movie)) writers = [] directors = [] for key, target in (('director', directors), ('writer', writers)): for personID in split_array(tc_data.get(key) or ''): if not personID: continue personID = int(personID) person_data = self._base_person_info(personID, movies_cache=_movies_cache, persons_cache=_persons_cache) person = Person(personID=personID, data=person_data, accessSystem=self.accessSystem) target.append(person) tc_data['director'] = directors tc_data['writer'] = writers data.update(tc_data) te = self.T['title_episode'] movie = tc.select(te.c.tconst == movieID).execute().fetchone() or {} te_data = self._rename('title_episode', dict(movie)) if 'parentTconst' in te_data: te_data['episodes of'] = self._base_title_info(te_data['parentTconst']) self._clean(te_data, ('parentTconst',)) data.update(te_data) tp = self.T['title_principals'] movie_rows = tp.select(tp.c.tconst == movieID).execute().fetchall() or {} roles = {} for movie_row in movie_rows: movie_row = dict(movie_row) tp_data = self._rename('title_principals', dict(movie_row)) category = tp_data.get('category') if not category: continue if category in ('actor', 'actress', 'self'): category = 'cast' roles.setdefault(category, []).append(movie_row) for role in roles: roles[role].sort(key=itemgetter('ordering')) persons = [] for person_info in roles[role]: personID = person_info.get('nconst') if not personID: continue person_data = self._base_person_info(personID, movies_cache=_movies_cache, persons_cache=_persons_cache) person = Person(personID=personID, data=person_data, billingPos=person_info.get('ordering'), currentRole=person_info.get('characters'), notes=person_info.get('job'), accessSystem=self.accessSystem) persons.append(person) data[role] = persons tr = self.T['title_ratings'] movie = tr.select(tr.c.tconst == movieID).execute().fetchone() or {} tr_data = self._rename('title_ratings', dict(movie)) data.update(tr_data) ta = self.T['title_akas'] akas = ta.select(ta.c.titleId == movieID).execute() akas_list = [] for aka in akas: ta_data = self._rename('title_akas', dict(aka)) or {} for key in list(ta_data.keys()): if not ta_data[key]: del ta_data[key] for key in 't_soundex', 'movieID': if key in ta_data: del ta_data[key] for key in 'types', 'attributes': if key not in ta_data: continue ta_data[key] = split_array(ta_data[key]) akas_list.append(ta_data) if akas_list: data['akas'] = akas_list self._clean(data, ('movieID', 't_soundex')) return {'data': data, 'info sets': self.get_movie_infoset()} # we don't really have plot information, yet get_movie_plot = get_movie_main def get_person_main(self, personID): personID = int(personID) data = self._base_person_info(personID) self._clean(data, ('personID',)) return {'data': data, 'info sets': self.get_person_infoset()} get_person_filmography = get_person_main get_person_biography = get_person_main def _search_movie(self, title, results, _episodes=False): title = title.strip() if not title: return [] results = [] t_soundex = title_soundex(title) tb = self.T['title_basics'] conditions = [tb.c.t_soundex == t_soundex] if _episodes: conditions.append(tb.c.titleType == 'episode') results = tb.select(sqlalchemy.and_(*conditions)).execute() results = [(x['tconst'], self._clean(self._rename('title_basics', dict(x)), ('t_soundex',))) for x in results] # Also search the AKAs ta = self.T['title_akas'] ta_conditions = [ta.c.t_soundex == t_soundex] ta_results = ta.select(sqlalchemy.and_(*ta_conditions)).execute() ta_results = [(x['titleId'], self._clean(self._rename('title_akas', dict(x)), ('t_soundex',))) for x in ta_results] results += ta_results results = scan_titles(results, title) results = [x[1] for x in results] return results def _search_movie_advanced(self, title=None, adult=None, results=None, sort=None, sort_dir=None): return self._search_movie(title, results) def _search_episode(self, title, results): return self._search_movie(title, results=results, _episodes=True) def _search_person(self, name, results): name = name.strip() if not name: return [] results = [] ns_soundex, sn_soundex, s_soundex = name_soundexes(name) nb = self.T['name_basics'] conditions = [nb.c.ns_soundex == ns_soundex] if sn_soundex: conditions.append(nb.c.sn_soundex == sn_soundex) if s_soundex: conditions.append(nb.c.s_soundex == s_soundex) results = nb.select(sqlalchemy.or_(*conditions)).execute() results = [(x['nconst'], self._clean(self._rename('name_basics', dict(x)), ('ns_soundex', 'sn_soundex', 's_soundex'))) for x in results] results = scan_names(results, name) results = [x[1] for x in results] return results imdbpy-6.8/imdb/parser/s3/utils.py000066400000000000000000000276211351454127000171330ustar00rootroot00000000000000# Copyright 2018 Davide Alberani # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This package provides utilities for the s3 dataset. """ from __future__ import absolute_import, division, print_function, unicode_literals import re import sqlalchemy from difflib import SequenceMatcher from imdb.utils import canonicalName, canonicalTitle, _unicodeArticles SOUNDEX_LENGTH = 5 RO_THRESHOLD = 0.6 STRING_MAXLENDIFFER = 0.7 re_imdbids = re.compile(r'(nm|tt)') re_characters = re.compile(r'"(.+?)"') def transf_imdbid(x): return int(x[2:]) def transf_multi_imdbid(x): if not x: return x return re_imdbids.sub('', x) def transf_multi_character(x): if not x: return x ' / '.join(re_characters.findall(x)) def transf_int(x): try: return int(x) except: return None def transf_float(x): try: return float(x) except: return None def transf_bool(x): try: return x == '1' except: return None KIND = { 'tvEpisode': 'episode', 'tvMiniSeries': 'tv mini series', 'tvSeries': 'tv series', 'tvShort': 'tv short', 'tvSpecial': 'tv special', 'videoGame': 'video game' } def transf_kind(x): return KIND.get(x, x) # Database mapping. # 'type' force a conversion to a specific SQL type # 'transform' applies a conversion to the content (changes the data in the database) # 'rename' is applied when reading the column names (the columns names are unchanged, in the database) # 'index' mark the columns that need to be indexed # 'length' is applied to VARCHAR fields DB_TRANSFORM = { 'title_basics': { 'tconst': {'type': sqlalchemy.Integer, 'transform': transf_imdbid, 'rename': 'movieID', 'index': True}, 'titleType': {'type': sqlalchemy.String, 'transform': transf_kind, 'rename': 'kind', 'length': 16, 'index': True}, 'primaryTitle': {'rename': 'title'}, 'originalTitle': {'rename': 'original title'}, 'isAdult': {'type': sqlalchemy.Boolean, 'transform': transf_bool, 'rename': 'adult', 'index': True}, 'startYear': {'type': sqlalchemy.Integer, 'transform': transf_int, 'index': True}, 'endYear': {'type': sqlalchemy.Integer, 'transform': transf_int}, 'runtimeMinutes': {'type': sqlalchemy.Integer, 'transform': transf_int, 'rename': 'runtimes', 'index': True}, 't_soundex': {'type': sqlalchemy.String, 'length': 5, 'index': True} }, 'name_basics': { 'nconst': {'type': sqlalchemy.Integer, 'transform': transf_imdbid, 'rename': 'personID', 'index': True}, 'primaryName': {'rename': 'name'}, 'birthYear': {'type': sqlalchemy.Integer, 'transform': transf_int, 'rename': 'birth date', 'index': True}, 'deathYear': {'type': sqlalchemy.Integer, 'transform': transf_int, 'rename': 'death date', 'index': True}, 'primaryProfession': {'rename': 'primary profession'}, 'knownForTitles': {'transform': transf_multi_imdbid, 'rename': 'known for'}, 'ns_soundex': {'type': sqlalchemy.String, 'length': 5, 'index': True}, 'sn_soundex': {'type': sqlalchemy.String, 'length': 5, 'index': True}, 's_soundex': {'type': sqlalchemy.String, 'length': 5, 'index': True}, }, 'title_akas': { 'titleId': {'type': sqlalchemy.Integer, 'transform': transf_imdbid, 'rename': 'movieID', 'index': True}, 'ordering': {'type': sqlalchemy.Integer, 'transform': transf_int}, 'title': {}, 'region': {'type': sqlalchemy.String, 'length': 5, 'index': True}, 'language': {'type': sqlalchemy.String, 'length': 5, 'index': True}, 'types': {'type': sqlalchemy.String, 'length': 31, 'index': True}, 'attributes': {'type': sqlalchemy.String, 'length': 127}, 'isOriginalTitle': {'type': sqlalchemy.Boolean, 'transform': transf_bool, 'rename': 'original', 'index': True}, 't_soundex': {'type': sqlalchemy.String, 'length': 5, 'index': True} }, 'title_crew': { 'tconst': {'type': sqlalchemy.Integer, 'transform': transf_imdbid, 'rename': 'movieID', 'index': True}, 'directors': {'transform': transf_multi_imdbid, 'rename': 'director'}, 'writers': {'transform': transf_multi_imdbid, 'rename': 'writer'} }, 'title_episode': { 'tconst': {'type': sqlalchemy.Integer, 'transform': transf_imdbid, 'rename': 'movieID', 'index': True}, 'parentTconst': {'type': sqlalchemy.Integer, 'transform': transf_imdbid, 'index': True}, 'seasonNumber': {'type': sqlalchemy.Integer, 'transform': transf_int, 'rename': 'seasonNr'}, 'episodeNumber': {'type': sqlalchemy.Integer, 'transform': transf_int, 'rename': 'episodeNr'} }, 'title_principals': { 'tconst': {'type': sqlalchemy.Integer, 'transform': transf_imdbid, 'rename': 'movieID', 'index': True}, 'ordering': {'type': sqlalchemy.Integer, 'transform': transf_int}, 'nconst': {'type': sqlalchemy.Integer, 'transform': transf_imdbid, 'rename': 'personID', 'index': True}, 'category': {'type': sqlalchemy.String, 'length': 64}, 'job': {'type': sqlalchemy.String, 'length': 1024}, 'characters': {'type': sqlalchemy.String, 'length': 1024, 'transform': transf_multi_character} }, 'title_ratings': { 'tconst': {'type': sqlalchemy.Integer, 'transform': transf_imdbid, 'rename': 'movieID', 'index': True}, 'averageRating': {'type': sqlalchemy.Float, 'transform': transf_float, 'rename': 'rating', 'index': True}, 'numVotes': {'type': sqlalchemy.Integer, 'transform': transf_int, 'rename': 'votes', 'index': True} } } _translate = dict(B='1', C='2', D='3', F='1', G='2', J='2', K='2', L='4', M='5', N='5', P='1', Q='2', R='6', S='2', T='3', V='1', X='2', Z='2') _translateget = _translate.get _re_non_ascii = re.compile(r'^[^a-z]*', re.I) def soundex(s, length=SOUNDEX_LENGTH): """Return the soundex code for the given string. :param s: the string to convert to soundex :type s: str :param length: length of the soundex code to generate :type length: int :returns: the soundex code :rtype: str""" s = _re_non_ascii.sub('', s) if not s: return None s = s.upper() soundCode = s[0] count = 1 for c in s[1:]: if count >= length: break cw = _translateget(c, '0') if cw != '0' and soundCode[-1] != cw: soundCode += cw count += 1 return soundCode or None def title_soundex(title): """Return the soundex code for the given title; the (optional) starting article is pruned. :param title: movie title :type title: str :returns: soundex of the title (without the article, if any) :rtype: str """ if not title: return None title = canonicalTitle(title) ts = title.split(', ') if ts[-1].lower() in _unicodeArticles: title = ', '.join(ts[:-1]) return soundex(title) def name_soundexes(name): """Return three soundex codes for the given name. :param name: person name :type name: str :returns: tuple of soundex codes: (S(Name Surname), S(Surname Name), S(Surname)) :rtype: tuple """ if not name: return None, None, None s1 = soundex(name) canonical_name = canonicalName(name) s2 = soundex(canonical_name) if s1 == s2: s2 = None s3 = soundex(canonical_name.split(', ')[0]) if s3 and s3 in (s1, s2): s3 = None return s1, s2, s3 def ratcliff(s1, s2, sm): """Ratcliff-Obershelp similarity. :param s1: first string to compare :type s1: str :param s2: second string to compare :type s2: str :param sm: sequence matcher to use for the comparison :type sm: :class:`difflib.SequenceMatcher` :returns: 0.0-1.0 similarity :rtype: float""" s1len = len(s1) s2len = len(s2) if s1len < s2len: threshold = float(s1len) / s2len else: threshold = float(s2len) / s1len if threshold < STRING_MAXLENDIFFER: return 0.0 sm.set_seq2(s2.lower()) return sm.ratio() def scan_names(name_list, name, results=0, ro_threshold=RO_THRESHOLD): """Scan a list of names, searching for best matches against some variations. :param name_list: list of (personID, {person_data}) tuples :type name_list: list :param name: searched name :type name: str :results: returns at most as much results (all, if 0) :type results: int :param ro_threshold: ignore results with a score lower than this value :type ro_threshold: float :returns: list of results sorted by similarity :rtype: list""" canonical_name = canonicalName(name).replace(',', '') sm1 = SequenceMatcher() sm2 = SequenceMatcher() sm1.set_seq1(name.lower()) sm2.set_seq1(canonical_name.lower()) resd = {} for i, n_data in name_list: nil = n_data['name'] # Distance with the canonical name. ratios = [ratcliff(name, nil, sm1) + 0.1, ratcliff(name, canonicalName(nil).replace(',', ''), sm2)] ratio = max(ratios) if ratio >= ro_threshold: if i in resd: if ratio > resd[i][0]: resd[i] = (ratio, (i, n_data)) else: resd[i] = (ratio, (i, n_data)) res = list(resd.values()) res.sort() res.reverse() if results > 0: res[:] = res[:results] return res def strip_article(title): no_article_title = canonicalTitle(title) t2s = no_article_title.split(', ') if t2s[-1].lower() in _unicodeArticles: no_article_title = ', '.join(t2s[:-1]) return no_article_title def scan_titles(titles_list, title, results=0, ro_threshold=RO_THRESHOLD): """Scan a list of titles, searching for best matches amongst some variations. :param titles_list: list of (movieID, {movie_data}) tuples :type titles_list: list :param title: searched title :type title: str :results: returns at most as much results (all, if 0) :type results: int :param ro_threshold: ignore results with a score lower than this value :type ro_threshold: float :returns: list of results sorted by similarity :rtype: list""" no_article_title = strip_article(title) sm1 = SequenceMatcher() sm1.set_seq1(title.lower()) sm2 = SequenceMatcher() sm2.set_seq2(no_article_title.lower()) resd = {} for i, t_data in titles_list: til = t_data['title'] ratios = [ratcliff(title, til, sm1) + 0.1, ratcliff(no_article_title, strip_article(til), sm2)] ratio = max(ratios) if t_data.get('kind') == 'episode': ratio -= .2 if ratio >= ro_threshold: if i in resd: if ratio > resd[i][0]: resd[i] = (ratio, (i, t_data)) else: resd[i] = (ratio, (i, t_data)) res = list(resd.values()) res.sort() res.reverse() if results > 0: res[:] = res[:results] return res imdbpy-6.8/imdb/parser/sql/000077500000000000000000000000001351454127000156635ustar00rootroot00000000000000imdbpy-6.8/imdb/parser/sql/__init__.py000066400000000000000000001750141351454127000200040ustar00rootroot00000000000000# Copyright 2005-2019 Davide Alberani # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This package provides the IMDbSqlAccessSystem class used to access IMDb's data through a SQL database. Every database supported by the SQLAlchemy object relational mapper is available. The :func:`imdb.IMDb` function will return an instance of this class when called with the ``accessSystem`` parameter is set to "sql", "database" or "db". """ from __future__ import absolute_import, division, print_function, unicode_literals import re import logging from difflib import SequenceMatcher from codecs import lookup from imdb import IMDbBase from imdb.utils import normalizeName, normalizeTitle, build_title, \ build_name, analyze_name, analyze_title, \ canonicalTitle, canonicalName, re_titleRef, \ build_company_name, re_episodes, _unicodeArticles, \ analyze_company_name, re_year_index, re_nameRef from imdb.Person import Person from imdb.Movie import Movie from imdb.Company import Company from imdb._exceptions import IMDbDataAccessError, IMDbError # Logger for miscellaneous functions. _aux_logger = logging.getLogger('imdbpy.parser.sql.aux') # ============================= # Things that once upon a time were in imdb.parser.common.locsql. def titleVariations(title, fromPtdf=False): """Build title variations useful for searches; if fromPtdf is true, the input is assumed to be in the plain text data files format.""" if fromPtdf: title1 = '' else: title1 = title title2 = title3 = '' if fromPtdf or re_year_index.search(title): # If it appears to have a (year[/imdbIndex]) indication, # assume that a long imdb canonical name was provided. titldict = analyze_title(title, canonical=1) # title1: the canonical name. title1 = titldict['title'] if titldict['kind'] != 'episode': # title3: the long imdb canonical name. if fromPtdf: title3 = title else: title3 = build_title(titldict, canonical=1, ptdf=1) else: title1 = normalizeTitle(title1) title3 = build_title(titldict, canonical=1, ptdf=1) else: # Just a title. # title1: the canonical title. title1 = canonicalTitle(title) title3 = '' # title2 is title1 without the article, or title1 unchanged. if title1: title2 = title1 t2s = title2.split(', ') if t2s[-1].lower() in _unicodeArticles: title2 = ', '.join(t2s[:-1]) _aux_logger.debug('title variations: 1:[%s] 2:[%s] 3:[%s]', title1, title2, title3) return title1, title2, title3 re_nameIndex = re.compile(r'\(([IVXLCDM]+)\)') def nameVariations(name, fromPtdf=False): """Build name variations useful for searches; if fromPtdf is true, the input is assumed to be in the plain text data files format.""" name1 = name2 = name3 = '' if fromPtdf or re_nameIndex.search(name): # We've a name with an (imdbIndex) namedict = analyze_name(name, canonical=1) # name1 is the name in the canonical format. name1 = namedict['name'] # name3 is the canonical name with the imdbIndex. if fromPtdf: if 'imdbIndex' in namedict: name3 = name else: name3 = build_name(namedict, canonical=1) else: # name1 is the name in the canonical format. name1 = canonicalName(name) name3 = '' # name2 is the name in the normal format, if it differs from name1. name2 = normalizeName(name1) if name1 == name2: name2 = '' _aux_logger.debug('name variations: 1:[%s] 2:[%s] 3:[%s]', name1, name2, name3) return name1, name2, name3 def ratcliff(s1, s2, sm): """Ratcliff-Obershelp similarity.""" STRING_MAXLENDIFFER = 0.7 s1len = len(s1) s2len = len(s2) if s1len < s2len: threshold = float(s1len) / s2len else: threshold = float(s2len) / s1len if threshold < STRING_MAXLENDIFFER: return 0.0 sm.set_seq2(s2.lower()) return sm.ratio() def merge_roles(mop): """Merge multiple roles.""" new_list = [] for m in mop: m_isinnewlist = False m_indexinnewlist = None if isinstance(m, Person): for i, person in enumerate(new_list): if person.isSamePerson(m): m_isinnewlist = True m_indexinnewlist = i break else: if m in new_list: m_isinnewlist = True m_indexinnewlist = new_list.index(m) if m_isinnewlist: keep_this = new_list[m_indexinnewlist] if not isinstance(keep_this.currentRole, list): keep_this.currentRole = [keep_this.currentRole] keep_this.currentRole.append(m.currentRole) else: new_list.append(m) return new_list def scan_names(name_list, name1, name2, name3, results=0, ro_thresold=None, _scan_character=False): """Scan a list of names, searching for best matches against the given variations.""" if ro_thresold is not None: RO_THRESHOLD = ro_thresold else: RO_THRESHOLD = 0.6 sm1 = SequenceMatcher() sm2 = SequenceMatcher() sm3 = SequenceMatcher() sm1.set_seq1(name1.lower()) if name2: sm2.set_seq1(name2.lower()) if name3: sm3.set_seq1(name3.lower()) resd = {} for i, n_data in name_list: nil = n_data['name'] # Distance with the canonical name. ratios = [ratcliff(name1, nil, sm1) + 0.05] namesurname = '' if not _scan_character: nils = nil.split(', ', 1) surname = nils[0] if len(nils) == 2: namesurname = '%s %s' % (nils[1], surname) else: nils = nil.split(' ', 1) surname = nils[-1] namesurname = nil if surname != nil: # Distance with the "Surname" in the database. ratios.append(ratcliff(name1, surname, sm1)) if not _scan_character: ratios.append(ratcliff(name1, namesurname, sm1)) if name2: ratios.append(ratcliff(name2, surname, sm2)) # Distance with the "Name Surname" in the database. if namesurname: ratios.append(ratcliff(name2, namesurname, sm2)) if name3: # Distance with the long imdb canonical name. ratios.append(ratcliff(name3, build_name(n_data, canonical=1), sm3) + 0.1) ratio = max(ratios) if ratio >= RO_THRESHOLD: if i in resd: if ratio > resd[i][0]: resd[i] = (ratio, (i, n_data)) else: resd[i] = (ratio, (i, n_data)) res = list(resd.values()) res.sort() res.reverse() if results > 0: res[:] = res[:results] return res def scan_titles(titles_list, title1, title2, title3, results=0, searchingEpisode=0, onlyEpisodes=0, ro_thresold=None): """Scan a list of titles, searching for best matches against the given variations.""" if ro_thresold is not None: RO_THRESHOLD = ro_thresold else: RO_THRESHOLD = 0.6 sm1 = SequenceMatcher() sm2 = SequenceMatcher() sm3 = SequenceMatcher() sm1.set_seq1(title1.lower()) sm2.set_seq2(title2.lower()) if title3: sm3.set_seq1(title3.lower()) if title3[-1] == '}': searchingEpisode = 1 hasArt = 0 if title2 != title1: hasArt = 1 resd = {} for i, t_data in titles_list: if onlyEpisodes: if t_data.get('kind') != 'episode': continue til = t_data['title'] if til[-1] == ')': dateIdx = til.rfind('(') if dateIdx != -1: til = til[:dateIdx].rstrip() if not til: continue ratio = ratcliff(title1, til, sm1) if ratio >= RO_THRESHOLD: resd[i] = (ratio, (i, t_data)) continue if searchingEpisode: if t_data.get('kind') != 'episode': continue elif t_data.get('kind') == 'episode': continue til = t_data['title'] # Distance with the canonical title (with or without article). # titleS -> titleR # titleS, the -> titleR, the if not searchingEpisode: til = canonicalTitle(til) ratios = [ratcliff(title1, til, sm1) + 0.05] # til2 is til without the article, if present. til2 = til tils = til2.split(', ') matchHasArt = 0 if tils[-1].lower() in _unicodeArticles: til2 = ', '.join(tils[:-1]) matchHasArt = 1 if hasArt and not matchHasArt: # titleS[, the] -> titleR ratios.append(ratcliff(title2, til, sm2)) elif matchHasArt and not hasArt: # titleS -> titleR[, the] ratios.append(ratcliff(title1, til2, sm1)) else: ratios = [0.0] if title3: # Distance with the long imdb canonical title. ratios.append(ratcliff(title3, build_title(t_data, canonical=1, ptdf=1), sm3) + 0.1) ratio = max(ratios) if ratio >= RO_THRESHOLD: if i in resd: if ratio > resd[i][0]: resd[i] = (ratio, (i, t_data)) else: resd[i] = (ratio, (i, t_data)) res = list(resd.values()) res.sort() res.reverse() if results > 0: res[:] = res[:results] return res def scan_company_names(name_list, name1, results=0, ro_thresold=None): """Scan a list of company names, searching for best matches against the given name. Notice that this function takes a list of strings, and not a list of dictionaries.""" if ro_thresold is not None: RO_THRESHOLD = ro_thresold else: RO_THRESHOLD = 0.6 sm1 = SequenceMatcher() sm1.set_seq1(name1.lower()) resd = {} withoutCountry = not name1.endswith(']') for i, n in name_list: o_name = n var = 0.0 if withoutCountry and n.endswith(']'): cidx = n.rfind('[') if cidx != -1: n = n[:cidx].rstrip() var = -0.05 # Distance with the company name. ratio = ratcliff(name1, n, sm1) + var if ratio >= RO_THRESHOLD: if i in resd: if ratio > resd[i][0]: resd[i] = (ratio, (i, analyze_company_name(o_name))) else: resd[i] = (ratio, (i, analyze_company_name(o_name))) res = list(resd.values()) res.sort() res.reverse() if results > 0: res[:] = res[:results] return res _translate = dict(B='1', C='2', D='3', F='1', G='2', J='2', K='2', L='4', M='5', N='5', P='1', Q='2', R='6', S='2', T='3', V='1', X='2', Z='2') _translateget = _translate.get _re_non_ascii = re.compile(r'^[^a-z]*', re.I) SOUNDEX_LEN = 5 def soundex(s): """Return the soundex code for the given string.""" # Maximum length of the soundex code. s = _re_non_ascii.sub('', s) if not s: return None s = s.upper() soundCode = s[0] for c in s[1:]: cw = _translateget(c, '0') if cw != '0' and soundCode[-1] != cw: soundCode += cw return soundCode[:SOUNDEX_LEN] or None def _sortKeywords(keyword, kwds): """Sort a list of keywords, based on the searched one.""" sm = SequenceMatcher() sm.set_seq1(keyword.lower()) ratios = [(ratcliff(keyword, k, sm), k) for k in kwds] checkContained = False if len(keyword) > 4: checkContained = True for idx, data in enumerate(ratios): ratio, key = data if key.startswith(keyword): ratios[idx] = (ratio + 0.5, key) elif checkContained and keyword in key: ratios[idx] = (ratio + 0.3, key) ratios.sort() ratios.reverse() return [r[1] for r in ratios] def filterSimilarKeywords(keyword, kwdsIterator): """Return a sorted list of keywords similar to the one given.""" seenDict = {} kwdSndx = soundex(keyword) matches = [] matchesappend = matches.append checkContained = False if len(keyword) > 4: checkContained = True for movieID, key in kwdsIterator: if key in seenDict: continue seenDict[key] = None if checkContained and keyword in key: matchesappend(key) continue if kwdSndx == soundex(key): matchesappend(key) return _sortKeywords(keyword, matches) # ============================= _litlist = ['screenplay/teleplay', 'novel', 'adaption', 'book', 'production process protocol', 'interviews', 'printed media reviews', 'essays', 'other literature'] _litd = dict([(x, ('literature', x)) for x in _litlist]) _buslist = ['budget', 'weekend gross', 'gross', 'opening weekend', 'rentals', 'admissions', 'filming dates', 'production dates', 'studios', 'copyright holder'] _busd = dict([(x, ('business', x)) for x in _buslist]) def _reGroupDict(d, newgr): """Regroup keys in the d dictionary in subdictionaries, based on the scheme in the newgr dictionary. E.g.: in the newgr, an entry 'LD label': ('laserdisc', 'label') tells the _reGroupDict() function to take the entry with label 'LD label' (as received from the sql database) and put it in the subsection (another dictionary) named 'laserdisc', using the key 'label'.""" r = {} newgrks = list(newgr.keys()) for k, v in list(d.items()): if k in newgrks: r.setdefault(newgr[k][0], {})[newgr[k][1]] = v else: r[k] = v return r def _groupListBy(l, index): """Regroup items in a list in a list of lists, grouped by the value at the given index.""" tmpd = {} for item in l: tmpd.setdefault(item[index], []).append(item) res = list(tmpd.values()) return res def sub_dict(d, keys): """Return the subdictionary of 'd', with just the keys listed in 'keys'.""" return dict([(k, d[k]) for k in keys if k in d]) def get_movie_data(movieID, kindDict, fromAka=0, _table=None): """Return a dictionary containing data about the given movieID; if fromAka is true, the AkaTitle table is searched; _table is reserved for the imdbpy2sql.py script.""" if _table is not None: Table = _table else: if not fromAka: Table = Title else: Table = AkaTitle try: m = Table.get(movieID) except Exception as e: _aux_logger.warn('Unable to fetch information for movieID %s: %s', movieID, e) mdict = {} return mdict mdict = {'title': m.title, 'kind': kindDict[m.kindID], 'year': m.productionYear, 'imdbIndex': m.imdbIndex, 'season': m.seasonNr, 'episode': m.episodeNr} if not fromAka: if m.seriesYears is not None: mdict['series years'] = str(m.seriesYears) if mdict['imdbIndex'] is None: del mdict['imdbIndex'] if mdict['year'] is None: del mdict['year'] else: try: mdict['year'] = int(mdict['year']) except (TypeError, ValueError): del mdict['year'] if mdict['season'] is None: del mdict['season'] else: try: mdict['season'] = int(mdict['season']) except: pass if mdict['episode'] is None: del mdict['episode'] else: try: mdict['episode'] = int(mdict['episode']) except: pass episodeOfID = m.episodeOfID if episodeOfID is not None: ser_dict = get_movie_data(episodeOfID, kindDict, fromAka) mdict['episode of'] = Movie(data=ser_dict, movieID=episodeOfID, accessSystem='sql') if fromAka: ser_note = AkaTitle.get(episodeOfID).note if ser_note: mdict['episode of'].notes = ser_note return mdict def _iterKeywords(results): """Iterate over (key.id, key.keyword) columns of a selection of the Keyword table.""" for key in results: yield key.id, key.keyword def getSingleInfo(table, movieID, infoType, notAList=False): """Return a dictionary in the form {infoType: infoListOrString}, retrieving a single set of information about a given movie, from the specified table.""" infoTypeID = InfoType.select(InfoType.q.info == infoType) if infoTypeID.count() == 0: return {} res = table.select(AND(table.q.movieID == movieID, table.q.infoTypeID == infoTypeID[0].id)) retList = [] for r in res: info = r.info note = r.note if note: info += '::%s' % note retList.append(info) if not retList: return {} if not notAList: return {infoType: retList} else: return {infoType: retList[0]} def _cmpTop(a, b, what='top 250 rank'): """Compare function used to sort top 250/bottom 10 rank.""" av = int(a[1].get(what)) bv = int(b[1].get(what)) if av == bv: return 0 return (-1, 1)[av > bv] def _cmpBottom(a, b): """Compare function used to sort top 250/bottom 10 rank.""" return _cmpTop(a, b, what='bottom 10 rank') class IMDbSqlAccessSystem(IMDbBase): """The class used to access IMDb's data through a SQL database.""" accessSystem = 'sql' _sql_logger = logging.getLogger('imdbpy.parser.sql') def __init__(self, uri, adultSearch=True, *arguments, **keywords): """Initialize the access system.""" IMDbBase.__init__(self, *arguments, **keywords) DB_TABLES = [] try: from .alchemyadapter import getDBTables, NotFoundError, \ setConnection, AND, OR, IN, \ ISNULL, CONTAINSSTRING, toUTF8 # XXX: look ma'... black magic! It's used to make # TableClasses and some functions accessible # through the whole module. for k, v in [('NotFoundError', NotFoundError), ('AND', AND), ('OR', OR), ('IN', IN), ('ISNULL', ISNULL), ('CONTAINSSTRING', CONTAINSSTRING)]: globals()[k] = v self.toUTF8 = toUTF8 DB_TABLES = getDBTables(uri) for t in DB_TABLES: globals()[t._imdbpyName] = t except ImportError as e: raise IMDbError('unable to import SQLAlchemy') # Set the connection to the database. self._sql_logger.debug('connecting to %s', uri) try: self._connection = setConnection(uri, DB_TABLES) except AssertionError as e: raise IMDbDataAccessError( 'unable to connect to the database server; ' + 'complete message: "%s"' % str(e)) self.Error = self._connection.module.Error # Maps some IDs to the corresponding strings. self._kind = {} self._kindRev = {} self._sql_logger.debug('reading constants from the database') try: for kt in KindType.select(): self._kind[kt.id] = kt.kind self._kindRev[str(kt.kind)] = kt.id except self.Error: # NOTE: you can also get the error, but - at least with # MySQL - it also contains the password, and I don't # like the idea to print it out. raise IMDbDataAccessError( 'unable to connect to the database server') self._role = {} for rl in RoleType.select(): self._role[rl.id] = str(rl.role) self._info = {} self._infoRev = {} for inf in InfoType.select(): self._info[inf.id] = str(inf.info) self._infoRev[str(inf.info)] = inf.id self._compType = {} for cType in CompanyType.select(): self._compType[cType.id] = cType.kind info = [(it.id, it.info) for it in InfoType.select()] self._compcast = {} for cc in CompCastType.select(): self._compcast[cc.id] = str(cc.kind) self._link = {} for lt in LinkType.select(): self._link[lt.id] = str(lt.link) self._moviesubs = {} # Build self._moviesubs, a dictionary used to rearrange # the data structure for a movie object. for vid, vinfo in info: if not vinfo.startswith('LD '): continue self._moviesubs[vinfo] = ('laserdisc', vinfo[3:]) self._moviesubs.update(_litd) self._moviesubs.update(_busd) self.do_adult_search(adultSearch) def _findRefs(self, o, trefs, nrefs): """Find titles or names references in strings.""" if isinstance(o, str): for title in re_titleRef.findall(o): a_title = analyze_title(title, canonical=0) rtitle = build_title(a_title, ptdf=1) if rtitle in trefs: continue movieID = self._getTitleID(rtitle) if movieID is None: movieID = self._getTitleID(title) if movieID is None: continue m = Movie(title=rtitle, movieID=movieID, accessSystem=self.accessSystem) trefs[rtitle] = m rtitle2 = canonicalTitle(a_title.get('title', '')) if rtitle2 and rtitle2 != rtitle and rtitle2 != title: trefs[rtitle2] = m if title != rtitle: trefs[title] = m for name in re_nameRef.findall(o): a_name = analyze_name(name, canonical=1) rname = build_name(a_name, canonical=1) if rname in nrefs: continue personID = self._getNameID(rname) if personID is None: personID = self._getNameID(name) if personID is None: continue p = Person(name=rname, personID=personID, accessSystem=self.accessSystem) nrefs[rname] = p rname2 = normalizeName(a_name.get('name', '')) if rname2 and rname2 != rname: nrefs[rname2] = p if name != rname and name != rname2: nrefs[name] = p elif isinstance(o, (list, tuple)): for item in o: self._findRefs(item, trefs, nrefs) elif isinstance(o, dict): for value in list(o.values()): self._findRefs(value, trefs, nrefs) return trefs, nrefs def _extractRefs(self, o): """Scan for titles or names references in strings.""" trefs = {} nrefs = {} try: return self._findRefs(o, trefs, nrefs) except RuntimeError as e: # Symbian/python 2.2 has a poor regexp implementation. import warnings warnings.warn('RuntimeError in ' "imdb.parser.sql.IMDbSqlAccessSystem; " "if it's not a recursion limit exceeded and we're not " "running in a Symbian environment, it's a bug:\n%s" % e) return trefs, nrefs def _changeAKAencoding(self, akanotes, akatitle): """Return akatitle in the correct charset, as specified in the akanotes field; if akatitle doesn't need to be modified, return None.""" oti = akanotes.find('(original ') if oti == -1: return None ote = akanotes[oti + 10:].find(' title)') if ote != -1: cs_info = akanotes[oti + 10:oti + 10 + ote].lower().split() for e in cs_info: # excludes some strings that clearly are not encoding. if e in ('script', '', 'cyrillic', 'greek'): continue if e.startswith('iso-') and e.find('latin') != -1: e = e[4:].replace('-', '') try: lookup(e) lat1 = akatitle.encode('latin_1', 'replace') return str(lat1, e, 'replace') except (LookupError, ValueError, TypeError): continue return None def _buildNULLCondition(self, col, val): """Build a comparison for columns where values can be NULL.""" if val is None: return ISNULL(col) else: if isinstance(val, int): return col == val else: return col == self.toUTF8(val) def _getTitleID(self, title): """Given a long imdb canonical title, returns a movieID or None if not found.""" td = analyze_title(title) condition = None if td['kind'] == 'episode': epof = td['episode of'] seriesID = [s.id for s in Title.select( AND(Title.q.title == self.toUTF8(epof['title']), self._buildNULLCondition(Title.q.imdbIndex, epof.get('imdbIndex')), Title.q.kindID == self._kindRev[epof['kind']], self._buildNULLCondition(Title.q.productionYear, epof.get('year'))))] if seriesID: condition = AND(IN(Title.q.episodeOfID, seriesID), Title.q.title == self.toUTF8(td['title']), self._buildNULLCondition(Title.q.imdbIndex, td.get('imdbIndex')), Title.q.kindID == self._kindRev[td['kind']], self._buildNULLCondition(Title.q.productionYear, td.get('year'))) if condition is None: condition = AND(Title.q.title == self.toUTF8(td['title']), self._buildNULLCondition(Title.q.imdbIndex, td.get('imdbIndex')), Title.q.kindID == self._kindRev[td['kind']], self._buildNULLCondition(Title.q.productionYear, td.get('year'))) res = Title.select(condition) try: if res.count() != 1: return None except (UnicodeDecodeError, TypeError): return None return res[0].id def _getNameID(self, name): """Given a long imdb canonical name, returns a personID or None if not found.""" nd = analyze_name(name) res = Name.select(AND(Name.q.name == self.toUTF8(nd['name']), self._buildNULLCondition(Name.q.imdbIndex, nd.get('imdbIndex')))) try: if res.count() != 1: return None except (UnicodeDecodeError, TypeError): return None return res[0].id def _normalize_movieID(self, movieID): """Normalize the given movieID.""" try: return int(movieID) except (ValueError, OverflowError): raise IMDbError('movieID "%s" can\'t be converted to integer' % movieID) def _normalize_personID(self, personID): """Normalize the given personID.""" try: return int(personID) except (ValueError, OverflowError): raise IMDbError('personID "%s" can\'t be converted to integer' % personID) def _normalize_characterID(self, characterID): """Normalize the given characterID.""" try: return int(characterID) except (ValueError, OverflowError): raise IMDbError('characterID "%s" can\'t be converted to integer' % characterID) def _normalize_companyID(self, companyID): """Normalize the given companyID.""" try: return int(companyID) except (ValueError, OverflowError): raise IMDbError('companyID "%s" can\'t be converted to integer' % companyID) def get_imdbMovieID(self, movieID): """Translate a movieID in an imdbID. If not in the database, try an Exact Primary Title search on IMDb; return None if it's unable to get the imdbID. """ try: movie = Title.get(movieID) except NotFoundError: return None imdbID = movie.imdbID if imdbID is not None: return '%07d' % imdbID m_dict = get_movie_data(movie.id, self._kind) titline = build_title(m_dict, ptdf=False) imdbID = self.title2imdbID(titline, m_dict['kind']) # If the imdbID was retrieved from the web and was not in the # database, update the database (ignoring errors, because it's # possibile that the current user has not update privileges). # There're times when I think I'm a genius; this one of # those times... if imdbID is not None and not isinstance(imdbID, list): try: movie.imdbID = int(imdbID) except: pass return imdbID def get_imdbPersonID(self, personID): """Translate a personID in an imdbID. If not in the database, try an Exact Primary Name search on IMDb; return None if it's unable to get the imdbID. """ try: person = Name.get(personID) except NotFoundError: return None imdbID = person.imdbID if imdbID is not None: return '%07d' % imdbID n_dict = {'name': person.name, 'imdbIndex': person.imdbIndex} namline = build_name(n_dict, canonical=False) imdbID = self.name2imdbID(namline) if imdbID is not None and not isinstance(imdbID, list): try: person.imdbID = int(imdbID) except: pass return imdbID def get_imdbCharacterID(self, characterID): """Translate a characterID in an imdbID. If not in the database, try an Exact Primary Name search on IMDb; return None if it's unable to get the imdbID. """ try: character = CharName.get(characterID) except NotFoundError: return None imdbID = character.imdbID if imdbID is not None: return '%07d' % imdbID n_dict = {'name': character.name, 'imdbIndex': character.imdbIndex} namline = build_name(n_dict, canonical=False) imdbID = self.character2imdbID(namline) if imdbID is not None and not isinstance(imdbID, list): try: character.imdbID = int(imdbID) except: pass return imdbID def get_imdbCompanyID(self, companyID): """Translate a companyID in an imdbID. If not in the database, try an Exact Primary Name search on IMDb; return None if it's unable to get the imdbID. """ try: company = CompanyName.get(companyID) except NotFoundError: return None imdbID = company.imdbID if imdbID is not None: return '%07d' % imdbID n_dict = {'name': company.name, 'country': company.countryCode} namline = build_company_name(n_dict) imdbID = self.company2imdbID(namline) if imdbID is not None and not isinstance(imdbID, list): try: company.imdbID = int(imdbID) except: pass return imdbID def do_adult_search(self, doAdult): """If set to 0 or False, movies in the Adult category are not episodeOf = title_dict.get('episode of') shown in the results of a search.""" self.doAdult = doAdult def _search_movie(self, title, results, _episodes=False): title = title.strip() if not title: return [] title_dict = analyze_title(title, canonical=1) s_title = title_dict['title'] if not s_title: return [] episodeOf = title_dict.get('episode of') if episodeOf: _episodes = False s_title_split = s_title.split(', ') if len(s_title_split) > 1 and \ s_title_split[-1].lower() in _unicodeArticles: s_title_rebuilt = ', '.join(s_title_split[:-1]) if s_title_rebuilt: s_title = s_title_rebuilt soundexCode = soundex(s_title) # XXX: improve the search restricting the kindID if the # "kind" of the input differs from "movie"? condition = conditionAka = None if _episodes: condition = AND(Title.q.phoneticCode == soundexCode, Title.q.kindID == self._kindRev['episode']) conditionAka = AND(AkaTitle.q.phoneticCode == soundexCode, AkaTitle.q.kindID == self._kindRev['episode']) elif title_dict['kind'] == 'episode' and episodeOf is not None: # set canonical=0 ? Should not make much difference. series_title = build_title(episodeOf, canonical=1) # XXX: is it safe to get "results" results? # Too many? Too few? serRes = results if serRes < 3 or serRes > 10: serRes = 10 searchSeries = self._search_movie(series_title, serRes) seriesIDs = [result[0] for result in searchSeries] if seriesIDs: condition = AND(Title.q.phoneticCode == soundexCode, IN(Title.q.episodeOfID, seriesIDs), Title.q.kindID == self._kindRev['episode']) conditionAka = AND(AkaTitle.q.phoneticCode == soundexCode, IN(AkaTitle.q.episodeOfID, seriesIDs), AkaTitle.q.kindID == self._kindRev['episode']) else: # XXX: bad situation: we have found no matching series; # try searching everything (both episodes and # non-episodes) for the title. condition = AND(Title.q.phoneticCode == soundexCode, IN(Title.q.episodeOfID, seriesIDs)) conditionAka = AND(AkaTitle.q.phoneticCode == soundexCode, IN(AkaTitle.q.episodeOfID, seriesIDs)) if condition is None: # XXX: excludes episodes? condition = AND(Title.q.kindID != self._kindRev['episode'], Title.q.phoneticCode == soundexCode) conditionAka = AND(AkaTitle.q.kindID != self._kindRev['episode'], AkaTitle.q.phoneticCode == soundexCode) # Up to 3 variations of the title are searched, plus the # long imdb canonical title, if provided. if not _episodes: title1, title2, title3 = titleVariations(title) else: title1 = title title2 = '' title3 = '' try: qr = [(q.id, get_movie_data(q.id, self._kind)) for q in Title.select(condition)] q2 = [(q.movieID, get_movie_data(q.id, self._kind, fromAka=1)) for q in AkaTitle.select(conditionAka)] qr += q2 except NotFoundError as e: raise IMDbDataAccessError( 'unable to search the database: "%s"' % str(e)) resultsST = results * 3 res = scan_titles(qr, title1, title2, title3, resultsST, searchingEpisode=episodeOf is not None, onlyEpisodes=_episodes, ro_thresold=0.0) res[:] = [x[1] for x in res] if res and not self.doAdult: mids = [x[0] for x in res] genreID = self._infoRev['genres'] adultlist = [al.movieID for al in MovieInfo.select( AND(MovieInfo.q.infoTypeID == genreID, MovieInfo.q.info == 'Adult', IN(MovieInfo.q.movieID, mids)))] res[:] = [x for x in res if x[0] not in adultlist] new_res = [] # XXX: can there be duplicates? for r in res: if r not in q2: new_res.append(r) continue mdict = r[1] aka_title = build_title(mdict, ptdf=1) orig_dict = get_movie_data(r[0], self._kind) orig_title = build_title(orig_dict, ptdf=1) if aka_title == orig_title: new_res.append(r) continue orig_dict['akas'] = [aka_title] new_res.append((r[0], orig_dict)) if results > 0: new_res[:] = new_res[:results] return new_res def _search_movie_advanced(self, title=None, adult=None, results=None, sort=None, sort_dir=None): return self._search_movie(title, results) def _search_episode(self, title, results): return self._search_movie(title, results, _episodes=True) def get_movie_main(self, movieID): # Every movie information is retrieved from here. infosets = self.get_movie_infoset() try: res = get_movie_data(movieID, self._kind) except NotFoundError as e: raise IMDbDataAccessError( 'unable to get movieID "%s": "%s"' % (movieID, str(e))) if not res: raise IMDbDataAccessError('unable to get movieID "%s"' % movieID) # Collect cast information. castdata = [[cd.personID, cd.personRoleID, cd.note, cd.nrOrder, self._role[cd.roleID]] for cd in CastInfo.select(CastInfo.q.movieID == movieID)] for p in castdata: person = Name.get(p[0]) p += [person.name, person.imdbIndex] if p[4] in ('actor', 'actress'): p[4] = 'cast' # Regroup by role/duty (cast, writer, director, ...) castdata[:] = _groupListBy(castdata, 4) for group in castdata: duty = group[0][4] for pdata in group: curRole = pdata[1] curRoleID = None if curRole is not None: robj = CharName.get(curRole) curRole = robj.name curRoleID = robj.id p = Person(personID=pdata[0], name=pdata[5], currentRole=curRole or '', roleID=curRoleID, notes=pdata[2] or '', accessSystem='sql') if pdata[6]: p['imdbIndex'] = pdata[6] p.billingPos = pdata[3] res.setdefault(duty, []).append(p) if duty == 'cast': res[duty] = merge_roles(res[duty]) res[duty].sort() # Info about the movie. minfo = [(self._info[m.infoTypeID], m.info, m.note) for m in MovieInfo.select(MovieInfo.q.movieID == movieID)] minfo += [('keywords', Keyword.get(m.keywordID).keyword, None) for m in MovieKeyword.select(MovieKeyword.q.movieID == movieID)] minfo = _groupListBy(minfo, 0) for group in minfo: sect = group[0][0] for mdata in group: data = mdata[1] if mdata[2]: data += '::%s' % mdata[2] res.setdefault(sect, []).append(data) # Companies info about a movie. cinfo = [(self._compType[m.companyTypeID], m.companyID, m.note) for m in MovieCompanies.select(MovieCompanies.q.movieID == movieID)] cinfo = _groupListBy(cinfo, 0) for group in cinfo: sect = group[0][0] for mdata in group: cDb = CompanyName.get(mdata[1]) cDbTxt = cDb.name if cDb.countryCode: cDbTxt += ' %s' % cDb.countryCode company = Company(name=cDbTxt, companyID=mdata[1], notes=mdata[2] or '', accessSystem=self.accessSystem) res.setdefault(sect, []).append(company) # AKA titles. akat = [(get_movie_data(at.id, self._kind, fromAka=1), at.note) for at in AkaTitle.select(AkaTitle.q.movieID == movieID)] if akat: res['akas'] = [] for td, note in akat: nt = build_title(td, ptdf=1) if note: net = self._changeAKAencoding(note, nt) if net is not None: nt = net nt += '::%s' % note if nt not in res['akas']: res['akas'].append(nt) # Complete cast/crew. compcast = [(self._compcast[cc.subjectID], self._compcast[cc.statusID]) for cc in CompleteCast.select(CompleteCast.q.movieID == movieID)] if compcast: for entry in compcast: val = str(entry[1]) res['complete %s' % entry[0]] = val # Movie connections. mlinks = [[ml.linkedMovieID, self._link[ml.linkTypeID]] for ml in MovieLink.select(MovieLink.q.movieID == movieID)] if mlinks: for ml in mlinks: lmovieData = get_movie_data(ml[0], self._kind) if lmovieData: m = Movie(movieID=ml[0], data=lmovieData, accessSystem='sql') ml[0] = m res['connections'] = {} mlinks[:] = _groupListBy(mlinks, 1) for group in mlinks: lt = group[0][1] res['connections'][lt] = [i[0] for i in group] # Episodes. episodes = {} eps_list = list(Title.select(Title.q.episodeOfID == movieID)) eps_list.sort(key=lambda e: '%s.%s' % (e.seasonNr or '', e.episodeNr or '')) if eps_list: ps_data = {'title': res['title'], 'kind': res['kind'], 'year': res.get('year'), 'imdbIndex': res.get('imdbIndex')} parentSeries = Movie(movieID=movieID, data=ps_data, accessSystem='sql') for episode in eps_list: episodeID = episode.id episode_data = get_movie_data(episodeID, self._kind) m = Movie(movieID=episodeID, data=episode_data, accessSystem='sql') m['episode of'] = parentSeries season = episode_data.get('season', 'UNKNOWN') if season not in episodes: episodes[season] = {} ep_number = episode_data.get('episode') if ep_number is None: ep_number = max((list(episodes[season].keys()) or [0])) + 1 episodes[season][ep_number] = m res['episodes'] = episodes res['number of episodes'] = sum([len(x) for x in list(episodes.values())]) res['number of seasons'] = len(list(episodes.keys())) # Regroup laserdisc information. res = _reGroupDict(res, self._moviesubs) # Do some transformation to preserve consistency with other # data access systems. if 'quotes' in res: for idx, quote in enumerate(res['quotes']): res['quotes'][idx] = quote.split('::') if 'runtimes' in res and len(res['runtimes']) > 0: rt = res['runtimes'][0] episodes = re_episodes.findall(rt) if episodes: res['runtimes'][0] = re_episodes.sub('', rt) if res['runtimes'][0][-2:] == '::': res['runtimes'][0] = res['runtimes'][0][:-2] if 'votes' in res: res['votes'] = int(res['votes'][0]) if 'rating' in res: res['rating'] = float(res['rating'][0]) if 'votes distribution' in res: res['votes distribution'] = res['votes distribution'][0] if 'mpaa' in res: res['mpaa'] = res['mpaa'][0] if 'top 250 rank' in res: try: res['top 250 rank'] = int(res['top 250 rank']) except: pass if 'bottom 10 rank' in res: try: res['bottom 100 rank'] = int(res['bottom 10 rank']) except: pass del res['bottom 10 rank'] for old, new in [('guest', 'guests'), ('trademarks', 'trade-mark'), ('articles', 'article'), ('pictorials', 'pictorial'), ('magazine-covers', 'magazine-cover-photo')]: if old in res: res[new] = res[old] del res[old] trefs, nrefs = {}, {} trefs, nrefs = self._extractRefs(sub_dict(res, Movie.keys_tomodify_list)) return {'data': res, 'titlesRefs': trefs, 'namesRefs': nrefs, 'info sets': infosets} # Just to know what kind of information are available. get_movie_alternate_versions = get_movie_main get_movie_business = get_movie_main get_movie_connections = get_movie_main get_movie_crazy_credits = get_movie_main get_movie_goofs = get_movie_main get_movie_keywords = get_movie_main get_movie_literature = get_movie_main get_movie_locations = get_movie_main get_movie_plot = get_movie_main get_movie_quotes = get_movie_main get_movie_release_dates = get_movie_main get_movie_soundtrack = get_movie_main get_movie_taglines = get_movie_main get_movie_technical = get_movie_main get_movie_trivia = get_movie_main get_movie_vote_details = get_movie_main get_movie_episodes = get_movie_main def _search_person(self, name, results): name = name.strip() if not name: return [] s_name = analyze_name(name)['name'] if not s_name: return [] soundexCode = soundex(s_name) name1, name2, name3 = nameVariations(name) # If the soundex is None, compare only with the first # phoneticCode column. if soundexCode is not None: condition = IN(soundexCode, [Name.q.namePcodeCf, Name.q.namePcodeNf, Name.q.surnamePcode]) conditionAka = IN(soundexCode, [AkaName.q.namePcodeCf, AkaName.q.namePcodeNf, AkaName.q.surnamePcode]) else: condition = ISNULL(Name.q.namePcodeCf) conditionAka = ISNULL(AkaName.q.namePcodeCf) try: qr = [(q.id, {'name': q.name, 'imdbIndex': q.imdbIndex}) for q in Name.select(condition)] q2 = [(q.personID, {'name': q.name, 'imdbIndex': q.imdbIndex}) for q in AkaName.select(conditionAka)] qr += q2 except NotFoundError as e: raise IMDbDataAccessError( 'unable to search the database: "%s"' % str(e)) res = scan_names(qr, name1, name2, name3, results) res[:] = [x[1] for x in res] # Purge empty imdbIndex. returnl = [] for x in res: tmpd = x[1] if tmpd['imdbIndex'] is None: del tmpd['imdbIndex'] returnl.append((x[0], tmpd)) new_res = [] # XXX: can there be duplicates? for r in returnl: if r not in q2: new_res.append(r) continue pdict = r[1] aka_name = build_name(pdict, canonical=1) p = Name.get(r[0]) orig_dict = {'name': p.name, 'imdbIndex': p.imdbIndex} if orig_dict['imdbIndex'] is None: del orig_dict['imdbIndex'] orig_name = build_name(orig_dict, canonical=1) if aka_name == orig_name: new_res.append(r) continue orig_dict['akas'] = [aka_name] new_res.append((r[0], orig_dict)) if results > 0: new_res[:] = new_res[:results] return new_res def get_person_main(self, personID): # Every person information is retrieved from here. infosets = self.get_person_infoset() try: p = Name.get(personID) except NotFoundError as e: raise IMDbDataAccessError( 'unable to get personID "%s": "%s"' % (personID, str(e))) res = {'name': p.name, 'imdbIndex': p.imdbIndex} if res['imdbIndex'] is None: del res['imdbIndex'] if not res: raise IMDbDataAccessError('unable to get personID "%s"' % personID) # Collect cast information. castdata = [(cd.movieID, cd.personRoleID, cd.note, self._role[cd.roleID], get_movie_data(cd.movieID, self._kind)) for cd in CastInfo.select(CastInfo.q.personID == personID)] # Regroup by role/duty (cast, writer, director, ...) castdata[:] = _groupListBy(castdata, 3) episodes = {} seenDuties = [] for group in castdata: for mdata in group: duty = orig_duty = group[0][3] if duty not in seenDuties: seenDuties.append(orig_duty) note = mdata[2] or '' if 'episode of' in mdata[4]: duty = 'episodes' if orig_duty not in ('actor', 'actress'): if note: note = ' %s' % note note = '[%s]%s' % (orig_duty, note) curRole = mdata[1] curRoleID = None if curRole is not None: robj = CharName.get(curRole) curRole = robj.name curRoleID = robj.id m = Movie(movieID=mdata[0], data=mdata[4], currentRole=curRole or '', roleID=curRoleID, notes=note, accessSystem='sql') if duty != 'episodes': res.setdefault(duty, []).append(m) else: episodes.setdefault(m['episode of'], []).append(m) if episodes: for k in episodes: episodes[k].sort() episodes[k].reverse() res['episodes'] = episodes for duty in seenDuties: if duty in res: if duty in ('actor', 'actress', 'himself', 'herself', 'themselves'): res[duty] = merge_roles(res[duty]) res[duty].sort() # Info about the person. pinfo = [(self._info[pi.infoTypeID], pi.info, pi.note) for pi in PersonInfo.select(PersonInfo.q.personID == personID)] # Regroup by duty. pinfo = _groupListBy(pinfo, 0) for group in pinfo: sect = group[0][0] for pdata in group: data = pdata[1] if pdata[2]: data += '::%s' % pdata[2] res.setdefault(sect, []).append(data) # AKA names. akan = [(an.name, an.imdbIndex) for an in AkaName.select(AkaName.q.personID == personID)] if akan: res['akas'] = [] for n in akan: nd = {'name': n[0]} if n[1]: nd['imdbIndex'] = n[1] nt = build_name(nd, canonical=1) res['akas'].append(nt) # Do some transformation to preserve consistency with other # data access systems. for key in ('birth date', 'birth notes', 'death date', 'death notes', 'birth name', 'height'): if key in res: res[key] = res[key][0] if 'guest' in res: res['notable tv guest appearances'] = res['guest'] del res['guest'] miscnames = res.get('nick names', []) if 'birth name' in res: miscnames.append(res['birth name']) if 'akas' in res: for mname in miscnames: if mname in res['akas']: res['akas'].remove(mname) if not res['akas']: del res['akas'] trefs, nrefs = self._extractRefs(sub_dict(res, Person.keys_tomodify_list)) return {'data': res, 'titlesRefs': trefs, 'namesRefs': nrefs, 'info sets': infosets} # Just to know what kind of information are available. get_person_filmography = get_person_main get_person_biography = get_person_main get_person_other_works = get_person_main get_person_episodes = get_person_main def _search_character(self, name, results): name = name.strip() if not name: return [] s_name = analyze_name(name)['name'] if not s_name: return [] s_name = normalizeName(s_name) soundexCode = soundex(s_name) surname = s_name.split(' ')[-1] surnameSoundex = soundex(surname) name2 = '' soundexName2 = None nsplit = s_name.split() if len(nsplit) > 1: name2 = '%s %s' % (nsplit[-1], ' '.join(nsplit[:-1])) if s_name == name2: name2 = '' else: soundexName2 = soundex(name2) # If the soundex is None, compare only with the first # phoneticCode column. if soundexCode is not None: if soundexName2 is not None: condition = OR(surnameSoundex == CharName.q.surnamePcode, IN(CharName.q.namePcodeNf, [soundexCode, soundexName2]), IN(CharName.q.surnamePcode, [soundexCode, soundexName2])) else: condition = OR(surnameSoundex == CharName.q.surnamePcode, IN(soundexCode, [CharName.q.namePcodeNf, CharName.q.surnamePcode])) else: condition = ISNULL(Name.q.namePcodeNf) try: qr = [(q.id, {'name': q.name, 'imdbIndex': q.imdbIndex}) for q in CharName.select(condition)] except NotFoundError as e: raise IMDbDataAccessError( 'unable to search the database: "%s"' % str(e)) res = scan_names(qr, s_name, name2, '', results, _scan_character=True) res[:] = [x[1] for x in res] # Purge empty imdbIndex. returnl = [] for x in res: tmpd = x[1] if tmpd['imdbIndex'] is None: del tmpd['imdbIndex'] returnl.append((x[0], tmpd)) return returnl def get_character_main(self, characterID, results=1000): # Every character information is retrieved from here. infosets = self.get_character_infoset() try: c = CharName.get(characterID) except NotFoundError as e: raise IMDbDataAccessError( 'unable to get characterID "%s": "%s"' % (characterID, e)) res = {'name': c.name, 'imdbIndex': c.imdbIndex} if res['imdbIndex'] is None: del res['imdbIndex'] if not res: raise IMDbDataAccessError('unable to get characterID "%s"' % characterID) # Collect filmography information. items = CastInfo.select(CastInfo.q.personRoleID == characterID) if results > 0: items = items[:results] filmodata = [(cd.movieID, cd.personID, cd.note, get_movie_data(cd.movieID, self._kind)) for cd in items if self._role[cd.roleID] in ('actor', 'actress')] fdata = [] for f in filmodata: curRole = None curRoleID = f[1] note = f[2] or '' if curRoleID is not None: robj = Name.get(curRoleID) curRole = robj.name m = Movie(movieID=f[0], data=f[3], currentRole=curRole or '', roleID=curRoleID, roleIsPerson=True, notes=note, accessSystem='sql') fdata.append(m) fdata = merge_roles(fdata) fdata.sort() if fdata: res['filmography'] = fdata return {'data': res, 'info sets': infosets} get_character_filmography = get_character_main get_character_biography = get_character_main def _search_company(self, name, results): name = name.strip() if not name: return [] soundexCode = soundex(name) # If the soundex is None, compare only with the first # phoneticCode column. if soundexCode is None: condition = ISNULL(CompanyName.q.namePcodeNf) else: if name.endswith(']'): condition = CompanyName.q.namePcodeSf == soundexCode else: condition = CompanyName.q.namePcodeNf == soundexCode try: qr = [(q.id, {'name': q.name, 'country': q.countryCode}) for q in CompanyName.select(condition)] except NotFoundError as e: raise IMDbDataAccessError( 'unable to search the database: "%s"' % str(e)) qr[:] = [(x[0], build_company_name(x[1])) for x in qr] res = scan_company_names(qr, name, results) res[:] = [x[1] for x in res] # Purge empty country keys. returnl = [] for x in res: tmpd = x[1] country = tmpd.get('country') if country is None and 'country' in tmpd: del tmpd['country'] returnl.append((x[0], tmpd)) return returnl def get_company_main(self, companyID, results=0): # Every company information is retrieved from here. infosets = self.get_company_infoset() try: c = CompanyName.get(companyID) except NotFoundError as e: raise IMDbDataAccessError( 'unable to get companyID "%s": "%s"' % (companyID, e)) res = {'name': c.name, 'country': c.countryCode} if res['country'] is None: del res['country'] if not res: raise IMDbDataAccessError('unable to get companyID "%s"' % companyID) # Collect filmography information. items = MovieCompanies.select(MovieCompanies.q.companyID == companyID) if results > 0: items = items[:results] filmodata = [(cd.movieID, cd.companyID, self._compType[cd.companyTypeID], cd.note, get_movie_data(cd.movieID, self._kind)) for cd in items] filmodata = _groupListBy(filmodata, 2) for group in filmodata: ctype = group[0][2] for movieID, companyID, ctype, note, movieData in group: movie = Movie(data=movieData, movieID=movieID, notes=note or '', accessSystem=self.accessSystem) res.setdefault(ctype, []).append(movie) res.get(ctype, []).sort() return {'data': res, 'info sets': infosets} def _search_keyword(self, keyword, results): constr = OR(Keyword.q.phoneticCode == soundex(keyword), CONTAINSSTRING(Keyword.q.keyword, self.toUTF8(keyword))) return filterSimilarKeywords(keyword, _iterKeywords(Keyword.select(constr)))[:results] def _get_keyword(self, keyword, results): keyID = Keyword.select(Keyword.q.keyword == keyword) if keyID.count() == 0: return [] keyID = keyID[0].id movies = MovieKeyword.select(MovieKeyword.q.keywordID == keyID)[:results] return [(m.movieID, get_movie_data(m.movieID, self._kind)) for m in movies] def _get_top_bottom_movies(self, kind): if kind == 'top': kind = 'top 250 rank' elif kind == 'bottom': # Not a refuse: the plain text data files contains only # the bottom 10 movies. kind = 'bottom 10 rank' else: return [] infoID = InfoType.select(InfoType.q.info == kind) if infoID.count() == 0: return [] infoID = infoID[0].id movies = MovieInfo.select(MovieInfo.q.infoTypeID == infoID) ml = [] for m in movies: minfo = get_movie_data(m.movieID, self._kind) for k in kind, 'votes', 'rating', 'votes distribution': valueDict = getSingleInfo(MovieInfo, m.movieID, k, notAList=True) if k in (kind, 'votes') and k in valueDict: valueDict[k] = int(valueDict[k]) elif k == 'rating' and k in valueDict: valueDict[k] = float(valueDict[k]) minfo.update(valueDict) ml.append((m.movieID, minfo)) sorter = (_cmpBottom, _cmpTop)[kind == 'top 250 rank'] ml.sort(sorter) return ml def __del__(self): """Ensure that the connection is closed.""" # TODO: on Python 3, using mysql+pymysql, raises random exceptions; # for now, skip it and hope it's garbage-collected. return if not hasattr(self, '_connection'): return self._sql_logger.debug('closing connection to the database') try: self._connection.close() except: pass imdbpy-6.8/imdb/parser/sql/alchemyadapter.py000066400000000000000000000402641351454127000212260ustar00rootroot00000000000000# Copyright 2008-2017 Davide Alberani # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module adapts the SQLAlchemy ORM to the internal mechanism. """ from __future__ import absolute_import, division, print_function, unicode_literals import re import sys import logging from sqlalchemy import * from sqlalchemy import schema try: from sqlalchemy import exc # 0.5 except ImportError: from sqlalchemy import exceptions as exc # 0.4 _alchemy_logger = logging.getLogger('imdbpy.parser.sql.alchemy') from imdb._exceptions import IMDbDataAccessError from .dbschema import * # Used to convert table and column names. re_upper = re.compile(r'([A-Z])') # XXX: I'm not sure at all that this is the best method to connect # to the database and bind that connection to every table. metadata = MetaData() # Maps our placeholders to SQLAlchemy's column types. MAP_COLS = { INTCOL: Integer, UNICODECOL: UnicodeText, STRINGCOL: String } class NotFoundError(IMDbDataAccessError): """Exception raised when Table.get(id) returns no value.""" pass def _renameTable(tname): """Build the name of a table, as done by SQLObject.""" tname = re_upper.sub(r'_\1', tname) if tname.startswith('_'): tname = tname[1:] return tname.lower() def _renameColumn(cname): """Build the name of a column, as done by SQLObject.""" cname = cname.replace('ID', 'Id') return _renameTable(cname) class DNNameObj(object): """Used to access table.sqlmeta.columns[column].dbName (a string).""" def __init__(self, dbName): self.dbName = dbName def __repr__(self): return '' % (self.dbName, id(self)) class DNNameDict(object): """Used to access table.sqlmeta.columns (a dictionary).""" def __init__(self, colMap): self.colMap = colMap def __getitem__(self, key): return DNNameObj(self.colMap[key]) def __repr__(self): return '' % (self.colMap, id(self)) class SQLMetaAdapter(object): """Used to access table.sqlmeta (an object with .table, .columns and .idName attributes).""" def __init__(self, table, colMap=None): self.table = table if colMap is None: colMap = {} self.colMap = colMap def __getattr__(self, name): if name == 'table': return getattr(self.table, name) if name == 'columns': return DNNameDict(self.colMap) if name == 'idName': return self.colMap.get('id', 'id') return None def __repr__(self): return '' % \ (repr(self.table), repr(self.colMap), id(self)) class QAdapter(object): """Used to access table.q attribute (remapped to SQLAlchemy table.c).""" def __init__(self, table, colMap=None): self.table = table if colMap is None: colMap = {} self.colMap = colMap def __getattr__(self, name): try: return getattr(self.table.c, self.colMap[name]) except KeyError: raise AttributeError("unable to get '%s'" % name) def __repr__(self): return '' % \ (repr(self.table), repr(self.colMap), id(self)) class RowAdapter(object): """Adapter for a SQLAlchemy RowProxy object.""" def __init__(self, row, table, colMap=None): self.row = row # FIXME: it's OBSCENE that 'table' should be passed from # TableAdapter through ResultAdapter only to land here, # where it's used to directly update a row item. self.table = table if colMap is None: colMap = {} self.colMap = colMap self.colMapKeys = list(colMap.keys()) def __getattr__(self, name): try: return getattr(self.row, self.colMap[name]) except KeyError: raise AttributeError("unable to get '%s'" % name) def __setattr__(self, name, value): # FIXME: I can't even think about how much performances suffer, # for this horrible hack (and it's used so rarely...) # For sure something like a "property" to map column names # to getter/setter functions would be much better, but it's # not possible (or at least not easy) to build them for a # single instance. if name in self.__dict__.get('colMapKeys', ()): # Trying to update a value in the database. row = self.__dict__['row'] table = self.__dict__['table'] colMap = self.__dict__['colMap'] params = {colMap[name]: value} table.update(table.c.id == row.id).execute(**params) # XXX: minor bug: after a value is assigned with the # 'rowAdapterInstance.colName = value' syntax, for some # reason rowAdapterInstance.colName still returns the # previous value (even if the database is updated). # Fix it? I'm not even sure it's ever used. return # For every other attribute. object.__setattr__(self, name, value) def __repr__(self): return '' % \ (repr(self.row), repr(self.table), repr(self.colMap), id(self)) class ResultAdapter(object): """Adapter for a SQLAlchemy ResultProxy object.""" def __init__(self, result, table, colMap=None): self.result = result self.table = table if colMap is None: colMap = {} self.colMap = colMap def count(self): return len(self) def __len__(self): # FIXME: why sqlite returns -1? (that's wrooong!) if self.result.rowcount == -1: return 0 return self.result.rowcount def __getitem__(self, key): rlist = list(self.result) res = rlist[key] if not isinstance(key, slice): # A single item. return RowAdapter(res, self.table, colMap=self.colMap) else: # A (possible empty) list of items. return [RowAdapter(x, self.table, colMap=self.colMap) for x in res] def __iter__(self): for item in self.result: yield RowAdapter(item, self.table, colMap=self.colMap) def __repr__(self): return '' % \ (repr(self.result), repr(self.table), repr(self.colMap), id(self)) class TableAdapter(object): """Adapter for a SQLAlchemy Table object, to mimic a SQLObject class.""" def __init__(self, table, uri=None): """Initialize a TableAdapter object.""" self._imdbpySchema = table self._imdbpyName = table.name self.connectionURI = uri self.colMap = {} columns = [] for col in table.cols: # Column's paramters. params = {'nullable': True} params.update(col.params) if col.name == 'id': params['primary_key'] = True if 'notNone' in params: params['nullable'] = not params['notNone'] del params['notNone'] cname = _renameColumn(col.name) self.colMap[col.name] = cname colClass = MAP_COLS[col.kind] colKindParams = {} if 'length' in params: colKindParams['length'] = params['length'] del params['length'] colKind = colClass(**colKindParams) if 'alternateID' in params: # There's no need to handle them here. del params['alternateID'] # Create a column. colObj = Column(cname, colKind, **params) columns.append(colObj) self.tableName = _renameTable(table.name) # Create the table. self.table = Table(self.tableName, metadata, *columns) self._ta_insert = self.table.insert() self._ta_select = self.table.select # Adapters for special attributes. self.q = QAdapter(self.table, colMap=self.colMap) self.sqlmeta = SQLMetaAdapter(self.table, colMap=self.colMap) def select(self, conditions=None): """Return a list of results.""" result = self._ta_select(conditions).execute() return ResultAdapter(result, self.table, colMap=self.colMap) def get(self, theID): """Get an object given its ID.""" result = self.select(self.table.c.id == theID) # if not result: # raise NotFoundError, 'no data for ID %s' % theID # FIXME: isn't this a bit risky? We can't check len(result), # because sqlite returns -1... # What about converting it to a list and getting the first item? try: return result[0] except IndexError: raise NotFoundError('no data for ID %s' % theID) def dropTable(self, checkfirst=True): """Drop the table.""" dropParams = {'checkfirst': checkfirst} # Guess what? Another work-around for a ibm_db bug. if self.table.bind.engine.url.drivername.startswith('ibm_db'): del dropParams['checkfirst'] try: self.table.drop(**dropParams) except exc.ProgrammingError: # As above: re-raise the exception, but only if it's not ibm_db. if not self.table.bind.engine.url.drivername.startswith('ibm_db'): raise def createTable(self, checkfirst=True): """Create the table.""" self.table.create(checkfirst=checkfirst) # Create indexes for alternateID columns (other indexes will be # created later, at explicit request for performances reasons). for col in self._imdbpySchema.cols: if col.name == 'id': continue if col.params.get('alternateID', False): self._createIndex(col, checkfirst=checkfirst) def _createIndex(self, col, checkfirst=True): """Create an index for a given (schema) column.""" idx_name = '%s_%s' % (self.table.name, col.index or col.name) if checkfirst: for index in self.table.indexes: if index.name == idx_name: return index_args = {} if self.connectionURI.startswith('mysql'): if col.indexLen: index_args['mysql_length'] = col.indexLen elif col.kind in (UNICODECOL, STRINGCOL): index_args['mysql_length'] = min(5, col.params.get('length') or 5) idx = Index(idx_name, getattr(self.table.c, self.colMap[col.name]), **index_args) # XXX: beware that exc.OperationalError can be raised, is some # strange circumstances; that's why the index name doesn't # follow the SQLObject convention, but includes the table name: # sqlite, for example, expects index names to be unique at # db-level. try: idx.create() except exc.OperationalError as e: _alchemy_logger.warn('Skipping creation of the %s.%s index: %s' % (self.sqlmeta.table, col.name, e)) def addIndexes(self, ifNotExists=True): """Create all required indexes.""" for col in self._imdbpySchema.cols: if col.index: self._createIndex(col, checkfirst=ifNotExists) def __call__(self, *args, **kwds): """To insert a new row with the syntax: TableClass(key=value, ...)""" taArgs = {} for key, value in list(kwds.items()): taArgs[self.colMap.get(key, key)] = value self._ta_insert.execute(*args, **taArgs) def __repr__(self): return '' % (repr(self.table), id(self)) # Module-level "cache" for SQLObject classes, to prevent # "Table 'tableName' is already defined for this MetaData instance" errors, # when two or more connections to the database are made. # XXX: is this the best way to act? TABLES_REPOSITORY = {} def getDBTables(uri=None): """Return a list of TableAdapter objects to be used to access the database through the SQLAlchemy ORM. The connection uri is optional, and can be used to tailor the db schema to specific needs.""" DB_TABLES = [] for table in DB_SCHEMA: if table.name in TABLES_REPOSITORY: DB_TABLES.append(TABLES_REPOSITORY[table.name]) continue tableAdapter = TableAdapter(table, uri) DB_TABLES.append(tableAdapter) TABLES_REPOSITORY[table.name] = tableAdapter return DB_TABLES # Functions used to emulate SQLObject's logical operators. def AND(*params): """Emulate SQLObject's AND.""" return and_(*params) def OR(*params): """Emulate SQLObject's OR.""" return or_(*params) def IN(item, inList): """Emulate SQLObject's IN.""" if not isinstance(item, schema.Column): return OR(*[x == item for x in inList]) else: return item.in_(inList) def ISNULL(x): """Emulate SQLObject's ISNULL.""" # XXX: Should we use null()? Can null() be a global instance? # XXX: Is it safe to test None with the == operator, in this case? return x is None def ISNOTNULL(x): """Emulate SQLObject's ISNOTNULL.""" return x is not None def CONTAINSSTRING(expr, pattern): """Emulate SQLObject's CONTAINSSTRING.""" return expr.like('%%%s%%' % pattern) def toUTF8(s): """For some strange reason, sometimes SQLObject wants utf8 strings instead of unicode; with SQLAlchemy we just return the unicode text.""" return s class _AlchemyConnection(object): """A proxy for the connection object, required since _ConnectionFairy uses __slots__.""" def __init__(self, conn): self.conn = conn def __getattr__(self, name): return getattr(self.conn, name) def setConnection(uri, tables, encoding='utf8', debug=False): """Set connection for every table.""" params = {'encoding': encoding} # FIXME: why on earth MySQL requires an additional parameter, # is well beyond my understanding... if uri.startswith('mysql'): if '?' in uri: uri += '&' else: uri += '?' uri += 'charset=%s' % encoding if debug: params['echo'] = True if uri.startswith('ibm_db'): # Try to work-around a possible bug of the ibm_db DB2 driver. params['convert_unicode'] = True # XXX: is this the best way to connect? engine = create_engine(uri, **params) metadata.bind = engine eng_conn = engine.connect() if uri.startswith('sqlite'): major = sys.version_info[0] minor = sys.version_info[1] if major > 2 or (major == 2 and minor > 5): eng_conn.connection.connection.text_factory = str # XXX: OH MY, THAT'S A MESS! # We need to return a "connection" object, with the .dbName # attribute set to the db engine name (e.g. "mysql"), .paramstyle # set to the style of the paramters for query() calls, and the # .module attribute set to a module (?) with .OperationalError and # .IntegrityError attributes. # Another attribute of "connection" is the getConnection() function, # used to return an object with a .cursor() method. connection = _AlchemyConnection(eng_conn.connection) paramstyle = eng_conn.dialect.paramstyle connection.module = eng_conn.dialect.dbapi connection.paramstyle = paramstyle connection.getConnection = lambda: connection.connection connection.dbName = engine.url.drivername return connection imdbpy-6.8/imdb/parser/sql/dbschema.py000066400000000000000000000455631351454127000200200ustar00rootroot00000000000000# Copyright 2005-2017 Davide Alberani # 2006 Giuseppe "Cowo" Corbelli lugbs.linux.it> # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides the schema used to describe the layout of the database used by the :mod:`imdb.parser.sql` package; functions to create/drop tables and indexes are also provided. """ from __future__ import absolute_import, division, print_function, unicode_literals import logging _dbschema_logger = logging.getLogger('imdbpy.parser.sql.dbschema') # Placeholders for column types. INTCOL = 1 UNICODECOL = 2 STRINGCOL = 3 _strMap = {1: 'INTCOL', 2: 'UNICODECOL', 3: 'STRINGCOL'} class DBCol(object): """Define column objects.""" def __init__(self, name, kind, **params): self.name = name self.kind = kind self.index = None self.indexLen = None # If not None, two notations are accepted: 'TableName' # and 'TableName.ColName'; in the first case, 'id' is assumed # as the name of the pointed column. self.foreignKey = None if 'index' in params: self.index = params['index'] del params['index'] if 'indexLen' in params: self.indexLen = params['indexLen'] del params['indexLen'] if 'foreignKey' in params: self.foreignKey = params['foreignKey'] del params['foreignKey'] self.params = params def __str__(self): """Class representation.""" s = '' % ( self.name, len(self.cols), sum([len(v) for v in list(self.values.values())]) ) def __repr__(self): """Class representation.""" s = '').lstrip('<') for col in self.cols]) if col_s: s += ', %s' % col_s if self.values: s += ', values=%s' % self.values s += ')>' return s # Default values to insert in some tables: {'column': (list, of, values, ...)} kindTypeDefs = { 'kind': ( 'movie', 'tv series', 'tv movie', 'video movie', 'tv mini series', 'video game', 'episode', 'short', 'tv short' ) } companyTypeDefs = { 'kind': ( 'distributors', 'production companies', 'special effects companies', 'miscellaneous companies' ) } infoTypeDefs = { 'info': ( 'runtimes', 'color info', 'genres', 'languages', 'certificates', 'sound mix', 'tech info', 'countries', 'taglines', 'keywords', 'alternate versions', 'crazy credits', 'goofs', 'soundtrack', 'quotes', 'release dates', 'trivia', 'locations', 'mini biography', 'birth notes', 'birth date', 'height', 'death date', 'spouse', 'other works', 'birth name', 'salary history', 'nick names', 'books', 'agent address', 'biographical movies', 'portrayed in', 'where now', 'trade mark', 'interviews', 'article', 'magazine cover photo', 'pictorial', 'death notes', 'LD disc format', 'LD year', 'LD digital sound', 'LD official retail price', 'LD frequency response', 'LD pressing plant', 'LD length', 'LD language', 'LD review', 'LD spaciality', 'LD release date', 'LD production country', 'LD contrast', 'LD color rendition', 'LD picture format', 'LD video noise', 'LD video artifacts', 'LD release country', 'LD sharpness', 'LD dynamic range', 'LD audio noise', 'LD color information', 'LD group genre', 'LD quality program', 'LD close captions-teletext-ld-g', 'LD category', 'LD analog left', 'LD certification', 'LD audio quality', 'LD video quality', 'LD aspect ratio', 'LD analog right', 'LD additional information', 'LD number of chapter stops', 'LD dialogue intellegibility', 'LD disc size', 'LD master format', 'LD subtitles', 'LD status of availablility', 'LD quality of source', 'LD number of sides', 'LD video standard', 'LD supplement', 'LD original title', 'LD sound encoding', 'LD number', 'LD label', 'LD catalog number', 'LD laserdisc title', 'screenplay-teleplay', 'novel', 'adaption', 'book', 'production process protocol', 'printed media reviews', 'essays', 'other literature', 'mpaa', 'plot', 'votes distribution', 'votes', 'rating', 'production dates', 'copyright holder', 'filming dates', 'budget', 'weekend gross', 'gross', 'opening weekend', 'rentals', 'admissions', 'studios', 'top 250 rank', 'bottom 10 rank' ) } compCastTypeDefs = { 'kind': ('cast', 'crew', 'complete', 'complete+verified') } linkTypeDefs = { 'link': ( 'follows', 'followed by', 'remake of', 'remade as', 'references', 'referenced in', 'spoofs', 'spoofed in', 'features', 'featured in', 'spin off from', 'spin off', 'version of', 'similar to', 'edited into', 'edited from', 'alternate language version of', 'unknown link' ) } roleTypeDefs = { 'role': ( 'actor', 'actress', 'producer', 'writer', 'cinematographer', 'composer', 'costume designer', 'director', 'editor', 'miscellaneous crew', 'production designer', 'guest' ) } # Schema of tables in our database. # XXX: Foreign keys can be used to create constrains between tables, # but they create indexes in the database, and this # means poor performances at insert-time. DB_SCHEMA = [ DBTable('Name', # namePcodeCf is the soundex of the name in the canonical format. # namePcodeNf is the soundex of the name in the normal format, if # different from namePcodeCf. # surnamePcode is the soundex of the surname, if different from the # other two values. # The 'id' column is simply skipped by SQLObject (it's a default); # the alternateID attribute here will be ignored by SQLAlchemy. DBCol('id', INTCOL, notNone=True, alternateID=True), DBCol('name', UNICODECOL, notNone=True, index='idx_name', indexLen=6), DBCol('imdbIndex', STRINGCOL, length=12, default=None), DBCol('imdbID', INTCOL, default=None, index='idx_imdb_id'), DBCol('gender', STRINGCOL, length=1, default=None, index='idx_gender'), DBCol('namePcodeCf', STRINGCOL, length=5, default=None, index='idx_pcodecf'), DBCol('namePcodeNf', STRINGCOL, length=5, default=None, index='idx_pcodenf'), DBCol('surnamePcode', STRINGCOL, length=5, default=None, index='idx_pcode'), DBCol('md5sum', STRINGCOL, length=32, default=None, index='idx_md5')), DBTable('CharName', # namePcodeNf is the soundex of the name in the normal format. # surnamePcode is the soundex of the surname, if different # from namePcodeNf. DBCol('id', INTCOL, notNone=True, alternateID=True), DBCol('name', UNICODECOL, notNone=True, index='idx_name', indexLen=6), DBCol('imdbIndex', STRINGCOL, length=12, default=None), DBCol('imdbID', INTCOL, default=None, index='idx_imdb_id'), DBCol('namePcodeNf', STRINGCOL, length=5, default=None, index='idx_pcodenf'), DBCol('surnamePcode', STRINGCOL, length=5, default=None, index='idx_pcode'), DBCol('md5sum', STRINGCOL, length=32, default=None, index='idx_md5')), DBTable('CompanyName', # namePcodeNf is the soundex of the name in the normal format. # namePcodeSf is the soundex of the name plus the country code. DBCol('id', INTCOL, notNone=True, alternateID=True), DBCol('name', UNICODECOL, notNone=True, index='idx_name', indexLen=6), DBCol('countryCode', STRINGCOL, length=255, default=None, index='idx_ccode'), DBCol('imdbID', INTCOL, default=None, index='idx_imdb_id'), DBCol('namePcodeNf', STRINGCOL, length=5, default=None, index='idx_pcodenf'), DBCol('namePcodeSf', STRINGCOL, length=5, default=None, index='idx_pcodesf'), DBCol('md5sum', STRINGCOL, length=32, default=None, index='idx_md5')), DBTable('KindType', DBCol('id', INTCOL, notNone=True, alternateID=True), DBCol('kind', STRINGCOL, length=15, default=None, alternateID=True), values=kindTypeDefs), DBTable('Title', DBCol('id', INTCOL, notNone=True, alternateID=True), DBCol('title', UNICODECOL, notNone=True, index='idx_title', indexLen=10), DBCol('imdbIndex', STRINGCOL, length=12, default=None), DBCol('kindID', INTCOL, notNone=True, index='idx_kindid'), DBCol('productionYear', INTCOL, default=None, index='idx_year'), DBCol('imdbID', INTCOL, default=None, index="idx_imdb_id"), DBCol('phoneticCode', STRINGCOL, length=5, default=None, index='idx_pcode'), DBCol('episodeOfID', INTCOL, default=None, index='idx_epof'), DBCol('seasonNr', INTCOL, default=None, index="idx_season_nr"), DBCol('episodeNr', INTCOL, default=None, index="idx_episode_nr"), # Maximum observed length is 44; 49 can store 5 comma-separated # year-year pairs. DBCol('seriesYears', STRINGCOL, length=49, default=None), DBCol('md5sum', STRINGCOL, length=32, default=None, index='idx_md5')), DBTable('CompanyType', DBCol('id', INTCOL, notNone=True, alternateID=True), DBCol('kind', STRINGCOL, length=32, default=None, alternateID=True), values=companyTypeDefs), DBTable('AkaName', DBCol('id', INTCOL, notNone=True, alternateID=True), DBCol('personID', INTCOL, notNone=True, index='idx_person'), DBCol('name', UNICODECOL, notNone=True, index='idx_name', indexLen=6), DBCol('imdbIndex', STRINGCOL, length=12, default=None), DBCol('namePcodeCf', STRINGCOL, length=5, default=None, index='idx_pcodecf'), DBCol('namePcodeNf', STRINGCOL, length=5, default=None, index='idx_pcodenf'), DBCol('surnamePcode', STRINGCOL, length=5, default=None, index='idx_pcode'), DBCol('md5sum', STRINGCOL, length=32, default=None, index='idx_md5')), DBTable('AkaTitle', # XXX: It's safer to set notNone to False, here. # alias for akas are stored completely in the AkaTitle table; # this means that episodes will set also a "tv series" alias name. # Reading the aka-title.list file it looks like there are # episode titles with aliases to different titles for both # the episode and the series title, while for just the series # there are no aliases. # E.g.: # aka title original title # "Series, The" (2005) {The Episode} "Other Title" (2005) {Other Title} # But there is no: # "Series, The" (2005) "Other Title" (2005) DBCol('id', INTCOL, notNone=True, alternateID=True), DBCol('movieID', INTCOL, notNone=True, index='idx_movieid'), DBCol('title', UNICODECOL, notNone=True, index='idx_title', indexLen=10), DBCol('imdbIndex', STRINGCOL, length=12, default=None), DBCol('kindID', INTCOL, notNone=True, index='idx_kindid'), DBCol('productionYear', INTCOL, default=None, index='idx_year'), DBCol('phoneticCode', STRINGCOL, length=5, default=None, index='idx_pcode'), DBCol('episodeOfID', INTCOL, default=None, index='idx_epof'), DBCol('seasonNr', INTCOL, default=None), DBCol('episodeNr', INTCOL, default=None), DBCol('note', UNICODECOL, default=None), DBCol('md5sum', STRINGCOL, length=32, default=None, index='idx_md5')), DBTable('RoleType', DBCol('id', INTCOL, notNone=True, alternateID=True), DBCol('role', STRINGCOL, length=32, notNone=True, alternateID=True), values=roleTypeDefs), DBTable('CastInfo', DBCol('id', INTCOL, notNone=True, alternateID=True), DBCol('personID', INTCOL, notNone=True, index='idx_pid'), DBCol('movieID', INTCOL, notNone=True, index='idx_mid'), DBCol('personRoleID', INTCOL, default=None, index='idx_cid'), DBCol('note', UNICODECOL, default=None), DBCol('nrOrder', INTCOL, default=None), DBCol('roleID', INTCOL, notNone=True, index='idx_rid')), DBTable('CompCastType', DBCol('id', INTCOL, notNone=True, alternateID=True), DBCol('kind', STRINGCOL, length=32, notNone=True, alternateID=True), values=compCastTypeDefs), DBTable('CompleteCast', DBCol('id', INTCOL, notNone=True, alternateID=True), DBCol('movieID', INTCOL, index='idx_mid'), DBCol('subjectID', INTCOL, notNone=True, index='idx_sid'), DBCol('statusID', INTCOL, notNone=True)), DBTable('InfoType', DBCol('id', INTCOL, notNone=True, alternateID=True), DBCol('info', STRINGCOL, length=32, notNone=True, alternateID=True), values=infoTypeDefs), DBTable('LinkType', DBCol('id', INTCOL, notNone=True, alternateID=True), DBCol('link', STRINGCOL, length=32, notNone=True, alternateID=True), values=linkTypeDefs), DBTable('Keyword', DBCol('id', INTCOL, notNone=True, alternateID=True), # XXX: can't use alternateID=True, because it would create # a UNIQUE index; unfortunately (at least with a common # collation like utf8_unicode_ci) MySQL will consider # some different keywords identical - like # "fiancée" and "fiancee". DBCol('keyword', UNICODECOL, notNone=True, index='idx_keyword', indexLen=5), DBCol('phoneticCode', STRINGCOL, length=5, default=None, index='idx_pcode')), DBTable('MovieKeyword', DBCol('id', INTCOL, notNone=True, alternateID=True), DBCol('movieID', INTCOL, notNone=True, index='idx_mid'), DBCol('keywordID', INTCOL, notNone=True, index='idx_keywordid')), DBTable('MovieLink', DBCol('id', INTCOL, notNone=True, alternateID=True), DBCol('movieID', INTCOL, notNone=True, index='idx_mid'), DBCol('linkedMovieID', INTCOL, notNone=True, index='idx_lmid'), DBCol('linkTypeID', INTCOL, notNone=True, index='idx_ltypeid')), DBTable('MovieInfo', DBCol('id', INTCOL, notNone=True, alternateID=True), DBCol('movieID', INTCOL, notNone=True, index='idx_mid'), DBCol('infoTypeID', INTCOL, notNone=True, index='idx_infotypeid'), DBCol('info', UNICODECOL, notNone=True, index='idx_info', indexLen=10), DBCol('note', UNICODECOL, default=None)), DBTable('MovieCompanies', DBCol('id', INTCOL, notNone=True, alternateID=True), DBCol('movieID', INTCOL, notNone=True, index='idx_mid'), DBCol('companyID', INTCOL, notNone=True, index='idx_cid'), DBCol('companyTypeID', INTCOL, notNone=True, index='idx_ctypeid'), DBCol('note', UNICODECOL, default=None)), DBTable('PersonInfo', DBCol('id', INTCOL, notNone=True, alternateID=True), DBCol('personID', INTCOL, notNone=True, index='idx_pid'), DBCol('infoTypeID', INTCOL, notNone=True, index='idx_itypeid'), DBCol('info', UNICODECOL, notNone=True), DBCol('note', UNICODECOL, default=None)) ] # Functions to manage tables. def dropTables(tables, ifExists=True): """Drop the tables.""" # In reverse order (useful to avoid errors about foreign keys). DB_TABLES_DROP = list(tables) DB_TABLES_DROP.reverse() for table in DB_TABLES_DROP: _dbschema_logger.info('dropping table %s', table._imdbpyName) table.dropTable(ifExists) def createTables(tables, ifNotExists=True): """Create the tables and insert default values.""" for table in tables: # Create the table. _dbschema_logger.info('creating table %s', table._imdbpyName) table.createTable(ifNotExists) # Insert default values, if any. if table._imdbpySchema.values: _dbschema_logger.info('inserting values into table %s', table._imdbpyName) for key in table._imdbpySchema.values: for value in table._imdbpySchema.values[key]: table(**{key: str(value)}) def createIndexes(tables, ifNotExists=True): """Create the indexes in the database. Return a list of errors, if any.""" errors = [] for table in tables: _dbschema_logger.info('creating indexes for table %s', table._imdbpyName) try: table.addIndexes(ifNotExists) except Exception as e: errors.append(e) continue return errors imdbpy-6.8/imdb/utils.py000066400000000000000000001656171351454127000153220ustar00rootroot00000000000000# Copyright 2004-2019 Davide Alberani # 2009 H. Turgut Uyar # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA """ This module provides basic utilities for the imdb package. """ from __future__ import absolute_import, division, print_function, unicode_literals import logging import re import string import sys from copy import copy, deepcopy from functools import total_ordering from time import strftime, strptime from imdb import VERSION from imdb import linguistics from imdb._exceptions import IMDbParserError PY2 = sys.hexversion < 0x3000000 # Logger for imdb.utils module. _utils_logger = logging.getLogger('imdbpy.utils') # The regular expression for the "long" year format of IMDb, like # "(1998)" and "(1986/II)", where the optional roman number (that I call # "imdbIndex" after the slash is used for movies with the same title # and year of release. # XXX: probably L, C, D and M are far too much! ;-) re_year_index = re.compile(r'\(([0-9\?]{4}(/[IVXLCDM]+)?)\)') re_m_episode = re.compile(r'\(TV Episode\)\s+-\s+', re.I) re_m_series = re.compile(r'Season\s+(\d+)\s+\|\s+Episode\s+(\d+)\s+-', re.I) re_m_imdbIndex = re.compile(r'\(([IVXLCDM]+)\)') re_m_kind = re.compile( r'\((TV episode|TV Series|TV mini-series|mini|TV|Video|Video Game|VG|Short|TV Movie|TV Short|V)\)', re.I ) KIND_MAP = { 'tv': 'tv movie', 'tv episode': 'episode', 'v': 'video movie', 'video': 'video movie', 'vg': 'video game', 'mini': 'tv mini series', 'tv mini-series': 'tv mini series' } # Match only the imdbIndex (for name strings). re_index = re.compile(r'^\(([IVXLCDM]+)\)$') # Match things inside parentheses. re_parentheses = re.compile(r'(\(.*\))') # Match the number of episodes. re_episodes = re.compile('\s?\((\d+) episodes\)', re.I) re_episode_info = re.compile( r'{\s*(.+?)?\s?(\([0-9\?]{4}-[0-9\?]{1,2}-[0-9\?]{1,2}\))?\s?(\(#[0-9]+\.[0-9]+\))?}' ) # Common suffixes in surnames. _sname_suffixes = ('de', 'la', 'der', 'den', 'del', 'y', 'da', 'van', 'e', 'von', 'the', 'di', 'du', 'el', 'al') def canonicalName(name): """Return the given name in canonical "Surname, Name" format. It assumes that name is in the 'Name Surname' format.""" # XXX: some statistics (as of 17 Apr 2008, over 2288622 names): # - just a surname: 69476 # - single surname, single name: 2209656 # - composed surname, composed name: 9490 # - composed surname, single name: 67606 # (2: 59764, 3: 6862, 4: 728) # - single surname, composed name: 242310 # (2: 229467, 3: 9901, 4: 2041, 5: 630) # - Jr.: 8025 # Don't convert names already in the canonical format. if name.find(', ') != -1: return name joiner = '%s, %s' sur_joiner = '%s %s' sur_space = ' %s' space = ' ' sname = name.split(' ') snl = len(sname) if snl == 2: # Just a name and a surname: how boring... name = joiner % (sname[1], sname[0]) elif snl > 2: lsname = [x.lower() for x in sname] if snl == 3: _indexes = (0, snl - 2) else: _indexes = (0, snl - 2, snl - 3) # Check for common surname prefixes at the beginning and near the end. for index in _indexes: if lsname[index] not in _sname_suffixes: continue try: # Build the surname. surn = sur_joiner % (sname[index], sname[index + 1]) del sname[index] del sname[index] try: # Handle the "Jr." after the name. if lsname[index + 2].startswith('jr'): surn += sur_space % sname[index] del sname[index] except (IndexError, ValueError): pass name = joiner % (surn, space.join(sname)) break except ValueError: continue else: name = joiner % (sname[-1], space.join(sname[:-1])) return name def normalizeName(name): """Return a name in the normal "Name Surname" format.""" joiner = '%s %s' sname = name.split(', ') if len(sname) == 2: name = joiner % (sname[1], sname[0]) return name def analyze_name(name, canonical=None): """Return a dictionary with the name and the optional imdbIndex keys, from the given string. If canonical is None (default), the name is stored in its own style. If canonical is True, the name is converted to canonical style. If canonical is False, the name is converted to normal format. raise an IMDbParserError exception if the name is not valid. """ original_n = name name = name.split(' aka ')[0].strip() res = {} imdbIndex = '' opi = name.rfind('(') cpi = name.rfind(')') # Strip notes (but not if the name starts with a parenthesis). if opi not in (-1, 0) and cpi > opi: if re_index.match(name[opi:cpi + 1]): imdbIndex = name[opi + 1:cpi] name = name[:opi].rstrip() else: # XXX: for the birth and death dates case like " (1926-2004)" name = re_parentheses.sub('', name).strip() if not name: raise IMDbParserError('invalid name: "%s"' % original_n) if canonical is not None: if canonical: name = canonicalName(name) else: name = normalizeName(name) res['name'] = name if imdbIndex: res['imdbIndex'] = imdbIndex return res def build_name(name_dict, canonical=None): """Given a dictionary that represents a "long" IMDb name, return a string. If canonical is None (default), the name is returned in the stored style. If canonical is True, the name is converted to canonical style. If canonical is False, the name is converted to normal format. """ name = name_dict.get('canonical name') or name_dict.get('name', '') if not name: return '' if canonical is not None: if canonical: name = canonicalName(name) else: name = normalizeName(name) imdbIndex = name_dict.get('imdbIndex') if imdbIndex: name += ' (%s)' % imdbIndex return name # XXX: here only for backward compatibility. Find and remove any dependency. _unicodeArticles = linguistics.GENERIC_ARTICLES _articles = linguistics.toUTF8(_unicodeArticles) articlesDicts = linguistics.articlesDictsForLang(None) spArticles = linguistics.spArticlesForLang(None) def canonicalTitle(title, lang=None, imdbIndex=None): """Return the title in the canonic format 'Movie Title, The'; beware that it doesn't handle long imdb titles. The 'lang' argument can be used to specify the language of the title. """ isUnicode = isinstance(title, str) articlesDicts = linguistics.articlesDictsForLang(lang) try: if title.split(', ')[-1].lower() in articlesDicts[isUnicode]: return title except IndexError: pass _format = '%s%s, %s' ltitle = title.lower() if imdbIndex: imdbIndex = ' (%s)' % imdbIndex else: imdbIndex = '' spArticles = linguistics.spArticlesForLang(lang) for article in spArticles[isUnicode]: if ltitle.startswith(article): lart = len(article) title = _format % (title[lart:], imdbIndex, title[:lart]) if article[-1] == ' ': title = title[:-1] break return title def normalizeTitle(title, lang=None): """Return the title in the normal "The Title" format; beware that it doesn't handle long imdb titles, but only the title portion, without year[/imdbIndex] or special markup. The 'lang' argument can be used to specify the language of the title. """ isUnicode = isinstance(title, str) stitle = title.split(', ') articlesDicts = linguistics.articlesDictsForLang(lang) if len(stitle) > 1 and stitle[-1].lower() in articlesDicts[isUnicode]: sep = ' ' if stitle[-1][-1] in ("'", '-'): sep = '' _format = '%s%s%s' _joiner = ', ' title = _format % (stitle[-1], sep, _joiner.join(stitle[:-1])) return title def _split_series_episode(title): """Return the series and the episode titles; if this is not a series' episode, the returned series title is empty. This function recognize two different styles: "The Series" An Episode (2005) "The Series" (2004) {An Episode (2005) (#season.episode)}""" series_title = '' episode_or_year = '' if title[-1:] == '}': # Title of the episode, as in the plain text data files. begin_eps = title.rfind('{') if begin_eps == -1: return '', '' series_title = title[:begin_eps].rstrip() # episode_or_year is returned with the {...} episode_or_year = title[begin_eps:].strip() if episode_or_year[:12] == '{SUSPENDED}}': return '', '' # XXX: works only with tv series; it's still unclear whether # IMDb will support episodes for tv mini series and tv movies... elif title[0:1] == '"': second_quot = title[1:].find('"') + 2 if second_quot != 1: # a second " was found. episode_or_year = title[second_quot:].lstrip() first_char = episode_or_year[0:1] if not first_char: return '', '' if first_char != '(': # There is not a (year) but the title of the episode; # that means this is an episode title, as returned by # the web server. series_title = title[:second_quot] return series_title, episode_or_year def is_series_episode(title): """Return True if 'title' is an series episode.""" return bool(_split_series_episode(title.strip())[0]) def analyze_title(title, canonical=None, canonicalSeries=None, canonicalEpisode=None): """Analyze the given title and return a dictionary with the "stripped" title, the kind of the show ("movie", "tv series", etc.), the year of production and the optional imdbIndex (a roman number used to distinguish between movies with the same title and year). If canonical is None (default), the title is stored in its own style. If canonical is True, the title is converted to canonical style. If canonical is False, the title is converted to normal format. raise an IMDbParserError exception if the title is not valid. """ # XXX: introduce the 'lang' argument? if canonical is not None: canonicalSeries = canonicalEpisode = canonical original_t = title result = {} title = title.split(' aka ')[0].strip() year = '' kind = '' imdbIndex = '' series_title, episode_or_year = _split_series_episode(title) if series_title: # It's an episode of a series. series_d = analyze_title(series_title, canonical=canonicalSeries) oad = sen = ep_year = '' # Plain text data files format. if episode_or_year[0:1] == '{' and episode_or_year[-1:] == '}': match = re_episode_info.findall(episode_or_year) if match: # Episode title, original air date and #season.episode episode_or_year, oad, sen = match[0] episode_or_year = episode_or_year.strip() if not oad: # No year, but the title is something like (2005-04-12) if episode_or_year and episode_or_year[0] == '(' and \ episode_or_year[-1:] == ')' and \ episode_or_year[1:2] != '#': oad = episode_or_year if oad[1:5] and oad[5:6] == '-': try: ep_year = int(oad[1:5]) except (TypeError, ValueError): pass if not oad and not sen and episode_or_year.startswith('(#'): sen = episode_or_year elif episode_or_year.startswith('Episode dated'): oad = episode_or_year[14:] if oad[-4:].isdigit(): try: ep_year = int(oad[-4:]) except (TypeError, ValueError): pass episode_d = analyze_title(episode_or_year, canonical=canonicalEpisode) episode_d['kind'] = 'episode' episode_d['episode of'] = series_d if oad: episode_d['original air date'] = oad[1:-1] if ep_year and episode_d.get('year') is None: episode_d['year'] = ep_year if sen and sen[2:-1].find('.') != -1: seas, epn = sen[2:-1].split('.') if seas: # Set season and episode. try: seas = int(seas) except ValueError: pass try: epn = int(epn) except ValueError: pass episode_d['season'] = seas if epn: episode_d['episode'] = epn return episode_d # First of all, search for the kind of show. # XXX: Number of entries at 17 Apr 2008: # movie: 379,871 # episode: 483,832 # tv movie: 61,119 # tv series: 44,795 # video movie: 57,915 # tv mini series: 5,497 # video game: 5,490 # More up-to-date statistics: http://us.imdb.com/database_statistics epindex = re_m_episode.search(title) if epindex: # It's an episode of a series. kind = 'episode' series_title = title[epindex.end():] season_episode_match = re_m_series.match(series_title) if season_episode_match: result['season'] = int(season_episode_match.groups()[0]) result['episode'] = int(season_episode_match.groups()[1]) series_title = re_m_series.sub('', series_title) series_info = analyze_title(series_title) result['episode of'] = series_info.get('title') result['series year'] = series_info.get('year') title = title[:epindex.start()].strip() else: detected_kind = re_m_kind.findall(title) if detected_kind: kind = detected_kind[-1].lower().replace('-', '') kind = KIND_MAP.get(kind, kind) title = re_m_kind.sub('', title).strip() # Search for the year and the optional imdbIndex (a roman number). yi = re_year_index.findall(title) if yi: last_yi = yi[-1] year = last_yi[0] if last_yi[1]: imdbIndex = last_yi[1][1:] year = year[:-len(imdbIndex) - 1] i = title.rfind('(%s)' % last_yi[0]) if i != -1: title = title[:i - 1].rstrip() if not imdbIndex: detect_imdbIndex = re_m_imdbIndex.findall(title) if detect_imdbIndex: imdbIndex = detect_imdbIndex[-1] title = re_m_imdbIndex.sub('', title).strip() # This is a tv (mini) series: strip the '"' at the begin and at the end. # XXX: strip('"') is not used for compatibility with Python 2.0. if title and title[0] == title[-1] == '"': if not kind: kind = 'tv series' title = title[1:-1].strip() if not title: raise IMDbParserError('invalid title: "%s"' % original_t) if canonical is not None: if canonical: title = canonicalTitle(title) else: title = normalizeTitle(title) result['title'] = title result['kind'] = kind or 'movie' if year and year != '????': if '-' in year: result['series years'] = year year = year[:4] try: result['year'] = int(year) except (TypeError, ValueError): pass if imdbIndex: result['imdbIndex'] = imdbIndex result['kind'] = kind or 'movie' return result _web_format = '%d %B %Y' _ptdf_format = '(%Y-%m-%d)' def _convertTime(title, fromPTDFtoWEB=True): """Convert a time expressed in the pain text data files, to the 'Episode dated ...' format used on the web site; if fromPTDFtoWEB is false, the inverted conversion is applied.""" try: if fromPTDFtoWEB: from_format = _ptdf_format to_format = _web_format else: from_format = 'Episode dated %s' % _web_format to_format = _ptdf_format t = strptime(title, from_format) title = strftime(to_format, t) if fromPTDFtoWEB: if title[0] == '0': title = title[1:] title = 'Episode dated %s' % title except ValueError: pass return title def build_title(title_dict, canonical=None, canonicalSeries=None, canonicalEpisode=None, ptdf=False, lang=None, _doYear=True, appendKind=True): """Given a dictionary that represents a "long" IMDb title, return a string. If canonical is None (default), the title is returned in the stored style. If canonical is True, the title is converted to canonical style. If canonical is False, the title is converted to normal format. lang can be used to specify the language of the title. If ptdf is true, the plain text data files format is used. """ if canonical is not None: canonicalSeries = canonical pre_title = '' kind = title_dict.get('kind') episode_of = title_dict.get('episode of') if kind == 'episode' and episode_of is not None: # Works with both Movie instances and plain dictionaries. doYear = 0 if ptdf: doYear = 1 # XXX: for results coming from the new search page. if not isinstance(episode_of, (dict, _Container)): episode_of = {'title': episode_of, 'kind': 'tv series'} if 'series year' in title_dict: episode_of['year'] = title_dict['series year'] pre_title = build_title(episode_of, canonical=canonicalSeries, ptdf=False, _doYear=doYear) ep_dict = {'title': title_dict.get('title', ''), 'imdbIndex': title_dict.get('imdbIndex')} ep_title = ep_dict['title'] if not ptdf: doYear = 1 ep_dict['year'] = title_dict.get('year', '????') if ep_title[0:1] == '(' and ep_title[-1:] == ')' and \ ep_title[1:5].isdigit(): ep_dict['title'] = _convertTime(ep_title, fromPTDFtoWEB=True) else: doYear = 0 if ep_title.startswith('Episode dated'): ep_dict['title'] = _convertTime(ep_title, fromPTDFtoWEB=False) episode_title = build_title(ep_dict, canonical=canonicalEpisode, ptdf=ptdf, _doYear=doYear) if ptdf: oad = title_dict.get('original air date', '') if len(oad) == 10 and oad[4] == '-' and oad[7] == '-' and \ episode_title.find(oad) == -1: episode_title += ' (%s)' % oad seas = title_dict.get('season') if seas is not None: episode_title += ' (#%s' % seas episode = title_dict.get('episode') if episode is not None: episode_title += '.%s' % episode episode_title += ')' episode_title = '{%s}' % episode_title return '%s %s' % (pre_title, episode_title) title = title_dict.get('title', '') imdbIndex = title_dict.get('imdbIndex', '') if not title: return '' if canonical is not None: if canonical: title = canonicalTitle(title, lang=lang, imdbIndex=imdbIndex) else: title = normalizeTitle(title, lang=lang) if pre_title: title = '%s %s' % (pre_title, title) if kind in ('tv series', 'tv mini series'): title = '"%s"' % title if _doYear: year = str(title_dict.get('year')) or '????' imdbIndex = title_dict.get('imdbIndex') if not ptdf: if imdbIndex and (canonical is None or canonical): title += ' (%s)' % imdbIndex title += ' (%s)' % year else: title += ' (%s' % year if imdbIndex and (canonical is None or canonical): title += '/%s' % imdbIndex title += ')' if appendKind and kind: if kind == 'tv movie': title += ' (TV)' elif kind == 'video movie': title += ' (V)' elif kind == 'tv mini series': title += ' (mini)' elif kind == 'video game': title += ' (VG)' return title def split_company_name_notes(name): """Return two strings, the first representing the company name, and the other representing the (optional) notes.""" name = name.strip() notes = '' if name.endswith(')'): fpidx = name.find('(') if fpidx != -1: notes = name[fpidx:] name = name[:fpidx].rstrip() return name, notes def analyze_company_name(name, stripNotes=False): """Return a dictionary with the name and the optional 'country' keys, from the given string. If stripNotes is true, tries to not consider optional notes. raise an IMDbParserError exception if the name is not valid. """ if stripNotes: name = split_company_name_notes(name)[0] o_name = name name = name.strip() country = None if name.startswith('['): name = re.sub('[!@#$\(\)\[\]]', '', name) else: if name.endswith(']'): idx = name.rfind('[') if idx != -1: country = name[idx:] name = name[:idx].rstrip() if not name: raise IMDbParserError('invalid name: "%s"' % o_name) result = {'name': name} if country: result['country'] = country return result def build_company_name(name_dict): """Given a dictionary that represents a "long" IMDb company name, return a string. """ name = name_dict.get('name') if not name: return '' country = name_dict.get('country') if country is not None: name += ' %s' % country return name @total_ordering class _LastC: """Size matters.""" def __lt__(self, other): return False def __eq__(self, other): return not isinstance(other, self.__class__) _last = _LastC() def cmpMovies(m1, m2): """Compare two movies by year, in reverse order; the imdbIndex is checked for movies with the same year of production and title.""" # Sort tv series' episodes. m1e = m1.get('episode of') m2e = m2.get('episode of') if m1e is not None and m2e is not None: cmp_series = cmpMovies(m1e, m2e) if cmp_series != 0: return cmp_series m1s = m1.get('season') m2s = m2.get('season') if m1s is not None and m2s is not None: if m1s < m2s: return 1 elif m1s > m2s: return -1 m1p = m1.get('episode') m2p = m2.get('episode') if m1p < m2p: return 1 elif m1p > m2p: return -1 try: if m1e is None: m1y = int(m1.get('year', 0)) else: m1y = int(m1e.get('year', 0)) except ValueError: m1y = 0 try: if m2e is None: m2y = int(m2.get('year', 0)) else: m2y = int(m2e.get('year', 0)) except ValueError: m2y = 0 if m1y > m2y: return -1 if m1y < m2y: return 1 # Ok, these movies have the same production year... # m1t = m1.get('canonical title', _last) # m2t = m2.get('canonical title', _last) # It should works also with normal dictionaries (returned from searches). # if m1t is _last and m2t is _last: m1t = m1.get('title', _last) m2t = m2.get('title', _last) if m1t < m2t: return -1 if m1t > m2t: return 1 # Ok, these movies have the same title... m1i = m1.get('imdbIndex', _last) m2i = m2.get('imdbIndex', _last) if m1i > m2i: return -1 if m1i < m2i: return 1 m1id = getattr(m1, 'movieID', None) # Introduce this check even for other comparisons functions? # XXX: is it safe to check without knowning the data access system? # probably not a great idea. Check for 'kind', instead? if m1id is not None: m2id = getattr(m2, 'movieID', None) if m1id > m2id: return -1 elif m1id < m2id: return 1 return 0 def cmpPeople(p1, p2): """Compare two people by billingPos, name and imdbIndex.""" p1b = getattr(p1, 'billingPos', None) or _last p2b = getattr(p2, 'billingPos', None) or _last if p1b > p2b: return 1 if p1b < p2b: return -1 p1n = p1.get('canonical name', _last) p2n = p2.get('canonical name', _last) if p1n is _last and p2n is _last: p1n = p1.get('name', _last) p2n = p2.get('name', _last) if p1n > p2n: return 1 if p1n < p2n: return -1 p1i = p1.get('imdbIndex', _last) p2i = p2.get('imdbIndex', _last) if p1i > p2i: return 1 if p1i < p2i: return -1 return 0 def cmpCompanies(p1, p2): """Compare two companies.""" p1n = p1.get('long imdb name', _last) p2n = p2.get('long imdb name', _last) if p1n is _last and p2n is _last: p1n = p1.get('name', _last) p2n = p2.get('name', _last) if p1n > p2n: return 1 if p1n < p2n: return -1 p1i = p1.get('country', _last) p2i = p2.get('country', _last) if p1i > p2i: return 1 if p1i < p2i: return -1 return 0 # References to titles, names and characters. # XXX: find better regexp! re_titleRef = re.compile( r'_(.+?(?: \([0-9\?]{4}(?:/[IVXLCDM]+)?\))?(?: \(mini\)| \(TV\)| \(V\)| \(VG\))?)_ \(qv\)' ) # FIXME: doesn't match persons with ' in the name. re_nameRef = re.compile(r"'([^']+?)' \(qv\)") # XXX: good choice? Are there characters with # in the name? re_characterRef = re.compile(r"#([^']+?)# \(qv\)") # Functions used to filter the text strings. def modNull(s, titlesRefs, namesRefs, charactersRefs): """Do nothing.""" return s def modClearTitleRefs(s, titlesRefs, namesRefs, charactersRefs): """Remove titles references.""" return re_titleRef.sub(r'\1', s) def modClearNameRefs(s, titlesRefs, namesRefs, charactersRefs): """Remove names references.""" return re_nameRef.sub(r'\1', s) def modClearCharacterRefs(s, titlesRefs, namesRefs, charactersRefs): """Remove characters references""" return re_characterRef.sub(r'\1', s) def modClearRefs(s, titlesRefs, namesRefs, charactersRefs): """Remove titles, names and characters references.""" s = modClearTitleRefs(s, {}, {}, {}) s = modClearCharacterRefs(s, {}, {}, {}) return modClearNameRefs(s, {}, {}, {}) def modifyStrings(o, modFunct, titlesRefs, namesRefs, charactersRefs): """Modify a string (or string values in a dictionary or strings in a list), using the provided modFunct function and titlesRefs namesRefs and charactersRefs references dictionaries.""" # Notice that it doesn't go any deeper than the first two levels in a list. if isinstance(o, str): return modFunct(o, titlesRefs, namesRefs, charactersRefs) elif isinstance(o, (list, tuple, dict)): _stillorig = 1 if isinstance(o, (list, tuple)): keys = range(len(o)) else: keys = list(o.keys()) for i in keys: v = o[i] if isinstance(v, str): if _stillorig: o = copy(o) _stillorig = 0 o[i] = modFunct(v, titlesRefs, namesRefs, charactersRefs) elif isinstance(v, (list, tuple)): modifyStrings(o[i], modFunct, titlesRefs, namesRefs, charactersRefs) return o def date_and_notes(s): """Parse (birth|death) date and notes; returns a tuple in the form (date, notes).""" s = s.strip() if not s: return '', '' notes = '' if s[0].isdigit() or s.split()[0].lower() in ( 'c.', 'january', 'february', 'march', 'april', 'may', 'june', 'july', 'august', 'september', 'october', 'november', 'december', 'ca.', 'circa', '????,'): i = s.find(',') if i != -1: notes = s[i + 1:].strip() s = s[:i] else: notes = s s = '' if s == '????': s = '' return s, notes class RolesList(list): """A list of Person or Character instances, used for the currentRole property.""" @property def notes(self): return self._notes @notes.setter def notes(self, notes): self._notes = notes def __init__(self, *args, **kwds): self._notes = None super(RolesList, self).__init__(*args, **kwds) def __str__(self): return ' / '.join([str(x) for x in self]) # Replace & with &, but only if it's not already part of a charref. # _re_amp = re.compile(r'(&)(?!\w+;)', re.I) # _re_amp = re.compile(r'(?<=\W)&(?=[^a-zA-Z0-9_#])') _re_amp = re.compile(r'&(?![^a-zA-Z0-9_#]{1,5};)') def escape4xml(value): """Escape some chars that can't be present in a XML value.""" if isinstance(value, (int, float)): value = str(value) value = _re_amp.sub('&', value) value = value.replace('"', '"').replace("'", ''') value = value.replace('<', '<').replace('>', '>') if isinstance(value, bytes): value = value.decode('utf-8', 'xmlcharrefreplace') return value def _refsToReplace(value, modFunct, titlesRefs, namesRefs, charactersRefs): """Return three lists - for movie titles, persons and characters names - with two items tuples: the first item is the reference once escaped by the user-provided modFunct function, the second is the same reference un-escaped.""" mRefs = [] for refRe, refTemplate in [(re_titleRef, '_%s_ (qv)'), (re_nameRef, "'%s' (qv)"), (re_characterRef, '#%s# (qv)')]: theseRefs = [] for theRef in refRe.findall(value): # refTemplate % theRef values don't change for a single # _Container instance, so this is a good candidate for a # cache or something - even if it's so rarely used that... # Moreover, it can grow - ia.update(...) - and change if # modFunct is modified. goodValue = modFunct(refTemplate % theRef, titlesRefs, namesRefs, charactersRefs) # Prevents problems with crap in plain text data files. # We should probably exclude invalid chars and string that # are too long in the re_*Ref expressions. if '_' in goodValue or len(goodValue) > 128: continue toReplace = escape4xml(goodValue) # Only the 'value' portion is replaced. replaceWith = goodValue.replace(theRef, escape4xml(theRef)) theseRefs.append((toReplace, replaceWith)) mRefs.append(theseRefs) return mRefs def _handleTextNotes(s): """Split text::notes strings.""" ssplit = s.split('::', 1) if len(ssplit) == 1: return s return '%s%s' % (ssplit[0], ssplit[1]) def _normalizeValue(value, withRefs=False, modFunct=None, titlesRefs=None, namesRefs=None, charactersRefs=None): """Replace some chars that can't be present in a XML text.""" if not withRefs: value = _handleTextNotes(escape4xml(value)) else: # Replace references that were accidentally escaped. replaceLists = _refsToReplace(value, modFunct, titlesRefs, namesRefs, charactersRefs) value = modFunct(value, titlesRefs or {}, namesRefs or {}, charactersRefs or {}) value = _handleTextNotes(escape4xml(value)) for replaceList in replaceLists: for toReplace, replaceWith in replaceList: value = value.replace(toReplace, replaceWith) return value def _tag4TON(ton, addAccessSystem=False, _containerOnly=False): """Build a tag for the given _Container instance; both open and close tags are returned.""" tag = ton.__class__.__name__.lower() what = 'name' if tag == 'movie': value = ton.get('long imdb title') or ton.get('title', '') what = 'title' else: value = ton.get('long imdb name') or ton.get('name', '') value = _normalizeValue(value) extras = '' crl = ton.currentRole if crl: if not isinstance(crl, list): crl = [crl] for cr in crl: crTag = cr.__class__.__name__.lower() if PY2 and isinstance(cr, unicode): crValue = cr crID = None else: crValue = cr.get('long imdb name') or '' crID = cr.getID() crValue = _normalizeValue(crValue) if crID is not None: extras += '<%s id="%s">%s' % ( crTag, crID, crValue, crTag ) else: extras += '<%s>%s' % (crTag, crValue, crTag) if hasattr(cr, 'notes'): extras += '%s' % _normalizeValue(cr.notes) extras += '' theID = ton.getID() if theID is not None: beginTag = '<%s id="%s"' % (tag, theID) if addAccessSystem and ton.accessSystem: beginTag += ' access-system="%s"' % ton.accessSystem if not _containerOnly: beginTag += '><%s>%s' % (what, value, what) else: beginTag += '>' else: if not _containerOnly: beginTag = '<%s><%s>%s' % (tag, what, value, what) else: beginTag = '<%s>' % tag beginTag += extras if ton.notes: beginTag += '%s' % _normalizeValue(ton.notes) return beginTag, '' % tag TAGS_TO_MODIFY = { 'movie.parents-guide': ('item', True), 'movie.number-of-votes': ('item', True), 'movie.soundtrack.item': ('item', True), 'movie.soundtrack.item.item': ('item', True), 'movie.quotes': ('quote', False), 'movie.quotes.quote': ('line', False), 'movie.demographic': ('item', True), 'movie.episodes': ('season', True), 'movie.episodes.season': ('episode', True), 'person.merchandising-links': ('item', True), 'person.genres': ('item', True), 'person.quotes': ('quote', False), 'person.keywords': ('item', True), 'character.quotes': ('item', True), 'character.quotes.item': ('quote', False), 'character.quotes.item.quote': ('line', False) } _valid_chars = string.ascii_lowercase + '-' + string.digits _translator = str.maketrans(_valid_chars, _valid_chars) if not PY2 else \ string.maketrans(_valid_chars, _valid_chars) def _tagAttr(key, fullpath): """Return a tuple with a tag name and a (possibly empty) attribute, applying the conversions specified in TAGS_TO_MODIFY and checking that the tag is safe for a XML document.""" attrs = {} _escapedKey = escape4xml(key) if fullpath in TAGS_TO_MODIFY: tagName, useTitle = TAGS_TO_MODIFY[fullpath] if useTitle: attrs['key'] = _escapedKey elif not isinstance(key, str): strType = str(type(key)).replace("", "") attrs['keytype'] = strType tagName = str(key) else: tagName = key if isinstance(key, int): attrs['keytype'] = 'int' origTagName = tagName tagName = tagName.lower().replace(' ', '-') tagName = str(tagName).translate(_translator) if origTagName != tagName: if 'key' not in attrs: attrs['key'] = _escapedKey if (not tagName) or tagName[0].isdigit() or tagName[0] == '-': # This is a fail-safe: we should never be here, since unpredictable # keys must be listed in TAGS_TO_MODIFY. # This will proably break the DTD/schema, but at least it will # produce a valid XML. tagName = 'item' _utils_logger.error('invalid tag: %s [%s]' % (_escapedKey, fullpath)) attrs['key'] = _escapedKey return tagName, ' '.join(['%s="%s"' % i for i in list(attrs.items())]) def _seq2xml(seq, _l=None, withRefs=False, modFunct=None, titlesRefs=None, namesRefs=None, charactersRefs=None, _topLevel=True, key2infoset=None, fullpath=''): """Convert a sequence or a dictionary to a list of XML unicode strings.""" if _l is None: _l = [] if isinstance(seq, dict): for key in seq: value = seq[key] if isinstance(key, _Container): # Here we're assuming that a _Container is never a top-level # key (otherwise we should handle key2infoset). openTag, closeTag = _tag4TON(key) # So that fullpath will contains something meaningful. tagName = key.__class__.__name__.lower() else: tagName, attrs = _tagAttr(key, fullpath) openTag = '<%s' % tagName if attrs: openTag += ' %s' % attrs if _topLevel and key2infoset and key in key2infoset: openTag += ' infoset="%s"' % key2infoset[key] if isinstance(value, int): openTag += ' type="int"' elif isinstance(value, float): openTag += ' type="float"' openTag += '>' closeTag = '' % tagName _l.append(openTag) _seq2xml(value, _l, withRefs, modFunct, titlesRefs, namesRefs, charactersRefs, _topLevel=False, fullpath='%s.%s' % (fullpath, tagName)) _l.append(closeTag) elif isinstance(seq, (list, tuple)): tagName, attrs = _tagAttr('item', fullpath) beginTag = '<%s' % tagName if attrs: beginTag += ' %s' % attrs # beginTag += u'>' closeTag = '' % tagName for item in seq: if isinstance(item, _Container): _seq2xml(item, _l, withRefs, modFunct, titlesRefs, namesRefs, charactersRefs, _topLevel=False, fullpath='%s.%s' % (fullpath, item.__class__.__name__.lower())) else: openTag = beginTag if isinstance(item, int): openTag += ' type="int"' elif isinstance(item, float): openTag += ' type="float"' openTag += '>' _l.append(openTag) _seq2xml(item, _l, withRefs, modFunct, titlesRefs, namesRefs, charactersRefs, _topLevel=False, fullpath='%s.%s' % (fullpath, tagName)) _l.append(closeTag) else: if isinstance(seq, _Container): _l.extend(_tag4TON(seq)) elif seq: # Text, ints, floats and the like. _l.append(_normalizeValue(seq, withRefs=withRefs, modFunct=modFunct, titlesRefs=titlesRefs, namesRefs=namesRefs, charactersRefs=charactersRefs)) return _l _xmlHead = """ """ _xmlHead = _xmlHead.replace('{VERSION}', VERSION.replace('.', '').split('dev')[0][:2]) @total_ordering class _Container(object): """Base class for Movie, Person, Character and Company classes.""" # The default sets of information retrieved. default_info = () # Aliases for some not-so-intuitive keys. keys_alias = {} # List of keys to modify. keys_tomodify_list = () # Function used to compare two instances of this class. cmpFunct = None # key that contains the cover/headshot _image_key = None def __init__(self, myID=None, data=None, notes='', currentRole='', roleID=None, roleIsPerson=False, accessSystem=None, titlesRefs=None, namesRefs=None, charactersRefs=None, modFunct=None, *args, **kwds): """Initialize a Movie, Person, Character or Company object. *myID* -- your personal identifier for this object. *data* -- a dictionary used to initialize the object. *notes* -- notes for the person referred in the currentRole attribute; e.g.: '(voice)' or the alias used in the movie credits. *accessSystem* -- a string representing the data access system used. *currentRole* -- a Character instance representing the current role or duty of a person in this movie, or a Person object representing the actor/actress who played a given character in a Movie. If a string is passed, an object is automatically build. *roleID* -- if available, the characterID/personID of the currentRole object. *roleIsPerson* -- when False (default) the currentRole is assumed to be a Character object, otherwise a Person. *titlesRefs* -- a dictionary with references to movies. *namesRefs* -- a dictionary with references to persons. *charactersRefs* -- a dictionary with references to characters. *modFunct* -- function called returning text fields. """ self.reset() self.accessSystem = accessSystem self.myID = myID if data is None: data = {} self.set_data(data, override=True) self.notes = notes if titlesRefs is None: titlesRefs = {} self.update_titlesRefs(titlesRefs) if namesRefs is None: namesRefs = {} self.update_namesRefs(namesRefs) if charactersRefs is None: charactersRefs = {} self.update_charactersRefs(charactersRefs) self.set_mod_funct(modFunct) self.keys_tomodify = {} for item in self.keys_tomodify_list: self.keys_tomodify[item] = None self._roleIsPerson = roleIsPerson if not roleIsPerson: from imdb.Character import Character self._roleClass = Character else: from imdb.Person import Person self._roleClass = Person self.currentRole = currentRole if roleID: self.roleID = roleID self._init(*args, **kwds) def _get_roleID(self): """Return the characterID or personID of the currentRole object.""" if not self.__role: return None if isinstance(self.__role, list): return [x.getID() for x in self.__role] return self.currentRole.getID() def _set_roleID(self, roleID): """Set the characterID or personID of the currentRole object.""" if not self.__role: # XXX: needed? Just ignore it? It's probably safer to # ignore it, to prevent some bugs in the parsers. # raise IMDbError,"Can't set ID of an empty Character/Person object." pass if not self._roleIsPerson: if not isinstance(roleID, (list, tuple)): if not(PY2 and isinstance(self.currentRole, unicode)): self.currentRole.characterID = roleID else: for index, item in enumerate(roleID): r = self.__role[index] if PY2 and isinstance(r, unicode): continue r.characterID = item else: if not isinstance(roleID, (list, tuple)): self.currentRole.personID = roleID else: for index, item in enumerate(roleID): r = self.__role[index] if PY2 and isinstance(r, unicode): continue r.personID = item roleID = property(_get_roleID, _set_roleID, doc="the characterID or personID of the currentRole object.") def _get_currentRole(self): """Return a Character or Person instance.""" if self.__role: return self.__role return self._roleClass(name='', accessSystem=self.accessSystem, modFunct=self.modFunct) def _set_currentRole(self, role): """Set self.currentRole to a Character or Person instance.""" if isinstance(role, str): if not role: self.__role = None else: self.__role = self._roleClass(name=role, modFunct=self.modFunct, accessSystem=self.accessSystem) elif isinstance(role, (list, tuple)): self.__role = RolesList() for item in role: if isinstance(item, str): self.__role.append(self._roleClass(name=item, accessSystem=self.accessSystem, modFunct=self.modFunct)) else: self.__role.append(item) if not self.__role: self.__role = None else: self.__role = role currentRole = property(_get_currentRole, _set_currentRole, doc="The role of a Person in a Movie" " or the interpreter of a Character in a Movie.") def _init(self, **kwds): pass def get_fullsizeURL(self): """Return the full-size URL for this object.""" if not (self._image_key and self._image_key in self.data): return None url = self.data[self._image_key] or '' ext_idx = url.rfind('.') if ext_idx == -1: return url if '@' in url: return url[:url.rindex('@')+1] + url[ext_idx:] else: prev_dot = url[:ext_idx].rfind('.') if prev_dot == -1: return url return url[:prev_dot] + url[ext_idx:] def reset(self): """Reset the object.""" self.data = {} self.myID = None self.notes = '' self.titlesRefs = {} self.namesRefs = {} self.charactersRefs = {} self.modFunct = modClearRefs self.current_info = [] self.infoset2keys = {} self.key2infoset = {} self.__role = None self._reset() def _reset(self): pass def clear(self): """Reset the dictionary.""" self.data.clear() self.notes = '' self.titlesRefs = {} self.namesRefs = {} self.charactersRefs = {} self.current_info = [] self.infoset2keys = {} self.key2infoset = {} self.__role = None self._clear() def _clear(self): pass def get_current_info(self): """Return the current set of information retrieved.""" return self.current_info def update_infoset_map(self, infoset, keys, mainInfoset): """Update the mappings between infoset and keys.""" if keys is None: keys = [] if mainInfoset is not None: theIS = mainInfoset else: theIS = infoset self.infoset2keys[theIS] = keys for key in keys: self.key2infoset[key] = theIS def set_current_info(self, ci): """Set the current set of information retrieved.""" # XXX:Remove? It's never used and there's no way to update infoset2keys. self.current_info = ci def add_to_current_info(self, val, keys=None, mainInfoset=None): """Add a set of information to the current list.""" if val not in self.current_info: self.current_info.append(val) self.update_infoset_map(val, keys, mainInfoset) def has_current_info(self, val): """Return true if the given set of information is in the list.""" return val in self.current_info def set_mod_funct(self, modFunct): """Set the fuction used to modify the strings.""" if modFunct is None: modFunct = modClearRefs self.modFunct = modFunct def update_titlesRefs(self, titlesRefs): """Update the dictionary with the references to movies.""" self.titlesRefs.update(titlesRefs) def get_titlesRefs(self): """Return the dictionary with the references to movies.""" return self.titlesRefs def update_namesRefs(self, namesRefs): """Update the dictionary with the references to names.""" self.namesRefs.update(namesRefs) def get_namesRefs(self): """Return the dictionary with the references to names.""" return self.namesRefs def update_charactersRefs(self, charactersRefs): """Update the dictionary with the references to characters.""" self.charactersRefs.update(charactersRefs) def get_charactersRefs(self): """Return the dictionary with the references to characters.""" return self.charactersRefs def set_data(self, data, override=False): """Set the movie data to the given dictionary; if 'override' is set, the previous data is removed, otherwise the two dictionary are merged. """ if not override: self.data.update(data) else: self.data = data def getID(self): """Return movieID, personID, characterID or companyID.""" raise NotImplementedError('override this method') def __lt__(self, other): """Compare two Movie, Person, Character or Company objects.""" if self.cmpFunct is None: return False if not isinstance(other, self.__class__): return False return self.cmpFunct(other) == -1 def __eq__(self, other): """Compare two Movie, Person, Character or Company objects.""" if self.cmpFunct is None: return False if not isinstance(other, self.__class__): return False return self.cmpFunct(other) def __hash__(self): """Hash for this object.""" # XXX: does it always work correctly? theID = self.getID() if theID is not None and self.accessSystem not in ('UNKNOWN', None): # Handle 'http' and 'mobile' as they are the same access system. acs = self.accessSystem if acs in ('mobile', 'httpThin'): acs = 'http' # There must be some indication of the kind of the object, too. s4h = '%s:%s[%s]' % (self.__class__.__name__, theID, acs) else: s4h = repr(self) return hash(s4h) def isSame(self, other): """Return True if the two represent the same object.""" return isinstance(other, self.__class__) and hash(self) == hash(other) def __len__(self): """Number of items in the data dictionary.""" return len(self.data) def getAsXML(self, key, _with_add_keys=True): """Return a XML representation of the specified key, or None if empty. If _with_add_keys is False, dinamically generated keys are excluded.""" # Prevent modifyStrings in __getitem__ to be called; if needed, # it will be called by the _normalizeValue function. origModFunct = self.modFunct self.modFunct = modNull # XXX: not totally sure it's a good idea, but could prevent # problems (i.e.: the returned string always contains # a DTD valid tag, and not something that can be only in # the keys_alias map). key = self.keys_alias.get(key, key) if (not _with_add_keys) and (key in self._additional_keys()): self.modFunct = origModFunct return None try: withRefs = False if key in self.keys_tomodify and \ origModFunct not in (None, modNull): withRefs = True value = self.get(key) if value is None: return None tag = self.__class__.__name__.lower() return ''.join(_seq2xml({key: value}, withRefs=withRefs, modFunct=origModFunct, titlesRefs=self.titlesRefs, namesRefs=self.namesRefs, charactersRefs=self.charactersRefs, key2infoset=self.key2infoset, fullpath=tag)) finally: self.modFunct = origModFunct def asXML(self, _with_add_keys=True): """Return a XML representation of the whole object. If _with_add_keys is False, dinamically generated keys are excluded.""" beginTag, endTag = _tag4TON(self, addAccessSystem=True, _containerOnly=True) resList = [beginTag] for key in list(self.keys()): value = self.getAsXML(key, _with_add_keys=_with_add_keys) if not value: continue resList.append(value) resList.append(endTag) head = _xmlHead % self.__class__.__name__.lower() return head + ''.join(resList) def _getitem(self, key): """Handle special keys.""" return None def __getitem__(self, key): """Return the value for a given key, checking key aliases; a KeyError exception is raised if the key is not found. """ value = self._getitem(key) if value is not None: return value # Handle key aliases. key = self.keys_alias.get(key, key) rawData = self.data[key] if key in self.keys_tomodify and \ self.modFunct not in (None, modNull): try: return modifyStrings(rawData, self.modFunct, self.titlesRefs, self.namesRefs, self.charactersRefs) except RuntimeError as e: import warnings warnings.warn("RuntimeError in imdb.utils._Container.__getitem__;" " if it's not a recursion limit exceeded, it's a bug:\n%s" % e) return rawData def __setitem__(self, key, item): """Directly store the item with the given key.""" self.data[key] = item def __delitem__(self, key): """Remove the given section or key.""" # XXX: how to remove an item of a section? del self.data[key] def _additional_keys(self): """Valid keys to append to the data.keys() list.""" return [] def keys(self): """Return a list of valid keys.""" return list(self.data.keys()) + self._additional_keys() def items(self): """Return the items in the dictionary.""" return [(k, self.get(k)) for k in list(self.keys())] # XXX: is this enough? def iteritems(self): return iter(self.data.items()) def iterkeys(self): return iter(self.data.keys()) def itervalues(self): return iter(self.data.values()) def values(self): """Return the values in the dictionary.""" return [self.get(k) for k in list(self.keys())] def has_key(self, key): """Return true if a given section is defined.""" try: self.__getitem__(key) except KeyError: return False return True # XXX: really useful??? # consider also that this will confuse people who meant to # call ia.update(movieObject, 'data set') instead. def update(self, dict): self.data.update(dict) def get(self, key, failobj=None): """Return the given section, or default if it's not found.""" try: return self.__getitem__(key) except KeyError: return failobj def setdefault(self, key, failobj=None): if key not in self: self[key] = failobj return self[key] def pop(self, key, *args): return self.data.pop(key, *args) def popitem(self): return self.data.popitem() def __repr__(self): """String representation of an object.""" raise NotImplementedError('override this method') def __str__(self): """Movie title or person name.""" raise NotImplementedError('override this method') def __contains__(self, key): raise NotImplementedError('override this method') def append_item(self, key, item): """The item is appended to the list identified by the given key.""" self.data.setdefault(key, []).append(item) def set_item(self, key, item): """Directly store the item with the given key.""" self.data[key] = item def __bool__(self): """Return true if self.data contains something.""" return bool(self.data) def __deepcopy__(self, memo): raise NotImplementedError('override this method') def copy(self): """Return a deep copy of the object itself.""" return deepcopy(self) def flatten(seq, toDescend=(list, dict, tuple), yieldDictKeys=False, onlyKeysType=(_Container,), scalar=None): """Iterate over nested lists and dictionaries; toDescend is a list or a tuple of types to be considered non-scalar; if yieldDictKeys is true, also dictionaries' keys are yielded; if scalar is not None, only items of the given type(s) are yielded.""" if scalar is None or isinstance(seq, scalar): yield seq if isinstance(seq, toDescend): if isinstance(seq, (dict, _Container)): if yieldDictKeys: # Yield also the keys of the dictionary. for key in seq.keys(): for k in flatten(key, toDescend=toDescend, yieldDictKeys=yieldDictKeys, onlyKeysType=onlyKeysType, scalar=scalar): if onlyKeysType and isinstance(k, onlyKeysType): yield k for value in seq.values(): for v in flatten(value, toDescend=toDescend, yieldDictKeys=yieldDictKeys, onlyKeysType=onlyKeysType, scalar=scalar): yield v elif not isinstance(seq, (str, bytes, int, float)): for item in seq: for i in flatten(item, toDescend=toDescend, yieldDictKeys=yieldDictKeys, onlyKeysType=onlyKeysType, scalar=scalar): yield i imdbpy-6.8/requirements.txt000066400000000000000000000003071351454127000161410ustar00rootroot00000000000000atomicwrites>=1.1.5 attrs>=18.1.0 lxml>=4.2.3 more-itertools>=4.2.0 packaging>=17.1 pluggy>=0.7.1 py>=1.5.4 pyparsing>=2.2.0 pytest>=3.6.4 six>=1.11.0 SQLAlchemy>=1.3.0 tox>=3.1.2 virtualenv>=16.0.0 imdbpy-6.8/setup.cfg000066400000000000000000000012321351454127000144740ustar00rootroot00000000000000[egg_info] #tag_build = dev #tag_date = true [bdist_rpm] vendor = Davide Alberani # Comment out the doc_files entry if you don't want to install # the documentation. doc_files = docs/* # Comment out the icon entry if you don't want to install the icon. icon = docs/imdbpyico.xpm [bdist_wininst] # Bitmap for the installer. bitmap = docs/imdbpywin.bmp [flake8] max-line-length = 120 exclude = sql,msgfmt.py builtins = unicode ignore = E731, I001 [isort] line_length = 96 lines_after_imports = 2 known_first_party = imdb known_pytest = pytest sections = FUTURE,PYTEST,STDLIB,THIRDPARTY,FIRSTPARTY,LOCALFOLDER [coverage:report] omit = */sql/* imdbpy-6.8/setup.py000077500000000000000000000145171351454127000144020ustar00rootroot00000000000000import distutils.sysconfig import os import sys import setuptools # version of the software; in the code repository this represents # the _next_ release. setuptools will automatically add 'dev-rREVISION'. version = '6.8' home_page = 'https://imdbpy.sourceforge.io/' long_desc = """IMDbPY is a Python package useful to retrieve and manage the data of the IMDb movie database about movies, people, characters and companies. Platform-independent and written in Python 3 it can retrieve data from both the IMDb's web server and a local copy of the whole database. IMDbPY package can be very easily used by programmers and developers to provide access to the IMDb's data to their programs. Some simple example scripts - useful for the end users - are included in this package; other IMDbPY-based programs are available at the home page: %s """ % home_page dwnl_url = 'https://imdbpy.sourceforge.io/downloads.html' classifiers = """\ Development Status :: 5 - Production/Stable Environment :: Console Environment :: Web Environment Intended Audience :: Developers Intended Audience :: End Users/Desktop License :: OSI Approved :: GNU General Public License (GPL) Natural Language :: English Natural Language :: Italian Natural Language :: Turkish Programming Language :: Python Programming Language :: Python :: 3.7 Programming Language :: Python :: 3.6 Programming Language :: Python :: 3.5 Programming Language :: Python :: 3.4 Programming Language :: Python :: 2.7 Programming Language :: Python :: Implementation :: CPython Programming Language :: Python :: Implementation :: PyPy Operating System :: OS Independent Topic :: Database :: Front-Ends Topic :: Internet :: WWW/HTTP :: Dynamic Content :: CGI Tools/Libraries Topic :: Software Development :: Libraries :: Python Modules """ keywords = ['imdb', 'movie', 'people', 'database', 'cinema', 'film', 'person', 'cast', 'actor', 'actress', 'director', 'sql', 'character', 'company', 'package', 'plain text data files', 'keywords', 'top250', 'bottom100', 'xml'] scripts = [ './bin/get_first_movie.py', './bin/imdbpy2sql.py', './bin/s32imdbpy.py', './bin/get_movie.py', './bin/search_movie.py', './bin/get_first_person.py', './bin/get_person.py', './bin/search_person.py', './bin/get_character.py', './bin/get_first_character.py', './bin/get_company.py', './bin/search_character.py', './bin/search_company.py', './bin/get_first_company.py', './bin/get_keyword.py', './bin/search_keyword.py', './bin/get_top_bottom_movies.py' ] data_files = [] featSQLAlchemy = setuptools.dist.Feature( 'SQLAlchemy dependency', standard=True, install_requires=['SQLAlchemy'] ) params = { # Meta-information. 'name': 'IMDbPY', 'version': version, 'description': 'Python package to access the IMDb\'s database', 'long_description': long_desc, 'author': 'Davide Alberani', 'author_email': 'da@erlug.linux.it', 'contact': 'IMDbPY-devel mailing list', 'contact_email': 'imdbpy-devel@lists.sourceforge.net', 'maintainer': 'Davide Alberani', 'maintainer_email': 'da@erlug.linux.it', 'license': 'GPL', 'platforms': 'any', 'keywords': keywords, 'classifiers': [_f for _f in classifiers.split("\n") if _f], 'url': home_page, 'download_url': dwnl_url, 'scripts': scripts, 'data_files': data_files, 'install_requires': ['SQLAlchemy', 'lxml'], 'extras_require': { 'dev': [ 'flake8', 'flake8-isort', 'readme_renderer' ], 'doc': [ 'sphinx', 'sphinx_rtd_theme' ], 'test': [ 'pytest', 'pytest-cov', 'pytest-profiling' ] }, 'features': {'sqlalchemy': featSQLAlchemy}, 'packages': setuptools.find_packages(), 'entry_points': """ [console_scripts] imdbpy=imdb.cli:main """ } ERR_MSG = """ ==================================================================== ERROR ===== Aaargh! An error! An error! Curse my metal body, I wasn't fast enough. It's all my fault! Anyway, if you were trying to build a package or install IMDbPY to your system, looks like we're unable to fetch or install some dependencies. The best solution is to resolve these dependencies (maybe you're not connected to Internet?) The caught exception, is re-raise below: """ REBUILDMO_DIR = os.path.join('imdb', 'locale') REBUILDMO_NAME = 'rebuildmo' def runRebuildmo(): """Call the function to rebuild the locales.""" cwd = os.getcwd() path = list(sys.path) languages = [] try: import imp scriptPath = os.path.dirname(__file__) modulePath = os.path.join(cwd, scriptPath, REBUILDMO_DIR) sys.path += [modulePath, '.', cwd] modInfo = imp.find_module(REBUILDMO_NAME, [modulePath, '.', cwd]) rebuildmo = imp.load_module('rebuildmo', *modInfo) os.chdir(modulePath) languages = rebuildmo.rebuildmo() print('Created locale for: %s.' % ' '.join(languages)) except Exception as e: print('ERROR: unable to rebuild .mo files; caught exception %s' % e) sys.path = path os.chdir(cwd) return languages def hasCommand(): """Return true if at least one command is found on the command line.""" args = sys.argv[1:] if '--help' in args: return False if '-h' in args: return False for arg in args: if arg and not arg.startswith('-'): return True return False try: if hasCommand(): languages = runRebuildmo() else: languages = [] if languages: data_files.append((os.path.join(distutils.sysconfig.get_python_lib(), 'imdb/locale'), ['imdb/locale/imdbpy.pot'])) for lang in languages: files_found = setuptools.findall('imdb/locale/%s' % lang) if not files_found: continue base_dir = os.path.dirname(files_found[0]) data_files.append((os.path.join(distutils.sysconfig.get_python_lib(), 'imdb/locale'), ['imdb/locale/imdbpy-%s.po' % lang])) if not base_dir: continue data_files.append((os.path.join(distutils.sysconfig.get_python_lib(), base_dir), files_found)) except SystemExit: print(ERR_MSG) setuptools.setup(**params) imdbpy-6.8/tests/000077500000000000000000000000001351454127000140175ustar00rootroot00000000000000imdbpy-6.8/tests/conftest.py000066400000000000000000000022361351454127000162210ustar00rootroot00000000000000from pytest import fixture import logging import os from hashlib import md5 from imdb import IMDb from imdb.parser.http import IMDbURLopener logging.raiseExceptions = False cache_dir = os.path.join(os.path.dirname(__file__), '.cache') if not os.path.exists(cache_dir): os.makedirs(cache_dir) retrieve_unicode_orig = IMDbURLopener.retrieve_unicode def retrieve_unicode_cached(self, url, size=-1): key = md5(url.encode('utf-8')).hexdigest() cache_file = os.path.join(cache_dir, key) if os.path.exists(cache_file): with open(cache_file, 'r') as f: content = f.read() else: content = retrieve_unicode_orig(self, url, size=size) with open(cache_file, 'w') as f: f.write(content) return content s3_uri = os.getenv('IMDBPY_S3_URI') @fixture(params=['http'] + (['s3'] if s3_uri is not None else [])) def ia(request): """Access to IMDb data.""" if request.param == 'http': IMDbURLopener.retrieve_unicode = retrieve_unicode_cached yield IMDb('http') IMDbURLopener.retrieve_unicode = retrieve_unicode_orig elif request.param == 's3': yield IMDb('s3', uri=s3_uri) imdbpy-6.8/tests/test_http_chart_bottom.py000066400000000000000000000024271351454127000211610ustar00rootroot00000000000000def test_bottom_chart_should_contain_100_entries(ia): chart = ia.get_bottom100_movies() assert len(chart) == 100 def test_bottom_chart_entries_should_have_rank(ia): movies = ia.get_bottom100_movies() for rank, movie in enumerate(movies): assert movie['bottom 100 rank'] == rank + 1 def test_bottom_chart_entries_should_have_movie_id(ia): movies = ia.get_bottom100_movies() for movie in movies: assert movie.movieID.isdigit() def test_bottom_chart_entries_should_have_title(ia): movies = ia.get_bottom100_movies() for movie in movies: assert 'title' in movie def test_bottom_chart_entries_should_be_movies(ia): movies = ia.get_bottom100_movies() for movie in movies: assert movie['kind'] == 'movie' def test_bottom_chart_entries_should_have_year(ia): movies = ia.get_bottom100_movies() for movie in movies: assert isinstance(movie['year'], int) def test_bottom_chart_entries_should_have_low_ratings(ia): movies = ia.get_bottom100_movies() for movie in movies: assert movie['rating'] < 5.0 def test_bottom_chart_entries_should_have_minimal_number_of_votes(ia): movies = ia.get_bottom100_movies() for movie in movies: assert movie['votes'] >= 1500 # limit stated by IMDb imdbpy-6.8/tests/test_http_chart_top.py000066400000000000000000000023461351454127000204570ustar00rootroot00000000000000def test_top_chart_should_contain_250_entries(ia): chart = ia.get_top250_movies() assert len(chart) == 250 def test_top_chart_entries_should_have_rank(ia): movies = ia.get_top250_movies() for rank, movie in enumerate(movies): assert movie['top 250 rank'] == rank + 1 def test_top_chart_entries_should_have_movie_id(ia): movies = ia.get_top250_movies() for movie in movies: assert movie.movieID.isdigit() def test_top_chart_entries_should_have_title(ia): movies = ia.get_top250_movies() for movie in movies: assert 'title' in movie def test_top_chart_entries_should_be_movies(ia): movies = ia.get_top250_movies() for movie in movies: assert movie['kind'] == 'movie' def test_top_chart_entries_should_have_year(ia): movies = ia.get_top250_movies() for movie in movies: assert isinstance(movie['year'], int) def test_top_chart_entries_should_have_high_ratings(ia): movies = ia.get_top250_movies() for movie in movies: assert movie['rating'] > 7.5 def test_top_chart_entries_should_have_minimal_number_of_votes(ia): movies = ia.get_top250_movies() for movie in movies: assert movie['votes'] >= 25000 # limit stated by IMDb imdbpy-6.8/tests/test_http_company_main.py000066400000000000000000000002431351454127000211400ustar00rootroot00000000000000def test_company_name_should_not_include_country(ia): data = ia.get_company('0017902', info=['main']) assert data.get('name') == 'Pixar Animation Studios' imdbpy-6.8/tests/test_http_gather_refs.py000066400000000000000000000006401351454127000207600ustar00rootroot00000000000000def test_references_to_titles_should_be_a_list(ia): person = ia.get_person('0000210', info=['biography']) # Julia Roberts titles_refs = person.get_titlesRefs() assert 70 < len(titles_refs) < 100 def test_references_to_names_should_be_a_list(ia): person = ia.get_person('0000210', info=['biography']) # Julia Roberts names_refs = person.get_namesRefs() assert 100 < len(names_refs) < 150 imdbpy-6.8/tests/test_http_movie_combined.py000066400000000000000000000506361351454127000214600ustar00rootroot00000000000000from pytest import mark import re from imdb.Movie import Movie from imdb.Person import Person months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'] re_date = re.compile(r'[0-9]{1,2} (%s) [0-9]{4}' % '|'.join(months), re.I) def test_movie_cover_url_should_be_an_image_link(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert re.match(r'^https?://.*\.jpg$', movie.get('cover url')) def test_cover_url_if_none_should_be_excluded(ia): movie = ia.get_movie('3629794', info=['main']) # Aslan assert 'cover url' not in movie def test_movie_directors_should_be_a_list_of_persons(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix directors = [p for p in movie.get('directors', [])] assert len(directors) == 2 for p in directors: assert isinstance(p, Person) def test_movie_directors_should_contain_correct_people(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix directorIDs = [p.personID for p in movie.get('directors', [])] assert directorIDs == ['0905154', '0905152'] def test_movie_directors_should_contain_person_names(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix director_names = [p.get('name') for p in movie.get('directors', [])] assert director_names == ['Lana Wachowski', 'Lilly Wachowski'] def test_movie_writers_should_be_a_list_of_persons(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix writers = [p for p in movie.get('writers', [])] assert len(writers) == 2 for p in writers: assert isinstance(p, Person) def test_movie_writers_should_contain_correct_people(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix writerIDs = [p.personID for p in movie.get('writer', [])] assert writerIDs == ['0905152', '0905154'] def test_movie_writers_should_contain_person_names(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix writer_names = [p.get('name') for p in movie.get('writers', [])] assert writer_names == ['Lilly Wachowski', 'Lana Wachowski'] def test_movie_title_should_not_have_year(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert movie.get('title') == 'The Matrix' def test_movie_title_tv_movie_should_not_include_type(ia): movie = ia.get_movie('0389150', info=['main']) # Matrix (TV) assert movie.get('title') == 'The Matrix Defence' def test_movie_title_video_movie_should_not_include_type(ia): movie = ia.get_movie('0109151', info=['main']) # Matrix (V) assert movie.get('title') == 'Armitage III: Polymatrix' def test_movie_title_video_game_should_not_include_type(ia): movie = ia.get_movie('0390244', info=['main']) # Matrix (VG) assert movie.get('title') == 'The Matrix Online' def test_movie_title_tv_series_should_not_have_quotes(ia): movie = ia.get_movie('0436992', info=['main']) # Doctor Who assert movie.get('title') == 'Doctor Who' def test_movie_title_tv_mini_series_should_not_have_quotes(ia): movie = ia.get_movie('0185906', info=['main']) # Band of Brothers assert movie.get('title') == 'Band of Brothers' def test_movie_title_tv_episode_should_not_be_series_title(ia): movie = ia.get_movie('1000252', info=['main']) # Doctor Who - Blink assert movie.get('title') == 'Blink' def test_movie_year_should_be_an_integer(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert movie.get('year') == 1999 def test_movie_year_followed_by_kind_in_full_title_should_be_ok(ia): movie = ia.get_movie('0109151', info=['main']) # Matrix (V) assert movie.get('year') == 1996 def test_movie_year_if_none_should_be_excluded(ia): movie = ia.get_movie('3629794', info=['main']) # Aslan assert 'year' not in movie @mark.skip(reason="imdb index is not included anymore") def test_movie_imdb_index_should_be_a_roman_number(ia): movie = ia.get_movie('3698420', info=['main']) # Mother's Day IV assert movie.get('imdbIndex') == 'IV' @mark.skip(reason="imdb index is not included anymore") def test_movie_imdb_index_none_should_be_excluded(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert 'imdbIndex' not in movie def test_movie_kind_none_should_be_movie(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert movie.get('kind') == 'movie' def test_movie_kind_tv_movie_should_be_tv_movie(ia): movie = ia.get_movie('0389150', info=['main']) # Matrix (TV) assert movie.get('kind') == 'tv movie' def test_movie_kind_video_movie_should_be_video_movie(ia): movie = ia.get_movie('0109151', info=['main']) # Matrix (V) assert movie.get('kind') == 'video movie' def test_movie_kind_video_game_should_be_video_game(ia): movie = ia.get_movie('0390244', info=['main']) # Matrix (VG) assert movie.get('kind') == 'video game' def test_movie_kind_tv_series_should_be_tv_series(ia): movie = ia.get_movie('0436992', info=['main']) # Doctor Who assert movie.get('kind') == 'tv series' def test_movie_kind_tv_mini_series_should_be_tv_mini_series(ia): movie = ia.get_movie('0185906', info=['main']) # Band of Brothers assert movie.get('kind') == 'tv mini series' def test_movie_kind_tv_series_episode_should_be_episode(ia): movie = ia.get_movie('1000252', info=['main']) # Doctor Who - Blink assert movie.get('kind') == 'episode' # def test_movie_kind_short_movie_should_be_short_movie(ia): # movie = ia.get_movie('2971344', info=['main']) # Matrix (Short) # assert movie.get('kind') == 'short movie' # def test_movie_kind_tv_short_movie_should_be_tv_short_movie(ia): # movie = ia.get_movie('0274085', info=['main']) # Matrix (TV Short) # assert movie.get('kind') == 'tv short movie' # def test_movie_kind_tv_special_should_be_tv_special(ia): # movie = ia.get_movie('1985970', info=['main']) # Roast of Charlie Sheen # assert movie.get('kind') == 'tv special' def test_series_years_if_continuing_should_be_open_range(ia): movie = ia.get_movie('0436992', info=['main']) # Doctor Who assert movie.get('series years') == '2005-' def test_series_years_if_ended_should_be_closed_range(ia): movie = ia.get_movie('0412142', info=['main']) # House M.D. assert movie.get('series years') == '2004-2012' def test_series_years_mini_series_ended_in_same_year_should_be_closed_range(ia): movie = ia.get_movie('0185906', info=['main']) # Band of Brothers assert movie.get('series years') == '2001-2001' def test_series_years_if_none_should_be_excluded(ia): movie = ia.get_movie('1000252', info=['main']) # Doctor Who - Blink assert 'series years' not in movie def test_series_number_of_episodes_should_be_an_integer(ia): movie = ia.get_movie('2121964', info=['main']) # House M.D. - 8/21 assert movie.get('number of episodes') == 176 def test_series_number_of_episodes_if_none_should_be_excluded(ia): movie = ia.get_movie('0412142', info=['main']) # House M.D. assert 'number of episodes' not in movie @mark.skip(reason="total episode number is not included anymore") def test_episode_number_should_be_an_integer(ia): movie = ia.get_movie('2121964', info=['main']) # House M.D. - 8/21 assert movie.get('episode number') == 175 @mark.skip(reason="total episode number is not included anymore") def test_episode_number_if_none_should_be_excluded(ia): movie = ia.get_movie('0412142', info=['main']) # House M.D. assert 'episode number' not in movie def test_episode_previous_episode_should_be_an_imdb_id(ia): movie = ia.get_movie('2121964', info=['main']) # House M.D. - 8/21 assert movie.get('previous episode') == '2121963' def test_episode_previous_episode_if_none_should_be_excluded(ia): movie = ia.get_movie('0606035', info=['main']) # House M.D. - 1/1 assert 'previous episode' not in movie def test_episode_next_episode_should_be_an_imdb_id(ia): movie = ia.get_movie('2121964', info=['main']) # House M.D. - 8/21 assert movie.get('next episode') == '2121965' def test_episode_next_episode_if_none_should_be_excluded(ia): movie = ia.get_movie('2121965', info=['main']) # House M.D. - 8/22 assert 'next episode' not in movie def test_episode_of_series_should_have_title_year_and_kind(ia): movie = ia.get_movie('2121964', info=['main']) # House M.D. - 8/21 series = movie.get('episode of') assert isinstance(series, Movie) assert series.movieID == '0412142' assert series.get('kind') == 'tv series' # original title and year are not included anymore # assert series.data == {'title': 'House M.D.', 'year': 2004, 'kind': 'tv series'} def test_episode_of_mini_series_should_have_title_year_and_kind(ia): movie = ia.get_movie('1247467', info=['main']) # Band of Brothers - 4 series = movie.get('episode of') assert isinstance(series, Movie) assert series.movieID == '0185906' assert series.get('kind') == 'tv series' # original title and year are not included anymore # assert series.data == {'title': 'Band of Brothers', 'year': 2001, 'kind': 'tv series'} def test_episode_of_series_if_none_should_be_excluded(ia): movie = ia.get_movie('0412142', info=['main']) # House M.D. assert 'episode of' not in movie def test_movie_rating_should_be_between_1_and_10(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert 1.0 <= movie.get('rating') <= 10.0 def test_movie_rating_if_none_should_be_excluded(ia): movie = ia.get_movie('1863157', info=['main']) # Ates Parcasi assert 'rating' not in movie def test_movie_votes_should_be_an_integer(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert movie.get('votes') > 1000000 def test_movie_votes_if_none_should_be_excluded(ia): movie = ia.get_movie('1863157', info=['main']) # Ates Parcasi assert 'votes' not in movie def test_movie_top250_rank_should_be_between_1_and_250(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert 1 <= movie.get('top 250 rank') <= 250 def test_movie_top250_rank_if_none_should_be_excluded(ia): movie = ia.get_movie('1863157', info=['main']) # Ates Parcasi assert 'top 250 rank' not in movie def test_movie_bottom100_rank_should_be_between_1_and_100(ia): movie = ia.get_movie('0060666', info=['main']) # Manos assert 1 <= movie.get('bottom 100 rank') <= 100 def test_movie_bottom100_rank_if_none_should_be_excluded(ia): movie = ia.get_movie('1863157', info=['main']) # Ates Parcasi assert 'bottom 100 rank' not in movie @mark.skip('seasons is an alias for number of seasons') def test_series_season_titles_should_be_a_list_of_season_titles(ia): movie = ia.get_movie('0436992', info=['main']) # Doctor Who assert movie.get('seasons', []) == [str(i) for i in range(1, 12)] # unknown doesn't show up in the reference page # assert movie.get('seasons', []) == [str(i) for i in range(1, 12)] + ['unknown'] def test_series_season_titles_if_none_should_be_excluded(ia): movie = ia.get_movie('1000252', info=['main']) # Doctor Who - Blink assert 'seasons' not in movie def test_series_number_of_seasons_should_be_numeric(ia): movie = ia.get_movie('0412142', info=['main']) # House M.D. assert movie.get('number of seasons') == 8 def test_series_number_of_seasons_should_exclude_non_numeric_season_titles(ia): movie = ia.get_movie('0436992', info=['main']) # Doctor Who assert movie.get('number of seasons') == 12 def test_episode_original_air_date_should_be_a_date(ia): movie = ia.get_movie('1000252', info=['main']) # Doctor Who - Blink assert re_date.match(movie.get('original air date')) def test_episode_original_air_date_if_none_should_be_excluded(ia): movie = ia.get_movie('0436992', info=['main']) # Doctor Who assert 'original air date' not in movie def test_season_and_episode_numbers_should_be_integers(ia): movie = ia.get_movie('1000252', info=['main']) # Doctor Who - Blink assert movie.get('season') == 3 assert movie.get('episode') == 10 def test_season_and_episode_numbers_none_should_be_excluded(ia): movie = ia.get_movie('0436992', info=['main']) # Doctor Who assert 'season' not in movie assert 'episode' not in movie def test_movie_genres_if_single_should_be_a_list_of_genre_names(ia): movie = ia.get_movie('0063850', info=['main']) # If.... assert movie.get('genres', []) == ['Drama'] def test_movie_genres_if_multiple_should_be_a_list_of_genre_names(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert movie.get('genres', []) == ['Action', 'Sci-Fi'] # TODO: find a movie with no genre def test_movie_plot_outline_should_be_a_longer_text(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert re.match(r'^Thomas A\. Anderson is a man .* human rebellion.$', movie.get('plot outline')) def test_movie_plot_outline_none_should_be_excluded(ia): movie = ia.get_movie('1863157', info=['main']) # Ates Parcasi assert 'plot outline' not in movie @mark.skip(reason="mpaa rating is not included anymore") def test_movie_mpaa_should_be_a_rating(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert movie.get('mpaa') == 'Rated R for sci-fi violence and brief language' @mark.skip(reason="mpaa rating is not included anymore") def test_movie_mpaa_none_should_be_excluded(ia): movie = ia.get_movie('1863157', info=['main']) # Ates Parcasi assert 'mpaa' not in movie def test_movie_runtimes_single_should_be_a_list_in_minutes(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert movie.get('runtimes', []) == ['136'] def test_movie_runtimes_with_countries_should_include_context(ia): movie = ia.get_movie('0076786', info=['main']) # Suspiria assert movie.get('runtimes', []) == ['98'] def test_movie_runtimes_if_none_should_be_excluded(ia): movie = ia.get_movie('0390244', info=['main']) # Matrix (VG) assert 'runtimes' not in movie def test_movie_countries_if_single_should_be_a_list_of_country_names(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert movie.get('countries', []) == ['United States'] def test_movie_countries_if_multiple_should_be_a_list_of_country_names(ia): movie = ia.get_movie('0081505', info=['main']) # Shining assert movie.get('countries', []) == ['United Kingdom', 'United States'] # TODO: find a movie with no country def test_movie_country_codes_if_single_should_be_a_list_of_country_codes(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert movie.get('country codes', []) == ['us'] def test_movie_country_codes_if_multiple_should_be_a_list_of_country_codes(ia): movie = ia.get_movie('0081505', info=['main']) # Shining assert movie.get('country codes', []) == ['gb', 'us'] # TODO: find a movie with no country def test_movie_languages_if_single_should_be_a_list_of_language_names(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert movie.get('languages', []) == ['English'] def test_movie_languages_if_multiple_should_be_a_list_of_language_names(ia): movie = ia.get_movie('0043338', info=['main']) # Ace in the Hole assert movie.get('languages', []) == ['English', 'Spanish', 'Latin'] def test_movie_languages_none_as_a_language_name_should_be_valid(ia): movie = ia.get_movie('2971344', info=['main']) # Matrix (Short) assert movie.get('languages', []) == ['None'] # TODO: find a movie with no language def test_movie_language_codes_if_single_should_be_a_list_of_language_names(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert movie.get('language codes', []) == ['en'] def test_movie_language_codes_if_multiple_should_be_a_list_of_language_names(ia): movie = ia.get_movie('0043338', info=['main']) # Ace in the Hole assert movie.get('language codes', []) == ['en', 'es', 'la'] def test_movie_language_codes_zxx_as_a_language_code_should_be_valid(ia): movie = ia.get_movie('2971344', info=['main']) # Matrix (Short) assert movie.get('language codes', []) == ['zxx'] # TODO: find a movie with no language def test_movie_colors_if_single_should_be_a_list_of_color_types(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert movie.get('color info', []) == ['Color'] def test_movie_colors_if_multiple_should_be_a_list_of_color_types(ia): # this used to return multiple colors, now it only returns the first movie = ia.get_movie('0120789', info=['main']) # Pleasantville assert movie.get('color info', []) == ['Black and White'] def test_movie_cast_can_contain_notes(ia): movie = ia.get_movie('0060666', info=['main']) # Manos diane_adelson = movie['cast'][2] assert str(diane_adelson.currentRole) == 'Margaret' assert diane_adelson.notes == '(as Diane Mahree)' def test_movie_colors_if_single_with_notes_should_include_notes(ia): movie = ia.get_movie('0060666', info=['main']) # Manos assert movie.get('color info', []) == ['Color::(Eastmancolor)'] def test_movie_colors_if_none_should_be_excluded(ia): movie = ia.get_movie('0389150', info=['main']) # Matrix (TV) assert 'color info' not in movie def test_movie_aspect_ratio_should_be_a_number_to_one(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert movie.get('aspect ratio') == '2.39 : 1' def test_movie_aspect_ratio_if_none_should_be_excluded(ia): movie = ia.get_movie('1863157', info=['main']) # Ates Parcasi assert 'aspect ratio' not in movie def test_movie_sound_mix_if_single_should_be_a_list_of_sound_mix_types(ia): movie = ia.get_movie('0063850', info=['main']) # If.... assert movie.get('sound mix', []) == ['Mono'] def test_movie_sound_mix_if_multiple_should_be_a_list_of_sound_mix_types(ia): movie = ia.get_movie('0120789', info=['main']) # Pleasantville assert movie.get('sound mix', []) == ['DTS', 'Dolby Digital', 'SDDS'] def test_movie_sound_mix_if_single_with_notes_should_include_notes(ia): movie = ia.get_movie('0043338', info=['main']) # Ace in the Hole assert movie.get('sound mix', []) == ['Mono::(Western Electric Recording)'] def test_movie_sound_mix_if_multiple_with_notes_should_include_notes(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert movie.get('sound mix', []) == ['DTS::(Digital DTS Sound)', 'Dolby Digital', 'SDDS'] def test_movie_sound_mix_if_none_should_be_excluded(ia): movie = ia.get_movie('1863157', info=['main']) # Ates Parcasi assert 'sound mix' not in movie def test_movie_certificates_should_be_a_list_of_certificates(ia): movie = ia.get_movie('1000252', info=['main']) # Doctor Who - Blink assert movie.get('certificates', []) == [ 'Australia:PG::(most episodes)', 'Brazil:12', 'Netherlands:9::(some episodes)', 'New Zealand:PG', 'Singapore:PG', 'South Africa:PG', 'United Kingdom:PG', 'United Kingdom:PG::(DVD rating)', 'United States:TV-PG' ] def test_movie_certificates_if_none_should_be_excluded(ia): movie = ia.get_movie('1863157', info=['main']) # Ates Parcasi assert 'certificates' not in movie def test_movie_cast_must_contain_items(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert len(movie.get('cast', [])) > 20 def test_movie_cast_must_be_in_plain_format(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert movie['cast'][0].data.get('name') == 'Keanu Reeves' def test_movie_misc_sections_must_contain_items(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert len(movie.get('casting department', [])) == 2 def test_movie_misc_sections_must_be_in_plain_format(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert movie['casting department'][0].data.get('name') == 'Tim Littleton' def test_movie_companies_sections_must_contain_items(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert len(movie.get('special effects companies', [])) == 7 def test_movie_box_office_should_be_a_dict(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert isinstance(movie.get('box office'), dict) assert len(movie.get('box office', {})) == 3 imdbpy-6.8/tests/test_http_movie_full_credit.py000066400000000000000000000011631351454127000221630ustar00rootroot00000000000000 def test_movie_full_credits(ia): movie = ia.get_movie('0133093', info=['full credits']) # Matrix assert 'cast' in movie assert len(movie['cast']) == 40 def test_movie_full_credits_for_tv_show(ia): movie = ia.get_movie('0098904', info=['full credits']) # Seinfeld assert 'cast' in movie assert len(movie['cast']) == 1314 def test_movie_full_credits_contains_headshot(ia): movie = ia.get_movie('0133093', info=['main', 'full credits']) # Matrix assert 'headshot' in movie['cast'][0] # Keanu Reeves assert 'nopicture' not in movie['cast'][0]['headshot'] # is real headshot, not default imdbpy-6.8/tests/test_http_movie_keywords.py000066400000000000000000000007071351454127000215410ustar00rootroot00000000000000def test_movie_keywords_should_be_a_list_of_keywords(ia): movie = ia.get_movie('0133093', info=['keywords']) # Matrix keywords = movie.get('keywords', []) assert 250 <= len(keywords) <= 400 assert {'computer-hacker', 'messiah', 'artificial-reality'}.issubset(set(keywords)) def test_movie_keywords_if_none_should_be_excluded(ia): movie = ia.get_movie('1863157', info=['keywords']) # Ates Parcasi assert 'keywords' not in movie imdbpy-6.8/tests/test_http_movie_parental_guide.py000066400000000000000000000003171351454127000226520ustar00rootroot00000000000000def test_movie_parental_guide_contains_mpaa_rating(ia): movie = ia.get_movie('0133093', info=['parents guide']) # Matrix assert movie.get('mpaa') == "Rated R for sci-fi violence and brief language" imdbpy-6.8/tests/test_http_movie_plot.py000066400000000000000000000017461351454127000206540ustar00rootroot00000000000000import re def test_movie_summary_should_be_some_text_with_author(ia): movie = ia.get_movie('0133093', info=['plot']) # Matrix plots = movie.get('plot', []) assert 5 <= len(plots) <= 10 kc_plot = '' for plot in plots: if plot.endswith('Kenneth Chisholm'): kc_plot = plot break assert re.match(r'^A computer hacker .*controllers\.::Kenneth Chisholm$', kc_plot) def test_movie_summary_if_none_should_be_excluded(ia): movie = ia.get_movie('1863157', info=['plot']) # Ates Parcasi assert 'plot' not in movie def test_movie_synopsis_should_be_some_text(ia): movie = ia.get_movie('0133093', info=['plot']) # Matrix synopsis = movie.get('synopsis') assert len(synopsis) == 1 assert re.match(r'^The screen fills with .* three Matrix movies\.$', synopsis[0]) def test_movie_synopsis_if_none_should_be_excluded(ia): movie = ia.get_movie('1863157', info=['plot']) # Ates Parcasi assert 'synopsis' not in movie imdbpy-6.8/tests/test_http_movie_releaseinfo.py000066400000000000000000000010161351454127000221600ustar00rootroot00000000000000def test_movie_release_info_raw_akas_must_be_a_list(ia): movie = ia.get_movie('0133093', info=['release info']) # Matrix akas = movie.get('raw akas', []) assert len(akas) >= 40 assert len(akas) == len(movie.get('akas from release info')) def test_movie_release_info_raw_release_dates_must_be_a_list(ia): movie = ia.get_movie('0133093', info=['release info']) # Matrix akas = movie.get('raw release dates', []) assert len(akas) >= 56 assert len(akas) == len(movie.get('release dates')) imdbpy-6.8/tests/test_http_movie_reviews.py000066400000000000000000000005311351454127000213510ustar00rootroot00000000000000def test_movie_reviews_should_be_a_list(ia): movie = ia.get_movie('0104155', info=['reviews']) # Dust Devil reviews = movie.get('reviews', []) assert len(reviews) == 25 def test_movie_reviews_if_none_should_be_excluded(ia): movie = ia.get_movie('1863157', info=['reviews']) # Ates Parcasi assert 'reviews' not in movie imdbpy-6.8/tests/test_http_movie_season_episodes.py000066400000000000000000000017471351454127000230620ustar00rootroot00000000000000from pytest import mark def test_series_episodes_should_be_a_map_of_seasons_and_episodes(ia): movie = ia.get_movie('0412142', info=['episodes']) # House M.D. assert list(sorted(movie.get('episodes'))) == list(range(1, 9)) def test_series_episodes_with_unknown_season_should_have_placeholder_at_end(ia): movie = ia.get_movie('0436992', info=['episodes']) # Doctor Who assert list(sorted(movie.get('episodes'))) == [-1] + list(range(1, 13)) @mark.skip('episodes is {} instead of None') def test_series_episodes_if_none_should_be_excluded(ia): movie = ia.get_movie('1000252', info=['episodes']) # Doctor Who - Blink assert 'episodes' not in movie def test_series_episodes_should_contain_rating_and_votes(ia): movie = ia.get_movie('0185906', info=['episodes']) # Band of Brothers episodes = movie.get('episodes') rating = episodes[1][1]['rating'] votes = episodes[1][1]['votes'] assert 8.3 <= rating <= 9.0 assert votes > 4400 imdbpy-6.8/tests/test_http_movie_series.py000066400000000000000000000003731351454127000211630ustar00rootroot00000000000000from pytest import mark def test_series_full_cast_has_ids(ia): movie = ia.get_movie('0412142', info=['full cast']) # House M.D. # all persons must have a personID assert [p for p in movie.get('cast', []) if p.personID is None] == [] imdbpy-6.8/tests/test_http_movie_sites.py000066400000000000000000000014141351454127000210150ustar00rootroot00000000000000def test_movie_official_sites_should_be_a_list(ia): movie = ia.get_movie('0133093', info=['official sites']) # Matrix official_sites = movie.get('official sites', []) assert len(official_sites) == 1 def test_movie_official_sites_if_none_should_be_excluded(ia): movie = ia.get_movie('1863157', info=['official sites']) # Ates Parcasi assert 'official sites' not in movie def test_movie_sound_clips_should_be_a_list(ia): movie = ia.get_movie('0133093', info=['official sites']) # Matrix sound_clips = movie.get('sound clips', []) assert len(sound_clips) == 3 def test_movie_sound_clips_if_none_should_be_excluded(ia): movie = ia.get_movie('1863157', info=['official sites']) # Ates Parcasi assert 'sound clips' not in movie imdbpy-6.8/tests/test_http_movie_taglines.py000066400000000000000000000012701351454127000214740ustar00rootroot00000000000000def test_movie_taglines_if_single_should_be_a_list_of_phrases(ia): movie = ia.get_movie('0109151', info=['taglines']) # Matrix (V) taglines = movie.get('taglines', []) assert taglines == ["If humans don't want me... why'd they create me?"] def test_movie_taglines_if_multiple_should_be_a_list_of_phrases(ia): movie = ia.get_movie('0060666', info=['taglines']) # Manos taglines = movie.get('taglines', []) assert len(taglines) == 3 assert taglines[0] == "It's Shocking! It's Beyond Your Imagination!" def test_movie_taglines_if_none_should_be_excluded(ia): movie = ia.get_movie('1863157', info=['taglines']) # Ates Parcasi assert 'taglines' not in movie imdbpy-6.8/tests/test_http_movie_tech.py000066400000000000000000000006061351454127000206130ustar00rootroot00000000000000def test_movie_tech_sections(ia): movie = ia.get_movie('0133093', info=['technical']) tech = movie.get('tech', []) assert set(tech.keys()) == set(['sound mix', 'color', 'aspect ratio', 'camera', 'laboratory', 'cinematographic process', 'printed film format', 'negative format', 'runtime', 'film length']) imdbpy-6.8/tests/test_http_movie_votes.py000066400000000000000000000026711351454127000210340ustar00rootroot00000000000000def test_movie_votes_should_be_divided_into_10_slots(ia): movie = ia.get_movie('0133093', info=['vote details']) # Matrix votes = movie.get('number of votes', []) assert len(votes) == 10 def test_movie_votes_should_be_integers(ia): movie = ia.get_movie('0133093', info=['vote details']) # Matrix votes = movie.get('number of votes', []) for vote in votes: assert isinstance(vote, int) def test_movie_votes_median_should_be_an_integer(ia): movie = ia.get_movie('0133093', info=['vote details']) # Matrix median = movie.get('median') assert median == 9 def test_movie_votes_mean_should_be_numeric(ia): movie = ia.get_movie('0133093', info=['vote details']) # Matrix mean = movie.get('arithmetic mean') assert 8.5 <= mean <= 9 def test_movie_demographics_should_be_divided_into_multiple_categories(ia): movie = ia.get_movie('0133093', info=['vote details']) # Matrix demographics = movie['demographics'] assert len(demographics) >= 18 def test_movie_demographics_votes_should_be_integers(ia): movie = ia.get_movie('0133093', info=['vote details']) # Matrix top1000 = movie['demographics']['top 1000 voters'] assert 890 <= top1000['votes'] <= 1000 def test_movie_demographics_rating_should_be_numeric(ia): movie = ia.get_movie('0133093', info=['vote details']) # Matrix top1000 = movie['demographics']['top 1000 voters'] assert 8 <= top1000['rating'] <= 8.5 imdbpy-6.8/tests/test_http_person_bio.py000066400000000000000000000105311351454127000206260ustar00rootroot00000000000000import re def test_person_headshot_should_be_an_image_link(ia): person = ia.get_person('0000206', info=['biography']) # Keanu Reeves assert re.match(r'^https?://.*\.jpg$', person['headshot']) def test_person_full_size_headshot_should_be_an_image_link(ia): person = ia.get_person('0000206', info=['biography']) # Keanu Reeves assert re.match(r'^https?://.*\.jpg$', person['full-size headshot']) def test_person_headshot_if_none_should_be_excluded(ia): person = ia.get_person('0330139', info=['biography']) # Deni Gordon assert 'headshot' not in person def test_person_bio_is_present(ia): person = ia.get_person('0000206', info=['biography']) # Keanu Reeves assert 'mini biography' in person def test_person_birth_date_should_be_in_ymd_format(ia): person = ia.get_person('0000001', info=['biography']) # Fred Astaire assert person.get('birth date') == '1899-05-10' def test_person_birth_date_without_month_and_date_should_be_in_y00_format(ia): person = ia.get_person('0565883', info=['biography']) # Belinda McClory assert person.get('birth date') == '1968-00-00' def test_person_birth_date_without_itemprop_should_be_in_ymd_format(ia): person = ia.get_person('0000007', info=['biography']) # Humphrey Bogart assert person.get('birth date') == '1899-12-25' def test_person_birth_notes_should_contain_birth_place(ia): person = ia.get_person('0000001', info=['biography']) # Fred Astaire assert person.get('birth notes') == 'Omaha, Nebraska, USA' def test_person_death_date_should_be_in_ymd_format(ia): person = ia.get_person('0000001', info=['biography']) # Fred Astaire assert person.get('death date') == '1987-06-22' def test_person_death_date_without_itemprop_should_be_in_ymd_format(ia): person = ia.get_person('0000007', info=['biography']) # Humphrey Bogart assert person.get('death date') == '1957-01-14' def test_person_death_date_if_none_should_be_excluded(ia): person = ia.get_person('0000210', info=['biography']) # Julia Roberts assert 'death date' not in person def test_person_death_notes_should_contain_death_place_and_reason(ia): person = ia.get_person('0000001', info=['biography']) # Fred Astaire assert person['death notes'] == 'in Los Angeles, California, USA (pneumonia)' def test_person_death_notes_if_none_should_be_excluded(ia): person = ia.get_person('0000210', info=['biography']) # Julia Roberts assert 'death notes' not in person def test_person_birth_name_should_be_normalized(ia): data = ia.get_person('0000210', info=['biography']) # Julia Roberts assert data.get('birth name') == 'Julia Fiona Roberts' def test_person_nicknames_if_single_should_be_a_list_of_names(ia): person = ia.get_person('0000210', info=['biography']) # Julia Roberts assert person.get('nick names') == ['Jules'] def test_person_nicknames_if_multiple_should_be_a_list_of_names(ia): person = ia.get_person('0000206', info=['biography']) # Keanu Reeves assert person.get('nick names') == ['The Wall', 'The One'] def test_person_height_should_be_in_inches_and_meters(ia): person = ia.get_person('0000210', info=['biography']) # Julia Roberts assert person.get('height') == '5\' 8" (1.73 m)' def test_person_height_if_none_should_be_excluded(ia): person = ia.get_person('0617588', info=['biography']) # Georges Melies assert 'height' not in person def test_person_spouse_should_be_a_list(ia): person = ia.get_person('0000210', info=['biography']) # Julia Roberts spouses = person.get('spouse', []) assert len(spouses) == 2 def test_person_trade_mark_should_be_a_list(ia): person = ia.get_person('0000210', info=['biography']) # Julia Roberts trade_mark = person.get('trade mark', []) assert len(trade_mark) == 3 def test_person_trivia_should_be_a_list(ia): person = ia.get_person('0000210', info=['biography']) # Julia Roberts trivia = person.get('trivia', []) assert len(trivia) > 90 def test_person_quotes_should_be_a_list(ia): person = ia.get_person('0000210', info=['biography']) # Julia Roberts quotes = person.get('quotes', []) assert len(quotes) > 30 def test_person_salary_history_should_be_a_list(ia): person = ia.get_person('0000210', info=['biography']) # Julia Roberts salary = person.get('salary history', []) assert len(salary) > 25 imdbpy-6.8/tests/test_http_person_main.py000066400000000000000000000025771351454127000210140ustar00rootroot00000000000000import re def test_person_headshot_should_be_an_image_link(ia): person = ia.get_person('0000206', info=['main']) # Keanu Reeves assert re.match(r'^https?://.*\.jpg$', person['headshot']) def test_person_name_in_data_should_be_plain(ia): person = ia.get_person('0000206', info=['main']) # Keanu Reeves assert person.data.get('name') == 'Keanu Reeves' def test_person_canonical_name(ia): person = ia.get_person('0000206', info=['main']) # Keanu Reeves assert person.get('canonical name') == 'Reeves, Keanu' def test_person_headshot_if_none_should_be_excluded(ia): person = ia.get_person('0330139', info=['main']) # Deni Gordon assert 'headshot' not in person def test_person_name_should_not_be_canonicalized(ia): person = ia.get_person('0000206', info=['main']) # Keanu Reeves assert person.get('name') == 'Keanu Reeves' def test_person_name_should_not_have_birth_and_death_years(ia): person = ia.get_person('0000001', info=['main']) # Fred Astaire assert person.get('name') == 'Fred Astaire' def test_person_imdb_index_should_be_a_roman_number(ia): person = ia.get_person('0000210', info=['main']) # Julia Roberts assert person.get('imdbIndex') == 'I' def test_person_imdb_index_if_none_should_be_excluded(ia): person = ia.get_person('0000206', info=['main']) # Keanu Reeves assert 'imdbIndex' not in person imdbpy-6.8/tests/test_http_person_otherworks.py000066400000000000000000000012311351454127000222610ustar00rootroot00000000000000def test_person_other_works_should_contain_correct_number_of_works(ia): person = ia.get_person('0000206', info=['other works']) # Keanu Reeves other_works = person.get('other works', []) assert len(other_works) == 42 def test_person_other_works_should_contain_correct_work(ia): person = ia.get_person('0000206', info=['other works']) # Keanu Reeves other_works = person.get('other works', []) assert other_works[0].startswith('(1995) Stage: Appeared') def test_person_other_works_if_none_should_be_excluded(ia): person = ia.get_person('0330139', info=['other works']) # Deni Gordon assert 'other works' not in person imdbpy-6.8/tests/test_http_search_company.py000066400000000000000000000031011351454127000214550ustar00rootroot00000000000000def test_search_company_should_list_default_number_of_companies(ia): companies = ia.search_company('pixar') assert len(companies) == 20 def test_search_company_limited_should_list_requested_number_of_companies(ia): companies = ia.search_company('pixar', results=7) assert len(companies) == 7 def test_search_company_unlimited_should_list_correct_number_of_companies(ia): companies = ia.search_company('pixar', results=500) assert 35 <= len(companies) <= 50 def test_search_company_too_many_should_list_upper_limit_of_companies(ia): companies = ia.search_company('pictures', results=500) assert len(companies) >= 199 def test_search_company_if_none_result_should_be_empty(ia): companies = ia.search_company('%e3%82%a2') assert companies == [] def test_search_company_entries_should_include_company_id(ia): companies = ia.search_company('pixar') assert companies[0].companyID == '0348691' def test_search_company_entries_should_include_company_name(ia): companies = ia.search_company('pixar') assert companies[0]['name'] == 'Pixar' def test_search_company_entries_should_include_company_country(ia): companies = ia.search_company('pixar') assert companies[0]['country'] == '[ca]' # shouldn't this be just 'ca'? def test_search_company_entries_missing_country_should_be_excluded(ia): companies = ia.search_company('pixar', results=500) company_without_country = [c for c in companies if c.companyID == '0115838'] assert len(company_without_country) == 1 assert 'country' not in company_without_country[0] imdbpy-6.8/tests/test_http_search_keyword.py000066400000000000000000000011271351454127000215010ustar00rootroot00000000000000def test_search_keyword_check_list_of_keywords(ia): keywords = ia.search_keyword('zoolander') assert 'reference-to-zoolander' in keywords def test_search_keyword_if_multiple_should_list_correct_number_of_keywords(ia): keywords = ia.search_keyword('messiah') assert 40 <= len(keywords) <= 60 def test_search_keyword_if_too_many_should_list_upper_limit_of_keywords(ia): keywords = ia.search_keyword('computer') assert len(keywords) == 200 def test_search_keyword_if_none_result_should_be_empty(ia): keywords = ia.search_keyword('%e3%82%a2') assert keywords == [] imdbpy-6.8/tests/test_http_search_movie.py000066400000000000000000000063651351454127000211450ustar00rootroot00000000000000def test_search_movie_if_single_should_list_one_movie(ia): movies = ia.search_movie('od instituta do proizvodnje') assert len(movies) == 1 assert movies[0].movieID == '0483758' assert movies[0]['kind'] == 'short' assert movies[0]['title'] == 'Od instituta do proizvodnje' assert movies[0]['year'] == 1971 def test_search_movie_should_list_default_number_of_movies(ia): movies = ia.search_movie('movie') assert len(movies) == 20 def test_search_movie_limited_should_list_requested_number_of_movies(ia): movies = ia.search_movie('ace in the hole', results=98) assert len(movies) == 98 def test_search_movie_unlimited_should_list_correct_number_of_movies(ia): movies = ia.search_movie('ace in the hole', results=500) assert 185 <= len(movies) <= 200 def test_search_movie_if_too_many_result_should_list_upper_limit_of_movies(ia): movies = ia.search_movie('matrix', results=500) assert len(movies) == 200 def test_search_movie_if_none_should_be_empty(ia): movies = ia.search_movie('%e4%82%a2', results=500) assert movies == [] def test_search_movie_entries_should_include_movie_id(ia): movies = ia.search_movie('matrix') assert movies[0].movieID == '0133093' def test_search_movie_entries_should_include_movie_title(ia): movies = ia.search_movie('matrix') assert movies[0]['title'] == 'The Matrix' def test_search_movie_entries_should_include_cover_url_if_available(ia): movies = ia.search_movie('matrix') assert 'cover url' in movies[0] def test_search_movie_entries_should_include_movie_kind(ia): movies = ia.search_movie('matrix') assert movies[0]['kind'] == 'movie' def test_search_movie_entries_should_include_movie_kind_if_other_than_movie(ia): movies = ia.search_movie('matrix') tv_series = [m for m in movies if m.movieID == '0106062'] assert len(tv_series) == 1 assert tv_series[0]['kind'] == 'tv series' def test_search_movie_entries_should_include_movie_year(ia): movies = ia.search_movie('matrix') assert movies[0]['year'] == 1999 def test_search_movie_entries_should_include_imdb_index(ia): movies = ia.search_movie('blink') movie_with_index = [m for m in movies if m.movieID == '6544524'] assert len(movie_with_index) == 1 assert movie_with_index[0]['imdbIndex'] == 'IV' def test_search_movie_entries_missing_imdb_index_should_be_excluded(ia): movies = ia.search_movie('matrix') assert 'imdbIndex' not in movies[0] def test_search_movie_entries_should_include_akas(ia): movies = ia.search_movie('matrix') movie_with_aka = [m for m in movies if m.movieID == '0270841'] assert len(movie_with_aka) == 1 assert movie_with_aka[0]['akas'] == ['Matrix Hunters: Kynigoi ston kyvernohoro'] def test_search_movie_entries_missing_akas_should_be_excluded(ia): movies = ia.search_movie('matrix') assert 'akas' not in movies[0] def test_search_movie_episodes_should_include_season_and_number(ia): movies = ia.search_movie('swarley') # How I Met Your Mother S02E07 movie_with_season_and_episode = [m for m in movies if m.movieID == '0875360'] assert len(movie_with_season_and_episode) == 1 assert movie_with_season_and_episode[0]['season'] == 2 assert movie_with_season_and_episode[0]['episode'] == 7 imdbpy-6.8/tests/test_http_search_movie_advanced.py000066400000000000000000000442031351454127000227630ustar00rootroot00000000000000import sys def test_search_results_should_include_correct_number_of_works_by_default(ia): movies = ia.search_movie_advanced('matrix') assert len(movies) == 20 def test_search_results_should_include_correct_number_of_works(ia): movies = ia.search_movie_advanced('matrix', results=250) assert len(movies) > 220 def test_search_results_should_include_correct_number_of_works_if_asked_less_than_available(ia): movies = ia.search_movie_advanced('matrix', results=25) assert len(movies) == 25 def test_found_movies_should_have_movie_ids(ia): movies = ia.search_movie_advanced('matrix', results=50) assert all(isinstance(m.movieID, str) for m in movies) def test_found_movies_should_have_titles(ia): movies = ia.search_movie_advanced('matrix', results=50) assert all(isinstance(m['title'], (str, unicode) if sys.version_info < (3,) else str) for m in movies) def test_selected_movie_should_have_correct_kind(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0133093'][0] assert selected['kind'] == 'movie' def test_selected_video_should_have_correct_kind(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0295432'][0] assert selected['kind'] == 'video movie' def test_selected_tv_movie_should_have_correct_kind(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '4151794'][0] assert selected['kind'] == 'tv movie' def test_selected_tv_short_should_have_correct_kind(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0274085'][0] assert selected['kind'] == 'tv short movie' def test_selected_tv_series_should_have_correct_kind(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0106062'][0] assert selected['kind'] == 'tv series' def test_selected_ended_tv_series_should_have_correct_kind(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0364888'][0] assert selected['kind'] == 'tv series' def test_selected_tv_episode_should_have_correct_kind(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '0594932'][0] assert selected['kind'] == 'episode' def test_selected_tv_special_should_have_correct_kind(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '1025014'][0] assert selected['kind'] == 'tv special' def test_selected_video_game_should_have_correct_kind(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '0277828'][0] assert selected['kind'] == 'video game' def test_selected_movie_should_have_correct_year(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0133093'][0] assert selected['year'] == 1999 def test_selected_ended_tv_series_should_have_correct_end_year(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0364888'][0] assert selected['end_year'] == 2004 def test_selected_unreleased_movie_should_have_correct_state(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '5359784'][0] assert selected['state'] == 'Completed' def test_selected_movie_should_have_correct_certificate(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0133093'][0] assert selected['certificates'] == ['R'] def test_selected_movie_should_have_correct_runtime(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0133093'][0] assert selected['runtimes'] == ['136'] def test_selected_movie_should_have_correct_genres(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0133093'][0] assert selected['genres'] == ['Action', 'Sci-Fi'] def test_selected_movie_should_have_correct_rating(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0133093'][0] assert abs(selected['rating'] - 8.7) < 0.5 def test_selected_movie_should_have_correct_number_of_votes(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0133093'][0] assert selected['votes'] >= 1513744 def test_selected_movie_should_have_correct_metascore(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0133093'][0] assert abs(selected['metascore'] - 73) < 5 def test_selected_movie_should_have_correct_gross(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0133093'][0] assert selected['gross'] >= 171479930 def test_selected_movie_should_have_correct_plot(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0133093'][0] assert selected['plot'].startswith('A computer hacker learns') def test_selected_movie_should_have_correct_director_imdb_ids(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '1830851'][0] assert [p.personID for p in selected['directors']] == ['0649609'] def test_selected_work_should_have_correct_director_name(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '1830851'][0] assert [p['name'] for p in selected['directors']] == ['Josh Oreck'] def test_selected_work_should_have_correct_director_imdb_ids_if_multiple(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0133093'][0] assert [p.personID for p in selected['directors']] == ['0905154', '0905152'] def test_selected_work_should_have_correct_director_names_if_multiple(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0133093'][0] assert [p['name'] for p in selected['directors']] == ['Lana Wachowski', 'Lilly Wachowski'] def test_selected_work_should_have_correct_cast_imdb_id(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '1830851'][0] assert [p.personID for p in selected['cast']] == ['1047143'] def test_selected_work_should_have_correct_cast_name(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '1830851'][0] assert [p['name'] for p in selected['cast']] == ['Clayton Watson'] def test_selected_work_should_have_correct_cast_imdb_ids_if_multiple(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0133093'][0] assert [p.personID for p in selected['cast']] == ['0000206', '0000401', '0005251', '0915989'] def test_selected_work_should_have_correct_cast_names_if_multiple(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0133093'][0] assert [p['name'] for p in selected['cast']] == [ 'Keanu Reeves', 'Laurence Fishburne', 'Carrie-Anne Moss', 'Hugo Weaving' ] def test_selected_tv_episode_should_have_correct_title(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '0594933'][0] assert selected['title'] == "The Making of 'The Matrix'" def test_selected_tv_episode_should_have_correct_year(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '0594933'][0] assert selected['year'] == 1999 def test_selected_tv_episode_should_have_correct_imdb_index(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '1072112'][0] assert selected['imdbIndex'] == 'I' def test_selected_tv_episode_should_have_correct_certificate(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '1072112'][0] assert selected['certificates'] == ['TV-PG'] def test_selected_tv_episode_should_have_correct_runtime(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '0594933'][0] assert selected['runtimes'] == ['26'] def test_selected_tv_episode_should_have_correct_genres(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '0594933'][0] assert selected['genres'] == ['Documentary', 'Short'] def test_selected_tv_episode_should_have_correct_rating(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '0594933'][0] assert abs(selected['rating'] - 7.6) < 0.5 def test_selected_tv_episode_should_have_correct_number_of_votes(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '0594933'][0] assert selected['votes'] >= 14 def test_selected_tv_episode_should_have_correct_plot(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '10177094'][0] assert selected['plot'].startswith('Roberto Leoni reviews The Matrix (1999)') def test_selected_tv_episode_should_have_correct_director_imdb_ids(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '0594933'][0] assert [p.personID for p in selected['directors']] == ['0649609'] def test_selected_tv_episode_should_have_correct_director_names(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '0594933'][0] assert [p['name'] for p in selected['directors']] == ['Josh Oreck'] def test_selected_tv_episode_should_have_correct_cast_imdb_ids(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '0594933'][0] assert [p.personID for p in selected['cast']] == ['0000401', '0300665', '0303293', '0005251'] def test_selected_tv_episode_should_have_correct_cast_names(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '0594933'][0] assert [p['name'] for p in selected['cast']] == [ 'Laurence Fishburne', 'John Gaeta', "Robert 'Rock' Galotti", 'Carrie-Anne Moss' ] def test_selected_tv_episode_should_have_series(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '0594933'][0] assert selected['episode of']['kind'] == 'tv series' def test_selected_tv_episode_should_have_correct_series_imdb_id(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '0594933'][0] assert selected['episode of'].movieID == '0318220' def test_selected_tv_episode_should_have_correct_series_title(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '0594933'][0] assert selected['episode of']['title'] == 'HBO First Look' def test_selected_tv_episode_should_have_correct_series_year(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '1072112'][0] assert selected['episode of']['year'] == 2001 def test_selected_tv_episode_should_have_correct_series_end_year(ia): movies = ia.search_movie_advanced('matrix', results=250) selected = [m for m in movies if m.movieID == '1072112'][0] assert selected['episode of']['end_year'] == 2012 def test_selected_movie_should_have_cover_url(ia): movies = ia.search_movie_advanced('matrix', results=50) selected = [m for m in movies if m.movieID == '0133093'][0] assert selected['cover url'].endswith('.jpg') def test_search_results_should_include_adult_titles_if_requested(ia): movies = ia.search_movie_advanced('matrix', adult=True, results=250) movies_no_adult = ia.search_movie_advanced('matrix', adult=False, results=250) assert len(movies) > len(movies_no_adult) def test_selected_adult_movie_should_have_correct_title(ia): movies = ia.search_movie_advanced('matrix', adult=True, results=250) selected = [m for m in movies if m.movieID == '0273126'][0] assert selected['title'] == 'Blue Matrix' def test_selected_adult_movie_should_have_adult_in_genres(ia): movies = ia.search_movie_advanced('matrix', adult=True, results=250) selected = [m for m in movies if m.movieID == '0273126'][0] assert 'Adult' in selected['genres'] def test_search_results_should_be_sortable_in_alphabetical_order_default_ascending(ia): movies = ia.search_movie_advanced(title='matrix', sort='alpha') titles = [m['title'] for m in movies] # assert all(a <= b for a, b in zip(titles, titles[1:])) # fails due to IMDb assert sum(1 if a > b else 0 for a, b in zip(titles, titles[1:])) <= 1 def test_search_results_should_be_sortable_in_alphabetical_order_ascending(ia): movies = ia.search_movie_advanced(title='matrix', sort='alpha', sort_dir='asc') titles = [m['title'] for m in movies] # assert all(a <= b for a, b in zip(titles, titles[1:])) # fails due to IMDb assert sum(1 if a > b else 0 for a, b in zip(titles, titles[1:])) <= 1 def test_search_results_should_be_sortable_in_alphabetical_order_descending(ia): movies = ia.search_movie_advanced(title='matrix', sort='alpha', sort_dir='desc') titles = [m['title'] for m in movies] assert all(a >= b for a, b in zip(titles, titles[1:])) def test_search_results_should_be_sortable_in_rating_order_default_descending(ia): movies = ia.search_movie_advanced(title='matrix', sort='user_rating') ratings = [m.get('rating', 0) for m in movies] assert all(a >= b for a, b in zip(ratings, ratings[1:])) def test_search_results_should_be_sortable_in_rating_order_ascending(ia): movies = ia.search_movie_advanced(title='matrix', sort='user_rating', sort_dir='asc') ratings = [m.get('rating', float('inf')) for m in movies] assert all(a <= b for a, b in zip(ratings, ratings[1:])) def test_search_results_should_be_sortable_in_rating_order_descending(ia): movies = ia.search_movie_advanced(title='matrix', sort='user_rating', sort_dir='desc') ratings = [m.get('rating', 0) for m in movies] assert all(a >= b for a, b in zip(ratings, ratings[1:])) def test_search_results_should_be_sortable_in_votes_order_default_ascending(ia): movies = ia.search_movie_advanced(title='matrix', sort='num_votes') votes = [m.get('votes', float('inf')) for m in movies] assert all(a <= b for a, b in zip(votes, votes[1:])) def test_search_results_should_be_sortable_in_votes_order_ascending(ia): movies = ia.search_movie_advanced(title='matrix', sort='num_votes', sort_dir='asc') votes = [m.get('votes', float('inf')) for m in movies] assert all(a <= b for a, b in zip(votes, votes[1:])) def test_search_results_should_be_sortable_in_votes_order_descending(ia): movies = ia.search_movie_advanced(title='matrix', sort='num_votes', sort_dir='desc') votes = [m.get('votes', 0) for m in movies] assert all(a >= b for a, b in zip(votes, votes[1:])) def test_search_results_should_be_sortable_in_gross_order_default_ascending(ia): movies = ia.search_movie_advanced(title='matrix', sort='boxoffice_gross_us') grosses = [m.get('gross', float('inf')) for m in movies] assert all(a <= b for a, b in zip(grosses, grosses[1:])) def test_search_results_should_be_sortable_in_gross_order_ascending(ia): movies = ia.search_movie_advanced(title='matrix', sort='boxoffice_gross_us', sort_dir='asc') grosses = [m.get('gross', float('inf')) for m in movies] assert all(a <= b for a, b in zip(grosses, grosses[1:])) def test_search_results_should_be_sortable_in_gross_order_descending(ia): movies = ia.search_movie_advanced(title='matrix', sort='boxoffice_gross_us', sort_dir='desc') grosses = [m.get('gross', 0) for m in movies] assert all(a >= b for a, b in zip(grosses, grosses[1:])) def test_search_results_should_be_sortable_in_runtime_order_default_ascending(ia): movies = ia.search_movie_advanced(title='matrix', sort='runtime') runtimes = [m.get('runtime', float('inf')) for m in movies] assert all(a <= b for a, b in zip(runtimes, runtimes[1:])) def test_search_results_should_be_sortable_in_runtime_order_ascending(ia): movies = ia.search_movie_advanced(title='matrix', sort='runtime', sort_dir='asc') runtimes = [int(m.get('runtimes', [float('inf')])[0]) for m in movies] assert all(a <= b for a, b in zip(runtimes, runtimes[1:])) def test_search_results_should_be_sortable_in_runtime_order_descending(ia): movies = ia.search_movie_advanced(title='matrix', sort='runtime', sort_dir='desc') runtimes = [int(m.get('runtimes', [float('inf')])[0]) for m in movies] assert all(a >= b for a, b in zip(runtimes, runtimes[1:])) def test_search_results_should_be_sortable_in_year_order_default_ascending(ia): movies = ia.search_movie_advanced(title='matrix', sort='year') years = [m.get('year', float('inf')) for m in movies] assert all(a <= b for a, b in zip(years, years[1:])) def test_search_results_should_be_sortable_in_year_order_ascending(ia): movies = ia.search_movie_advanced(title='matrix', sort='year', sort_dir='asc') years = [m.get('year', float('inf')) for m in movies] assert all(a <= b for a, b in zip(years, years[1:])) # def test_search_results_should_be_sortable_in_year_order_descending(ia): # movies = ia.search_movie_advanced(title='matrix', sort='year', sort_dir='desc') # years = [m.get('year', float('inf')) for m in movies] # assert all(a >= b for a, b in zip(years, years[1:])) imdbpy-6.8/tests/test_http_search_movie_keyword.py000066400000000000000000000015171351454127000227030ustar00rootroot00000000000000def test_get_keyword_should_list_correct_number_of_movies(ia): movies = ia.get_keyword('colander') assert len(movies) == 5 def test_get_keyword_if_too_many_should_list_upper_limit_of_movies(ia): movies = ia.get_keyword('computer') assert len(movies) == 50 def test_get_keyword_entries_should_include_movie_id(ia): movies = ia.get_keyword('colander') assert movies[0].movieID == '0382932' def test_get_keyword_entries_should_include_movie_title(ia): movies = ia.get_keyword('colander') assert movies[0]['title'] == 'Ratatouille' def test_get_keyword_entries_should_include_movie_kind(ia): movies = ia.get_keyword('colander') assert movies[0]['kind'] == 'movie' def test_get_keyword_entries_should_include_movie_year(ia): movies = ia.get_keyword('colander') assert movies[0]['year'] == 2007 imdbpy-6.8/tests/test_http_search_person.py000066400000000000000000000046101351454127000213230ustar00rootroot00000000000000from pytest import mark def test_search_person_should_list_default_number_of_people(ia): people = ia.search_person('julia') assert len(people) == 20 def test_search_person_limited_should_list_requested_number_of_people(ia): people = ia.search_person('julia', results=11) assert len(people) == 11 def test_search_person_unlimited_should_list_correct_number_of_people(ia): people = ia.search_person('engelbart', results=500) assert 120 <= len(people) <= 150 def test_search_person_if_too_many_should_list_upper_limit_of_people(ia): people = ia.search_person('john', results=500) assert len(people) == 200 def test_search_person_if_none_result_should_be_empty(ia): people = ia.search_person('%e3%82%a2') assert people == [] def test_search_person_entries_should_include_person_id(ia): people = ia.search_person('julia roberts') assert people[0].personID == '0000210' def test_search_person_entries_should_include_person_name(ia): people = ia.search_person('julia roberts') assert people[0]['name'] == 'Julia Roberts' def test_search_person_entries_should_include_headshot_if_available(ia): people = ia.search_person('julia roberts') assert 'headshot' in people[0] def test_search_person_entries_with_aka_should_exclude_name_in_aka(ia): people = ia.search_person('julia roberts') robertson = None for person in people: if person['name'] == 'Julia Robertson': robertson = person break assert robertson assert robertson['name'] == 'Julia Robertson' def test_search_person_entries_should_include_person_index(ia): people = ia.search_person('julia roberts') assert people[0]['imdbIndex'] == 'I' @mark.skip(reason="no persons without imdbIndex in the first 20 results") def test_search_person_entries_missing_index_should_be_excluded(ia): people = ia.search_person('julia roberts') assert 'imdbIndex' not in people[3] @mark.skip(reason="AKAs no longer present in results?") def test_search_person_entries_should_include_akas(ia): people = ia.search_person('julia roberts') person_with_aka = [p for p in people if p.personID == '4691618'] assert len(person_with_aka) == 1 assert person_with_aka[0]['akas'] == ['Julia Robertson'] def test_search_person_entries_missing_akas_should_be_excluded(ia): people = ia.search_person('julia roberts') assert 'akas' not in people[0] imdbpy-6.8/tests/test_in_operator.py000066400000000000000000000013411351454127000177500ustar00rootroot00000000000000def test_person_in_movie(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix person = ia.get_person('0000206', info=['main']) # Keanu Reeves assert person in movie def test_key_in_movie(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix assert 'cast' in movie def test_movie_in_person(ia): movie = ia.get_movie('0133093', info=['main']) # Matrix person = ia.get_person('0000206', info=['main']) # Keanu Reeves assert movie in person def test_key_in_person(ia): person = ia.get_person('0000206') # Keanu Reeves assert 'filmography' in person def test_key_in_company(ia): company = ia.get_company('0017902', info=['main']) # Pixar assert 'name' in company imdbpy-6.8/tests/test_xml.py000066400000000000000000000003551351454127000162330ustar00rootroot00000000000000import xml.etree.ElementTree as ET def test_movie_xml(ia): movie = ia.get_movie('0133093') # Matrix movie_xml = movie.asXML() movie_xml = movie_xml.encode('utf8', 'ignore') assert ET.fromstring(movie_xml) is not None imdbpy-6.8/tox.ini000066400000000000000000000007061351454127000141730ustar00rootroot00000000000000[tox] envlist = py{37,36,35,34,27}, pypy{36,35,27}, docs [testenv] deps = pytest lxml commands = {posargs:pytest} [testenv:pypy36] basepython = pypy3.6 [testenv:pypy35] basepython = pypy3.5 [testenv:pypy27] basepython = pypy2.7 [testenv:style] deps = flake8 flake8-isort commands = python setup.py flake8 [testenv:docs] changedir = docs/ deps = sphinx sphinx_rtd_theme commands = sphinx-build -b html ./ _build/