././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6418371 sahara-plugin-spark-10.0.0/0000775000175000017500000000000000000000000015464 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/.stestr.conf0000664000175000017500000000010000000000000017724 0ustar00zuulzuul00000000000000[DEFAULT] test_path=./sahara_plugin_spark/tests/unit top_dir=./ ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/.zuul.yaml0000664000175000017500000000036300000000000017427 0ustar00zuulzuul00000000000000- project: templates: - check-requirements - openstack-python3-zed-jobs - publish-openstack-docs-pti - release-notes-jobs-python3 check: jobs: - sahara-buildimages-spark: voting: false ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418419.0 sahara-plugin-spark-10.0.0/AUTHORS0000664000175000017500000000670100000000000016540 0ustar00zuulzuul00000000000000Adrien Vergé Alexander Aleksiyants Alexander Ignatov Alok Jani Andreas Jaeger Andreas Jaeger Andrew Lazarev Andrey Pavlov Artem Osadchiy Bo Wang Cao Xuan Hoang Chad Roberts ChangBo Guo(gcb) Charles Short Corey Bryant Daniele Venzano Dao Cong Tien Davanum Srinivas DennyZhang Dexter Fryar Dina Belova Dirk Mueller Dmitry Mescheryakov Doug Hellmann Duan Jiong Ethan Gafford Evgeny Sikachev Ghanshyam Mann Guo Shan Hervé Beraud Iwona Kotlarska James E. Blair Javeme Jaxon Wang Jeremy Freudberg Jeremy Stanley Jesse Pretorius Jon Maron Joseph D Natoli Julien Danjou Li, Chen Luigi Toscano Luong Anh Tuan Marianne Linhares Monteiro Matthew Farrellee Michael Ionkin Michael Krotscheck Michael McCune Michael McCune Nadya Privalova Ngo Quoc Cuong Nguyen Hai Nikita Konovalov Nikolay Starodubtsev Ondřej Nový OpenStack Release Bot PanFengyun PavlovAndrey Ronald Bradford Ruslan Kamaldinov Sean McGinnis Sergey Gotliv Sergey Lukjanov Sergey Lukjanov Sergey Reshetnyak Sergey Reshetnyak Sergey Vilgelm Shu Yingya Telles Nobrega Telles Nobrega Thierry Carrez Thomas Bechtold Tim Kelsey Trevor McKay Vadim Rovachev Vitaly Gridnev Yaroslav Lobankov ZhongShengping akhiljain23 anguoming artemosadchiy caoyue chenpengzi <1523688226@qq.com> chenxing dmitryme lcsong nizam pawnesh.kumar pengyuesheng ricolin zhanghongtao zhouyunfeng ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/CONTRIBUTING.rst0000664000175000017500000000124700000000000020131 0ustar00zuulzuul00000000000000The source repository for this project can be found at: https://opendev.org/openstack/sahara-plugin-spark Pull requests submitted through GitHub are not monitored. To start contributing to OpenStack, follow the steps in the contribution guide to set up and use Gerrit: https://docs.openstack.org/contributors/code-and-documentation/quick-start.html Bugs should be filed on Storyboard: https://storyboard.openstack.org/#!/project/openstack/sahara-plugin-spark For more specific information about contributing to this repository, see the sahara-plugin-spark contributor guide: https://docs.openstack.org/sahara-plugin-spark/latest/contributor/contributing.html ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418419.0 sahara-plugin-spark-10.0.0/ChangeLog0000664000175000017500000006455700000000000017257 0ustar00zuulzuul00000000000000CHANGES ======= 10.0.0 ------ * Update master for stable/zed * Dropping lower constraints testing and remove py36,py37 support 8.0.0 ----- * Update master for stable/xena 6.0.0 ----- * Update master for stable/wallaby 5.0.0 ----- * Imported Translations from Zanata * Fix reqs (focal), remove linters from l-r * Add Python3 wallaby unit tests * Update master for stable/victoria 4.0.0 ----- * Fix URL of Maven Central Repository * Use unittest.mock instead of mock * Switch to newer openstackdocstheme and reno versions * Fix hacking min version to 3.0.1 * Imported Translations from Zanata * Imported Translations from Zanata * Bump default tox env from py37 to py38 * Add py38 package metadata * Add Python3 victoria unit tests * Update master for stable/ussuri 3.0.0 ----- * Ussuri contributor docs community goal * Cleanup py27 support * Update hacking for Python3 * fix: typo in tox minversion option * [ussuri][goal] Drop python 2.7 support and testing * Switch to Ussuri jobs * Imported Translations from Zanata * Imported Translations from Zanata * Update master for stable/train 2.0.0.0rc1 ---------- * Imported Translations from Zanata * Update the constraints url * Doc updates: bump theme to 1.20.0, add PDF build * Imported Translations from Zanata * Limit envlist to py37 for Python 3 Train goal * Update sphinx from current requirements * Update Python 3 test runtimes for Train * Replace git.openstack.org URLs with opendev.org URLs * OpenDev Migration Patch * Dropping the py35 testing * Update master for stable/stein 1.0.0 ----- 1.0.0.0b1 --------- * Add the buildimages job to the check queue * Adding string to int conversion * add python 3.7 unit test job * Adding Spark to sahara-image-pack * Reduce the dependencies, add more common Zuul jobs * Sync the tox.ini files with the other plugins * Update mailinglist from dev to discuss * Migrate away from oslo\_i18n.enable\_lazy() * Fix translations: add the babel.cfg file * Post-import fixes: name, license, doc, translations * Updating plugin documentation and release notes * Add .gitreview and basic Zuul jobs * Plugins splitted from sahara core * Add framework for sahara-status upgrade check * doc: restructure the image building documentation * Cleanup tox.ini constraint handling * doc: update distro information and cloud-init users * Imported Translations from Zanata * Update reno for stable/rocky * Imported Translations from Zanata * S3 data source * Switch the coverage tox target to stestr * Updating Spark versions * Switch ostestr to stestr * Bump Flask version according requirements * Remove any reference to pre-built images * Updating plugins status for Rocky * fix tox python3 overrides * doc: add the redirect for a file recently renamed * Remove the (now obsolete) pip-missing-reqs tox target * uncap eventlet * Follow the new PTI for document build * Updated from global requirements * add lower-constraints job * Migration to Storyboard * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Imported Translations from Zanata * Updated from global requirements * Updated from global requirements * Imported Translations from Zanata * Update reno for stable/queens * Replace chinese quotes * Enable hacking-extensions H204, H205 * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * S3 job binary and binary retriever * Updated from global requirements * Updated from global requirements * Updated from global requirements * Upgrading Spark to version 2.2 * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Remove setting of version/release from releasenotes * Updated from global requirements * Updated from global requirements * Incorrect indent Sahara Installation Guide in sahara * Updated from global requirements * Policy in code for Sahara * TrivialFix: Redundant alias in import statement * Updated from global requirements * Updated from global requirements * Add default configuration files to data\_files * Updated from global requirements * Updated from global requirements * Updated from global requirements * [ut] replace .testr.conf with .stestr.conf * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * doc: point to the main git repository and update links * Updated from global requirements * Updated from global requirements * Imported Translations from Zanata * Update reno for stable/pike * Updated from global requirements * Restructure the documentation according the new spec * Deprecate Spark 1.3.1 * Enable some off-by-default checks * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Update the documentation link for doc migration * Update Documention link * Updated from global requirements * Enable warnings as errors for doc building * Enable H904 check * doc: update the configuration of the theme * Updated from global requirements * doc: switch to openstackdocstheme and add metadata * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Raise better exception for Spark master validation * Updated from global requirements * Basic script for pack-based build image * Remove usage of parameter enforce\_type * Updated from global requirements * Updated from global requirements * Remove log translations * Updated from global requirements * Remove log translations * Remove log translations * Remove log translations * Upgrading Spark version to 2.1.0 * Updated from global requirements * Apply monkeypatching from eventlet before the tests starts * install saharaclient from pypi if not from source * Fix some reST field lists in docstrings * Add ability to install with Apache in devstack * Support Job binary pluggability * Updated from global requirements * Updated from global requirements * Support Data Source pluggability * Indicating the location tests directory in oslo\_debug\_helper * Fix api-ref build * Updated from global requirements * [Fix gate]Update test requirement * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Remove support for py34 * Update reno for stable/ocata * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * spelling fixed * Updated from global requirements * Updated from global requirements * Remove enable\_notifications option * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Constraints are ready to be used for tox.ini * Enable release notes translation * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Update reno for stable/newton * fix docs env * Remove entry point of sahara tempest plugin * Updated from global requirements * Remove Tempest-like tests for clients (see sahara-tests) * standardize release note page ordering * Updated from global requirements * Updated from global requirements * Updating DOC on floating IPs change * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Designate integration * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Add Python 3.5 classifier and venv * CLI for Plugin-Declared Image Declaration * Simplify tox hacking rule to match other projects * improvements on api for plugins * Updated from global requirements * Updated from global requirements * fix building api ref docs * Updated from global requirements * Updated from global requirements * Updated from global requirements * novaclient.v2.images to glanceclient migration * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Moving WADL docs to Sahara repository * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Remove hdp 2.0.6 plugin * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Remove openstack/common related stuff * Updated from global requirements * Updated from global requirements * keystoneclient to keystoneauth migration * PrettyTable and rfc3986 are no longer used in tests * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Bandit password tests * Add hadoop openstack swift jar to ambari cluster * Move bandit to pep8 * Revert "Remove PyMySQL and psycopg2 from test-requirements.txt" * Remove PyMySQL and psycopg2 from test-requirements.txt * Update reno for stable/mitaka * register the config generator default hook with the right name * Updated from global requirements * Moved CORS middleware configuration into oslo-config-generator * Updated from global requirements * No longer necessary to specify jackson-core-asl in spark classpath * Updated from global requirements * Use ostestr instead of the custom pretty\_tox.sh * Updated from global requirements * Updated from global requirements * Remove support for spark 1.0.0 * Updated from global requirements * Updated from global requirements * Add support running Sahara as wsgi app * Python3: Fix using dictionary keys() * Await start datanodes in Spark plugin * Updated from global requirements * Added support of Spark 1.6.0 * Distributed periodic tasks implementation * Updated from global requirements * Remove outdated pot files * Updated from global requirements * Updated from global requirements * Remove scenario tests and related files * Updated from global requirements * Updated from global requirements * Updated from global requirements * add debug testenv in tox * Updated from global requirements * Updated from global requirements * Updated from global requirements * Ensure default arguments are not mutable * Initial key manager implementation * Updated from global requirements * Updated from global requirements * Don't configure hadoop.tmp.dir in Spark plugin * Updated from global requirements * Deprecated tox -downloadcache option removed * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * test: make enforce\_type=True in CONF.set\_override * Remove version from setup.cfg * Force releasenotes warnings to be treated as errors * Updated from global requirements * Updated from global requirements * Updated from global requirements * Drop direct engine support * Remove old integration tests for sahara codebase * Updated from global requirements * Updated from global requirements * Add "unreleased" release notes page * Support reno for release notes management * Updated from global requirements * Fix doc8 check failures * Updated from global requirements * Run py34 first in default tox run * Updated from global requirements * Publish sample conf to docs * Move doc8 dependency to test-requirements.txt * Fix E005 bashate error * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Add testresources used by oslo.db fixture * Updated from global requirements * Updated from global requirements * Updated from global requirements * Open Mitaka development * Updated from global requirements * Adapt python client tests to use Tempest plugin interface * Formatting and mounting methods changed for ironic * Updated from global requirements * Register SSL cert in Java keystore to access to swift via SSL * Updated from global requirements * Updated from global requirements * Remove useless test dependency 'discover' * Disable autotune configs for scaling old clusters * Adding support for the Spark Shell job * Updated from global requirements * Updated from global requirements * Ensure working dir is on driver class path for Spark/Swift * Updated from global requirements * Updated from global requirements * New version of HDP plugin * Updated from global requirements * Updated from global requirements * Support manila shares as binary store * Add script to report uncovered new lines * Updated from global requirements * Updated from global requirements * Mount share API * Updated from global requirements * Updated from global requirements * Add recommendation support for Spark plugin * Cleanup .gitignore * Ignore .eggs directory in git * Remove openstack.common package * updating documentation on devstack usage * Deprecate Spark 1.0.0 * Updated from global requirements * Updated from global requirements * Spark job for Cloudera 5.3.0 and 5.4.0 added * Add py34 to envlist * Add bashate check for devstack scripts * Updated from global requirements * Support Spark 1.3.1 * Updated from global requirements * Updated from global requirements * pass environment variables of proxy to tox * Switch to oslo.service * Updated from global requirements * Updated from global requirements * Update version for Liberty * Removed dependency on Spark plugin in edp code * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Use PyMySQL as MySQL DB driver for unit tests * Updated from global requirements * Improve compatible with python3 * Updated from global requirements * Remove sqlalchemy-migrate from test-requirements * Updated from global requirements * Adding basic bandit config * Updated from global requirements * Fixing log messages to avoid information duplication * Updated from global requirements * Updated from global requirements * Open Liberty development * Migrate to oslo.policy lib instead of copy-pasted oslo-incubator * Add unit-tests for new integration tests * Leverage dict comprehension in PEP-0274 * Add a CLI tool for managing default templates * Add validation in new integration tests * Updated from global requirements * Remove the sahara.conf.sample file * Implement job-types endpoint support methods for Spark plugin * Implement poll util and plugin poll util * Rewrite malformed imports order * Rewrite log levels and messages * Adding barbican client and keymgr module * Updated from global requirements * Updated from global requirements * Add support of several scenario files in integration tests * Collect errors in new integration tests * Add support for oslo\_debug\_helper to tox.ini * Updated from global requirements * Updated from global requirements * Reorganized heat template generation code * Add provision steps to Spark Plugin * New integration tests - base functional * Updated from global requirements * Refactor MapR plugin for Sahara * Updated from global requirements * Add ability to get cluster\_id directly from instance * Adding validation check for Spark plugin * Updated from global requirements * Fixed bug with spark scaling * Using oslo\_\* instead of oslo.\* * Updated from global requirements * Add Swift integration with Spark * Remove log module from common modules * Drop cli/sahara-rootwrap * Spark Temporary Job Data Retention and Cleanup * Updated from global requirements * Migrate to oslo.log * Updated from global requirements * Updated from global requirements * Updated from global requirements * Remove useless packages from requirements * Use pretty-tox for better test output * Updated from global requirements * Move to hacking 0.10 * Added ability to listen HTTPS port * Updated from global requirements * Updated from global requirements * Updated from global requirements * Adding Storm entry point to setup.cfg * Cleaned up config generator settings * Extracted config check from pep8 to separate env * Migrate to oslo.concurrency * Updated from global requirements * Updated from global requirements * Migrate to oslo.context * Updated from global requirements * Updated from global requirements * Inherit Context from oslo * Add list of open ports for Spark plugin * Adding uuids to exceptions * Remove py26 from tox * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Remove oslo-incubator's gettextutils * Drop obsolete oslo-confing-generator * MapR plugin implementation * Use oslo.middleware instead of copy-pasted * Updated from global requirements * Updated from global requirements * Updated from global requirements * Add bashate checks * Fix bashate errors * Updated from global requirements * Moved exceptions.py and utils.py up to plugins dir * Adding support for oslo.rootwrap to namespace access * Fix working Spark with cinder volumes * Fixed volumes configuration in spark plugin * Fix working Spark with cinder volumes * Support Cinder API version 2 * Fixed volumes configuration in spark plugin * Open Kilo development * Updated from global requirements * Add pip-missing-reqs tox env * Add genconfig tox env * Updated from global requirements * Updated from global requirements * Moved validate\_edp from plugin SPI to edp\_engine * Migrate to oslo.serialization * Add warn re sorting requirements * Add doc8 tox env * Removed comment about hashseed reset in unit tests * Minor change - removed unnessary parentheses * Updated from global requirements * Use auth\_token from keystonemiddleware * Updated from global requirements * Made EDP engine plugin specific * Do not rely on hash ordering in tests * Correction of words decoMMiSSion-decoMMiSSioning * Updated from global requirements * Add translation support to plugin modules * Migration to oslo.utils * Update oslo.messaging to alpha/juno version * Update oslo.config to the alpha/juno version * Updated from global requirements * Removed a duplicate directive * Set python hash seed to 0 in tox.ini * Implement EDP for a Spark standalone cluster * Add CDH plugin to Sahara * Add rm from docs env to whitelist to avoid warn * Migration to oslo.db * Add translation support to upper level modules * Updated from global requirements * Create an option for Spark path * Update oslo-incubator lockutils module * Updated from global requirements * Use oslo.i18n * Add oslo.i18n lib to requirements * Remove docutils pin * Switched Sahara unit tests base class to oslotest * Updated from global requirements * Corrected a number of pep8 errors * Updated from global requirements * Updated from global requirements * Fixed number of hacking errors * Updated from global requirements * Implement scaling for Spark clusters * Fixed H405 pep8 style check * Updated from global requirements * Migrated integration tests to testtools * Updated from global requirements * Fixed E265 pep8 * Added new hacking version to requirements * Updated from global requirements * Hided not found logger messages in unit tests * Migrated unit tests to testtools * Use in-memory sqlite DB for unit tests * Add Spark 1.0.0 to the version list * Updated from global requirements * Sync the latest DB code from oslo-incubator * Updated from global requirements * Add Spark plugin to Sahara * Updated from global requirements * Updated from global requirements * Updated from global requirements * Split sahara into sahara-api and sahara-engine * Add sahara-all binary * Fix eventlet monkey patch and threadlocal usage * Add simple fake plugin for testing * Updated from global requirements * Remove IDH plugin from sahara * Add \*.log files to gitignore * Updated from global requirements * Check that all po/pot files are valid * Open Juno dev * Updated from global requirements * Remove agent remote * Updated from global requirements * Move integration tests to python-saharaclient 0.6.0 * Change remaining savanna namespaces in setup.cfg * Renaming files with savanna words in its names * Move the savanna subdir to sahara * Update i18n config due to the renaming * Updated from global requirements * Updated from global requirements * Make savanna able to be executed as sahara * Updated from global requirements * Add alias 'direct' for savanna/direct engine * Intial Agent remote implementation * Updated from global requirements * Updated from global requirements * Updated from global requirements * Updated from global requirements * Auto generate and check config sample * Switch over to oslosphinx * Further preparation for transition to guest agent * Sync with global requirements * Make remote pluggable * Sync with global-requirements * Remove kombu from requirements * Bump stevedore to >=0.14 * Updated from global requirements * Updated from global requirements * Add integration test for Oozie java action * Updated from global requirements * Add alembic migration tool to sqlalchemy * Add missed i18n configs to setup.cfg * Extract common part of instances.py and instances\_heat.py * Adding IDH plugin basic implementation * Removal of AUTHORS file from repo * Launch integration tests with testr * Provisioning via Heat * Migrating to testr * Sync requirements: pin Sphinx to <1.2 * Bump savanna client used for tests to >= 0.4.0 * Make infrastructure engine pluggable * Use stevedore for plugins loading * There is no sense to keep py33 in tox envs * Revert "Support building wheels (PEP-427)" * Bump version to 2014.1 * Support building wheels (PEP-427) * Hacking contains all needed requirements * Fix style errors and upgrade hacking * Enable network operations over neutron private nets * Sync with global-requirements * Use release version of python-savannaclient * Add lower bound for the six dep * Use python-savannaclient 0.3.rc4 * Use savanna client 0.3-rc3 * Added EDP testing * Move swift client to runtime requirements * Hide savanna-subprocess endpoint from end users * Added rack topology configuration for hadoop cluster * Bump savanna client version to 0.3-rc2 * Add missing package dependency for test\_requirements.txt * Sync with global requirements * Replace copy-pasted sphinx theme with oslo.sphinx * Migration to new integration tests * Revert bump of alembic version * Integration test refactoring * Bump oslo.config version to use Havana release * Add default sqlite db to .gitignore * Sync requirements with global requirements * Remove version pbr pins from setup\_requires * Floating ip assignement support * Add direct dependency on iso8601 * Wrapping ssh calls into subprocesses * Sync requirements with os/requirements * Use setup.py develop for tox install * Move Babel from test to runtime requirements * Sync requirements with global-requirements * Get rid of pycrypto dep * Install configs to share/savanna from etc/savanna * Migrate to pbr * First steps for i18n support * Limit requests version * Sync OpenStack commons with oslo-incubator * Migrate to Conductor * Sync with global requirements * Raise eventlet to 0.13.0 * Bump hacking to 0.7 * Made Ambari RPM location configurable * Bump version to 0.3 * Fix docs build * Fix requests version * Add check S361 for imports of savanna.db module * Update requirements to the latest versions * Improve coverage calculation * Created savanna-db-manage script for new DB * Workflow creator * Docs build fixed * Enforce hacking >=0.6.0 * Allow sqlalchemy 0.8.X * Move requirements files to the common place * Use console\_scripts instead of bin * Cluster scaling: deletion * Added config tests * Fix author/homepage in setup.py * Rollback sitepackages fix for tox.ini * Fix pep8 and pycrypto versions, fix tox.ini * Posargs has been added to the flake8 command * License hacking tests has been added * XML coverage report added (cobertura) * Add cover report to .gitignore * Heap Size can be applied for Hadoop services now * Vanilla plugin configs are more informative now * Helper for Swift integration was added * Implementation of Vanilla Plugin * Adding lintstack to support pylint testing * Enable all code style tests * Introduce py33 to tox.ini * AUTHORS added to the repo * Initial version of Savanna v0.2 * .gitignore updated * cscope.out has been added to .gitignore * bump version to 0.1.2 * Adds xml hadoop config generating * Implements integration tests * Re-add setuptools-git to setup.py * All tools modev to tox * bump version to 0.1.1 * Remove an invalid trove classifier * setup.py has been improved * setuptools-get has been removed from deps * AUTHORS and ChangeLog has been added to .gitignore * resources has been added to sdist tarball * savanna-manage added to the scripts section of setup.py * Several fixes in tools and docs * Tools has been improved * savanna-manage has been added; reset-db/gen-templates moved to it * Author email has been fixed * dev-conf is now supported * some confs cleanup, pyflakes added to tox * simple tox.ini has been added * eho -> savanna * Build docs is now implemented using setup.py * setup utils is now from oslo-incubator * setup.py has been added * using conf files instead of hardcoded values * pylint and pyflakes static analysis has been added * nosetests.xml added to .gitignore * \*.db added to .gitignore * tests, coverage added * bin added * Initial commit ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/LICENSE0000664000175000017500000002363600000000000016503 0ustar00zuulzuul00000000000000 Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6418371 sahara-plugin-spark-10.0.0/PKG-INFO0000664000175000017500000000464600000000000016573 0ustar00zuulzuul00000000000000Metadata-Version: 1.2 Name: sahara-plugin-spark Version: 10.0.0 Summary: Spark Plugin for Sahara Project Home-page: https://docs.openstack.org/sahara/latest/ Author: OpenStack Author-email: openstack-discuss@lists.openstack.org License: Apache Software License Description: ======================== Team and repository tags ======================== .. image:: https://governance.openstack.org/tc/badges/sahara.svg :target: https://governance.openstack.org/tc/reference/tags/index.html .. Change things from this point on OpenStack Data Processing ("Sahara") Spark Plugin ================================================== OpenStack Sahara Spark Plugin provides the users the option to start Spark clusters on OpenStack Sahara. Check out OpenStack Sahara documentation to see how to deploy the Spark Plugin. Sahara at wiki.openstack.org: https://wiki.openstack.org/wiki/Sahara Storyboard project: https://storyboard.openstack.org/#!/project/openstack/sahara-plugin-spark Sahara docs site: https://docs.openstack.org/sahara/latest/ Quickstart guide: https://docs.openstack.org/sahara/latest/user/quickstart.html How to participate: https://docs.openstack.org/sahara/latest/contributor/how-to-participate.html Source: https://opendev.org/openstack/sahara-plugin-spark Bugs and feature requests: https://storyboard.openstack.org/#!/project/openstack/sahara-plugin-spark Release notes: https://docs.openstack.org/releasenotes/sahara-plugin-spark/ License ------- Apache License Version 2.0 http://www.apache.org/licenses/LICENSE-2.0 Platform: UNKNOWN Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: Implementation :: CPython Classifier: Programming Language :: Python :: 3 :: Only Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3.8 Classifier: Programming Language :: Python :: 3.9 Classifier: Environment :: OpenStack Classifier: Intended Audience :: Information Technology Classifier: Intended Audience :: System Administrators Classifier: License :: OSI Approved :: Apache Software License Classifier: Operating System :: POSIX :: Linux Requires-Python: >=3.8 ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/README.rst0000664000175000017500000000237100000000000017156 0ustar00zuulzuul00000000000000======================== Team and repository tags ======================== .. image:: https://governance.openstack.org/tc/badges/sahara.svg :target: https://governance.openstack.org/tc/reference/tags/index.html .. Change things from this point on OpenStack Data Processing ("Sahara") Spark Plugin ================================================== OpenStack Sahara Spark Plugin provides the users the option to start Spark clusters on OpenStack Sahara. Check out OpenStack Sahara documentation to see how to deploy the Spark Plugin. Sahara at wiki.openstack.org: https://wiki.openstack.org/wiki/Sahara Storyboard project: https://storyboard.openstack.org/#!/project/openstack/sahara-plugin-spark Sahara docs site: https://docs.openstack.org/sahara/latest/ Quickstart guide: https://docs.openstack.org/sahara/latest/user/quickstart.html How to participate: https://docs.openstack.org/sahara/latest/contributor/how-to-participate.html Source: https://opendev.org/openstack/sahara-plugin-spark Bugs and feature requests: https://storyboard.openstack.org/#!/project/openstack/sahara-plugin-spark Release notes: https://docs.openstack.org/releasenotes/sahara-plugin-spark/ License ------- Apache License Version 2.0 http://www.apache.org/licenses/LICENSE-2.0 ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/babel.cfg0000664000175000017500000000002000000000000017202 0ustar00zuulzuul00000000000000[python: **.py] ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6258364 sahara-plugin-spark-10.0.0/doc/0000775000175000017500000000000000000000000016231 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/doc/requirements.txt0000664000175000017500000000061700000000000021521 0ustar00zuulzuul00000000000000# The order of packages is significant, because pip processes them in the order # of appearance. Changing the order has an impact on the overall integration # process, which may cause wedges in the gate later. openstackdocstheme>=2.2.1 # Apache-2.0 os-api-ref>=1.4.0 # Apache-2.0 reno>=3.1.0 # Apache-2.0 sphinx>=2.0.0,!=2.1.0 # BSD sphinxcontrib-httpdomain>=1.3.0 # BSD whereto>=0.3.0 # Apache-2.0 ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6258364 sahara-plugin-spark-10.0.0/doc/source/0000775000175000017500000000000000000000000017531 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/doc/source/conf.py0000664000175000017500000001525000000000000021033 0ustar00zuulzuul00000000000000# -*- coding: utf-8 -*- # # sahara-plugin-spark documentation build configuration file. # # -- General configuration ----------------------------------------------------- # If your documentation needs a minimal Sphinx version, state it here. #needs_sphinx = '1.0' # Add any Sphinx extension module names here, as strings. They can be extensions # coming with Sphinx (named 'sphinx.ext.*') or your custom ones. extensions = [ 'reno.sphinxext', 'openstackdocstheme', ] # openstackdocstheme options openstackdocs_repo_name = 'openstack/sahara-plugin-spark' openstackdocs_pdf_link = True openstackdocs_use_storyboard = True openstackdocs_projects = [ 'sahara' ] # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] # The suffix of source filenames. source_suffix = '.rst' # The encoding of source files. #source_encoding = 'utf-8-sig' # The master toctree document. master_doc = 'index' # General information about the project. copyright = u'2015, Sahara team' # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. #language = None # There are two options for replacing |today|: either, you set today to some # non-false value, then it is used: #today = '' # Else, today_fmt is used as the format for a strftime call. #today_fmt = '%B %d, %Y' # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. exclude_patterns = [] # The reST default role (used for this markup: `text`) to use for all documents. #default_role = None # If true, '()' will be appended to :func: etc. cross-reference text. #add_function_parentheses = True # If true, the current module name will be prepended to all description # unit titles (such as .. function::). #add_module_names = True # If true, sectionauthor and moduleauthor directives will be shown in the # output. They are ignored by default. #show_authors = False # The name of the Pygments (syntax highlighting) style to use. pygments_style = 'native' # A list of ignored prefixes for module index sorting. #modindex_common_prefix = [] # -- Options for HTML output --------------------------------------------------- # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. html_theme = 'openstackdocs' # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the # documentation. #html_theme_options = {} # Add any paths that contain custom themes here, relative to this directory. #html_theme_path = [] # The name for this set of Sphinx documents. If None, it defaults to # " v documentation". #html_title = None # A shorter title for the navigation bar. Default is the same as html_title. #html_short_title = None # The name of an image file (relative to this directory) to place at the top # of the sidebar. #html_logo = None # The name of an image file (within the static path) to use as favicon of the # docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32 # pixels large. #html_favicon = None # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". #html_static_path = ['_static'] # If not '', a 'Last updated on:' timestamp is inserted at every page bottom, # using the given strftime format. #html_last_updated_fmt = '%b %d, %Y' # Custom sidebar templates, maps document names to template names. #html_sidebars = {} # Additional templates that should be rendered to pages, maps page names to # template names. #html_additional_pages = {} # If false, no module index is generated. #html_domain_indices = True # If false, no index is generated. #html_use_index = True # If true, the index is split into individual pages for each letter. #html_split_index = False # If true, links to the reST sources are added to the pages. #html_show_sourcelink = True # If true, "Created using Sphinx" is shown in the HTML footer. Default is True. #html_show_sphinx = True # If true, "(C) Copyright ..." is shown in the HTML footer. Default is True. #html_show_copyright = True # If true, an OpenSearch description file will be output, and all pages will # contain a tag referring to it. The value of this option must be the # base URL from which the finished HTML is served. #html_use_opensearch = '' # This is the file name suffix for HTML files (e.g. ".xhtml"). #html_file_suffix = None # Output file base name for HTML help builder. htmlhelp_basename = 'saharasparkplugin-testsdoc' # -- Options for LaTeX output -------------------------------------------------- # Grouping the document tree into LaTeX files. List of tuples # (source start file, target name, title, author, documentclass [howto/manual]). latex_documents = [ ('index', 'doc-sahara-plugin-spark.tex', u'Sahara Spark Plugin Documentation', u'Sahara team', 'manual'), ] # The name of an image file (relative to this directory) to place at the top of # the title page. #latex_logo = None # For "manual" documents, if this is true, then toplevel headings are parts, # not chapters. #latex_use_parts = False # If true, show page references after internal links. #latex_show_pagerefs = False # If true, show URL addresses after external links. #latex_show_urls = False # Documents to append as an appendix to all manuals. #latex_appendices = [] # If false, no module index is generated. #latex_domain_indices = True smartquotes_excludes = {'builders': ['latex']} # -- Options for manual page output -------------------------------------------- # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). man_pages = [ ('index', 'sahara-plugin-spark', u'sahara-plugin-spark Documentation', [u'Sahara team'], 1) ] # If true, show URL addresses after external links. #man_show_urls = False # -- Options for Texinfo output ------------------------------------------------ # Grouping the document tree into Texinfo files. List of tuples # (source start file, target name, title, author, # dir menu entry, description, category) texinfo_documents = [ ('index', 'sahara-plugin-spark', u'sahara-plugin-spark Documentation', u'Sahara team', 'sahara-plugin-spark', 'One line description of project.', 'Miscellaneous'), ] # Documents to append as an appendix to all manuals. #texinfo_appendices = [] # If false, no module index is generated. #texinfo_domain_indices = True # How to display URL addresses: 'footnote', 'no', or 'inline'. #texinfo_show_urls = 'footnote' ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6258364 sahara-plugin-spark-10.0.0/doc/source/contributor/0000775000175000017500000000000000000000000022103 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/doc/source/contributor/contributing.rst0000664000175000017500000000130100000000000025337 0ustar00zuulzuul00000000000000============================ So You Want to Contribute... ============================ For general information on contributing to OpenStack, please check out the `contributor guide `_ to get started. It covers all the basics that are common to all OpenStack projects: the accounts you need, the basics of interacting with our Gerrit review system, how we communicate as a community, etc. sahara-plugin-spark is maintained by the OpenStack Sahara project. To understand our development process and how you can contribute to it, please look at the Sahara project's general contributor's page: http://docs.openstack.org/sahara/latest/contributor/contributing.html ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/doc/source/contributor/index.rst0000664000175000017500000000014500000000000023744 0ustar00zuulzuul00000000000000================= Contributor Guide ================= .. toctree:: :maxdepth: 2 contributing ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/doc/source/index.rst0000664000175000017500000000016200000000000021371 0ustar00zuulzuul00000000000000Spark plugin for Sahara ======================= .. toctree:: :maxdepth: 2 user/index contributor/index ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6258364 sahara-plugin-spark-10.0.0/doc/source/user/0000775000175000017500000000000000000000000020507 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/doc/source/user/index.rst0000664000175000017500000000012100000000000022342 0ustar00zuulzuul00000000000000========== User Guide ========== .. toctree:: :maxdepth: 2 spark-plugin ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/doc/source/user/spark-plugin.rst0000664000175000017500000000635000000000000023661 0ustar00zuulzuul00000000000000Spark Plugin ============ The Spark plugin for sahara provides a way to provision Apache Spark clusters on OpenStack in a single click and in an easily repeatable fashion. Currently Spark is installed in standalone mode, with no YARN or Mesos support. Images ------ For cluster provisioning, prepared images should be used. .. list-table:: Support matrix for the `spark` plugin :widths: 15 15 20 15 35 :header-rows: 1 * - Version (image tag) - Distribution - Build method - Version (build parameter) - Notes * - 2.3 - Ubuntu 16.04, CentOS 7 - sahara-image-pack - 2.3 - based on CDH 5.11 use --plugin_version to specify the minor version: 2.3.2 (default), 2.3.1 or 2.3.0 * - 2.3 - Ubuntu 16.04 - sahara-image-create - 2.3.0 - based on CDH 5.11 * - 2.2 - Ubuntu 16.04, CentOS 7 - sahara-image-pack - 2.2 - based on CDH 5.11 use --plugin_version to specify the minor version: 2.2.1 (default), or 2.2.0 * - 2.2 - Ubuntu 16.04 - sahara-image-create - 2.2.0 - based on CDH 5.11 For more information about building image, refer to :sahara-doc:`Sahara documentation `. The Spark plugin requires an image to be tagged in the sahara image registry with two tags: 'spark' and '' (e.g. '1.6.0'). The image requires a username. For more information, refer to the :sahara-doc:`registering image ` section of the Sahara documentation. Note that the Spark cluster is deployed using the scripts available in the Spark distribution, which allow the user to start all services (master and slaves), stop all services and so on. As such (and as opposed to CDH HDFS daemons), Spark is not deployed as a standard Ubuntu service and if the virtual machines are rebooted, Spark will not be restarted. Build settings ~~~~~~~~~~~~~~ When ``sahara-image-create`` is used, you can override few settings by exporting the corresponding environment variables before starting the build command: * ``SPARK_DOWNLOAD_URL`` - download link for Spark Spark configuration ------------------- Spark needs few parameters to work and has sensible defaults. If needed they can be changed when creating the sahara cluster template. No node group options are available. Once the cluster is ready, connect with ssh to the master using the `ubuntu` user and the appropriate ssh key. Spark is installed in `/opt/spark` and should be completely configured and ready to start executing jobs. At the bottom of the cluster information page from the OpenStack dashboard, a link to the Spark web interface is provided. Cluster Validation ------------------ When a user creates an Hadoop cluster using the Spark plugin, the cluster topology requested by user is verified for consistency. Currently there are the following limitations in cluster topology for the Spark plugin: + Cluster must contain exactly one HDFS namenode + Cluster must contain exactly one Spark master + Cluster must contain at least one Spark slave + Cluster must contain at least one HDFS datanode The tested configuration co-locates the NameNode with the master and a DataNode with each slave to maximize data locality. ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6178362 sahara-plugin-spark-10.0.0/releasenotes/0000775000175000017500000000000000000000000020155 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6258364 sahara-plugin-spark-10.0.0/releasenotes/notes/0000775000175000017500000000000000000000000021305 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/releasenotes/notes/drop-py2-7-ff5c64bed835ce49.yaml0000664000175000017500000000035200000000000026404 0ustar00zuulzuul00000000000000--- upgrade: - | Python 2.7 support has been dropped. Last release of sahara and its plugins to support python 2.7 is OpenStack Train. The minimum version of Python now supported by sahara and its plugins is Python 3.6. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/releasenotes/notes/spark-on-image-pack-f5609daf38c45b6f.yaml0000664000175000017500000000013000000000000030220 0ustar00zuulzuul00000000000000--- features: - | Adding abilitiy to create spark images using Sahara Image Pack. ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6298366 sahara-plugin-spark-10.0.0/releasenotes/source/0000775000175000017500000000000000000000000021455 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6298366 sahara-plugin-spark-10.0.0/releasenotes/source/_static/0000775000175000017500000000000000000000000023103 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/releasenotes/source/_static/.placeholder0000664000175000017500000000000000000000000025354 0ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6298366 sahara-plugin-spark-10.0.0/releasenotes/source/_templates/0000775000175000017500000000000000000000000023612 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/releasenotes/source/_templates/.placeholder0000664000175000017500000000000000000000000026063 0ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/releasenotes/source/conf.py0000664000175000017500000001524400000000000022762 0ustar00zuulzuul00000000000000# -*- coding: utf-8 -*- # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or # implied. # See the License for the specific language governing permissions and # limitations under the License. # Sahara Release Notes documentation build configuration file extensions = [ 'reno.sphinxext', 'openstackdocstheme' ] # openstackdocstheme options openstackdocs_repo_name = 'openstack/sahara-plugin-spark' openstackdocs_use_storyboard = True # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] # The suffix of source filenames. source_suffix = '.rst' # The master toctree document. master_doc = 'index' # General information about the project. copyright = u'2015, Sahara Developers' # Release do not need a version number in the title, they # cover multiple versions. # The full version, including alpha/beta/rc tags. release = '' # The short X.Y version. version = '' # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. exclude_patterns = [] # The name of the Pygments (syntax highlighting) style to use. pygments_style = 'native' # -- Options for HTML output ---------------------------------------------- # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. html_theme = 'openstackdocs' # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the # documentation. # html_theme_options = {} # Add any paths that contain custom themes here, relative to this directory. # html_theme_path = [] # The name for this set of Sphinx documents. If None, it defaults to # " v documentation". # html_title = None # A shorter title for the navigation bar. Default is the same as html_title. # html_short_title = None # The name of an image file (relative to this directory) to place at the top # of the sidebar. # html_logo = None # The name of an image file (within the static path) to use as favicon of the # docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32 # pixels large. # html_favicon = None # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ['_static'] # Add any extra paths that contain custom files (such as robots.txt or # .htaccess) here, relative to this directory. These files are copied # directly to the root of the documentation. # html_extra_path = [] # If not '', a 'Last updated on:' timestamp is inserted at every page bottom, # using the given strftime format. # html_last_updated_fmt = '%b %d, %Y' # If true, SmartyPants will be used to convert quotes and dashes to # typographically correct entities. # html_use_smartypants = True # Custom sidebar templates, maps document names to template names. # html_sidebars = {} # Additional templates that should be rendered to pages, maps page names to # template names. # html_additional_pages = {} # If false, no module index is generated. # html_domain_indices = True # If false, no index is generated. # html_use_index = True # If true, the index is split into individual pages for each letter. # html_split_index = False # If true, links to the reST sources are added to the pages. # html_show_sourcelink = True # If true, "Created using Sphinx" is shown in the HTML footer. Default is True. # html_show_sphinx = True # If true, "(C) Copyright ..." is shown in the HTML footer. Default is True. # html_show_copyright = True # If true, an OpenSearch description file will be output, and all pages will # contain a tag referring to it. The value of this option must be the # base URL from which the finished HTML is served. # html_use_opensearch = '' # This is the file name suffix for HTML files (e.g. ".xhtml"). # html_file_suffix = None # Output file base name for HTML help builder. htmlhelp_basename = 'SaharaSparkReleaseNotesdoc' # -- Options for LaTeX output --------------------------------------------- # Grouping the document tree into LaTeX files. List of tuples # (source start file, target name, title, # author, documentclass [howto, manual, or own class]). latex_documents = [ ('index', 'SaharaSparkReleaseNotes.tex', u'Sahara Spark Plugin Release Notes Documentation', u'Sahara Developers', 'manual'), ] # The name of an image file (relative to this directory) to place at the top of # the title page. # latex_logo = None # For "manual" documents, if this is true, then toplevel headings are parts, # not chapters. # latex_use_parts = False # If true, show page references after internal links. # latex_show_pagerefs = False # If true, show URL addresses after external links. # latex_show_urls = False # Documents to append as an appendix to all manuals. # latex_appendices = [] # If false, no module index is generated. # latex_domain_indices = True # -- Options for manual page output --------------------------------------- # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). man_pages = [ ('index', 'saharasparkreleasenotes', u'Sahara Spark Plugin Release Notes Documentation', [u'Sahara Developers'], 1) ] # If true, show URL addresses after external links. # man_show_urls = False # -- Options for Texinfo output ------------------------------------------- # Grouping the document tree into Texinfo files. List of tuples # (source start file, target name, title, author, # dir menu entry, description, category) texinfo_documents = [ ('index', 'SaharaSparkReleaseNotes', u'Sahara Spark Plugin Release Notes Documentation', u'Sahara Developers', 'SaharaSparkReleaseNotes', 'One line description of project.', 'Miscellaneous'), ] # Documents to append as an appendix to all manuals. # texinfo_appendices = [] # If false, no module index is generated. # texinfo_domain_indices = True # How to display URL addresses: 'footnote', 'no', or 'inline'. # texinfo_show_urls = 'footnote' # If true, do not generate a @detailmenu in the "Top" node's menu. # texinfo_no_detailmenu = False # -- Options for Internationalization output ------------------------------ locale_dirs = ['locale/'] ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/releasenotes/source/index.rst0000664000175000017500000000033200000000000023314 0ustar00zuulzuul00000000000000=================================== Sahara Spark Plugin Release Notes =================================== .. toctree:: :maxdepth: 1 unreleased zed xena wallaby victoria ussuri train stein ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6178362 sahara-plugin-spark-10.0.0/releasenotes/source/locale/0000775000175000017500000000000000000000000022714 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6178362 sahara-plugin-spark-10.0.0/releasenotes/source/locale/de/0000775000175000017500000000000000000000000023304 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6298366 sahara-plugin-spark-10.0.0/releasenotes/source/locale/de/LC_MESSAGES/0000775000175000017500000000000000000000000025071 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/releasenotes/source/locale/de/LC_MESSAGES/releasenotes.po0000664000175000017500000000325300000000000030125 0ustar00zuulzuul00000000000000# Andreas Jaeger , 2019. #zanata # Andreas Jaeger , 2020. #zanata msgid "" msgstr "" "Project-Id-Version: sahara-plugin-spark\n" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2020-04-24 23:45+0000\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "PO-Revision-Date: 2020-04-25 10:40+0000\n" "Last-Translator: Andreas Jaeger \n" "Language-Team: German\n" "Language: de\n" "X-Generator: Zanata 4.3.3\n" "Plural-Forms: nplurals=2; plural=(n != 1)\n" msgid "1.0.0" msgstr "1.0.0" msgid "Adding abilitiy to create spark images using Sahara Image Pack." msgstr "Spark Abbilder können jetzt mit dem Sahara Image Pack erzeugt werden." msgid "Current Series Release Notes" msgstr "Aktuelle Serie Releasenotes" msgid "New Features" msgstr "Neue Funktionen" msgid "" "Python 2.7 support has been dropped. Last release of sahara and its plugins " "to support python 2.7 is OpenStack Train. The minimum version of Python now " "supported by sahara and its plugins is Python 3.6." msgstr "" "Python 2.7 Unterstützung wurde beendet. Der letzte Release von Sahara und " "seinen Plugins der Python 2.7 unterstützt ist OpenStack Train. Die minimal " "Python Version welche von Sahara und seinen Plugins unterstützt wird, ist " "Python 3.6." msgid "Sahara Spark Plugin Release Notes" msgstr "Sahara Spark Plugin Releasenotes" msgid "Stein Series Release Notes" msgstr "Stein Serie Releasenotes" msgid "Train Series Release Notes" msgstr "Train Serie Releasenotes" msgid "Upgrade Notes" msgstr "Aktualisierungsnotizen" msgid "Ussuri Series Release Notes" msgstr "Ussuri Serie Releasenotes" ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6178362 sahara-plugin-spark-10.0.0/releasenotes/source/locale/en_GB/0000775000175000017500000000000000000000000023666 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6298366 sahara-plugin-spark-10.0.0/releasenotes/source/locale/en_GB/LC_MESSAGES/0000775000175000017500000000000000000000000025453 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/releasenotes/source/locale/en_GB/LC_MESSAGES/releasenotes.po0000664000175000017500000000327000000000000030506 0ustar00zuulzuul00000000000000# Andi Chandler , 2020. #zanata msgid "" msgstr "" "Project-Id-Version: sahara-plugin-spark\n" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2020-10-07 22:08+0000\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "PO-Revision-Date: 2020-11-04 12:47+0000\n" "Last-Translator: Andi Chandler \n" "Language-Team: English (United Kingdom)\n" "Language: en_GB\n" "X-Generator: Zanata 4.3.3\n" "Plural-Forms: nplurals=2; plural=(n != 1)\n" msgid "1.0.0" msgstr "1.0.0" msgid "3.0.0" msgstr "3.0.0" msgid "Adding abilitiy to create spark images using Sahara Image Pack." msgstr "Adding ability to create spark images using Sahara Image Pack." msgid "Current Series Release Notes" msgstr "Current Series Release Notes" msgid "New Features" msgstr "New Features" msgid "" "Python 2.7 support has been dropped. Last release of sahara and its plugins " "to support python 2.7 is OpenStack Train. The minimum version of Python now " "supported by sahara and its plugins is Python 3.6." msgstr "" "Python 2.7 support has been dropped. Last release of Sahara and its plugins " "to support Python 2.7 is OpenStack Train. The minimum version of Python now " "supported by Sahara and its plugins is Python 3.6." msgid "Sahara Spark Plugin Release Notes" msgstr "Sahara Spark Plugin Release Notes" msgid "Stein Series Release Notes" msgstr "Stein Series Release Notes" msgid "Train Series Release Notes" msgstr "Train Series Release Notes" msgid "Upgrade Notes" msgstr "Upgrade Notes" msgid "Ussuri Series Release Notes" msgstr "Ussuri Series Release Notes" msgid "Victoria Series Release Notes" msgstr "Victoria Series Release Notes" ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6178362 sahara-plugin-spark-10.0.0/releasenotes/source/locale/ne/0000775000175000017500000000000000000000000023316 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6298366 sahara-plugin-spark-10.0.0/releasenotes/source/locale/ne/LC_MESSAGES/0000775000175000017500000000000000000000000025103 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/releasenotes/source/locale/ne/LC_MESSAGES/releasenotes.po0000664000175000017500000000224700000000000030141 0ustar00zuulzuul00000000000000# Surit Aryal , 2019. #zanata msgid "" msgstr "" "Project-Id-Version: sahara-plugin-spark\n" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2019-07-23 14:26+0000\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "PO-Revision-Date: 2019-08-02 08:15+0000\n" "Last-Translator: Surit Aryal \n" "Language-Team: Nepali\n" "Language: ne\n" "X-Generator: Zanata 4.3.3\n" "Plural-Forms: nplurals=2; plural=(n != 1)\n" msgid "1.0.0" msgstr "१.०.०" msgid "Adding abilitiy to create spark images using Sahara Image Pack." msgstr "Sahara Image Pack प्रयोग गरेर स्पार्क छविहरू सिर्जना गर्ने क्षमता थप्दै।" msgid "Current Series Release Notes" msgstr "Current Series रिलीज नोट्स" msgid "New Features" msgstr "नयाँ सुविधाहरू" msgid "Sahara Spark Plugin Release Notes" msgstr "Sahara Spark Plugin नोट जारी गर्नुहोस्" msgid "Stein Series Release Notes" msgstr "Stein Series नोट जारी गर्नुहोस्" ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/releasenotes/source/stein.rst0000664000175000017500000000022100000000000023324 0ustar00zuulzuul00000000000000=================================== Stein Series Release Notes =================================== .. release-notes:: :branch: stable/stein ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/releasenotes/source/train.rst0000664000175000017500000000017600000000000023330 0ustar00zuulzuul00000000000000========================== Train Series Release Notes ========================== .. release-notes:: :branch: stable/train ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/releasenotes/source/unreleased.rst0000664000175000017500000000016000000000000024333 0ustar00zuulzuul00000000000000============================== Current Series Release Notes ============================== .. release-notes:: ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/releasenotes/source/ussuri.rst0000664000175000017500000000020200000000000023533 0ustar00zuulzuul00000000000000=========================== Ussuri Series Release Notes =========================== .. release-notes:: :branch: stable/ussuri ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/releasenotes/source/victoria.rst0000664000175000017500000000021200000000000024022 0ustar00zuulzuul00000000000000============================= Victoria Series Release Notes ============================= .. release-notes:: :branch: stable/victoria ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/releasenotes/source/wallaby.rst0000664000175000017500000000020600000000000023640 0ustar00zuulzuul00000000000000============================ Wallaby Series Release Notes ============================ .. release-notes:: :branch: stable/wallaby ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/releasenotes/source/xena.rst0000664000175000017500000000017200000000000023142 0ustar00zuulzuul00000000000000========================= Xena Series Release Notes ========================= .. release-notes:: :branch: stable/xena ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/releasenotes/source/zed.rst0000664000175000017500000000016600000000000022774 0ustar00zuulzuul00000000000000======================== Zed Series Release Notes ======================== .. release-notes:: :branch: stable/zed ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/requirements.txt0000664000175000017500000000076700000000000020762 0ustar00zuulzuul00000000000000# The order of packages is significant, because pip processes them in the order # of appearance. Changing the order has an impact on the overall integration # process, which may cause wedges in the gate later. pbr!=2.1.0,>=2.0.0 # Apache-2.0 Babel!=2.4.0,>=2.3.4 # BSD eventlet>=0.26.0 # MIT oslo.i18n>=3.15.3 # Apache-2.0 oslo.log>=3.36.0 # Apache-2.0 oslo.serialization!=2.19.1,>=2.18.0 # Apache-2.0 oslo.utils>=3.33.0 # Apache-2.0 requests>=2.14.2 # Apache-2.0 sahara>=10.0.0.0b1 six>=1.10.0 # MIT ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6298366 sahara-plugin-spark-10.0.0/sahara_plugin_spark/0000775000175000017500000000000000000000000021501 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/__init__.py0000664000175000017500000000000000000000000023600 0ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/i18n.py0000664000175000017500000000162700000000000022640 0ustar00zuulzuul00000000000000# Copyright (c) 2014 Mirantis Inc. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or # implied. # See the License for the specific language governing permissions and # limitations under the License. # It's based on oslo.i18n usage in OpenStack Keystone project and # recommendations from https://docs.openstack.org/oslo.i18n/latest/ # user/usage.html import oslo_i18n _translators = oslo_i18n.TranslatorFactory(domain='sahara_plugin_spark') # The primary translation function using the well-known name "_" _ = _translators.primary ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6178362 sahara-plugin-spark-10.0.0/sahara_plugin_spark/locale/0000775000175000017500000000000000000000000022740 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6178362 sahara-plugin-spark-10.0.0/sahara_plugin_spark/locale/de/0000775000175000017500000000000000000000000023330 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6338367 sahara-plugin-spark-10.0.0/sahara_plugin_spark/locale/de/LC_MESSAGES/0000775000175000017500000000000000000000000025115 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/locale/de/LC_MESSAGES/sahara_plugin_spark.po0000664000175000017500000000375400000000000031503 0ustar00zuulzuul00000000000000# Andreas Jaeger , 2019. #zanata msgid "" msgstr "" "Project-Id-Version: sahara-plugin-spark VERSION\n" "Report-Msgid-Bugs-To: https://bugs.launchpad.net/openstack-i18n/\n" "POT-Creation-Date: 2019-09-20 17:24+0000\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "PO-Revision-Date: 2019-09-25 06:54+0000\n" "Last-Translator: Andreas Jaeger \n" "Language-Team: German\n" "Language: de\n" "X-Generator: Zanata 4.3.3\n" "Plural-Forms: nplurals=2; plural=(n != 1)\n" #, python-format msgid "%s or more" msgstr "%s oder mehr" msgid "1 or more" msgstr "1 oder mehr" msgid "Await DataNodes start up" msgstr "Warten, bis DataNodes gestartet wird" #, python-format msgid "Decommission %s" msgstr "Außerkraftsetzung %s" #, python-format msgid "Number of %(dn)s instances should not be less than %(replication)s" msgstr "" "Die Anzahl der %(dn)s-Instanzen sollte nicht kleiner als %(replication)s sein" msgid "Push configs to nodes" msgstr "Push-Konfigurationen zu Knoten" #, python-format msgid "Spark plugin cannot scale nodegroup with processes: %s" msgstr "Das Spark-Plugin kann Knotengruppen nicht mit Prozessen skalieren: %s" #, python-format msgid "" "Spark plugin cannot shrink cluster because there would be not enough nodes " "for HDFS replicas (replication factor is %s)" msgstr "" "Das Spark-Plug-in kann den Cluster nicht verkleinern, da nicht genügend " "Knoten für HDFS-Replikate vorhanden sind (der Replikationsfaktor ist%s)" msgid "Spark {base} or higher required to run {type} jobs" msgstr "Spark {base} oder höher erforderlich, um {type} Jobs auszuführen" msgid "" "This plugin provides an ability to launch Spark on Hadoop CDH cluster " "without any management consoles." msgstr "" "Dieses Plugin bietet die Möglichkeit, Spark auf dem Hadoop CDH-Cluster ohne " "Verwaltungskonsolen zu starten." #, python-format msgid "Waiting on %d DataNodes to start up" msgstr "Warten auf %d DataNodes zum Starten" ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6178362 sahara-plugin-spark-10.0.0/sahara_plugin_spark/locale/en_GB/0000775000175000017500000000000000000000000023712 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6338367 sahara-plugin-spark-10.0.0/sahara_plugin_spark/locale/en_GB/LC_MESSAGES/0000775000175000017500000000000000000000000025477 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/locale/en_GB/LC_MESSAGES/sahara_plugin_spark.po0000664000175000017500000000361300000000000032057 0ustar00zuulzuul00000000000000# Andi Chandler , 2020. #zanata msgid "" msgstr "" "Project-Id-Version: sahara-plugin-spark VERSION\n" "Report-Msgid-Bugs-To: https://bugs.launchpad.net/openstack-i18n/\n" "POT-Creation-Date: 2020-04-26 20:56+0000\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "PO-Revision-Date: 2020-05-05 11:25+0000\n" "Last-Translator: Andi Chandler \n" "Language-Team: English (United Kingdom)\n" "Language: en_GB\n" "X-Generator: Zanata 4.3.3\n" "Plural-Forms: nplurals=2; plural=(n != 1)\n" #, python-format msgid "%s or more" msgstr "%s or more" msgid "1 or more" msgstr "1 or more" msgid "Await DataNodes start up" msgstr "Await DataNodes start up" #, python-format msgid "Decommission %s" msgstr "Decommission %s" #, python-format msgid "Number of %(dn)s instances should not be less than %(replication)s" msgstr "Number of %(dn)s instances should not be less than %(replication)s" msgid "Push configs to nodes" msgstr "Push configs to nodes" #, python-format msgid "Spark plugin cannot scale nodegroup with processes: %s" msgstr "Spark plugin cannot scale nodegroup with processes: %s" #, python-format msgid "" "Spark plugin cannot shrink cluster because there would be not enough nodes " "for HDFS replicas (replication factor is %s)" msgstr "" "Spark plugin cannot shrink cluster because there would be not enough nodes " "for HDFS replicas (replication factor is %s)" msgid "Spark {base} or higher required to run {type} jobs" msgstr "Spark {base} or higher required to run {type} jobs" msgid "" "This plugin provides an ability to launch Spark on Hadoop CDH cluster " "without any management consoles." msgstr "" "This plugin provides an ability to launch Spark on Hadoop CDH cluster " "without any management consoles." #, python-format msgid "Waiting on %d DataNodes to start up" msgstr "Waiting on %d DataNodes to start up" ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6178362 sahara-plugin-spark-10.0.0/sahara_plugin_spark/locale/id/0000775000175000017500000000000000000000000023334 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6338367 sahara-plugin-spark-10.0.0/sahara_plugin_spark/locale/id/LC_MESSAGES/0000775000175000017500000000000000000000000025121 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/locale/id/LC_MESSAGES/sahara_plugin_spark.po0000664000175000017500000000373400000000000031505 0ustar00zuulzuul00000000000000# Andreas Jaeger , 2019. #zanata # suhartono , 2019. #zanata msgid "" msgstr "" "Project-Id-Version: sahara-plugin-spark VERSION\n" "Report-Msgid-Bugs-To: https://bugs.launchpad.net/openstack-i18n/\n" "POT-Creation-Date: 2019-09-30 09:25+0000\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "PO-Revision-Date: 2019-10-08 10:52+0000\n" "Last-Translator: Andreas Jaeger \n" "Language-Team: Indonesian\n" "Language: id\n" "X-Generator: Zanata 4.3.3\n" "Plural-Forms: nplurals=1; plural=0\n" #, python-format msgid "%s or more" msgstr "%s or more" msgid "1 or more" msgstr "1 atau lebih" msgid "Await DataNodes start up" msgstr "Tunggu DataNodes mulai" #, python-format msgid "Decommission %s" msgstr "Decommission %s" #, python-format msgid "Number of %(dn)s instances should not be less than %(replication)s" msgstr "Jumlah instance %(dn)s tidak boleh kurang dari %(replication)s" msgid "Push configs to nodes" msgstr "Dorong konfigurasi ke node" #, python-format msgid "Spark plugin cannot scale nodegroup with processes: %s" msgstr "Plugin Spark tidak dapat menskala nodegroup dengan proses: %s" #, python-format msgid "" "Spark plugin cannot shrink cluster because there would be not enough nodes " "for HDFS replicas (replication factor is %s)" msgstr "" "Plugin Spark tidak dapat mengecilkan cluster karena tidak akan ada cukup " "node untuk replika HDFS (faktor replikasi adalah %s)" msgid "Spark {base} or higher required to run {type} jobs" msgstr "" "Spark {base} atau lebih tinggi diperlukan untuk menjalankan jobs {type}" msgid "" "This plugin provides an ability to launch Spark on Hadoop CDH cluster " "without any management consoles." msgstr "" "Plugin ini menyediakan kemampuan untuk meluncurkan Spark pada cluster Hadoop " "CDH tanpa konsol manajemen." #, python-format msgid "Waiting on %d DataNodes to start up" msgstr "Menunggu %d DataNodes untuk memulai" ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6178362 sahara-plugin-spark-10.0.0/sahara_plugin_spark/locale/ne/0000775000175000017500000000000000000000000023342 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6338367 sahara-plugin-spark-10.0.0/sahara_plugin_spark/locale/ne/LC_MESSAGES/0000775000175000017500000000000000000000000025127 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/locale/ne/LC_MESSAGES/sahara_plugin_spark.po0000664000175000017500000000466600000000000031520 0ustar00zuulzuul00000000000000# Surit Aryal , 2019. #zanata msgid "" msgstr "" "Project-Id-Version: sahara-plugin-spark VERSION\n" "Report-Msgid-Bugs-To: https://bugs.launchpad.net/openstack-i18n/\n" "POT-Creation-Date: 2019-07-23 14:26+0000\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "PO-Revision-Date: 2019-08-02 08:40+0000\n" "Last-Translator: Surit Aryal \n" "Language-Team: Nepali\n" "Language: ne\n" "X-Generator: Zanata 4.3.3\n" "Plural-Forms: nplurals=2; plural=(n != 1)\n" #, python-format msgid "%s or more" msgstr "%s वा अधिक" msgid "1 or more" msgstr "१ वा अधिक" msgid "Await DataNodes start up" msgstr "प्रतीक्षा डाटा Nodes शुरू" #, python-format msgid "Decommission %s" msgstr "Decommission %s" #, python-format msgid "Number of %(dn)s instances should not be less than %(replication)s" msgstr "%(dn)s घटनाहरूको संख्या %(replication)s भन्दा कम हुनुहुन्न" msgid "Push configs to nodes" msgstr "कन्फिगरेसन नोडहरूमा पुश गर्नुहोस्" #, python-format msgid "Spark plugin cannot scale nodegroup with processes: %s" msgstr "Spark pluginले प्रक्रियाहरूसँग नोड ग्रुप मापन गर्न सक्दैन: %s" #, python-format msgid "" "Spark plugin cannot shrink cluster because there would be not enough nodes " "for HDFS replicas (replication factor is %s)" msgstr "" "Spark plugin क्लस्टर कम गर्न सक्दैन किनभने त्यहाँ HDFS प्रतिकृतिहरु लागि पर्याप्त नोड्स " "(प्रतिकृति कारक %s हो)" msgid "Spark {base} or higher required to run {type} jobs" msgstr "Spark {base} or higher required to run {type} jobs" msgid "" "This plugin provides an ability to launch Spark on Hadoop CDH cluster " "without any management consoles." msgstr "" "यो प्लगइनले कुनै व्यवस्थापन कन्सोल बिना Hadoop CDH क्लस्टरमा Spark सुरू गर्न क्षमता " "प्रदान गर्दछ।" #, python-format msgid "Waiting on %d DataNodes to start up" msgstr "%d DataNodes सुरू गर्न पर्खँदै" ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6338367 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/0000775000175000017500000000000000000000000023162 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/__init__.py0000664000175000017500000000000000000000000025261 0ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1696418419.637837 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/0000775000175000017500000000000000000000000024302 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/__init__.py0000664000175000017500000000000000000000000026401 0ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/config_helper.py0000664000175000017500000004707500000000000027475 0ustar00zuulzuul00000000000000# Copyright (c) 2014 Hoang Do, Phuc Vo, P. Michiardi, D. Venzano # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or # implied. # See the License for the specific language governing permissions and # limitations under the License. from oslo_config import cfg from oslo_log import log as logging import six from sahara.plugins import provisioning as p from sahara.plugins import swift_helper as swift from sahara.plugins import topology_helper as topology from sahara.plugins import utils LOG = logging.getLogger(__name__) CONF = cfg.CONF CORE_DEFAULT = utils.load_hadoop_xml_defaults( 'plugins/spark/resources/core-default.xml', 'sahara_plugin_spark') HDFS_DEFAULT = utils.load_hadoop_xml_defaults( 'plugins/spark/resources/hdfs-default.xml', 'sahara_plugin_spark') SWIFT_DEFAULTS = swift.read_default_swift_configs() XML_CONFS = { "HDFS": [CORE_DEFAULT, HDFS_DEFAULT, SWIFT_DEFAULTS] } _default_executor_classpath = ":".join( ['/usr/lib/hadoop-mapreduce/hadoop-openstack.jar']) SPARK_CONFS = { 'Spark': { "OPTIONS": [ { 'name': 'Executor extra classpath', 'description': 'Value for spark.executor.extraClassPath' ' in spark-defaults.conf' ' (default: %s)' % _default_executor_classpath, 'default': '%s' % _default_executor_classpath, 'priority': 2, }, { 'name': 'Master port', 'description': 'Start the master on a different port' ' (default: 7077)', 'default': '7077', 'priority': 2, }, { 'name': 'Worker port', 'description': 'Start the Spark worker on a specific port' ' (default: random)', 'default': 'random', 'priority': 2, }, { 'name': 'Master webui port', 'description': 'Port for the master web UI (default: 8080)', 'default': '8080', 'priority': 1, }, { 'name': 'Worker webui port', 'description': 'Port for the worker web UI (default: 8081)', 'default': '8081', 'priority': 1, }, { 'name': 'Worker cores', 'description': 'Total number of cores to allow Spark' ' applications to use on the machine' ' (default: all available cores)', 'default': 'all', 'priority': 2, }, { 'name': 'Worker memory', 'description': 'Total amount of memory to allow Spark' ' applications to use on the machine, e.g. 1000m,' ' 2g (default: total memory minus 1 GB)', 'default': 'all', 'priority': 1, }, { 'name': 'Worker instances', 'description': 'Number of worker instances to run on each' ' machine (default: 1)', 'default': '1', 'priority': 2, }, { 'name': 'Spark home', 'description': 'The location of the spark installation' ' (default: /opt/spark)', 'default': '/opt/spark', 'priority': 2, }, { 'name': 'Minimum cleanup seconds', 'description': 'Job data will never be purged before this' ' amount of time elapses (default: 86400 = 1 day)', 'default': '86400', 'priority': 2, }, { 'name': 'Maximum cleanup seconds', 'description': 'Job data will always be purged after this' ' amount of time elapses (default: 1209600 = 14 days)', 'default': '1209600', 'priority': 2, }, { 'name': 'Minimum cleanup megabytes', 'description': 'No job data will be purged unless the total' ' job data exceeds this size (default: 4096 = 4GB)', 'default': '4096', 'priority': 2, }, ] } } HADOOP_CONF_DIR = "/etc/hadoop/conf" ENV_CONFS = { "HDFS": { 'Name Node Heap Size': 'HADOOP_NAMENODE_OPTS=\\"-Xmx%sm\\"', 'Data Node Heap Size': 'HADOOP_DATANODE_OPTS=\\"-Xmx%sm\\"' } } ENABLE_DATA_LOCALITY = p.Config('Enable Data Locality', 'general', 'cluster', config_type="bool", priority=1, default_value=True, is_optional=True) ENABLE_SWIFT = p.Config('Enable Swift', 'general', 'cluster', config_type="bool", priority=1, default_value=True, is_optional=False) DATANODES_STARTUP_TIMEOUT = p.Config( 'DataNodes startup timeout', 'general', 'cluster', config_type='int', priority=1, default_value=10800, is_optional=True, description='Timeout for DataNodes startup, in seconds') # Default set to 1 day, which is the default Keystone token # expiration time. After the token is expired we can't continue # scaling anyway. DECOMMISSIONING_TIMEOUT = p.Config('Decommissioning Timeout', 'general', 'cluster', config_type='int', priority=1, default_value=86400, is_optional=True, description='Timeout for datanode' ' decommissioning operation' ' during scaling, in seconds') HIDDEN_CONFS = ['fs.defaultFS', 'dfs.namenode.name.dir', 'dfs.datanode.data.dir'] CLUSTER_WIDE_CONFS = ['dfs.block.size', 'dfs.permissions', 'dfs.replication', 'dfs.replication.min', 'dfs.replication.max', 'io.file.buffer.size'] PRIORITY_1_CONFS = ['dfs.datanode.du.reserved', 'dfs.datanode.failed.volumes.tolerated', 'dfs.datanode.max.xcievers', 'dfs.datanode.handler.count', 'dfs.namenode.handler.count'] # for now we have not so many cluster-wide configs # lets consider all of them having high priority PRIORITY_1_CONFS += CLUSTER_WIDE_CONFS def _initialise_configs(): configs = [] for service, config_lists in six.iteritems(XML_CONFS): for config_list in config_lists: for config in config_list: if config['name'] not in HIDDEN_CONFS: cfg = p.Config(config['name'], service, "node", is_optional=True, config_type="string", default_value=str(config['value']), description=config['description']) if cfg.default_value in ["true", "false"]: cfg.config_type = "bool" cfg.default_value = (cfg.default_value == 'true') elif utils.is_int(cfg.default_value): cfg.config_type = "int" cfg.default_value = int(cfg.default_value) if config['name'] in CLUSTER_WIDE_CONFS: cfg.scope = 'cluster' if config['name'] in PRIORITY_1_CONFS: cfg.priority = 1 configs.append(cfg) for service, config_items in six.iteritems(ENV_CONFS): for name, param_format_str in six.iteritems(config_items): configs.append(p.Config(name, service, "node", default_value=1024, priority=1, config_type="int")) for service, config_items in six.iteritems(SPARK_CONFS): for item in config_items['OPTIONS']: cfg = p.Config(name=item["name"], description=item["description"], default_value=item["default"], applicable_target=service, scope="cluster", is_optional=True, priority=item["priority"]) configs.append(cfg) configs.append(DECOMMISSIONING_TIMEOUT) configs.append(ENABLE_SWIFT) configs.append(DATANODES_STARTUP_TIMEOUT) if CONF.enable_data_locality: configs.append(ENABLE_DATA_LOCALITY) return configs # Initialise plugin Hadoop configurations PLUGIN_CONFIGS = _initialise_configs() def get_plugin_configs(): return PLUGIN_CONFIGS def generate_cfg_from_general(cfg, configs, general_config, rest_excluded=False): if 'general' in configs: for nm in general_config: if nm not in configs['general'] and not rest_excluded: configs['general'][nm] = general_config[nm]['default_value'] for name, value in configs['general'].items(): if value: cfg = _set_config(cfg, general_config, name) LOG.debug("Applying config: {name}".format(name=name)) else: cfg = _set_config(cfg, general_config) return cfg def _get_hostname(service): return service.hostname() if service else None def generate_xml_configs(configs, storage_path, nn_hostname, hadoop_port): if hadoop_port is None: hadoop_port = 8020 cfg = { 'fs.defaultFS': 'hdfs://%s:%s' % (nn_hostname, str(hadoop_port)), 'dfs.namenode.name.dir': extract_hadoop_path(storage_path, '/dfs/nn'), 'dfs.datanode.data.dir': extract_hadoop_path(storage_path, '/dfs/dn'), 'dfs.hosts': '/etc/hadoop/dn.incl', 'dfs.hosts.exclude': '/etc/hadoop/dn.excl' } # inserting user-defined configs for key, value in extract_hadoop_xml_confs(configs): cfg[key] = value # Add the swift defaults if they have not been set by the user swft_def = [] if is_swift_enabled(configs): swft_def = SWIFT_DEFAULTS swift_configs = extract_name_values(swift.get_swift_configs()) for key, value in six.iteritems(swift_configs): if key not in cfg: cfg[key] = value # invoking applied configs to appropriate xml files core_all = CORE_DEFAULT + swft_def if CONF.enable_data_locality: cfg.update(topology.TOPOLOGY_CONFIG) # applying vm awareness configs core_all += topology.vm_awareness_core_config() xml_configs = { 'core-site': utils.create_hadoop_xml(cfg, core_all), 'hdfs-site': utils.create_hadoop_xml(cfg, HDFS_DEFAULT) } return xml_configs def _get_spark_opt_default(opt_name): for opt in SPARK_CONFS["Spark"]["OPTIONS"]: if opt_name == opt["name"]: return opt["default"] return None def generate_spark_env_configs(cluster): configs = [] # master configuration sp_master = utils.get_instance(cluster, "master") configs.append('SPARK_MASTER_IP=' + sp_master.hostname()) # point to the hadoop conf dir so that Spark can read things # like the swift configuration without having to copy core-site # to /opt/spark/conf configs.append('HADOOP_CONF_DIR=' + HADOOP_CONF_DIR) masterport = utils.get_config_value_or_default("Spark", "Master port", cluster) if masterport and masterport != _get_spark_opt_default("Master port"): configs.append('SPARK_MASTER_PORT=' + str(masterport)) masterwebport = utils.get_config_value_or_default("Spark", "Master webui port", cluster) if (masterwebport and masterwebport != _get_spark_opt_default("Master webui port")): configs.append('SPARK_MASTER_WEBUI_PORT=' + str(masterwebport)) # configuration for workers workercores = utils.get_config_value_or_default("Spark", "Worker cores", cluster) if workercores and workercores != _get_spark_opt_default("Worker cores"): configs.append('SPARK_WORKER_CORES=' + str(workercores)) workermemory = utils.get_config_value_or_default("Spark", "Worker memory", cluster) if (workermemory and workermemory != _get_spark_opt_default("Worker memory")): configs.append('SPARK_WORKER_MEMORY=' + str(workermemory)) workerport = utils.get_config_value_or_default("Spark", "Worker port", cluster) if workerport and workerport != _get_spark_opt_default("Worker port"): configs.append('SPARK_WORKER_PORT=' + str(workerport)) workerwebport = utils.get_config_value_or_default("Spark", "Worker webui port", cluster) if (workerwebport and workerwebport != _get_spark_opt_default("Worker webui port")): configs.append('SPARK_WORKER_WEBUI_PORT=' + str(workerwebport)) workerinstances = utils.get_config_value_or_default("Spark", "Worker instances", cluster) if (workerinstances and workerinstances != _get_spark_opt_default("Worker instances")): configs.append('SPARK_WORKER_INSTANCES=' + str(workerinstances)) return '\n'.join(configs) # workernames need to be a list of worker names def generate_spark_slaves_configs(workernames): return '\n'.join(workernames) def generate_spark_executor_classpath(cluster): cp = utils.get_config_value_or_default("Spark", "Executor extra classpath", cluster) if cp: return "spark.executor.extraClassPath " + cp return "\n" def extract_hadoop_environment_confs(configs): """Returns environment specific Hadoop configurations. :returns: list of Hadoop parameters which should be passed via environment """ lst = [] for service, srv_confs in configs.items(): if ENV_CONFS.get(service): for param_name, param_value in srv_confs.items(): for cfg_name, cfg_format_str in ENV_CONFS[service].items(): if param_name == cfg_name and param_value is not None: lst.append(cfg_format_str % param_value) return lst def extract_hadoop_xml_confs(configs): """Returns xml specific Hadoop configurations. :returns: list of Hadoop parameters which should be passed into general configs like core-site.xml """ lst = [] for service, srv_confs in configs.items(): if XML_CONFS.get(service): for param_name, param_value in srv_confs.items(): for cfg_list in XML_CONFS[service]: names = [cfg['name'] for cfg in cfg_list] if param_name in names and param_value is not None: lst.append((param_name, param_value)) return lst def generate_hadoop_setup_script(storage_paths, env_configs): script_lines = ["#!/bin/bash -x"] script_lines.append("echo -n > /tmp/hadoop-env.sh") for line in env_configs: if 'HADOOP' in line: script_lines.append('echo "%s" >> /tmp/hadoop-env.sh' % line) script_lines.append("cat /etc/hadoop/hadoop-env.sh >> /tmp/hadoop-env.sh") script_lines.append("cp /tmp/hadoop-env.sh /etc/hadoop/hadoop-env.sh") hadoop_log = storage_paths[0] + "/log/hadoop/\\$USER/" script_lines.append('sed -i "s,export HADOOP_LOG_DIR=.*,' 'export HADOOP_LOG_DIR=%s," /etc/hadoop/hadoop-env.sh' % hadoop_log) hadoop_log = storage_paths[0] + "/log/hadoop/hdfs" script_lines.append('sed -i "s,export HADOOP_SECURE_DN_LOG_DIR=.*,' 'export HADOOP_SECURE_DN_LOG_DIR=%s," ' '/etc/hadoop/hadoop-env.sh' % hadoop_log) for path in storage_paths: script_lines.append("chown -R hadoop:hadoop %s" % path) script_lines.append("chmod -f -R 755 %s ||" "echo 'Permissions unchanged'" % path) return "\n".join(script_lines) def generate_job_cleanup_config(cluster): spark_config = { 'minimum_cleanup_megabytes': utils.get_config_value_or_default( "Spark", "Minimum cleanup megabytes", cluster), 'minimum_cleanup_seconds': utils.get_config_value_or_default( "Spark", "Minimum cleanup seconds", cluster), 'maximum_cleanup_seconds': utils.get_config_value_or_default( "Spark", "Maximum cleanup seconds", cluster) } job_conf = { 'valid': ( _convert_config_to_int( spark_config['maximum_cleanup_seconds']) > 0 and _convert_config_to_int( spark_config['minimum_cleanup_megabytes']) > 0 and _convert_config_to_int( spark_config['minimum_cleanup_seconds']) > 0) } if job_conf['valid']: job_conf['cron'] = utils.get_file_text( 'plugins/spark/resources/spark-cleanup.cron', 'sahara_plugin_spark'), job_cleanup_script = utils.get_file_text( 'plugins/spark/resources/tmp-cleanup.sh.template', 'sahara_plugin_spark') job_conf['script'] = job_cleanup_script.format(**spark_config) return job_conf def _convert_config_to_int(config_value): try: return int(config_value) except ValueError: return -1 def extract_name_values(configs): return {cfg['name']: cfg['value'] for cfg in configs} def make_hadoop_path(base_dirs, suffix): return [base_dir + suffix for base_dir in base_dirs] def extract_hadoop_path(lst, hadoop_dir): if lst: return ",".join(make_hadoop_path(lst, hadoop_dir)) def _set_config(cfg, gen_cfg, name=None): if name in gen_cfg: cfg.update(gen_cfg[name]['conf']) if name is None: for name in gen_cfg: cfg.update(gen_cfg[name]['conf']) return cfg def _get_general_config_value(conf, option): if 'general' in conf and option.name in conf['general']: return conf['general'][option.name] return option.default_value def _get_general_cluster_config_value(cluster, option): return _get_general_config_value(cluster.cluster_configs, option) def is_data_locality_enabled(cluster): if not CONF.enable_data_locality: return False return _get_general_cluster_config_value(cluster, ENABLE_DATA_LOCALITY) def is_swift_enabled(configs): return _get_general_config_value(configs, ENABLE_SWIFT) def get_decommissioning_timeout(cluster): return _get_general_cluster_config_value(cluster, DECOMMISSIONING_TIMEOUT) def get_port_from_config(service, name, cluster=None): address = utils.get_config_value_or_default(service, name, cluster) return utils.get_port_from_address(address) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/edp_engine.py0000664000175000017500000000445600000000000026762 0ustar00zuulzuul00000000000000# Copyright (c) 2014 Mirantis Inc. # Copyright (c) 2015 ISPRAS # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or # implied. # See the License for the specific language governing permissions and # limitations under the License. import os import six from sahara.plugins import edp from sahara.plugins import exceptions as ex from sahara.plugins import utils as plugin_utils from sahara_plugin_spark.i18n import _ class EdpEngine(edp.PluginsSparkJobEngine): edp_base_version = "1.6.0" def __init__(self, cluster): super(EdpEngine, self).__init__(cluster) self.master = plugin_utils.get_instance(cluster, "master") self.plugin_params["spark-user"] = "" self.plugin_params["spark-submit"] = os.path.join( plugin_utils. get_config_value_or_default("Spark", "Spark home", self.cluster), "bin/spark-submit") self.plugin_params["deploy-mode"] = "client" port_str = six.text_type( plugin_utils.get_config_value_or_default( "Spark", "Master port", self.cluster)) self.plugin_params["master"] = ('spark://%(host)s:' + port_str) driver_cp = plugin_utils.get_config_value_or_default( "Spark", "Executor extra classpath", self.cluster) self.plugin_params["driver-class-path"] = driver_cp @staticmethod def edp_supported(version): return version >= EdpEngine.edp_base_version @staticmethod def job_type_supported(job_type): return job_type in edp.PluginsSparkJobEngine.get_supported_job_types() def validate_job_execution(self, cluster, job, data): if not self.edp_supported(cluster.hadoop_version): raise ex.PluginInvalidDataException( _('Spark {base} or higher required to run {type} jobs').format( base=EdpEngine.edp_base_version, type=job.type)) super(EdpEngine, self).validate_job_execution(cluster, job, data) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/images.py0000664000175000017500000000302400000000000026120 0ustar00zuulzuul00000000000000# Copyright (c) 2019 Red Hat, Inc. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or # implied. # See the License for the specific language governing permissions and # limitations under the License. from sahara.plugins import images from sahara.plugins import utils as plugin_utils _validator = images.SaharaImageValidator.from_yaml( 'plugins/spark/resources/images/image.yaml', resource_roots=['plugins/spark/resources/images'], package='sahara_plugin_spark') def get_image_arguments(): return _validator.get_argument_list() def pack_image(remote, test_only=False, image_arguments=None): _validator.validate(remote, test_only=test_only, image_arguments=image_arguments) def validate_images(cluster, test_only=False, image_arguments=None): image_arguments = get_image_arguments() if not test_only: instances = plugin_utils.get_instances(cluster) else: instances = plugin_utils.get_instances(cluster)[0] for instance in instances: with instance.remote() as r: _validator.validate(r, test_only=test_only, image_arguments=image_arguments) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/plugin.py0000664000175000017500000005667300000000000026173 0ustar00zuulzuul00000000000000# Copyright (c) 2014 Hoang Do, Phuc Vo, P. Michiardi, D. Venzano # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or # implied. # See the License for the specific language governing permissions and # limitations under the License. import copy import os from oslo_config import cfg from oslo_log import log as logging from sahara.plugins import conductor from sahara.plugins import context from sahara.plugins import exceptions as ex from sahara.plugins import provisioning as p from sahara.plugins import recommendations_utils as ru from sahara.plugins import swift_helper from sahara.plugins import topology_helper as th from sahara.plugins import utils from sahara_plugin_spark.i18n import _ from sahara_plugin_spark.plugins.spark import config_helper as c_helper from sahara_plugin_spark.plugins.spark import edp_engine from sahara_plugin_spark.plugins.spark import images from sahara_plugin_spark.plugins.spark import run_scripts as run from sahara_plugin_spark.plugins.spark import scaling as sc from sahara_plugin_spark.plugins.spark import shell_engine LOG = logging.getLogger(__name__) CONF = cfg.CONF class SparkProvider(p.ProvisioningPluginBase): def __init__(self): self.processes = { "HDFS": ["namenode", "datanode"], "Spark": ["master", "slave"] } def get_title(self): return "Apache Spark" def get_description(self): return _("This plugin provides an ability to launch Spark on Hadoop " "CDH cluster without any management consoles.") def get_labels(self): default = {'enabled': {'status': True}, 'stable': {'status': True}} deprecated = {'enabled': {'status': True}, 'deprecated': {'status': True}} result = {'plugin_labels': copy.deepcopy(default)} stable_versions = ['2.3', '2.2'] result['version_labels'] = { version: copy.deepcopy( default if version in stable_versions else deprecated ) for version in self.get_versions() } return result def get_versions(self): return ['2.3', '2.2', '2.1.0', '1.6.0'] def get_configs(self, hadoop_version): return c_helper.get_plugin_configs() def get_node_processes(self, hadoop_version): return self.processes def validate(self, cluster): nn_count = sum([ng.count for ng in utils.get_node_groups(cluster, "namenode")]) if nn_count != 1: raise ex.InvalidComponentCountException("namenode", 1, nn_count) dn_count = sum([ng.count for ng in utils.get_node_groups(cluster, "datanode")]) if dn_count < 1: raise ex.InvalidComponentCountException("datanode", _("1 or more"), nn_count) rep_factor = utils.get_config_value_or_default('HDFS', "dfs.replication", cluster) if dn_count < rep_factor: raise ex.InvalidComponentCountException( 'datanode', _('%s or more') % rep_factor, dn_count, _('Number of %(dn)s instances should not be less ' 'than %(replication)s') % {'dn': 'datanode', 'replication': 'dfs.replication'}) # validate Spark Master Node and Spark Slaves sm_count = sum([ng.count for ng in utils.get_node_groups(cluster, "master")]) if sm_count < 1: raise ex.RequiredServiceMissingException("Spark master") if sm_count >= 2: raise ex.InvalidComponentCountException("Spark master", "1", sm_count) sl_count = sum([ng.count for ng in utils.get_node_groups(cluster, "slave")]) if sl_count < 1: raise ex.InvalidComponentCountException("Spark slave", _("1 or more"), sl_count) def update_infra(self, cluster): pass def configure_cluster(self, cluster): self._setup_instances(cluster) @utils.event_wrapper( True, step=utils.start_process_event_message("NameNode")) def _start_namenode(self, nn_instance): with utils.get_remote(nn_instance) as r: run.format_namenode(r) run.start_processes(r, "namenode") def start_spark(self, cluster): sm_instance = utils.get_instance(cluster, "master") if sm_instance: self._start_spark(cluster, sm_instance) @utils.event_wrapper( True, step=utils.start_process_event_message("SparkMasterNode")) def _start_spark(self, cluster, sm_instance): with utils.get_remote(sm_instance) as r: run.start_spark_master(r, self._spark_home(cluster)) LOG.info("Spark service has been started") def start_cluster(self, cluster): nn_instance = utils.get_instance(cluster, "namenode") dn_instances = utils.get_instances(cluster, "datanode") # Start the name node self._start_namenode(nn_instance) # start the data nodes self._start_datanode_processes(dn_instances) run.await_datanodes(cluster) LOG.info("Hadoop services have been started") with utils.get_remote(nn_instance) as r: r.execute_command("sudo -u hdfs hdfs dfs -mkdir -p /user/$USER/") r.execute_command("sudo -u hdfs hdfs dfs -chown $USER " "/user/$USER/") # start spark nodes self.start_spark(cluster) swift_helper.install_ssl_certs(utils.get_instances(cluster)) LOG.info('Cluster has been started successfully') self._set_cluster_info(cluster) def _spark_home(self, cluster): return utils.get_config_value_or_default("Spark", "Spark home", cluster) def _extract_configs_to_extra(self, cluster): sp_master = utils.get_instance(cluster, "master") sp_slaves = utils.get_instances(cluster, "slave") extra = dict() config_master = config_slaves = '' if sp_master is not None: config_master = c_helper.generate_spark_env_configs(cluster) if sp_slaves is not None: slavenames = [] for slave in sp_slaves: slavenames.append(slave.hostname()) config_slaves = c_helper.generate_spark_slaves_configs(slavenames) else: config_slaves = "\n" # Any node that might be used to run spark-submit will need # these libs for swift integration config_defaults = c_helper.generate_spark_executor_classpath(cluster) extra['job_cleanup'] = c_helper.generate_job_cleanup_config(cluster) extra['sp_master'] = config_master extra['sp_slaves'] = config_slaves extra['sp_defaults'] = config_defaults if c_helper.is_data_locality_enabled(cluster): topology_data = th.generate_topology_map( cluster, CONF.enable_hypervisor_awareness) extra['topology_data'] = "\n".join( [k + " " + v for k, v in topology_data.items()]) + "\n" return extra def _add_instance_ng_related_to_extra(self, cluster, instance, extra): extra = extra.copy() ng = instance.node_group nn = utils.get_instance(cluster, "namenode") extra['xml'] = c_helper.generate_xml_configs( ng.configuration(), instance.storage_paths(), nn.hostname(), None) extra['setup_script'] = c_helper.generate_hadoop_setup_script( instance.storage_paths(), c_helper.extract_hadoop_environment_confs(ng.configuration())) return extra def _start_datanode_processes(self, dn_instances): if len(dn_instances) == 0: return utils.add_provisioning_step( dn_instances[0].cluster_id, utils.start_process_event_message("DataNodes"), len(dn_instances)) with context.PluginsThreadGroup() as tg: for i in dn_instances: tg.spawn('spark-start-dn-%s' % i.instance_name, self._start_datanode, i) @utils.event_wrapper(mark_successful_on_exit=True) def _start_datanode(self, instance): with instance.remote() as r: run.start_processes(r, "datanode") def _setup_instances(self, cluster, instances=None): extra = self._extract_configs_to_extra(cluster) if instances is None: instances = utils.get_instances(cluster) self._push_configs_to_nodes(cluster, extra, instances) def _push_configs_to_nodes(self, cluster, extra, new_instances): all_instances = utils.get_instances(cluster) utils.add_provisioning_step( cluster.id, _("Push configs to nodes"), len(all_instances)) with context.PluginsThreadGroup() as tg: for instance in all_instances: extra = self._add_instance_ng_related_to_extra( cluster, instance, extra) if instance in new_instances: tg.spawn('spark-configure-%s' % instance.instance_name, self._push_configs_to_new_node, cluster, extra, instance) else: tg.spawn('spark-reconfigure-%s' % instance.instance_name, self._push_configs_to_existing_node, cluster, extra, instance) @utils.event_wrapper(mark_successful_on_exit=True) def _push_configs_to_new_node(self, cluster, extra, instance): files_hadoop = { os.path.join(c_helper.HADOOP_CONF_DIR, "core-site.xml"): extra['xml']['core-site'], os.path.join(c_helper.HADOOP_CONF_DIR, "hdfs-site.xml"): extra['xml']['hdfs-site'], } sp_home = self._spark_home(cluster) files_spark = { os.path.join(sp_home, 'conf/spark-env.sh'): extra['sp_master'], os.path.join(sp_home, 'conf/slaves'): extra['sp_slaves'], os.path.join(sp_home, 'conf/spark-defaults.conf'): extra['sp_defaults'] } files_init = { '/tmp/sahara-hadoop-init.sh': extra['setup_script'], 'id_rsa': cluster.management_private_key, 'authorized_keys': cluster.management_public_key } # pietro: This is required because the (secret) key is not stored in # .ssh which hinders password-less ssh required by spark scripts key_cmd = ('sudo cp $HOME/id_rsa $HOME/.ssh/; ' 'sudo chown $USER $HOME/.ssh/id_rsa; ' 'sudo chmod 600 $HOME/.ssh/id_rsa') storage_paths = instance.storage_paths() dn_path = ' '.join(c_helper.make_hadoop_path(storage_paths, '/dfs/dn')) nn_path = ' '.join(c_helper.make_hadoop_path(storage_paths, '/dfs/nn')) hdfs_dir_cmd = ('sudo mkdir -p %(nn_path)s %(dn_path)s &&' 'sudo chown -R hdfs:hadoop %(nn_path)s %(dn_path)s &&' 'sudo chmod 755 %(nn_path)s %(dn_path)s' % {"nn_path": nn_path, "dn_path": dn_path}) with utils.get_remote(instance) as r: r.execute_command( 'sudo chown -R $USER:$USER /etc/hadoop' ) r.execute_command( 'sudo chown -R $USER:$USER %s' % sp_home ) r.write_files_to(files_hadoop) r.write_files_to(files_spark) r.write_files_to(files_init) r.execute_command( 'sudo chmod 0500 /tmp/sahara-hadoop-init.sh' ) r.execute_command( 'sudo /tmp/sahara-hadoop-init.sh ' '>> /tmp/sahara-hadoop-init.log 2>&1') r.execute_command(hdfs_dir_cmd) r.execute_command(key_cmd) if c_helper.is_data_locality_enabled(cluster): r.write_file_to( '/etc/hadoop/topology.sh', utils.get_file_text( 'plugins/spark/resources/topology.sh', 'sahara_plugin_spark')) r.execute_command( 'sudo chmod +x /etc/hadoop/topology.sh' ) self._write_topology_data(r, cluster, extra) self._push_master_configs(r, cluster, extra, instance) self._push_cleanup_job(r, cluster, extra, instance) @utils.event_wrapper(mark_successful_on_exit=True) def _push_configs_to_existing_node(self, cluster, extra, instance): node_processes = instance.node_group.node_processes need_update_hadoop = (c_helper.is_data_locality_enabled(cluster) or 'namenode' in node_processes) need_update_spark = ('master' in node_processes or 'slave' in node_processes) if need_update_spark: sp_home = self._spark_home(cluster) files = { os.path.join(sp_home, 'conf/spark-env.sh'): extra['sp_master'], os.path.join(sp_home, 'conf/slaves'): extra['sp_slaves'], os.path.join( sp_home, 'conf/spark-defaults.conf'): extra['sp_defaults'] } r = utils.get_remote(instance) r.write_files_to(files) self._push_cleanup_job(r, cluster, extra, instance) if need_update_hadoop: with utils.get_remote(instance) as r: self._write_topology_data(r, cluster, extra) self._push_master_configs(r, cluster, extra, instance) def _write_topology_data(self, r, cluster, extra): if c_helper.is_data_locality_enabled(cluster): topology_data = extra['topology_data'] r.write_file_to('/etc/hadoop/topology.data', topology_data) def _push_master_configs(self, r, cluster, extra, instance): node_processes = instance.node_group.node_processes if 'namenode' in node_processes: self._push_namenode_configs(cluster, r) def _push_cleanup_job(self, r, cluster, extra, instance): node_processes = instance.node_group.node_processes if 'master' in node_processes: if extra['job_cleanup']['valid']: r.write_file_to('/etc/hadoop/tmp-cleanup.sh', extra['job_cleanup']['script']) r.execute_command("chmod 755 /etc/hadoop/tmp-cleanup.sh") cmd = 'sudo sh -c \'echo "%s" > /etc/cron.d/spark-cleanup\'' r.execute_command(cmd % extra['job_cleanup']['cron']) else: r.execute_command("sudo rm -f /etc/hadoop/tmp-cleanup.sh") r.execute_command("sudo rm -f /etc/crond.d/spark-cleanup") def _push_namenode_configs(self, cluster, r): r.write_file_to('/etc/hadoop/dn.incl', utils.generate_fqdn_host_names( utils.get_instances(cluster, "datanode"))) r.write_file_to('/etc/hadoop/dn.excl', '') def _set_cluster_info(self, cluster): nn = utils.get_instance(cluster, "namenode") sp_master = utils.get_instance(cluster, "master") info = {} if nn: address = utils.get_config_value_or_default( 'HDFS', 'dfs.http.address', cluster) port = address[address.rfind(':') + 1:] info['HDFS'] = { 'Web UI': 'http://%s:%s' % (nn.get_ip_or_dns_name(), port) } info['HDFS']['NameNode'] = 'hdfs://%s:8020' % nn.hostname() if sp_master: port = utils.get_config_value_or_default( 'Spark', 'Master webui port', cluster) if port is not None: info['Spark'] = { 'Web UI': 'http://%s:%s' % ( sp_master.get_ip_or_dns_name(), port) } ctx = context.ctx() conductor.cluster_update(ctx, cluster, {'info': info}) # Scaling def validate_scaling(self, cluster, existing, additional): self._validate_existing_ng_scaling(cluster, existing) self._validate_additional_ng_scaling(cluster, additional) def decommission_nodes(self, cluster, instances): sls = utils.get_instances(cluster, "slave") dns = utils.get_instances(cluster, "datanode") decommission_dns = False decommission_sls = False for i in instances: if 'datanode' in i.node_group.node_processes: dns.remove(i) decommission_dns = True if 'slave' in i.node_group.node_processes: sls.remove(i) decommission_sls = True nn = utils.get_instance(cluster, "namenode") spark_master = utils.get_instance(cluster, "master") if decommission_sls: sc.decommission_sl(spark_master, instances, sls) if decommission_dns: sc.decommission_dn(nn, instances, dns) def scale_cluster(self, cluster, instances): master = utils.get_instance(cluster, "master") r_master = utils.get_remote(master) run.stop_spark(r_master, self._spark_home(cluster)) self._setup_instances(cluster, instances) nn = utils.get_instance(cluster, "namenode") run.refresh_nodes(utils.get_remote(nn), "dfsadmin") dn_instances = [instance for instance in instances if 'datanode' in instance.node_group.node_processes] self._start_datanode_processes(dn_instances) swift_helper.install_ssl_certs(instances) run.start_spark_master(r_master, self._spark_home(cluster)) LOG.info("Spark master service has been restarted") def _get_scalable_processes(self): return ["datanode", "slave"] def _validate_additional_ng_scaling(self, cluster, additional): scalable_processes = self._get_scalable_processes() for ng_id in additional: ng = utils.get_by_id(cluster.node_groups, ng_id) if not set(ng.node_processes).issubset(scalable_processes): raise ex.NodeGroupCannotBeScaled( ng.name, _("Spark plugin cannot scale nodegroup" " with processes: %s") % ' '.join(ng.node_processes)) def _validate_existing_ng_scaling(self, cluster, existing): scalable_processes = self._get_scalable_processes() dn_to_delete = 0 for ng in cluster.node_groups: if ng.id in existing: if ng.count > existing[ng.id] and ("datanode" in ng.node_processes): dn_to_delete += ng.count - existing[ng.id] if not set(ng.node_processes).issubset(scalable_processes): raise ex.NodeGroupCannotBeScaled( ng.name, _("Spark plugin cannot scale nodegroup" " with processes: %s") % ' '.join(ng.node_processes)) dn_amount = len(utils.get_instances(cluster, "datanode")) rep_factor = utils.get_config_value_or_default('HDFS', "dfs.replication", cluster) if dn_to_delete > 0 and dn_amount - dn_to_delete < rep_factor: raise ex.ClusterCannotBeScaled( cluster.name, _("Spark plugin cannot shrink cluster because " "there would be not enough nodes for HDFS " "replicas (replication factor is %s)") % rep_factor) def get_edp_engine(self, cluster, job_type): if edp_engine.EdpEngine.job_type_supported(job_type): return edp_engine.EdpEngine(cluster) if shell_engine.ShellEngine.job_type_supported(job_type): return shell_engine.ShellEngine(cluster) return None def get_edp_job_types(self, versions=None): res = {} for vers in self.get_versions(): if not versions or vers in versions: res[vers] = shell_engine.ShellEngine.get_supported_job_types() if edp_engine.EdpEngine.edp_supported(vers): res[vers].extend( edp_engine.EdpEngine.get_supported_job_types()) return res def get_edp_config_hints(self, job_type, version): if (edp_engine.EdpEngine.edp_supported(version) and edp_engine.EdpEngine.job_type_supported(job_type)): return edp_engine.EdpEngine.get_possible_job_config(job_type) if shell_engine.ShellEngine.job_type_supported(job_type): return shell_engine.ShellEngine.get_possible_job_config(job_type) return {} def get_open_ports(self, node_group): cluster = node_group.cluster ports_map = { 'namenode': [8020, 50070, 50470], 'datanode': [50010, 1004, 50075, 1006, 50020], 'master': [ int(utils.get_config_value_or_default("Spark", "Master port", cluster)), int(utils.get_config_value_or_default("Spark", "Master webui port", cluster)), ], 'slave': [ int(utils.get_config_value_or_default("Spark", "Worker webui port", cluster)) ] } ports = [] for process in node_group.node_processes: if process in ports_map: ports.extend(ports_map[process]) return ports def recommend_configs(self, cluster, scaling=False): want_to_configure = { 'cluster_configs': { 'dfs.replication': ('HDFS', 'dfs.replication') } } provider = ru.HadoopAutoConfigsProvider( want_to_configure, self.get_configs( cluster.hadoop_version), cluster, scaling) provider.apply_recommended_configs() def get_image_arguments(self, hadoop_version): if hadoop_version in ['1.6.0', '2.1.0']: return NotImplemented return images.get_image_arguments() def pack_image(self, hadoop_version, remote, test_only=False, image_arguments=None): images.pack_image(remote, test_only=test_only, image_arguments=image_arguments) def validate_images(self, cluster, test_only=False, image_arguments=None): if cluster.hadoop_version not in ['1.6.0', '2.1.0']: images.validate_images(cluster, test_only=test_only, image_arguments=image_arguments) ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1696418419.637837 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/0000775000175000017500000000000000000000000026314 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/README.rst0000664000175000017500000000155300000000000030007 0ustar00zuulzuul00000000000000Apache Spark and HDFS Configurations for Sahara =============================================== This directory contains default XML configuration files and Spark scripts: * core-default.xml, * hdfs-default.xml, * spark-env.sh.template, * topology.sh These files are used by Sahara's plugin for Apache Spark and Cloudera HDFS. XML config files were taken from here: * https://github.com/apache/hadoop-common/blob/release-1.2.1/src/core/core-default.xml * https://github.com/apache/hadoop-common/blob/release-1.2.1/src/hdfs/hdfs-default.xml Cloudera packages use the same configuration files as standard Apache Hadoop. XML configs are used to expose default Hadoop configurations to the users through Sahara's REST API. It allows users to override some config values which will be pushed to the provisioned VMs running Hadoop services as part of appropriate xml config. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/core-default.xml0000664000175000017500000004734300000000000031423 0ustar00zuulzuul00000000000000 hadoop.tmp.dir /tmp/hadoop-${user.name} A base for other temporary directories. hadoop.native.lib true Should native hadoop libraries, if present, be used. hadoop.http.filter.initializers A comma separated list of class names. Each class in the list must extend org.apache.hadoop.http.FilterInitializer. The corresponding Filter will be initialized. Then, the Filter will be applied to all user facing jsp and servlet web pages. The ordering of the list defines the ordering of the filters. hadoop.security.group.mapping org.apache.hadoop.security.ShellBasedUnixGroupsMapping Class for user to group mapping (get groups for a given user) hadoop.security.authorization false Is service-level authorization enabled? hadoop.security.instrumentation.requires.admin false Indicates if administrator ACLs are required to access instrumentation servlets (JMX, METRICS, CONF, STACKS). hadoop.security.authentication simple Possible values are simple (no authentication), and kerberos hadoop.security.token.service.use_ip true Controls whether tokens always use IP addresses. DNS changes will not be detected if this option is enabled. Existing client connections that break will always reconnect to the IP of the original host. New clients will connect to the host's new IP but fail to locate a token. Disabling this option will allow existing and new clients to detect an IP change and continue to locate the new host's token. hadoop.security.use-weak-http-crypto false If enabled, use KSSL to authenticate HTTP connections to the NameNode. Due to a bug in JDK6, using KSSL requires one to configure Kerberos tickets to use encryption types that are known to be cryptographically weak. If disabled, SPNEGO will be used for HTTP authentication, which supports stronger encryption types. hadoop.logfile.size 10000000 The max size of each log file hadoop.logfile.count 10 The max number of log files io.file.buffer.size 4096 The size of buffer for use in sequence files. The size of this buffer should probably be a multiple of hardware page size (4096 on Intel x86), and it determines how much data is buffered during read and write operations. io.bytes.per.checksum 512 The number of bytes per checksum. Must not be larger than io.file.buffer.size. io.skip.checksum.errors false If true, when a checksum error is encountered while reading a sequence file, entries are skipped, instead of throwing an exception. io.compression.codecs org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec A list of the compression codec classes that can be used for compression/decompression. io.serializations org.apache.hadoop.io.serializer.WritableSerialization A list of serialization classes that can be used for obtaining serializers and deserializers. fs.defaultFS file:/// The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem. fs.trash.interval 0 Number of minutes between trash checkpoints. If zero, the trash feature is disabled. fs.file.impl org.apache.hadoop.fs.LocalFileSystem The FileSystem for file: uris. fs.hdfs.impl org.apache.hadoop.hdfs.DistributedFileSystem The FileSystem for hdfs: uris. fs.s3.impl org.apache.hadoop.fs.s3.S3FileSystem The FileSystem for s3: uris. fs.s3n.impl org.apache.hadoop.fs.s3native.NativeS3FileSystem The FileSystem for s3n: (Native S3) uris. fs.kfs.impl org.apache.hadoop.fs.kfs.KosmosFileSystem The FileSystem for kfs: uris. fs.hftp.impl org.apache.hadoop.hdfs.HftpFileSystem fs.hsftp.impl org.apache.hadoop.hdfs.HsftpFileSystem fs.webhdfs.impl org.apache.hadoop.hdfs.web.WebHdfsFileSystem fs.ftp.impl org.apache.hadoop.fs.ftp.FTPFileSystem The FileSystem for ftp: uris. fs.ramfs.impl org.apache.hadoop.fs.InMemoryFileSystem The FileSystem for ramfs: uris. fs.har.impl org.apache.hadoop.fs.HarFileSystem The filesystem for Hadoop archives. fs.har.impl.disable.cache true Don't cache 'har' filesystem instances. fs.checkpoint.dir ${hadoop.tmp.dir}/dfs/namesecondary Determines where on the local filesystem the DFS secondary name node should store the temporary images to merge. If this is a comma-delimited list of directories then the image is replicated in all of the directories for redundancy. fs.checkpoint.edits.dir ${fs.checkpoint.dir} Determines where on the local filesystem the DFS secondary name node should store the temporary edits to merge. If this is a comma-delimited list of directoires then teh edits is replicated in all of the directoires for redundancy. Default value is same as fs.checkpoint.dir fs.checkpoint.period 3600 The number of seconds between two periodic checkpoints. fs.checkpoint.size 67108864 The size of the current edit log (in bytes) that triggers a periodic checkpoint even if the fs.checkpoint.period hasn't expired. fs.s3.block.size 67108864 Block size to use when writing files to S3. fs.s3.buffer.dir ${hadoop.tmp.dir}/s3 Determines where on the local filesystem the S3 filesystem should store files before sending them to S3 (or after retrieving them from S3). fs.s3.maxRetries 4 The maximum number of retries for reading or writing files to S3, before we signal failure to the application. fs.s3.sleepTimeSeconds 10 The number of seconds to sleep between each S3 retry. local.cache.size 10737418240 The limit on the size of cache you want to keep, set by default to 10GB. This will act as a soft limit on the cache directory for out of band data. io.seqfile.compress.blocksize 1000000 The minimum block size for compression in block compressed SequenceFiles. io.seqfile.lazydecompress true Should values of block-compressed SequenceFiles be decompressed only when necessary. io.seqfile.sorter.recordlimit 1000000 The limit on number of records to be kept in memory in a spill in SequenceFiles.Sorter io.mapfile.bloom.size 1048576 The size of BloomFilter-s used in BloomMapFile. Each time this many keys is appended the next BloomFilter will be created (inside a DynamicBloomFilter). Larger values minimize the number of filters, which slightly increases the performance, but may waste too much space if the total number of keys is usually much smaller than this number. io.mapfile.bloom.error.rate 0.005 The rate of false positives in BloomFilter-s used in BloomMapFile. As this value decreases, the size of BloomFilter-s increases exponentially. This value is the probability of encountering false positives (default is 0.5%). hadoop.util.hash.type murmur The default implementation of Hash. Currently this can take one of the two values: 'murmur' to select MurmurHash and 'jenkins' to select JenkinsHash. ipc.client.idlethreshold 4000 Defines the threshold number of connections after which connections will be inspected for idleness. ipc.client.kill.max 10 Defines the maximum number of clients to disconnect in one go. ipc.client.connection.maxidletime 10000 The maximum time in msec after which a client will bring down the connection to the server. ipc.client.connect.max.retries 10 Indicates the number of retries a client will make to establish a server connection. ipc.server.listen.queue.size 128 Indicates the length of the listen queue for servers accepting client connections. ipc.server.tcpnodelay false Turn on/off Nagle's algorithm for the TCP socket connection on the server. Setting to true disables the algorithm and may decrease latency with a cost of more/smaller packets. ipc.client.tcpnodelay false Turn on/off Nagle's algorithm for the TCP socket connection on the client. Setting to true disables the algorithm and may decrease latency with a cost of more/smaller packets. webinterface.private.actions false If set to true, the web interfaces of JT and NN may contain actions, such as kill job, delete file, etc., that should not be exposed to public. Enable this option if the interfaces are only reachable by those who have the right authorization. hadoop.rpc.socket.factory.class.default org.apache.hadoop.net.StandardSocketFactory Default SocketFactory to use. This parameter is expected to be formatted as "package.FactoryClassName". hadoop.rpc.socket.factory.class.ClientProtocol SocketFactory to use to connect to a DFS. If null or empty, use hadoop.rpc.socket.class.default. This socket factory is also used by DFSClient to create sockets to DataNodes. hadoop.socks.server Address (host:port) of the SOCKS server to be used by the SocksSocketFactory. topology.node.switch.mapping.impl org.apache.hadoop.net.ScriptBasedMapping The default implementation of the DNSToSwitchMapping. It invokes a script specified in topology.script.file.name to resolve node names. If the value for topology.script.file.name is not set, the default value of DEFAULT_RACK is returned for all node names. net.topology.impl org.apache.hadoop.net.NetworkTopology The default implementation of NetworkTopology which is classic three layer one. topology.script.file.name The script name that should be invoked to resolve DNS names to NetworkTopology names. Example: the script would take host.foo.bar as an argument, and return /rack1 as the output. topology.script.number.args 100 The max number of args that the script configured with topology.script.file.name should be run with. Each arg is an IP address. hadoop.security.uid.cache.secs 14400 NativeIO maintains a cache from UID to UserName. This is the timeout for an entry in that cache. hadoop.http.authentication.type simple Defines authentication used for Oozie HTTP endpoint. Supported values are: simple | kerberos | #AUTHENTICATION_HANDLER_CLASSNAME# hadoop.http.authentication.token.validity 36000 Indicates how long (in seconds) an authentication token is valid before it has to be renewed. hadoop.http.authentication.signature.secret.file ${user.home}/hadoop-http-auth-signature-secret The signature secret for signing the authentication tokens. If not set a random secret is generated at startup time. The same secret should be used for JT/NN/DN/TT configurations. hadoop.http.authentication.cookie.domain The domain to use for the HTTP cookie that stores the authentication token. In order to authentiation to work correctly across all Hadoop nodes web-consoles the domain must be correctly set. IMPORTANT: when using IP addresses, browsers ignore cookies with domain settings. For this setting to work properly all nodes in the cluster must be configured to generate URLs with hostname.domain names on it. hadoop.http.authentication.simple.anonymous.allowed true Indicates if anonymous requests are allowed when using 'simple' authentication. hadoop.http.authentication.kerberos.principal HTTP/localhost@LOCALHOST Indicates the Kerberos principal to be used for HTTP endpoint. The principal MUST start with 'HTTP/' as per Kerberos HTTP SPNEGO specification. hadoop.http.authentication.kerberos.keytab ${user.home}/hadoop.keytab Location of the keytab file with the credentials for the principal. Referring to the same keytab file Oozie uses for its Kerberos credentials for Hadoop. hadoop.relaxed.worker.version.check false By default datanodes refuse to connect to namenodes if their build revision (svn revision) do not match, and tasktrackers refuse to connect to jobtrackers if their build version (version, revision, user, and source checksum) do not match. This option changes the behavior of hadoop workers to only check for a version match (eg "1.0.2") but ignore the other build fields (revision, user, and source checksum). hadoop.skip.worker.version.check false By default datanodes refuse to connect to namenodes if their build revision (svn revision) do not match, and tasktrackers refuse to connect to jobtrackers if their build version (version, revision, user, and source checksum) do not match. This option changes the behavior of hadoop workers to skip doing a version check at all. This option supersedes the 'hadoop.relaxed.worker.version.check' option. hadoop.jetty.logs.serve.aliases true Enable/Disable aliases serving from jetty ipc.client.fallback-to-simple-auth-allowed false When a client is configured to attempt a secure connection, but attempts to connect to an insecure server, that server may instruct the client to switch to SASL SIMPLE (unsecure) authentication. This setting controls whether or not the client will accept this instruction from the server. When false (the default), the client will not allow the fallback to SIMPLE authentication, and will abort the connection. ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/hdfs-default.xml0000664000175000017500000005467200000000000031422 0ustar00zuulzuul00000000000000 dfs.namenode.logging.level info The logging level for dfs namenode. Other values are "dir"(trac e namespace mutations), "block"(trace block under/over replications and block creations/deletions), or "all". dfs.namenode.rpc-address RPC address that handles all clients requests. If empty then we'll get the value from fs.default.name. The value of this property will take the form of hdfs://nn-host1:rpc-port. dfs.secondary.http.address 0.0.0.0:50090 The secondary namenode http server address and port. If the port is 0 then the server will start on a free port. dfs.datanode.address 0.0.0.0:50010 The datanode server address and port for data transfer. If the port is 0 then the server will start on a free port. dfs.datanode.http.address 0.0.0.0:50075 The datanode http server address and port. If the port is 0 then the server will start on a free port. dfs.datanode.ipc.address 0.0.0.0:50020 The datanode ipc server address and port. If the port is 0 then the server will start on a free port. dfs.datanode.handler.count 3 The number of server threads for the datanode. dfs.http.address 0.0.0.0:50070 The address and the base port where the dfs namenode web ui will listen on. If the port is 0 then the server will start on a free port. dfs.https.enable false Decide if HTTPS(SSL) is supported on HDFS dfs.https.need.client.auth false Whether SSL client certificate authentication is required dfs.https.server.keystore.resource ssl-server.xml Resource file from which ssl server keystore information will be extracted dfs.https.client.keystore.resource ssl-client.xml Resource file from which ssl client keystore information will be extracted dfs.datanode.https.address 0.0.0.0:50475 dfs.https.address 0.0.0.0:50470 dfs.datanode.dns.interface default The name of the Network Interface from which a data node should report its IP address. dfs.datanode.dns.nameserver default The host name or IP address of the name server (DNS) which a DataNode should use to determine the host name used by the NameNode for communication and display purposes. dfs.replication.considerLoad true Decide if chooseTarget considers the target's load or not dfs.default.chunk.view.size 32768 The number of bytes to view for a file on the browser. dfs.datanode.du.reserved 0 Reserved space in bytes per volume. Always leave this much space free for non dfs use. dfs.namenode.name.dir ${hadoop.tmp.dir}/dfs/name Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. dfs.name.edits.dir ${dfs.name.dir} Determines where on the local filesystem the DFS name node should store the transaction (edits) file. If this is a comma-delimited list of directories then the transaction file is replicated in all of the directories, for redundancy. Default value is same as dfs.name.dir dfs.namenode.edits.toleration.length 0 The length in bytes that namenode is willing to tolerate when the edit log is corrupted. The edit log toleration feature checks the entire edit log. It computes read length (the length of valid data), corruption length and padding length. In case that corruption length is non-zero, the corruption will be tolerated only if the corruption length is less than or equal to the toleration length. For disabling edit log toleration feature, set this property to -1. When the feature is disabled, the end of edit log will not be checked. In this case, namenode will startup normally even if the end of edit log is corrupted. dfs.web.ugi webuser,webgroup The user account used by the web interface. Syntax: USERNAME,GROUP1,GROUP2, ... dfs.permissions true If "true", enable permission checking in HDFS. If "false", permission checking is turned off, but all other behavior is unchanged. Switching from one parameter value to the other does not change the mode, owner or group of files or directories. dfs.permissions.supergroup supergroup The name of the group of super-users. dfs.block.access.token.enable false If "true", access tokens are used as capabilities for accessing datanodes. If "false", no access tokens are checked on accessing datanodes. dfs.block.access.key.update.interval 600 Interval in minutes at which namenode updates its access keys. dfs.block.access.token.lifetime 600 The lifetime of access tokens in minutes. dfs.datanode.data.dir ${hadoop.tmp.dir}/dfs/data Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored. dfs.datanode.data.dir.perm 755 Permissions for the directories on on the local filesystem where the DFS data node store its blocks. The permissions can either be octal or symbolic. dfs.replication 3 Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. dfs.replication.max 512 Maximal block replication. dfs.replication.min 1 Minimal block replication. dfs.block.size 67108864 The default block size for new files. dfs.df.interval 60000 Disk usage statistics refresh interval in msec. dfs.client.block.write.retries 3 The number of retries for writing blocks to the data nodes, before we signal failure to the application. dfs.blockreport.intervalMsec 3600000 Determines block reporting interval in milliseconds. dfs.blockreport.initialDelay 0 Delay for first block report in seconds. dfs.heartbeat.interval 3 Determines datanode heartbeat interval in seconds. dfs.namenode.handler.count 10 The number of server threads for the namenode. dfs.safemode.threshold.pct 0.999f Specifies the percentage of blocks that should satisfy the minimal replication requirement defined by dfs.replication.min. Values less than or equal to 0 mean not to wait for any particular percentage of blocks before exiting safemode. Values greater than 1 will make safe mode permanent. dfs.namenode.safemode.min.datanodes 0 Specifies the number of datanodes that must be considered alive before the name node exits safemode. Values less than or equal to 0 mean not to take the number of live datanodes into account when deciding whether to remain in safe mode during startup. Values greater than the number of datanodes in the cluster will make safe mode permanent. dfs.safemode.extension 30000 Determines extension of safe mode in milliseconds after the threshold level is reached. dfs.balance.bandwidthPerSec 1048576 Specifies the maximum amount of bandwidth that each datanode can utilize for the balancing purpose in term of the number of bytes per second. dfs.hosts Names a file that contains a list of hosts that are permitted to connect to the namenode. The full pathname of the file must be specified. If the value is empty, all hosts are permitted. dfs.hosts.exclude Names a file that contains a list of hosts that are not permitted to connect to the namenode. The full pathname of the file must be specified. If the value is empty, no hosts are excluded. dfs.max.objects 0 The maximum number of files, directories and blocks dfs supports. A value of zero indicates no limit to the number of objects that dfs supports. dfs.namenode.decommission.interval 30 Namenode periodicity in seconds to check if decommission is complete. dfs.namenode.decommission.nodes.per.interval 5 The number of nodes namenode checks if decommission is complete in each dfs.namenode.decommission.interval. dfs.replication.interval 3 The periodicity in seconds with which the namenode computes repliaction work for datanodes. dfs.access.time.precision 3600000 The access time for HDFS file is precise upto this value. The default value is 1 hour. Setting a value of 0 disables access times for HDFS. dfs.support.append This option is no longer supported. HBase no longer requires that this option be enabled as sync is now enabled by default. See HADOOP-8230 for additional information. dfs.namenode.delegation.key.update-interval 86400000 The update interval for master key for delegation tokens in the namenode in milliseconds. dfs.namenode.delegation.token.max-lifetime 604800000 The maximum lifetime in milliseconds for which a delegation token is valid. dfs.namenode.delegation.token.renew-interval 86400000 The renewal interval for delegation token in milliseconds. dfs.datanode.failed.volumes.tolerated 0 The number of volumes that are allowed to fail before a datanode stops offering service. By default any volume failure will cause a datanode to shutdown. dfs.datanode.max.xcievers 4096 Specifies the maximum number of threads to use for transferring data in and out of the DN. dfs.datanode.readahead.bytes 4193404 While reading block files, if the Hadoop native libraries are available, the datanode can use the posix_fadvise system call to explicitly page data into the operating system buffer cache ahead of the current reader's position. This can improve performance especially when disks are highly contended. This configuration specifies the number of bytes ahead of the current read position which the datanode will attempt to read ahead. This feature may be disabled by configuring this property to 0. If the native libraries are not available, this configuration has no effect. dfs.datanode.drop.cache.behind.reads false In some workloads, the data read from HDFS is known to be significantly large enough that it is unlikely to be useful to cache it in the operating system buffer cache. In this case, the DataNode may be configured to automatically purge all data from the buffer cache after it is delivered to the client. This behavior is automatically disabled for workloads which read only short sections of a block (e.g HBase random-IO workloads). This may improve performance for some workloads by freeing buffer cache spage usage for more cacheable data. If the Hadoop native libraries are not available, this configuration has no effect. dfs.datanode.drop.cache.behind.writes false In some workloads, the data written to HDFS is known to be significantly large enough that it is unlikely to be useful to cache it in the operating system buffer cache. In this case, the DataNode may be configured to automatically purge all data from the buffer cache after it is written to disk. This may improve performance for some workloads by freeing buffer cache spage usage for more cacheable data. If the Hadoop native libraries are not available, this configuration has no effect. dfs.datanode.sync.behind.writes false If this configuration is enabled, the datanode will instruct the operating system to enqueue all written data to the disk immediately after it is written. This differs from the usual OS policy which may wait for up to 30 seconds before triggering writeback. This may improve performance for some workloads by smoothing the IO profile for data written to disk. If the Hadoop native libraries are not available, this configuration has no effect. dfs.client.use.datanode.hostname false Whether clients should use datanode hostnames when connecting to datanodes. dfs.datanode.use.datanode.hostname false Whether datanodes should use datanode hostnames when connecting to other datanodes for data transfer. dfs.client.local.interfaces A comma separated list of network interface names to use for data transfer between the client and datanodes. When creating a connection to read from or write to a datanode, the client chooses one of the specified interfaces at random and binds its socket to the IP of that interface. Individual names may be specified as either an interface name (eg "eth0"), a subinterface name (eg "eth0:0"), or an IP address (which may be specified using CIDR notation to match a range of IPs). dfs.image.transfer.bandwidthPerSec 0 Specifies the maximum amount of bandwidth that can be utilized for image transfer in term of the number of bytes per second. A default value of 0 indicates that throttling is disabled. dfs.webhdfs.enabled false Enable WebHDFS (REST API) in Namenodes and Datanodes. dfs.namenode.kerberos.internal.spnego.principal ${dfs.web.authentication.kerberos.principal} dfs.secondary.namenode.kerberos.internal.spnego.principal ${dfs.web.authentication.kerberos.principal} dfs.namenode.invalidate.work.pct.per.iteration 0.32f *Note*: Advanced property. Change with caution. This determines the percentage amount of block invalidations (deletes) to do over a single DN heartbeat deletion command. The final deletion count is determined by applying this percentage to the number of live nodes in the system. The resultant number is the number of blocks from the deletion list chosen for proper invalidation over a single heartbeat of a single DN. Value should be a positive, non-zero percentage in float notation (X.Yf), with 1.0f meaning 100%. dfs.namenode.replication.work.multiplier.per.iteration 2 *Note*: Advanced property. Change with caution. This determines the total amount of block transfers to begin in parallel at a DN, for replication, when such a command list is being sent over a DN heartbeat by the NN. The actual number is obtained by multiplying this multiplier with the total number of live nodes in the cluster. The result number is the number of blocks to begin transfers immediately for, per DN heartbeat. This number can be any positive, non-zero integer. dfs.namenode.avoid.read.stale.datanode false Indicate whether or not to avoid reading from "stale" datanodes whose heartbeat messages have not been received by the namenode for more than a specified time interval. Stale datanodes will be moved to the end of the node list returned for reading. See dfs.namenode.avoid.write.stale.datanode for a similar setting for writes. dfs.namenode.avoid.write.stale.datanode false Indicate whether or not to avoid writing to "stale" datanodes whose heartbeat messages have not been received by the namenode for more than a specified time interval. Writes will avoid using stale datanodes unless more than a configured ratio (dfs.namenode.write.stale.datanode.ratio) of datanodes are marked as stale. See dfs.namenode.avoid.read.stale.datanode for a similar setting for reads. dfs.namenode.stale.datanode.interval 30000 Default time interval for marking a datanode as "stale", i.e., if the namenode has not received heartbeat msg from a datanode for more than this time interval, the datanode will be marked and treated as "stale" by default. The stale interval cannot be too small since otherwise this may cause too frequent change of stale states. We thus set a minimum stale interval value (the default value is 3 times of heartbeat interval) and guarantee that the stale interval cannot be less than the minimum value. dfs.namenode.write.stale.datanode.ratio 0.5f When the ratio of number stale datanodes to total datanodes marked is greater than this ratio, stop avoiding writing to stale nodes so as to prevent causing hotspots. dfs.datanode.plugins Comma-separated list of datanode plug-ins to be activated. dfs.namenode.plugins Comma-separated list of namenode plug-ins to be activated. ././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1696418419.637837 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/images/0000775000175000017500000000000000000000000027561 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000003300000000000011451 xustar000000000000000027 mtime=1696418419.637837 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/images/centos/0000775000175000017500000000000000000000000031054 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000020600000000000011453 xustar0000000000000000112 path=sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/images/centos/turn_off_services 22 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/images/centos/turn_off_servic0000664000175000017500000000021500000000000034172 0ustar00zuulzuul00000000000000#!/bin/bash if [ $test_only -eq 0 ]; then systemctl stop hadoop-hdfs-datanode systemctl stop hadoop-hdfs-namenode else exit 0 fi ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/images/centos/wget_cdh_repo0000664000175000017500000000517500000000000033620 0ustar00zuulzuul00000000000000#!/bin/bash CDH_VERSION=5.11 CDH_MINOR_VERSION=5.11.0 if [ ! -f /etc/yum.repos.d/cloudera-cdh5.repo ]; then if [ $test_only -eq 0 ]; then echo '[cloudera-cdh5]' > /etc/yum.repos.d/cloudera-cdh5.repo echo "name=Cloudera's Distribution for Hadoop, Version 5" >> /etc/yum.repos.d/cloudera-cdh5.repo echo "baseurl=http://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/$CDH_MINOR_VERSION/" >> /etc/yum.repos.d/cloudera-cdh5.repo echo "gpgkey = http://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/RPM-GPG-KEY-cloudera" >> /etc/yum.repos.d/cloudera-cdh5.repo echo 'gpgcheck = 1' >> /etc/yum.repos.d/cloudera-cdh5.repo echo '[cloudera-manager]' > /etc/yum.repos.d/cloudera-manager.repo echo 'name=Cloudera Manager' >> /etc/yum.repos.d/cloudera-manager.repo echo "baseurl=http://archive.cloudera.com/cm5/redhat/7/x86_64/cm/$CDH_MINOR_VERSION/" >> /etc/yum.repos.d/cloudera-manager.repo echo "gpgkey = http://archive.cloudera.com/cm5/redhat/7/x86_64/cm/RPM-GPG-KEY-cloudera" >> /etc/yum.repos.d/cloudera-manager.repo echo 'gpgcheck = 1' >> /etc/yum.repos.d/cloudera-manager.repo echo '[navigator-keytrustee]' > /etc/yum.repos.d/kms.repo echo "name=Cloudera's Distribution for navigator-Keytrustee, Version 5" >> /etc/yum.repos.d/kms.repo RETURN_CODE="$(curl -s -o /dev/null -w "%{http_code}" http://archive.cloudera.com/navigator-keytrustee5/redhat/7/x86_64/navigator-keytrustee/$CDH_MINOR_VERSION/)" if [ "$RETURN_CODE" == "404" ]; then echo "baseurl=http://archive.cloudera.com/navigator-keytrustee5/redhat/7/x86_64/navigator-keytrustee/$CDH_VERSION/" >> /etc/yum.repos.d/kms.repo else echo "baseurl=http://archive.cloudera.com/navigator-keytrustee5/redhat/7/x86_64/navigator-keytrustee/$CDH_MINOR_VERSION/" >> /etc/yum.repos.d/kms.repo fi echo "gpgkey = http://archive.cloudera.com/navigator-keytrustee5/redhat/7/x86_64/navigator-keytrustee/RPM-GPG-KEY-cloudera" >> /etc/yum.repos.d/kms.repo echo 'gpgcheck = 1' >> /etc/yum.repos.d/kms.repo echo "[cloudera-kafka]" > /etc/yum.repos.d/cloudera-kafka.repo echo "name=Cloudera's Distribution for kafka, Version 2.2.0" >> /etc/yum.repos.d/cloudera-kafka.repo echo "baseurl=http://archive.cloudera.com/kafka/redhat/7/x86_64/kafka/2.2.0/" >> /etc/yum.repos.d/cloudera-kafka.repo echo "gpgkey = http://archive.cloudera.com/kafka/redhat/7/x86_64/kafka/RPM-GPG-KEY-cloudera" >> /etc/yum.repos.d/cloudera-kafka.repo echo "gpgcheck = 1" >> /etc/yum.repos.d/cloudera-kafka.repo yum clean all else exit 0 fi fi ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6418371 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/images/common/0000775000175000017500000000000000000000000031051 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/images/common/add_jar0000664000175000017500000000070200000000000032357 0ustar00zuulzuul00000000000000#!/bin/bash hadoop="2.6.0" HDFS_LIB_DIR=${hdfs_lib_dir:-"/usr/share/hadoop/lib"} HADOOP_SWIFT_JAR_NAME="hadoop-openstack.jar" if [ $test_only -eq 0 ]; then mkdir -p $HDFS_LIB_DIR curl -sS -o $HDFS_LIB_DIR/$HADOOP_SWIFT_JAR_NAME $swift_url if [ $? -ne 0 ]; then echo -e "Could not download Swift Hadoop FS implementation.\nAborting" exit 1 fi chmod 0644 $HDFS_LIB_DIR/$HADOOP_SWIFT_JAR_NAME else exit 0 fi ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/images/common/install_extjs0000664000175000017500000000144000000000000033656 0ustar00zuulzuul00000000000000#!/bin/bash EXTJS_DESTINATION_DIR="/var/lib/oozie" EXTJS_DOWNLOAD_URL="https://tarballs.openstack.org/sahara-extra/dist/common-artifacts/ext-2.2.zip" extjs_basepath=$(basename ${EXTJS_DOWNLOAD_URL}) extjs_archive=/tmp/${extjs_basepath} extjs_folder="${extjs_basepath%.*}" setup_extjs() { curl -sS -o $extjs_archive $EXTJS_DOWNLOAD_URL mkdir -p $EXTJS_DESTINATION_DIR } if [ -z "${EXTJS_NO_UNPACK:-}" ]; then if [ ! -d "${EXTJS_DESTINATION_DIR}/${extjs_folder}" ]; then setup_extjs unzip -o -d "$EXTJS_DESTINATION_DIR" $extjs_archive rm -f $extjs_archive else exit 0 fi else if [ ! -f "${EXTJS_DESTINATION_DIR}/${extjs_basepath}" ]; then setup_extjs mv $extjs_archive $EXTJS_DESTINATION_DIR else exit 0 fi fi ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/images/common/install_spark0000664000175000017500000000251300000000000033643 0ustar00zuulzuul00000000000000#!/bin/bash tmp_dir=/tmp/spark CDH_VERSION=5.11 mkdir -p $tmp_dir if [ ! -d /opt/spark ]; then if [ $test_only -eq 0 ]; then # The user is not providing his own Spark distribution package if [ -z "${SPARK_DOWNLOAD_URL:-}" ]; then # Check hadoop version # INFO on hadoop versions: http://spark.apache.org/docs/latest/hadoop-third-party-distributions.html # Now the below is just a sanity check if [ -z "${SPARK_HADOOP_DL:-}" ]; then SPARK_HADOOP_DL=hadoop2.7 fi SPARK_DOWNLOAD_URL="http://archive.apache.org/dist/spark/spark-$plugin_version/spark-$plugin_version-bin-$SPARK_HADOOP_DL.tgz" fi echo "Downloading SPARK" spark_file=$(basename "$SPARK_DOWNLOAD_URL") wget -O $tmp_dir/$spark_file $SPARK_DOWNLOAD_URL echo "$SPARK_DOWNLOAD_URL" > $tmp_dir/spark_url.txt echo "Extracting SPARK" extract_folder=$(tar tzf $tmp_dir/$spark_file | sed -e 's@/.*@@' | uniq) echo "Decompressing Spark..." tar xzf $tmp_dir/$spark_file rm $tmp_dir/$spark_file echo "Moving SPARK to /opt/" # Placing spark in /opt/spark mv $extract_folder /opt/spark/ mv $tmp_dir/spark_url.txt /opt/spark/ rm -Rf $tmp_dir else exit 1 fi fi ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/images/common/manipulate_s30000664000175000017500000000112500000000000033537 0ustar00zuulzuul00000000000000#!/bin/bash SPARK_JARS_DIR_PATH="/opt/spark/jars" HADOOP_TOOLS_DIR_PATH="/opt/hadoop/share/hadoop/tools/lib" HADOOP_COMMON_DIR_PATH="/opt/hadoop/share/hadoop/common/lib" # The hadoop-aws and aws-java-sdk libraries are missing here, but we # cannot copy them from the Hadoop folder on-disk due to # version/patching issues curl -sS https://tarballs.openstack.org/sahara-extra/dist/common-artifacts/hadoop-aws-2.7.3.jar -o $SPARK_JARS_DIR_PATH/hadoop-aws.jar curl -sS https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk/1.7.4/aws-java-sdk-1.7.4.jar -o $SPARK_JARS_DIR_PATH/aws-java-sdk.jar ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/images/image.yaml0000664000175000017500000000317500000000000031535 0ustar00zuulzuul00000000000000arguments: plugin_version: description: The version of Spark to install. Defaults to 2.3.2 default: 2.3.2 choices: - 2.3.2 - 2.3.1 - 2.3.0 - 2.2.1 - 2.2.0 java_distro: default: openjdk description: The distribution of Java to install. Defaults to openjdk. choices: - openjdk - oracle-java hdfs_lib_dir: default: /usr/lib/hadoop-mapreduce description: The path to HDFS lib. Defaults to /usr/lib/hadoop-mapreduce. required: False swift_url: default: https://tarballs.openstack.org/sahara-extra/dist/hadoop-openstack/master/hadoop-openstack-2.6.0.jar description: Location of the swift jar file. required: False validators: - os_case: - redhat: - package: wget - script: centos/wget_cdh_repo - ubuntu: - script: ubuntu/wget_cdh_repo - argument_case: argument_name: java_distro cases: openjdk: - os_case: - redhat: - package: java-1.8.0-openjdk-devel - ubuntu: - package: openjdk-8-jdk - script: common/install_spark: env_vars: [plugin_version, cdh_version] - os_case: - ubuntu: - script: ubuntu/config_spark - package: ntp - package: - hadoop-hdfs-namenode - hadoop-hdfs-datanode - script: common/install_extjs - os_case: - redhat: - script: centos/turn_off_services - ubuntu: - script: ubuntu/turn_off_services - script: common/manipulate_s3 - script: common/add_jar: env_vars: [hdfs_lib_dir, swift_url] ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6418371 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/images/ubuntu/0000775000175000017500000000000000000000000031103 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/images/ubuntu/config_spark0000664000175000017500000000055700000000000033502 0ustar00zuulzuul00000000000000#!/bin/bash firstboot_script_name="/opt/spark/firstboot.sh" sed -i -e "s,^exit 0$,[ -f $firstboot_script_name ] \&\& sh $firstboot_script_name; exit 0," /etc/rc.local user_and_group_names="ubuntu:ubuntu" cat >> $firstboot_script_name <> /etc/apt/sources.list # Cloudera repositories echo "deb [arch=amd64] http://archive.cloudera.com/cdh5/ubuntu/xenial/amd64/cdh xenial-cdh$CDH_VERSION contrib" > /etc/apt/sources.list.d/cdh5.list echo "deb-src http://archive.cloudera.com/cdh5/ubuntu/xenial/amd64/cdh xenial-cdh$CDH_VERSION contrib" >> /etc/apt/sources.list.d/cdh5.list wget -qO - http://archive.cloudera.com/cdh5/ubuntu/xenial/amd64/cdh/archive.key | apt-key add - echo "deb [arch=amd64] http://archive.cloudera.com/cm5/ubuntu/xenial/amd64/cm xenial-cm$CDH_VERSION contrib" > /etc/apt/sources.list.d/cm5.list echo "deb-src http://archive.cloudera.com/cm5/ubuntu/xenial/amd64/cm xenial-cm$CDH_VERSION contrib" >> /etc/apt/sources.list.d/cm5.list wget -qO - http://archive.cloudera.com/cm5/ubuntu/xenial/amd64/cm/archive.key | apt-key add - wget -O /etc/apt/sources.list.d/kms.list http://archive.cloudera.com/navigator-keytrustee5/ubuntu/xenial/amd64/navigator-keytrustee/cloudera.list wget -qO - http://archive.cloudera.com/navigator-keytrustee5/ubuntu/xenial/amd64/navigator-keytrustee/archive.key | apt-key add - # add Kafka repository echo 'deb http://archive.cloudera.com/kafka/ubuntu/xenial/amd64/kafka/ xenial-kafka2.2.0 contrib' >> /etc/apt/sources.list wget -qO - https://archive.cloudera.com/kafka/ubuntu/xenial/amd64/kafka/archive.key | apt-key add - #change repository priority echo 'Package: zookeeper\nPin: origin "archive.cloudera.com"\nPin-Priority: 1001' > /etc/apt/preferences.d/cloudera-pin apt-get update else exit 0 fi fi ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/spark-cleanup.cron0000664000175000017500000000013600000000000031744 0ustar00zuulzuul00000000000000# Cleans up old Spark job directories once per hour. 0 * * * * root /etc/hadoop/tmp-cleanup.sh././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/spark-env.sh.template0000664000175000017500000000216500000000000032374 0ustar00zuulzuul00000000000000#!/usr/bin/env bash # This file contains environment variables required to run Spark. Copy it as # spark-env.sh and edit that to configure Spark for your site. # # The following variables can be set in this file: # - SPARK_LOCAL_IP, to set the IP address Spark binds to on this node # - MESOS_NATIVE_LIBRARY, to point to your libmesos.so if you use Mesos # - SPARK_JAVA_OPTS, to set node-specific JVM options for Spark. Note that # we recommend setting app-wide options in the application's driver program. # Examples of node-specific options : -Dspark.local.dir, GC options # Examples of app-wide options : -Dspark.serializer # # If using the standalone deploy mode, you can also set variables for it here: # - SPARK_MASTER_IP, to bind the master to a different IP address or hostname # - SPARK_MASTER_PORT / SPARK_MASTER_WEBUI_PORT, to use non-default ports # - SPARK_WORKER_CORES, to set the number of cores to use on this machine # - SPARK_WORKER_MEMORY, to set how much memory to use (e.g. 1000m, 2g) # - SPARK_WORKER_PORT / SPARK_WORKER_WEBUI_PORT # - SPARK_WORKER_INSTANCES, to set the number of worker processes per node ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/tmp-cleanup.sh.template0000664000175000017500000000230500000000000032707 0ustar00zuulzuul00000000000000#!/bin/sh MINIMUM_CLEANUP_MEGABYTES={minimum_cleanup_megabytes} MINIMUM_CLEANUP_SECONDS={minimum_cleanup_seconds} MAXIMUM_CLEANUP_SECONDS={maximum_cleanup_seconds} CURRENT_TIMESTAMP=`date +%s` POSSIBLE_CLEANUP_THRESHOLD=$(($CURRENT_TIMESTAMP - $MINIMUM_CLEANUP_SECONDS)) DEFINITE_CLEANUP_THRESHOLD=$(($CURRENT_TIMESTAMP - $MAXIMUM_CLEANUP_SECONDS)) unset MAY_DELETE unset WILL_DELETE if [ ! -d /tmp/spark-edp ] then exit 0 fi cd /tmp/spark-edp for JOB in $(find . -maxdepth 1 -mindepth 1 -type d -printf '%f\n') do for EXECUTION in $(find $JOB -maxdepth 1 -mindepth 1 -type d -printf '%f\n') do TIMESTAMP=`stat $JOB/$EXECUTION --printf '%Y'` if [[ $TIMESTAMP -lt $DEFINITE_CLEANUP_THRESHOLD ]] then WILL_DELETE="$WILL_DELETE $JOB/$EXECUTION" else if [[ $TIMESTAMP -lt $POSSIBLE_CLEANUP_THRESHOLD ]] then MAY_DELETE="$MAY_DELETE $JOB/$EXECUTION" fi fi done done for EXECUTION in $WILL_DELETE do rm -Rf $EXECUTION done for EXECUTION in $(ls $MAY_DELETE -trd) do if [[ `du -s -BM | grep -o '[0-9]\+'` -le $MINIMUM_CLEANUP_MEGABYTES ]]; then break fi rm -Rf $EXECUTION done ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/resources/topology.sh0000664000175000017500000000061200000000000030523 0ustar00zuulzuul00000000000000#!/bin/bash HADOOP_CONF=/etc/hadoop while [ $# -gt 0 ] ; do nodeArg=$1 exec< ${HADOOP_CONF}/topology.data result="" while read line ; do ar=( $line ) if [ "${ar[0]}" = "$nodeArg" ] ; then result="${ar[1]}" fi done shift if [ -z "$result" ] ; then echo -n "/default/rack " else echo -n "$result " fi done ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/run_scripts.py0000664000175000017500000000602300000000000027230 0ustar00zuulzuul00000000000000# Copyright (c) 2014 Hoang Do, Phuc Vo, P. Michiardi, D. Venzano # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or # implied. # See the License for the specific language governing permissions and # limitations under the License. import os from oslo_log import log as logging from sahara.plugins import utils from sahara_plugin_spark.i18n import _ from sahara_plugin_spark.plugins.spark import config_helper as c_helper LOG = logging.getLogger(__name__) def start_processes(remote, *processes): for proc in processes: if proc == "namenode": remote.execute_command("sudo service hadoop-hdfs-namenode start") elif proc == "datanode": remote.execute_command("sudo service hadoop-hdfs-datanode start") else: remote.execute_command("screen -d -m sudo hadoop %s" % proc) def refresh_nodes(remote, service): remote.execute_command("sudo -u hdfs hadoop %s -refreshNodes" % service) def format_namenode(nn_remote): nn_remote.execute_command("sudo -u hdfs hadoop namenode -format") def clean_port_hadoop(nn_remote): nn_remote.execute_command(("sudo netstat -tlnp" "| awk '/:8020 */" "{split($NF,a,\"/\"); print a[1]}'" "| xargs sudo kill -9")) def start_spark_master(nn_remote, sp_home): nn_remote.execute_command("bash " + os.path.join(sp_home, "sbin/start-all.sh")) def stop_spark(nn_remote, sp_home): nn_remote.execute_command("bash " + os.path.join(sp_home, "sbin/stop-all.sh")) @utils.event_wrapper( True, step=_("Await DataNodes start up"), param=("cluster", 0)) def await_datanodes(cluster): datanodes_count = len(utils.get_instances(cluster, "datanode")) if datanodes_count < 1: return log_msg = _("Waiting on %d DataNodes to start up") % datanodes_count with utils.get_instance(cluster, "namenode").remote() as r: utils.plugin_option_poll( cluster, _check_datanodes_count, c_helper.DATANODES_STARTUP_TIMEOUT, log_msg, 1, {"remote": r, "count": datanodes_count}) def _check_datanodes_count(remote, count): if count < 1: return True LOG.debug("Checking DataNodes count") ex_code, stdout = remote.execute_command( 'sudo su -lc "hdfs dfsadmin -report" hdfs | ' r'grep \'Live datanodes\|Datanodes available:\' | ' r'grep -o \'[0-9]\+\' | head -n 1') LOG.debug("DataNodes count='{count}'".format(count=stdout.strip())) return stdout and int(stdout) == count ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/scaling.py0000664000175000017500000000723700000000000026305 0ustar00zuulzuul00000000000000# Copyright (c) 2014 Hoang Do, Phuc Vo, P. Michiardi, D. Venzano # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or # implied. # See the License for the specific language governing permissions and # limitations under the License. import os import six from sahara.plugins import context from sahara.plugins import utils from sahara_plugin_spark.i18n import _ from sahara_plugin_spark.plugins.spark import config_helper as c_helper from sahara_plugin_spark.plugins.spark import run_scripts as run @utils.event_wrapper(True, step=_("Decommission %s") % "Slaves") def decommission_sl(master, inst_to_be_deleted, survived_inst): if survived_inst is not None: slavenames = [] for slave in survived_inst: slavenames.append(slave.hostname()) slaves_content = c_helper.generate_spark_slaves_configs(slavenames) else: slaves_content = "\n" cluster = master.cluster sp_home = utils.get_config_value_or_default("Spark", "Spark home", cluster) r_master = utils.get_remote(master) run.stop_spark(r_master, sp_home) # write new slave file to master files = {os.path.join(sp_home, 'conf/slaves'): slaves_content} r_master.write_files_to(files) # write new slaves file to each survived slave as well for i in survived_inst: with utils.get_remote(i) as r: r.write_files_to(files) run.start_spark_master(r_master, sp_home) def _is_decommissioned(r, inst_to_be_deleted): cmd = r.execute_command("sudo -u hdfs hadoop dfsadmin -report") datanodes_info = parse_dfs_report(cmd[1]) for i in inst_to_be_deleted: for dn in datanodes_info: if (dn["Name"].startswith(i.internal_ip)) and ( dn["Decommission Status"] != "Decommissioned"): return False return True @utils.event_wrapper(True, step=_("Decommission %s") % "DataNodes") def decommission_dn(nn, inst_to_be_deleted, survived_inst): with utils.get_remote(nn) as r: r.write_file_to('/etc/hadoop/dn.excl', utils.generate_fqdn_host_names( inst_to_be_deleted)) run.refresh_nodes(utils.get_remote(nn), "dfsadmin") context.sleep(3) utils.plugin_option_poll( nn.cluster, _is_decommissioned, c_helper.DECOMMISSIONING_TIMEOUT, _("Decommission %s") % "DataNodes", 3, { 'r': r, 'inst_to_be_deleted': inst_to_be_deleted}) r.write_files_to({ '/etc/hadoop/dn.incl': utils. generate_fqdn_host_names(survived_inst), '/etc/hadoop/dn.excl': ""}) def parse_dfs_report(cmd_output): report = cmd_output.rstrip().split(os.linesep) array = [] started = False for line in report: if started: array.append(line) if line.startswith("Datanodes available"): started = True res = [] datanode_info = {} for i in six.moves.xrange(0, len(array)): if array[i]: idx = str.find(array[i], ':') name = array[i][0:idx] value = array[i][idx + 2:] datanode_info[name.strip()] = value.strip() if not array[i] and datanode_info: res.append(datanode_info) datanode_info = {} if datanode_info: res.append(datanode_info) return res ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/plugins/spark/shell_engine.py0000664000175000017500000000201200000000000027303 0ustar00zuulzuul00000000000000# Copyright (c) 2015 OpenStack Foundation # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or # implied. # See the License for the specific language governing permissions and # limitations under the License. from sahara.plugins import edp from sahara.plugins import utils as plugin_utils class ShellEngine(edp.PluginsSparkShellJobEngine): def __init__(self, cluster): super(ShellEngine, self).__init__(cluster) self.master = plugin_utils.get_instance(cluster, "master") @staticmethod def job_type_supported(job_type): return (job_type in edp.PluginsSparkShellJobEngine. get_supported_job_types()) ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6418371 sahara-plugin-spark-10.0.0/sahara_plugin_spark/tests/0000775000175000017500000000000000000000000022643 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/tests/__init__.py0000664000175000017500000000121100000000000024747 0ustar00zuulzuul00000000000000# Copyright (c) 2014 Mirantis Inc. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or # implied. # See the License for the specific language governing permissions and # limitations under the License. from sahara_plugin_spark.utils import patches patches.patch_all() ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6418371 sahara-plugin-spark-10.0.0/sahara_plugin_spark/tests/unit/0000775000175000017500000000000000000000000023622 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/tests/unit/__init__.py0000664000175000017500000000000000000000000025721 0ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/tests/unit/base.py0000664000175000017500000000354400000000000025114 0ustar00zuulzuul00000000000000# Copyright (c) 2013 Mirantis Inc. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or # implied. # See the License for the specific language governing permissions and # limitations under the License. from oslotest import base from sahara.plugins import context from sahara.plugins import db as db_api from sahara.plugins import main from sahara.plugins import utils class SaharaTestCase(base.BaseTestCase): def setUp(self): super(SaharaTestCase, self).setUp() self.setup_context() utils.rpc_setup('all-in-one') def setup_context(self, username="test_user", tenant_id="tenant_1", auth_token="test_auth_token", tenant_name='test_tenant', service_catalog=None, **kwargs): self.addCleanup(context.set_ctx, context.ctx() if context.has_ctx() else None) context.set_ctx(context.PluginsContext( username=username, tenant_id=tenant_id, auth_token=auth_token, service_catalog=service_catalog or {}, tenant_name=tenant_name, **kwargs)) def override_config(self, name, override, group=None): main.set_override(name, override, group) self.addCleanup(main.clear_override, name, group) class SaharaWithDbTestCase(SaharaTestCase): def setUp(self): super(SaharaWithDbTestCase, self).setUp() self.override_config('connection', "sqlite://", group='database') db_api.setup_db() self.addCleanup(db_api.drop_db) ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6418371 sahara-plugin-spark-10.0.0/sahara_plugin_spark/tests/unit/plugins/0000775000175000017500000000000000000000000025303 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/tests/unit/plugins/__init__.py0000664000175000017500000000000000000000000027402 0ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6418371 sahara-plugin-spark-10.0.0/sahara_plugin_spark/tests/unit/plugins/spark/0000775000175000017500000000000000000000000026423 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/tests/unit/plugins/spark/__init__.py0000664000175000017500000000000000000000000030522 0ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/tests/unit/plugins/spark/test_config_helper.py0000664000175000017500000001062300000000000032642 0ustar00zuulzuul00000000000000# Copyright (c) 2014 Mirantis Inc. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or # implied. # See the License for the specific language governing permissions and # limitations under the License. import copy import xml.dom.minidom as xml from unittest import mock from sahara.plugins import swift_helper as swift from sahara.plugins import utils from sahara_plugin_spark.plugins.spark import config_helper as c_helper from sahara_plugin_spark.tests.unit import base as test_base class ConfigHelperUtilsTest(test_base.SaharaTestCase): def test_make_hadoop_path(self): storage_paths = ['/mnt/one', '/mnt/two'] paths = c_helper.make_hadoop_path(storage_paths, '/spam') expected = ['/mnt/one/spam', '/mnt/two/spam'] self.assertEqual(expected, paths) @mock.patch('sahara.plugins.utils.get_config_value_or_default') def test_cleanup_configs(self, get_config_value): getter = lambda plugin, key, cluster: plugin_configs[key] # noqa: E731 get_config_value.side_effect = getter plugin_configs = {"Minimum cleanup megabytes": 4096, "Minimum cleanup seconds": 86400, "Maximum cleanup seconds": 1209600} configs = c_helper.generate_job_cleanup_config(None) self.assertTrue(configs['valid']) expected = ["MINIMUM_CLEANUP_MEGABYTES=4096", "MINIMUM_CLEANUP_SECONDS=86400", "MAXIMUM_CLEANUP_SECONDS=1209600"] for config_value in expected: self.assertIn(config_value, configs['script']) self.assertIn("0 * * * * root /etc/hadoop/tmp-cleanup.sh", configs['cron'][0]) plugin_configs['Maximum cleanup seconds'] = 0 configs = c_helper.generate_job_cleanup_config(None) self.assertFalse(configs['valid']) self.assertNotIn(configs, 'script') self.assertNotIn(configs, 'cron') plugin_configs = {"Minimum cleanup megabytes": 0, "Minimum cleanup seconds": 0, "Maximum cleanup seconds": 1209600} configs = c_helper.generate_job_cleanup_config(None) self.assertFalse(configs['valid']) self.assertNotIn(configs, 'script') self.assertNotIn(configs, 'cron') @mock.patch("sahara.plugins.swift_utils.retrieve_auth_url") def test_generate_xml_configs(self, auth_url): auth_url.return_value = "http://localhost:5000/v2/" # Make a dict of swift configs to verify generated values swift_vals = c_helper.extract_name_values(swift.get_swift_configs()) # Make sure that all the swift configs are in core-site c = c_helper.generate_xml_configs({}, ['/mnt/one'], 'localhost', None) doc = xml.parseString(c['core-site']) configuration = doc.getElementsByTagName('configuration') properties = utils.get_property_dict(configuration[0]) self.assertDictContainsSubset(swift_vals, properties) # Make sure that user values have precedence over defaults c = c_helper.generate_xml_configs( {'HDFS': {'fs.swift.service.sahara.tenant': 'fred'}}, ['/mnt/one'], 'localhost', None) doc = xml.parseString(c['core-site']) configuration = doc.getElementsByTagName('configuration') properties = utils.get_property_dict(configuration[0]) mod_swift_vals = copy.copy(swift_vals) mod_swift_vals['fs.swift.service.sahara.tenant'] = 'fred' self.assertDictContainsSubset(mod_swift_vals, properties) # Make sure that swift configs are left out if not enabled c = c_helper.generate_xml_configs( {'HDFS': {'fs.swift.service.sahara.tenant': 'fred'}, 'general': {'Enable Swift': False}}, ['/mnt/one'], 'localhost', None) doc = xml.parseString(c['core-site']) configuration = doc.getElementsByTagName('configuration') properties = utils.get_property_dict(configuration[0]) for key in mod_swift_vals.keys(): self.assertNotIn(key, properties) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/tests/unit/plugins/spark/test_plugin.py0000664000175000017500000002006200000000000031332 0ustar00zuulzuul00000000000000# Copyright (c) 2013 Mirantis Inc. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or # implied. # See the License for the specific language governing permissions and # limitations under the License. from unittest import mock import testtools from sahara.plugins import base as pb from sahara.plugins import conductor from sahara.plugins import context from sahara.plugins import edp from sahara.plugins import exceptions as pe from sahara.plugins import testutils as tu from sahara_plugin_spark.plugins.spark import plugin as pl from sahara_plugin_spark.tests.unit import base class SparkPluginTest(base.SaharaWithDbTestCase): def setUp(self): super(SparkPluginTest, self).setUp() self.override_config("plugins", ["spark"]) pb.setup_plugins() def _init_cluster_dict(self, version): cluster_dict = { 'name': 'cluster', 'plugin_name': 'spark', 'hadoop_version': version, 'default_image_id': 'image'} return cluster_dict def test_plugin11_edp_engine(self): self._test_engine('1.6.0', edp.JOB_TYPE_SPARK, edp.PluginsSparkJobEngine) def test_plugin12_shell_engine(self): self._test_engine('1.6.0', edp.JOB_TYPE_SHELL, edp.PluginsSparkShellJobEngine) def test_plugin21_edp_engine(self): self._test_engine('2.1.0', edp.JOB_TYPE_SPARK, edp.PluginsSparkJobEngine) def test_plugin21_shell_engine(self): self._test_engine('2.1.0', edp.JOB_TYPE_SHELL, edp.PluginsSparkShellJobEngine) def test_plugin22_edp_engine(self): self._test_engine('2.2', edp.JOB_TYPE_SPARK, edp.PluginsSparkJobEngine) def test_plugin22_shell_engine(self): self._test_engine('2.2', edp.JOB_TYPE_SHELL, edp.PluginsSparkShellJobEngine) def test_plugin23_edp_engine(self): self._test_engine('2.3', edp.JOB_TYPE_SPARK, edp.PluginsSparkJobEngine) def test_plugin23_shell_engine(self): self._test_engine('2.3', edp.JOB_TYPE_SHELL, edp.PluginsSparkShellJobEngine) def _test_engine(self, version, job_type, eng): cluster_dict = self._init_cluster_dict(version) cluster = conductor.cluster_create(context.ctx(), cluster_dict) plugin = pb.PLUGINS.get_plugin(cluster.plugin_name) self.assertIsInstance(plugin.get_edp_engine(cluster, job_type), eng) def test_cleanup_configs(self): remote = mock.Mock() instance = mock.Mock() extra_conf = {'job_cleanup': { 'valid': True, 'script': 'script_text', 'cron': 'cron_text'}} instance.node_group.node_processes = ["master"] instance.node_group.id = id cluster_dict = self._init_cluster_dict('2.2') cluster = conductor.cluster_create(context.ctx(), cluster_dict) plugin = pb.PLUGINS.get_plugin(cluster.plugin_name) plugin._push_cleanup_job(remote, cluster, extra_conf, instance) remote.write_file_to.assert_called_with( '/etc/hadoop/tmp-cleanup.sh', 'script_text') remote.execute_command.assert_called_with( 'sudo sh -c \'echo "cron_text" > /etc/cron.d/spark-cleanup\'') remote.reset_mock() instance.node_group.node_processes = ["worker"] plugin._push_cleanup_job(remote, cluster, extra_conf, instance) self.assertFalse(remote.called) remote.reset_mock() instance.node_group.node_processes = ["master"] extra_conf['job_cleanup']['valid'] = False plugin._push_cleanup_job(remote, cluster, extra_conf, instance) remote.execute_command.assert_called_with( 'sudo rm -f /etc/crond.d/spark-cleanup') class SparkValidationTest(base.SaharaTestCase): def setUp(self): super(SparkValidationTest, self).setUp() self.override_config("plugins", ["spark"]) pb.setup_plugins() self.plugin = pl.SparkProvider() def test_validate(self): self.ng = [] self.ng.append(tu.make_ng_dict("nn", "f1", ["namenode"], 0)) self.ng.append(tu.make_ng_dict("ma", "f1", ["master"], 0)) self.ng.append(tu.make_ng_dict("sl", "f1", ["slave"], 0)) self.ng.append(tu.make_ng_dict("dn", "f1", ["datanode"], 0)) self._validate_case(1, 1, 3, 3) self._validate_case(1, 1, 3, 4) self._validate_case(1, 1, 4, 3) with testtools.ExpectedException(pe.InvalidComponentCountException): self._validate_case(2, 1, 3, 3) with testtools.ExpectedException(pe.InvalidComponentCountException): self._validate_case(1, 2, 3, 3) with testtools.ExpectedException(pe.InvalidComponentCountException): self._validate_case(0, 1, 3, 3) with testtools.ExpectedException(pe.RequiredServiceMissingException): self._validate_case(1, 0, 3, 3) cl = self._create_cluster( 1, 1, 3, 3, cluster_configs={'HDFS': {'dfs.replication': 4}}) with testtools.ExpectedException(pe.InvalidComponentCountException): self.plugin.validate(cl) def _create_cluster(self, *args, **kwargs): lst = [] for i in range(0, len(args)): self.ng[i]['count'] = args[i] lst.append(self.ng[i]) return tu.create_cluster("cluster1", "tenant1", "spark", "2.2", lst, **kwargs) def _validate_case(self, *args): cl = self._create_cluster(*args) self.plugin.validate(cl) class SparkProviderTest(base.SaharaTestCase): def setUp(self): super(SparkProviderTest, self).setUp() def test_supported_job_types(self): provider = pl.SparkProvider() res = provider.get_edp_job_types() self.assertEqual([edp.JOB_TYPE_SHELL, edp.JOB_TYPE_SPARK], res['1.6.0']) self.assertEqual([edp.JOB_TYPE_SHELL, edp.JOB_TYPE_SPARK], res['2.1.0']) self.assertEqual([edp.JOB_TYPE_SHELL, edp.JOB_TYPE_SPARK], res['2.2']) self.assertEqual([edp.JOB_TYPE_SHELL, edp.JOB_TYPE_SPARK], res['2.3']) def test_edp_config_hints(self): provider = pl.SparkProvider() res = provider.get_edp_config_hints(edp.JOB_TYPE_SHELL, "1.6.0") self.assertEqual({'configs': {}, 'args': [], 'params': {}}, res['job_config']) res = provider.get_edp_config_hints(edp.JOB_TYPE_SPARK, "1.6.0") self.assertEqual({'args': [], 'configs': []}, res['job_config']) res = provider.get_edp_config_hints(edp.JOB_TYPE_SPARK, "2.1.0") self.assertEqual({'args': [], 'configs': []}, res['job_config']) res = provider.get_edp_config_hints(edp.JOB_TYPE_SHELL, "2.1.0") self.assertEqual({'args': [], 'configs': {}, 'params': {}}, res['job_config']) res = provider.get_edp_config_hints(edp.JOB_TYPE_SPARK, "2.2") self.assertEqual({'args': [], 'configs': []}, res['job_config']) res = provider.get_edp_config_hints(edp.JOB_TYPE_SHELL, "2.2") self.assertEqual({'args': [], 'configs': {}, 'params': {}}, res['job_config']) res = provider.get_edp_config_hints(edp.JOB_TYPE_SPARK, "2.3") self.assertEqual({'args': [], 'configs': []}, res['job_config']) res = provider.get_edp_config_hints(edp.JOB_TYPE_SHELL, "2.3") self.assertEqual({'args': [], 'configs': {}, 'params': {}}, res['job_config']) ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6418371 sahara-plugin-spark-10.0.0/sahara_plugin_spark/utils/0000775000175000017500000000000000000000000022641 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/utils/__init__.py0000664000175000017500000000000000000000000024740 0ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark/utils/patches.py0000664000175000017500000000667400000000000024657 0ustar00zuulzuul00000000000000# Copyright (c) 2013 Mirantis Inc. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or # implied. # See the License for the specific language governing permissions and # limitations under the License. import eventlet EVENTLET_MONKEY_PATCH_MODULES = dict(os=True, select=True, socket=True, thread=True, time=True) def patch_all(): """Apply all patches. List of patches: * eventlet's monkey patch for all cases; * minidom's writexml patch for py < 2.7.3 only. """ eventlet_monkey_patch() patch_minidom_writexml() def eventlet_monkey_patch(): """Apply eventlet's monkey patch. This call should be the first call in application. It's safe to call monkey_patch multiple times. """ eventlet.monkey_patch(**EVENTLET_MONKEY_PATCH_MODULES) def eventlet_import_monkey_patched(module): """Returns module monkey patched by eventlet. It's needed for some tests, for example, context test. """ return eventlet.import_patched(module, **EVENTLET_MONKEY_PATCH_MODULES) def patch_minidom_writexml(): """Patch for xml.dom.minidom toprettyxml bug with whitespaces around text We apply the patch to avoid excess whitespaces in generated xml configuration files that brakes Hadoop. (This patch will be applied for all Python versions < 2.7.3) Issue: http://bugs.python.org/issue4147 Patch: http://hg.python.org/cpython/rev/cb6614e3438b/ Description: http://ronrothman.com/public/leftbraned/xml-dom-minidom-\ toprettyxml-and-silly-whitespace/#best-solution """ import sys if sys.version_info >= (2, 7, 3): return import xml.dom.minidom as md def element_writexml(self, writer, indent="", addindent="", newl=""): # indent = current indentation # addindent = indentation to add to higher levels # newl = newline string writer.write(indent + "<" + self.tagName) attrs = self._get_attributes() a_names = list(attrs.keys()) a_names.sort() for a_name in a_names: writer.write(" %s=\"" % a_name) md._write_data(writer, attrs[a_name].value) writer.write("\"") if self.childNodes: writer.write(">") if (len(self.childNodes) == 1 and self.childNodes[0].nodeType == md.Node.TEXT_NODE): self.childNodes[0].writexml(writer, '', '', '') else: writer.write(newl) for node in self.childNodes: node.writexml(writer, indent + addindent, addindent, newl) writer.write(indent) writer.write("%s" % (self.tagName, newl)) else: writer.write("/>%s" % (newl)) md.Element.writexml = element_writexml def text_writexml(self, writer, indent="", addindent="", newl=""): md._write_data(writer, "%s%s%s" % (indent, self.data, newl)) md.Text.writexml = text_writexml ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6338367 sahara-plugin-spark-10.0.0/sahara_plugin_spark.egg-info/0000775000175000017500000000000000000000000023173 5ustar00zuulzuul00000000000000././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418419.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark.egg-info/PKG-INFO0000664000175000017500000000464600000000000024302 0ustar00zuulzuul00000000000000Metadata-Version: 1.2 Name: sahara-plugin-spark Version: 10.0.0 Summary: Spark Plugin for Sahara Project Home-page: https://docs.openstack.org/sahara/latest/ Author: OpenStack Author-email: openstack-discuss@lists.openstack.org License: Apache Software License Description: ======================== Team and repository tags ======================== .. image:: https://governance.openstack.org/tc/badges/sahara.svg :target: https://governance.openstack.org/tc/reference/tags/index.html .. Change things from this point on OpenStack Data Processing ("Sahara") Spark Plugin ================================================== OpenStack Sahara Spark Plugin provides the users the option to start Spark clusters on OpenStack Sahara. Check out OpenStack Sahara documentation to see how to deploy the Spark Plugin. Sahara at wiki.openstack.org: https://wiki.openstack.org/wiki/Sahara Storyboard project: https://storyboard.openstack.org/#!/project/openstack/sahara-plugin-spark Sahara docs site: https://docs.openstack.org/sahara/latest/ Quickstart guide: https://docs.openstack.org/sahara/latest/user/quickstart.html How to participate: https://docs.openstack.org/sahara/latest/contributor/how-to-participate.html Source: https://opendev.org/openstack/sahara-plugin-spark Bugs and feature requests: https://storyboard.openstack.org/#!/project/openstack/sahara-plugin-spark Release notes: https://docs.openstack.org/releasenotes/sahara-plugin-spark/ License ------- Apache License Version 2.0 http://www.apache.org/licenses/LICENSE-2.0 Platform: UNKNOWN Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: Implementation :: CPython Classifier: Programming Language :: Python :: 3 :: Only Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3.8 Classifier: Programming Language :: Python :: 3.9 Classifier: Environment :: OpenStack Classifier: Intended Audience :: Information Technology Classifier: Intended Audience :: System Administrators Classifier: License :: OSI Approved :: Apache Software License Classifier: Operating System :: POSIX :: Linux Requires-Python: >=3.8 ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418419.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark.egg-info/SOURCES.txt0000664000175000017500000000713600000000000025066 0ustar00zuulzuul00000000000000.stestr.conf .zuul.yaml AUTHORS CONTRIBUTING.rst ChangeLog LICENSE README.rst babel.cfg requirements.txt setup.cfg setup.py test-requirements.txt tox.ini doc/requirements.txt doc/source/conf.py doc/source/index.rst doc/source/contributor/contributing.rst doc/source/contributor/index.rst doc/source/user/index.rst doc/source/user/spark-plugin.rst releasenotes/notes/drop-py2-7-ff5c64bed835ce49.yaml releasenotes/notes/spark-on-image-pack-f5609daf38c45b6f.yaml releasenotes/source/conf.py releasenotes/source/index.rst releasenotes/source/stein.rst releasenotes/source/train.rst releasenotes/source/unreleased.rst releasenotes/source/ussuri.rst releasenotes/source/victoria.rst releasenotes/source/wallaby.rst releasenotes/source/xena.rst releasenotes/source/zed.rst releasenotes/source/_static/.placeholder releasenotes/source/_templates/.placeholder releasenotes/source/locale/de/LC_MESSAGES/releasenotes.po releasenotes/source/locale/en_GB/LC_MESSAGES/releasenotes.po releasenotes/source/locale/ne/LC_MESSAGES/releasenotes.po sahara_plugin_spark/__init__.py sahara_plugin_spark/i18n.py sahara_plugin_spark.egg-info/PKG-INFO sahara_plugin_spark.egg-info/SOURCES.txt sahara_plugin_spark.egg-info/dependency_links.txt sahara_plugin_spark.egg-info/entry_points.txt sahara_plugin_spark.egg-info/not-zip-safe sahara_plugin_spark.egg-info/pbr.json sahara_plugin_spark.egg-info/requires.txt sahara_plugin_spark.egg-info/top_level.txt sahara_plugin_spark/locale/de/LC_MESSAGES/sahara_plugin_spark.po sahara_plugin_spark/locale/en_GB/LC_MESSAGES/sahara_plugin_spark.po sahara_plugin_spark/locale/id/LC_MESSAGES/sahara_plugin_spark.po sahara_plugin_spark/locale/ne/LC_MESSAGES/sahara_plugin_spark.po sahara_plugin_spark/plugins/__init__.py sahara_plugin_spark/plugins/spark/__init__.py sahara_plugin_spark/plugins/spark/config_helper.py sahara_plugin_spark/plugins/spark/edp_engine.py sahara_plugin_spark/plugins/spark/images.py sahara_plugin_spark/plugins/spark/plugin.py sahara_plugin_spark/plugins/spark/run_scripts.py sahara_plugin_spark/plugins/spark/scaling.py sahara_plugin_spark/plugins/spark/shell_engine.py sahara_plugin_spark/plugins/spark/resources/README.rst sahara_plugin_spark/plugins/spark/resources/core-default.xml sahara_plugin_spark/plugins/spark/resources/hdfs-default.xml sahara_plugin_spark/plugins/spark/resources/spark-cleanup.cron sahara_plugin_spark/plugins/spark/resources/spark-env.sh.template sahara_plugin_spark/plugins/spark/resources/tmp-cleanup.sh.template sahara_plugin_spark/plugins/spark/resources/topology.sh sahara_plugin_spark/plugins/spark/resources/images/image.yaml sahara_plugin_spark/plugins/spark/resources/images/centos/turn_off_services sahara_plugin_spark/plugins/spark/resources/images/centos/wget_cdh_repo sahara_plugin_spark/plugins/spark/resources/images/common/add_jar sahara_plugin_spark/plugins/spark/resources/images/common/install_extjs sahara_plugin_spark/plugins/spark/resources/images/common/install_spark sahara_plugin_spark/plugins/spark/resources/images/common/manipulate_s3 sahara_plugin_spark/plugins/spark/resources/images/ubuntu/config_spark sahara_plugin_spark/plugins/spark/resources/images/ubuntu/turn_off_services sahara_plugin_spark/plugins/spark/resources/images/ubuntu/wget_cdh_repo sahara_plugin_spark/tests/__init__.py sahara_plugin_spark/tests/unit/__init__.py sahara_plugin_spark/tests/unit/base.py sahara_plugin_spark/tests/unit/plugins/__init__.py sahara_plugin_spark/tests/unit/plugins/spark/__init__.py sahara_plugin_spark/tests/unit/plugins/spark/test_config_helper.py sahara_plugin_spark/tests/unit/plugins/spark/test_plugin.py sahara_plugin_spark/utils/__init__.py sahara_plugin_spark/utils/patches.py././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418419.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark.egg-info/dependency_links.txt0000664000175000017500000000000100000000000027241 0ustar00zuulzuul00000000000000 ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418419.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark.egg-info/entry_points.txt0000664000175000017500000000013100000000000026464 0ustar00zuulzuul00000000000000[sahara.cluster.plugins] spark = sahara_plugin_spark.plugins.spark.plugin:SparkProvider ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418419.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark.egg-info/not-zip-safe0000664000175000017500000000000100000000000025421 0ustar00zuulzuul00000000000000 ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418419.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark.egg-info/pbr.json0000664000175000017500000000005600000000000024652 0ustar00zuulzuul00000000000000{"git_version": "c68cb5e", "is_release": true}././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418419.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark.egg-info/requires.txt0000664000175000017500000000030300000000000025567 0ustar00zuulzuul00000000000000Babel!=2.4.0,>=2.3.4 eventlet>=0.26.0 oslo.i18n>=3.15.3 oslo.log>=3.36.0 oslo.serialization!=2.19.1,>=2.18.0 oslo.utils>=3.33.0 pbr!=2.1.0,>=2.0.0 requests>=2.14.2 sahara>=10.0.0.0b1 six>=1.10.0 ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418419.0 sahara-plugin-spark-10.0.0/sahara_plugin_spark.egg-info/top_level.txt0000664000175000017500000000002400000000000025721 0ustar00zuulzuul00000000000000sahara_plugin_spark ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1696418419.6458373 sahara-plugin-spark-10.0.0/setup.cfg0000664000175000017500000000250400000000000017306 0ustar00zuulzuul00000000000000[metadata] name = sahara-plugin-spark summary = Spark Plugin for Sahara Project description-file = README.rst license = Apache Software License python-requires = >=3.8 classifiers = Programming Language :: Python Programming Language :: Python :: Implementation :: CPython Programming Language :: Python :: 3 :: Only Programming Language :: Python :: 3 Programming Language :: Python :: 3.8 Programming Language :: Python :: 3.9 Environment :: OpenStack Intended Audience :: Information Technology Intended Audience :: System Administrators License :: OSI Approved :: Apache Software License Operating System :: POSIX :: Linux author = OpenStack author-email = openstack-discuss@lists.openstack.org home-page = https://docs.openstack.org/sahara/latest/ [files] packages = sahara_plugin_spark [entry_points] sahara.cluster.plugins = spark = sahara_plugin_spark.plugins.spark.plugin:SparkProvider [compile_catalog] directory = sahara_plugin_spark/locale domain = sahara_plugin_spark [update_catalog] domain = sahara_plugin_spark output_dir = sahara_plugin_spark/locale input_file = sahara_plugin_spark/locale/sahara_plugin_spark.pot [extract_messages] keywords = _ gettext ngettext l_ lazy_gettext mapping_file = babel.cfg output_file = sahara_plugin_spark/locale/sahara_plugin_spark.pot [egg_info] tag_build = tag_date = 0 ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/setup.py0000664000175000017500000000127100000000000017177 0ustar00zuulzuul00000000000000# Copyright (c) 2013 Hewlett-Packard Development Company, L.P. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or # implied. # See the License for the specific language governing permissions and # limitations under the License. import setuptools setuptools.setup( setup_requires=['pbr>=2.0.0'], pbr=True) ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/test-requirements.txt0000664000175000017500000000102200000000000021720 0ustar00zuulzuul00000000000000# The order of packages is significant, because pip processes them in the order # of appearance. Changing the order has an impact on the overall integration # process, which may cause wedges in the gate later. hacking>=3.0.1,<3.1.0 # Apache-2.0 bandit>=1.1.0 # Apache-2.0 bashate>=0.5.1 # Apache-2.0 coverage!=4.4,>=4.0 # Apache-2.0 doc8>=0.6.0 # Apache-2.0 fixtures>=3.0.0 # Apache-2.0/BSD oslotest>=3.2.0 # Apache-2.0 stestr>=1.0.0 # Apache-2.0 pylint==1.4.5 # GPLv2 testscenarios>=0.4 # Apache-2.0/BSD testtools>=2.4.0 # MIT ././@PaxHeader0000000000000000000000000000002600000000000011453 xustar000000000000000022 mtime=1696418392.0 sahara-plugin-spark-10.0.0/tox.ini0000664000175000017500000000545100000000000017004 0ustar00zuulzuul00000000000000[tox] envlist = py38,pep8 minversion = 3.1.1 skipsdist = True # this allows tox to infer the base python from the environment name # and override any basepython configured in this file ignore_basepython_conflict = true [testenv] basepython = python3 usedevelop = True install_command = pip install {opts} {packages} setenv = VIRTUAL_ENV={envdir} DISCOVER_DIRECTORY=sahara_plugin_spark/tests/unit deps = -c{env:UPPER_CONSTRAINTS_FILE:https://releases.openstack.org/constraints/upper/master} -r{toxinidir}/requirements.txt -r{toxinidir}/test-requirements.txt commands = stestr run {posargs} passenv = http_proxy HTTP_PROXY https_proxy HTTPS_PROXY no_proxy NO_PROXY [testenv:debug-py36] basepython = python3.6 commands = oslo_debug_helper -t sahara_plugin_spark/tests/unit {posargs} [testenv:debug-py37] basepython = python3.7 commands = oslo_debug_helper -t sahara_plugin_spark/tests/unit {posargs} [testenv:pep8] deps = -c{env:UPPER_CONSTRAINTS_FILE:https://releases.openstack.org/constraints/upper/master} -r{toxinidir}/requirements.txt -r{toxinidir}/test-requirements.txt -r{toxinidir}/doc/requirements.txt commands = flake8 {posargs} doc8 doc/source [testenv:venv] commands = {posargs} [testenv:docs] deps = -c{env:UPPER_CONSTRAINTS_FILE:https://releases.openstack.org/constraints/upper/master} -r{toxinidir}/doc/requirements.txt commands = rm -rf doc/build/html sphinx-build -W -b html doc/source doc/build/html whitelist_externals = rm [testenv:pdf-docs] deps = {[testenv:docs]deps} commands = rm -rf doc/build/pdf sphinx-build -W -b latex doc/source doc/build/pdf make -C doc/build/pdf whitelist_externals = make rm [testenv:releasenotes] deps = -c{env:UPPER_CONSTRAINTS_FILE:https://releases.openstack.org/constraints/upper/master} -r{toxinidir}/doc/requirements.txt commands = rm -rf releasenotes/build releasenotes/html sphinx-build -a -E -W -d releasenotes/build/doctrees -b html releasenotes/source releasenotes/build/html whitelist_externals = rm [testenv:debug] # It runs tests from the specified dir (default is sahara_plugin_spark/tests) # in interactive mode, so, you could use pbr for tests debug. # Example usage: tox -e debug -- -t sahara_plugin_spark/tests/unit some.test.path # https://docs.openstack.org/oslotest/latest/features.html#debugging-with-oslo-debug-helper commands = oslo_debug_helper -t sahara_plugin_spark/tests/unit {posargs} [flake8] show-source = true builtins = _ exclude=.venv,.git,.tox,dist,doc,*lib/python*,*egg,tools # [H904] Delay string interpolations at logging calls # [H106] Don't put vim configuration in source files # [H203] Use assertIs(Not)None to check for None. # [H204] Use assert(Not)Equal to check for equality # [H205] Use assert(Greater|Less)(Equal) for comparison enable-extensions=H904,H106,H203,H204,H205