pax_global_header00006660000000000000000000000064146725665140014532gustar00rootroot0000000000000052 comment=c3ba8fae2b13c00cb8a9fa178a4dcac746337c53 pyDataverse-0.3.4/000077500000000000000000000000001467256651400140255ustar00rootroot00000000000000pyDataverse-0.3.4/.coveragerc000066400000000000000000000000461467256651400161460ustar00rootroot00000000000000[html] directory = docs/coverage_htlm pyDataverse-0.3.4/.github/000077500000000000000000000000001467256651400153655ustar00rootroot00000000000000pyDataverse-0.3.4/.github/.gitmessage.txt000066400000000000000000000016601467256651400203370ustar00rootroot00000000000000Capitalized, short (50 chars or less) summary (#123) More detailed explanatory text, if necessary. Wrap it to about 72 characters or so. In some contexts, the first line is treated as the subject of an email and the rest of the text as the body. The blank line separating the summary from the body is critical (unless you omit the body entirely); tools like rebase can get confused if you run the two together. Write your commit message in the imperative: "Fix bug" and not "Fixed bug" or "Fixes bug." This convention matches up with commit messages generated by commands like git merge and git revert. Further paragraphs come after blank lines. - Bullet points are okay, too - Typically a hyphen or asterisk is used for the bullet, followed by a single space, with blank lines in between, but conventions vary here - Use a hanging indent If you use an issue tracker, add a reference(s) to them at the bottom, like so: Resolves: #123 pyDataverse-0.3.4/.github/ISSUE_TEMPLATE/000077500000000000000000000000001467256651400175505ustar00rootroot00000000000000pyDataverse-0.3.4/.github/ISSUE_TEMPLATE/bug-template.md000066400000000000000000000044561467256651400224710ustar00rootroot00000000000000--- name: 'Bug Report' about: 'This is a bug report issue' labels: 'type:bug, status:incoming' --- Thank you for your contribution! It's great, that you want contribute to pyDataverse. First, start by reading the [Bug reports, enhancement requests and other issues](https://pydataverse.readthedocs.io/en/master/contributing/contributing.html) section. ### Before we can start Before moving on, please check some things first: * [ ] Your issue may already be reported! Please search on the [issue tracker](https://github.com/gdcc/pyDataverse/issues) before creating one. * [ ] Is this something you can **debug and fix**? Send a pull request! For more information, see the [Contributor Guide](https://pydataverse.readthedocs.io/en/master/contributing/contributing.html). * [ ] We as maintainers foster an open and welcoming environment. Be respectfull, supportive and nice to each other! :) ### Prerequisites * [ ] Are you running the expected version of pyDataverse? (check via `pip freeze`). ### Bug report [Please replace this line with a brief summary of your issue and add code and/or screenshots or other media to it if available] To write a bug report, we have defined a small, helpful workflow, to keep communication effective. **1. Describe your environment** * [ ] OS: NAME, VERSION, 64/32bit * [ ] pyDataverse: VERSION * [ ] Python: VERSION * [ ] Dataverse: VERSION **2. Actual behaviour:** [What actually happened] [Add logs, code, data, screenshots or other media if available.] **3. Expected behaviour:** [What have you expected to happen?] **4. Steps to reproduce** 1. [First Step] 2. [Second Step] 3. [and so on...] [Add logs, code, data, screenshots or other media if available.] **5. Possible solution** [If you have a clue, tell what could be the actual solution to the problem] **6. Check your bug report** Before you submit the issue: * Check if all information necessary to understand the problem is in. * Check if your language is written in a positive way. pyDataverse-0.3.4/.github/ISSUE_TEMPLATE/config.yml000066400000000000000000000000341467256651400215350ustar00rootroot00000000000000blank_issues_enabled: false pyDataverse-0.3.4/.github/ISSUE_TEMPLATE/feature-template.md000066400000000000000000000026631467256651400233450ustar00rootroot00000000000000--- name: 'Feature Request' about: 'This is a feature request issue' labels: 'type:feature, status:incoming' --- Thank you for your contribution! It's great, that you want contribute to pyDataverse. First, start by reading the [Bug reports, enhancement requests and other issues](https://pydataverse.readthedocs.io/en/master/contributing/contributing.html) section. ### Before we can start Before moving on, please check some things first: * [ ] Your issue may already be reported! Please search on the [issue tracker](https://github.com/gdcc/pyDataverse/issues) before creating one. * [ ] Is this something you can **debug and fix**? Send a pull request! For more information, see the [Contributor Guide](https://pydataverse.readthedocs.io/en/master/contributing/contributing.html). * [ ] We as maintainers foster an open and welcoming environment. Be respectfull, supportive and nice to each other! :) ### Prerequisites * [ ] Are you running the latest version? ### Feature Request We will consider your request but it may be closed if it's something we're not actively planning to work on. **Please note: By far the quickest way to get a new feature is to file a [Pull Request](https://github.com/gdcc/pyDataverse/pulls).** pyDataverse-0.3.4/.github/ISSUE_TEMPLATE/issue-template.md000066400000000000000000000025331467256651400230360ustar00rootroot00000000000000--- name: 'Issue' about: 'This is a normal issue' labels: 'status:incoming' --- Thank you for your contribution! It's great, that you want contribute to pyDataverse. First, start by reading the [Bug reports, enhancement requests and other issues](https://pydataverse.readthedocs.io/en/master/contributing/contributing.html) section. ### Before we can start Before moving on, please check some things first: * [ ] Your issue may already be reported! Please search on the [issue tracker](https://github.com/gdcc/pyDataverse/issues) before creating one. * [ ] Use our issue templates for bug reports and feature requests, if that's what you need. * [ ] Are you running the expected version of pyDataverse? (check via `pip freeze`). * [ ] Is this something you can **debug and fix**? Send a pull request! Bug fixes and documentation fixes are welcome. For more information, see the [Contributor Guide](https://pydataverse.readthedocs.io/en/master/contributing/contributing.html). * [ ] We as maintainers foster an open and welcoming environment. Be respectfull, supportive and nice to each other! :) ### Issue [Explain the reason for your issue] pyDataverse-0.3.4/.github/PULL_REQUEST_TEMPLATE.md000066400000000000000000000064271467256651400211770ustar00rootroot00000000000000 *Any change needs to be discussed before proceeding. Failure to do so may result in the rejection of the pull request.* Thanks for submitting a pull request! It's great, that you want contribute to pyDataverse. Please provide enough information so that others can review it. First, start always by reading the [Contribution Guide](https://pydataverse.readthedocs.io/en/master/contributing/contributing.html). There you can find all information needed, to create good pull requests. ### All Submissions **Describe your environment** * [ ] OS: NAME, VERSION, 64/32bit * [ ] pyDataverse: VERSION * [ ] Python: VERSION * [ ] Dataverse: VERSION **Follow best practices** * [ ] Have you checked to ensure there aren't other open [Pull Requests](https://github.com/gdcc/pyDataverse/pulls) for the same update/change? * [ ] Have you followed the guidelines in our [Contribution Guide](https://pydataverse.readthedocs.io/en/master/contributing/contributing.html)? * [ ] Have you read the [Code of Conduct](https://github.com/gdcc/pyDataverse/blob/master/CODE_OF_CONDUCT.md)? * [ ] Do your changes in a separate branch. Branches MUST have descriptive names. * [ ] Have you merged the latest changes from upstream to your branch? **Describe the PR** * [ ] What kind of change does this PR introduce? * TEXT * [ ] Why is this change required? What problem does it solve? * TEXT * [ ] Screenshots (if appropriate) * [ ] Put `Closes #ISSUE_NUMBER` to the end of this pull request **Testing** * [ ] Have you used tox and/or pytest for testing the changes? * [ ] Did the local testing ran successfully? * [ ] Did the Continuous Integration testing (Travis-CI) ran successfully? **Commits** * [ ] Have descriptive commit messages with a short title (first line). * [ ] Use the [commit message template](https://github.com/gdcc/pyDataverse/blob/master/.github/.gitmessage.txt) * [ ] Put `Closes #ISSUE_NUMBER` in your commit messages to auto-close the issue that it fixes (if such). **Others** * [ ] Is there anything you need from someone else? ### Documentation contribution * [ ] Have you followed NumPy Docstring standard? ### Code contribution * [ ] Have you used pre-commit? * [ ] Have you formatted your code with black prior to submission (e. g. via pre-commit)? * [ ] Have you written new tests for your changes? * [ ] Have you ran mypy on your changes successfully? * [ ] Have you documented your update (Docstrings and/or Docs)? * [ ] Do your changes require additional changes to the documentation? Closes #ISSUE_NUMBER pyDataverse-0.3.4/.github/workflows/000077500000000000000000000000001467256651400174225ustar00rootroot00000000000000pyDataverse-0.3.4/.github/workflows/build.yml000066400000000000000000000010651467256651400212460ustar00rootroot00000000000000name: Build PyDataverse on: [push] jobs: build: runs-on: ubuntu-latest strategy: matrix: python-version: ["3.8", "3.9", "3.10", "3.11"] name: Build pyDataverse steps: - name: "Checkout" uses: "actions/checkout@v4" - name: Setup Python uses: actions/setup-python@v3 with: python-version: ${{ matrix.python-version }} - name: Install Python Dependencies run: | python3 -m pip install --upgrade pip python3 -m pip install poetry poetry install pyDataverse-0.3.4/.github/workflows/codespell.yml000066400000000000000000000006211467256651400221160ustar00rootroot00000000000000# Codespell configuration is within setup.cfg --- name: Codespell on: push: branches: [master] pull_request: branches: [master] permissions: contents: read jobs: codespell: name: Check for spelling errors runs-on: ubuntu-latest steps: - name: Checkout uses: actions/checkout@v4 - name: Codespell uses: codespell-project/actions-codespell@v2 pyDataverse-0.3.4/.github/workflows/lint.yml000066400000000000000000000002421467256651400211110ustar00rootroot00000000000000name: Ruff on: [push, pull_request] jobs: ruff: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: chartboost/ruff-action@v1 pyDataverse-0.3.4/.github/workflows/publish.yml000066400000000000000000000004531467256651400216150ustar00rootroot00000000000000name: Build and publish on: release: types: [released] jobs: deploy: runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - name: "Build and publish to PyPi" uses: JRubics/poetry-publish@v1.17 with: pypi_token: ${{ secrets.PYPI_TOKEN }} pyDataverse-0.3.4/.github/workflows/tests.yml000066400000000000000000000020321467256651400213040ustar00rootroot00000000000000name: Unit tests on: [push] jobs: custom_test: runs-on: ubuntu-latest strategy: matrix: python-version: ["3.8", "3.9", "3.10", "3.11"] name: Test pyDataverse env: PORT: 8080 steps: - name: "Checkout" uses: "actions/checkout@v4" - name: Run Dataverse Action id: dataverse uses: gdcc/dataverse-action@main - name: Setup Python uses: actions/setup-python@v3 with: python-version: ${{ matrix.python-version }} - name: Install Python Dependencies run: | python3 -m pip install --upgrade pip python3 -m pip install poetry poetry install --with tests - name: Run tests env: API_TOKEN_SUPERUSER: ${{ steps.dataverse.outputs.api_token }} API_TOKEN: ${{ steps.dataverse.outputs.api_token }} BASE_URL: ${{ steps.dataverse.outputs.base_url }} DV_VERSION: ${{ steps.dataverse.outputs.dv_version }} run: | python3 -m poetry run pytest pyDataverse-0.3.4/.gitignore000066400000000000000000000046121467256651400160200ustar00rootroot00000000000000# Local testing dv artifacts dv solr # Apple artifacts .DS_Store # Byte-compiled / optimized / DLL files __pycache__/ *.py[cod] *$py.class # C extensions *.so # Distribution / packaging .Python build/ develop-eggs/ dist/ downloads/ eggs/ .eggs/ lib/ lib64/ parts/ sdist/ var/ wheels/ share/python-wheels/ *.egg-info/ .installed.cfg *.egg MANIFEST # PyInstaller # Usually these files are written by a python script from a template # before PyInstaller builds the exe, so as to inject date/other infos into it. *.manifest *.spec # Installer logs pip-log.txt pip-delete-this-directory.txt # Unit test / coverage reports htmlcov/ .tox/ .nox/ .coverage .coverage.* .cache nosetests.xml coverage.xml *.cover *.py,cover .hypothesis/ .pytest_cache/ cover/ # Translations *.mo *.pot # Django stuff: *.log local_settings.py db.sqlite3 db.sqlite3-journal # Flask stuff: instance/ .webassets-cache # Scrapy stuff: .scrapy # Sphinx documentation docs/_build/ # PyBuilder .pybuilder/ target/ # Jupyter Notebook .ipynb_checkpoints # IPython profile_default/ ipython_config.py # pyenv # For a library or package, you might want to ignore these files since the code is # intended to run in multiple environments; otherwise, check them in: # .python-version # pipenv # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. # However, in case of collaboration, if having platform-specific dependencies or dependencies # having no cross-platform support, pipenv may install dependencies that don't work, or not # install all needed dependencies. #Pipfile.lock # PEP 582; used by e.g. github.com/David-OConnor/pyflow __pypackages__/ # Celery stuff celerybeat-schedule celerybeat.pid # SageMath parsed files *.sage.py # Environments .env .venv env/ venv/ ENV/ env.bak/ venv.bak/ # Spyder project settings .spyderproject .spyproject # Rope project settings .ropeproject # mkdocs documentation /site # mypy .mypy_cache/ .dmypy.json dmypy.json # Pyre type checker .pyre/ # pytype static type analyzer .pytype/ # Cython debug symbols cython_debug/ # Personal notes*.md stash*.* setup.sh .pypirc data/ !tests/data tests/data/output/ dev/ internal *.code-workspace .python-version devel.py test_manual.py /docs src/pyDataverse/docs/build src/pyDataverse/docs/source/_build src/pyDataverse/docs/Makefile env-config/ wiki/ # Poetry lock poetry.lock # Ruff .ruff_cache/ # JetBrains .idea/ pyDataverse-0.3.4/.pre-commit-config.yaml000066400000000000000000000016611467256651400203120ustar00rootroot00000000000000default_language_version: python: python3 exclude: ^migrations/ repos: - repo: https://github.com/pre-commit/pre-commit-hooks rev: v4.6.0 hooks: - id: check-added-large-files - id: check-case-conflict - id: check-docstring-first - id: check-json - id: check-symlinks - id: check-xml - id: check-yaml - id: detect-private-key - id: end-of-file-fixer - id: pretty-format-json args: [--autofix, --no-sort-keys] - repo: https://github.com/astral-sh/ruff-pre-commit rev: v0.4.4 hooks: - id: ruff args: [--fix] - id: ruff-format - repo: https://github.com/asottile/blacken-docs rev: v1.7.0 hooks: - id: blacken-docs additional_dependencies: [black==19.10b0] - repo: https://github.com/codespell-project/codespell # Configuration for codespell is in setup.cfg rev: v2.2.6 hooks: - id: codespell pyDataverse-0.3.4/.readthedocs.yml000066400000000000000000000010731467256651400171140ustar00rootroot00000000000000# .readthedocs.yml # Read the Docs configuration file # See https://docs.readthedocs.io/en/stable/config-file/v2.html for details # Required version: 2 # Build documentation in the docs/ directory with Sphinx sphinx: configuration: pyDataverse/docs/source/conf.py # Optionally build your docs in additional formats such as PDF and ePub formats: all # Optionally set the version of Python and requirements required to build your docs python: version: 3.6 install: - requirements: requirements/docs.txt - method: pip path: . system_packages: true pyDataverse-0.3.4/.travis.yml000066400000000000000000000014631467256651400161420ustar00rootroot00000000000000language: python cache: pip dist: xenial matrix: include: - python: 3.6 env: TOXENV=py36 - python: 3.7 env: TOXENV=py37 - python: 3.8 env: TOXENV=py38 - python: 3.6 env: TOXENV=docs - python: 3.6 env: TOXENV=coverage - python: 3.6 env: TOXENV=coveralls branches: only: - master - develop before_install: - echo $TRAVIS_PYTHON_VERSION install: - pip install tox-travis - pip install coverage - pip install coveralls - virtualenv --version - easy_install --version - pip --version - tox --version script: - tox after_success: - coveralls notifications: email: recipients: - stefan.kasberger@univie.ac.at on_success: change pyDataverse-0.3.4/CODE_OF_CONDUCT.md000066400000000000000000000065221467256651400166310ustar00rootroot00000000000000# Contributor Covenant Code of Conduct ## Our Pledge In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation. ## Our Standards Examples of behavior that contributes to creating a positive environment include: - Using welcoming and inclusive language - Being respectful of differing viewpoints and experiences - Gracefully accepting constructive criticism - Focusing on what is best for the community - Showing empathy towards other community members Examples of unacceptable behavior by participants include: - The use of sexualized language or imagery and unwelcome sexual attention or advances - Trolling, insulting/derogatory comments, and personal or political attacks - Public or private harassment - Publishing others' private information, such as a physical or electronic address, without explicit permission - Other conduct which could reasonably be considered inappropriate in a professional setting ## Our Responsibilities Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior. Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful. ## Scope This Code of Conduct applies within all project spaces, and it also applies when an individual is representing the project or its community in public spaces. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers. ## Enforcement Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project lead at stefan.kasberger@univie.ac.at. All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately. Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership. ## Attribution This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, available at [homepage]: https://www.contributor-covenant.org For answers to common questions about this code of conduct, see pyDataverse-0.3.4/CONTRIBUTING.rst000066400000000000000000001000431467256651400164640ustar00rootroot00000000000000Contributor Guide ========================================= .. contents:: Table of Contents :local: .. _contributing_get-started: Where to Start? ----------------------------- All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome! If you are new to open-source development or pyDataverse, we recommend going through the `GitHub issues `_, to find issues that interest you. There are a number of issues listed under `beginner `_, `docs `_ and `unassigned issues `_. where you could start. Once you've found an interesting issue, you can return here to get your development environment setup. When you start working on an issue, it’s a good idea to assign the issue to yourself so that nobody else duplicates the work on it. GitHub restricts assigning issues to maintainers of the project only. To let us know, please add a comment to the issue so that everyone knows that you are working on the issue. If for whatever reason you are not able to continue working with the issue, please try to unassign it so that other people know it’s available again. You can periodically check the list of assigned issues, since people may not be working in them anymore. If you want to work on one that is assigned, feel free to kindly ask the current assignee if you can take it (please allow at least a week of inactivity before considering work in the issue discontinued). This project and everyone participating in it is governed by the pyDataverse `Code of Conduct `_. By participating, you are expected to uphold this code. Please report unacceptable behaviour (see :ref:`community_contact`). **Be respectful, supportive and nice to each other!** .. _contributing_create-issues: Bug reports, enhancement requests and other issues ---------------------------------------------------- Bug reports are an important part of making pyDataverse more stable. Having a complete bug report will allow others to reproduce the bug and provide insight into fixing the issue. Trying the bug-producing code out on the ``main`` branch is often a worthwhile exercise to confirm the bug still exists. It is also worth searching existing bug reports and pull requests to see if the issue has already been reported and/or fixed. Other reasons to create an issue could be: * suggesting new features * sharing an idea * giving feedback Please check some things before creating an issue: * Your issue may already be reported! Please search on the `issue tracker `_ before creating one. * Is this something you can **develop**? Send a pull request! Once you have clicked `New issue `_, you have to choose one of the issue templates: * Bug report (`template `__) * Feature request (`template `__) * Issue: all other issues, except bug reports and feature requests (`template `__) After selecting the appropriate template, you will see some explanatory text. Follow it step-by-step. After clicking `Submit new issue`, the issue will then show up to the pyDataverse community and be open to comments/ideas from others. Besides creating an issue, you also can contribute in many other ways by: * sharing your knowledge in Issues and Pull Requests * reviewing `pull requests `_ * talking about pyDataverse and sharing it with others .. _contributing_working-with-code: Working with the code ----------------------------- Now that you have an issue you want to fix, an enhancement to add, or documentation to improve, you need to learn how to work with GitHub and the pyDataverse code base. .. _contributing_working-with-code_version-control: Version control, Git, and GitHub ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To the new user, working with Git is one of the more daunting aspects of contributing to pyDataverse. It can very quickly become overwhelming, but sticking to the guidelines below will help keep the process straightforward and mostly trouble free. As always, if you are having difficulties please feel free to ask for help. The code is hosted on `GitHub `__. To contribute you will need to sign up for a `free GitHub account `_. We use `Git `_ for version control to allow many people to work together on the project. A great resource for learning Git: the `GitHub help pages `_ There are many ways to work with git and Github. Our workflow is inspired by the `GitHub flow `_ and `Git flow `_ approaches. .. _contributing_working-with-code_git: Getting started with Git ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `GitHub has instructions `_ for installing git, setting up your SSH key, and configuring git. All these steps need to be completed before you can work seamlessly between your local repository and GitHub. .. _contributing_working-with-code_forking: Forking ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ You will need your own fork to work on the code. Go to the `pyDataverse project page `_ and hit the Fork button. You will want to clone your fork to your machine: .. code-block:: shell git clone https://github.com/YOUR_USER_NAME/pyDataverse.git cd pyDataverse git remote add upstream https://github.com/gdcc/pyDataverse.git This creates the directory `pyDataverse` and connects your repository to the upstream (main project) pyDataverse repository. .. _contributing_working-with-code_development-environment: Creating a development environment ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To test out code changes, you’ll need to build pyDataverse from source, which requires a Python environment. If you’re making documentation changes, you can skip to :ref:`Contributing to the documentation ` , but if you skip creating the development environment you won’t be able to build the documentation locally before pushing your changes. We use poetry to manage dependencies and the development environment. If you already have poetry on your system, you can set everything up by calling ``poetry install``: .. code-block:: shell $ poetry install --with=dev $ poetry run python3 -c "import pyDataverse; print(pyDataverse.__version__)" 0.3.4 For most tasks, you can use poetry without activating the virtual environment, but sometimes you might want to use the virtual environment directly or save yourself from typing ``poetry run`` over and over again. For that, use the poetry shell: .. code-block:: shell $ poetry shell pyDataverse $ python3 -c "import pyDataverse; print(pyDataverse.__version__)" 0.3.4 pyDataverse $ exit $ In addition to poetry, we use tox to manage common tasks, such as building the documentation or running the tests. .. code-block:: shell $ poetry run tox -e docs You can find more information on how to build and view the docs :ref:`below `. .. _contributing_working-with-code_create-branch: Creating a branch ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ You want your ``main`` branch to reflect only release-ready code, so create a feature branch for making your changes. Use a descriptive branch name and replace `BRANCH_NAME` with it, e. g. ``shiny-new-feature``. .. code-block:: shell git checkout main git checkout -b BRANCH_NAME This changes your working directory to the `BRANCH_NAME` branch. Keep any changes in this branch specific to one bug or feature so it is clear what the branch brings to pyDataverse. You can have many branches and switch between them using the git checkout command. When creating this branch, make sure your ``main`` branch is up to date with the latest upstream ``main`` version. To update your local ``main`` branch, you can do: .. code-block:: shell git checkout main git pull --rebase upstream When you want to update the feature branch with changes in ``main`` after you created the branch, check the section on :ref:`updating a PR `. From here, you now can move forward to - contribute to the :ref:`documentation ` - contribute to the :ref:`code base ` .. _contributing_documentation: Contributing to the documentation ----------------------------------------- Contributing to the documentation benefits everyone who uses pyDataverse. We encourage you to help us improve the documentation, and you don’t have to be an expert on pyDataverse to do so! In fact, there are sections of the docs that are worse off after being written by experts. If something in the docs doesn’t make sense to you, updating the relevant section after you figure it out is a great way to ensure it will help the next person. To find ways to contribute to the documentation, start looking the `docs issues `_. .. _contributing_documentation_about: About the pyDataverse documentation ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The documentation is written in **reStructuredText**, which is almost like writing in plain English, and built using `Sphinx `_. The Sphinx Documentation provides an `excellent introduction to reST `_. Review the Sphinx docs to learn how to perform more complex changes to the documentation as well. Some other important things to know about the docs: - The pyDataverse documentation consists of two parts: - the docstrings in the code itself and - the docs in the folder ``pyDataverse/doc/`` - The docstrings provide a clear explanation of the usage of the individual functions, while the documentation consists of tutorial-like overviews per topic together with some other information (what’s new, installation, this page you're viewing right now, etc). - The docstrings follow the `Numpy Docstring Standard `_. .. _contributing_documentation_build: How to build the pyDataverse documentation ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ **Requirements** First, you need to have a development environment to be able to build the pyDataverse docs (see :ref:`creating a development environment ` above). **Building the documentation** You can build the docs with ``tox``: .. code-block:: shell poetry run tox -e docs This will create a new virtual environment just for building the docs, install the relevant dependencies into it, and build the documentation. You can find the output in docs/build/html and open the file ``docs/build/html/index.html`` in a web browser to see the full documentation you just built. If you want to inspect them as if they came from a webserver, run: .. code-block:: shell poetry run python3 -m http.server -d docs/build/html -b 127.1 8090 Then open your browser at `http://127.0.0.1:8090 `__. .. _contributing_documentation_pushing-changes: Pushing documentation changes to GitHub ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Each time, a change in the ``main`` branch is pushed to GitHub, the docs are automatically built by Read the Docs. There is also a `latest `_ documentation, which is not a branch itself, only a forward to ``main``. As you do not have the rights to commit directly to the ``main`` branch, you have to :ref:`create a pull request ` to make this happen. .. _contributing_code: Contributing to the code base ----------------------------- Writing good code is not just about what you write. It is also about how you write it. During testing, several tools will be run to check your code for stylistic errors. Thus, good style is suggested for submitting code to pyDataverse. You can open a Pull Request at any point during the development process: when you have little or no code but want to share some screenshots or general ideas, when you're stuck and need help or advice, or when you're ready for someone to review your work. .. _contributing_code_standards: Code standards ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ pyDataverse follows the `PEP8 `_ standard and uses `Black `_, `ruff `_ to ensure a consistent code format throughout the project. **Imports** In Python 3, absolute imports are recommended. Import formatting: Imports should be alphabetically sorted within the sections. **String formatting** pyDataverse uses f-strings formatting instead of ``%`` and ``.format()`` string formatters. There is still some code around which uses other conventions, but new code should usually use f-strings. .. _contributing_code_pre-commit: Pre-commit ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ You can run many of the styling checks manually. However, we encourage you to use `pre-commit `_ hooks instead to automatically run ``black`` and other tools when you make a git commit. With the ``poetry install --with=dev`` you already installed it, now you only need to set it up as a git-hook: .. code-block:: shell poetry run pre-commit install from the root of the pyDataverse repository. Now styling checks will be run each time you commit changes without your needing to run each one manually. In addition, using pre-commit will also allow you to more easily remain up-to-date with our code checks as they change. .. _contributing_code_type-hints: Type hints ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ pyDataverse strongly encourages the use of `PEP 484 `_ style type hints. New development should contain type hints! **Validating type hints** pyDataverse uses `mypy `_ to statically analyze the code base and type hints. After making any change you can ensure your type hints are correct by running .. code-block:: shell poetry run tox -e mypy .. _contributing_code_testing-with-ci: Testing with continuous integration ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The pyDataverse test suite will run automatically on `Travis-CI `_ continuous integration service, once your pull request is submitted. However, if you wish to run the test suite on a branch prior to submitting the pull request, then the continuous integration services need to be hooked to your GitHub repository. Instructions are `here `__. A pull-request will be considered for merging when you have an all ‘green’ build. If any tests are failing, then you will get a red ‘X’, where you can click through to see the individual failed tests. You can find the pyDataverse builds on Travis-CI `here `__. .. _contributing_code_test-driven-development: Test-driven development/code writing ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ pyDataverse is serious about testing and strongly encourages contributors to embrace `test-driven development (TDD) `_. This development process “relies on the repetition of a very short development cycle: first the developer writes an (initially failing) automated test case that defines a desired improvement or new function, then produces the minimum amount of code to pass that test.” So, before actually writing any code, you should write your tests. Often the test can be taken from the original GitHub issue. However, it is always worth considering additional use cases and writing corresponding tests. Adding tests is one of the most common requests after code is pushed to pyDataverse. Therefore, it is worth getting in the habit of writing tests ahead of time so this is never an issue. Like many packages, pyDataverse uses `pytest `_ and `tox `_ as test frameworks. To find open tasks around tests, look at open `testing issues `_. **Writing tests** All tests should go into the ``tests/`` subdirectory. This folder contains many current examples of tests, and we suggest looking to these for inspiration. Name your tests with a descriptive filename (with prefix ``test_``) and put it in an appropriate place in the ``tests/`` structure. Follow the typical pattern of constructing an ``expected`` and comparing versus the ``result``. .. _contributing_code_run-test-suite: Running the test suite ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If you have docker available, by far the easiest way to run the tests is to use the ``run-tests.sh`` script. It will spin up a local docker setup with a Dataverse installation and run the tests with that installation. .. code-block:: shell sh run-tests.sh **Setup manual testing** It is possible to run the tests against the live server you have or against demo.dataverse.org, but we recommend using a local docker-installation to avoid unnecessary traffic and load against live servers. **Docker compose service** If you have not used the ``run-tests.sh`` before, you have to create a directory ``dv`` and a file ``dv/bootstrap.exposed.env``: .. code-block:: shell mkdir -p dv touch dv/bootstrap.exposed.env After that, you can run the Dataverse server with docker: .. code-block:: shell docker compose -f docker/docker-compose-base.yml --env-file local-test.env up If you want to login to the web interface, you can use the default username and password as found in the `container guide `__. **Setting up your environment** Before you can run the tests manually, you have to define following environment variables: - BASE_URL: Base URL of your Dataverse installation, without trailing slash (e. g. ``http://localhost:8080``) - API_TOKEN: API token of Dataverse installation user with proper rights. You find it in ``dv/bootstrap.exposed.env`` after you started docker compose and the bootstrap process is done. - API_TOKEN_SUPERUSER: Dataverse installation Superuser, for docker setups use the same token as API_TOKEN. - DV_VERSION: The version of the Dataverse instance you run, for example the one used in the docker container from the ``run-tests.sh`` script. Note that in `issue #195 `__, there is a Discussion if this should be changed in the future. .. code-block:: shell export API_TOKEN=**SECRET** export API_TOKEN_SUPERUSER=**SECRET** export BASE_URL=http://localhost:8080 export DV_VERSION=6.3 Instead of running export, you can also save them in a file called ``.env``: .. code-block:: API_TOKEN=**SECRET** API_TOKEN_SUPERUSER=**SECRET** BASE_URL=http://localhost:8080 DV_VERSION=6.3 (Advanced) Note that if you aim to setup your tests in an IDE, you might need to add the variables defined in ``local-test.env`` to your ``.env``, as some IDEs only allow to specify a single env file. **Using pytest** With poetry, tox, and the help of the .env and local-test.env files, you can now run the tests: .. code-block:: shell poetry run env $(cat local-test.env .env | grep -v '^#' | xargs) tox -e py3 Using ``-e py3`` will automatically select your default python version. If you have multiple versions available and want to test some of those, you can replace ``py3`` with, for example, ``py39`` for Python 3.9, ``py311`` for Python 3.11 etc. The tests can also be run directly with `pytest `_ inside your Git clone with: .. code-block:: shell poetry run env $(cat local-test.env .env | grep -v '^#' | xargs) pytest Often it is worth running only a subset of tests first around your changes before running the entire suite. The easiest way to do this is with: .. code-block:: shell pytest tests/path/to/test.py -k regex_matching_test_name **Using Coverage** pyDataverse supports the usage of code coverage to check how much of the code base is covered by tests. For this, `pytest-cov `_ (using `coverage `_) and `coveralls.io `_ is used. You can find the coverage report `here `_. Run tests with ``coverage`` to create ``html`` and ``xml`` reports as an output. Again, call it by ``tox``. This creates the generated docs inside ``docs/coverage_html/``. .. code-block:: shell poetry run tox -e coverage For coveralls, use .. code-block:: shell poetry run tox -e coveralls **Common issues with setting up IDEs** - *Problem:* Some IDEs can only specify one environment file - Solution: Add the variables from local-test.env to your .env file. - *Problem:* Some IDEs can not make use of breakpoints during testing - Explanation: We configured pytest to use pytest-cov, which interferes with breakpoints. - Solution: Add ``PYTEST_ADDOPTS=--no-cov`` to your environment file or your IDE's environment definition. - *Problem:* VSCode cannot launch the debugger for a test - Compare your launch.json entries with this or add this configuration: .. code-block:: json { "name": "Debug Tests", "type": "debugpy", "request": "launch", "program": "${file}", "purpose": ["debug-test"], "justMyCode": false, "env": {"PYTEST_ADDOPTS": "--no-cov"}, "envFile": "${workspaceFolder}/.env" } .. _contributing_changes: Contributing your changes to pyDataverse ----------------------------------------- .. _contributing_changes_commit: Committing your code ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Before committing your changes, make clear: - You are on the right branch - All tests for your change ran successful - All style and code checks for your change ran successful (mypy, pylint, flake8) - Keep style fixes to a separate commit to make your pull request more readable Once you’ve made changes, you can see them by typing: .. code-block:: shell git status If you have created a new file, it is not being tracked by git. Add it by typing: .. code-block:: shell git add path/to/file-to-be-added.py Doing ``git status`` again should give something like: .. code-block:: shell # On branch BRANCH_NAME # # modified: relative/path/to/file-you-added.py # Finally, commit your changes to your local repository with an explanatory message. The following defines how a commit message should be structured. Please reference the relevant GitHub issues in your commit message using #1234. - a subject line with < 80 chars. - One blank line. - Optionally, a commit message body. pyDataverse uses a `commit message template `_ to pre-fill the commit message, once you create a commit. We recommend, using it for your commit message. Now, commit your changes in your local repository: .. code-block:: shell git commit .. _contributing_changes_push: Pushing your changes ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ When you want your changes to appear publicly on your GitHub page, push your forked feature branch’s commits: .. code-block:: shell git push origin BRANCH_NAME Here origin is the default name given to your remote repository on GitHub. You can see the remote repositories: .. code-block:: shell git remote -v If you added the upstream repository as described above you will see something like: .. code-block:: shell origin git@github.com:YOUR_USER_NAME/pyDataverse.git (fetch) origin git@github.com:YOUR_USER_NAME/pyDataverse.git (push) upstream git://github.com/gdcc/pyDataverse.git (fetch) upstream git://github.com/gdcc/pyDataverse.git (push) Now your code is on GitHub, but it is not yet a part of the pyDataverse project. For that to happen, a pull request needs to be submitted on GitHub. .. _contributing_changes_review: Review your code ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ When you’re ready to ask for a code review, file a pull request. Before you do, once again make sure that you have followed all the guidelines outlined in this document regarding code style, tests and documentation. You should also double check your branch changes against the branch it was based on: - Navigate to your repository on GitHub – ``https://github.com/YOUR_USER_NAME/pyDataverse`` - Click on the ``Compare & create pull request`` button for your `BRANCH_NAME` - Select the base and compare branches, if necessary. This will be ``main`` and ``BRANCH_NAME``, respectively. .. _contributing_changes_pull-request: Finally, make the pull request ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If everything looks good, you are ready to make a pull request. A pull request is how code from a local repository becomes available to the GitHub community and can be looked at and eventually merged into the ``main`` version. This pull request and its associated changes will be available in the next release. To submit a pull request: - Navigate to your repository on GitHub - Click on the ``Pull Request`` button - You can then click on ``Commits`` and ``Files Changed`` to make sure everything looks okay one last time - Write a description of your changes in the ``Preview Discussion`` tab. A `pull request template `_ is used to pre-fill the description. Follow the explainationi in it. - Click ``Send Pull Request``. This request then goes to the repository maintainers, and they will review the code. By using GitHub's @mention system in your Pull Request message, you can ask for feedback from specific people or teams, whether they're down the hall or ten time zones away. Once you send a pull request, we can discuss its potential modifications and even add more commits to it later on. There's an excellent tutorial on how Pull Requests work in the `GitHub Help Center `_. .. _contributing_changes_update-pull-request: Updating your pull request ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Based on the review you get on your pull request, you will probably need to make some changes to the code. In that case, you can make them in your branch, add a new commit to that branch, push it to GitHub, and the pull request will be automatically updated. Pushing them to GitHub again is done by: .. code-block:: shell git push origin BRANCH_NAME This will automatically update your pull request with the latest code and restart the :ref:`Continuous Integration tests `. Another reason you might need to update your pull request is to solve conflicts with changes that have been merged into the ``develop`` branch since you opened your pull request. To do this, you need to “merge upstream develop“ in your branch: .. code-block:: shell git checkout BRANCH_NAME git fetch upstream git merge upstream/develop If there are no conflicts (or they could be fixed automatically), a file with a default commit message will open, and you can simply save and quit this file. If there are merge conflicts, you need to solve those conflicts. See for example in `the GitHub tutorial on merge conflicts `_ for an explanation on how to do this. Once the conflicts are merged and the files where the conflicts were solved are added, you can run ``git commit`` to save those fixes. If you have uncommitted changes at the moment you want to update the branch with ``develop``, you will need to ``stash`` them prior to updating (see the `stash docs `_). This will effectively store your changes and they can be reapplied after updating. After the feature branch has been update locally, you can now update your pull request by pushing to the branch on GitHub: .. code-block:: shell git push origin BRANCH_NAME .. _contributing_changes_delete-merged-branch: Delete your merged branch (optional) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Once your feature branch is accepted into upstream, you’ll probably want to get rid of the branch. First, merge upstream develop into your branch so git knows it is safe to delete your branch: .. code-block:: shell git fetch upstream git checkout develop git merge upstream/develop Then you can do: .. code-block:: shell git branch -d BRANCH_NAME Make sure you use a lower-case -d, or else git won’t warn you if your feature branch has not actually been merged. The branch will still exist on GitHub, so to delete it there do: .. code-block:: shell git push origin --delete BRANCH_NAME .. _contributing_changes_tips: Tips for a successful pull request ----------------------------------------- If you have made it to the :ref:`Review your code ` phase , one of the core contributors may take a look. Please note however that a handful of people are responsible for reviewing all of the contributions, which can often lead to bottlenecks. To improve the chances of your pull request being reviewed, you should: - **Reference an open issue** for non-trivial changes to clarify the PR’s purpose - **Ensure you have appropriate tests**. These should be the first part of any PR - **Keep your pull requests as simple as possible**. Larger PRs take longer to review - **Ensure that CI is in a green state**. Reviewers may not even look otherwise - Keep :ref:`updating your pull request `, either by request or every few days .. _contributing_after-pull-request: What happens after the pull ----------------------------------------- .. _contributing_after-pull-request_review: Reviewing the Pull request ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Once a new issue is created, a maintainer adds `labels `_ , an assignee and a `milestone `_ to it. Labels are used to separate between issue types and the status of it, show effected module(s) and to prioritize tasks. Also at least one responsible person for the next steps is assigned , and often a milestone too. The next steps may consist of requests from the assigned person(s) for further work, questions on some changes or the okay for the pull request to be merged. Once all actions are done, including review and documentation, the issue gets closed. The issue then lives on as an open and transparent documentation of the work done. .. _contributing_after-pull-request_create-release: Create a release ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ First, to plan a release, the maintainers: - define, which issues are part of it and the version number - create a new milestone for the release (named after the version number) - and assign all selected issues to the milestone Once all issues related to the release are closed (except the ones related to release activities), the release can be created. This includes: - review documentation and code changes - test the release - write release notes - write a release announcement - update version number - tag release name to commit (e. g. ``v0.3.0``), push branch and create pull request - upload to `PyPI `_ and `conda-forge `_ You can find the full release history at :ref:`community_history` and on `GitHub `__. **Versioning** For pyDataverse, `Semantic versioning `_ is used for releases. .. _contributing_resources: pyDataverse-0.3.4/HISTORY.rst000066400000000000000000000017101467256651400157170ustar00rootroot00000000000000.. _history: 0.3.1 - (2021-04-06) ---------------------------------------------------------- `Release `_ 0.3.0 - (2021-01-26) - Ruth Wodak ---------------------------------------------------------- `Release `_ 0.2.1 - (2019-06-19) ---------------------------------------------------------- `Release `_ 0.2.0 - (2019-06-18) - Ida Pfeiffer ---------------------------------------------------------- `Release `_ 0.1.1 - (2019-05-28) ---------------------------------------------------------- `Release `_ 0.1.0 - (2019-05-22) - Marietta Blau ---------------------------------------------------------- `Release `_ pyDataverse-0.3.4/LICENSE.txt000066400000000000000000000021431467256651400156500ustar00rootroot00000000000000The MIT License (MIT) ===================== Copyright © 2024 Stefan Kasberger, Jan Range Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. pyDataverse-0.3.4/MANIFEST.in000066400000000000000000000000501467256651400155560ustar00rootroot00000000000000recursive-include pyDataverse/schemas * pyDataverse-0.3.4/README.md000066400000000000000000000072021467256651400153050ustar00rootroot00000000000000[![PyPI](https://img.shields.io/pypi/v/pyDataverse.svg)](https://pypi.org/project/pyDataverse/) [![Conda Version](https://img.shields.io/conda/vn/conda-forge/pydataverse.svg)](https://anaconda.org/conda-forge/pydataverse/) ![Build Status](https://github.com/gdcc/pyDataverse/actions/workflows/test_build.yml/badge.svg) [![Coverage Status](https://coveralls.io/repos/github/gdcc/pyDataverse/badge.svg)](https://coveralls.io/github/gdcc/pyDataverse) [![Documentation Status](https://readthedocs.org/projects/pydataverse/badge/?version=latest)](https://pydataverse.readthedocs.io/en/latest) PyPI - Python Version [![GitHub](https://img.shields.io/github/license/gdcc/pydataverse.svg)](https://opensource.org/licenses/MIT) [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.4664557.svg)](https://doi.org/10.5281/zenodo.4664557) # pyDataverse [![Project Status: Unsupported – The project has reached a stable, usable state but the author(s) have ceased all work on it. A new maintainer may be desired.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active) pyDataverse is a Python module for [Dataverse](http://dataverse.org). It helps to access the Dataverse [API's](http://guides.dataverse.org/en/latest/api/index.html) and manipulate, validate, import and export all Dataverse data-types (Dataverse, Dataset, Datafile). **Find out more: [Read the Docs](https://pydataverse.readthedocs.io/en/latest/)** # Running tests In order to run the tests, you need to have a Dataverse instance running. We have prepared a shell script that will start a Dataverse instance using Docker that runs all tests in a clean environment. To run the tests, execute the following command: ```bash # Defaults to Python 3.11 ./run_tests.sh # To run the tests with a specific Python version ./run_tests.sh -p 3.8 ``` Once finished, you can find the test results in the `dv/unit-tests.log` file and in the terminal. ## Manual setup If you want to run single tests you need to manually set up the environment and set up the necessary environment variables. Please follow the instructions below. **1. Start the Dataverse instance** ```bash docker compose \ -f ./docker/docker-compose-base.yml \ --env-file local-test.env \ up -d ``` **2. Set up the environment variables** ```bash export BASE_URL=http://localhost:8080 export DV_VERSION=6.2 # or any other version export $(grep "API_TOKEN" "dv/bootstrap.exposed.env") export API_TOKEN_SUPERUSER=$API_TOKEN ``` **3. Run the test(s) with pytest** ```bash python -m pytest -v ``` ## Chat with us! If you are interested in the development of pyDataverse, we invite you to join us for a chat on our [Zulip Channel](https://dataverse.zulipchat.com/#narrow/stream/377090-python). This is the perfect place to discuss and exchange ideas about the development of pyDataverse. Whether you need help or have ideas to share, feel free to join us! ## PyDataverse Working Group We have formed a [pyDataverse working group](https://py.gdcc.io) to exchange ideas and collaborate on pyDataverse. There is a bi-weekly meeting planned for this purpose, and you are welcome to join us by clicking the following [WebEx meeting link](https://unistuttgart.webex.com/unistuttgart/j.php?MTID=m322473ae7c744792437ce854422e52a3). For a list of all the scheduled dates, please refer to the [Dataverse Community calendar](https://calendar.google.com/calendar/embed?src=c_udn4tonm401kgjjre4jl4ja0cs%40group.calendar.google.com&ctz=America%2FNew_York). pyDataverse-0.3.4/assets/000077500000000000000000000000001467256651400153275ustar00rootroot00000000000000pyDataverse-0.3.4/assets/function_schematic.png000066400000000000000000000673761467256651400217250ustar00rootroot00000000000000PNG  IHDRW08sBIT|d pHYsbۥtEXtSoftwarewww.inkscape.org< IDATxwx>{fwIH REEPTTcs,]_G=* *RE(HL% -mǚffMH./gڳ@r<DQ<+IRgQE@4i!""""""/ظq#0zh<m'""""""kNƍ!uG DDDDDDD΀ 09t"""""""'NDDDDDDЉ:`@'""""""r DDDDDDDN 09t"""""""'NDDDDDDЉ:`@'""""""r DDDDDDDN 09t"""""""'NDDDDDDЉ:`@'""""""r DDDDDDDN 09t"""""""'NDDDDDDЉ:`@'""""""r DDDDDDDN 09t"""""""'NDDDDDDЉ:`@'""""""r DDDDDDDN 09t"""""""'NDDDDDDЉ:`@'""""""r DDDDDDDN 09t"""""""'NDDDDDDЉ:`@'""""""r DDDDDDDN 09t"""""""'NDDDDDDЉ:`@'""""""r DDDDDDDN 09t"""""""'NDDDDDDЉ;d3P0mЕrL uO_ Vu/k^5 @鶶5ۺDDDDDDƀXՃ_O"""""j8]h@ֽ G{ۺDDDDDDW]M}Ǡ9t"""""j8ĝ 09t"""""""'NDDDDDDЉ:`@'""""""r DDDDDDDN 09t"""""""'NDDDDDDЉ:`@'""""""rڶÌ0@*-PYRe)`2RZX$ŀ\q\] Y(+]wYAh5|!{A o% DDDDD&}!xBufE֤d8rQ>ȥEC./\Q:ڠ  T3RQy ]ap>кl?QЉ!3QFR@*ibmZD7^>} 1$phC#e47c,Òw LH s C*l2X7'ITcq>w!zB_7?}""""ٌ]aس g [Z`(BpuMpGh"N&26Ē-1(`/UWœ KF*NsRQrsC*+A @ F zs]ƀNDDDDt1Vr*ms:w9h4ݽ!u6t DWw*700@9KIѕMQeHyY޼ [VA@xL bF P1]%G|]j!C n=:ߛ.q}2Ru%L|0i)2!~J aس=!tipoAd3c5*Xۀ&Fcu8AaIj-;@?` y 7]qpMӻHNϥ[tc}!aط?)=y Q}vCAj_Љ =(U j Ip1sI~x臌OF֯`J= д 'Q-p0k^7IfRhx(& I7ƻnT_pAɂʍ{j 73'ODnn.ݻwKqiBEk׮EMDDDDM$xݐJQ8=olt}0ð{FO|=w_ڊ=:09>ѵ}G}2331|#!!IIIHJJBbb"0e[z ;Dpp00rH >ݺuCHH:u6|W}_l6;b=s{\. K6Ωe~G$~>s Ă Ty9dee􃈈W|ȕvףo;*)w}]#TnT2$_/ ~')>AwrFfͲ/7{lh4EO?g}ާ5f3㥗^r> nоDDDD上R~6ޏ 3Sqo&:tÇ'ph_""""Aϳf8'9!斋mЛ\߾}m^?~}'|zݲeiӦ)_.!!Çio~yDDk_Xvm}""""UmׄCn9u8eiНѣ1pSN{Eǎq}رcǞ9sa^݀fw_|wm{qܹ6\YBz+Wl5@P j$Plk~W NNo*++tR$&&bƌϷٞg:$$녆I{κbbb_/..w I-шoADDD*D2%? 泎M?$j ;Tp"U,b{ĉ}֮]d۷>>>NglnZ&ٺP3aiַo_,]wuW\wٌ͛7cͪtXbbcc}{6 yO>iDDDD:zH%0},YcT~ \z I%mxe" sz woU  nDŗj^[I0?oMہ:;} 2DoGXX͛g{]b6lC׵7ĺu\G#`8_Β㱽0 .=¥qtm,0/0 0ױ4ZO nS ^ Vzj|ꫯpAb 00}Yp-@m{,[ C!##X,B\\Fy!""p)Y6l A`ʔ)>}:4yK...?~P e˰aÆڶO? .=jZTrySJN@K\_z ./a!t͒`:ɇ`<rE#ip4nnC5iӦaڴi:N >Ç>t:̞=gn񮮮#2e LҤk}ܧ̅| &~]>[okTV өaJ a>AuZ&mhWTǀNDDDDN z7p\΀n~%CKE0Áy@mL/6:oKjI3`:}s'a>{sM. ۯЉ9A X { Knf#WUtu)-hà(hã &(;0_LbڟN,+:u:l" [*"ƻ~0_Hn2Y%",9C?6 zWh:FZ{H80hB¡ |olY̰gÒkX/} iʊ20\ pDUЉRNv;!dpdL)`:yԡe 0g9bd A!C ?_Ā:+yKW7ITRK~6HBʽKX r*x%q0x`a@'""""6v:N$ s'a:q|fTg zD~C @?۬}arUHeŐˊ![,C.)TZKA6\QSWZ &Bs3`@'""""ֈ"qFm@|1dOK@Rica-Z'y@t A5+ =+:EE-fU5Ȑ+![,+  U Ր+!UC*_Y rey57ڨ8⡍K7m'RǀNDDDDt5F@$Y*`HktX23 -0ծٮ_x44a)qںg t"""""R<]|% ܋0_H%3:/:' W4Gl iB;C)%@N#tC?fTZK5K99BgE-Sj%!@14r18f~>b@'""""+&zAkFYTR Z(BiE֢iJʊ;ZDkq~tsm"+b4/(// 퍞={߿JDu0SIII 裏bƌm㖄оz;aJlPij'.z<<`8bh@h48uc>EZN85k;DΝۺ;N/̙3q,Gvv6$IjpH3wy' AZ$I8{,RRRp ⡇B>}ZDΌEȩcݘ9s&t:z>Eͤ{`07i8Euk~Cpc Ff}!j-x0l0^@iii[w)F<쳈EBB^|Eur%9ddd`ҥ>|8 {g# ::SLSO=e˖UMЉifx֯2<;mZrr2^|E$$$ ::=jWRRRK/!..68q:?n]Љ]`XҨpOm, IDAT塼Gx饗ݺ*ԄvU>& ?V9D\}Q$$$`!>>;r:>zWy`@XgLCg?Ǿ6fgYcKrr2֬YիWՄ>sLz뭈k)$ 'OΝ;+nH  )) b ??jhN(((`+jWSQQ\zť2LCWUU!;;,C#44ѿf梺_:kWCXFBpD! ApC#(D~e9ga]؀ ms`I BG/ᢪJ+)j=\(-RCu'1=;zWű:Q7NƤ/ԉ >0]zW\`Qj0 (2cz?w0m&yct.Aa^>quF* %*`9̼$vz3(7Y".(D.2\{C0>*~DɄru;2NBi DP#nЊ"ʍFd#G3U>Y:chH# dBNEN8 l /㣻gpGyx@eVU"9/;2N#%/ﵭ1;EQD\\0|Xpp%K`Ν8vΟ?d=** _=x t~=SUk֩ywp$''%4 z3f_7%IM7݄Jc>DDD(+2nf׶kf߲2,Y_};Vۮ1l0,X3f̨߹su̙30 ܹ3z~}=K,ʕ+R5j͛X߹s'/^Ѹ馛!!ʟ5ЉўzL@1:#ác*F|v*F+2gjqe]{#cvwvXd Hv&~ʧû`Pxg컨\|{p9" ~ۧ- ;y>xæx{ThJ؆)2>4#|[S͜򸸸9 CMXOHHlA믿x[UU]v)wtT@uu5N)ѱcvոpnF'7tbQ-[(nf׿ ƌڶm&NqX 8Bg7bo/'" h7L4ip~2L6 ;wtt DtsX:,qWFл#S9dA. QߟLƉ|n\(-FY MCnT1I]l!# 0NW^xsFWps3BWFW* ' rk_WFOoȐqy'N^`A'܄I1mS-eH3'{nx 6x7W:C2r}Yɓ'Cm6мi&E3gδyXGEE!8X9"j޽xwuڵ O?#G(Q_;5{O.\~I(hȳX,;wn{j DtMZºxL*$:Ge76n't2dIQ_g7AvIq7 8nu~kNEGm{%ա>zW\ױyBxnۮ)3?=KtSV-z4;G3blw~0Ol]~..WWzQu௧ZF H+vFaւj?eml}@QZ,'֬YVzz:/^ŋK.AK] X6֧gΜAee%t:BBBпXR7ЛΝ`:v^z"ەHfΜGy&p[nDQ^ȑ#'N uwy@4xUo\?S.5"""0p@_c<؄8ĝT2dݱcG9Xf˧lfI’U{ms,+Q:s=Ч2'e#5{ergwlR]ndT=(ybΓ*:aH_{}{p^dA.>WN*C`` ?"pi뢅uGԞͣ.++o>} s[Vh{@#2PPUsuר.$:wvce7!&ĩۢA^.z S^kxNq\½}q '9/}?YXvd?^<й߫).uG84ooo<&̝;شi֬Y-[4IJ?AdrMk+++k^r̛7կ{l~^ ÁfQCm|xIL4 ii+hњ.֭^Gw`ڵ6m| -[(JR~.Jqq11ck.ڵ Xl{YAeGTWW`0(S`@'TUI8V[VV6[_~W_}o6/^ GEEE<<#zeeYFtt4ƏW_ZUIcȑ駟P]]~݇s +5T+ږ=m=wuWk=SBAVv<*Mw_=n8/ž X*3N!%/ եD,ap~Υa?\bM|r &I9{柟^m%jեԾ7DNMPXUkVoW gx}$^&_YYYXb&O!@dd$y/HOO{l6cΝ7n^yv^W_}զ5\X㣣JֺR~ݥ#::7pMۅ l^0@u \\\BCmG>}=j?k4}&IaѢE:uM,8{lk%>(稨PtىQ^Yj:|dR[#7 g`Xm1Q11;|Vrxn[ 㢔s͒6e_tN!w;}kH. *qǷ+,<йzN:7QzwðHEa@ZwJeڴ8WR׭rO5VonO`$u ފ5nuu'˭#/*MFTL2&UUPRNO֝3>)5k|}}Q^^~mEpW_};wƀj*~z̞=F+k.,^]vU]oQĦMnW{zX?xlڗ,Y5k`Μ9򐓓{_EAAȄڱc̙ɓ'cرܹ3BBBjqiե.=܃^zfѫW/L:Fxx8***4lٲ DԮ]r#իS˵?,?:Q&)}4FavywYQ}jqm駰Ze(wcTy2(g5ޫM>E@'T|.2d#&hT2u]qg7Wن Â~C#M8u ߴp}ư<1_u/Vp t6s1vFgСŋIWZUlo9r]vɓb m7o͛}&L@BBS㰰0̚5=V\'Nشggg+VP|.jxc'I;#RR쯈A̓C܉ݩcǎW{8tU̴T38P﷩QCzӛd>u4۬{c Vya%Щdi^< @ uγ.V"K]J'uبXzx%i"~~&(qMўgg.jr_giE^=QQ1>sz5ΰ1qE#oY]:v®œa5Ztŭ }Y$_sy~x<=t CRkػ)b4!) !^>ox|HtyD{xaB8;F<ؿ,a%u 倵|'Ob /7߄kZy3gbٲe uyB-"{̝;Q'?k,D2 IDATEٮ7xݻ{<ӄ]<$$VR]*N,^+WD.]:_^^M9j~V^ը/]? uطF]{υ U9W~ON_=KF@=OrkBܚЧǕq7(.@ZQ$y [N"U&4wˎHh?RH7?[RWY ObkI\*+bFh?! Y|Ϗ_bi169aw;l:}'sTtÒ 7VϮ(s?Ui'.vKBoy顣[QV- uT?b-JuCw#a,ȭ(F@wweľ0+ץKL:i_NٳŋQXXшRxxxHLLDѡ6oތ~ .\@Qu/HvC]w݅I&aڵ_WWWt cǎ+V_ƍ ;;eeet􄿿?"""^z!:Z}Z+WߵkWoBbҥxi&޽.]BYY4 ѵkWCElr+ {ݣG_~;侾ȑ#1eʔ̞=3fm۰uV>}X,DDD0p@ 2ġt `^Lmy|Td'! 4"EQx`u)*jZTdqEd!B !!L231S쁹 Kι«;sj*t_%P 0`򗿨[n.ǧ-w6XO}Yɳ sϲ\_M%{-I6]:u800H;xnsS vG]$]T'!}{xJ׏GxIhCV*Ȧw~ܨg13/sѻRvC~֌ze7~~z芷NancyZsbQbb|7ׯW+C%)RZ_WPK2Hq/TS۶ml2_axƭXS*ζwmOy':#?O٢kU]ϕ^TR/Z[ew*]6ӄ>'q}{x$UeJ_[ݶQ_}24&$eЈOh{|ZDeqʩL6/ e[ 5t{Z8hZUKF(?E\\x _aXی;d Py;v&KÖ_ Oރ?zlz-z* RI}Sk+kʺz}ӷ۵)?L՝2e-OWhwF[C3]_2TEHT&[7U$ _c}ϒ.s/ZJ|KM&)c~'K 3=i)mrŵ_Ӥ]zsw%6of #irG ܖzLgvm1. S'rt$gfhO IOO'ʮl3|QԻ6UРǥ?;*ykEJk>v&K?L_|nSlZ%(8]v_摍d1y~PX\'Rݚs^;@yFb/q{߼Tz^viZu`o6&71=pAoNu@ӌ-] TPfOR7{.:S9Ok=K2{WwwgoYr8.,w鉕J@??ڬ,ȗ/rv}w>ݻKRɊa)vץU< 6=ڒ,4 P! ,өlkRsTjfEz^nj+,.җ\ І ,áӅJ9S'gfhwV)_Q! i ɤ3BeL%?ݛJ~np*_U ݚūa`ө|5 ӠW^ kjTI:oܛlkaVUaqRs+?m\vCY*,Ūί\Zkqge)%'[)9T1WuomY3"T@=ҹu8+S<t'Cx]\gla4ΗTPVpvB5:NjwJ6P\p[,Γǵ|}gG.?5x`5jT첟~IIII:z T\\pmVzRDD9_}>뮻N c"D@XZlMۼV6PlܸQ#G$jĉe6l ѣC 6t֧~ӧQޘ:u3fPAA^>3裏Giƌ6lL&~m[E>B@P/߿_˖-Sջwoޫ޽{kȐ!2d{1-\P[lQXX~͟?TӧOn׸qKM&Yk]q:t$ᆱ:yyyy=߿㕔T-IͭR2y ,Pvv HY,=C3gV7n$hѢC@P畆ѣG>Y,}JKK??Zv,^^:Ԏ;ԲeK]wuf͚wZr͛~O?5j<:uJl믿N:]OisΕ$9R&ؘ1c$Is)sm۶iڴitWGO⁳@UXX Jjڴin$)99YzqKWXXH-[ӧOQFe.斖͙3k@ Udd.@{I&NXtZz>cM6MDx@Z~~~wM7I&ҥK/`@I ,PqqnƖ9b{$͛7BTSI&Icǎ-yc*??_fY6M:tRSSW{[7VBBBMJ"&LP۷o7iҤ*G9@e1 `t 0:@@ `~.ڵkk.+((HѺ+յkWyݻvZ9rDEEE Uvt+..5n߾}q_n]JNNvf:uTO /8qBSLѻᆱc5tPkjܸ׮]zᇵqFY,G׿Ժuk͙3GxNm۶r={?7p*|B@)?Ν;kڴiesI*((Ї~(WΝ;իW2ù$v}Wjذ$iذanǷo߮[~>|xe#շo_xkذ﯈WWvvۘ͛sjѢL&$nSLL$cǎj߾9/.#GhӦMv@@\O8݀FLW;00Puqegg̙3'|;CFr=v֭[ji:t萶nݪ*--MSNO,>^7x[ߢE<%''- WF'7|־{t~1c4sL>|<̘1Zfh"M4m̹} RCj]J)RǾ.l6K3Z4($e',Kq tz, ױc*]l6kڴi2dk$mٲE[lO?#GjjݺTxxk}iڵ cےl:!]R.e.U~~ݻ5g,IҘTBB+]#HnnBCC|Ai7nRSS=1sL͞=[SLx=88X#FЌ3\}w7jcW_}*c HK.ҥˀ}gzeIR]5`W*d2~jꦛnٳ5k,%%%lzꩧ1cƸ{G|_]z=Ewc%|]G.o秈>}믿e˖պ^xxƏkƍ>}>'N] Sׯ$edd?W~ ëUi-݂-.U ˗/v] ,В%K;Ç{?ᆱO>˃#GAR\s7gΜݻw.ƺeeeyO %s]A@7osKOO׈#uQ͚5STT>M2^Д)SC)**J\s﯁kU^d]n߾5˸qܾ,8k\h>ێ rZ }P Pyͫ=,b7tzꩧCyiϞ=ڹs~ǎs-vZ;vLdZuVXBK.Ֆ-[X,?Yf{=tmU| x9O)wT#~RkG'?|HY u)ordgz=nSUmtBGM#+CE{z,.Rp GA7uwV9AS2Y7USe npA1[ [+27lt+E},w*XP;6jk2/#uiU@ O&YW|BM_˙S(tW0:I})S/n/{Xmke۶Vh^N߶#ab?~XmdRsQ 9^Wv?ot6S +Uliwde*z̡ :t)e @UvLmzٶ=hO5)A pjKL&]WAz˺yϗ=H9rU~ ׯf]F(Cg/2*ڳUE{~mJM_?)(DA=)5Et.)GvU-ohOY'p V,) H~*A~-˯eFπ i-T*>UN9Μֵ,ԫ)$+fvY]odWrN%BQE{~tYȯ47k%˥d8i-=吊P%G|.gM#QQ9L!%=4B% 9$T_ )(D) 4{rJ+g~9rY[;WgN˙%GNv? sT,Yпrlult"f%&N_R"cŇK$_k5Ǚ+{~x8oLA%a=(D&ҧ&w Jµ@U"k+̗v9s%]΂jx!Fri+*_lVK=~pIXO)]9Y>ԝV( w]g iY d:\ \~-.wwpo u-b;jS@,1d%ҒYdm&sDC-"SH.y 92Oʑ.rdtYjc1!KTlye%۬ uYZע1NUΜӮ9YrdgZbkWQ] ڙZZ]XQ2><ᆀ'L2E59I+ r:%;%AHNkdY\ț@:g8gm8\"I&3$ SPd98TR?,tuJSiIRc&6 `t 0:@@ud=*PY}]ݛw}]"wI2|]j=$I PuPeLq%ƒ뼠)揾.IH}]"w 0:@@ `t 0:@@ `t o3WIDAT0:@@ `t 0:@@u½$!_W l%]RP[_W :/Ἶ(y@ |(6t#h:Y MuR' K T_+P1 `t 0:@@ `t 0:@@ `t 0:@@ `t  e]!CE.I""B'NԸq$I1bV^-R FC@L&K,a% _T|]w9sF5uTӧ^xu'N/ סCvE =cڶmfϞ-Ioꫯg^`dLqt1(bu]Fr 6Lk֬Q۶m=9XB=zЋ/v,''ǭ]=4hP51c:wjOSdd@Ntm ??? 8Pohڴil1o+Ź} uV(K۶mo$.b;@-)KǏ?<77׭]/-6m^{Msu?;xn>3>}:> k)FF@϶{nvttt=7[Vbrݻww{n͕r:z'ݦGFFꦛnt-4}JFG@CyJJC>W^yE鮾/R-rױcGvkϞ=n:''G/ظ8%?v|߿6l ͦ"m޼Ywyf͚6v„ gl޼׿V0"A){*tA͝;WC=bcc1I&]$XB+VK/`EDDHU\\vAcǎ_|˗V\+W[oSOU볎?^KelӧOw=v8JMMc6[oշpB1*((z6m^… 5j(-]RK+5\믿^NZ_c;Yʚ^¹$G\sM[hKV`/u7O>.ѣGkݺu8O>DʼN5}tY={_k/\~R~wI&.Vz뭺[/~PZZըQ#]yܹDhǎ:p233UTTpjJ^{˽d҈#4b8p@6mɓ't:.]C~dSqRӧs7n\(T_C7mڴQ6m|^XXz^zոVZUV:E-[V>ktpyuWt`(tpѹ|]X$ `t 0:@@ `t 0:@@ `t 0:@@ `t H\(u,_W"Az_W16xv+@mj@U/<.\&Y}] j*~P|,E"w 0:@@ `t 0:@@ `t 0:@@ `t 0:@@ `t 0:@@ `~7ߨ׵pYn$d6;xEl6f{ZIENDB`pyDataverse-0.3.4/assets/function_schematic.svg000066400000000000000000000637521467256651400217320ustar00rootroot00000000000000 image/svg+xml pyDataverse CSV JSON DataverseAPI JSON pyDataverse-0.3.4/assets/models_export.svg000066400000000000000000002557711467256651400207550ustar00rootroot00000000000000 CSV JSON XML Dict pyDataverse DDI:- Mapping (pyDataverse)- Schema (pyDataverse) Custom:- Mapping (user)- Schema (user) to_json() Dataverse Upload:- Default- Mapping (pyDataverse)- Schema (pyDataverse)- Custom- Mapping (user)- Schema (user) Dataverse Download:- Default- Mapping (pyDataverse)- Schema (pyDataverse)- Custom- Mapping (user)- Schema (user) DSpace:- Mapping (pyDataverse)- Schema (pyDataverse) Custom:- Mapping (user)- Schema (user) to_xml() Templates:- Mapping (pyDataverse)- Schema (pyDataverse) Custom:- Mapping (user)- Schema (user) to_csv() dict() to_xml_dir() to_oais_dir() DIR: JSON DIR: XML OAISTree to_bagit_dir() BagIt pyDataverse-0.3.4/assets/models_import.svg000066400000000000000000002553311467256651400207360ustar00rootroot00000000000000 CSV JSON XML Dict pyDataverse DDI:- Mapping (pyDataverse)- Schema (pyDataverse) Custom:- Mapping (user)- Schema (user) from_json() Dataverse Upload:- Default- Mapping (pyDataverse)- Schema (pyDataverse)- Custom- Mapping (user)- Schema (user) Dataverse Download:- Default- Mapping (pyDataverse)- Schema (pyDataverse)- Custom- Mapping (user)- Schema (user) DSpace:- Mapping (pyDataverse)- Schema (pyDataverse) Custom:- Mapping (user)- Schema (user) from_xml() Templates:- Mapping (pyDataverse)- Schema (pyDataverse) Custom:- Mapping (user)- Schema (user) from_csv() set() from_xml_dir() from_oais_dir() DIR: JSON DIR: XML OAISTree from_bagit_dir() BagIt pyDataverse-0.3.4/docker/000077500000000000000000000000001467256651400152745ustar00rootroot00000000000000pyDataverse-0.3.4/docker/docker-compose-base.yml000066400000000000000000000062771467256651400216550ustar00rootroot00000000000000version: "2.4" name: pydataverse services: dataverse: container_name: "dataverse" hostname: dataverse image: ${DATAVERSE_IMAGE} restart: on-failure user: payara environment: - DATAVERSE_DB_HOST=postgres - DATAVERSE_DB_USER=${DATAVERSE_DB_USER} - DATAVERSE_DB_PASSWORD=${DATAVERSE_DB_PASSWORD} - JVM_ARGS=-Ddataverse.pid.providers=fake -Ddataverse.pid.default-provider=fake -Ddataverse.pid.fake.type=FAKE -Ddataverse.pid.fake.label=FakeDOIProvider -Ddataverse.pid.fake.authority=10.5072 -Ddataverse.pid.fake.shoulder=FK2/ ports: - "8080:8080" networks: - dataverse depends_on: postgres: condition: service_started solr: condition: service_started dv_initializer: condition: service_completed_successfully volumes: - ${PWD}/dv/data:/dv - ${PWD}:/secrets tmpfs: - /dumps:mode=770,size=2052M,uid=1000,gid=1000 - /tmp:mode=770,size=2052M,uid=1000,gid=1000 mem_limit: 2147483648 # 2 GiB mem_reservation: 1024m privileged: false healthcheck: test: curl --fail http://dataverse:8080/api/info/version || exit 1 interval: 10s retries: 20 start_period: 20s timeout: 240s dv_initializer: container_name: "dv_initializer" image: ${CONFIGBAKER_IMAGE} restart: "no" command: - sh - -c - "fix-fs-perms.sh dv" volumes: - ${PWD}/dv/data:/dv postgres: container_name: "postgres" hostname: postgres image: postgres:${POSTGRES_VERSION} restart: on-failure environment: - POSTGRES_USER=${DATAVERSE_DB_USER} - POSTGRES_PASSWORD=${DATAVERSE_DB_PASSWORD} ports: - "5432:5432" networks: - dataverse solr_initializer: container_name: "solr_initializer" image: ${CONFIGBAKER_IMAGE} restart: "no" command: - sh - -c - "fix-fs-perms.sh solr && cp -a /template/* /solr-template" volumes: - ${PWD}/solr/data:/var/solr - ${PWD}/solr/conf:/solr-template solr: container_name: "solr" hostname: "solr" image: solr:${SOLR_VERSION} depends_on: solr_initializer: condition: service_completed_successfully restart: on-failure ports: - "8983:8983" networks: - dataverse command: - "solr-precreate" - "collection1" - "/template" volumes: - ${PWD}/solr/data:/var/solr - ${PWD}/solr/conf:/template smtp: container_name: "smtp" hostname: "smtp" image: maildev/maildev:2.0.5 restart: on-failure expose: - "25" # smtp server environment: - MAILDEV_SMTP_PORT=25 - MAILDEV_MAIL_DIRECTORY=/mail networks: - dataverse tmpfs: - /mail:mode=770,size=128M,uid=1000,gid=1000 bootstrap: container_name: "bootstrap" hostname: "bootstrap" image: ${CONFIGBAKER_IMAGE} restart: "no" networks: - dataverse volumes: - ${PWD}/dv/bootstrap.exposed.env:/.env command: - sh - -c - "bootstrap.sh -e /.env dev" depends_on: dataverse: condition: service_healthy networks: dataverse: driver: bridge pyDataverse-0.3.4/docker/docker-compose-test-all.yml000066400000000000000000000014361467256651400224600ustar00rootroot00000000000000version: "2.4" services: unit-tests: container_name: unit-tests image: python:${PYTHON_VERSION}-slim environment: BASE_URL: http://dataverse:8080 DV_VERSION: 6.3 networks: - dataverse volumes: - ${PWD}:/pydataverse - ../dv:/dv command: - sh - -c - | # Fetch the API Token from the local file export $(grep "API_TOKEN" "dv/bootstrap.exposed.env") export API_TOKEN_SUPERUSER=$$API_TOKEN cd /pydataverse # Run the unit tests python3 -m pip install --upgrade pip python3 -m pip install pytest pytest-cov python3 -m pip install -e . python3 -m pytest > /dv/unit-tests.log depends_on: bootstrap: condition: service_completed_successfully pyDataverse-0.3.4/local-test.env000066400000000000000000000003401467256651400166030ustar00rootroot00000000000000# Dataverse DATAVERSE_IMAGE=docker.io/gdcc/dataverse:unstable DATAVERSE_DB_USER=dataverse DATAVERSE_DB_PASSWORD=secret CONFIGBAKER_IMAGE=docker.io/gdcc/configbaker:unstable # Services POSTGRES_VERSION=15 SOLR_VERSION=9.3.0 pyDataverse-0.3.4/pyDataverse/000077500000000000000000000000001467256651400163145ustar00rootroot00000000000000pyDataverse-0.3.4/pyDataverse/__init__.py000066400000000000000000000007761467256651400204370ustar00rootroot00000000000000"""Find out more at https://github.com/GDCC/pyDataverse. Copyright 2019 Stefan Kasberger Licensed under the MIT License. """ from __future__ import absolute_import __author__ = "Stefan Kasberger" __email__ = "stefan.kasberger@univie.ac.at" __copyright__ = "Copyright (c) 2019 Stefan Kasberger" __license__ = "MIT License" __version__ = "0.3.4" __url__ = "https://github.com/GDCC/pyDataverse" __download_url__ = "https://pypi.python.org/pypi/pyDataverse" __description__ = "A Python module for Dataverse." pyDataverse-0.3.4/pyDataverse/api.py000066400000000000000000002621701467256651400174470ustar00rootroot00000000000000"""Dataverse API wrapper for all it's API's.""" import json from typing import Any, Dict, Optional import httpx import subprocess as sp from warnings import warn from httpx import ConnectError, Response from pyDataverse.auth import ApiTokenAuth from pyDataverse.exceptions import ( ApiAuthorizationError, ApiUrlError, DatasetNotFoundError, DataverseNotEmptyError, DataverseNotFoundError, OperationFailedError, ) DEPRECATION_GUARD = object() class Api: """Base class. Parameters ---------- base_url : str Base URL of Dataverse instance. Without trailing `/` at the end. e.g. `http://demo.dataverse.org` api_token : str Authentication token for the api. Attributes ---------- base_url api_token dataverse_version """ def __init__( self, base_url: str, api_token: Optional[str] = None, api_version: str = "latest", *, auth: Optional[httpx.Auth] = None, ): """Init an Api() class. Scheme, host and path combined create the base-url for the api. See more about URL at `Wikipedia `_. Parameters ---------- base_url : str Base url for Dataverse api. api_token : str | None API token for Dataverse API. If you provide an :code:`api_token`, we assume it is an API token as retrieved via your Dataverse instance user profile. We recommend using the :code:`auth` argument instead. To retain the current behaviour with the :code:`auth` argument, change .. code-block:: python Api("https://demo.dataverse.org", "my_token") to .. code-block:: python from pyDataverse.auth import ApiTokenAuth Api("https://demo.dataverse.org", auth=ApiTokenAuth("my_token")) If you are using an OIDC/OAuth 2.0 Bearer token, please use the :code:`auth` parameter with the :py:class:`.auth.BearerTokenAuth`. api_version : str The version string of the Dataverse API or :code:`latest`, e.g., :code:`v1`. Defaults to :code:`latest`, which drops the version from the API urls. auth : httpx.Auth | None You can provide any authentication mechanism you like to connect to your Dataverse instance. The most common mechanisms are implemented in :py:mod:`.auth`, but if one is missing, you can use your own `httpx.Auth`-compatible class. For more information, have a look at `httpx' Authentication docs `_. Examples ------- Create an API connection:: .. code-block:: >>> from pyDataverse.api import Api >>> base_url = 'http://demo.dataverse.org' >>> api = Api(base_url) .. code-block:: >>> from pyDataverse.api import Api >>> from pyDataverse.auth import ApiTokenAuth >>> base_url = 'http://demo.dataverse.org' >>> api = Api(base_url, ApiTokenAuth('my_api_token')) """ if not isinstance(base_url, str): raise ApiUrlError("base_url {0} is not a string.".format(base_url)) self.base_url = base_url.rstrip("/") self.client = None if not isinstance(api_version, str): raise ApiUrlError("api_version {0} is not a string.".format(api_version)) self.api_version = api_version self.auth = auth self.api_token = api_token if api_token is not None: if auth is None: self.auth = ApiTokenAuth(api_token) else: self.api_token = None warn( UserWarning( "You provided both, an api_token and a custom auth " "method. We will only use the auth method." ) ) if self.api_version == "latest": self.base_url_api = "{0}/api".format(self.base_url) else: self.base_url_api = "{0}/api/{1}".format(self.base_url, self.api_version) self.timeout = 500 def __str__(self): """Return the class name and URL of the used API class. Returns ------- str Naming of the API class. """ return f"{self.__class__.__name__}: {self.base_url_api}" def get_request(self, url, params=None, auth=DEPRECATION_GUARD): """Make a GET request. Parameters ---------- url : str Full URL. params : dict Dictionary of parameters to be passed with the request. Defaults to `None`. auth : Any .. deprecated:: 0.3.4 The auth parameter was ignored before version 0.3.4. Please pass your auth to the Api instance directly, as explained in :py:func:`Api.__init__`. If you need multiple auth methods, create multiple API instances: .. code-block:: python api = Api("https://demo.dataverse.org", auth=ApiTokenAuth("my_api_token")) api_oauth = Api("https://demo.dataverse.org", auth=BearerTokenAuth("my_bearer_token")) Returns ------- httpx.Response Response object of httpx library. """ if auth is not DEPRECATION_GUARD: warn( DeprecationWarning( "The auth parameter is deprecated. Please pass your auth " "arguments to the __init__ method instead." ) ) headers = {} headers["User-Agent"] = "pydataverse" if self.client is None: return self._sync_request( method=httpx.get, url=url, headers=headers, params=params, ) else: return self._async_request( method=self.client.get, url=url, headers=headers, params=params, ) def post_request( self, url, data=None, auth=DEPRECATION_GUARD, params=None, files=None ): """Make a POST request. params will be added as key-value pairs to the URL. Parameters ---------- url : str Full URL. data : str Metadata as a json-formatted string. Defaults to `None`. auth : Any .. deprecated:: 0.3.4 The auth parameter was ignored before version 0.3.4. Please pass your auth to the Api instance directly, as explained in :py:func:`Api.__init__`. If you need multiple auth methods, create multiple API instances: .. code-block:: python api = Api("https://demo.dataverse.org", auth=ApiTokenAuth("my_api_token")) api_oauth = Api("https://demo.dataverse.org", auth=BearerTokenAuth("my_bearer_token")) files : dict e.g. :code:`files={'file': open('sample_file.txt','rb')}` params : dict Dictionary of parameters to be passed with the request. Defaults to :code:`None`. Returns ------- httpx.Response Response object of httpx library. """ if auth is not DEPRECATION_GUARD: warn( DeprecationWarning( "The auth parameter is deprecated. Please pass your auth " "arguments to the __init__ method instead." ) ) headers = {} headers["User-Agent"] = "pydataverse" if isinstance(data, str): data = json.loads(data) # Decide whether to use 'data' or 'json' args request_params = self._check_json_data_form(data) if self.client is None: return self._sync_request( method=httpx.post, url=url, headers=headers, params=params, files=files, **request_params, ) else: return self._async_request( method=self.client.post, url=url, headers=headers, params=params, files=files, **request_params, ) def put_request(self, url, data=None, auth=DEPRECATION_GUARD, params=None): """Make a PUT request. Parameters ---------- url : str Full URL. data : str Metadata as a json-formatted string. Defaults to `None`. auth : Any .. deprecated:: 0.3.4 The auth parameter was ignored before version 0.3.4. Please pass your auth to the Api instance directly, as explained in :py:func:`Api.__init__`. If you need multiple auth methods, create multiple API instances: .. code-block:: python api = Api("https://demo.dataverse.org", auth=ApiTokenAuth("my_api_token")) api_oauth = Api("https://demo.dataverse.org", auth=BearerTokenAuth("my_bearer_token")) params : dict Dictionary of parameters to be passed with the request. Defaults to `None`. Returns ------- httpx.Response Response object of httpx library. """ if auth is not DEPRECATION_GUARD: warn( DeprecationWarning( "The auth parameter is deprecated. Please pass your auth " "arguments to the __init__ method instead." ) ) headers = {} headers["User-Agent"] = "pydataverse" if isinstance(data, str): data = json.loads(data) # Decide whether to use 'data' or 'json' args request_params = self._check_json_data_form(data) if self.client is None: return self._sync_request( method=httpx.put, url=url, json=data, headers=headers, params=params, **request_params, ) else: return self._async_request( method=self.client.put, url=url, json=data, headers=headers, params=params, **request_params, ) def delete_request(self, url, auth=DEPRECATION_GUARD, params=None): """Make a Delete request. Parameters ---------- url : str Full URL. auth : Any .. deprecated:: 0.3.4 The auth parameter was ignored before version 0.3.4. Please pass your auth to the Api instance directly, as explained in :py:func:`Api.__init__`. If you need multiple auth methods, create multiple API instances: .. code-block:: python api = Api("https://demo.dataverse.org", auth=ApiTokenAuth("my_api_token")) api_oauth = Api("https://demo.dataverse.org", auth=BearerTokenAuth("my_bearer_token")) params : dict Dictionary of parameters to be passed with the request. Defaults to `None`. Returns ------- httpx.Response Response object of httpx library. """ if auth is not DEPRECATION_GUARD: warn( DeprecationWarning( "The auth parameter is deprecated. Please pass your auth " "arguments to the __init__ method instead." ) ) headers = {} headers["User-Agent"] = "pydataverse" if self.client is None: return self._sync_request( method=httpx.delete, url=url, headers=headers, params=params, ) else: return self._async_request( method=self.client.delete, url=url, headers=headers, params=params, ) @staticmethod def _check_json_data_form(data: Optional[Dict]): """This method checks and distributes given payload to match Dataverse expectations. In the case of the form-data keyed by "jsonData", Dataverse expects the payload as a string in a form of a dictionary. This is not possible using HTTPXs json parameter, so we need to handle this case separately. """ if not data: return {} elif not isinstance(data, dict): raise ValueError("Data must be a dictionary.") elif "jsonData" not in data: return {"json": data} assert list(data.keys()) == [ "jsonData" ], "jsonData must be the only key in the dictionary." # Content of JSON data should ideally be a string content = data["jsonData"] if not isinstance(content, str): data["jsonData"] = json.dumps(content) return {"data": data} def _sync_request( self, method, **kwargs, ): """ Sends a synchronous request to the specified URL using the specified HTTP method. Args: method (function): The HTTP method to use for the request. **kwargs: Additional keyword arguments to be passed to the method. Returns: httpx.Response: The response object returned by the request. Raises: ApiAuthorizationError: If the response status code is 401 (Authorization error). ConnectError: If a connection to the API cannot be established. """ assert "url" in kwargs, "URL is required for a request." kwargs = self._filter_kwargs(kwargs) try: resp: httpx.Response = method( **kwargs, auth=self.auth, follow_redirects=True, timeout=None ) if resp.status_code == 401: try: error_msg = resp.json()["message"] except json.JSONDecodeError: error_msg = resp.reason_phrase raise ApiAuthorizationError( "ERROR: HTTP 401 - Authorization error {0}. MSG: {1}".format( kwargs["url"], error_msg ) ) return resp except ConnectError: raise ConnectError( "ERROR - Could not establish connection to api '{0}'.".format( kwargs["url"] ) ) async def _async_request( self, method, **kwargs, ): """ Sends an asynchronous request to the specified URL using the specified HTTP method. Args: method (callable): The HTTP method to use for the request. **kwargs: Additional keyword arguments to be passed to the method. Raises: ApiAuthorizationError: If the response status code is 401 (Authorization error). ConnectError: If a connection to the API cannot be established. Returns: The response object. """ assert "url" in kwargs, "URL is required for a request." kwargs = self._filter_kwargs(kwargs) try: resp = await method(**kwargs, auth=self.auth) if resp.status_code == 401: error_msg = resp.json()["message"] raise ApiAuthorizationError( "ERROR: HTTP 401 - Authorization error {0}. MSG: {1}".format( kwargs["url"], error_msg ) ) return resp except ConnectError: raise ConnectError( "ERROR - Could not establish connection to api '{0}'.".format( kwargs["url"] ) ) @staticmethod def _filter_kwargs(kwargs: Dict[str, Any]) -> Dict[str, Any]: """ Filters out any keyword arguments that are `None` from the specified dictionary. Args: kwargs (Dict[str, Any]): The dictionary to filter. Returns: Dict[str, Any]: The filtered dictionary. """ return {k: v for k, v in kwargs.items() if v is not None} async def __aenter__(self): """ Context manager method that initializes an instance of httpx.AsyncClient. Returns: httpx.AsyncClient: An instance of httpx.AsyncClient. """ self.client = httpx.AsyncClient() async def __aexit__(self, exc_type, exc_value, traceback): """ Closes the client connection when exiting a context manager. Args: exc_type (type): The type of the exception raised, if any. exc_value (Exception): The exception raised, if any. traceback (traceback): The traceback object associated with the exception, if any. """ await self.client.aclose() self.client = None class DataAccessApi(Api): """Class to access Dataverse's Data Access API. Attributes ---------- base_url_api_data_access : type Description of attribute `base_url_api_data_access`. base_url : type Description of attribute `base_url`. """ def __init__(self, base_url, api_token=None, *, auth=None): """Init an DataAccessApi() class.""" super().__init__(base_url, api_token, auth=auth) if base_url: self.base_url_api_data_access = "{0}/access".format(self.base_url_api) else: self.base_url_api_data_access = self.base_url_api def get_datafile( self, identifier, data_format=None, no_var_header=None, image_thumb=None, is_pid=True, auth=DEPRECATION_GUARD, ): """Download a datafile via the Dataverse Data Access API. Get by file id (HTTP Request). .. code-block:: bash GET /api/access/datafile/$id Get by persistent identifier (HTTP Request). .. code-block:: bash GET http://$SERVER/api/access/datafile/:persistentId/?persistentId=doi:10.5072/FK2/J8SJZB Parameters ---------- identifier : str Identifier of the datafile. Can be datafile id or persistent identifier of the datafile (e. g. doi). is_pid : bool ``True`` to use persistent identifier. ``False``, if not. Returns ------- httpx.Response Response object of httpx library. """ is_first_param = True if is_pid: url = "{0}/datafile/:persistentId/?persistentId={1}".format( self.base_url_api_data_access, identifier ) else: url = "{0}/datafile/{1}".format(self.base_url_api_data_access, identifier) if data_format or no_var_header or image_thumb: url += "?" if data_format: url += "format={0}".format(data_format) is_first_param = False if no_var_header: if not is_first_param: url += "&" url += "noVarHeader={0}".format(no_var_header) is_first_param = False if image_thumb: if not is_first_param: url += "&" url += "imageThumb={0}".format(image_thumb) return self.get_request(url, auth=auth) def get_datafiles(self, identifier, data_format=None, auth=DEPRECATION_GUARD): """Download a datafile via the Dataverse Data Access API. Get by file id (HTTP Request). .. code-block:: bash GET /api/access/datafiles/$id1,$id2,...$idN Get by persistent identifier (HTTP Request). Parameters ---------- identifier : str Identifier of the dataset. Can be datafile id or persistent identifier of the datafile (e. g. doi). Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/datafiles/{1}".format(self.base_url_api_data_access, identifier) if data_format: url += "?format={0}".format(data_format) return self.get_request(url, auth=auth) def get_datafile_bundle( self, identifier, file_metadata_id=None, auth=DEPRECATION_GUARD ): """Download a datafile in all its formats. HTTP Request: .. code-block:: bash GET /api/access/datafile/bundle/$id Data Access API calls can now be made using persistent identifiers (in addition to database ids). This is done by passing the constant :persistentId where the numeric id of the file is expected, and then passing the actual persistent id as a query parameter with the name persistentId. This is a convenience packaging method available for tabular data files. It returns a zipped bundle that contains the data in the following formats: - Tab-delimited; - “Saved Original”, the proprietary (SPSS, Stata, R, etc.) file from which the tabular data was ingested; - Generated R Data frame (unless the “original” above was in R); - Data (Variable) metadata record, in DDI XML; - File citation, in Endnote and RIS formats. Parameters ---------- identifier : str Identifier of the dataset. Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/datafile/bundle/{1}".format( self.base_url_api_data_access, identifier ) if file_metadata_id: url += "?fileMetadataId={0}".format(file_metadata_id) return self.get_request(url, auth=auth) def request_access(self, identifier, auth=DEPRECATION_GUARD, is_filepid=False): """Request datafile access. This method requests access to the datafile whose id is passed on the behalf of an authenticated user whose key is passed. Note that not all datasets allow access requests to restricted files. https://guides.dataverse.org/en/4.18.1/api/dataaccess.html#request-access /api/access/datafile/$id/requestAccess curl -H "X-Dataverse-key:$API_TOKEN" -X PUT http://$SERVER/api/access/datafile/{id}/requestAccess """ if is_filepid: url = "{0}/datafile/:persistentId/requestAccess?persistentId={1}".format( self.base_url_api_data_access, identifier ) else: url = "{0}/datafile/{1}/requestAccess".format( self.base_url_api_data_access, identifier ) return self.put_request(url, auth=auth) def allow_access_request( self, identifier, do_allow=True, auth=DEPRECATION_GUARD, is_pid=True ): """Allow access request for datafiles. https://guides.dataverse.org/en/latest/api/dataaccess.html#allow-access-requests curl -H "X-Dataverse-key:$API_TOKEN" -X PUT -d true http://$SERVER/api/access/{id}/allowAccessRequest curl -H "X-Dataverse-key:$API_TOKEN" -X PUT -d true http://$SERVER/api/access/:persistentId/allowAccessRequest?persistentId={pid} """ if is_pid: url = "{0}/:persistentId/allowAccessRequest?persistentId={1}".format( self.base_url_api_data_access, identifier ) else: url = "{0}/{1}/allowAccessRequest".format( self.base_url_api_data_access, identifier ) if do_allow: data = "true" else: data = "false" return self.put_request(url, data=data, auth=auth) def grant_file_access(self, identifier, user, auth=DEPRECATION_GUARD): """Grant datafile access. https://guides.dataverse.org/en/4.18.1/api/dataaccess.html#grant-file-access curl -H "X-Dataverse-key:$API_TOKEN" -X PUT http://$SERVER/api/access/datafile/{id}/grantAccess/{@userIdentifier} """ url = "{0}/datafile/{1}/grantAccess/{2}".format( self.base_url_api_data_access, identifier, user ) return self.put_request(url, auth=auth) def list_file_access_requests(self, identifier, auth=DEPRECATION_GUARD): """Liste datafile access requests. https://guides.dataverse.org/en/4.18.1/api/dataaccess.html#list-file-access-requests curl -H "X-Dataverse-key:$API_TOKEN" -X GET http://$SERVER/api/access/datafile/{id}/listRequests """ url = "{0}/datafile/{1}/listRequests".format( self.base_url_api_data_access, identifier ) return self.get_request(url, auth=auth) class MetricsApi(Api): """Class to access Dataverse's Metrics API. Attributes ---------- base_url_api_metrics : type Description of attribute `base_url_api_metrics`. base_url : type Description of attribute `base_url`. """ def __init__(self, base_url, api_token=None, api_version="latest", *, auth=None): """Init an MetricsApi() class.""" super().__init__(base_url, api_token, api_version, auth=auth) if base_url: self.base_url_api_metrics = "{0}/api/info/metrics".format(self.base_url) else: self.base_url_api_metrics = None def total(self, data_type, date_str=None, auth=DEPRECATION_GUARD): """ GET https://$SERVER/api/info/metrics/$type GET https://$SERVER/api/info/metrics/$type/toMonth/$YYYY-DD $type can be set to dataverses, datasets, files or downloads. """ url = "{0}/{1}".format(self.base_url_api_metrics, data_type) if date_str: url += "/toMonth/{0}".format(date_str) return self.get_request(url, auth=auth) def past_days(self, data_type, days_str, auth=DEPRECATION_GUARD): """ http://guides.dataverse.org/en/4.18.1/api/metrics.html GET https://$SERVER/api/info/metrics/$type/pastDays/$days $type can be set to dataverses, datasets, files or downloads. """ # TODO: check if date-string has proper format url = "{0}/{1}/pastDays/{2}".format( self.base_url_api_metrics, data_type, days_str ) return self.get_request(url, auth=auth) def get_dataverses_by_subject(self, auth=DEPRECATION_GUARD): """ GET https://$SERVER/api/info/metrics/dataverses/bySubject $type can be set to dataverses, datasets, files or downloads. """ # TODO: check if date-string has proper format url = "{0}/dataverses/bySubject".format(self.base_url_api_metrics) return self.get_request(url, auth=auth) def get_dataverses_by_category(self, auth=DEPRECATION_GUARD): """ GET https://$SERVER/api/info/metrics/dataverses/byCategory $type can be set to dataverses, datasets, files or downloads. """ # TODO: check if date-string has proper format url = "{0}/dataverses/byCategory".format(self.base_url_api_metrics) return self.get_request(url, auth=auth) def get_datasets_by_subject(self, date_str=None, auth=DEPRECATION_GUARD): """ GET https://$SERVER/api/info/metrics/datasets/bySubject $type can be set to dataverses, datasets, files or downloads. """ # TODO: check if date-string has proper format url = "{0}/datasets/bySubject".format(self.base_url_api_metrics) if date_str: url += "/toMonth/{0}".format(date_str) return self.get_request(url, auth=auth) def get_datasets_by_data_location(self, data_location, auth=DEPRECATION_GUARD): """ GET https://$SERVER/api/info/metrics/datasets/?dataLocation=$location $type can be set to dataverses, datasets, files or downloads. """ # TODO: check if date-string has proper format url = "{0}/datasets/?dataLocation={1}".format( self.base_url_api_metrics, data_location ) return self.get_request(url, auth=auth) class NativeApi(Api): """Class to access Dataverse's Native API. Parameters ---------- base_url : type Description of parameter `base_url`. api_token : type Description of parameter `api_token`. api_version : type Description of parameter `api_version`. Attributes ---------- base_url_api_native : type Description of attribute `base_url_api_native`. base_url_api : type Description of attribute `base_url_api`. """ def __init__(self, base_url: str, api_token=None, api_version="v1", *, auth=None): """Init an Api() class. Scheme, host and path combined create the base-url for the api. See more about URL at `Wikipedia `_. Parameters ---------- native_api_version : str API version of Dataverse native API. Default is `v1`. """ super().__init__(base_url, api_token, api_version, auth=auth) self.base_url_api_native = self.base_url_api def get_dataverse(self, identifier, auth=DEPRECATION_GUARD): """Get dataverse metadata by alias or id. View metadata about a dataverse. .. code-block:: bash GET http://$SERVER/api/dataverses/$id Parameters ---------- identifier : str Can either be a dataverse id (long), a dataverse alias (more robust), or the special value ``:root``. Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/dataverses/{1}".format(self.base_url_api_native, identifier) return self.get_request(url, auth=auth) def create_dataverse( self, parent: str, metadata: str, auth: bool = True ) -> Response: """Create a dataverse. Generates a new dataverse under identifier. Expects a JSON content describing the dataverse. HTTP Request: .. code-block:: bash POST http://$SERVER/api/dataverses/$id Download the `dataverse.json `_ example file and modify to create dataverses to suit your needs. The fields name, alias, and dataverseContacts are required. Status Codes: 200: dataverse created 201: dataverse created Parameters ---------- parent : str Parent dataverse, to which the Dataverse gets attached to. metadata : str Metadata of the Dataverse. auth : bool True if api authorization is necessary. Defaults to ``True``. Returns ------- httpx.Response Response object of httpx library. """ metadata_dict = json.loads(metadata) identifier = metadata_dict["alias"] url = "{0}/dataverses/{1}".format(self.base_url_api_native, parent) resp = self.post_request(url, metadata, auth) if resp.status_code == 404: error_msg = resp.json()["message"] raise DataverseNotFoundError( "ERROR: HTTP 404 - Dataverse {0} was not found. MSG: {1}".format( parent, error_msg ) ) elif resp.status_code != 200 and resp.status_code != 201: error_msg = resp.json()["message"] raise OperationFailedError( "ERROR: HTTP {0} - Dataverse {1} could not be created. MSG: {2}".format( resp.status_code, identifier, error_msg ) ) else: print("Dataverse {0} created.".format(identifier)) return resp def publish_dataverse(self, identifier, auth=True): """Publish a dataverse. Publish the Dataverse pointed by identifier, which can either by the dataverse alias or its numerical id. HTTP Request: .. code-block:: bash POST http://$SERVER/api/dataverses/$identifier/actions/:publish Status Code: 200: Dataverse published Parameters ---------- identifier : str Can either be a dataverse id (long) or a dataverse alias (more robust). auth : bool True if api authorization is necessary. Defaults to ``False``. Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/dataverses/{1}/actions/:publish".format( self.base_url_api_native, identifier ) resp = self.post_request(url, auth=auth) if resp.status_code == 401: error_msg = resp.json()["message"] raise ApiAuthorizationError( "ERROR: HTTP 401 - Publish Dataverse {0} unauthorized. MSG: {1}".format( identifier, error_msg ) ) elif resp.status_code == 404: error_msg = resp.json()["message"] raise DataverseNotFoundError( "ERROR: HTTP 404 - Dataverse {0} was not found. MSG: {1}".format( identifier, error_msg ) ) elif resp.status_code != 200: error_msg = resp.json()["message"] raise OperationFailedError( "ERROR: HTTP {0} - Dataverse {1} could not be published. MSG: {2}".format( resp.status_code, identifier, error_msg ) ) elif resp.status_code == 200: print("Dataverse {0} published.".format(identifier)) return resp def delete_dataverse(self, identifier, auth=True): """Delete dataverse by alias or id. Status Code: 200: Dataverse deleted Parameters ---------- identifier : str Can either be a dataverse id (long) or a dataverse alias (more robust). Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/dataverses/{1}".format(self.base_url_api_native, identifier) resp = self.delete_request(url, auth) if resp.status_code == 401: error_msg = resp.json()["message"] raise ApiAuthorizationError( "ERROR: HTTP 401 - Delete Dataverse {0} unauthorized. MSG: {1}".format( identifier, error_msg ) ) elif resp.status_code == 404: error_msg = resp.json()["message"] raise DataverseNotFoundError( "ERROR: HTTP 404 - Dataverse {0} was not found. MSG: {1}".format( identifier, error_msg ) ) elif resp.status_code == 403: error_msg = resp.json()["message"] raise DataverseNotEmptyError( "ERROR: HTTP 403 - Dataverse {0} not empty. MSG: {1}".format( identifier, error_msg ) ) elif resp.status_code != 200: error_msg = resp.json()["message"] raise OperationFailedError( "ERROR: HTTP {0} - Dataverse {1} could not be deleted. MSG: {2}".format( resp.status_code, identifier, error_msg ) ) elif resp.status_code == 200: print("Dataverse {0} deleted.".format(identifier)) return resp def get_dataverse_roles(self, identifier: str, auth: bool = False) -> Response: """All the roles defined directly in the dataverse by identifier. `Docs `_ .. code-block:: bash GET http://$SERVER/api/dataverses/$id/roles Parameters ---------- identifier : str Can either be a dataverse id (long), a dataverse alias (more robust), or the special value ``:root``. Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/dataverses/{1}/roles".format(self.base_url_api_native, identifier) return self.get_request(url, auth=auth) def get_dataverse_contents(self, identifier, auth=True): """Gets contents of Dataverse. Parameters ---------- identifier : str Can either be a dataverse id (long), a dataverse alias (more robust), or the special value ``:root``. auth : bool Description of parameter `auth` (the default is False). Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/dataverses/{1}/contents".format(self.base_url_api_native, identifier) return self.get_request(url, auth=auth) def get_dataverse_assignments(self, identifier, auth=DEPRECATION_GUARD): """Get dataverse assignments by alias or id. View assignments of a dataverse. .. code-block:: bash GET http://$SERVER/api/dataverses/$id/assignments Parameters ---------- identifier : str Can either be a dataverse id (long), a dataverse alias (more robust), or the special value ``:root``. Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/dataverses/{1}/assignments".format( self.base_url_api_native, identifier ) return self.get_request(url, auth=auth) def get_dataverse_facets(self, identifier, auth=DEPRECATION_GUARD): """Get dataverse facets by alias or id. View facets of a dataverse. .. code-block:: bash GET http://$SERVER/api/dataverses/$id/facets Parameters ---------- identifier : str Can either be a dataverse id (long), a dataverse alias (more robust), or the special value ``:root``. Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/dataverses/{1}/facets".format(self.base_url_api_native, identifier) return self.get_request(url, auth=auth) def dataverse_id2alias(self, dataverse_id, auth=DEPRECATION_GUARD): """Converts a Dataverse ID to an alias. Parameters ---------- dataverse_id : str Dataverse ID. Returns ------- str Dataverse alias """ resp = self.get_dataverse(dataverse_id, auth=auth) if "data" in resp.json(): if "alias" in resp.json()["data"]: return resp.json()["data"]["alias"] print("ERROR: Can not resolve Dataverse ID to alias.") return False def get_dataset(self, identifier, version=":latest", auth=True, is_pid=True): """Get metadata of a Dataset. With Dataverse identifier: .. code-block:: bash GET http://$SERVER/api/datasets/$identifier With persistent identifier: .. code-block:: bash GET http://$SERVER/api/datasets/:persistentId/?persistentId=$id GET http://$SERVER/api/datasets/:persistentId/ ?persistentId=$pid Parameters ---------- identifier : str Identifier of the dataset. Can be a Dataverse identifier or a persistent identifier (e.g. ``doi:10.11587/8H3N93``). is_pid : bool True, if identifier is a persistent identifier. version : str Version to be retrieved: ``:latest-published``: the latest published version ``:latest``: either a draft (if exists) or the latest published version. ``:draft``: the draft version, if any ``x.y``: x.y a specific version, where x is the major version number and y is the minor version number. ``x``: same as x.0 Returns ------- httpx.Response Response object of httpx library. """ if is_pid: # TODO: Add version to query http://guides.dataverse.org/en/4.18.1/api/native-api.html#get-json-representation-of-a-dataset url = "{0}/datasets/:persistentId/?persistentId={1}".format( self.base_url_api_native, identifier ) else: url = "{0}/datasets/{1}".format(self.base_url_api_native, identifier) # CHECK: Its not really clear, if the version query can also be done via ID. return self.get_request(url, auth=auth) def get_dataset_versions(self, identifier, auth=True, is_pid=True): """Get versions of a Dataset. With Dataverse identifier: .. code-block:: bash GET http://$SERVER/api/datasets/$identifier/versions With persistent identifier: .. code-block:: bash GET http://$SERVER/api/datasets/:persistentId/versions?persistentId=$id Parameters ---------- identifier : str Identifier of the dataset. Can be a Dataverse identifier or a persistent identifier (e.g. ``doi:10.11587/8H3N93``). is_pid : bool True, if identifier is a persistent identifier. Returns ------- httpx.Response Response object of httpx library. """ if is_pid: url = "{0}/datasets/:persistentId/versions?persistentId={1}".format( self.base_url_api_native, identifier ) else: url = "{0}/datasets/{1}/versions".format( self.base_url_api_native, identifier ) return self.get_request(url, auth=auth) def get_dataset_version(self, identifier, version, auth=True, is_pid=True): """Get version of a Dataset. With Dataverse identifier: .. code-block:: bash GET http://$SERVER/api/datasets/$identifier/versions/$versionNumber With persistent identifier: .. code-block:: bash GET http://$SERVER/api/datasets/:persistentId/versions/$versionNumber?persistentId=$id Parameters ---------- identifier : str Identifier of the dataset. Can be a Dataverse identifier or a persistent identifier (e.g. ``doi:10.11587/8H3N93``). version : str Version string of the Dataset. is_pid : bool True, if identifier is a persistent identifier. Returns ------- httpx.Response Response object of httpx library. """ if is_pid: url = "{0}/datasets/:persistentId/versions/{1}?persistentId={2}".format( self.base_url_api_native, version, identifier ) else: url = "{0}/datasets/{1}/versions/{2}".format( self.base_url_api_native, identifier, version ) return self.get_request(url, auth=auth) def get_dataset_export(self, pid, export_format, auth=DEPRECATION_GUARD): """Get metadata of dataset exported in different formats. Export the metadata of the current published version of a dataset in various formats by its persistend identifier. .. code-block:: bash GET http://$SERVER/api/datasets/export?exporter=$exportformat&persistentId=$pid Parameters ---------- pid : str Persistent identifier of the dataset. (e.g. ``doi:10.11587/8H3N93``). export_format : str Export format as a string. Formats: ``ddi``, ``oai_ddi``, ``dcterms``, ``oai_dc``, ``schema.org``, ``dataverse_json``. Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/datasets/export?exporter={1}&persistentId={2}".format( self.base_url_api_native, export_format, pid ) return self.get_request(url, auth=auth) def create_dataset(self, dataverse, metadata, pid=None, publish=False, auth=True): """Add dataset to a dataverse. `Dataverse Documentation `_ HTTP Request: .. code-block:: bash POST http://$SERVER/api/dataverses/$dataverse/datasets --upload-file FILENAME Add new dataset with curl: .. code-block:: bash curl -H "X-Dataverse-key: $API_TOKEN" -X POST $SERVER_URL/api/dataverses/$DV_ALIAS/datasets --upload-file tests/data/dataset_min.json Import dataset with existing persistend identifier with curl: .. code-block:: bash curl -H "X-Dataverse-key: $API_TOKEN" -X POST $SERVER_URL/api/dataverses/$DV_ALIAS/datasets/:import?pid=$PERSISTENT_IDENTIFIER&release=yes --upload-file tests/data/dataset_min.json To create a dataset, you must create a JSON file containing all the metadata you want such as example file: `dataset-finch1.json `_. Then, you must decide which dataverse to create the dataset in and target that datavese with either the "alias" of the dataverse (e.g. "root") or the database id of the dataverse (e.g. "1"). The initial version state will be set to "DRAFT": Status Code: 201: dataset created Import Dataset with existing PID: ``_ To import a dataset with an existing persistent identifier (PID), the dataset’s metadata should be prepared in Dataverse’s native JSON format. The PID is provided as a parameter at the URL. The following line imports a dataset with the PID PERSISTENT_IDENTIFIER to Dataverse, and then releases it: The pid parameter holds a persistent identifier (such as a DOI or Handle). The import will fail if no PID is provided, or if the provided PID fails validation. The optional release parameter tells Dataverse to immediately publish the dataset. If the parameter is changed to no, the imported dataset will remain in DRAFT status. Parameters ---------- dataverse : str "alias" of the dataverse (e.g. ``root``) or the database id of the dataverse (e.g. ``1``) pid : str PID of existing Dataset. publish : bool Publish only works when a Dataset with an existing PID is created. If it is ``True``, Dataset should be instantly published, ``False`` if a Draft should be created. metadata : str Metadata of the Dataset as a json-formatted string (e. g. `dataset-finch1.json `_) Returns ------- httpx.Response Response object of httpx library. """ if pid: assert isinstance(pid, str) url = "{0}/dataverses/{1}/datasets/:import?pid={2}".format( self.base_url_api_native, dataverse, pid ) if publish: url += "&release=yes" else: url += "&release=no" else: url = "{0}/dataverses/{1}/datasets".format( self.base_url_api_native, dataverse ) resp = self.post_request(url, metadata, auth) if resp.status_code == 404: error_msg = resp.json()["message"] raise DataverseNotFoundError( "ERROR: HTTP 404 - Dataverse {0} was not found. MSG: {1}".format( dataverse, error_msg ) ) elif resp.status_code == 401: error_msg = resp.json()["message"] raise ApiAuthorizationError( "ERROR: HTTP 401 - Create Dataset unauthorized. MSG: {0}".format( error_msg ) ) elif resp.status_code == 201: if "data" in resp.json(): if "persistentId" in resp.json()["data"]: identifier = resp.json()["data"]["persistentId"] print("Dataset with pid '{0}' created.".format(identifier)) elif "id" in resp.json()["data"]: identifier = resp.json()["data"]["id"] print("Dataset with id '{0}' created.".format(identifier)) else: print("ERROR: No identifier returned for created Dataset.") return resp def edit_dataset_metadata( self, identifier, metadata, is_pid=True, replace=False, auth=True ): """Edit metadata of a given dataset. `edit-dataset-metadata `_. HTTP Request: .. code-block:: bash PUT http://$SERVER/api/datasets/editMetadata/$id --upload-file FILENAME Add data to dataset fields that are blank or accept multiple values with the following CURL Request: .. code-block:: bash curl -H "X-Dataverse-key: $API_TOKEN" -X PUT $SERVER_URL/api/datasets/:persistentId/editMetadata/?persistentId=$pid --upload-file dataset-add-metadata.json For these edits your JSON file need only include those dataset fields which you would like to edit. A sample JSON file may be downloaded here: `dataset-edit-metadata-sample.json `_ Parameters ---------- identifier : str Identifier of the dataset. Can be a Dataverse identifier or a persistent identifier (e.g. ``doi:10.11587/8H3N93``). metadata : str Metadata of the Dataset as a json-formatted string. is_pid : bool ``True`` to use persistent identifier. ``False``, if not. replace : bool ``True`` to replace already existing metadata. ``False``, if not. auth : bool ``True``, if an api token should be sent. Defaults to ``False``. Returns ------- httpx.Response Response object of httpx library. Examples ------- Get dataset metadata:: >>> data = api.get_dataset(doi).json()["data"]["latestVersion"]["metadataBlocks"]["citation"] >>> resp = api.edit_dataset_metadata(doi, data, is_replace=True, auth=True) >>> resp.status_code 200: metadata updated """ if is_pid: url = "{0}/datasets/:persistentId/editMetadata/?persistentId={1}".format( self.base_url_api_native, identifier ) else: url = "{0}/datasets/editMetadata/{1}".format( self.base_url_api_native, identifier ) params = {"replace": True} if replace else {} resp = self.put_request(url, metadata, auth, params) if resp.status_code == 401: error_msg = resp.json()["message"] raise ApiAuthorizationError( "ERROR: HTTP 401 - Updating metadata unauthorized. MSG: {0}".format( error_msg ) ) elif resp.status_code == 400: if "Error parsing" in resp.json()["message"]: print("Wrong passed data format.") else: print( "You may not add data to a field that already has data and does not" " allow multiples. Use is_replace=true to replace existing data." ) elif resp.status_code == 200: print("Dataset '{0}' updated".format(identifier)) return resp def create_dataset_private_url(self, identifier, is_pid=True, auth=True): """Create private Dataset URL. POST http://$SERVER/api/datasets/$id/privateUrl?key=$apiKey http://guides.dataverse.org/en/4.16/api/native-api.html#create-a-private-url-for-a-dataset 'MSG: {1}'.format(pid, error_msg)) """ if is_pid: url = "{0}/datasets/:persistentId/privateUrl/?persistentId={1}".format( self.base_url_api_native, identifier ) else: url = "{0}/datasets/{1}/privateUrl".format( self.base_url_api_native, identifier ) resp = self.post_request(url, auth=auth) if resp.status_code == 200: print( "Dataset private URL created: {0}".format(resp.json()["data"]["link"]) ) return resp def get_dataset_private_url(self, identifier, is_pid=True, auth=True): """Get private Dataset URL. GET http://$SERVER/api/datasets/$id/privateUrl?key=$apiKey http://guides.dataverse.org/en/4.16/api/native-api.html#get-the-private-url-for-a-dataset """ if is_pid: url = "{0}/datasets/:persistentId/privateUrl/?persistentId={1}".format( self.base_url_api_native, identifier ) else: url = "{0}/datasets/{1}/privateUrl".format( self.base_url_api_native, identifier ) resp = self.get_request(url, auth=auth) if resp.status_code == 200: print("Got Dataset private URL: {0}".format(resp.json()["data"]["link"])) return resp def delete_dataset_private_url(self, identifier, is_pid=True, auth=True): """Get private Dataset URL. DELETE http://$SERVER/api/datasets/$id/privateUrl?key=$apiKey http://guides.dataverse.org/en/4.16/api/native-api.html#delete-the-private-url-from-a-dataset """ if is_pid: url = "{0}/datasets/:persistentId/privateUrl/?persistentId={1}".format( self.base_url_api_native, identifier ) else: url = "{0}/datasets/{1}/privateUrl".format( self.base_url_api_native, identifier ) resp = self.delete_request(url, auth=auth) if resp.status_code == 200: print("Got Dataset private URL: {0}".format(resp.json()["data"]["link"])) return resp def publish_dataset(self, pid, release_type="minor", auth=True): """Publish dataset. Publishes the dataset whose id is passed. If this is the first version of the dataset, its version number will be set to 1.0. Otherwise, the new dataset version number is determined by the most recent version number and the type parameter. Passing type=minor increases the minor version number (2.3 is updated to 2.4). Passing type=major increases the major version number (2.3 is updated to 3.0). Superusers can pass type=updatecurrent to update metadata without changing the version number. HTTP Request: .. code-block:: bash POST http://$SERVER/api/datasets/$id/actions/:publish?type=$type When there are no default workflows, a successful publication process will result in 200 OK response. When there are workflows, it is impossible for Dataverse to know how long they are going to take and whether they will succeed or not (recall that some stages might require human intervention). Thus, a 202 ACCEPTED is returned immediately. To know whether the publication process succeeded or not, the client code has to check the status of the dataset periodically, or perform some push request in the post-publish workflow. Status Code: 200: dataset published Parameters ---------- pid : str Persistent identifier of the dataset (e.g. ``doi:10.11587/8H3N93``). release_type : str Passing ``minor`` increases the minor version number (2.3 is updated to 2.4). Passing ``major`` increases the major version number (2.3 is updated to 3.0). Superusers can pass ``updatecurrent`` to update metadata without changing the version number. auth : bool ``True`` if api authorization is necessary. Defaults to ``False``. Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/datasets/:persistentId/actions/:publish".format( self.base_url_api_native ) url += "?persistentId={0}&type={1}".format(pid, release_type) resp = self.post_request(url, auth=auth) if resp.status_code == 404: error_msg = resp.json()["message"] raise DatasetNotFoundError( "ERROR: HTTP 404 - Dataset {0} was not found. MSG: {1}".format( pid, error_msg ) ) elif resp.status_code == 401: error_msg = resp.json()["message"] raise ApiAuthorizationError( "ERROR: HTTP 401 - User not allowed to publish dataset {0}. " "MSG: {1}".format(pid, error_msg) ) elif resp.status_code == 200: print("Dataset {0} published".format(pid)) return resp def get_dataset_lock(self, pid): """Get if dataset is locked. The lock API endpoint was introduced in Dataverse 4.9.3. Parameters ---------- pid : str Persistent identifier of the Dataset (e.g. ``doi:10.11587/8H3N93``). Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/datasets/:persistentId/locks/?persistentId={1}".format( self.base_url_api_native, pid ) return self.get_request(url, auth=True) def get_dataset_assignments(self, identifier, is_pid=True, auth=True): """Get Dataset assignments. GET http://$SERVER/api/datasets/$id/assignments?key=$apiKey """ if is_pid: url = "{0}/datasets/:persistentId/assignments/?persistentId={1}".format( self.base_url_api_native, identifier ) else: url = "{0}/datasets/{1}/assignments".format( self.base_url_api_native, identifier ) return self.get_request(url, auth=auth) def delete_dataset(self, identifier, is_pid=True, auth=True): """Delete a dataset. Delete the dataset whose id is passed Status Code: 200: dataset deleted Parameters ---------- identifier : str Identifier of the dataset. Can be a Dataverse identifier or a persistent identifier (e.g. ``doi:10.11587/8H3N93``). is_pid : bool True, if identifier is a persistent identifier. Returns ------- httpx.Response Response object of httpx library. """ if is_pid: url = "{0}/datasets/:persistentId/?persistentId={1}".format( self.base_url_api_native, identifier ) else: url = "{0}/datasets/{1}".format(self.base_url_api_native, identifier) resp = self.delete_request(url, auth=auth) if resp.status_code == 404: error_msg = resp.json()["message"] raise DatasetNotFoundError( "ERROR: HTTP 404 - Dataset '{0}' was not found. MSG: {1}".format( identifier, error_msg ) ) elif resp.status_code == 405: error_msg = resp.json()["message"] raise OperationFailedError( "ERROR: HTTP 405 - " "Published datasets can only be deleted from the GUI. For " "more information, please refer to " "https://github.com/IQSS/dataverse/issues/778" " MSG: {0}".format(error_msg) ) elif resp.status_code == 401: error_msg = resp.json()["message"] raise ApiAuthorizationError( "ERROR: HTTP 401 - User not allowed to delete dataset '{0}'. " "MSG: {1}".format(identifier, error_msg) ) elif resp.status_code == 200: print("Dataset '{0}' deleted.".format(identifier)) return resp def destroy_dataset(self, identifier, is_pid=True, auth=True): """Destroy Dataset. http://guides.dataverse.org/en/4.16/api/native-api.html#delete-published-dataset Normally published datasets should not be deleted, but there exists a “destroy” API endpoint for superusers which will act on a dataset given a persistent ID or dataset database ID: curl -H "X-Dataverse-key:$API_TOKEN" -X DELETE http://$SERVER/api/datasets/:persistentId/destroy/?persistentId=doi:10.5072/FK2/AAA000 curl -H "X-Dataverse-key:$API_TOKEN" -X DELETE http://$SERVER/api/datasets/999/destroy Calling the destroy endpoint is permanent and irreversible. It will remove the dataset and its datafiles, then re-index the parent dataverse in Solr. This endpoint requires the API token of a superuser. """ if is_pid: url = "{0}/datasets/:persistentId/destroy/?persistentId={1}".format( self.base_url_api_native, identifier ) else: url = "{0}/datasets/{1}/destroy".format( self.base_url_api_native, identifier ) resp = self.delete_request(url, auth=auth) if resp.status_code == 200: print("Dataset {0} destroyed".format(resp.json())) return resp def get_datafiles_metadata(self, pid, version=":latest", auth=True): """List metadata of all datafiles of a dataset. `Documentation `_ HTTP Request: .. code-block:: bash GET http://$SERVER/api/datasets/$id/versions/$versionId/files Parameters ---------- pid : str Persistent identifier of the dataset. e.g. ``doi:10.11587/8H3N93``. version : str Version of dataset. Defaults to `1`. Returns ------- httpx.Response Response object of httpx library. """ base_str = "{0}/datasets/:persistentId/versions/".format( self.base_url_api_native ) url = base_str + "{0}/files?persistentId={1}".format(version, pid) return self.get_request(url, auth=auth) def get_datafile_metadata( self, identifier, is_filepid=False, is_draft=False, auth=True ): """ GET http://$SERVER/api/files/{id}/metadata curl $SERVER_URL/api/files/$ID/metadata curl "$SERVER_URL/api/files/:persistentId/metadata?persistentId=$PERSISTENT_ID" curl "https://demo.dataverse.org/api/files/:persistentId/metadata?persistentId=doi:10.5072/FK2/AAA000" curl -H "X-Dataverse-key:$API_TOKEN" $SERVER_URL/api/files/$ID/metadata/draft """ if is_filepid: url = "{0}/files/:persistentId/metadata".format(self.base_url_api_native) if is_draft: url += "/draft" url += "?persistentId={0}".format(identifier) else: url = "{0}/files/{1}/metadata".format(self.base_url_api_native, identifier) if is_draft: url += "/draft" # CHECK: Its not really clear, if the version query can also be done via ID. return self.get_request(url, auth=auth) def upload_datafile(self, identifier, filename, json_str=None, is_pid=True): """Add file to a dataset. Add a file to an existing Dataset. Description and tags are optional: HTTP Request: .. code-block:: bash POST http://$SERVER/api/datasets/$id/add The upload endpoint checks the content of the file, compares it with existing files and tells if already in the database (most likely via hashing). `adding-files `_. Parameters ---------- identifier : str Identifier of the dataset. filename : str Full filename with path. json_str : str Metadata as JSON string. is_pid : bool ``True`` to use persistent identifier. ``False``, if not. Returns ------- dict The json string responded by the CURL request, converted to a dict(). """ url = self.base_url_api_native if is_pid: url += "/datasets/:persistentId/add?persistentId={0}".format(identifier) else: url += "/datasets/{0}/add".format(identifier) files = {"file": open(filename, "rb")} metadata = {} if json_str is not None: metadata["jsonData"] = json_str return self.post_request(url, data=metadata, files=files, auth=True) def update_datafile_metadata(self, identifier, json_str=None, is_filepid=False): """Update datafile metadata. metadata such as description, directoryLabel (File Path) and tags are not carried over from the file being replaced: Updates the file metadata for an existing file where ID is the database id of the file to update or PERSISTENT_ID is the persistent id (DOI or Handle) of the file. Requires a jsonString expressing the new metadata. No metadata from the previous version of this file will be persisted, so if you want to update a specific field first get the json with the above command and alter the fields you want. Also note that dataFileTags are not versioned and changes to these will update the published version of the file. This functions needs CURL to work! HTTP Request: .. code-block:: bash POST -F 'file=@file.extension' -F 'jsonData={json}' http://$SERVER/api/files/{id}/metadata?key={apiKey} curl -H "X-Dataverse-key:$API_TOKEN" -X POST -F 'jsonData={"description":"My description bbb.","provFreeform":"Test prov freeform","categories":["Data"],"restrict":false}' $SERVER_URL/api/files/$ID/metadata curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X POST -F 'jsonData={"description":"My description bbb.","provFreeform":"Test prov freeform","categories":["Data"],"restrict":false}' "https://demo.dataverse.org/api/files/:persistentId/metadata?persistentId=doi:10.5072/FK2/AAA000" `Docs `_. Parameters ---------- identifier : str Identifier of the dataset. json_str : str Metadata as JSON string. is_filepid : bool ``True`` to use persistent identifier for datafile. ``False``, if not. Returns ------- dict The json string responded by the CURL request, converted to a dict(). """ # if is_filepid: # url = '{0}/files/:persistentId/metadata?persistentId={1}'.format( # self.base_url_api_native, identifier) # else: # url = '{0}/files/{1}/metadata'.format(self.base_url_api_native, identifier) # # data = {'jsonData': json_str} # resp = self.post_request( # url, # data=data, # auth=True # ) query_str = self.base_url_api_native if is_filepid: query_str = "{0}/files/:persistentId/metadata?persistentId={1}".format( self.base_url_api_native, identifier ) else: query_str = "{0}/files/{1}/metadata".format( self.base_url_api_native, identifier ) shell_command = 'curl -H "X-Dataverse-key: {0}"'.format(self.api_token) shell_command += " -X POST -F 'jsonData={0}' {1}".format(json_str, query_str) # TODO(Shell): is shell=True necessary? return sp.run(shell_command, shell=True, stdout=sp.PIPE) def replace_datafile(self, identifier, filename, json_str, is_filepid=True): """Replace datafile. HTTP Request: .. code-block:: bash POST -F 'file=@file.extension' -F 'jsonData={json}' http://$SERVER/api/files/{id}/replace?key={apiKey} `replacing-files `_. Parameters ---------- identifier : str Identifier of the file to be replaced. filename : str Full filename with path. json_str : str Metadata as JSON string. is_filepid : bool ``True`` if ``identifier`` is a persistent identifier for the datafile. ``False``, if not. Returns ------- dict The json string responded by the CURL request, converted to a dict(). """ url = self.base_url_api_native files = {"file": open(filename, "rb")} data = {"jsonData": json_str} if is_filepid: url += "/files/:persistentId/replace?persistentId={0}".format(identifier) else: url += "/files/{0}/replace".format(identifier) return self.post_request(url, data=data, files=files, auth=True) def get_info_version(self, auth=DEPRECATION_GUARD): """Get the Dataverse version and build number. The response contains the version and build numbers. Requires no api token. HTTP Request: .. code-block:: bash GET http://$SERVER/api/info/version Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/info/version".format(self.base_url_api_native) return self.get_request(url, auth=auth) def get_info_server(self, auth=DEPRECATION_GUARD): """Get dataverse server name. This is useful when a Dataverse system is composed of multiple Java EE servers behind a load balancer. HTTP Request: .. code-block:: bash GET http://$SERVER/api/info/server Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/info/server".format(self.base_url_api_native) return self.get_request(url, auth=auth) def get_info_api_terms_of_use(self, auth=DEPRECATION_GUARD): """Get API Terms of Use url. The response contains the text value inserted as API Terms of use which uses the database setting :ApiTermsOfUse. HTTP Request: .. code-block:: bash GET http://$SERVER/api/info/apiTermsOfUse Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/info/apiTermsOfUse".format(self.base_url_api_native) return self.get_request(url, auth=auth) def get_metadatablocks(self, auth=DEPRECATION_GUARD): """Get info about all metadata blocks. Lists brief info about all metadata blocks registered in the system. HTTP Request: .. code-block:: bash GET http://$SERVER/api/metadatablocks Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/metadatablocks".format(self.base_url_api_native) return self.get_request(url, auth=auth) def get_metadatablock(self, identifier, auth=DEPRECATION_GUARD): """Get info about single metadata block. Returns data about the block whose identifier is passed. identifier can either be the block’s id, or its name. HTTP Request: .. code-block:: bash GET http://$SERVER/api/metadatablocks/$identifier Parameters ---------- identifier : str Can be block's id, or it's name. Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/metadatablocks/{1}".format(self.base_url_api_native, identifier) return self.get_request(url, auth=auth) def get_user_api_token_expiration_date(self, auth=DEPRECATION_GUARD): """Get the expiration date of an Users's API token. HTTP Request: .. code-block:: bash curl -H X-Dataverse-key:$API_TOKEN -X GET $SERVER_URL/api/users/token Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/users/token".format(self.base_url_api_native) return self.get_request(url, auth=auth) def recreate_user_api_token(self): """Recreate an Users API token. HTTP Request: .. code-block:: bash curl -H X-Dataverse-key:$API_TOKEN -X POST $SERVER_URL/api/users/token/recreate Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/users/token/recreate".format(self.base_url_api_native) return self.post_request(url) def delete_user_api_token(self): """Delete an Users API token. HTTP Request: .. code-block:: bash curl -H X-Dataverse-key:$API_TOKEN -X POST $SERVER_URL/api/users/token/recreate Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/users/token".format(self.base_url_api_native) return self.delete_request(url) def create_role(self, dataverse_id): """Create a new role. `Docs `_ HTTP Request: .. code-block:: bash POST http://$SERVER/api/roles?dvo=$dataverseIdtf&key=$apiKey Parameters ---------- dataverse_id : str Can be alias or id of a Dataverse. Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/roles?dvo={1}".format(self.base_url_api_native, dataverse_id) return self.post_request(url) def show_role(self, role_id, auth=DEPRECATION_GUARD): """Show role. `Docs `_ HTTP Request: .. code-block:: bash GET http://$SERVER/api/roles/$id Parameters ---------- identifier : str Can be alias or id of a Dataverse. Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/roles/{1}".format(self.base_url_api_native, role_id) return self.get_request(url, auth=auth) def delete_role(self, role_id): """Delete role. `Docs `_ Parameters ---------- identifier : str Can be alias or id of a Dataverse. Returns ------- httpx.Response Response object of httpx library. """ url = "{0}/roles/{1}".format(self.base_url_api_native, role_id) return self.delete_request(url) def get_children( self, parent=":root", parent_type="dataverse", children_types=None, auth=True ): """Walk through children of parent element in Dataverse tree. Default: gets all child dataverses if parent = dataverse or all Example Dataverse Tree: .. code-block:: bash data = { 'type': 'dataverse', 'dataverse_id': 1, 'dataverse_alias': ':root', 'children': [ { 'type': 'datasets', 'dataset_id': 231, 'pid': 'doi:10.11587/LYFDYC', 'children': [ { 'type': 'datafile' 'datafile_id': 532, 'pid': 'doi:10.11587/LYFDYC/C2WTRN', 'filename': '10082_curation.pdf ' } ] } ] } Parameters ---------- parent : str Description of parameter `parent`. parent_type : str Description of parameter `parent_type`. children_types : list Types of children to be collected. 'dataverses', 'datasets' and 'datafiles' are valid list items. auth : bool Authentication needed Returns ------- list List of Dataverse data type dictionaries. Different ones for Dataverses, Datasets and Datafiles. # TODO - differentiate between published and unpublished data types - util function to read out all dataverses into a list - util function to read out all datasets into a list - util function to read out all datafiles into a list - Unify tree and models """ children = [] if children_types is None: children_types = [] if len(children_types) == 0: if parent_type == "dataverse": children_types = ["dataverses"] elif parent_type == "dataset": children_types = ["datafiles"] if ( "dataverses" in children_types and "datafiles" in children_types and "datasets" not in children_types ): print( "ERROR: Wrong children_types passed: 'dataverses' and 'datafiles'" " passed, 'datasets' missing." ) return False if parent_type == "dataverse": # check for dataverses and datasets as children and get their ID parent_alias = parent resp = self.get_dataverse_contents(parent_alias, auth=auth) if "data" in resp.json(): contents = resp.json()["data"] for content in contents: if ( content["type"] == "dataverse" and "dataverses" in children_types ): dataverse_id = content["id"] child_alias = self.dataverse_id2alias(dataverse_id, auth=auth) children.append( { "dataverse_id": dataverse_id, "title": content["title"], "dataverse_alias": child_alias, "type": "dataverse", "children": self.get_children( parent=child_alias, parent_type="dataverse", children_types=children_types, auth=auth, ), } ) elif content["type"] == "dataset" and "datasets" in children_types: pid = ( content["protocol"] + ":" + content["authority"] + "/" + content["identifier"] ) children.append( { "dataset_id": content["id"], "pid": pid, "type": "dataset", "children": self.get_children( parent=pid, parent_type="dataset", children_types=children_types, auth=auth, ), } ) else: print("ERROR: 'get_dataverse_contents()' API request not working.") elif parent_type == "dataset" and "datafiles" in children_types: # check for datafiles as children and get their ID pid = parent resp = self.get_datafiles_metadata(parent, version=":latest") if "data" in resp.json(): for datafile in resp.json()["data"]: children.append( { "datafile_id": datafile["dataFile"]["id"], "filename": datafile["dataFile"]["filename"], "label": datafile["label"], "pid": datafile["dataFile"]["persistentId"], "type": "datafile", } ) else: print("ERROR: 'get_datafiles()' API request not working.") return children def get_user(self): """Get details of the current authenticated user. Auth must be ``true`` for this to work. API endpoint is available for Dataverse >= 5.3. https://guides.dataverse.org/en/latest/api/native-api.html#get-user-information-in-json-format """ url = f"{self.base_url}/users/:me" return self.get_request(url, auth=True) def redetect_file_type( self, identifier: str, is_pid: bool = False, dry_run: bool = False ) -> Response: """Redetect file type. https://guides.dataverse.org/en/latest/api/native-api.html#redetect-file-type Parameters ---------- identifier : str Datafile id (fileid) or file PID. is_pid : bool Is the identifier a PID, by default False. dry_run : bool, optional [description], by default False Returns ------- Response Request Response() object. """ if dry_run is True: dry_run_str = "true" elif dry_run is False: dry_run_str = "false" if is_pid: url = f"{self.base_url_api_native}/files/:persistentId/redetect?persistentId={identifier}&dryRun={dry_run_str}" else: url = f"{self.base_url_api_native}/files/{identifier}/redetect?dryRun={dry_run_str}" return self.post_request(url, auth=True) def reingest_datafile(self, identifier: str, is_pid: bool = False) -> Response: """Reingest datafile. https://guides.dataverse.org/en/latest/api/native-api.html#reingest-a-file Parameters ---------- identifier : str Datafile id (fileid) or file PID. is_pid : bool Is the identifier a PID, by default False. Returns ------- Response Request Response() object. """ if is_pid: url = f"{self.base_url_api_native}/files/:persistentId/reingest?persistentId={identifier}" else: url = f"{self.base_url_api_native}/files/{identifier}/reingest" return self.post_request(url, auth=True) def uningest_datafile(self, identifier: str, is_pid: bool = False) -> Response: """Uningest datafile. https://guides.dataverse.org/en/latest/api/native-api.html#uningest-a-file Parameters ---------- identifier : str Datafile id (fileid) or file PID. is_pid : bool Is the identifier a PID, by default False. Returns ------- Response Request Response() object. """ if is_pid: url = f"{self.base_url_api_native}/files/:persistentId/uningest?persistentId={identifier}" else: url = f"{self.base_url_api_native}/files/{identifier}/uningest" return self.post_request(url, auth=True) def restrict_datafile(self, identifier: str, is_pid: bool = False) -> Response: """Uningest datafile. https://guides.dataverse.org/en/latest/api/native-api.html#restrict-files Parameters ---------- identifier : str Datafile id (fileid) or file PID. is_pid : bool Is the identifier a PID, by default False. Returns ------- Response Request Response() object. """ if is_pid: url = f"{self.base_url_api_native}/files/:persistentId/restrict?persistentId={identifier}" else: url = f"{self.base_url_api_native}/files/{identifier}/restrict" return self.put_request(url, auth=True) class SearchApi(Api): """Class to access Dataverse's Search API. Examples ------- Examples should be written in doctest format, and should illustrate how to use the function/class. >>> Attributes ---------- base_url_api_search : type Description of attribute `base_url_api_search`. base_url : type Description of attribute `base_url`. """ def __init__(self, base_url, api_token=None, api_version="latest", *, auth=None): """Init an SearchApi() class.""" super().__init__(base_url, api_token, api_version, auth=auth) if base_url: self.base_url_api_search = "{0}/search?q=".format(self.base_url_api) else: self.base_url_api_search = self.base_url_api def search( self, q_str, data_type=None, subtree=None, sort=None, order=None, per_page=None, start=None, show_relevance=None, show_facets=None, filter_query=None, show_entity_ids=None, query_entities=None, auth=DEPRECATION_GUARD, ): """Search. http://guides.dataverse.org/en/4.18.1/api/search.html """ url = "{0}{1}".format(self.base_url_api_search, q_str) if data_type: # TODO: pass list of types url += "&type={0}".format(data_type) if subtree: # TODO: pass list of subtrees url += "&subtree={0}".format(subtree) if sort: url += "&sort={0}".format(sort) if order: url += "&order={0}".format(order) if per_page: url += "&per_page={0}".format(per_page) if start: url += "&start={0}".format(start) if show_relevance: url += "&show_relevance={0}".format(show_relevance) if show_facets: url += "&show_facets={0}".format(show_facets) if filter_query: url += "&fq={0}".format(filter_query) if show_entity_ids: url += "&show_entity_ids={0}".format(show_entity_ids) if query_entities: url += "&query_entities={0}".format(query_entities) return self.get_request(url, auth=auth) class SwordApi(Api): """Class to access Dataverse's SWORD API. Parameters ---------- sword_api_version : str SWORD API version. Defaults to 'v1.1'. Attributes ---------- base_url_api_sword : str Description of attribute `base_url_api_sword`. base_url : str Description of attribute `base_url`. native_api_version : str Description of attribute `native_api_version`. sword_api_version """ def __init__( self, base_url, api_version="v1.1", api_token=None, sword_api_version="v1.1", *, auth=None, ): """Init a :class:`SwordApi ` instance. Parameters ---------- sword_api_version : str Api version of Dataverse SWORD API. api_token : str | None An Api token as retrieved from your Dataverse instance. auth : httpx.Auth Note that the SWORD API uses a different authentication mechanism than the native API, in particular it uses `HTTP Basic Authentication `_. Thus, if you pass an api_token, it will be used as the username in the HTTP Basic Authentication. If you pass a custom :py:class:`httpx.Auth`, use :py:class:`httpx.BasicAuth` with an empty password: .. code-block:: python sword_api = Api( "https://demo.dataverse.org", auth=httpx.BasicAuth(username="my_token", password="") ) """ if auth is None and api_token is not None: auth = httpx.BasicAuth(api_token, "") super().__init__(base_url, api_token, api_version, auth=auth) if not isinstance(sword_api_version, ("".__class__, "".__class__)): raise ApiUrlError( "sword_api_version {0} is not a string.".format(sword_api_version) ) self.sword_api_version = sword_api_version # Test connection. if self.base_url and sword_api_version: self.base_url_api_sword = "{0}/dvn/api/data-deposit/{1}".format( self.base_url, self.sword_api_version ) else: self.base_url_api_sword = base_url def get_service_document(self): url = "{0}/swordv2/service-document".format(self.base_url_api_sword) return self.get_request(url, auth=True) pyDataverse-0.3.4/pyDataverse/auth.py000066400000000000000000000063711467256651400176360ustar00rootroot00000000000000"""This module contains authentication handlers compatible with :class:`httpx.Auth`""" from typing import Generator from httpx import Auth, Request, Response from pyDataverse.exceptions import ApiAuthorizationError class ApiTokenAuth(Auth): """An authentication handler to add an API token as the X-Dataverse-key header. For more information on how to retrieve an API token and how it is used, please refer to https://guides.dataverse.org/en/latest/api/auth.html. """ def __init__(self, api_token: str): """Initializes the auth handler with an API token. Parameters ---------- api_token : str The API token retrieved from your Dataverse instance user profile. Examples -------- >>> import os >>> from pyDataverse.api import DataAccessApi >>> base_url = 'https://demo.dataverse.org' >>> api_token_auth = ApiTokenAuth(os.getenv('API_TOKEN')) >>> api = DataAccessApi(base_url, api_token_auth) """ if not isinstance(api_token, str): raise ApiAuthorizationError("API token passed is not a string.") self.api_token = api_token def auth_flow(self, request: Request) -> Generator[Request, Response, None]: """Adds the X-Dataverse-key header with the API token and yields the original :class:`httpx.Request`. Parameters ---------- request : httpx.Request The request object which requires authentication headers Yields ------ httpx.Request The original request with modified headers """ request.headers["X-Dataverse-key"] = self.api_token yield request class BearerTokenAuth(Auth): """An authentication handler to add a Bearer token as defined in `RFC 6750 `_ to the request. A bearer token could be obtained from an OIDC provider, for example, Keycloak. """ def __init__(self, bearer_token: str): """Initializes the auth handler with a bearer token. Parameters ---------- bearer_token : str The bearer token retrieved from your OIDC provider. Examples -------- >>> import os >>> from pyDataverse.api import DataAccessApi >>> base_url = 'https://demo.dataverse.org' >>> bearer_token_auth = OAuthBearerTokenAuth(os.getenv('OAUTH_TOKEN')) >>> api = DataAccessApi(base_url, bearer_token_auth) """ if not isinstance(bearer_token, str): raise ApiAuthorizationError("API token passed is not a string.") self.bearer_token = bearer_token def auth_flow(self, request: Request) -> Generator[Request, Response, None]: """Adds the X-Dataverse-key header with the API token and yields the original :class:`httpx.Request`. Parameters ---------- request : httpx.Request The request object which requires authentication headers Yields ------ httpx.Request The original request with modified headers """ request.headers["Authorization"] = f"Bearer {self.bearer_token}" yield request pyDataverse-0.3.4/pyDataverse/docs/000077500000000000000000000000001467256651400172445ustar00rootroot00000000000000pyDataverse-0.3.4/pyDataverse/docs/source/000077500000000000000000000000001467256651400205445ustar00rootroot00000000000000pyDataverse-0.3.4/pyDataverse/docs/source/_images/000077500000000000000000000000001467256651400221505ustar00rootroot00000000000000pyDataverse-0.3.4/pyDataverse/docs/source/_images/collection_dataset.png000066400000000000000000001042211467256651400265160ustar00rootroot00000000000000PNG  IHDR  œIDATx^Uϕ"EQ`CT{A+vE,XPPT^Wck}W~|9;ۜ}J{8Yc|㜕d&3e̝ڴIWҕt+]JWҕt+]JWҕt+]JWҕt+]JWҕt+]JWڵk?*H$D" }'D"H !ˉ0H$D" liT\ND"H$Y`CH"r"L$D"B!a"H$DҨD"H$FEDH$D"64*B.'D"H$,!Qr9&D"Hd !ˉ0H$D" liT\ND"H$Y`CH"r"L$D"B!a"H$DҨD"H$FEDH$D"64*B.'vbH$Y5cٶڵOx(gY>tﺜW"ύ|QC@ß=XyRۯ 1ucG%>7ˉ0 #XԈyRۯ 1ucG%>7ˉ0 #XԈyRۯ 1ucG%>7ˉ0 #XԈyRۯ 1ucG%>7ˉ0 #XԈyRۯ 1ucG%>7ˉ0 #XԈyRۯ 1ucG%>7ˉ0 #XԈyRۯ 1ucGvJscӇg{ظdM69.' XԈy|r]4m:n>tڵtC^@^eePM?Dk{|E4%V·иKD$1ρO'4XY%+_1ucG~2ʼn֚j.m0%VzALݘnLgqZ"ORV+ChܥwFD$1ρO'}E^2ES7~'Yhyȓw}9lĢFs #;j_beԍft'Zkޫ(!e/r>]zgD_N! /A"}ȎڗXY%+_1ucG~2ʼn֚j.m0%VzALݘnL&k'PN0] #9gW?'1>aFW dd~&ORV+ChܥC@^6P4zla̳eZBYpe0%V֔t!ԍftd22 #ˢZ"ORV+ChܥzC@^6PC6Za@-AzE)G)Xԍftd22 cʻ_UVݏ;LG2I|e[Ij5.sǔ01#l?LSO浖al}Kh 6Uf2>> c6e̯nqґj+Y70YdQ )+}ɒy̠}Կ7 YqKJ}9llx_O[."S 1ρy2V6O[++_ 8ﲊC蕈S?4ed^kFs3nW?Uf⓲RO-غasw#J=rpAN,Ҩd]9zX|^>4'Lsd-)}9llx_O[."S 1ρy2V6O[++_1ucG~2̟k-Hhyodr0p0YdQ )+}ɒ;胚{9▔҈C@^6PC6Za@-AzE)Gi4}7oLs &DB =uo :Ի,hO1<>)X]ˉ0e#8dëFztwYўby|tSyR^Y hXg9 >ԍftd22 cvϔX{]G2I|e[O<,w *.[4yҐޗX9B.GDB =uo :Ի,hO1<>)XrHeYAS0n\xMw׊J>xQuٹ'搲ݧ.=ֳ9~[or" yB!jѿ:]Eb]elWVbԏtd:2M?ZeZ^-Kۗf[ EVՐޗ,9+sVo~5INȊ[R*ˉ0e#8dëFztwYўby|tSyR^Y"S?4ed^kFEk{|Å#N, v EVՐޗ,8I>9▔҈C@^6PC6Za@-AzE)G)X)XflyeeePM?Dk{|E4%V·иKDB#aևC9D\csܧ6rS5YUojߡre#h6TwM͑<}~T.;lUv?q5cWđ->7iͶDd##)=KFLݘnLgqZ"ORV+Ch܄or" yB!e1jŕ{k9&mNv6$OYP=W]IY{aկfѡ֐lYKFLݘnLgqZ"ORV+Chܥ}DB# vF=Ɠv[61@c/>ctxu j>iK p )+!`M=Uj7sV\)Qھ~mbߑ)=KFLݘnLgqZ"ORV+Ch܄or" yB!e1`= W2@gw5Y.rT2qjvd%×X,Jaڶ]BێM_)W2@Y"S 1Atmmԍft'Zkޫ(!e/r>M69.' Q2PO =m96Fu}%Sّ~e&76y#h2$ 2Es1M)^6bԏtd:5]yҐޗX9B&|aF qȎ(nv}wm-^_,T6FY̌i@`4լ/OVSmƎ$OYc3u` |ߞzʌ^vdmhO1<ҵS7~'Yhyȓ7东C@^6PCvDYv 2KxQvAy;rpɵvݰvE8^w̑ l̛뺜}#?L.}9,ʕrqIhO1<ҵS7~'Yhyȓ7东C@^6PCvD>p v5UvԖ&Oӑ|幎_=lt%nLHD"QwE_N! /_–SeeaZ! ([{|Yj߾zy~["NsSzG_N! /_lͷ꧷?2@}7_dZ`AlQs00T+ X#2 , lv}of=i>隣N:\A/3rz}7)O0rY.Vu񈞶:=(}?|N~S7~["NsS껢/'/PCvD69kKCȲ>Goл[{Țu~)q鎻/&- ;9,=7?z]};M) VW &͌Lr mmG?poߝN+myD۶OWv8Md`5fF]`#q_IX_bԏtK$ }nJˉ0 #Q!F&Yߣ_X:Æ8n]!mhO1< p6a5q+~]w0LUvf\tD=3nfĞ^Rz}ßB) Db!)/'/PCvD6le'*M}q}{spBi6]}:sʿi#ӵfUz꣨)`^g~j&8i ~Bk@88ytm.B}Bk3@0mB?s1CYf0Ôr]1~|v[rzGm"1<͎iX Db!)/'/PCvD6a9Miܲ(a+'j鳟qҫDHzpsn̑(v): J͡og߸H7d ,.%,|7ƴA7l =rc/3#P`̍&sGy%W͌p|=ӱSrݭyY8vuWP+ ĢCsSz?D_N! /_l;9=E}јίmlEj{f߱fg3jŮ2}Þ^G0Bn{hRm酡WI0\g '.oY`D~ounpEo$1<;Pۀ=&DL؇U3@XGzv8TgDDM6Ja}ig{)#IH"?xY8|3y&.mSoFmgmdU7or" y8dGdCT}&uSam.<ncҁ |Xm=n9Z2ͫ_ߥҁ GyaM۟ᄷ9| M[L-a@|am0R9q mO,< Bs!aGKUW_Co}eÚ{ ahnuwKpgZ{֜rM°MVYMOz)-Ú;jk FUd8bԏtڋFJvNS_#('A]Gw]묱rӾ/S0@}7#maɍT/?Z{vF-WoB·q9|aB#s4L뙲kbvHY:0g:7NX*@Y:g¬{&e#K7F Jh{ Svj0;bh⇑j5P3aܑYe^3;%vOGH _9økG}?gvocq8B~AP#}cCN)ߛu*!ԍû/=:ovgW]6Vb#PsuZ#;h׮t&r ь,^{qKm96.,Ou<'3TzMˉ0 # k̺ Õ.F k(IQ`X([a8amf~Ja`0( v^7"۠tM&nS;amnBwЉ.5 0<!Sg^pI0:sǟ21quF4ajN;"֧MC6z{6;㇝'1ͳno>霮ݗ爜w0ꙙ`ĄCiw;o[vǎf•a i;'ߪ{0\ķ,0fXWbO Sk@BI:n,PbP]G*33Q $#|9w½k#1QґQwLy)0#Q@/F1F_qa}q# wDPհs/чUi0EFxXONjԍZ7@X]MG _`$m+6;"_S۵Ss?e 0BR}9|aB#8okNK.j+ѹKg(XaYpsՕUvm +5نꏟà/7_sgG)nk4<{y$!̥g,^rZSG8Ԥq#^Y#zYC '!MҺk% ixĽjڝwӆm ,67yK04{оu)gʀ2Cn-6u cP/NLKa6nrk  ڤoAwuF{jv ߾zIܾ>7or" y8dGa VdOS"ɚa,C1r`FT̴‚`ִu긔S^^Si:cL&Lqe#!ƭ9事? OCvD,vzW^}g>/>8F<:0!7oOGFBytjaxB\"ʂi)/8X&elBF/ :}g/1A8bԏt%1 32RM!t`f7g11EgT^O@.$,cyfYSi9W?pk_8IL70e[L d}wI?)_[v3C K=QFa-w5gMϤ錫ќh2U[W>]v/Ưv8|tv^v骇1oM+ʨ?Hȷq5@X¨2H`9Nl~6aMdP2x`^ R qȎ( LKCٕۨB{Xrctx]vz!^祉Gǯ׷)+!vQt` ~956OYp",r6X+NJ+Pg_t>Dўbykk#nLH7_R F?\[p͢T5tvWi:(fVZȉ.f=3`o vݰvg*}{՝m`qHYY2NaTcgF N6" lb 4tL[e\ F0+6ftx[2@JF(!;,y`B?9?FNgez-N33pވNAa3J)RN,걽Y2f[sft东C@^C!eWf@j96JYO:5"lw5<Z@1:&]|c\C1|c=:!G&=uWd6k/yD#NF1PYs69I+FNQ3:0 xwlX4jjp&+X9=ւH#tC8:zc<Gla&9`)F#~v`S)Ǡ;c H:˅0I:/l/R>L& $&of ҹdaڍ3Fck1FIy%򊾬uLD~b;q08`2{T CɔCYV8d[5JC@^C!eq*W5iJOC#׿1Yf7/Ǟt>܄qjrd%?睘@7mx?a+ĝ5,}u}G=_tvgŎ"qaYʓB'W< JC@gH-%/yF/tfO=IsHW! +/iΰ:lZ3jU^ S- ' `0Ӝ%8卆< ܔ޿їa˗y(!;,^7]-ZN7-~M?4?FlƯvoXnS?Lk>DI#V4?Q8}KpS`;6n5nwhغޑ_֖GܐCgO qȶXIuHܔˉ0<wMMm}H1鐅ォ0nz |f~9ݗSF:zzE W1:l)ctE9H7+r" y28dGE–Pbt@ct1@؞ʁ`]B1:@y%mo[D2@ӭ{Wuī?"S yD#i"NsS껢/'/PCvD[`''|{v*}ඇ&9T|ua'L߰r`#H`|t0n>ήvp&,r1˅dBTS5yD#i"NsS껢/'/PCvDLAdӐT|uLL}dM; /ݛceZͱRK !!lHY\w'jp`(L!]ٶH'ֆؓP9RƕH$s#j/'/PCvDY0sq9_]5L_mX2A1:S0LxC` f;B<ҵS7D6ˉ0 #Q?|6B{(F85=ҏN:6Y){kMEbzڈD"QJ}WD qȎNv+aY >3/,H7. w"S 1AtmmM" +r" y8dG GK ;T/VGv)|%I сVIjp2ka*!u0z[D{!9I$wE_N! /_숲Ql5{aGupն][']Cv$&t቗'WkwapvcDj ʶEtmmM" w{D qȎ(s/HK|رخ GNbсYؤ:M;Uctоj]wY?@ZqghM=0C:|s=I o~䏩&vpnmvتI wS]}#+S G9q l=[Y)=KFL$P껢/'/PCvDY{Z/Б10 W16r&k-X"@;6q;q1Jﱼw~ T Sjf,*QD{!9I$wE_N! /_숲83%^Z,s:x?=} :6FX-0#/jctmwZoe#^o= ?S7G`Jiq-=4w{Lўbykk#nDm(]їaF(!;,f~Z2Zpy}Oy!Rpt%m.G0czBIyoi_ձR,8.c uOfLKw9gեtjj ʶo&v9n>P]vu}@.,D"2(=їaF(!;"{[΀5 Y'pcu` .S ?ƍŖts՞(rt2'C9^H#-߈ឭL@?|ўby|t-KS͇<5~:`| bg3<>)Xб|y^g daՁ-0z[NH?V`y˂e:&"S 1ρeiB-啕/1#DuB;#r" y8ˬldr6yr`}Hi, `9ՑM=W]Y*e[,ݸF|#0`t5+v>, )C^Y"S?-HT')3/'/PC̚3O*$x#[ l]f[6ړM7IWº‡NsҔeև90:ҕm; ]CR+k]WVbԏtK$ }nJˉ0 #/J喰lwN%[NHOkՁfv;mmҜz0Q C컇ꫨs.>K/5:03\p\DhYLgHǑCWwMÌ^#UߍZSD{!9շ,++ږ/1#DuB;#r" y8LB=qtnӴp8~*J?֣#D7Y[k1^fi5)!eqG6 3 r4~y|t)戕 iK2ES7~["NsSzgD_N! /_|IXgе[ô ,Jcu]bс]%~cW hZ^A`x'8Q]xy$,8O|w<3 }Jtўby|t)戕 iK2ES7~[v7Uk4PyYui3ߨ6hx/dowd6|:CՔۯuMsSzgD_N! /_|IN=k>TW5Y#F zqV5v)Nl`Mo>inSD{!9է,#V6-IALݘnY\2rکK%?u5_>|ՑKOީZ}}V\QV{Ʃ?ݑ:maǔYzkk͑[،,67M}9|aBe&a1%+{56FX0Kj$mHw ,nY`I7IHY :`=^ 0P"S 1ρ>e!mIWVbԏt)OzxkLS9Vmo#ծ][5la߾r+ԥ眢;HhC_߿).>|G3wdx۴m٧Og: I^G`zWoB;#r" y8L22HXZ|tXtNV k0j"%!e}FaHw"S 1ρ>e!mIWV|F\p$W>y =R~|I<͝E2@}nJˉ0 #/3 B)P󑲅a4|H?yee%5 wd'$nLH7_ ]k۫?|Z.>$amP ڧɯמC˯zc8Q>r}v'֞zsaÓ7yݶW︕:]JlfMG~z FP{uELRvZ>HhN:snFbF+lHg-^41@jWq{2a0ol(YrͅõqJnا\'L<|=̉?L=1-E.o>'!>|1'^_Bҳ}9|aBe&Iȯ$xHW,|t%̣/yee+1tͶF_ILݘnd l0j~XvC~q-:,?j4mxV7ZMWk*X=E u`DcB7ڽH5ArM˄щ]F/-a]yNHbϟx0jb:k([u?z;j+ F|5p-WG0(_  ⧮l0LنMtuYt"G[{pUp3AąaJAW-vz0!|aF(!_f 3G5_%QCmoHJcXC)G?7,5 ͷ>Nf /d+Җ$yeeミ5GLݘnd st:'9O0g˲~w ۝w{qnwFg[k"O#ps=$\UW֝'ttt\]~|E_;A1nq'IW.˳nf;7 su3.!WifsQwDLjv36n!ȝ7\0:bef=v=,|byMv@g WQvxF6H9ϖݮtt4xHMtYV ·q9|aBe&9#1 ~Bf•{58hd _eî;n8c7v) V6-IALݘn kZUa{:#ߚZ]cu6l F !uf~UahOf>Yvk%jg7I pu3`|ZioF_w$δ1m'8̘B'(q1bB uqݱ ,u쩺ܙ"ðvcꮚbm *Vig u3(kܨs;,^ۣS! +ݪQˉ0 #/Za@-AzE)G# wm#DiXvN:=S <CƤFoh7_F؂-˫RK갳O΅ۄ g =Zɻo~vٶ֜bF2'n |aF(!_f_O[."S 1ρy2V6O[++/_ժa;vꤷ01#|2@|MYjp= 3 zsyM6g`4 Ly>2f_{`]mcEaH+QgIm&|c8p~= HeHU5X@挚en4h8l{_3=%1J,ĕ~6Y3ZicqKH)q8ˬV~KСeQD{!95OYiKyee%z4;vHbԏt%a%Le7pӞ0S0W_0|sY j){]t`)? #Y;Q999G XoY )-6v?akiضKxw0z0=BʹBx2bVvW" *Bgl' #/Za@-AzE)GxQgб.?h3ށT 4(?`?։?o0(9G| `-njjK-١_ ﳛW%4=w h58)%._| =uo :Ի,hO1<>)XsܳДp]8$33%tF)їaF(!_f_O[."S 1ρy2V6O[++_y_Y49~S?matv'Hne5q;/pFp3 S>_Ő!F?_5/r_ F a⩶9: :sFWS&1߉iFB+uJ*Lㅖ2@ZܔїaF(!_f_O[."S 1ρy2V6O[++/YT]uW. #S?-HT')3/'/PCjѿ:]Eb]elWV_ܛӀ &h>7g9r" y8ˬV~KСeQD{!95OYiKyee%,<׿qfz~ԍH$ܔїaF(!_f_O[."S 1ρy2V6O[++_ a6KO_%bԏtK$ }nJˉ0 #/Za@-AzE)G)X)X)X7)m(ڵs|ҵFO" nYDePz/'/PCvD_O[."S 1AtmmM" +r" y8dGT+H%P(=KFL@: 5'ZCH%P(=KFLݘnD:MC@^0B! =uo :Ի,hO1<ҵS7~["NsS껢/'/PCvD_O[."S 1AtmmԍH$ܔˉ0 #Q0S CˢCsP/][1ucG%>7+r" y8dGT+H%P(=KFLݘnD:MC@^0B! =uo :Ի,hO1<ҵS7~["NsS껢/'/PCvD_O[."S 1Atmmԍh=sԌR/MU7 5Ňィݧ7^WZsZMC@^0B! =uo :Ի,hO1<1աÒjͷXoG9r1#ݲ'߾3"|>QkvnÓaqУݺ0#Xb [م͛ܫAV_ }nJˉ0 #.]H= F⎟.vGYKZB5-vx|r~/eiotvfԌwd}ejs#/T[lz?bԏtq]m;l?;+[g u'?##]u^viuA񗪋:Il֥h\tZך{=ct~^zKχ1#|-:ge=pÓY_>dZcz@FZeLm~X><ّbwkYi?x o=3cn3sMsSzE_N! /_ĢFs ^_#"bxTv;َ_k~^6ݰpcsۯ5w_2@=ג }nJˉ0 #XԈyb~^}_zeoZvz=jvU;﹏Zmkn~ੲg;ڵk\F\pe9nq@j:}6PwLyI/Cպl꽞Z&k&4 wCSFT[mjW??K]CZcly}?`AM"nLH7_ 0#&}8~dYd |:)Wo:1j U][b0!\W} ?xQ/yuYlo#π>LW`I?z`Vb ⥼EN֥b[(go>uǷo; }NBһ,r" yH$5bضcL}r0=F4;uR>Yn䅹K.zS3?QxMcNưaĚ6Xw-aV/}0tAGJ=))\Ɯ58u>UW^Q=MMHٽ}C\>4{vfs]>o'3l"jo&j=wR>D p :n]{Lj6(s|508\z3a%^fN=#IM qHjbDH,j<m6av co);n"ʫ{QCtk&[v;Ӵ=OTv[}͵(-;GRivaЃrfm]n|$/1ucGe`P0.?[M{x?c@p{ᑛo]<~X~ȡe7 xꮲnr}Wby*Q:0;wk#ĸѱ_tܳk9<3nٸ`],ƍ&h٥!$za /1Zc2RcC  wk0013ߨڶ]B>u*ȃj. ?iPy;n_}N5F.ヲEB!fP()ݪA69.'/DbQ#9m>&1@0J y{F'Ah-ߜ槿S<7 ~nwzfN<#3cFYևi|nC9h:k+'A_8rl4(j6g0UO/?:qq)Y׏GStkMن 15\{>+WUTq0q ºi!6#sNpQ&ғ#tܺ0br ].2^F40B5ad66\o m`t 4g0drwu( it^m{ۃea\)1Llsb9ji ~7A!6J9W>a=I:P@Ӹ1c(6[le9„a Fo?\´v:LH0vt cw`tYczu# t[ܴqQ8k=^N{$>a C  /1B:l܌b Ltd[5&DH,j<maD{T^7T/1&x '^Ԭ8~:'?$쬏Z q^c^ϼ*' h˗ؿ4Xw^c$NBT S$  oL{N4##3|9T䛹NUT3@B1B C }e@*46RBft\gGWiۄ fcգj5$|a$1Al1@u些nn=stK˚> VYS7l+՗8Q4̈́h"nWP^ZilaTGS7~/Y\XLj%cN9yXLAN|zd|Eʡvخz {nɚ$]#Wj` S1L'Lg[R.8@du`FϨ 9c`C3`d6ߎ\%&9sQm89sd/DbQ#9mR頱1_rTy[pcgZ ۝(l1uXML~/{ y?6 wQjt-̲zmt J'|S?͗,Člk5yf9fQ)1@X3ՁAaxv1-Æ`ʃ%ς CۍF.Lgm$Fk|9a:V,aXDfk| `W n"V 0ؔ~6ף*  k g%|a,ʼ/M%h<mާ_0|fv,y951Ŧ5wdm9eMXs>ڝmt0Fۆ_R1f]š9n&e7 3-V+y9j۝v/oC=Ic+z ,戩S?QA=q^u@?uM0[>mwktqp(_ YNLg(;c9ҐKp FE'>Λ{xm0ГrG{ЩcǮc % EX*u@,!c~c:LJ}0(S];m.[cأ^ EnMʦPm_ig_FE r΂Mle6W깪} oAw&)b|vgj'`M#!7d0LXY tnnrYNclpؘ\Kܛ#°-a]îW/?zޖʹˁ_eγ'v7R1A˶zHNs+Gw2‚*ӑJc0PX7`;x;s(jDQB!Ƅ$4ԘA"1S1PR)1 jEmmp{ng[Y}I};gc.C'(;@vڳƆ f ?7kWǀR:P<&l1c/߷w옶߄52~f\Nitgz1:p"!6`ve?˷<3`#H-@wc@/".(FoX9x!APPk O&m 89NP k0ʎo xSOkuKCy}i#V yA^]H"?0\i;<02\o_8?g# xwG97v~|^5\%۲bй8JLlN3CxkS.lұ"0-|fW` 6>SaұGdc B'"rx;6A5 3WS,ưOtĽG3PY o 3]K0@KG=pqxRF>wg"p^PΥ|N:AãDmyc/Dl >.W{Ug[Φj}CD,@Xpl^cc込6 vyŸ}YP}Ъ~wsc !:}Cv-QW  0vαu$@`EoP S}Ъ~wsc !:}֝2E`H=Ƹ}9L8gz yphw Hmڏ:0#m.3X`IpYSۆX& X8c}K sݨyB)zP]Kԕ2E`ԃyϼZ֞F:y, 4(lQfF`.8C"ݟOz'9b]xL$Q7-|+&Mdjې݇gE3S}Ъ~wsc !:}՝2E` ^ΊyXIGάt)J]]f3g#roZSbTژjiAԻa_Ec1?F"Tմn+Q}Ъ~wsc !:}՝2E` RS 0-i鐋t-L0#)wІ)i}. nDJ0K5fBU4dG)O2zw797v~|s7_)3XF+PaxzpJ[+I~PxFgiy_0F wR߬'AlY`m ʴݨyB)zߴu`h@$tiA>sT\-}߾XUBtNox՝2E Sa|σw:θ{ ڨ&@iSij,+z̉jmKC-&oAdm+!"}L)3Xy+Pp]/;bR}LlM7+#ǕLkqv H,DC-wlOⅺ|a|L+!"<;ۤD})3Xy+PI0`w5GZԳ۾vw"7BЪwPƇ|vOg6Ƕd6SvM@<$={YozBW!l`h ql.|鋠)L3,QY>~ۅ|=,Jx1O.y>4I|Gżt weȃJBݭbֽnY4,;O]f t>B6)Q_ a 鎰fnzsw%]u>hG\1'M g^xE |bOLSb)J̔a宇_Pù\!'NRb7m"UV>ḰۯQ%Vf2j:{1$eXc_:NW!艴u`h;#|ņ'* [o7(9椒g@TK#0͜`^^ d ߧ6z۸mng0b=Z?yZ ZE Knww*}aa}zW@!z"<;(2E lt>BN) `hB4]'³iB)3XFWKCG #?2o?#\qiamEO荲מFF;,̶>ƍ=03ۯ >>^*++m{ /?zGVt?xv:MQ(e@{0}]q{ UV^1vVv઩Ke}wUJe/D5suKo?_쩹mߚ2N8btYM7\T{¯Bf_t/W)3XFWJͱF2rR[VxVYmF1饲˦/+ŀR i?Wu`hB4] ޶ lQXcsf=R_>|- xB„ǔ>W }R;/ /l!6߯ݓUw a.D矽nrJ^rn~ޢ0cڤpe/ꆡ97^.=w|I,?\d{w?~ΙWٮo<^z?7^~nV.D+~֝2B4k/:+0NgmI'O?>9`x[6v{| k/uO9:,aж[D[f²do_ Z~2bwFO|"MWHEQۧ*vIYu;O.LʅE9#.=w|V&*M:x3+kA|aV x]W -IkOgN˄he/>SfPFAu* c]ׇjsy_Yy$@3ԣoËo?VaȎ"4}(2B4j;lTÐ̏_{,+ b BЗs(R|~$@p_|,'̠Eg0/RbMVo6ppڸ~TX:6 ׯ^lk,(aXR6}1WO;#[o>q?Kq#6#sϊ6 <6gfC_& nxo?wovSNPVSrn`ǗonYYޟjvfM7gPa`PmApQX_[n8v̨7e{Zq{#\:{zyD8[E& {qm<%g|T)c[ybyqMm?E':{8c;?>tipj9m3!!̤Iˊ Ф ̰"(1_y饼߾bILyx$Rv[}!O {o>}OYγыIE=7o_F5٫ )3(D#U8bwKn2vsou LJ(H#EWZ6獗\x']g̶6_;۞7{[)ʈ,{6Xæ,K;o/KT/? kLIDAT]i-4_ W^)[Xߑ^䥢߈}$?N&@L PvzgkGO%DՅ*@pS/xi%UVޙq;ʰzmeH6a xuJ0' ; n2ތSەHy~8@P)!"|vm/GO]5Kj+vĭ2xtX8 xOcsg;c 1D:GACa>pQG-HXeޖkWRfPFRmz  ;.ux@ DſmM** :D&?&0 c!6wn5B솭aƧWo3k!7Ri[LuɧیLi0C[:?ƃO˭eR0ÿOYh&&{u!eh$&@p2`1-Κ` ځM5ߐ1wMv;$n16pb9ۤxLPk@9(iy fX& 10Y _2"]/6-6Ia<4?yOTsS|C~6 0gIgLeHG׫$@8T_MD4\2B4 cKA _!mX>C  mc^f0"-3ђzGho^v%їr꽴|1\i}[5;-N:=ڥC@xmK~6 _=x,oʭW_ۧb,LxhVj2=s, e&]O.2>3$p_^,߂W~#nz] Jm>A9C;i G \/r#x( qm}!DՅfZ\z_1,~Ԍt4xa9<8۰%a ۰/a~1›:~6AƧf3QR/j#-6D<.67;?|eAwQBg8aU.A] g%8.bt=D<3bl Ap^Y!Xp-i3F?"f=1Uc1[#Z8eb_f/\Cxkcq2Bإkubq1A>c?NgW ?\#VykگɺSfPF#xG%x-:,00~ 'exNx'h:>a`?tS yq_ Jqҙ"toxoO k_ۭY'rB0^ ɳT:ƻD_^Q叙&+>hgSGCQTG#xՈ“cKRi9vbg 7<{[v]Y!(2B!@CxQQ$eB!H !BT EEB!‹")3(BQ 4ERfP!h/*̠B!D5^TIA!Bj!(2B!@CxQQ$eB!H !BT EEB!‹")3(BQ 45eY !BLPRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRǥvEIENDB`pyDataverse-0.3.4/pyDataverse/docs/source/_static/000077500000000000000000000000001467256651400221725ustar00rootroot00000000000000pyDataverse-0.3.4/pyDataverse/docs/source/_static/.gitkeep000066400000000000000000000000001467256651400236110ustar00rootroot00000000000000pyDataverse-0.3.4/pyDataverse/docs/source/_templates/000077500000000000000000000000001467256651400227015ustar00rootroot00000000000000pyDataverse-0.3.4/pyDataverse/docs/source/_templates/layout.html000066400000000000000000000013101467256651400250770ustar00rootroot00000000000000{% extends "!layout.html" %} {% block extrahead %} {% endblock %} pyDataverse-0.3.4/pyDataverse/docs/source/_templates/sidebar_intro.html000066400000000000000000000020201467256651400264050ustar00rootroot00000000000000

{{ project }}

{{ theme_description }}

Developed by Stefan Kasberger at AUSSDA - The Austrian Social Science Data Archive.

https://travis-ci.com/gdcc/pyDataverse.svg?branch=master

pyDataverse-0.3.4/pyDataverse/docs/source/_templates/sidebar_related-links.html000066400000000000000000000005561467256651400300240ustar00rootroot00000000000000

Useful Links

pyDataverse-0.3.4/pyDataverse/docs/source/community/000077500000000000000000000000001467256651400225705ustar00rootroot00000000000000pyDataverse-0.3.4/pyDataverse/docs/source/community/contact.rst000066400000000000000000000012411467256651400247530ustar00rootroot00000000000000.. _community_contact: Contact ================= If you'd like to get in touch with the community and development of pyDataverse, there are several options: GitHub ------ The best way to track the development of pyDataverse is through the `GitHub repo `_. Email ------- The author of pyDataverse, Stefan Kasberger, can also be contacted directly via Email. - stefan.kasberger@univie.ac.at Twitter ------- pyDataverse is developed at AUSSDA - The Austrian Social Science Data Archive. You can get regular updates of pyDataverse from our Twitter account, or get in touch with. - `@theaussda `_ pyDataverse-0.3.4/pyDataverse/docs/source/community/releases.rst000066400000000000000000000001551467256651400251260ustar00rootroot00000000000000.. _community_history: Release History =========================== .. include:: ../../../../../HISTORY.rst pyDataverse-0.3.4/pyDataverse/docs/source/conf.py000066400000000000000000000141231467256651400220440ustar00rootroot00000000000000# Configuration file for the Sphinx documentation builder. # # This file does only contain a selection of the most common options. For a # full list see the documentation: # http://www.sphinx-doc.org/en/master/config # -- Path setup -------------------------------------------------------------- # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. # # sys.path.insert(0, os.path.abspath('../../')) # -- Project information ----------------------------------------------------- import pyDataverse from datetime import date import os import sys sys.path.insert(0, os.path.abspath("../..")) project = "pyDataverse" author = "Stefan Kasberger" author_affiliation = "AUSSDA - The Austrian Social Science Data Archive" copyright = "{0}, {1}".format(date.today().strftime("%Y"), author) description = "pyDataverse helps with the Dataverse API's and data types (Dataverse, Dataset, Datafile)." # The short X.Y version version = pyDataverse.__version__ # The full version, including alpha/beta/rc tags release = pyDataverse.__version__ # -- General configuration --------------------------------------------------- # If your documentation needs a minimal Sphinx version, state it here. # # needs_sphinx = '1.0' # Add any Sphinx extension module names here, as strings. They can be # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. extensions = [ "sphinx.ext.autodoc", "sphinx.ext.doctest", "sphinx.ext.coverage", "sphinx.ext.napoleon", "sphinx.ext.todo", "sphinx.ext.viewcode", "sphinx.ext.intersphinx", ] # Add any paths that contain templates here, relative to this directory. templates_path = ["_templates"] # The suffix(es) of source filenames. # You can specify multiple suffix as a list of string: # # source_suffix = ['.rst', '.md'] source_suffix = ".rst" # The master toctree document. master_doc = "index" # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. # # This is also used if you do content translation via gettext catalogs. # Usually you set "language" from the command line for these cases. language = "en" # If true, the current module name will be prepended to all description # unit titles (such as .. function::). add_module_names = False # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. # This pattern also affects html_static_path and html_extra_path . exclude_patterns = ["_build", "Thumbs.db", ".DS_Store", "code/"] # The name of the Pygments (syntax highlighting) style to use. pygments_style = "sphinx" # -- Options for HTML output ------------------------------------------------- # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. # html_theme = "alabaster" # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the # documentation. # html_theme_options = { "description": description, "show_powered_by": False, "github_button": True, "github_user": "gdcc", "github_repo": "pyDataverse", "github_banner": False, "travis_button": True, } # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ["_static"] # Custom sidebar templates, must be a dictionary that maps document names # to template names. # # The default sidebars (for documents that don't match any pattern) are # defined by theme itself. Builtin themes are using these templates by # default: ``['localtoc.html', 'relations.html', 'sourcelink.html', # 'searchbox.html']``. # html_sidebars = { "index": [ "sidebar_intro.html", "navigation.html", "sidebar_related-links.html", "sourcelink.html", "searchbox.html", ], "**": [ "sidebar_intro.html", "navigation.html", "sidebar_related-links.html", "sourcelink.html", "searchbox.html", ], } # -- Options for HTMLHelp output --------------------------------------------- # Output file base name for HTML help builder. htmlhelp_basename = "pyDataverse" # -- Options for LaTeX output ------------------------------------------------ latex_elements: dict[str, str] = { # The paper size ('letterpaper' or 'a4paper'). # # 'papersize': 'letterpaper', # The font size ('10pt', '11pt' or '12pt'). # # 'pointsize': '10pt', # Additional stuff for the LaTeX preamble. # # 'preamble': '', # Latex figure (float) alignment # # 'figure_align': 'htbp', } # Grouping the document tree into LaTeX files. List of tuples # (source start file, target name, title, # author, documentclass [howto, manual, or own class]). latex_documents = [ ( master_doc, "pyDataverse.tex", "pyDataverse Documentation", "AUSSDA - Austrian Social Science Data Archive", "manual", ), ] # -- Options for manual page output ------------------------------------------ # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). man_pages = [(master_doc, "pyDataverse", "pyDataverse Documentation", [author], 1)] # -- Options for Texinfo output ---------------------------------------------- # Grouping the document tree into Texinfo files. List of tuples # (source start file, target name, title, author, # dir menu entry, description, category) texinfo_documents = [ ( master_doc, "pyDataverse", "pyDataverse Documentation", author, "pyDataverse", description, "Miscellaneous", ), ] # -- Extension configuration ------------------------------------------------- # Intersphinx intersphinx_mapping = { "python": ("https://docs.python.org/3", None), } pyDataverse-0.3.4/pyDataverse/docs/source/contributing/000077500000000000000000000000001467256651400232535ustar00rootroot00000000000000pyDataverse-0.3.4/pyDataverse/docs/source/contributing/contributing.rst000066400000000000000000000002321467256651400265110ustar00rootroot00000000000000Contributor Guide ========================================= .. _contributing_contributing: .. include:: ../../../../CONTRIBUTING.rst :start-line: 2 pyDataverse-0.3.4/pyDataverse/docs/source/index.rst000066400000000000000000000200261467256651400224050ustar00rootroot00000000000000.. _homepage: pyDataverse ========================================= Release v\ |version|. .. image:: https://img.shields.io/github/v/release/gdcc/pyDataverse :target: https://github.com/gdcc/pyDataverse .. image:: https://img.shields.io/conda/vn/conda-forge/pydataverse.svg :target: https://anaconda.org/conda-forge/pydataverse .. image:: https://travis-ci.com/gdcc/pyDataverse.svg?branch=master :target: https://travis-ci.com/gdcc/pyDataverse .. image:: https://img.shields.io/pypi/v/pyDataverse.svg :target: https://pypi.org/project/pyDataverse/ .. image:: https://img.shields.io/pypi/wheel/pyDataverse.svg :target: https://pypi.org/project/pyDataverse/ .. image:: https://img.shields.io/pypi/pyversions/pyDataverse.svg :target: https://pypi.org/project/pyDataverse/ .. image:: https://readthedocs.org/projects/pydataverse/badge/?version=latest :target: https://pydataverse.readthedocs.io/en/latest .. image:: https://coveralls.io/repos/github/gdcc/pyDataverse/badge.svg :target: https://coveralls.io/github/gdcc/pyDataverse .. image:: https://img.shields.io/github/license/gdcc/pydataverse.svg :target: https://opensource.org/licenses/MIT .. image:: https://img.shields.io/badge/code%20style-black-000000.svg :target: https://github.com/psf/black .. image:: https://zenodo.org/badge/DOI/10.5281/zenodo.4664557.svg :target: https://doi.org/10.5281/zenodo.4664557 ------------------- .. _homepage_description: **pyDataverse** is a Python module for `Dataverse `_ you can use for: - accessing the Dataverse `API's `_ - manipulating and using the Dataverse (meta)data - Dataverses, Datasets, Datafiles No matter, if you want to import huge masses of data into Dataverse, test your Dataverse instance after deployment or want to make basic API calls: **pyDataverse helps you with Dataverse!** pyDataverse is fully Open Source and can be used by everybody. .. image:: https://www.repostatus.org/badges/latest/unsupported.svg :alt: Project Status: Unsupported – The project has reached a stable, usable state but the author(s) have ceased all work on it. A new maintainer may be desired. :target: https://www.repostatus.org/#unsupported pyDataverse is not supported right now. A new maintainer or funding is desired. Please contact the author `Stefan Kasberger `_, if you want to contribute in some way. .. _homepage_install: Install ----------------------------- To install pyDataverse, simply run this command in your terminal of choice: .. code-block:: shell pip install pyDataverse Or run this command to install using conda: .. code-block:: shell conda install pyDataverse -c conda-forge Find more options at :ref:`user_installation`. **Requirements** .. include:: snippets/requirements.rst .. _homepage_quickstart: Quickstart ----------------------------- .. include:: snippets/warning_production.rst **Import Dataset metadata JSON** To import the metadata of a Dataset from Dataverse's own JSON format, use :meth:`ds.from_json() `. The created :class:`Dataset ` can then be retrieved with :meth:`get() `. For this example, we use the ``dataset.json`` from ``tests/data/user-guide/`` (`GitHub repo `_) and place it in the root directory. :: >>> from pyDataverse.models import Dataset >>> from pyDataverse.utils import read_file >>> ds = Dataset() >>> ds_filename = "dataset.json" >>> ds.from_json(read_file(ds_filename)) >>> ds.get() {'citation_displayName': 'Citation Metadata', 'title': 'Youth in Austria 2005', 'author': [{'authorName': 'LastAuthor1, FirstAuthor1', 'authorAffiliation': 'AuthorAffiliation1'}], 'datasetContact': [{'datasetContactEmail': 'ContactEmail1@mailinator.com', 'datasetContactName': 'LastContact1, FirstContact1'}], 'dsDescription': [{'dsDescriptionValue': 'DescriptionText'}], 'subject': ['Medicine, Health and Life Sciences']} **Create Dataset by API** To access Dataverse's Native API, you first have to instantiate :class:`NativeApi `. Then create the Dataset through the API with :meth:`create_dataset() `. This returns, as all API functions do, a :class:`httpx.Response ` object, with the DOI inside ``data``. Replace following variables with your own instance data before you execute the lines: - BASE_URL: Base URL of your Dataverse instance, without trailing slash (e. g. ``https://data.aussda.at``)) - API_TOKEN: API token of a Dataverse user with proper rights to create a Dataset - DV_PARENT_ALIAS: Alias of the Dataverse, the Dataset should be attached to. :: >>> from pyDataverse.api import NativeApi >>> api = NativeApi(BASE_URL, API_TOKEN) >>> resp = api.create_dataset(DV_PARENT_ALIAS, ds.json()) Dataset with pid 'doi:10.5072/FK2/UTGITX' created. >>> resp.json() {'status': 'OK', 'data': {'id': 251, 'persistentId': 'doi:10.5072/FK2/UTGITX'}} For more tutorials, check out :ref:`User Guide - Basic Usage ` and :ref:`User Guide - Advanced Usage `. .. _homepage_features: Features ----------------------------- - **Comprehensive API wrapper** for all Dataverse API’s and most of their endpoints - **Data models** for each of Dataverses data types: **Dataverse, Dataset and Datafile** - Data conversion to and from Dataverse's own JSON format for API uploads - **Easy mass imports and exports through CSV templates** - Utils with helper functions - **Documented** examples and functionalities - Custom exceptions - Tested (`Travis CI `_) and documented (`Read the Docs `_) - Open Source (`MIT `_) .. _homepage_user-guide: User Guide ----------------------------- .. toctree:: :maxdepth: 3 user/installation user/basic-usage user/advanced-usage user/use-cases user/csv-templates user/faq Wiki user/resources .. _homepage_reference: Reference / API ----------------------------- If you are looking for information on a specific class, function, or method, this part of the documentation is for you. .. toctree:: :maxdepth: 2 reference .. _homepage_community-guide: Community Guide ----------------------------- This part of the documentation, which is mostly prose, details the pyDataverse ecosystem and community. .. toctree:: :maxdepth: 1 community/contact community/releases .. _homepage_contributor-guide: Contributor Guide ----------------------------- .. toctree:: :maxdepth: 2 contributing/contributing .. _homepage_thanks: Thanks! ----------------------------- To everyone who has contributed to pyDataverse - with an idea, an issue, a pull request, developing used tools, sharing it with others or by any other means: **Thank you for your support!** Open Source projects live from the cooperation of the many and pyDataverse is no exception to that, so to say thank you is the least that can be done. Special thanks to Lars Kaczmirek, Veronika Heider, Christian Bischof, Iris Butzlaff and everyone else from AUSSDA, Slava Tykhonov and Marion Wittenberg from DANS and all the people who do an amazing job by developing Dataverse at IQSS, but especially to Phil Durbin for it's support from the first minute. pyDataverse is funded by `AUSSDA - The Austrian Social Science Data Archive `_ and through the EU Horizon2020 programme `SSHOC - Social Sciences & Humanities Open Cloud `_ (T5.2). .. _homepage_license: License ----------------------------- Copyright Stefan Kasberger and others, 2019-2021. Distributed under the terms of the MIT license, pyDataverse is free and open source software. Full License Text: `LICENSE.txt `_ pyDataverse-0.3.4/pyDataverse/docs/source/reference.rst000066400000000000000000000020301467256651400232270ustar00rootroot00000000000000Reference / API ========================================= .. module:: pyDataverse This part of the documentation covers all the interfaces / APIs of the pyDataverse modules. Where pyDataverse depends on external libraries, we document the most important right here and provide links to the canonical documentation outside of scope. API Interface ----------------------------- Access all of Dataverse APIs. .. automodule:: pyDataverse.api :members: :special-members: Models Interface ----------------------------- Use all metadata models of the Dataverse data-types (`Dataverse`, `Dataset` and `Datafile`). This includes import, export and manipulation. .. automodule:: pyDataverse.models :inherited-members: Utils Interface ----------------------------- Helper functions. .. automodule:: pyDataverse.utils :members: Auth Helpers ----------------------------- .. automodule:: pyDataverse.auth :members: Exceptions ----------------------------- Custom exceptions. .. automodule:: pyDataverse.exceptions :members: pyDataverse-0.3.4/pyDataverse/docs/source/snippets/000077500000000000000000000000001467256651400224115ustar00rootroot00000000000000pyDataverse-0.3.4/pyDataverse/docs/source/snippets/pip-install.rst000066400000000000000000000000661467256651400254010ustar00rootroot00000000000000.. code-block:: shell pip install -U pyDataverse pyDataverse-0.3.4/pyDataverse/docs/source/snippets/requirements.rst000066400000000000000000000005201467256651400256630ustar00rootroot00000000000000pyDataverse officially supports Python 3.6–3.8 Python packages required: - `requests `_>=2.12.0 - `jsonschema `_>=3.2.0 External packages required: - curl (only for :meth:`replace_datafile() ` necessary) pyDataverse-0.3.4/pyDataverse/docs/source/snippets/warning_production.rst000066400000000000000000000001471467256651400270600ustar00rootroot00000000000000.. warning:: Do not execute the example code on a Dataverse production instance, unless 100% sure! pyDataverse-0.3.4/pyDataverse/docs/source/user/000077500000000000000000000000001467256651400215225ustar00rootroot00000000000000pyDataverse-0.3.4/pyDataverse/docs/source/user/advanced-usage.rst000066400000000000000000000164071467256651400251330ustar00rootroot00000000000000.. _user_advanced-usage: Advanced Usage ============== In addition to these tutorials, you can find more basic examples at :ref:`User Guide - Basic Usage `. and use-cases :ref:`User Guide - Use-Cases `. .. _advanced-usage_data-migration: Import CSV to Dataverse ----------------------- This tutorial will show you how to mass-import metadata from pyDataverse's own CSV format (see :ref:`CSV templates `), create pyDataverse objects from it (Datasets and Datafiles) and upload the data and metadata through the API. The CSV format in this case can work as an exchange format or kind of a bridge between all kind of data formats and programming languages. Note that this can be filled directly by humans who collect the data manually (such as in digitization projects) as well as through more common automation workflows. .. _advanced-usage_prepare: Prepare ^^^^^^^ **Requirements** - pyDataverse installed (see :ref:`user_installation`) **Information** - Follow the order of code execution - Dataverse Docker 4.18.1 used - pyDataverse 0.3.0 used - API responses may vary by each request and Dataverse installation! .. include:: ../snippets/warning_production.rst **Additional Resources** - CSV templates from ``pyDataverse/templates/`` are used (see :ref:`CSV templates `) - Data from ``tests/data/user-guide/`` is used (`GitHub repo `_) .. _advanced-usage_data-migration_adapt-csv-templates: Adapt CSV template ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ See :ref:`CSV templates - Adapt CSV template(s) `. .. _advanced-usage_data-migration_fill-csv-templates: Add metadata to the CSV files ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ After preparing the CSV files, the metadata will need to be collected (manually or programmatically). No matter the origin or the format, each row must contain one entity (Dataverse collection, Dataset or Datafile). As mentioned in "Additional Resources" in the tutorial we use prepared data and place it in the root directory. You can ether use our files or fill in your own metadata with your own datafiles. No matter what you choose, you have to have properly formatted CSV files (``datasets.csv`` and ``datafiles.csv``) before moving on. Don't forget: Some columns must be entered in a JSON format! .. _advanced-usage_data-migration_add-datafiles: Add datafiles ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Add the files you have filled in the ``org.filename`` cell in ``datafiles.csv`` and then place them in the root directory (or any other specified directory). .. _advanced-usage_data-migration_import-csv-templates: Import CSV files ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Import the CSV files with :meth:`read_csv_as_dicts() `. This creates a list of :class:`dict`'s, automatically imports the Dataverse Software's own metadata attribute (``dv.`` prefix), converts boolean values, and loads JSON cells properly. :: >>> import os >>> from pyDataverse.utils import read_csv_as_dicts >>> csv_datasets_filename = "datasets.csv" >>> ds_data = read_csv_as_dicts(csv_datasets_filename) >>> csv_datafiles_filename = "datafiles.csv" >>> df_data = read_csv_as_dicts(csv_datafiles_filename) Once we have the data in Python, we can easily import the data into pyDataverse. For this, loop over each Dataset :class:`dict`, to: #. Instantiate an empty :class:`Dataset ` #. add the data with :meth:`set() ` and #. append the instance to a :class:`list`. :: >>> from pyDataverse.models import Dataset >>> ds_lst = [] >>> for ds in ds_data: >>> ds_obj = Dataset() >>> ds_obj.set(ds) >>> ds_lst.append(ds_obj) To import the :class:`Datafile `'s, do the same with ``df_data``: :meth:`set() ` the Datafile metadata, and append it. :: >>> from pyDataverse.models import Datafile >>> df_lst = [] >>> for df in df_data: >>> df_obj = Datafile() >>> df_obj.set(df) >>> df_lst.append(df_obj) .. _advanced-usage_data-migration_upload-data: Upload data via API ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Before we can upload metadata and data, we need to create an instance of :class:`NativeApi `. You will need to replace the following variables with your own Dataverse installation's data before executing the lines: - BASE_URL: Base URL of your Dataverse installation, without trailing slash (e. g. ``https://data.aussda.at``)) - API_TOKEN: API token of a Dataverse user with proper rights to create a Dataset and upload Datafiles :: >>> from pyDataverse.api import NativeApi >>> api = NativeApi(BASE_URL, API_TOKEN) Loop over the :class:`list ` of :class:`Dataset `'s, upload the metadata with :meth:`create_dataset() ` and collect all ``dataset_id``'s and ``pid``'s in ``dataset_id_2_pid``. Note: The Dataverse collection assigned to ``dv_alias`` must be published in order to add a Dataset to it. :: >>> dv_alias = ":root:" >>> dataset_id_2_pid = {} >>> for ds in ds_lst: >>> resp = api.create_dataset(dv_alias, ds.json()) >>> dataset_id_2_pid[ds.get()["org.dataset_id"]] = resp.json()["data"]["persistentId"] Dataset with pid 'doi:10.5072/FK2/WVMDFE' created. The API requests always return a :class:`httpx.Response ` object, which can then be used to extract the data. Next, we'll do the same for the :class:`list ` of :class:`Datafile `'s with :meth:`upload_datafile() `. In addition to the metadata, the ``PID`` (Persistent Identifier, which is mostly the DOI) and the ``filename`` must be passed. :: >>> for df in df_lst: >>> pid = dataset_id_2_pid[df.get()["org.dataset_id"]] >>> filename = os.path.join(os.getcwd(), df.get()["org.filename"]) >>> df.set({"pid": pid, "filename": filename}) >>> resp = api.upload_datafile(pid, filename, df.json()) Now we have created all Datasets, which we added to ``datasets.csv``, and uploaded all Datafiles, which we placed in the root directory, to the Dataverse installation. .. _advanced-usage_data-migration_publish-dataset: Publish Datasets via API ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Finally, we iterate over all Datasets and publish them with :meth:`publish_dataset() `. :: >>> for dataset_id, pid in dataset_id_2_pid.items(): >>> resp = api.publish_dataset(pid, "major") >>> resp.json() Dataset doi:10.5072/FK2/WVMDFE published {'status': 'OK', 'data': {'id': 444, 'identifier': 'FK2/WVMDFE', 'persistentUrl': 'https://doi.org/10.5072/FK2/WVMDFE', 'protocol': 'doi', 'authority': '10.5072', 'publisher': 'Root', 'publicationDate': '2021-01-13', 'storageIdentifier': 'file://10.5072/FK2/WVMDFE'}} The Advanced Usage tutorial is now finished! If you want to revisit basic examples and use cases you can do so at :ref:`User Guide - Basic Usage ` and :ref:`User Guide - Use-Cases `. pyDataverse-0.3.4/pyDataverse/docs/source/user/basic-usage.rst000066400000000000000000000411111467256651400244350ustar00rootroot00000000000000.. _user_basic-usage: Basic Usage ================= This tutorial will show you how to import metadata from the Dataverse software's own JSON format, create pyDataverse objects from it (Dataverse collection, Dataset and Datafile), upload it via the API, and clean up at the end. In addition to this tutorial, you can find more advanced examples at :ref:`User Guide - Advanced Usage ` and background information at :ref:`User Guide - Use-Cases `. .. _user_basic-usage_prepare: Prepare ------------------------------------------ **Requirements** - pyDataverse installed (see :ref:`Installation `) **Basic Information** - Follow the order of code execution - Dataverse Docker 4.18.1 used - pyDataverse 0.3.0 used - API responses may vary by each request and Dataverse installation! .. include:: ../snippets/warning_production.rst **Additional Resources** - Data from ``tests/data/user-guide/`` used (`GitHub repo `_) .. _user_basic-usage_api-connection: Connect to Native API ------------------------------------------ First, create a :class:`NativeApi ` instance. You will use it later for data creation. Replace the following variables with your own installation's data before you execute the lines: - BASE_URL: Base URL of your Dataverse installation, without trailing slash (e. g. ``https://data.aussda.at``)) - API_TOKEN: API token of a Dataverse installation user with proper permissions to create a Dataverse collection, create a Dataset, and upload Datafiles :: >>> from pyDataverse.api import NativeApi >>> api = NativeApi(BASE_URL, API_TOKEN) Check with :meth:`get_info_version() `, if the API connection works and to retrieve the version of your Dataverse instance: :: >>> resp = api.get_info_version() >>> resp.json() {'status': 'OK', 'data': {'version': '4.15.1', 'build': '1377-701b56b'}} >>> resp.status_code 200 All API requests return a :class:`httpx.Response ` object, which can then be used (e. g. :meth:`json() `). .. _user_basic-usage_create-dataverse: Create Dataverse Collection ----------------------------- The top-level data-type in the Dataverse software is called a Dataverse collection, so we will start with that. Take a look at the figure below to better understand the relationship between a Dataverse collection, a dataset, and a datafile. .. figure:: ../_images/collection_dataset.png :align: center :alt: collection dataset datafile A dataverse collection (also known as a :class:`Dataverse `) acts as a container for your :class:`Datasets`. It can also store other collections (:class:`Dataverses `). You could create your own Dataverse collections, but it is not a requirement. A Dataset is a container for :class:`Datafiles`, such as data, documentation, code, metadata, etc. You need to create a Dataset to deposit your files. All Datasets are uniquely identified with a DOI at Dataverse. For more detailed explanations, check out `the Dataverse User Guide `_. Going back to the example, first, instantiate a :class:`Dataverse ` object and import the metadata from the Dataverse Software's own JSON format with :meth:`from_json() `: :: >>> from pyDataverse.models import Dataverse >>> from pyDataverse.utils import read_file >>> dv = Dataverse() >>> dv_filename = "dataverse.json" >>> dv.from_json(read_file(dv_filename)) With :meth:`get() ` you can have a look at all the data of the object: :: >>> dv.get() {'alias': 'pyDataverse_user-guide', 'name': 'pyDataverse - User Guide', 'dataverseContacts': [{'contactEmail': 'info@aussda.at'}]} >>> type(dv.get()) To see only the metadata necessary for the Dataverse API upload, use :meth:`json() `, which defaults to the needed format for the Dataverse API upload (equivalent to ``json(data_format="dataverse_upload")``): :: >>> dv.json() '{\n "alias": "pyDataverse_user-guide",\n "dataverseContacts": [\n {\n "contactEmail": "info@aussda.at"\n }\n ],\n "name": "pyDataverse - User Guide"\n}' >>> type(dv.json()) Then use :meth:`create_dataverse() ` to upload the Dataverse metadata to your Dataverse installation via its Native API and create an unpublished Dataverse collection draft. For this, you have to pass a) the parent Dataverse collection alias to which the new Dataverse collection is attached and b) the metadata in the Dataverse Software's own JSON format (:meth:`json() `): :: >>> resp = api.create_dataverse(":root", dv.json()) Dataverse pyDataverse_user-guide created. Last, we publish the Dataverse collection draft with :meth:`publish_dataverse() `: :: >>> resp = api.publish_dataverse("pyDataverse_user-guide") Dataverse pyDataverse_user-guide published. To have a look at the results of our work, you can check the created Dataverse collection on the frontend, or use pyDataverse to retrieve the Dataverse collection with :meth:`get_dataverse() `: :: >>> resp = api.get_dataverse("pyDataverse_user-guide") >>> resp.json() {'status': 'OK', 'data': {'id': 441, 'alias': 'pyDataverse_user-guide', 'name': 'pyDataverse - User Guide', 'dataverseContacts': [{'displayOrder': 0, 'contactEmail': 'info@aussda.at'}], 'permissionRoot': True, 'dataverseType': 'UNCATEGORIZED', 'ownerId': 1, 'creationDate': '2021-01-13T20:47:43Z'}} This is it, our first Dataverse collection created with the help of pyDataverse! Now let's move on and apply what we've learned to Datasets and Datafiles. .. _user_basic-usage_create-dataset: Create Dataset ----------------------------- Again, start by creating an empty pyDataverse object, this time a :class:`Dataset `: :: >>> from pyDataverse.models import Dataset >>> ds = Dataset() The function names often are the same for each data-type. So again, we can use :meth:`from_json() ` to import the metadata from the JSON file, but this time it feeds into a Dataset: :: >>> ds_filename = "dataset.json" >>> ds.from_json(read_file(ds_filename)) You can also use :meth:`get() ` to output all data: :: >>> ds.get() {'citation_displayName': 'Citation Metadata', 'title': 'Youth in Austria 2005', 'author': [{'authorName': 'LastAuthor1, FirstAuthor1', 'authorAffiliation': 'AuthorAffiliation1'}], 'datasetContact': [{'datasetContactEmail': 'ContactEmail1@mailinator.com', 'datasetContactName': 'LastContact1, FirstContact1'}], 'dsDescription': [{'dsDescriptionValue': 'DescriptionText'}], 'subject': ['Medicine, Health and Life Sciences']} Now, as the metadata is imported, we don't know if the data is valid and can be used to create a Dataset. Maybe some attributes are missing or misnamed, or a mistake during import happened. pyDataverse offers a convenient function to test this out with :meth:`validate_json() `, so you can move on with confidence: :: >>> ds.validate_json() True Adding or updating data manually is easy. With :meth:`set() ` you can pass any attribute you want as a collection of key-value pairs in a :class:`dict`: :: >>> ds.get()["title"] Youth in Austria 2005 >>> ds.set({"title": "Youth from Austria 2005"}) >>> ds.get()["title"] Youth from Austria 2005 To upload the Dataset, use :meth:`create_dataset() `. You'll pass the Dataverse collection where the Dataset should be attached and include the metadata as a JSON string (:meth:`json() `): :: >>> resp = api.create_dataset("pyDataverse_user-guide", ds.json()) Dataset with pid 'doi:10.5072/FK2/EO7BNB' created. >>> resp.json() {'status': 'OK', 'data': {'id': 442, 'persistentId': 'doi:10.5072/FK2/EO7BNB'}} Save the created PID (short for Persistent Identifier, which in our case is the DOI) in a :class:`dict`: :: >>> ds_pid = resp.json()["data"]["persistentId"] Private Dataset URL's can also be created. Use :meth:`create_dataset_private_url() ` to get the URL and the private token: :: >>> resp = api.create_dataset_private_url(ds_pid) Dataset private URL created: http://data.aussda.at/privateurl.xhtml?token={PRIVATE_TOKEN} >>> resp.json() {'status': 'OK', 'data': {'token': '{PRIVATE_TOKEN}', 'link': 'http://data.aussda.at/privateurl.xhtml?token={PRIVATE_TOKEN}', 'roleAssignment': {'id': 174, 'assignee': '#442', 'roleId': 8, '_roleAlias': 'member', 'privateUrlToken': '{PRIVATE_TOKEN}', 'definitionPointId': 442}}} Finally, to make the Dataset public, publish the draft with :meth:`publish_dataset() `. Set ``release_type="major"`` (defaults to ``minor``), to create version 1.0: :: >>> resp = api.publish_dataset(ds_pid, release_type="major") Dataset doi:10.5072/FK2/EO7BNB published .. _user_basic-usage_upload-datafile: Upload Datafile ----------------------------- After all the preparations, it's now time to upload a :class:`Datafile ` and attach it to the Dataset: :: >>> from pyDataverse.models import Datafile >>> df = Datafile() Again, import your metadata with :meth:`from_json() `. Then, set your PID and filename manually (:meth:`set() `), as they are required as metadata for the upload and are created during the import process: :: >>> df_filename = "datafile.txt" >>> df.set({"pid": ds_pid, "filename": df_filename}) >>> df.get() {'pid': 'doi:10.5072/FK2/EO7BNB', 'filename': 'datafile.txt'} Upload the Datafile with :meth:`upload_datafile() `. Pass the PID, the Datafile filename and the Datafile metadata: :: >>> resp = api.upload_datafile(ds_pid, df_filename, df.json()) >>> resp.json() {'status': 'OK', 'data': {'files': [{'description': '', 'label': 'datafile.txt', 'restricted': False, 'version': 1, 'datasetVersionId': 101, 'dataFile': {'id': 443, 'persistentId': '', 'pidURL': '', 'filename': 'datafile.txt', 'contentType': 'text/plain', 'filesize': 7, 'description': '', 'storageIdentifier': '176fd85f46f-cf06cf243502', 'rootDataFileId': -1, 'md5': '8b8db3dfa426f6bdb1798d578f5239ae', 'checksum': {'type': 'MD5', 'value': '8b8db3dfa426f6bdb1798d578f5239ae'}, 'creationDate': '2021-01-13'}}]}} By uploading the Datafile, the attached Dataset gets an update. This means that a new unpublished Dataset version is created as a draft and the change is not yet publicly available. To make it available through creating a new Dataset version, publish the Dataset with :meth:`publish_dataset() `. Again, set the ``release_type="major"`` to create version 2.0, as a file change always leads to a major version change: :: >>> resp = api.publish_dataset(ds_pid, release_type="major") Dataset doi:10.5072/FK2/EO7BNB published .. _user_basic-usage_download-data: Download and save a dataset to disk ---------------------------------------- You may want to download and explore an existing dataset from Dataverse. The following code snippet will show how to retrieve and save a dataset to your machine. Note that if the dataset is public, you don't need to have an API_TOKEN. Furthermore, you don't even need to have a Dataverse account to use this functionality. The code would therefore look as follows: :: >>> from pyDataverse.api import NativeApi, DataAccessApi >>> from pyDataverse.models import Dataverse >>> base_url = 'https://dataverse.harvard.edu/' >>> api = NativeApi(base_url) >>> data_api = DataAccessApi(base_url) However, you need to know the DOI of the dataset that you want to download. In this example, we use ``doi:10.7910/DVN/KBHLOD``, which is hosted on Harvard's Dataverse instance that we specified as ``base_url``. The code looks as follows: :: >>> DOI = "doi:10.7910/DVN/KBHLOD" >>> dataset = api.get_dataset(DOI) As previously mentioned, every dataset comprises of datafiles, therefore, we need to get the list of datafiles by ID and save them on disk. That is done in the following code snippet: :: >>> files_list = dataset.json()['data']['latestVersion']['files'] >>> for file in files_list: >>> filename = file["dataFile"]["filename"] >>> file_id = file["dataFile"]["id"] >>> print("File name {}, id {}".format(filename, file_id)) >>> response = data_api.get_datafile(file_id) >>> with open(filename, "wb") as f: >>> f.write(response.content) File name cat.jpg, id 2456195 Please note that in this example, the dataset will be saved in the execution directory. You could change that by adding a desired path in the ``open()`` function above. .. _user_basic-usage_get-data-tree: Retrieve all created data as a Dataverse tree --------------------------------------------------------- PyDataverse offers a convenient way to retrieve all children-data from a specific Dataverse collection or Dataset down to the Datafile level (Dataverse collections, Datasets and Datafiles). Simply pass the identifier of the parent (e. g. Dataverse collection alias or Dataset PID) and the list of the children data-types that should be collected (``dataverses``, ``datasets``, ``datafiles``) to :meth:`get_children() `: :: >>> tree = api.get_children("pyDataverse_user-guide", children_types= ["datasets", "datafiles"]) >>> tree [{'dataset_id': 442, 'pid': 'doi:10.5072/FK2/EO7BNB', 'type': 'dataset', 'children': [{'datafile_id': 443, 'filename': 'datafile.txt', 'label': 'datafile.txt', 'pid': '', 'type': 'datafile'}]}] In our case, we don't use ``dataverses`` as children data-type, as there is none inside the created Dataverse collection. For further use of the tree, have a look at :meth:`dataverse_tree_walker() ` and :meth:`save_tree_data() `. .. _user_basic-usage_remove-data: Clean up and remove all created data ---------------------------------------- As we have created a Dataverse collection, created a Dataset, and uploaded a Datafile, we now will remove all of it in order to clean up what we did so far. The Dataset has been published in the step above, so we have to destroy it with :meth:`destroy_dataset() `. To remove a non-published Dataset, :meth:`delete_dataset() ` must be used instead. Note: When you delete a Dataset, it automatically deletes all attached Datafile(s): :: >>> resp = api.destroy_dataset(ds_pid) Dataset {'status': 'OK', 'data': {'message': 'Dataset :persistentId destroyed'}} destroyed When you want to retrieve the Dataset now with :meth:`get_dataset() `, pyDataverse throws an :class:`OperationFailedError ` exception, which is the expected behaviour, as the Dataset was deleted: :: >>> resp = api.get_dataset(ds_pid) pyDataverse.exceptions.OperationFailedError: ERROR: GET HTTP 404 - http://data.aussda.at/api/v1/datasets/:persistentId/?persistentId=doi:10.5072/FK2/EO7BNB. MSG: {"status":"ERROR","message":"Dataset with Persistent ID doi:10.5072/FK2/EO7BNB not found."} After removing all Datasets and/or Dataverse collections in it, delete the parent Dataverse collection (:meth:`delete_dataverse() `). Note: It is not possible to delete a Dataverse collection with any data (Dataverse collection or Dataset) attached to it. :: >>> resp = api.delete_dataverse("pyDataverse_user-guide") Dataverse pyDataverse_user-guide deleted. Now the Dataverse instance is as it was once before we started. The Basic Usage tutorial is now finished, but maybe you want to have a look at more advanced examples at :ref:`User Guide - Advanced Usage ` and at :ref:`User Guide - Use-Cases ` for more information. pyDataverse-0.3.4/pyDataverse/docs/source/user/csv-templates.rst000066400000000000000000000153651467256651400250550ustar00rootroot00000000000000.. _user_csv-templates: CSV Templates ============================ .. _user_csv-templates_description: General ----------------------------- The CSV templates offer a **pre-defined data format**, which can be used to import metadata into pyDataverse, and export from it. They support all three Dataverse Software data-types: Dataverse collections, Datasets and Datafiles. CSV is an open file format, and great for humans and for machines. It can be opened with your Spreadsheet software and edited manually, or used by your favoured programming language. The CSV format can also work as an exchange format or kind of a bridge between all kind of data formats and programming languages. The CSV templates and the mentioned workflow below can be used especially for: - **Mass imports into a Dataverse installation:** The data to be imported could ether be collected manually (e. g. digitization of paper works), or created by machines (coming from any data source you have). - **Data exchange:** share pyDataverse data with any other system in an open, machine-readable format The CSV templates are licensed under `CC BY 4.0 `_ .. _user_csv-templates_data-format: Data format ----------------------------- - Separator: ``,`` - Encoding: ``utf-8`` - Quotation: ``"``. Note: In JSON strings, you have to escape with ``\`` before a quotation mark (e. g. adapt ``"`` to ``\"``). - Boolean: we recommend using ``TRUE`` and ``FALSE`` as boolean values. Note: They can be modified, when you open it with your preferred spreadsheet software (e. g. Libre Office), depending on the software or your operating systems settings. .. _user_csv-templates_content: Content ----------------------------- The templates don't come empty. They are pre-filled with supportive information to get started. Each row is one entry 1. **Column names**: The attribute name for each column. You can add and remove columns as you want. The pre-filled columns are a recommendation, as they consist of all metadata for the specific data-type, and the most common internal fields for handling the workflow. This is the only row that's not allowed to be deleted. There are three established prefixes so far (you can define your own if you want): a. ``org.``: Organization specific information to handle the data workflow later on. b. ``dv.``: Dataverse specific metadata, used for API uploads. Use the exact Dataverse software attribute name after the prefix, so the metadata gets imported properly. c. ``alma.``: ALMA specific information 2. **Description:** Description of the Dataverse software attribute. This row is for support purposes only, and must be deleted before usage. 3. **Attribute type:** Describes the type of the attribute (``serial``, ``string`` or ``numeric``). Strings can also be valid JSON strings to use more complex data structures. This row is for support purposes only, and must be deleted before usage. 4. **Example:** Contains a concrete example. To start adding your own data, it is often good to get started by copying the example for it. This row is for support purposes only, and must be deleted before usage. 5. **Multiple:** ``TRUE``, if multiple entries are allowed (boolean). This row is for support purposes only, and must be deleted before usage. 6. **Sub-keys:** ``TRUE``, if sub-keys are part (boolean). Only applicable to JSON strings. This row is for support purposes only, and must be deleted before usage. .. _user_csv-templates_usage: Usage ----------------------------- To use the CSV templates, we propose following steps as a best practice. The workflow is the same for Dataverse collections, Datasets and Datafiles. There is also a more detailed tutorial on how to use the CSV templates for mass imports in the :ref:`User Guide - Advanced `. The CSV templates can be found in ``pyDataverse/templates/`` (`GitHub repo `_): - `dataverses.csv `_ - `datasets.csv `_ - `datafiles.csv `_ .. _user_csv-templates_usage_create-csv: Adapt CSV template(s) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ First, adapt the CSV templates to your own needs and workflow. #. **Open a template file and save it:** Just start by copying the file and changing its filename to something descriptive (e.g. ``20200117_datasets.csv``). #. **Adapt columns:** Then change the pre-defined columns (attributes) to your needs. #. **Add metadata:** Add metadata in the first empty row. Closely following the example is often a good starting point, especially for JSON strings. #. **Remove supporting rows:** Once you are used to the workflow, you can delete the supportive rows 2 to 6. This must be done before you use the template for pyDataverse! #. **Save and use:** Once you have finished editing, save the CSV-file and import it to pyDataverse. .. _user_csv-templates_usage_add-metadata: Use the CSV files ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ For further usage of the CSV files with pyDataverse, for example: - adding metadata to the CSV files - importing CSV files - uploading data and metadata via API ... have a look at the :ref:`Data Migration Tutorial `. .. _user_csv-templates_usage_export-csv: Export from pyDataverse ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If you want to export your metadata from a pyDataverse object ( :class:`Dataverse `, :class:`Dataset `, :class:`Datafile `) to a CSV file: #. Get the metadata as :class:`dict ` (:meth:`Dataverse.get() `, :meth:`Dataset.get() ` or :meth:`Datafile.get() `). #. Pass the :class:`dict ` to :func:`write_dicts_as_csv() `. Note: Use the internal attribute lists from ``pyDataverse.models`` to get a complete list of fieldnames for each Dataverse data-type (e. g. ``Dataset.__attr_import_dv_up_citation_fields_values``). .. _user_csv-templates_resources: Resources ----------------------------- - Dataverse example data taken from `dataverse_full.json `_ - Dataset example data taken from `dataset_full.json `_ - Datafile example data taken from `Native API documentation `_ pyDataverse-0.3.4/pyDataverse/docs/source/user/faq.rst000066400000000000000000000031071467256651400230240ustar00rootroot00000000000000.. _community_faq: FAQ ================================== **Q: What is the "Dataverse Software?" What is a "Dataverse collection?"** A: The Dataverse Software is the name of the open source data repository software. 2. A Dataverse collection is the top-level data-type in the Dataverse Software. **Q: What is a "Dataset"?** A: The term dataset differs from the usual use for a structured set of data. A Dataset in the Dataverse software is a data-type typically representative for all content of one study. The Dataset itself contains only metadata, but it relates to other data-types: Datafiles are attached to it and a Dataset is always part of a Dataverse collection. **Q: What is a "Datafile"?** A: A Datafile is a Dataverse software data-type. It consists of the file itself and its metadata. A Datafile is always part of a Dataset. **Q: What are the expected HTTP Status Codes for the API requests?** A: So far, this is still an unsolved question, as it is not documented yet. We started to collect this information at a `Wiki page `_ , so if you have some knowledge about this, please add it there or get in touch with us (:ref:`community_contact`). **Q: Can I create my own API calls?** A: Yes, you can use the :class:`Api ` base-class and its request functions (:meth:`get_request() `, :meth:`post_request() `, :meth:`put_request() ` and :meth:`delete_request() `) and pass your own parameter. pyDataverse-0.3.4/pyDataverse/docs/source/user/installation.rst000066400000000000000000000036741467256651400247670ustar00rootroot00000000000000.. _user_installation: Installation ================= .. contents:: Table of Contents :local: There are different options on how to install a Python package, mostly depending on your preferred tools and what you want to do with it. The easiest way is in most cases to use pip (see :ref:`below `). .. _user_installation_requirements: Requirements ----------------------------- .. include:: ../snippets/requirements.rst Installer requirements: `setuptools `_ .. _user_installation_pip: Pip ----------------------------- To install the latest release of pyDataverse from PyPI, simply run this `pip `_ command in your terminal of choice: .. include:: ../snippets/pip-install.rst .. _user_installation_pipenv: Pipenv ----------------------------- `Pipenv `_ combines pip and virtualenv. .. code-block:: shell pipenv install pyDataverse .. _user_installation_source-code: Conda ----------------------------- pyDataverse is also available through `conda-forge `_. .. code-block:: shell conda install pyDataverse -c conda-forge Source Code ----------------------------- PyDataverse is actively developed on GitHub, where the code is `always available `_. You can either clone the public repository: .. code-block:: shell git clone git://github.com/gdcc/pyDataverse.git Or download the archive of the ``master`` branch as a zip: .. code-block:: shell curl -OL https://github.com/gdcc/pyDataverse/archive/master.zip Once you have a copy of the source, you can embed it in your own Python package: .. code-block:: shell cd pyDataverse pip install . .. _user_installation_development: Development ----------------------------- To set up your development environment, see :ref:`contributing_working-with-code_development-environment`. pyDataverse-0.3.4/pyDataverse/docs/source/user/resources.rst000066400000000000000000000023611467256651400242700ustar00rootroot00000000000000.. _user_resources: Resources ================= .. _user_resources_presentations-workshops: Presentations / Workshops ----------------------------- - `Slides `_: from talk at Dataverse Community Meeting 2019 - `Jupyter Notebook Demo `_: at European Dataverse Community Workshop Tromso 2020 .. _user_resources_dataverse: Dataverse Project ----------------------------- - `Dataverse `_ - `API Guide `_ .. _user_resources_developing: Developing ----------------------------- **Helpful** - `JSON Schema `_ - `Validator Webservice `_ - `Getting Started `_ - `Schema Validation `_ **Open Source Development** - `Producing Open Source Software by Karl Fogel `_ - `GitHub flow `_ - `Git Workflow `_ - `Writing on GitHub `_ pyDataverse-0.3.4/pyDataverse/docs/source/user/use-cases.rst000066400000000000000000000076761467256651400241640ustar00rootroot00000000000000.. _user_use-cases: Use-Cases ================= For a basic introduction to pyDataverse, visit :ref:`User Guide - Basic Usage `. For information on more advanced uses, visit :ref:`User Guide - Advanced Usage `. .. _use-cases_data-migration: Data Migration ----------------------------- Importing lots of data from data sources outside a Dataverse installation can be done with the help of the :ref:`CSV templates `. Simply add your data to the CSV files, import the files into pyDataverse, and then upload the data and metadata via the API. The following mappings currently exist: - CSV - CSV 2 pyDataverse (:ref:`Tutorial `) - pyDataverse 2 CSV (:ref:`Tutorial `) - Dataverse Upload JSON - JSON 2 pyDataverse - pyDataverse to JSON If you would like to add an additional mapping, we welcome :ref:`contributions `! .. _use-cases_testing: Testing ----------------------------- .. _use-cases_testing_create-test-data: Create test data for integrity tests (DevOps) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Get full lists of all Dataverse collections, Datasets and Datafiles of an installation, or a subset of it. The results are stored in JSON files, which then can be used to do data integrity tests and verify data completeness. This is typically useful after an upgrade or a Dataverse migration. The data integrates easily into `aussda_tests `_ and to any CI build tools. The general steps for use: - Collect a data tree with all Dataverse collections, Datasets and Datafiles (:meth:`get_children() `) - Extract Dataverse collections, Datasets and Datafiles from the tree (:func:`dataverse_tree_walker() `) - Save extracted data (:func:`save_tree_data() `) .. _use-cases_testing_mass-removal: Mass removal of data in a Dataverse installation (DevOps) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ After testing, you often have to clean up Dataverse collections with Datasets and Datafiles within. It can be tricky to remove them all at once, but pyDataverse helps you to do it with only a few commands: - Collect a data tree with all Dataverse collections and Datasets (:meth:`get_children() `) - Extract Dataverse collections and Datasets from the tree (:func:`dataverse_tree_walker() `) - Save extracted data (:func:`save_tree_data() `) - Iterate over all Datasets to delete/destroy them (:meth:`destroy_dataset() ` :meth:`delete_dataset() `, :meth:`destroy_dataset() `) - Iterate over all Dataverse collections to delete them (:meth:`delete_dataverse() `) This functionality is not yet fully implemented in pyDataverse, but you can find it in `aussda_tests `_. .. _use-cases_data-science: Data Science Pipeline ------------------------------------ Using APIs, you can access data and/or metadata from a Dataverse installation. You can also use pyDataverse to automatically add data and metadata to your Dataset. PyDataverse connects your Data Science pipeline with your Dataverse installation. .. _use-cases_microservices: Web-Applications / Microservices ------------------------------------------ As it is a direct and easy way to access Dataverses API's and to manipulate the Dataverse installation's data models, it integrates really well into all kind of web-applications and microservices. For example, you can use pyDataverse to visualize data, do some analysis, enrich it with other data sources (and so on). pyDataverse-0.3.4/pyDataverse/exceptions.py000066400000000000000000000022541467256651400210520ustar00rootroot00000000000000"""Find out more at https://github.com/GDCC/pyDataverse.""" class DataverseError(Exception): """Base exception class for Dataverse-related error.""" pass class DataverseApiError(DataverseError): """Base exception class for Dataverse-related api error.""" pass class OperationFailedError(DataverseApiError): """Raised when an operation fails for an unknown reason.""" pass class ApiUrlError(DataverseApiError): """Raised when the request url is not valid.""" pass class ApiResponseError(DataverseApiError): """Raised when the requests response fails.""" pass class ApiAuthorizationError(OperationFailedError): """Raised if a user provides invalid credentials.""" pass class DataverseNotEmptyError(OperationFailedError): """Raised when a Dataverse has accessioned Datasets.""" pass class DataverseNotFoundError(OperationFailedError): """Raised when a Dataverse cannot be found.""" pass class DatasetNotFoundError(OperationFailedError): """Raised when a Dataset cannot be found.""" pass class DatafileNotFoundError(OperationFailedError): """Raised when a Datafile cannot be found.""" pass pyDataverse-0.3.4/pyDataverse/models.py000066400000000000000000001645761467256651400201740ustar00rootroot00000000000000"""Dataverse data-types data model.""" from __future__ import absolute_import import json import os from pyDataverse.utils import validate_data INTERNAL_ATTRIBUTES = [ "_default_json_format", "_default_json_schema_filename", "_allowed_json_formats", "_json_dataverse_upload_attr", "_internal_attributes", ] class DVObject: """Base class for the Dataverse data types `Dataverse`, `Dataset` and `Datafile`.""" def __init__(self, data=None): """Init :class:`DVObject`. Parameters ---------- data : dict Flat dictionary. All keys will be mapped to a similar named attribute and it's value. """ if data is not None: self.set(data) def set(self, data): """Set class attributes by a flat dictionary. The flat dict is the main way to set the class attributes. It is the main interface between the object and the outside world. Parameters ---------- data : dict Flat dictionary. All keys will be mapped to a similar named attribute and it's value. Returns ------- bool `True` if all attributes are set, `False` if wrong data type was passed. """ assert isinstance(data, dict) for key, val in data.items(): if key in self._internal_attributes: print("Importing attribute {0} not allowed.".format(key)) else: self.__setattr__(key, val) def get(self): """Create flat `dict` of all attributes. Creates :class:`dict` with all attributes in a flat structure. The flat :class:`dict` can then be used for further processing. Returns ------- dict Data in a flat data structure. """ data = {} for attr in list(self.__dict__.keys()): if attr not in INTERNAL_ATTRIBUTES: data[attr] = self.__getattribute__(attr) assert isinstance(data, dict) return data def validate_json(self, filename_schema=None): """Validate JSON formats. Check if JSON data structure is valid. Parameters ---------- filename_schema : str Filename of JSON schema with full path. Returns ------- bool `True` if JSON validates correctly, `False` if not. """ if filename_schema is None: filename_schema = os.path.join( os.path.dirname(os.path.realpath(__file__)), self._default_json_schema_filename, ) assert isinstance(filename_schema, str) return validate_data( json.loads(self.json(validate=False)), filename_schema, file_format="json", ) def from_json( self, json_str, data_format=None, validate=True, filename_schema=None ): """Import metadata from a JSON file. Parses in the metadata from different JSON formats. Parameters ---------- json_str : str JSON string to be imported. data_format : str Data formats available for import. See `_allowed_json_formats`. validate : bool `True`, if imported JSON should be validated against a JSON schema file. `False`, if JSON string should be imported directly and not checked if valid. filename_schema : str Filename of JSON schema with full path. Returns ------- bool `True` if JSON imported correctly, `False` if not. """ assert isinstance(json_str, str) json_dict = json.loads(json_str) assert isinstance(json_dict, dict) assert isinstance(validate, bool) if data_format is None: data_format = self._default_json_format assert isinstance(data_format, str) assert data_format in self._allowed_json_formats if filename_schema is None: filename_schema = os.path.join( os.path.dirname(os.path.realpath(__file__)), self._default_json_schema_filename, ) assert isinstance(filename_schema, str) data = {} if data_format == "dataverse_upload": if validate: validate_data(json_dict, filename_schema) # get first level metadata and parse it automatically for key in json_dict.keys(): if key in self._json_dataverse_upload_attr: data[key] = json_dict[key] else: print( "INFO: Attribute {0} not valid for import (data format=`{1}`).".format( key, data_format ) ) elif data_format == "dataverse_download": print("INFO: Not implemented yet.") elif data_format == "dspace": print("INFO: Not implemented yet.") elif data_format == "custom": print("INFO: Not implemented yet.") else: # TODO: add exception for wrong data format pass self.set(data) def json(self, data_format=None, validate=True, filename_schema=None): r"""Create JSON from :class:`DVObject` attributes. Parameters ---------- data_format : str Data formats to be validated. See `_allowed_json_formats`. validate : bool `True`, if created JSON should be validated against a JSON schema file. `False`, if JSON string should be created and not checked if valid. filename_schema : str Filename of JSON schema with full path. Returns ------- str The data as a JSON string. """ assert isinstance(validate, bool) if data_format is None: data_format = self._default_json_format assert isinstance(data_format, str) assert data_format in self._allowed_json_formats if filename_schema is None: filename_schema = os.path.join( os.path.dirname(os.path.realpath(__file__)), self._default_json_schema_filename, ) assert isinstance(filename_schema, str) data = {} if data_format == "dataverse_upload": for attr in self._json_dataverse_upload_attr: # check if attribute exists if hasattr(self, attr): data[attr] = self.__getattribute__(attr) elif data_format == "dspace": print("INFO: Not implemented yet.") return False elif data_format == "custom": print("INFO: Not implemented yet.") return False if validate: validate_data(data, filename_schema) json_str = json.dumps(data, indent=2) assert isinstance(json_str, str) return json_str class Dataverse(DVObject): """Base class for the Dataverse data type `Dataverse`. Attributes ---------- _default_json_format : str Default JSON data format. _default_json_schema_filename : str Default JSON schema filename. _allowed_json_formats : list List of all possible JSON data formats. _json_dataverse_upload_attr : list List of all attributes to be exported in :func:`json`. """ def __init__(self, data=None): """Init :class:`Dataverse()`. Inherits attributes from parent :class:`DVObject()` Parameters ---------- data : dict Flat dictionary. All keys will be mapped to a similar named attribute and it's value. Examples ------- Create a Dataverse:: >>> from pyDataverse.models import Dataverse >>> dv = Dataverse() >>> print(dv._default_json_schema_filename) 'schemas/json/dataverse_upload_schema.json' """ self._internal_attributes = [ "_Dataverse" + attr for attr in INTERNAL_ATTRIBUTES ] super().__init__(data=data) self._default_json_format = "dataverse_upload" self._default_json_schema_filename = "schemas/json/dataverse_upload_schema.json" self._allowed_json_formats = ["dataverse_upload", "dataverse_download"] self._json_dataverse_upload_attr = [ "affiliation", "alias", "dataverseContacts", "dataverseType", "description", "name", ] class Dataset(DVObject): """Base class for the Dataverse data type `Dataset`. Attributes ---------- _default_json_format : str Default JSON data format. _default_json_schema_filename : str Default JSON schema filename. _allowed_json_formats : list List of all possible JSON data formats. _json_dataverse_upload_attr : list List with all attributes to be exported in :func:`json`. __attr_import_dv_up_datasetVersion_values : list Dataverse API Upload Dataset JSON attributes inside ds[\'datasetVersion\']. __attr_import_dv_up_citation_fields_values : list Dataverse API Upload Dataset JSON attributes inside ds[\'datasetVersion\'][\'metadataBlocks\'][\'citation\'][\'fields\']. __attr_import_dv_up_citation_fields_arrays : dict Dataverse API Upload Dataset JSON attributes inside [\'datasetVersion\'][\'metadataBlocks\'][\'citation\'][\'fields\']. __attr_import_dv_up_geospatial_fields_values : list Attributes of Dataverse API Upload Dataset JSON metadata standard inside [\'datasetVersion\'][\'metadataBlocks\'][\'geospatial\'][\'fields\']. __attr_import_dv_up_geospatial_fields_arrays : dict Attributes of Dataverse API Upload Dataset JSON metadata standard inside [\'datasetVersion\'][\'metadataBlocks\'][\'geospatial\'][\'fields\']. __attr_import_dv_up_socialscience_fields_values : list Attributes of Dataverse API Upload Dataset JSON metadata standard inside [\'datasetVersion\'][\'metadataBlocks\'][\'socialscience\'][\'fields\']. __attr_import_dv_up_journal_fields_values : list Attributes of Dataverse API Upload Dataset JSON metadata standard inside [\'datasetVersion\'][\'metadataBlocks\'][\'journal\'][\'fields\']. __attr_import_dv_up_journal_fields_arrays : dict Attributes of Dataverse API Upload Dataset JSON metadata standard inside [\'datasetVersion\'][\'metadataBlocks\'][\'journal\'][\'fields\']. __attr_dict_dv_up_required :list Required attributes for valid `dv_up` metadata dict creation. __attr_dict_dv_up_type_class_primitive : list typeClass primitive. __attr_dict_dv_up_type_class_compound : list typeClass compound. __attr_dict_dv_up_type_class_controlled_vocabulary : list typeClass controlledVocabulary. __attr_dict_dv_up_single_dict : list This attributes are excluded from automatic parsing in ds.get() creation. __attr_displayNames : list Attributes of displayName. """ __attr_import_dv_up_datasetVersion_values = [ "license", "termsOfAccess", "fileAccessRequest", "protocol", "authority", "identifier", "termsOfUse", ] __attr_import_dv_up_citation_fields_values = [ "accessToSources", "alternativeTitle", "alternativeURL", "characteristicOfSources", "dateOfDeposit", "dataSources", "depositor", "distributionDate", "kindOfData", "language", "notesText", "originOfSources", "otherReferences", "productionDate", "productionPlace", "relatedDatasets", "relatedMaterial", "subject", "subtitle", "title", ] __attr_import_dv_up_citation_fields_arrays = { "author": [ "authorName", "authorAffiliation", "authorIdentifierScheme", "authorIdentifier", ], "contributor": ["contributorType", "contributorName"], "dateOfCollection": ["dateOfCollectionStart", "dateOfCollectionEnd"], "datasetContact": [ "datasetContactName", "datasetContactAffiliation", "datasetContactEmail", ], "distributor": [ "distributorName", "distributorAffiliation", "distributorAbbreviation", "distributorURL", "distributorLogoURL", ], "dsDescription": ["dsDescriptionValue", "dsDescriptionDate"], "grantNumber": ["grantNumberAgency", "grantNumberValue"], "keyword": ["keywordValue", "keywordVocabulary", "keywordVocabularyURI"], "producer": [ "producerName", "producerAffiliation", "producerAbbreviation", "producerURL", "producerLogoURL", ], "otherId": ["otherIdAgency", "otherIdValue"], "publication": [ "publicationCitation", "publicationIDType", "publicationIDNumber", "publicationURL", ], "software": ["softwareName", "softwareVersion"], "timePeriodCovered": ["timePeriodCoveredStart", "timePeriodCoveredEnd"], "topicClassification": [ "topicClassValue", "topicClassVocab", "topicClassVocabURI", ], } __attr_import_dv_up_geospatial_fields_values = ["geographicUnit"] __attr_import_dv_up_geospatial_fields_arrays = { "geographicBoundingBox": [ "westLongitude", "eastLongitude", "northLongitude", "southLongitude", ], "geographicCoverage": ["country", "state", "city", "otherGeographicCoverage"], } __attr_import_dv_up_socialscience_fields_values = [ "actionsToMinimizeLoss", "cleaningOperations", "collectionMode", "collectorTraining", "controlOperations", "dataCollectionSituation", "dataCollector", "datasetLevelErrorNotes", "deviationsFromSampleDesign", "frequencyOfDataCollection", "otherDataAppraisal", "researchInstrument", "responseRate", "samplingErrorEstimates", "samplingProcedure", "unitOfAnalysis", "universe", "timeMethod", "weighting", ] __attr_import_dv_up_journal_fields_values = ["journalArticleType"] __attr_import_dv_up_journal_fields_arrays = { "journalVolumeIssue": ["journalVolume", "journalIssue", "journalPubDate"] } __attr_dict_dv_up_required = [ "author", "datasetContact", "dsDescription", "subject", "title", ] __attr_dict_dv_up_type_class_primitive = ( [ "accessToSources", "alternativeTitle", "alternativeURL", "authorAffiliation", "authorIdentifier", "authorName", "characteristicOfSources", "city", "contributorName", "dateOfDeposit", "dataSources", "depositor", "distributionDate", "kindOfData", "notesText", "originOfSources", "otherGeographicCoverage", "otherReferences", "productionDate", "productionPlace", "publicationCitation", "publicationIDNumber", "publicationURL", "relatedDatasets", "relatedMaterial", "seriesInformation", "seriesName", "state", "subtitle", "title", ] + __attr_import_dv_up_citation_fields_arrays["dateOfCollection"] + __attr_import_dv_up_citation_fields_arrays["datasetContact"] + __attr_import_dv_up_citation_fields_arrays["distributor"] + __attr_import_dv_up_citation_fields_arrays["dsDescription"] + __attr_import_dv_up_citation_fields_arrays["grantNumber"] + __attr_import_dv_up_citation_fields_arrays["keyword"] + __attr_import_dv_up_citation_fields_arrays["producer"] + __attr_import_dv_up_citation_fields_arrays["otherId"] + __attr_import_dv_up_citation_fields_arrays["software"] + __attr_import_dv_up_citation_fields_arrays["timePeriodCovered"] + __attr_import_dv_up_citation_fields_arrays["topicClassification"] + __attr_import_dv_up_geospatial_fields_values + __attr_import_dv_up_geospatial_fields_arrays["geographicBoundingBox"] + __attr_import_dv_up_socialscience_fields_values + __attr_import_dv_up_journal_fields_arrays["journalVolumeIssue"] + [ "socialScienceNotesType", "socialScienceNotesSubject", "socialScienceNotesText", ] + ["targetSampleActualSize", "targetSampleSizeFormula"] ) __attr_dict_dv_up_type_class_compound = ( list(__attr_import_dv_up_citation_fields_arrays.keys()) + list(__attr_import_dv_up_geospatial_fields_arrays.keys()) + list(__attr_import_dv_up_journal_fields_arrays.keys()) + ["series", "socialScienceNotes", "targetSampleSize"] ) __attr_dict_dv_up_type_class_controlled_vocabulary = [ "authorIdentifierScheme", "contributorType", "country", "journalArticleType", "language", "publicationIDType", "subject", ] __attr_dict_dv_up_single_dict = ["series", "socialScienceNotes", "targetSampleSize"] __attr_displayNames = [ "citation_displayName", "geospatial_displayName", "socialscience_displayName", "journal_displayName", ] def __init__(self, data=None): """Init a Dataset() class. Parameters ---------- data : dict Flat dictionary. All keys will be mapped to a similar named attribute and it's value. Examples ------- Create a Dataset:: >>> from pyDataverse.models import Dataset >>> ds = Dataset() >>> print(ds._default_json_schema_filename) 'schemas/json/dataset_upload_default_schema.json' """ self._internal_attributes = ["_Dataset" + attr for attr in INTERNAL_ATTRIBUTES] super().__init__(data=data) self._default_json_format = "dataverse_upload" self._default_json_schema_filename = ( "schemas/json/dataset_upload_default_schema.json" ) self._allowed_json_formats = [ "dataverse_upload", "dataverse_download", "dspace", "custom", ] self._json_dataverse_upload_attr = [ "license", "termsOfUse", "termsOfAccess", "fileAccessRequest", "protocol", "authority", "identifier", "citation_displayName", "title", "subtitle", "alternativeTitle", "alternativeURL", "otherId", "author", "datasetContact", "dsDescription", "subject", "keyword", "topicClassification", "publication", "notesText", "producer", "productionDate", "productionPlace", "contributor", "grantNumber", "distributor", "distributionDate", "depositor", "dateOfDeposit", "timePeriodCovered", "dateOfCollection", "kindOfData", "language", "series", "software", "relatedMaterial", "relatedDatasets", "otherReferences", "dataSources", "originOfSources", "characteristicOfSources", "accessToSources", "geospatial_displayName", "geographicCoverage", "geographicUnit", "geographicBoundingBox", "socialscience_displayName", "unitOfAnalysis", "universe", "timeMethod", "dataCollector", "collectorTraining", "frequencyOfDataCollection", "samplingProcedure", "targetSampleSize", "deviationsFromSampleDesign", "collectionMode", "researchInstrument", "dataCollectionSituation", "actionsToMinimizeLoss", "controlOperations", "weighting", "cleaningOperations", "datasetLevelErrorNotes", "responseRate", "samplingErrorEstimates", "otherDataAppraisal", "socialScienceNotes", "journal_displayName", "journalVolumeIssue", "journalArticleType", ] def validate_json(self, filename_schema=None): """Validate JSON formats of Dataset. Check if JSON data structure is valid. Parameters ---------- filename_schema : str Filename of JSON schema with full path. Returns ------- bool `True` if JSON validate correctly, `False` if not. Examples ------- Check if JSON is valid for Dataverse API upload:: >>> from pyDataverse.models import Dataset >>> ds = Dataset() >>> data = { >>> 'title': 'pyDataverse study 2019', >>> 'dsDescription': [ >>> {'dsDescriptionValue': 'New study about pyDataverse usage in 2019'} >>> ] >>> } >>> ds.set(data) >>> print(ds.validate_json()) False >>> ds.author = [{'authorName': 'LastAuthor1, FirstAuthor1'}] >>> ds.datasetContact = [{'datasetContactName': 'LastContact1, FirstContact1'}] >>> ds.subject = ['Engineering'] >>> print(ds.validate_json()) True """ if filename_schema is None: filename_schema = os.path.join( os.path.dirname(os.path.realpath(__file__)), self._default_json_schema_filename, ) assert isinstance(filename_schema, str) is_valid = True data_json = self.json(validate=False) if data_json: is_valid = validate_data( json.loads(data_json), filename_schema, file_format="json" ) if not is_valid: return False else: return False # check if all required attributes are set for attr in self.__attr_dict_dv_up_required: if attr in list(self.__dict__.keys()): if not self.__getattribute__(attr): is_valid = False print("Attribute '{0}' is `False`.".format(attr)) else: is_valid = False print("Attribute '{0}' missing.".format(attr)) # check if attributes set are complete where necessary if "timePeriodCovered" in list(self.__dict__.keys()): tp_cov = self.__getattribute__("timePeriodCovered") if tp_cov: for tp in tp_cov: if "timePeriodCoveredStart" in tp or "timePeriodCoveredEnd" in tp: if not ( "timePeriodCoveredStart" in tp and "timePeriodCoveredEnd" in tp ): is_valid = False print("timePeriodCovered attribute missing.") if "dateOfCollection" in list(self.__dict__.keys()): d_coll = self.__getattribute__("dateOfCollection") if d_coll: for d in d_coll: if "dateOfCollectionStart" in d or "dateOfCollectionEnd" in d: if not ( "dateOfCollectionStart" in d and "dateOfCollectionEnd" in d ): is_valid = False print("dateOfCollection attribute missing.") if "author" in list(self.__dict__.keys()): authors = self.__getattribute__("author") if authors: for a in authors: if ( "authorAffiliation" in a or "authorIdentifierScheme" in a or "authorIdentifier" in a ): if "authorName" not in a: is_valid = False print("author attribute missing.") if "datasetContact" in list(self.__dict__.keys()): ds_contac = self.__getattribute__("datasetContact") if ds_contac: for c in ds_contac: if "datasetContactAffiliation" in c or "datasetContactEmail" in c: if "datasetContactName" not in c: is_valid = False print("datasetContact attribute missing.") if "producer" in list(self.__dict__.keys()): producer = self.__getattribute__("producer") if producer: for p in producer: if ( "producerAffiliation" in p or "producerAbbreviation" in p or "producerURL" in p or "producerLogoURL" in p ): if not p["producerName"]: is_valid = False print("producer attribute missing.") if "contributor" in list(self.__dict__.keys()): contributor = self.__getattribute__("contributor") if contributor: for c in contributor: if "contributorType" in c: if "contributorName" not in c: is_valid = False print("contributor attribute missing.") if "distributor" in list(self.__dict__.keys()): distributor = self.__getattribute__("distributor") if distributor: for d in distributor: if ( "distributorAffiliation" in d or "distributorAbbreviation" in d or "distributorURL" in d or "distributorLogoURL" in d ): if "distributorName" not in d: is_valid = False print("distributor attribute missing.") if "geographicBoundingBox" in list(self.__dict__.keys()): bbox = self.__getattribute__("geographicBoundingBox") if bbox: for b in bbox: if b: if not ( "westLongitude" in b and "eastLongitude" in b and "northLongitude" in b and "southLongitude" in b ): is_valid = False print("geographicBoundingBox attribute missing.") assert isinstance(is_valid, bool) return is_valid def from_json( self, json_str, data_format=None, validate=True, filename_schema=None ): """Import Dataset metadata from JSON file. Parses in the metadata of a Dataset from different JSON formats. Parameters ---------- json_str : str JSON string to be imported. data_format : str Data formats available for import. See `_allowed_json_formats`. validate : bool `True`, if imported JSON should be validated against a JSON schema file. `False`, if JSON string should be imported directly and not checked if valid. filename_schema : str Filename of JSON schema with full path. Examples ------- Set Dataverse attributes via flat :class:`dict`:: >>> from pyDataverse.models import Dataset >>> ds = Dataset() >>> ds.from_json('tests/data/dataset_upload_min_default.json') >>> ds.title 'Darwin's Finches' """ assert isinstance(json_str, str) json_dict = json.loads(json_str) assert isinstance(json_dict, dict) assert isinstance(validate, bool) if data_format is None: data_format = self._default_json_format assert isinstance(data_format, str) assert data_format in self._allowed_json_formats if filename_schema is None: filename_schema = os.path.join( os.path.dirname(os.path.realpath(__file__)), self._default_json_schema_filename, ) assert isinstance(filename_schema, str) data = {} if data_format == "dataverse_upload": if validate: validate_data(json_dict, filename_schema, file_format="json") # dataset # get first level metadata and parse it automatically for key, val in json_dict["datasetVersion"].items(): if not key == "metadataBlocks": if key in self.__attr_import_dv_up_datasetVersion_values: data[key] = val else: print( "Attribute {0} not valid for import (format={1}).".format( key, data_format ) ) if "metadataBlocks" in json_dict["datasetVersion"]: # citation if "citation" in json_dict["datasetVersion"]["metadataBlocks"]: citation = json_dict["datasetVersion"]["metadataBlocks"]["citation"] if "displayName" in citation: data["citation_displayName"] = citation["displayName"] for field in citation["fields"]: if ( field["typeName"] in self.__attr_import_dv_up_citation_fields_values ): data[field["typeName"]] = field["value"] elif ( field["typeName"] in self.__attr_import_dv_up_citation_fields_arrays ): data[field["typeName"]] = self.__parse_field_array( field["value"], self.__attr_import_dv_up_citation_fields_arrays[ field["typeName"] ], ) elif field["typeName"] == "series": data["series"] = {} if "seriesName" in field["value"]: data["series"]["seriesName"] = field["value"][ "seriesName" ]["value"] if "seriesInformation" in field["value"]: data["series"]["seriesInformation"] = field["value"][ "seriesInformation" ]["value"] else: print( "Attribute {0} not valid for import (dv_up).".format( field["typeName"] ) ) else: # TODO: Exception pass # geospatial if "geospatial" in json_dict["datasetVersion"]["metadataBlocks"]: geospatial = json_dict["datasetVersion"]["metadataBlocks"][ "geospatial" ] if "displayName" in geospatial: self.__setattr__( "geospatial_displayName", geospatial["displayName"] ) for field in geospatial["fields"]: if ( field["typeName"] in self.__attr_import_dv_up_geospatial_fields_values ): data[field["typeName"]] = field["value"] elif ( field["typeName"] in self.__attr_import_dv_up_geospatial_fields_arrays ): data[field["typeName"]] = self.__parse_field_array( field["value"], self.__attr_import_dv_up_geospatial_fields_arrays[ field["typeName"] ], ) else: print( "Attribute {0} not valid for import (dv_up).".format( field["typeName"] ) ) else: # TODO: Exception pass # socialscience if "socialscience" in json_dict["datasetVersion"]["metadataBlocks"]: socialscience = json_dict["datasetVersion"]["metadataBlocks"][ "socialscience" ] if "displayName" in socialscience: self.__setattr__( "socialscience_displayName", socialscience["displayName"], ) for field in socialscience["fields"]: if ( field["typeName"] in self.__attr_import_dv_up_socialscience_fields_values ): data[field["typeName"]] = field["value"] elif field["typeName"] == "targetSampleSize": data["targetSampleSize"] = {} if "targetSampleActualSize" in field["value"]: data["targetSampleSize"]["targetSampleActualSize"] = ( field["value"]["targetSampleActualSize"]["value"] ) if "targetSampleSizeFormula" in field["value"]: data["targetSampleSize"]["targetSampleSizeFormula"] = ( field["value"]["targetSampleSizeFormula"]["value"] ) elif field["typeName"] == "socialScienceNotes": data["socialScienceNotes"] = {} if "socialScienceNotesType" in field["value"]: data["socialScienceNotes"]["socialScienceNotesType"] = ( field["value"]["socialScienceNotesType"]["value"] ) if "socialScienceNotesSubject" in field["value"]: data["socialScienceNotes"][ "socialScienceNotesSubject" ] = field["value"]["socialScienceNotesSubject"]["value"] if "socialScienceNotesText" in field["value"]: data["socialScienceNotes"]["socialScienceNotesText"] = ( field["value"]["socialScienceNotesText"]["value"] ) else: print( "Attribute {0} not valid for import (dv_up).".format( field["typeName"] ) ) else: # TODO: Exception pass # journal if "journal" in json_dict["datasetVersion"]["metadataBlocks"]: journal = json_dict["datasetVersion"]["metadataBlocks"]["journal"] if "displayName" in journal: self.__setattr__("journal_displayName", journal["displayName"]) for field in journal["fields"]: if ( field["typeName"] in self.__attr_import_dv_up_journal_fields_values ): data[field["typeName"]] = field["value"] elif ( field["typeName"] in self.__attr_import_dv_up_journal_fields_arrays ): data[field["typeName"]] = self.__parse_field_array( field["value"], self.__attr_import_dv_up_journal_fields_arrays[ field["typeName"] ], ) else: print( "Attribute {0} not valid for import (dv_up).".format( field["typeName"] ) ) else: # TODO: Exception pass elif data_format == "dataverse_download": print("INFO: Not implemented yet.") elif data_format == "dspace": print("INFO: Not implemented yet.") elif data_format == "custom": print("INFO: Not implemented yet.") self.set(data) def __parse_field_array(self, data, attr_list): """Parse arrays of Dataset upload format. Parameters ---------- data : list List of dictionaries of a specific Dataverse API metadata field. attr_list : list List of attributes to be parsed. Returns ------- list List of :class:`dict`s with parsed out key-value pairs. """ assert isinstance(data, list) assert isinstance(attr_list, list) data_tmp = [] for d in data: tmp_dict = {} for key, val in d.items(): if key in attr_list: tmp_dict[key] = val["value"] else: print("Key '{0}' not in attribute list".format(key)) data_tmp.append(tmp_dict) assert isinstance(data_tmp, list) return data_tmp def __generate_field_arrays(self, key, sub_keys): """Generate dicts for array attributes of Dataverse API metadata upload. Parameters ---------- key : str Name of attribute. sub_keys : list List of keys to be created. Returns ------- list List of filled :class:`dict`s of metadata for Dataverse API upload. """ assert isinstance(key, str) assert isinstance(sub_keys, list) # check if attribute exists tmp_list = [] if self.__getattribute__(key): attr = self.__getattribute__(key) # loop over list of attribute dict for d in attr: tmp_dict = {} # iterate over key-value pairs for k, v in d.items(): # check if key is in attribute list if k in sub_keys: multiple = None type_class = None if isinstance(v, list): multiple = True else: multiple = False if k in self.__attr_dict_dv_up_type_class_primitive: type_class = "primitive" elif k in self.__attr_dict_dv_up_type_class_compound: type_class = "compound" elif ( k in self.__attr_dict_dv_up_type_class_controlled_vocabulary ): type_class = "controlledVocabulary" tmp_dict[k] = {} tmp_dict[k]["typeName"] = k tmp_dict[k]["typeClass"] = type_class tmp_dict[k]["multiple"] = multiple tmp_dict[k]["value"] = v tmp_list.append(tmp_dict) assert isinstance(tmp_list, list) return tmp_list def json(self, data_format=None, validate=True, filename_schema=None): """Create Dataset JSON from attributes. Parameters ---------- format : str Data formats to be validated. See `_allowed_json_formats`. validate : bool `True`, if created JSON should be validated against a JSON schema file. `False`, if JSON string should be created and not checked if valid. filename_schema : str Filename of JSON schema with full path. Returns ------- str The data as a JSON string. """ assert isinstance(validate, bool) if data_format is None: data_format = self._default_json_format assert isinstance(data_format, str) assert data_format in self._allowed_json_formats if filename_schema is None: filename_schema = os.path.join( os.path.dirname(os.path.realpath(__file__)), self._default_json_schema_filename, ) assert isinstance(filename_schema, str) data = {} if data_format == "dataverse_upload": data_dict = self.get() data["datasetVersion"] = {} data["datasetVersion"]["metadataBlocks"] = {} citation = {} citation["fields"] = [] # dataset # Generate first level attributes for attr in self.__attr_import_dv_up_datasetVersion_values: if attr in data_dict: data["datasetVersion"][attr] = data_dict[attr] # citation if "citation_displayName" in data_dict: citation["displayName"] = data_dict["citation_displayName"] # Generate first level attributes for attr in self.__attr_import_dv_up_citation_fields_values: if attr in data_dict: v = data_dict[attr] if isinstance(v, list): multiple = True else: multiple = False if attr in self.__attr_dict_dv_up_type_class_primitive: type_class = "primitive" elif attr in self.__attr_dict_dv_up_type_class_compound: type_class = "compound" elif ( attr in self.__attr_dict_dv_up_type_class_controlled_vocabulary ): type_class = "controlledVocabulary" citation["fields"].append( { "typeName": attr, "multiple": multiple, "typeClass": type_class, "value": v, } ) # Generate fields attributes for ( key, val, ) in self.__attr_import_dv_up_citation_fields_arrays.items(): if key in data_dict: v = data_dict[key] citation["fields"].append( { "typeName": key, "multiple": True, "typeClass": "compound", "value": self.__generate_field_arrays(key, val), } ) # Generate series attributes if "series" in data_dict: series = data_dict["series"] tmp_dict = {} if "seriesName" in series: if series["seriesName"] is not None: tmp_dict["seriesName"] = {} tmp_dict["seriesName"]["typeName"] = "seriesName" tmp_dict["seriesName"]["multiple"] = False tmp_dict["seriesName"]["typeClass"] = "primitive" tmp_dict["seriesName"]["value"] = series["seriesName"] if "seriesInformation" in series: if series["seriesInformation"] is not None: tmp_dict["seriesInformation"] = {} tmp_dict["seriesInformation"]["typeName"] = "seriesInformation" tmp_dict["seriesInformation"]["multiple"] = False tmp_dict["seriesInformation"]["typeClass"] = "primitive" tmp_dict["seriesInformation"]["value"] = series[ "seriesInformation" ] citation["fields"].append( { "typeName": "series", "multiple": False, "typeClass": "compound", "value": tmp_dict, } ) # geospatial for attr in ( self.__attr_import_dv_up_geospatial_fields_values + list(self.__attr_import_dv_up_geospatial_fields_arrays.keys()) + ["geospatial_displayName"] ): if attr in data_dict: geospatial = {} if attr != "geospatial_displayName": geospatial["fields"] = [] break if "geospatial_displayName" in data_dict: geospatial["displayName"] = data_dict["geospatial_displayName"] # Generate first level attributes for attr in self.__attr_import_dv_up_geospatial_fields_values: if attr in data_dict: v = data_dict[attr] if isinstance(v, list): multiple = True else: multiple = False if attr in self.__attr_dict_dv_up_type_class_primitive: type_class = "primitive" elif attr in self.__attr_dict_dv_up_type_class_compound: type_class = "compound" elif ( attr in self.__attr_dict_dv_up_type_class_controlled_vocabulary ): type_class = "controlledVocabulary" geospatial["fields"].append( { "typeName": attr, "multiple": multiple, "typeClass": type_class, "value": v, } ) # Generate fields attributes for ( key, val, ) in self.__attr_import_dv_up_geospatial_fields_arrays.items(): if key in data_dict: geospatial["fields"].append( { "typeName": key, "multiple": True, "typeClass": "compound", "value": self.__generate_field_arrays(key, val), } ) # socialscience for attr in self.__attr_import_dv_up_socialscience_fields_values + [ "socialscience_displayName" ]: if attr in data_dict: socialscience = {} if attr != "socialscience_displayName": socialscience["fields"] = [] break if "socialscience_displayName" in data_dict: socialscience["displayName"] = data_dict["socialscience_displayName"] # Generate first level attributes for attr in self.__attr_import_dv_up_socialscience_fields_values: if attr in data_dict: v = data_dict[attr] if isinstance(v, list): multiple = True else: multiple = False if attr in self.__attr_dict_dv_up_type_class_primitive: type_class = "primitive" elif attr in self.__attr_dict_dv_up_type_class_compound: type_class = "compound" elif ( attr in self.__attr_dict_dv_up_type_class_controlled_vocabulary ): type_class = "controlledVocabulary" socialscience["fields"].append( { "typeName": attr, "multiple": multiple, "typeClass": type_class, "value": v, } ) # Generate targetSampleSize attributes if "targetSampleSize" in data_dict: target_sample_size = data_dict["targetSampleSize"] tmp_dict = {} if "targetSampleActualSize" in target_sample_size: if target_sample_size["targetSampleActualSize"] is not None: tmp_dict["targetSampleActualSize"] = {} tmp_dict["targetSampleActualSize"]["typeName"] = ( "targetSampleActualSize" ) tmp_dict["targetSampleActualSize"]["multiple"] = False tmp_dict["targetSampleActualSize"]["typeClass"] = "primitive" tmp_dict["targetSampleActualSize"]["value"] = ( target_sample_size["targetSampleActualSize"] ) if "targetSampleSizeFormula" in target_sample_size: if target_sample_size["targetSampleSizeFormula"] is not None: tmp_dict["targetSampleSizeFormula"] = {} tmp_dict["targetSampleSizeFormula"]["typeName"] = ( "targetSampleSizeFormula" ) tmp_dict["targetSampleSizeFormula"]["multiple"] = False tmp_dict["targetSampleSizeFormula"]["typeClass"] = "primitive" tmp_dict["targetSampleSizeFormula"]["value"] = ( target_sample_size["targetSampleSizeFormula"] ) socialscience["fields"].append( { "typeName": "targetSampleSize", "multiple": False, "typeClass": "compound", "value": tmp_dict, } ) # Generate socialScienceNotes attributes if "socialScienceNotes" in data_dict: social_science_notes = data_dict["socialScienceNotes"] tmp_dict = {} if "socialScienceNotesType" in social_science_notes: if social_science_notes["socialScienceNotesType"] is not None: tmp_dict["socialScienceNotesType"] = {} tmp_dict["socialScienceNotesType"]["typeName"] = ( "socialScienceNotesType" ) tmp_dict["socialScienceNotesType"]["multiple"] = False tmp_dict["socialScienceNotesType"]["typeClass"] = "primitive" tmp_dict["socialScienceNotesType"]["value"] = ( social_science_notes["socialScienceNotesType"] ) if "socialScienceNotesSubject" in social_science_notes: if social_science_notes["socialScienceNotesSubject"] is not None: tmp_dict["socialScienceNotesSubject"] = {} tmp_dict["socialScienceNotesSubject"]["typeName"] = ( "socialScienceNotesSubject" ) tmp_dict["socialScienceNotesSubject"]["multiple"] = False tmp_dict["socialScienceNotesSubject"]["typeClass"] = "primitive" tmp_dict["socialScienceNotesSubject"]["value"] = ( social_science_notes["socialScienceNotesSubject"] ) if "socialScienceNotesText" in social_science_notes: if social_science_notes["socialScienceNotesText"] is not None: tmp_dict["socialScienceNotesText"] = {} tmp_dict["socialScienceNotesText"]["typeName"] = ( "socialScienceNotesText" ) tmp_dict["socialScienceNotesText"]["multiple"] = False tmp_dict["socialScienceNotesText"]["typeClass"] = "primitive" tmp_dict["socialScienceNotesText"]["value"] = ( social_science_notes["socialScienceNotesText"] ) socialscience["fields"].append( { "typeName": "socialScienceNotes", "multiple": False, "typeClass": "compound", "value": tmp_dict, } ) # journal for attr in ( self.__attr_import_dv_up_journal_fields_values + list(self.__attr_import_dv_up_journal_fields_arrays.keys()) + ["journal_displayName"] ): if attr in data_dict: journal = {} if attr != "journal_displayName": journal["fields"] = [] break if "journal_displayName" in data_dict: journal["displayName"] = data_dict["journal_displayName"] # Generate first level attributes for attr in self.__attr_import_dv_up_journal_fields_values: if attr in data_dict: v = data_dict[attr] if isinstance(v, list): multiple = True else: multiple = False if attr in self.__attr_dict_dv_up_type_class_primitive: type_class = "primitive" elif attr in self.__attr_dict_dv_up_type_class_compound: type_class = "compound" elif ( attr in self.__attr_dict_dv_up_type_class_controlled_vocabulary ): type_class = "controlledVocabulary" journal["fields"].append( { "typeName": attr, "multiple": multiple, "typeClass": type_class, "value": v, } ) # Generate fields attributes for ( key, val, ) in self.__attr_import_dv_up_journal_fields_arrays.items(): if key in data_dict: journal["fields"].append( { "typeName": key, "multiple": True, "typeClass": "compound", "value": self.__generate_field_arrays(key, val), } ) data["datasetVersion"]["metadataBlocks"]["citation"] = citation if "socialscience" in locals(): data["datasetVersion"]["metadataBlocks"]["socialscience"] = ( socialscience ) if "geospatial" in locals(): data["datasetVersion"]["metadataBlocks"]["geospatial"] = geospatial if "journal" in locals(): data["datasetVersion"]["metadataBlocks"]["journal"] = journal elif data_format == "dspace": data = None print("INFO: Not implemented yet.") elif data_format == "custom": data = None print("INFO: Not implemented yet.") if validate: validate_data(data, filename_schema) json_str = json.dumps(data, indent=2) assert isinstance(json_str, str) return json_str class Datafile(DVObject): """Base class for the Dataverse data type `Datafile`. Attributes ---------- _default_json_format : str Default JSON data format. _default_json_schema_filename : str Default JSON schema filename. _allowed_json_formats : list List of all possible JSON data formats. _json_dataverse_upload_attr : list List of all attributes to be exported in :func:`json`. """ def __init__(self, data=None): """Init :class:`Datafile()`. Inherits attributes from parent :class:`DVObject()` Parameters ---------- data : dict Flat dictionary. All keys will be mapped to a similar named attribute and it's value. Examples ------- Create a Datafile:: >>> from pyDataverse.models import Datafile >>> df = Datafile() >>> print(df._default_json_schema_filename) 'schemas/json/datafile_upload_schema.json' """ self._internal_attributes = ["_Datafile" + attr for attr in INTERNAL_ATTRIBUTES] super().__init__(data=data) self._default_json_format = "dataverse_upload" self._default_json_schema_filename = "schemas/json/datafile_upload_schema.json" self._allowed_json_formats = ["dataverse_upload", "dataverse_download"] self._json_dataverse_upload_attr = [ "description", "categories", "restrict", "label", "directoryLabel", "pid", "filename", ] pyDataverse-0.3.4/pyDataverse/schemas/000077500000000000000000000000001467256651400177375ustar00rootroot00000000000000pyDataverse-0.3.4/pyDataverse/schemas/json/000077500000000000000000000000001467256651400207105ustar00rootroot00000000000000pyDataverse-0.3.4/pyDataverse/schemas/json/datafile_upload_schema.json000066400000000000000000000024111467256651400262360ustar00rootroot00000000000000{ "$schema": "http://json-schema.org/draft-07/schema", "$id": "https://github.com/GDCC/pyDataverse/schemas/json/datafile_upload_schema.json", "type": "object", "title": "Datafile JSON upload schema", "description": "Describes the full Datafile metadata JSON file structure for a Dataverse API upload.", "default": {}, "required": [ "pid", "filename" ], "additionalProperties": false, "properties": { "description": { "$id": "#/properties/description", "type": "string" }, "categories": { "$id": "#/properties/categories", "type": "array", "additionalItems": false, "items": { "anyOf": [ { "$id": "#/properties/categories/items/anyOf/0", "type": "string" } ], "$id": "#/properties/categories/items" } }, "restrict": { "$id": "#/properties/restrict", "type": "boolean" }, "pid": { "$id": "#/properties/pid", "type": "string" }, "filename": { "$id": "#/properties/filename", "type": "string" }, "label": { "$id": "#/properties/label", "type": "string" }, "directoryLabel": { "$id": "#/properties/directoryLabel", "type": "string" } } } pyDataverse-0.3.4/pyDataverse/schemas/json/dataset_upload_default_schema.json000066400000000000000000000723171467256651400276320ustar00rootroot00000000000000{ "$schema": "http://json-schema.org/draft-07/schema", "$id": "http://example.com/example.json", "type": "object", "title": "The root schema", "description": "The root schema comprises the entire JSON document.", "properties": { "datasetVersion": { "$id": "#/properties/datasetVersion", "type": "object", "properties": { "license": { "$id": "#/properties/datasetVersion/properties/license", "type": "string" }, "termsOfUse": { "$id": "#/properties/datasetVersion/properties/termsOfUse", "type": "string" }, "termsOfAccess": { "$id": "#/properties/datasetVersion/properties/termsOfAccess", "type": "string" }, "metadataBlocks": { "$id": "#/properties/datasetVersion/properties/metadataBlocks", "type": "object", "properties": { "citation": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/citation", "type": "object", "properties": { "displayName": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/citation/properties/displayName", "type": "string" }, "fields": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/citation/properties/fields", "type": "array", "items": { "anyOf": [ { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/citation/properties/fields/items/anyOf/0", "type": "object", "properties": { "typeName": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/citation/properties/fields/items/anyOf/0/properties/typeName", "type": "string" }, "multiple": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/citation/properties/fields/items/anyOf/0/properties/multiple", "type": "boolean" }, "typeClass": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/citation/properties/fields/items/anyOf/0/properties/typeClass", "type": "string" }, "value": { "anyOf": [ { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/citation/properties/fields/items/anyOf/0/properties/value/anyOf/0", "type": "array" }, { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/citation/properties/fields/items/anyOf/0/properties/value/anyOf/1", "type": "string" }, { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/citation/properties/fields/items/anyOf/0/properties/value/anyOf/2", "type": "object" } ] } } } ], "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/citation/properties/fields/items" } } } }, "geospatial": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial", "type": "object", "properties": { "displayName": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/displayName", "type": "string" }, "fields": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields", "type": "array", "items": { "anyOf": [ { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0", "type": "object", "properties": { "typeName": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/typeName", "type": "string" }, "multiple": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/multiple", "type": "boolean" }, "typeClass": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/typeClass", "type": "string" }, "value": { "anyOf": [ { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0", "type": "array" }, { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/1", "type": "object", "properties": { "country": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/country", "type": "object", "properties": { "typeName": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/country/properties/typeName", "type": "string" }, "multiple": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/country/properties/multiple", "type": "boolean" }, "typeClass": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/country/properties/typeClass", "type": "string" }, "value": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/country/properties/value", "type": "string" } } }, "state": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/state", "type": "object", "properties": { "typeName": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/state/properties/typeName", "type": "string" }, "multiple": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/state/properties/multiple", "type": "boolean" }, "typeClass": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/state/properties/typeClass", "type": "string" }, "value": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/state/properties/value", "type": "string" } } }, "city": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/city", "type": "object", "properties": { "typeName": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/city/properties/typeName", "type": "string" }, "multiple": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/city/properties/multiple", "type": "boolean" }, "typeClass": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/city/properties/typeClass", "type": "string" }, "value": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/city/properties/value", "type": "string" } } }, "otherGeographicCoverage": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/otherGeographicCoverage", "type": "object", "properties": { "typeName": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/otherGeographicCoverage/properties/typeName", "type": "string" }, "multiple": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/otherGeographicCoverage/properties/multiple", "type": "boolean" }, "typeClass": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/otherGeographicCoverage/properties/typeClass", "type": "string" }, "value": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/otherGeographicCoverage/properties/value", "type": "string" } } } } } ], "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items/anyOf/0/properties/value/items" } } } ], "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/geospatial/properties/fields/items" } } } }, "socialscience": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/socialscience", "type": "object", "properties": { "displayName": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/socialscience/properties/displayName", "type": "string" }, "fields": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/socialscience/properties/fields", "type": "array", "items": { "anyOf": [ { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/socialscience/properties/fields/items/anyOf/0", "type": "object", "properties": { "typeName": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/socialscience/properties/fields/items/anyOf/0/properties/typeName", "type": "string" }, "multiple": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/socialscience/properties/fields/items/anyOf/0/properties/multiple", "type": "boolean" }, "typeClass": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/socialscience/properties/fields/items/anyOf/0/properties/typeClass", "type": "string" }, "value": { "anyOf": [ { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/socialscience/properties/fields/items/anyOf/0/properties/value", "type": "array", "items": { "anyOf": [ { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/socialscience/properties/fields/items/anyOf/0/properties/value/items/anyOf/0", "type": "string" } ], "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/socialscience/properties/fields/items/anyOf/0/properties/value/items" } }, { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/socialscience/properties/fields/items/anyOf/0/properties/value", "type": "string" }, { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/socialscience/properties/fields/items/anyOf/0/properties/value", "type": "object", "properties": { "targetSampleActualSize": { "type": "object" }, "targetSampleSizeFormula": { "type": "object", "properties": { "typeName": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/astrophysics/properties/fields/items/anyOf/0/properties/typeName", "type": "string" }, "multiple": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/astrophysics/properties/fields/items/anyOf/0/properties/multiple", "type": "boolean" }, "typeClass": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/astrophysics/properties/fields/items/anyOf/0/properties/typeClass", "type": "string" }, "value": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/astrophysics/properties/fields/items/anyOf/0/properties/value", "type": "string" } } } } } ] } } } ], "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/socialscience/properties/fields/items" } } } }, "journal": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal", "type": "object", "properties": { "displayName": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/displayName", "type": "string" }, "fields": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields", "type": "array", "items": { "anyOf": [ { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0", "type": "object", "properties": { "typeName": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/typeName", "type": "string" }, "multiple": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/multiple", "type": "boolean" }, "typeClass": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/typeClass", "type": "string" }, "value": { "anyOf": [ { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/value", "type": "array", "items": { "anyOf": [ { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/value/items/anyOf/0", "type": "object", "properties": { "journalVolume": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/journalVolume", "type": "object", "properties": { "typeName": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/journalVolume/properties/typeName", "type": "string" }, "multiple": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/journalVolume/properties/multiple", "type": "boolean" }, "typeClass": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/journalVolume/properties/typeClass", "type": "string" }, "value": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/journalVolume/properties/value", "type": "string" } } }, "journalIssue": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/journalIssue", "type": "object", "properties": { "typeName": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/journalIssue/properties/typeName", "type": "string" }, "multiple": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/journalIssue/properties/multiple", "type": "boolean" }, "typeClass": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/journalIssue/properties/typeClass", "type": "string" }, "value": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/journalIssue/properties/value", "type": "string" } } }, "journalPubDate": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/journalPubDate", "type": "object", "properties": { "typeName": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/journalPubDate/properties/typeName", "type": "string" }, "multiple": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/journalPubDate/properties/multiple", "type": "boolean" }, "typeClass": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/journalPubDate/properties/typeClass", "type": "string" }, "value": { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/value/items/anyOf/0/properties/journalPubDate/properties/value", "type": "string" } } } } } ] } }, { "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/value", "type": "string" } ], "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items/anyOf/0/properties/value/items" } } } ], "$id": "#/properties/datasetVersion/properties/metadataBlocks/properties/journal/properties/fields/items" } } } } } } } } } } pyDataverse-0.3.4/pyDataverse/schemas/json/dataverse_upload_schema.json000066400000000000000000000030511467256651400264440ustar00rootroot00000000000000{ "$schema": "http://json-schema.org/draft-07/schema", "$id": "https://github.com/GDCC/pyDataverse/schemas/json/dataverse_upload_schema.json", "type": "object", "title": "Dataverse JSON upload schema", "description": "Describes the full Dataverse metadata JSON file structure for a Dataverse API upload.", "required": [ "name", "alias", "dataverseContacts" ], "additionalProperties": false, "properties": { "name": { "$id": "#/properties/name", "type": "string" }, "alias": { "$id": "#/properties/alias", "type": "string" }, "dataverseContacts": { "$id": "#/properties/dataverseContacts", "type": "array", "additionalItems": false, "items": { "anyOf": [ { "$id": "#/properties/dataverseContacts/items/anyOf/0", "type": "object", "required": [ "contactEmail" ], "additionalProperties": false, "properties": { "contactEmail": { "$id": "#/properties/dataverseContacts/items/anyOf/0/properties/contactEmail", "type": "string" } } } ], "$id": "#/properties/dataverseContacts/items" } }, "affiliation": { "$id": "#/properties/affiliation", "type": "string" }, "description": { "$id": "#/properties/description", "type": "string" }, "dataverseType": { "$id": "#/properties/dataverseType", "type": "string" } } } pyDataverse-0.3.4/pyDataverse/schemas/json/dspace_schema.json000066400000000000000000000212101467256651400243560ustar00rootroot00000000000000{ "$schema": "http://json-schema.org/schema#", "type": "object", "properties": { "responseHeader": { "type": "object", "properties": { "status": { "type": "integer" }, "QTime": { "type": "integer" }, "params": { "type": "object", "properties": { "q": { "type": "string" }, "indent": { "type": "string" }, "wt": { "type": "string" } } } } }, "response": { "type": "object", "properties": { "numFound": { "type": "integer" }, "start": { "type": "integer" }, "docs": { "type": "array", "items": { "type": "object", "properties": { "ZANo": { "type": "string" }, "id": { "type": "string" }, "AccessClass": { "type": "string" }, "NumberOfUnits": { "type": "string" }, "src": { "type": "string" }, "Studynumber": { "type": "string" }, "AnalysisSystem": { "type": "string" }, "NumberOfVariables": { "type": "string" }, "SDC": { "type": "string" }, "Universe": { "type": "string" }, "max": { "type": "string" }, "min": { "type": "string" }, "Abstract": { "type": "string" }, "CollectionMethod": { "type": "string" }, "Title": { "type": "string" }, "SelectionMethod": { "type": "string" }, "Remarks": { "type": "string" }, "DatCollector": { "type": "string" }, "EAbstract": { "type": "string" }, "EUniverse": { "type": "string" }, "ECollectionMethod": { "type": "string" }, "ETitle": { "type": "string" }, "ESelectionMethod": { "type": "string" }, "EDatCollector": { "type": "string" }, "PriInvestigator": { "type": "array", "items": { "type": "string" } }, "Institution": { "type": "array", "items": { "type": "string" } }, "EGeogFree": { "type": "array", "items": { "type": "string" } }, "GeogISO": { "type": "array", "items": { "type": "string" } }, "EGeogName": { "type": "array", "items": { "type": "string" } }, "GeogFree": { "type": "array", "items": { "type": "string" } }, "GeogName": { "type": "array", "items": { "type": "string" } }, "CategoryNo": { "type": "array", "items": { "type": "string" } }, "ECategory": { "type": "array", "items": { "type": "string" } }, "Category": { "type": "array", "items": { "type": "string" } }, "TopicNo": { "type": "array", "items": { "type": "string" } }, "ETopic": { "type": "array", "items": { "type": "string" } }, "Topic": { "type": "array", "items": { "type": "string" } }, "VersionYear": { "type": "string" }, "EVersionName": { "type": "string" }, "PublicationYear": { "type": "string" }, "VersionDate": { "type": "string" }, "DOInumber": { "type": "array", "items": { "type": "string" } }, "VersionNumber": { "type": "string" }, "VersionName": { "type": "string" }, "DOI": { "type": "array", "items": { "type": "string" } }, "CollDateMin": { "type": "string" }, "CollDateMax": { "type": "string" }, "GroupNo": { "type": "array", "items": { "type": "string" } }, "EGroupName": { "type": "array", "items": { "type": "string" } }, "GroupName": { "type": "array", "items": { "type": "string" } }, "DatasetFile": { "type": "array", "items": { "type": "string" } }, "DatasetSize": { "type": "array", "items": { "type": "string" } }, "DatasetDescription": { "type": "array", "items": { "type": "string" } }, "Dataset": { "type": "array", "items": { "type": "string" } }, "EDatasetDescription": { "type": "array", "items": { "type": "string" } }, "QuestionnaireSize": { "type": "array", "items": { "type": "string" } }, "EQuestionnaireDescription": { "type": "array", "items": { "type": "string" } }, "QuestionnaireFile": { "type": "array", "items": { "type": "string" } }, "Questionnaire": { "type": "array", "items": { "type": "string" } }, "QuestionnaireDescription": { "type": "array", "items": { "type": "string" } }, "CodebookSize": { "type": "array", "items": { "type": "string" } }, "Codebook": { "type": "array", "items": { "type": "string" } }, "ECodebookDescription": { "type": "array", "items": { "type": "string" } }, "CodebookDescription": { "type": "array", "items": { "type": "string" } }, "CodebookFile": { "type": "array", "items": { "type": "string" } } } } } } } } } pyDataverse-0.3.4/pyDataverse/templates/000077500000000000000000000000001467256651400203125ustar00rootroot00000000000000pyDataverse-0.3.4/pyDataverse/templates/datafiles.csv000066400000000000000000000025431467256651400227670ustar00rootroot00000000000000"attribute","org.datafile_id","org.dataset_id","org.filename","org.to_upload","org.is_uploaded","org.to_delete","org.is_deleted","org.to_update","org.is_updated","dv.datafile_id","dv.description","dv.categories","dv.restrict","dv.label","dv.directoryLabel","alma.title","alma.pages","alma.year" "description","Unique identifier for a Datafile from the organizational perspective.","Unique identifier for a Dataset from the organizational perspective. Relates to the column in datasets.csv.","Filename without path.","Datafile is to be uploaded.","Datafile is uploaded.","Datafile is to be deleted.","Datafile is deleted.","Datafile Metadata is to be updated.","Datafile Metadata is updated.","Datafile ID in Dataverse.","Description for the file","Categories for the file.","File restriction.","Title","Directory, the file should be stored in.","Title for Alma.","Number of pages of work.", "type","serial","numeric","string","boolean","boolean","boolean","boolean","boolean","boolean","string","string","String, JSON object [""VALUE"", ""VALUE""] Labels: ""Documentation"", ""Data"", ""Code""","boolean","string","string","string","string","string" "example",1,1,"01035_en_q.pdf","TRUE","TRUE","TRUE","TRUE","TRUE","TRUE",634,"My description bbb.","[""Data""]","FALSE","Text Report","data/subdir1","Text Report",23,1997 "multiple",,,,,,,,,,,,,,,,,, "sub_keys",,,,,,,,,,,,,,,,,, pyDataverse-0.3.4/pyDataverse/templates/datasets.csv000066400000000000000000000225701467256651400226450ustar00rootroot00000000000000"attribute","org.dataset_id","org.dataverse_id","org.doi","org.privateurl","org.to_upload","org.is_uploaded","org.to_publish","org.is_published","org.to_delete","org.is_deleted","org.to_update","org.is_updated","dv.license","dv.termsOfAccess","dv.termsOfUse","dv.otherId","dv.title","dv.subtitle","dv.alternativeTitle","dv.series","dv.notesText","dv.author","dv.dsDescription","dv.subject","dv.keyword","dv.topicClassification","dv.language","dv.grantNumber","dv.dateOfCollection","dv.kindOfData","dv.dataSources","dv.accessToSources","dv.alternativeURL","dv.characteristicOfSources","dv.dateOfDeposit","dv.depositor","dv.distributionDate","dv.otherReferences","dv.productionDate","dv.productionPlace","dv.contributor","dv.relatedDatasets","dv.relatedMaterial","dv.datasetContact","dv.distributor","dv.producer","dv.publication","dv.software","dv.timePeriodCovered","dv.geographicUnit","dv.geographicBoundingBox","dv.geographicCoverage","dv.actionsToMinimizeLoss","dv.cleaningOperations","dv.collectionMode","dv.collectorTraining","dv.controlOperations","dv.dataCollectionSituation","dv.dataCollector","dv.datasetLevelErrorNotes","dv.deviationsFromSampleDesign","dv.frequencyOfDataCollection","dv.otherDataAppraisal","dv.socialScienceNotes","dv.researchInstrument","dv.responseRate","dv.samplingErrorEstimates","dv.samplingProcedure","dv.unitOfAnalysis","dv.universe","dv.timeMethod","dv.weighting","dv.fileAccessRequest" "description","Unique identifier for a Dataset from the organizational perspective.","Unique identifier for the dataverse coming from the organization. Ether dataverse ID or dataverse alias. Relates to column in datverses.csv.","DOI related to the Dataset.","Private URL related to the Dataset.","Is the Dataset to be uploaded.","Is the Dataset uploaded.","Is the Dataset to be published.","Is the Dataset published.","Is the Dataset to be deleted.","Is the Dataset deleted.","Dataset Metadata is to be updated.","Dataset Metadata is updated.","License text.","Terms of Access text.","Terms of Use text.","Other identifier related to the Dataset.","Title","Subtitle","Alternative title.","Series, which the Dataset is part of.",,"Author(s), work creator(s) of the Dataset content.","Description","Subject","Keyword(s)","Topic(s)","Language(s)","Grant number(s)",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, "type","serial","serial/string","string","string","boolean","boolean","boolean","boolean","boolean","boolean","boolean","boolean","string","string","string","String JSON object [{""otherIdAgency"": ""VALUE"", ""otherIdValue"": ""VALUE""}]","string","string","string","string JSON object {""seriesName"": ""VALUE"", ""seriesInformation"": ""VALUE""}","string","String JSON object [{""authorName"": ""VALUE"", ""authorAffiliation"": ""VALUE"", ""authorIdentifierScheme"": ""VALUE"", ""authorIdentifier"": ""VALUE""}]","String JSON object [{""dsDescriptionValue"": ""VALUE"", ""dsDescriptionDate"": ""VALUE""}]","String, JSON object [""VALUE"", ""VALUE""]","String, JSON object [{""keywordValue"": ""VALUE"", ""keywordVocabulary"": ""VALUE"", ""keywordVocabularyURI"": ""VALUE""}]","String, JSON object [{""topicClassValue"": ""VALUE"", ""topicClassVocab"": ""VALUE"", ""topicClassVocabURI"": ""VALUE""}]","String, JSON object [""VALUE"", ""VALUE""]","String JSON object [{""grantNumberAgency"": ""VALUE"", ""grantNumberValue"": ""VALUE""}]","String, JSON object [{""dateOfCollectionStart"": ""VALUE"", ""dateOfCollectionEnd"": ""VALUE""}] DateOfCollectionStart & dateOfCollectionEnd → String, YYYY-MM-DD ","string, JSON object [""VALUE"", ""VALUE""]","string, JSON object [""VALUE"", ""VALUE""]","string","string; URL","string","string; YYYY-MM-DD","string","string; YYYY-MM-DD","string, JSON object [""VALUE"", ""VALUE""]","string; YYYY-MM-DD","string","String, JSON object [{""contributorType"": ""VALUE"", ""contributorName"": ""VALUE""}]","string, JSON object [""VALUE"", ""VALUE""]","string, JSON object [""VALUE"", ""VALUE""]","String, JSON object [{""datasetContactName"": ""VALUE"", ""datasetContactAffiliation"": ""VALUE"", ""datasetContactEmail"": ""VALUE""}]","String, JSON object [{""distributorName"": ""VALUE"", ""distributorAffiliation"": ""VALUE"", ""distributorAbbreviation"": ""VALUE"", ""distributorURL"": ""VALUE"", ""distributorLogoURL"": ""VALUE""}]","String, JSON object [{""producerName"": ""VALUE"", ""producerAffiliation"": ""VALUE"", ""producerAbbreviation"": ""VALUE"", ""producerURL"": ""VALUE"", ""producerLogoURL"": ""VALUE""}]","String, JSON object [{""publicationCitation"": ""VALUE"", ""publicationIDType"": ""VALUE"", ""publicationIDNumber"": ""VALUE"", ""publicationURL"": ""VALUE""}]","String, JSON object [{""softwareName"": ""VALUE"", ""softwareVersion"": ""VALUE""}]","String, JSON object [{""timePeriodCoveredStart"": ""VALUE"", ""timePeriodCoveredEnd"": ""VALUE""}]","string, JSON object [""VALUE"", ""VALUE""]","String, JSON object [{""westLongitude"": ""VALUE"", ""eastLongitude"": ""VALUE"", ""northLongitude"": ""VALUE"", ""southLongitude"": ""VALUE""}]","String, JSON object [{""country"": ""VALUE"", ""state"": ""VALUE"", ""city"": ""VALUE"", ""otherGeographicCoverage"": ""VALUE""}]","string","string","string","string","string","string","string","string","string","string","string","String JSON object [{""socialScienceNotesType"": ""VALUE"", ""socialScienceNotesSubject"": ""VALUE"", ""socialScienceNotesText"": ""VALUE""}]","string","string","string","string","string, JSON object [""VALUE"", ""VALUE""]","string, JSON object [""VALUE"", ""VALUE""]","string","string","bool" "example",1,1,"doi:10.11587/19ZW6I",,"TRUE","TRUE","TRUE","TRUE","TRUE","TRUE","TRUE","TRUE","CC0","Terms of Access","CC0 Waiver","[{""otherIdAgency"": ""OtherIDAgency1"", ""otherIdValue"": ""OtherIDIdentifier1""}]","Replication Data for: Title","Subtitle","Alternative Title","{""seriesName"": ""SeriesName"", ""seriesInformation"": ""SeriesInformation""}","Notes1","[{""authorName"": ""LastAuthor1, FirstAuthor1"", ""authorAffiliation"": ""AuthorAffiliation1"", ""authorIdentifierScheme"": ""ORCID"", ""authorIdentifier"": ""AuthorIdentifier1""}]","[{""dsDescriptionValue"": ""DescriptionText2"", ""dsDescriptionDate"": ""1000-02-02""}]","[""Agricultural Sciences"", ""Business and Management"", ""Engineering"", ""Law""]","[{""keywordValue"": ""KeywordTerm1"", ""keywordVocabulary"": ""KeywordVocabulary1"", ""keywordVocabularyURI"": ""http://KeywordVocabularyURL1.org""}]","[{""topicClassValue"": ""Topic Class Value1"", ""topicClassVocab"": ""Topic Classification Vocabulary"", ""topicClassVocabURI"": ""http://www.topicURL.net""}]","[""English"", ""German""]","[{""grantNumberAgency"": ""GrantInformationGrantAgency1"", ""grantNumberValue"": ""GrantInformationGrantNumber1""}]","[{""dateOfCollectionStart"": ""1006-01-01"", ""dateOfCollectionEnd"": ""1006-01-01""}]","[""KindOfData1"", ""KindOfData2""]","[""DataSources1"", ""DataSources2""]","DocumentationAndAccessToSources","http://AlternativeURL.org","CharacteristicOfSourcesNoted","1002-01-01","LastDepositor, FirstDepositor","1004-01-01","[""OtherReferences1"", ""OtherReferences2""]","1003-01-01","ProductionPlace","[{""contributorType"": ""Data Collector"", ""contributorName"": ""LastContributor1, FirstContributor1""}]","[""RelatedDatasets1"", ""RelatedDatasets2""]","[""RelatedMaterial1"", ""RelatedMaterial2""]","[{""datasetContactName"": ""LastContact1, FirstContact1"", ""datasetContactAffiliation"": ""ContactAffiliation1"", ""datasetContactEmail"": ""ContactEmail1@mailinator.com""}]","[{""distributorName"": ""LastDistributor1, FirstDistributor1"", ""distributorAffiliation"": ""DistributorAffiliation1"", ""distributorAbbreviation"": ""DistributorAbbreviation1"", ""distributorURL"": ""http://DistributorURL1.org"", ""distributorLogoURL"": ""http://DistributorLogoURL1.org""}]","[{""producerName"": ""LastProducer1, FirstProducer1"", ""producerAffiliation"": ""ProducerAffiliation1"", ""producerAbbreviation"": ""ProducerAbbreviation1"", ""producerURL"": ""http://ProducerURL1.org"", ""producerLogoURL"": ""http://ProducerLogoURL1.org""}]","[{""publicationCitation"": ""RelatedPublicationCitation1"", ""publicationIDType"": ""ark"", ""publicationIDNumber"": ""RelatedPublicationIDNumber1"", ""publicationURL"": ""http://RelatedPublicationURL1.org""}]","[{""softwareName"": ""SoftwareName1"", ""softwareVersion"": ""SoftwareVersion1""}]","[{""timePeriodCoveredStart"": ""1005-01-01"", ""timePeriodCoveredEnd"": ""1005-01-02""}]","[""GeographicUnit1"", ""GeographicUnit2""]","[{""westLongitude"": ""10"", ""eastLongitude"": ""20"", ""northLongitude"": ""30"", ""southLongitude"": ""40""}]","[{""country"": ""Afghanistan"", ""state"": ""GeographicCoverageStateProvince1"", ""city"": ""GeographicCoverageCity1"", ""otherGeographicCoverage"": ""GeographicCoverageOther1""}]","ActionsToMinimizeLosses","CleaningOperations","CollectionMode","CollectorTraining","ControlOperations","CharacteristicsOfDataCollectionSituation","LastDataCollector1, FirstDataCollector1","StudyLevelErrorNotes","MajorDeviationsForSampleDesign","Frequency","OtherFormsOfDataAppraisal","[{""socialScienceNotesType"": ""NotesType"", ""socialScienceNotesSubject"": ""NotesSubject"", ""socialScienceNotesText"": ""NotesText""}]","TypeOfResearchInstrument","ResponseRate","EstimatesOfSamplingError","SamplingProcedure","[""UnitOfAnalysis1"", ""UnitOfAnalysis2""]","[""Universe1"", ""Universe2""]","TimeMethod","Weighting","True" "multiple",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, "sub_keys",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, pyDataverse-0.3.4/pyDataverse/templates/dataverses.csv000066400000000000000000000030411467256651400231660ustar00rootroot00000000000000"attribute","org.dataverse_id","org.to_upload","org.is_uploaded","org.to_publish","org.is_published","org.to_delete","org.is_deleted","org.to_update","org.is_updated","dv.dataverse_id","dv.dataverse","dv.affiliation","dv.alias","dv.dataverseContacts","dv.dataverseType","dv.description","dv.name" "description","Unique identifier for the dataverse coming from the organization.","Dataverse should be uploaded.","Dataverse is uploaded.","Dataverse is to be published.","Dataverse is published.","Dataverse is to be published.","Dataverse is published.","Dataverse Metadata is to be updated.","Dataverse Metadata is updated.","Dataverse ID in Dataverse.","Parent dataverse.","Affiliation of Dataverse","Alias of Dataverse","Contact Email(s) of Dataverse.","Type of Dataverse.","Description text of Dataverse.","Name of Dataverse." "type","string","boolean","boolean","boolean","boolean","boolean","boolean","boolean","boolean","number","string","string","string","string","string","string","string" "example",1,"TRUE","TRUE","TRUE","TRUE","TRUE","TRUE","TRUE","TRUE",31,"public","Scientific Research University","science","[{""contactEmail"": ""pi@example.edu""}, {""contactEmail"": ""student@ex ample.edu""}]","LABORATORY","We do all the science.","Scientific Research" "multiple","FALSE","FALSE","FALSE","FALSE","FALSE","FALSE","FALSE","FALSE","FALSE","FALSE","FALSE","FALSE","FALSE","TRUE","FALSE","FALSE","FALSE" "sub_keys","NULL","NULL","NULL","NULL","NULL","NULL","NULL","NULL","NULL","NULL","NULL","NULL","NULL","contactEmail (format: email)","NULL","NULL","NULL" pyDataverse-0.3.4/pyDataverse/utils.py000066400000000000000000000426141467256651400200350ustar00rootroot00000000000000"""Helper functions.""" import csv import json import os import pickle from jsonschema import validate CSV_JSON_COLS = [ "otherId", "series", "author", "dsDescription", "subject", "keyword", "topicClassification", "language", "grantNumber", "dateOfCollection", "kindOfData", "dataSources", "otherReferences", "contributor", "relatedDatasets", "relatedMaterial", "datasetContact", "distributor", "producer", "publication", "software", "timePeriodCovered", "geographicUnit", "geographicBoundingBox", "geographicCoverage", "socialScienceNotes", "unitOfAnalysis", "universe", "targetSampleActualSize", "categories", ] def read_file(filename, mode="r", encoding="utf-8"): """Read in a file. Parameters ---------- filename : str Filename with full path. mode : str Read mode of file. Defaults to `r`. See more at https://docs.python.org/3.5/library/functions.html#open Returns ------- str Returns data as string. """ assert isinstance(filename, str) assert isinstance(mode, str) assert isinstance(encoding, str) with open(filename, mode, encoding=encoding) as f: data = f.read() assert isinstance(data, str) return data def write_file(filename, data, mode="w", encoding="utf-8"): """Write data in a file. Parameters ---------- filename : str Filename with full path. data : str Data to be stored. mode : str Read mode of file. Defaults to `w`. See more at https://docs.python.org/3.5/library/functions.html#open encoding : str Character encoding of file. Defaults to 'utf-8'. """ assert isinstance(filename, str) assert isinstance(data, str) assert isinstance(mode, str) assert isinstance(encoding, str) with open(filename, mode, encoding=encoding) as f: f.write(data) def read_json(filename: str, mode: str = "r", encoding: str = "utf-8") -> dict: """Read in a json file. See more about the json module at https://docs.python.org/3.5/library/json.html Parameters ---------- filename : str Filename with full path. mode : str Read mode of file. Defaults to `w`. See more at https://docs.python.org/3.5/library/functions.html#open encoding : str Character encoding of file. Defaults to 'utf-8'. Returns ------- dict Data as a json-formatted string. """ # TODO: add kwargs with open(filename, mode, encoding=encoding) as f: data = json.load(f) return data def write_json(filename, data, mode="w", encoding="utf-8"): """Write data to a json file. Parameters ---------- filename : str Filename with full path. data : dict Data to be written in the JSON file. mode : str Write mode of file. Defaults to `w`. See more at https://docs.python.org/3/library/functions.html#open encoding : str Character encoding of file. Defaults to 'utf-8'. """ with open(filename, mode, encoding=encoding) as f: json.dump(data, f, indent=2) def read_pickle(filename): """Read in pickle file. See more at `pickle `_. Parameters ---------- filename : str Full filename with path of file. Returns ------- dict Data object. """ assert isinstance(filename, str) with open(filename, "rb") as f: data = pickle.load(f) assert isinstance(data, dict) return data def write_pickle(filename, data): """Write data in pickle file. See more at `pickle `_. Parameters ---------- filename : str Full filename with path of file. data : dict Data to write in pickle file. """ assert isinstance(filename, str) assert isinstance(data, dict) with open(filename, "wb") as f: pickle.dump(data, f) def read_csv(filename, newline="", delimiter=",", quotechar='"', encoding="utf-8"): """Read in a CSV file. See more at `csv `_. Parameters ---------- filename : str Full filename with path of file. newline : str Newline character. delimiter : str Cell delimiter of CSV file. Defaults to ';'. quotechar : str Quote-character of CSV file. Defaults to '"'. encoding : str Character encoding of file. Defaults to 'utf-8'. Returns ------- reader Reader object, which can be iterated over. """ assert isinstance(filename, str) assert isinstance(newline, str) assert isinstance(delimiter, str) assert isinstance(quotechar, str) assert isinstance(encoding, str) with open(filename, newline=newline, encoding=encoding) as csvfile: csv_reader = csv.reader(csvfile, delimiter=delimiter, quotechar=quotechar) assert isinstance(csv_reader, csv.reader) return csv_reader def write_csv( data, filename, newline="", delimiter=",", quotechar='"', encoding="utf-8" ): """Short summary. See more at `csv `_. Parameters ---------- data : list List of :class:`dict`. Key is column, value is cell content. filename : str Full filename with path of file. newline : str Newline character. delimiter : str Cell delimiter of CSV file. Defaults to ';'. quotechar : str Quote-character of CSV file. Defaults to '"'. encoding : str Character encoding of file. Defaults to 'utf-8'. """ assert isinstance(data, list) assert isinstance(filename, str) assert isinstance(newline, str) assert isinstance(delimiter, str) assert isinstance(quotechar, str) assert isinstance(encoding, str) with open(filename, "w", newline=newline, encoding=encoding) as csvfile: writer = csv.writer(csvfile, delimiter=delimiter, quotechar=quotechar) for row in data: writer.writerow(row) def read_csv_as_dicts( filename, newline="", delimiter=",", quotechar='"', encoding="utf-8", remove_prefix=True, prefix="dv.", json_cols=CSV_JSON_COLS, false_values=["FALSE"], true_values=["TRUE"], ): """Read in CSV file into a list of :class:`dict`. This offers an easy import functionality of your data from CSV files. See more at `csv `_. CSV file structure: 1) The header row contains the column names. 2) A row contains one dataset 3) A column contains one specific attribute. Recommendation: Name the column name the way you want the attribute to be named later in your Dataverse object. See the `pyDataverse templates `_ for this. The created :class:`dict` can later be used for the `set()` function to create Dataverse objects. Parameters ---------- filename : str Filename with full path. newline : str Newline character. delimiter : str Cell delimiter of CSV file. Defaults to ';'. quotechar : str Quote-character of CSV file. Defaults to '"'. encoding : str Character encoding of file. Defaults to 'utf-8'. Returns ------- list List with one :class:`dict` each row. The keys of a :class:`dict` are named after the columen names. """ assert isinstance(filename, str) assert isinstance(newline, str) assert isinstance(delimiter, str) assert isinstance(quotechar, str) assert isinstance(encoding, str) with open(filename, "r", newline=newline, encoding=encoding) as csvfile: reader = csv.DictReader(csvfile, delimiter=delimiter, quotechar=quotechar) data = [] for row in reader: data.append(dict(row)) data_tmp = [] for ds in data: ds_tmp = {} for key, val in ds.items(): if val in false_values: ds_tmp[key] = False ds_tmp[key] = True elif val in true_values: ds_tmp[key] = True else: ds_tmp[key] = val data_tmp.append(ds_tmp) data = data_tmp if remove_prefix: data_tmp = [] for ds in data: ds_tmp = {} for key, val in ds.items(): if key.startswith(prefix): ds_tmp[key[len(prefix) :]] = val else: ds_tmp[key] = val data_tmp.append(ds_tmp) data = data_tmp if len(json_cols) > 0: data_tmp = [] for ds in data: ds_tmp = {} for key, val in ds.items(): if key in json_cols: ds_tmp[key] = json.loads(val) else: ds_tmp[key] = val data_tmp.append(ds_tmp) data = data_tmp return data def write_dicts_as_csv(data, fieldnames, filename, delimiter=",", quotechar='"'): """Write :class:`dict` to a CSV file This offers an easy export functionality of your data to a CSV files. See more at `csv `_. Parameters ---------- data : dict Dictionary with columns as keys, to be written in the CSV file. fieldnames : list Sequence of keys that identify the order of the columns. filename : str Filename with full path. delimiter : str Cell delimiter of CSV file. Defaults to ';'. quotechar : str Quote-character of CSV file. Defaults to '"'. """ assert isinstance(data, str) assert isinstance(fieldnames, list) assert isinstance(filename, str) assert isinstance(delimiter, str) assert isinstance(quotechar, str) with open(filename, "w", newline="") as csvfile: writer = csv.DictWriter(csvfile, fieldnames=fieldnames) writer.writeheader() for d in data: for key, val in d.items(): if isinstance(val, dict) or isinstance(val, list): d[key] = json.dump(val) writer.writerow(d) def clean_string(string): """Clean a string. Trims whitespace. Parameters ---------- str : str String to be cleaned. Returns ------- string Cleaned string. """ assert isinstance(string, str) clean_str = string.strip() clean_str = clean_str.replace(" ", " ") assert isinstance(clean_str, str) return clean_str def validate_data(data: dict, filename_schema: str, file_format: str = "json") -> bool: """Validate data against a schema. Parameters ---------- data : dict Data to be validated. filename_schema : str Filename with full path of the schema file. file_format : str File format to be validated. Returns ------- bool `True` if data was validated, `False` if not. """ assert isinstance(data, dict) assert isinstance(filename_schema, str) assert isinstance(file_format, str) if file_format == "json": validate(instance=data, schema=read_json(filename_schema)) return True elif file_format == "xml": print("INFO: Not implemented yet.") return False else: print("WARNING: No valid format passed.") return False def create_dataverse_url(base_url, identifier): """Creates URL of Dataverse. Example: https://data.aussda.at/dataverse/autnes Parameters ---------- base_url : str Base URL of Dataverse instance identifier : str Can either be a dataverse id (long), a dataverse alias (more robust), or the special value ``:root``. Returns ------- str URL of the dataverse """ assert isinstance(base_url, str) assert isinstance(identifier, str) base_url = base_url.rstrip("/") url = "{0}/dataverse/{1}".format(base_url, identifier) assert isinstance(url, str) return url def create_dataset_url(base_url, identifier, is_pid): """Creates URL of Dataset. Example: https://data.aussda.at/dataset.xhtml?persistentId=doi:10.11587/CCESLK Parameters ---------- base_url : str Base URL of Dataverse instance identifier : str Identifier of the dataset. Can be dataset id or persistent identifier of the dataset (e. g. doi). is_pid : bool ``True`` to use persistent identifier. ``False``, if not. Returns ------- str URL of the dataset """ assert isinstance(base_url, str) assert isinstance(identifier, str) assert isinstance(is_pid, bool) base_url = base_url.rstrip("/") if is_pid: url = "{0}/dataset.xhtml?persistentId={1}".format(base_url, identifier) else: url = "{0}/dataset.xhtml?id{1}".format(base_url, identifier) assert isinstance(url, str) return url def create_datafile_url(base_url, identifier, is_filepid): """Creates URL of Datafile. Example - File ID: https://data.aussda.at/file.xhtml?persistentId=doi:10.11587/CCESLK/5RH5GK Parameters ---------- base_url : str Base URL of Dataverse instance identifier : str Identifier of the datafile. Can be datafile id or persistent identifier of the datafile (e. g. doi). is_filepid : bool ``True`` to use persistent identifier. ``False``, if not. Returns ------- str URL of the datafile """ assert isinstance(base_url, str) assert isinstance(identifier, str) base_url = base_url.rstrip("/") if is_filepid: url = "{0}/file.xhtml?persistentId={1}".format(base_url, identifier) else: url = "{0}/file.xhtml?fileId={1}".format(base_url, identifier) assert isinstance(url, str) return url def dataverse_tree_walker( data: list, dv_keys: list = ["dataverse_id", "dataverse_alias"], ds_keys: list = ["dataset_id", "pid"], df_keys: list = ["datafile_id", "filename", "pid", "label"], ) -> tuple: """Walk through a Dataverse tree by get_children(). Recursively walk through the tree structure returned by ``get_children()`` and extract the keys needed. Parameters ---------- data : dict Tree data structure returned by ``get_children()``. dv_keys : list List of keys to be extracted from each Dataverse element. ds_keys : list List of keys to be extracted from each Dataset element. df_keys : list List of keys to be extracted from each Datafile element. Returns ------- tuple (List of Dataverse, List of Datasets, List of Datafiles) """ dataverses = [] datasets = [] datafiles = [] if isinstance(data, list): for elem in data: dv, ds, df = dataverse_tree_walker(elem) dataverses += dv datasets += ds datafiles += df elif isinstance(data, dict): if data["type"] == "dataverse": dv_tmp = {} for key in dv_keys: if key in data: dv_tmp[key] = data[key] dataverses.append(dv_tmp) elif data["type"] == "dataset": ds_tmp = {} for key in ds_keys: if key in data: ds_tmp[key] = data[key] datasets.append(ds_tmp) elif data["type"] == "datafile": df_tmp = {} for key in df_keys: if key in data: df_tmp[key] = data[key] datafiles.append(df_tmp) if "children" in data: if len(data["children"]) > 0: dv, ds, df = dataverse_tree_walker(data["children"]) dataverses += dv datasets += ds datafiles += df return dataverses, datasets, datafiles def save_tree_data( dataverses: list, datasets: list, datafiles: list, filename_dv: str = "dataverses.json", filename_ds: str = "datasets.json", filename_df: str = "datafiles.json", filename_md: str = "metadata.json", ) -> None: """Save lists from data returned by ``dv_tree_walker``. Collect lists of Dataverses, Datasets and Datafiles and save them in separated JSON files. Parameters ---------- data : dict Tree data structure returned by ``get_children()``. filename_dv : str Filename with full path for the Dataverse JSON file. filename_ds : str Filename with full path for the Dataset JSON file. filename_df : str Filename with full path for the Datafile JSON file. filename_md : str Filename with full path for the metadata JSON file. """ if os.path.isfile(filename_dv): os.remove(filename_dv) if os.path.isfile(filename_ds): os.remove(filename_ds) if os.path.isfile(filename_df): os.remove(filename_df) if len(dataverses) > 0: write_json(filename_dv, dataverses) if len(datasets) > 0: write_json(filename_ds, datasets) if len(datafiles) > 0: write_json(filename_df, datafiles) metadata = { "dataverses": len(dataverses), "datasets": len(datasets), "datafiles": len(datafiles), } write_json(filename_md, metadata) print(f"- Dataverses: {len(dataverses)}") print(f"- Datasets: {len(datasets)}") print(f"- Datafiles: {len(datafiles)}") pyDataverse-0.3.4/pyproject.toml000066400000000000000000000032111467256651400167360ustar00rootroot00000000000000[tool.poetry] name = "pyDataverse" version = "0.3.4" description = "A Python module for Dataverse." authors = [ "Stefan Kasberger ", "Jan Range ", ] license = "MIT" readme = "README.md" repository = "https://github.com/gdcc/pyDataverse" packages = [{ include = "pyDataverse" }] [tool.poetry.dependencies] python = "^3.8.1" httpx = "^0.27.0" jsonschema = "^4.21.1" [tool.poetry.group.dev] optional = true [tool.poetry.group.dev.dependencies] black = "^24.3.0" radon = "^6.0.1" mypy = "^1.9.0" autopep8 = "^2.1.0" pydocstyle = "^6.3.0" pygments = "^2.17.2" pytest = "^8.1.1" pytest-cov = "^5.0.0" tox = "^4.14.2" selenium = "^4.19.0" wheel = "^0.43.0" pre-commit = "3.5.0" sphinx = "7.1.2" restructuredtext-lint = "^1.4.0" rstcheck = "^6.2.1" ruff = "^0.4.4" [tool.poetry.group.tests] optional = true [tool.poetry.group.tests.dependencies] pytest = "^8.1.1" pytest-cov = "^5.0.0" pytest-asyncio = "^0.23.7" tox = "^4.14.2" selenium = "^4.19.0" [tool.poetry.group.docs] optional = true [tool.poetry.group.docs.dependencies] sphinx = "7.1.2" pydocstyle = "^6.3.0" restructuredtext-lint = "^1.4.0" pygments = "^2.17.2" rstcheck = "^6.2.1" [tool.poetry.group.lint] optional = true [tool.poetry.group.lint.dependencies] black = "^24.3.0" radon = "^6.0.1" mypy = "^1.9.0" types-jsonschema = "^4.23.0" autopep8 = "^2.1.0" ruff = "^0.4.4" [build-system] requires = ["poetry-core"] build-backend = "poetry.core.masonry.api" [tool.pytest.ini_options] addopts = ["-v", "--cov=pyDataverse"] [tool.coverage.run] source = "tests" [tool.coverage.report] show_missing = true [tool.radon] cc_min = "B" pyDataverse-0.3.4/requirements.txt000066400000000000000000000000601467256651400173050ustar00rootroot00000000000000# Requirements httpx==0.27.0 jsonschema==4.21.1 pyDataverse-0.3.4/run-tests.sh000066400000000000000000000042711467256651400163310ustar00rootroot00000000000000#!bin/bash # Parse arguments usage() { echo "Usage: $0 [-p Python version (e.g. 3.10, 3.11, ...)]" 1>&2 exit 1 } while getopts ":p:d:" o; do case "${o}" in p) p=${OPTARG} ;; *) ;; esac done shift $((OPTIND - 1)) # Fall back to Python 3.11 if no Python version is specified if [ -z "${p}" ]; then printf "\n⚠️ No Python version specified falling back to '3.11'\n" p=3.11 fi # Validate Python version if [[ ! "${p}" =~ ^3\.[0-9]+$ ]]; then echo "\n❌ Invalid Python version. Please specify a valid Python version (e.g. 3.10, 3.11, ...)\n" exit 1 fi # Check if Docker is installed if ! command -v docker &>/dev/null; then echo "✋ Docker is not installed. Please install Docker before running this script." exit 1 fi # Prepare the environment for the test mkdir dv >>/dev/null 2>&1 touch dv/bootstrap.exposed.env >>/dev/null 2>&1 # Add python version to the environment export PYTHON_VERSION=${p} printf "\n🚀 Preparing containers\n" printf " Using PYTHON_VERSION=${p}\n\n" # Run all containers docker compose \ -f docker/docker-compose-base.yml \ -f ./docker/docker-compose-test-all.yml \ --env-file local-test.env \ up -d printf "\n🔎 Running pyDataverse tests\n" printf " Logs will be printed once finished...\n\n" # Check if "unit-test" container has finished while [ -n "$(docker ps -f "name=unit-tests" -f "status=running" -q)" ]; do printf " Waiting for unit-tests container to finish...\n" sleep 5 done # Check if "unit-test" container has failed if [ "$(docker inspect -f '{{.State.ExitCode}}' unit-tests)" -ne 0 ]; then printf "\n❌ Unit tests failed. Printing logs...\n" docker logs unit-tests printf "\n Stopping containers\n" docker compose \ -f docker/docker-compose-base.yml \ -f ./docker/docker-compose-test-all.yml \ --env-file local-test.env \ down exit 1 fi # Print test results printf "\n" cat dv/unit-tests.log printf "\n\n✅ Unit tests passed\n\n" # Stop all containers docker compose \ -f docker/docker-compose-base.yml \ -f ./docker/docker-compose-test-all.yml \ --env-file local-test.env \ down printf "\n🎉 Done\n\n" pyDataverse-0.3.4/tests/000077500000000000000000000000001467256651400151675ustar00rootroot00000000000000pyDataverse-0.3.4/tests/__init__.py000066400000000000000000000000001467256651400172660ustar00rootroot00000000000000pyDataverse-0.3.4/tests/api/000077500000000000000000000000001467256651400157405ustar00rootroot00000000000000pyDataverse-0.3.4/tests/api/__init__.py000066400000000000000000000000001467256651400200370ustar00rootroot00000000000000pyDataverse-0.3.4/tests/api/test_access.py000066400000000000000000000100161467256651400206100ustar00rootroot00000000000000import os import json import httpx from pyDataverse.api import DataAccessApi class TestDataAccess: def test_get_data_by_id(self): """Tests getting data file by id.""" # Arrange BASE_URL = os.getenv("BASE_URL").rstrip("/") API_TOKEN = os.getenv("API_TOKEN") assert BASE_URL is not None, "BASE_URL is not set" assert API_TOKEN is not None, "API_TOKEN is not set" # Create dataset metadata = json.load(open("tests/data/file_upload_ds_minimum.json")) pid = self._create_dataset(BASE_URL, API_TOKEN, metadata) api = DataAccessApi(BASE_URL, API_TOKEN) # Upload a file self._upload_datafile(BASE_URL, API_TOKEN, pid) # Retrieve the file ID file_id = self._get_file_id(BASE_URL, API_TOKEN, pid) # Act response = api.get_datafile(file_id, is_pid=False) response.raise_for_status() content = response.content.decode("utf-8") # Assert expected = open("tests/data/datafile.txt").read() assert content == expected, "Data retrieval failed." def test_get_data_by_pid(self): """Tests getting data file by id. Test runs with a PID instead of a file ID from Harvard. No PID given if used within local containers TODO - Check if possible with containers """ # Arrange BASE_URL = "https://dataverse.harvard.edu" pid = "doi:10.7910/DVN/26093/IGA4JD" api = DataAccessApi(BASE_URL) # Act response = api.get_datafile(pid, is_pid=True) response.raise_for_status() content = response.content # Assert expected = self._get_file_content(BASE_URL, pid) assert content == expected, "Data retrieval failed." @staticmethod def _create_dataset( BASE_URL: str, API_TOKEN: str, metadata: dict, ): """ Create a dataset in the Dataverse. Args: BASE_URL (str): The base URL of the Dataverse instance. API_TOKEN (str): The API token for authentication. metadata (dict): The metadata for the dataset. Returns: str: The persistent identifier (PID) of the created dataset. """ url = f"{BASE_URL}/api/dataverses/root/datasets" response = httpx.post( url=url, json=metadata, headers={ "X-Dataverse-key": API_TOKEN, "Content-Type": "application/json", }, ) response.raise_for_status() return response.json()["data"]["persistentId"] @staticmethod def _get_file_id( BASE_URL: str, API_TOKEN: str, pid: str, ): """Retrieves a file ID for a given persistent identifier (PID) in Dataverse.""" response = httpx.get( url=f"{BASE_URL}/api/datasets/:persistentId/?persistentId={pid}", headers={ "X-Dataverse-key": API_TOKEN, "Content-Type": "application/json", }, ) response.raise_for_status() return response.json()["data"]["latestVersion"]["files"][0]["dataFile"]["id"] @staticmethod def _upload_datafile( BASE_URL: str, API_TOKEN: str, pid: str, ): """Uploads a file to Dataverse""" url = f"{BASE_URL}/api/datasets/:persistentId/add?persistentId={pid}" response = httpx.post( url=url, files={"file": open("tests/data/datafile.txt", "rb")}, headers={ "X-Dataverse-key": API_TOKEN, }, ) response.raise_for_status() @staticmethod def _get_file_content( BASE_URL: str, pid: str, ): """Retrieves the file content for testing purposes.""" response = httpx.get( url=f"{BASE_URL}/api/access/datafile/:persistentId/?persistentId={pid}", follow_redirects=True, ) response.raise_for_status() return response.content pyDataverse-0.3.4/tests/api/test_api.py000066400000000000000000000211701467256651400201230ustar00rootroot00000000000000import os import httpx import pytest from httpx import Response from time import sleep from pyDataverse.api import DataAccessApi, NativeApi, SwordApi from pyDataverse.auth import ApiTokenAuth from pyDataverse.exceptions import ApiAuthorizationError from pyDataverse.exceptions import ApiUrlError from pyDataverse.models import Dataset from pyDataverse.utils import read_file from ..conftest import test_config BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(os.path.dirname(__file__)))) class TestApiConnect(object): """Test the NativeApi() class initialization.""" def test_api_connect(self, native_api): sleep(test_config["wait_time"]) assert isinstance(native_api, NativeApi) assert not native_api.api_token assert native_api.api_version == "v1" assert native_api.base_url == os.getenv("BASE_URL").rstrip("/") assert native_api.base_url_api_native == "{0}/api/{1}".format( os.getenv("BASE_URL").rstrip("/"), native_api.api_version ) def test_api_connect_base_url_wrong(self): """Test native_api connection with wrong `base_url`.""" # None with pytest.raises(ApiUrlError): NativeApi(None) class TestApiTokenAndAuthBehavior: def test_api_token_none_and_auth_none(self): api = NativeApi("https://demo.dataverse.org") assert api.api_token is None assert api.auth is None def test_api_token_none_and_auth(self): auth = ApiTokenAuth("mytoken") api = NativeApi("https://demo.dataverse.org", auth=auth) assert api.api_token is None assert api.auth is auth def test_api_token_and_auth(self): auth = ApiTokenAuth("mytoken") # Only one, api_token or auth, should be specified with pytest.warns(UserWarning): api = NativeApi( "https://demo.dataverse.org", api_token="sometoken", auth=auth ) assert api.api_token is None assert api.auth is auth def test_api_token_and_auth_none(self): api_token = "mytoken" api = NativeApi("https://demo.dataverse.org", api_token) assert api.api_token == api_token assert isinstance(api.auth, ApiTokenAuth) assert api.auth.api_token == api_token class TestApiRequests(object): """Test the native_api requests.""" dataset_id = None @classmethod def setup_class(cls): """Create the native_api connection for later use.""" cls.dataverse_id = "test-pyDataverse" cls.dataset_id = None def test_get_request(self, native_api): """Test successful `.get_request()` request.""" # TODO: test params und auth default base_url = os.getenv("BASE_URL").rstrip("/") query_str = base_url + "/api/v1/info/server" resp = native_api.get_request(query_str) sleep(test_config["wait_time"]) assert isinstance(resp, Response) def test_get_dataverse(self, native_api): """Test successful `.get_dataverse()` request`.""" resp = native_api.get_dataverse(":root") sleep(test_config["wait_time"]) assert isinstance(resp, Response) if not os.environ.get("TRAVIS"): class TestApiToken(object): """Test user rights.""" def test_token_missing(self): BASE_URL = os.getenv("BASE_URL").rstrip("/") api = NativeApi(BASE_URL) resp = api.get_info_version() assert resp.json()["data"]["version"] == os.getenv("DV_VERSION") # assert resp.json()["data"]["build"] == "267-a91d370" with pytest.raises(ApiAuthorizationError): ds = Dataset() ds.from_json( read_file( os.path.join( BASE_DIR, "tests/data/dataset_upload_min_default.json" ) ) ) api.create_dataset(":root", ds.json()) def test_token_empty_string(self): BASE_URL = os.getenv("BASE_URL").rstrip("/") api = NativeApi(BASE_URL, "") resp = api.get_info_version() assert resp.json()["data"]["version"] == os.getenv("DV_VERSION") # assert resp.json()["data"]["build"] == "267-a91d370" with pytest.raises(ApiAuthorizationError): ds = Dataset() ds.from_json( read_file( os.path.join( BASE_DIR, "tests/data/dataset_upload_min_default.json" ) ) ) api.create_dataset(":root", ds.json()) # def test_token_no_rights(self): # BASE_URL = os.getenv("BASE_URL") # API_TOKEN = os.getenv("API_TOKEN_NO_RIGHTS") # api = NativeApi(BASE_URL, API_TOKEN) # resp = api.get_info_version() # assert resp.json()["data"]["version"] == os.getenv("DV_VERSION") # assert resp.json()["data"]["build"] == "267-a91d370" # with pytest.raises(ApiAuthorizationError): # ds = Dataset() # ds.from_json( # read_file( # os.path.join( # BASE_DIR, "tests/data/dataset_upload_min_default.json" # ) # ) # ) # api.create_dataset(":root", ds.json()) def test_token_right_create_dataset_rights(self): BASE_URL = os.getenv("BASE_URL").rstrip("/") api_su = NativeApi(BASE_URL, os.getenv("API_TOKEN_SUPERUSER")) # api_nru = NativeApi(BASE_URL, os.getenv("API_TOKEN_TEST_NO_RIGHTS")) resp = api_su.get_info_version() assert resp.json()["data"]["version"] == os.getenv("DV_VERSION") # assert resp.json()["data"]["build"] == "267-a91d370" # resp = api_nru.get_info_version() # assert resp.json()["data"]["version"] == os.getenv("DV_VERSION") # assert resp.json()["data"]["build"] == "267-a91d370" ds = Dataset() ds.from_json( read_file( os.path.join(BASE_DIR, "tests/data/dataset_upload_min_default.json") ) ) resp = api_su.create_dataset(":root", ds.json()) pid = resp.json()["data"]["persistentId"] assert resp.json()["status"] == "OK" # with pytest.raises(ApiAuthorizationError): # resp = api_nru.get_dataset(pid) resp = api_su.delete_dataset(pid) assert resp.json()["status"] == "OK" def test_token_should_not_be_exposed_on_error(self): BASE_URL = os.getenv("BASE_URL") API_TOKEN = os.getenv("API_TOKEN") api = DataAccessApi(BASE_URL, API_TOKEN) result = api.get_datafile("does-not-exist").json() assert API_TOKEN not in result["requestUrl"] @pytest.mark.parametrize( "auth", (True, False, "api-token", ApiTokenAuth("some-token")) ) def test_using_auth_on_individual_requests_is_deprecated(self, auth): BASE_URL = os.getenv("BASE_URL") API_TOKEN = os.getenv("API_TOKEN") api = DataAccessApi(BASE_URL, auth=ApiTokenAuth(API_TOKEN)) with pytest.warns(DeprecationWarning): api.get_datafile("does-not-exist", auth=auth) @pytest.mark.parametrize( "auth", (True, False, "api-token", ApiTokenAuth("some-token")) ) def test_using_auth_on_individual_requests_is_deprecated_unauthorized( self, auth ): BASE_URL = os.getenv("BASE_URL") no_auth_api = DataAccessApi(BASE_URL) with pytest.warns(DeprecationWarning): no_auth_api.get_datafile("does-not-exist", auth=auth) def test_sword_api_requires_http_basic_auth(self): BASE_URL = os.getenv("BASE_URL") API_TOKEN = os.getenv("API_TOKEN") api = SwordApi(BASE_URL, api_token=API_TOKEN) assert isinstance(api.auth, httpx.BasicAuth) def test_sword_api_can_authenticate(self): BASE_URL = os.getenv("BASE_URL") API_TOKEN = os.getenv("API_TOKEN") api = SwordApi(BASE_URL, api_token=API_TOKEN) response = api.get_service_document() assert response.status_code == 200 def test_sword_api_cannot_authenticate_without_token(self): BASE_URL = os.getenv("BASE_URL") api = SwordApi(BASE_URL) with pytest.raises(ApiAuthorizationError): api.get_service_document() pyDataverse-0.3.4/tests/api/test_async_api.py000066400000000000000000000006411467256651400213200ustar00rootroot00000000000000import asyncio import pytest class TestAsyncAPI: @pytest.mark.asyncio async def test_async_api(self, native_api): async with native_api: tasks = [native_api.get_info_version() for _ in range(10)] responses = await asyncio.gather(*tasks) assert len(responses) == 10 for response in responses: assert response.status_code == 200, "Request failed." pyDataverse-0.3.4/tests/api/test_upload.py000066400000000000000000000227331467256651400206440ustar00rootroot00000000000000import json import os import tempfile import httpx from pyDataverse.api import DataAccessApi, NativeApi from pyDataverse.models import Datafile class TestFileUpload: def test_file_upload(self): """ Test case for uploading a file to a dataset. This test case performs the following steps: 1. Creates a dataset using the provided metadata. 2. Prepares a file for upload. 3. Uploads the file to the dataset. 4. Asserts that the file upload was successful. Raises: AssertionError: If the file upload fails. """ # Arrange BASE_URL = os.getenv("BASE_URL").rstrip("/") API_TOKEN = os.getenv("API_TOKEN") # Create dataset metadata = json.load(open("tests/data/file_upload_ds_minimum.json")) pid = self._create_dataset(BASE_URL, API_TOKEN, metadata) api = NativeApi(BASE_URL, API_TOKEN) # Prepare file upload df = Datafile({"pid": pid, "filename": "datafile.txt"}) # Act response = api.upload_datafile( identifier=pid, filename="tests/data/datafile.txt", json_str=df.json(), ) # Assert assert response.status_code == 200, "File upload failed." def test_file_upload_without_metadata(self): """ Test case for uploading a file to a dataset without metadata. --> json_str will be set as None This test case performs the following steps: 1. Creates a dataset using the provided metadata. 2. Prepares a file for upload. 3. Uploads the file to the dataset. 4. Asserts that the file upload was successful. Raises: AssertionError: If the file upload fails. """ # Arrange BASE_URL = os.getenv("BASE_URL").rstrip("/") API_TOKEN = os.getenv("API_TOKEN") # Create dataset metadata = json.load(open("tests/data/file_upload_ds_minimum.json")) pid = self._create_dataset(BASE_URL, API_TOKEN, metadata) api = NativeApi(BASE_URL, API_TOKEN) # Act response = api.upload_datafile( identifier=pid, filename="tests/data/datafile.txt", json_str=None, ) # Assert assert response.status_code == 200, "File upload failed." def test_bulk_file_upload(self, create_mock_file): """ Test case for uploading bulk files to a dataset. This test is meant to check the performance of the file upload feature and that nothing breaks when uploading multiple files in line. This test case performs the following steps: 0. Create 50 mock files. 1. Creates a dataset using the provided metadata. 2. Prepares a file for upload. 3. Uploads the file to the dataset. 4. Asserts that the file upload was successful. Raises: AssertionError: If the file upload fails. """ # Arrange BASE_URL = os.getenv("BASE_URL").rstrip("/") API_TOKEN = os.getenv("API_TOKEN") # Create dataset metadata = json.load(open("tests/data/file_upload_ds_minimum.json")) pid = self._create_dataset(BASE_URL, API_TOKEN, metadata) api = NativeApi(BASE_URL, API_TOKEN) with tempfile.TemporaryDirectory() as tmp_dir: # Create mock files mock_files = [ create_mock_file( filename=f"mock_file_{i}.txt", dir=tmp_dir, size=1024**2, # 1MB ) for i in range(50) ] for mock_file in mock_files: # Prepare file upload df = Datafile({"pid": pid, "filename": os.path.basename(mock_file)}) # Act response = api.upload_datafile( identifier=pid, filename=mock_file, json_str=df.json(), ) # Assert assert response.status_code == 200, "File upload failed." def test_file_replacement_wo_metadata(self): """ Test case for replacing a file in a dataset without metadata. Steps: 1. Create a dataset using the provided metadata. 2. Upload a datafile to the dataset. 3. Replace the uploaded datafile with a mutated version. 4. Verify that the file replacement was successful and the content matches the expected content. """ # Arrange BASE_URL = os.getenv("BASE_URL").rstrip("/") API_TOKEN = os.getenv("API_TOKEN") # Create dataset metadata = json.load(open("tests/data/file_upload_ds_minimum.json")) pid = self._create_dataset(BASE_URL, API_TOKEN, metadata) api = NativeApi(BASE_URL, API_TOKEN) data_api = DataAccessApi(BASE_URL, API_TOKEN) # Perform file upload df = Datafile({"pid": pid, "filename": "datafile.txt"}) response = api.upload_datafile( identifier=pid, filename="tests/data/replace.xyz", json_str=df.json(), ) # Retrieve file ID file_id = response.json()["data"]["files"][0]["dataFile"]["id"] # Act with tempfile.TemporaryDirectory() as tempdir: original = open("tests/data/replace.xyz").read() mutated = "Z" + original[1::] mutated_path = os.path.join(tempdir, "replace.xyz") with open(mutated_path, "w") as f: f.write(mutated) json_data = {} response = api.replace_datafile( identifier=file_id, filename=mutated_path, json_str=json.dumps(json_data), is_filepid=False, ) # Assert file_id = response.json()["data"]["files"][0]["dataFile"]["id"] content = data_api.get_datafile(file_id, is_pid=False).text assert response.status_code == 200, "File replacement failed." assert content == mutated, "File content does not match the expected content." def test_file_replacement_w_metadata(self): """ Test case for replacing a file in a dataset with metadata. Steps: 1. Create a dataset using the provided metadata. 2. Upload a datafile to the dataset. 3. Replace the uploaded datafile with a mutated version. 4. Verify that the file replacement was successful and the content matches the expected content. """ # Arrange BASE_URL = os.getenv("BASE_URL").rstrip("/") API_TOKEN = os.getenv("API_TOKEN") # Create dataset metadata = json.load(open("tests/data/file_upload_ds_minimum.json")) pid = self._create_dataset(BASE_URL, API_TOKEN, metadata) api = NativeApi(BASE_URL, API_TOKEN) data_api = DataAccessApi(BASE_URL, API_TOKEN) # Perform file upload df = Datafile({"pid": pid, "filename": "datafile.txt"}) response = api.upload_datafile( identifier=pid, filename="tests/data/replace.xyz", json_str=df.json(), ) # Retrieve file ID file_id = response.json()["data"]["files"][0]["dataFile"]["id"] # Act with tempfile.TemporaryDirectory() as tempdir: original = open("tests/data/replace.xyz").read() mutated = "Z" + original[1::] mutated_path = os.path.join(tempdir, "replace.xyz") with open(mutated_path, "w") as f: f.write(mutated) json_data = { "description": "My description.", "categories": ["Data"], "forceReplace": False, "directoryLabel": "some/other", } response = api.replace_datafile( identifier=file_id, filename=mutated_path, json_str=json.dumps(json_data), is_filepid=False, ) # Assert file_id = response.json()["data"]["files"][0]["dataFile"]["id"] data_file = api.get_dataset(pid).json()["data"]["latestVersion"]["files"][0] content = data_api.get_datafile(file_id, is_pid=False).text assert ( data_file["description"] == "My description." ), "Description does not match." assert data_file["categories"] == ["Data"], "Categories do not match." assert ( data_file["directoryLabel"] == "some/other" ), "Directory label does not match." assert response.status_code == 200, "File replacement failed." assert content == mutated, "File content does not match the expected content." @staticmethod def _create_dataset( BASE_URL: str, API_TOKEN: str, metadata: dict, ): """ Create a dataset in the Dataverse. Args: BASE_URL (str): The base URL of the Dataverse instance. API_TOKEN (str): The API token for authentication. metadata (dict): The metadata for the dataset. Returns: str: The persistent identifier (PID) of the created dataset. """ url = f"{BASE_URL}/api/dataverses/root/datasets" response = httpx.post( url=url, json=metadata, headers={ "X-Dataverse-key": API_TOKEN, "Content-Type": "application/json", }, ) response.raise_for_status() return response.json()["data"]["persistentId"] pyDataverse-0.3.4/tests/auth/000077500000000000000000000000001467256651400161305ustar00rootroot00000000000000pyDataverse-0.3.4/tests/auth/__init__.py000066400000000000000000000000001467256651400202270ustar00rootroot00000000000000pyDataverse-0.3.4/tests/auth/test_auth.py000066400000000000000000000032571467256651400205110ustar00rootroot00000000000000import uuid import pytest from httpx import Request from pyDataverse.auth import ApiTokenAuth, BearerTokenAuth from pyDataverse.exceptions import ApiAuthorizationError class TestApiTokenAuth: def test_token_header_is_added_during_auth_flow(self): api_token = str(uuid.uuid4()) auth = ApiTokenAuth(api_token) request = Request("GET", "https://example.org") assert "X-Dataverse-key" not in request.headers modified_request = next(auth.auth_flow(request)) assert "X-Dataverse-key" in modified_request.headers assert modified_request.headers["X-Dataverse-key"] == api_token @pytest.mark.parametrize( "non_str_token", (123, object(), lambda x: x, 1.423, b"123", uuid.uuid4()) ) def test_raise_if_token_is_not_str(self, non_str_token): with pytest.raises(ApiAuthorizationError): ApiTokenAuth(non_str_token) class TestBearerTokenAuth: def test_authorization_header_is_added_during_auth_flow(self): # Token as shown in RFC 6750 bearer_token = "mF_9.B5f-4.1JqM" auth = BearerTokenAuth(bearer_token) request = Request("GET", "https://example.org") assert "Authorization" not in request.headers modified_request = next(auth.auth_flow(request)) assert "Authorization" in modified_request.headers assert modified_request.headers["Authorization"] == f"Bearer {bearer_token}" @pytest.mark.parametrize( "non_str_token", (123, object(), lambda x: x, 1.423, b"123", uuid.uuid4()) ) def test_raise_if_token_is_not_str(self, non_str_token): with pytest.raises(ApiAuthorizationError): BearerTokenAuth(non_str_token) pyDataverse-0.3.4/tests/conftest.py000066400000000000000000000117631467256651400173760ustar00rootroot00000000000000"""Find out more at https://github.com/GDCC/pyDataverse.""" import os import pytest from pyDataverse.api import NativeApi def test_config(): test_dir = os.path.dirname(os.path.realpath(__file__)) root_dir = os.path.dirname(test_dir) test_data_dir = os.path.join(test_dir, "data") json_schemas_dir = os.path.join(root_dir, "pyDataverse/schemas/json") test_data_output_dir = os.path.join(test_data_dir, "output") invalid_filename_strings = ["wrong", ""] invalid_filename_types = [(), [], 12, 12.12, set(), True, False] return { "root_dir": root_dir, "test_dir": test_dir, "test_data_dir": test_data_dir, "json_schemas_dir": json_schemas_dir, "test_data_output_dir": test_data_output_dir, "dataverse_upload_min_filename": os.path.join( test_data_dir, "dataverse_upload_min.json" ), "dataverse_upload_full_filename": os.path.join( test_data_dir, "dataverse_upload_full.json" ), "dataverse_upload_schema_filename": os.path.join( json_schemas_dir, "dataverse_upload_schema.json" ), "dataverse_json_output_filename": os.path.join( test_data_output_dir, "dataverse_pytest.json" ), "dataset_upload_min_filename": os.path.join( test_data_dir, "dataset_upload_min_default.json" ), "dataset_upload_full_filename": os.path.join( test_data_dir, "dataset_upload_full_default.json" ), "dataset_upload_schema_filename": os.path.join( json_schemas_dir, "dataset_upload_default_schema.json" ), "dataset_json_output_filename": os.path.join( test_data_output_dir, "dataset_pytest.json" ), "datafile_upload_min_filename": os.path.join( test_data_dir, "datafile_upload_min.json" ), "datafile_upload_full_filename": os.path.join( test_data_dir, "datafile_upload_full.json" ), "datafile_upload_schema_filename": os.path.join( json_schemas_dir, "datafile_upload_schema.json" ), "datafile_json_output_filename": os.path.join( test_data_output_dir, "datafile_pytest.json" ), "tree_filename": os.path.join(test_data_dir, "tree.json"), "invalid_filename_strings": ["wrong", ""], "invalid_filename_types": [(), [], 12, 12.12, set(), True, False], "invalid_validate_types": [None, "wrong", {}, []], "invalid_json_data_types": [[], (), 12, set(), True, False, None], "invalid_set_types": invalid_filename_types + ["", "wrong"], "invalid_json_strings": invalid_filename_strings, "invalid_data_format_types": invalid_filename_types, "invalid_data_format_strings": invalid_filename_strings, "base_url": os.getenv("BASE_URL").rstrip("/"), "api_token": os.getenv("API_TOKEN"), "travis": os.getenv("TRAVIS") or False, "wait_time": 1, } test_config = test_config() @pytest.fixture() def native_api(monkeypatch): """Fixture, so set up an Api connection. Returns ------- Api Api object. """ BASE_URL = os.getenv("BASE_URL").rstrip("/") monkeypatch.setenv("BASE_URL", BASE_URL) return NativeApi(BASE_URL) def import_dataverse_min_dict(): """Import minimum Dataverse dict. Returns ------- dict Minimum Dataverse metadata. """ return { "alias": "test-pyDataverse", "name": "Test pyDataverse", "dataverseContacts": [{"contactEmail": "info@aussda.at"}], } def import_dataset_min_dict(): """Import dataset dict. Returns ------- dict Dataset metadata. """ return { "license": "CC0", "termsOfUse": "CC0 Waiver", "termsOfAccess": "Terms of Access", "citation_displayName": "Citation Metadata", "title": "Replication Data for: Title", } def import_datafile_min_dict(): """Import minimum Datafile dict. Returns ------- dict Minimum Datafile metadata. """ return { "pid": "doi:10.11587/EVMUHP", "filename": "tests/data/datafile.txt", } def import_datafile_full_dict(): """Import full Datafile dict. Returns ------- dict Full Datafile metadata. """ return { "pid": "doi:10.11587/EVMUHP", "filename": "tests/data/datafile.txt", "description": "Test datafile", "restrict": False, } @pytest.fixture def create_mock_file(): """Returns a function that creates a mock file.""" def _create_mock_file(filename: str, dir: str, size: int): """Create a mock file. Args: filename (str): Filename. dir (str): Directory. size (int): Size. Returns: str: Path to the file. """ path = os.path.join(dir, filename) with open(path, "wb") as f: f.write(os.urandom(size)) return path return _create_mock_file pyDataverse-0.3.4/tests/core/000077500000000000000000000000001467256651400161175ustar00rootroot00000000000000pyDataverse-0.3.4/tests/core/__init__.py000066400000000000000000000000001467256651400202160ustar00rootroot00000000000000pyDataverse-0.3.4/tests/data/000077500000000000000000000000001467256651400161005ustar00rootroot00000000000000pyDataverse-0.3.4/tests/data/datafile.txt000066400000000000000000000000071467256651400204070ustar00rootroot00000000000000hello! pyDataverse-0.3.4/tests/data/datafile_upload_full.json000066400000000000000000000003571467256651400231370ustar00rootroot00000000000000{ "description": "Another data file.", "categories": [ "Documentation" ], "restrict": true, "pid": "doi:10.11587/NVWE8Y", "filename": "20001_ta_de_v1_0.pdf", "label": "Questionnaire", "directoryLabel": "data/subdir1" } pyDataverse-0.3.4/tests/data/datafile_upload_min.json000066400000000000000000000001111467256651400227440ustar00rootroot00000000000000{ "pid": "doi:10.11587/RRKEA9", "filename": "10109_qu_de_v1_0.pdf" } pyDataverse-0.3.4/tests/data/dataset_upload_full_default.json000066400000000000000000001067411467256651400245230ustar00rootroot00000000000000{ "datasetVersion": { "license": "CC0", "termsOfUse": "CC0 Waiver", "termsOfAccess": "Terms of Access", "fileAccessRequest": true, "protocol": "doi", "authority": "10.11587", "identifier": "6AQBYW", "metadataBlocks": { "citation": { "displayName": "Citation Metadata", "fields": [ { "typeName": "title", "multiple": false, "typeClass": "primitive", "value": "Replication Data for: Title" }, { "typeName": "subtitle", "multiple": false, "typeClass": "primitive", "value": "Subtitle" }, { "typeName": "alternativeTitle", "multiple": false, "typeClass": "primitive", "value": "Alternative Title" }, { "typeName": "alternativeURL", "multiple": false, "typeClass": "primitive", "value": "http://AlternativeURL.org" }, { "typeName": "otherId", "multiple": true, "typeClass": "compound", "value": [ { "otherIdAgency": { "typeName": "otherIdAgency", "multiple": false, "typeClass": "primitive", "value": "OtherIDAgency1" }, "otherIdValue": { "typeName": "otherIdValue", "multiple": false, "typeClass": "primitive", "value": "OtherIDIdentifier1" } } ] }, { "typeName": "author", "multiple": true, "typeClass": "compound", "value": [ { "authorName": { "typeName": "authorName", "multiple": false, "typeClass": "primitive", "value": "LastAuthor1, FirstAuthor1" }, "authorAffiliation": { "typeName": "authorAffiliation", "multiple": false, "typeClass": "primitive", "value": "AuthorAffiliation1" }, "authorIdentifierScheme": { "typeName": "authorIdentifierScheme", "multiple": false, "typeClass": "controlledVocabulary", "value": "ORCID" }, "authorIdentifier": { "typeName": "authorIdentifier", "multiple": false, "typeClass": "primitive", "value": "AuthorIdentifier1" } } ] }, { "typeName": "datasetContact", "multiple": true, "typeClass": "compound", "value": [ { "datasetContactName": { "typeName": "datasetContactName", "multiple": false, "typeClass": "primitive", "value": "LastContact1, FirstContact1" }, "datasetContactAffiliation": { "typeName": "datasetContactAffiliation", "multiple": false, "typeClass": "primitive", "value": "ContactAffiliation1" }, "datasetContactEmail": { "typeName": "datasetContactEmail", "multiple": false, "typeClass": "primitive", "value": "ContactEmail1@mailinator.com" } } ] }, { "typeName": "dsDescription", "multiple": true, "typeClass": "compound", "value": [ { "dsDescriptionValue": { "typeName": "dsDescriptionValue", "multiple": false, "typeClass": "primitive", "value": "DescriptionText2" }, "dsDescriptionDate": { "typeName": "dsDescriptionDate", "multiple": false, "typeClass": "primitive", "value": "1000-02-02" } } ] }, { "typeName": "subject", "multiple": true, "typeClass": "controlledVocabulary", "value": [ "Agricultural Sciences", "Business and Management", "Engineering", "Law" ] }, { "typeName": "keyword", "multiple": true, "typeClass": "compound", "value": [ { "keywordValue": { "typeName": "keywordValue", "multiple": false, "typeClass": "primitive", "value": "KeywordTerm1" }, "keywordVocabulary": { "typeName": "keywordVocabulary", "multiple": false, "typeClass": "primitive", "value": "KeywordVocabulary1" }, "keywordVocabularyURI": { "typeName": "keywordVocabularyURI", "multiple": false, "typeClass": "primitive", "value": "http://KeywordVocabularyURL1.org" } } ] }, { "typeName": "topicClassification", "multiple": true, "typeClass": "compound", "value": [ { "topicClassValue": { "typeName": "topicClassValue", "multiple": false, "typeClass": "primitive", "value": "Topic Class Value1" }, "topicClassVocab": { "typeName": "topicClassVocab", "multiple": false, "typeClass": "primitive", "value": "Topic Classification Vocabulary" }, "topicClassVocabURI": { "typeName": "topicClassVocabURI", "multiple": false, "typeClass": "primitive", "value": "https://topic.class/vocab/uri" } } ] }, { "typeName": "publication", "multiple": true, "typeClass": "compound", "value": [ { "publicationCitation": { "typeName": "publicationCitation", "multiple": false, "typeClass": "primitive", "value": "RelatedPublicationCitation1" }, "publicationIDType": { "typeName": "publicationIDType", "multiple": false, "typeClass": "controlledVocabulary", "value": "ark" }, "publicationIDNumber": { "typeName": "publicationIDNumber", "multiple": false, "typeClass": "primitive", "value": "RelatedPublicationIDNumber1" }, "publicationURL": { "typeName": "publicationURL", "multiple": false, "typeClass": "primitive", "value": "http://RelatedPublicationURL1.org" } } ] }, { "typeName": "notesText", "multiple": false, "typeClass": "primitive", "value": "Notes1" }, { "typeName": "producer", "multiple": true, "typeClass": "compound", "value": [ { "producerName": { "typeName": "producerName", "multiple": false, "typeClass": "primitive", "value": "LastProducer1, FirstProducer1" }, "producerAffiliation": { "typeName": "producerAffiliation", "multiple": false, "typeClass": "primitive", "value": "ProducerAffiliation1" }, "producerAbbreviation": { "typeName": "producerAbbreviation", "multiple": false, "typeClass": "primitive", "value": "ProducerAbbreviation1" }, "producerURL": { "typeName": "producerURL", "multiple": false, "typeClass": "primitive", "value": "http://ProducerURL1.org" }, "producerLogoURL": { "typeName": "producerLogoURL", "multiple": false, "typeClass": "primitive", "value": "http://ProducerLogoURL1.org" } } ] }, { "typeName": "productionDate", "multiple": false, "typeClass": "primitive", "value": "1003-01-01" }, { "typeName": "productionPlace", "multiple": false, "typeClass": "primitive", "value": "ProductionPlace" }, { "typeName": "contributor", "multiple": true, "typeClass": "compound", "value": [ { "contributorType": { "typeName": "contributorType", "multiple": false, "typeClass": "controlledVocabulary", "value": "Data Collector" }, "contributorName": { "typeName": "contributorName", "multiple": false, "typeClass": "primitive", "value": "LastContributor1, FirstContributor1" } } ] }, { "typeName": "grantNumber", "multiple": true, "typeClass": "compound", "value": [ { "grantNumberAgency": { "typeName": "grantNumberAgency", "multiple": false, "typeClass": "primitive", "value": "GrantInformationGrantAgency1" }, "grantNumberValue": { "typeName": "grantNumberValue", "multiple": false, "typeClass": "primitive", "value": "GrantInformationGrantNumber1" } } ] }, { "typeName": "distributor", "multiple": true, "typeClass": "compound", "value": [ { "distributorName": { "typeName": "distributorName", "multiple": false, "typeClass": "primitive", "value": "LastDistributor1, FirstDistributor1" }, "distributorAffiliation": { "typeName": "distributorAffiliation", "multiple": false, "typeClass": "primitive", "value": "DistributorAffiliation1" }, "distributorAbbreviation": { "typeName": "distributorAbbreviation", "multiple": false, "typeClass": "primitive", "value": "DistributorAbbreviation1" }, "distributorURL": { "typeName": "distributorURL", "multiple": false, "typeClass": "primitive", "value": "http://DistributorURL1.org" }, "distributorLogoURL": { "typeName": "distributorLogoURL", "multiple": false, "typeClass": "primitive", "value": "http://DistributorLogoURL1.org" } } ] }, { "typeName": "distributionDate", "multiple": false, "typeClass": "primitive", "value": "1004-01-01" }, { "typeName": "depositor", "multiple": false, "typeClass": "primitive", "value": "LastDepositor, FirstDepositor" }, { "typeName": "dateOfDeposit", "multiple": false, "typeClass": "primitive", "value": "1002-01-01" }, { "typeName": "timePeriodCovered", "multiple": true, "typeClass": "compound", "value": [ { "timePeriodCoveredStart": { "typeName": "timePeriodCoveredStart", "multiple": false, "typeClass": "primitive", "value": "1005-01-01" }, "timePeriodCoveredEnd": { "typeName": "timePeriodCoveredEnd", "multiple": false, "typeClass": "primitive", "value": "1005-01-02" } } ] }, { "typeName": "dateOfCollection", "multiple": true, "typeClass": "compound", "value": [ { "dateOfCollectionStart": { "typeName": "dateOfCollectionStart", "multiple": false, "typeClass": "primitive", "value": "1006-01-01" }, "dateOfCollectionEnd": { "typeName": "dateOfCollectionEnd", "multiple": false, "typeClass": "primitive", "value": "1006-01-01" } } ] }, { "typeName": "kindOfData", "multiple": true, "typeClass": "primitive", "value": [ "KindOfData1", "KindOfData2" ] }, { "typeName": "language", "multiple": true, "typeClass": "controlledVocabulary", "value": [ "German" ] }, { "typeName": "series", "multiple": false, "typeClass": "compound", "value": { "seriesName": { "typeName": "seriesName", "multiple": false, "typeClass": "primitive", "value": "SeriesName" }, "seriesInformation": { "typeName": "seriesInformation", "multiple": false, "typeClass": "primitive", "value": "SeriesInformation" } } }, { "typeName": "software", "multiple": true, "typeClass": "compound", "value": [ { "softwareName": { "typeName": "softwareName", "multiple": false, "typeClass": "primitive", "value": "SoftwareName1" }, "softwareVersion": { "typeName": "softwareVersion", "multiple": false, "typeClass": "primitive", "value": "SoftwareVersion1" } } ] }, { "typeName": "relatedMaterial", "multiple": true, "typeClass": "primitive", "value": [ "RelatedMaterial1", "RelatedMaterial2" ] }, { "typeName": "relatedDatasets", "multiple": true, "typeClass": "primitive", "value": [ "RelatedDatasets1", "RelatedDatasets2" ] }, { "typeName": "otherReferences", "multiple": true, "typeClass": "primitive", "value": [ "OtherReferences1", "OtherReferences2" ] }, { "typeName": "dataSources", "multiple": true, "typeClass": "primitive", "value": [ "DataSources1", "DataSources2" ] }, { "typeName": "originOfSources", "multiple": false, "typeClass": "primitive", "value": "OriginOfSources" }, { "typeName": "characteristicOfSources", "multiple": false, "typeClass": "primitive", "value": "CharacteristicOfSourcesNoted" }, { "typeName": "accessToSources", "multiple": false, "typeClass": "primitive", "value": "DocumentationAndAccessToSources" } ] }, "geospatial": { "displayName": "Geospatial Metadata", "fields": [ { "typeName": "geographicCoverage", "multiple": true, "typeClass": "compound", "value": [ { "country": { "typeName": "country", "multiple": false, "typeClass": "controlledVocabulary", "value": "Afghanistan" }, "state": { "typeName": "state", "multiple": false, "typeClass": "primitive", "value": "GeographicCoverageStateProvince1" }, "city": { "typeName": "city", "multiple": false, "typeClass": "primitive", "value": "GeographicCoverageCity1" }, "otherGeographicCoverage": { "typeName": "otherGeographicCoverage", "multiple": false, "typeClass": "primitive", "value": "GeographicCoverageOther1" } } ] }, { "typeName": "geographicUnit", "multiple": true, "typeClass": "primitive", "value": [ "GeographicUnit1", "GeographicUnit2" ] }, { "typeName": "geographicBoundingBox", "multiple": true, "typeClass": "compound", "value": [ { "westLongitude": { "typeName": "westLongitude", "multiple": false, "typeClass": "primitive", "value": "10" }, "eastLongitude": { "typeName": "eastLongitude", "multiple": false, "typeClass": "primitive", "value": "20" }, "northLongitude": { "typeName": "northLongitude", "multiple": false, "typeClass": "primitive", "value": "30" }, "southLongitude": { "typeName": "southLongitude", "multiple": false, "typeClass": "primitive", "value": "40" } } ] } ] }, "socialscience": { "displayName": "Social Science and Humanities Metadata", "fields": [ { "typeName": "unitOfAnalysis", "multiple": true, "typeClass": "primitive", "value": [ "UnitOfAnalysis1", "UnitOfAnalysis2" ] }, { "typeName": "universe", "multiple": true, "typeClass": "primitive", "value": [ "Universe1", "Universe2" ] }, { "typeName": "timeMethod", "multiple": false, "typeClass": "primitive", "value": "TimeMethod" }, { "typeName": "dataCollector", "multiple": false, "typeClass": "primitive", "value": "LastDataCollector1, FirstDataCollector1" }, { "typeName": "collectorTraining", "multiple": false, "typeClass": "primitive", "value": "CollectorTraining" }, { "typeName": "frequencyOfDataCollection", "multiple": false, "typeClass": "primitive", "value": "Frequency" }, { "typeName": "samplingProcedure", "multiple": false, "typeClass": "primitive", "value": "SamplingProcedure" }, { "typeName": "targetSampleSize", "multiple": false, "typeClass": "compound", "value": { "targetSampleActualSize": { "typeName": "targetSampleActualSize", "multiple": false, "typeClass": "primitive", "value": "100" }, "targetSampleSizeFormula": { "typeName": "targetSampleSizeFormula", "multiple": false, "typeClass": "primitive", "value": "TargetSampleSizeFormula" } } }, { "typeName": "deviationsFromSampleDesign", "multiple": false, "typeClass": "primitive", "value": "MajorDeviationsForSampleDesign" }, { "typeName": "collectionMode", "multiple": false, "typeClass": "primitive", "value": "CollectionMode" }, { "typeName": "researchInstrument", "multiple": false, "typeClass": "primitive", "value": "TypeOfResearchInstrument" }, { "typeName": "dataCollectionSituation", "multiple": false, "typeClass": "primitive", "value": "CharacteristicsOfDataCollectionSituation" }, { "typeName": "actionsToMinimizeLoss", "multiple": false, "typeClass": "primitive", "value": "ActionsToMinimizeLosses" }, { "typeName": "controlOperations", "multiple": false, "typeClass": "primitive", "value": "ControlOperations" }, { "typeName": "weighting", "multiple": false, "typeClass": "primitive", "value": "Weighting" }, { "typeName": "cleaningOperations", "multiple": false, "typeClass": "primitive", "value": "CleaningOperations" }, { "typeName": "datasetLevelErrorNotes", "multiple": false, "typeClass": "primitive", "value": "StudyLevelErrorNotes" }, { "typeName": "responseRate", "multiple": false, "typeClass": "primitive", "value": "ResponseRate" }, { "typeName": "samplingErrorEstimates", "multiple": false, "typeClass": "primitive", "value": "EstimatesOfSamplingError" }, { "typeName": "otherDataAppraisal", "multiple": false, "typeClass": "primitive", "value": "OtherFormsOfDataAppraisal" }, { "typeName": "socialScienceNotes", "multiple": false, "typeClass": "compound", "value": { "socialScienceNotesType": { "typeName": "socialScienceNotesType", "multiple": false, "typeClass": "primitive", "value": "NotesType" }, "socialScienceNotesSubject": { "typeName": "socialScienceNotesSubject", "multiple": false, "typeClass": "primitive", "value": "NotesSubject" }, "socialScienceNotesText": { "typeName": "socialScienceNotesText", "multiple": false, "typeClass": "primitive", "value": "NotesText" } } } ] }, "astrophysics": { "displayName": "Astronomy and Astrophysics Metadata", "fields": [ { "typeName": "astroType", "multiple": true, "typeClass": "controlledVocabulary", "value": [ "Image", "Mosaic", "EventList", "Cube" ] }, { "typeName": "astroFacility", "multiple": true, "typeClass": "primitive", "value": [ "Facility1", "Facility2" ] }, { "typeName": "astroInstrument", "multiple": true, "typeClass": "primitive", "value": [ "Instrument1", "Instrument2" ] }, { "typeName": "astroObject", "multiple": true, "typeClass": "primitive", "value": [ "Object1", "Object2" ] }, { "typeName": "resolution.Spatial", "multiple": false, "typeClass": "primitive", "value": "SpatialResolution" }, { "typeName": "resolution.Spectral", "multiple": false, "typeClass": "primitive", "value": "SpectralResolution" }, { "typeName": "resolution.Temporal", "multiple": false, "typeClass": "primitive", "value": "TimeResolution" }, { "typeName": "coverage.Spectral.Bandpass", "multiple": true, "typeClass": "primitive", "value": [ "Bandpass1", "Bandpass2" ] }, { "typeName": "coverage.Spectral.CentralWavelength", "multiple": true, "typeClass": "primitive", "value": [ "3001", "3002" ] }, { "typeName": "coverage.Spectral.Wavelength", "multiple": true, "typeClass": "compound", "value": [ { "coverage.Spectral.MinimumWavelength": { "typeName": "coverage.Spectral.MinimumWavelength", "multiple": false, "typeClass": "primitive", "value": "4001" }, "coverage.Spectral.MaximumWavelength": { "typeName": "coverage.Spectral.MaximumWavelength", "multiple": false, "typeClass": "primitive", "value": "4002" } }, { "coverage.Spectral.MinimumWavelength": { "typeName": "coverage.Spectral.MinimumWavelength", "multiple": false, "typeClass": "primitive", "value": "4003" }, "coverage.Spectral.MaximumWavelength": { "typeName": "coverage.Spectral.MaximumWavelength", "multiple": false, "typeClass": "primitive", "value": "4004" } } ] }, { "typeName": "coverage.Temporal", "multiple": true, "typeClass": "compound", "value": [ { "coverage.Temporal.StartTime": { "typeName": "coverage.Temporal.StartTime", "multiple": false, "typeClass": "primitive", "value": "1007-01-01" }, "coverage.Temporal.StopTime": { "typeName": "coverage.Temporal.StopTime", "multiple": false, "typeClass": "primitive", "value": "1007-01-02" } }, { "coverage.Temporal.StartTime": { "typeName": "coverage.Temporal.StartTime", "multiple": false, "typeClass": "primitive", "value": "1007-02-01" }, "coverage.Temporal.StopTime": { "typeName": "coverage.Temporal.StopTime", "multiple": false, "typeClass": "primitive", "value": "1007-02-02" } } ] }, { "typeName": "coverage.Spatial", "multiple": true, "typeClass": "primitive", "value": [ "SkyCoverage1", "SkyCoverage2" ] }, { "typeName": "coverage.Depth", "multiple": false, "typeClass": "primitive", "value": "200" }, { "typeName": "coverage.ObjectDensity", "multiple": false, "typeClass": "primitive", "value": "300" }, { "typeName": "coverage.ObjectCount", "multiple": false, "typeClass": "primitive", "value": "400" }, { "typeName": "coverage.SkyFraction", "multiple": false, "typeClass": "primitive", "value": "500" }, { "typeName": "coverage.Polarization", "multiple": false, "typeClass": "primitive", "value": "Polarization" }, { "typeName": "redshiftType", "multiple": false, "typeClass": "primitive", "value": "RedshiftType" }, { "typeName": "resolution.Redshift", "multiple": false, "typeClass": "primitive", "value": "600" }, { "typeName": "coverage.RedshiftValue", "multiple": true, "typeClass": "compound", "value": [ { "coverage.Redshift.MinimumValue": { "typeName": "coverage.Redshift.MinimumValue", "multiple": false, "typeClass": "primitive", "value": "701" }, "coverage.Redshift.MaximumValue": { "typeName": "coverage.Redshift.MaximumValue", "multiple": false, "typeClass": "primitive", "value": "702" } } ] } ] }, "biomedical": { "displayName": "Life Sciences Metadata", "fields": [ { "typeName": "studyDesignType", "multiple": true, "typeClass": "controlledVocabulary", "value": [ "Case Control", "Cross Sectional", "Cohort Study", "Not Specified" ] }, { "typeName": "studyFactorType", "multiple": true, "typeClass": "controlledVocabulary", "value": [ "Age", "Biomarkers", "Cell Surface Markers", "Developmental Stage" ] }, { "typeName": "studyAssayOrganism", "multiple": true, "typeClass": "controlledVocabulary", "value": [ "Arabidopsis thaliana", "Bos taurus", "Caenorhabditis elegans", "Danio rerio (zebrafish)" ] }, { "typeName": "studyAssayOtherOrganism", "multiple": true, "typeClass": "primitive", "value": [ "OtherOrganism1", "OtherOrganism2" ] }, { "typeName": "studyAssayMeasurementType", "multiple": true, "typeClass": "controlledVocabulary", "value": [ "cell counting", "cell sorting", "clinical chemistry analysis", "DNA methylation profiling" ] }, { "typeName": "studyAssayOtherMeasurmentType", "multiple": true, "typeClass": "primitive", "value": [ "OtherMeasurementType1", "OtherMeasurementType2" ] }, { "typeName": "studyAssayTechnologyType", "multiple": true, "typeClass": "controlledVocabulary", "value": [ "culture based drug susceptibility testing, single concentration", "culture based drug susceptibility testing, two concentrations", "culture based drug susceptibility testing, three or more concentrations (minimum inhibitory concentration measurement)", "flow cytometry" ] }, { "typeName": "studyAssayPlatform", "multiple": true, "typeClass": "controlledVocabulary", "value": [ "210-MS GC Ion Trap (Variant)", "220-MS GC Ion Trap (Variant)", "225-MS GC Ion Trap (Variant)", "300-MS quadrupole GC/MS (Variant)" ] }, { "typeName": "studyAssayCellType", "multiple": true, "typeClass": "primitive", "value": [ "CellType1", "CellType2" ] } ] }, "journal": { "displayName": "Journal Metadata", "fields": [ { "typeName": "journalVolumeIssue", "multiple": true, "typeClass": "compound", "value": [ { "journalVolume": { "typeName": "journalVolume", "multiple": false, "typeClass": "primitive", "value": "JournalVolume1" }, "journalIssue": { "typeName": "journalIssue", "multiple": false, "typeClass": "primitive", "value": "JournalIssue1" }, "journalPubDate": { "typeName": "journalPubDate", "multiple": false, "typeClass": "primitive", "value": "1008-01-01" } } ] }, { "typeName": "journalArticleType", "multiple": false, "typeClass": "controlledVocabulary", "value": "abstract" } ] } } } } pyDataverse-0.3.4/tests/data/dataset_upload_min_default.json000066400000000000000000000045601467256651400243400ustar00rootroot00000000000000{ "datasetVersion": { "metadataBlocks": { "citation": { "fields": [ { "value": "Darwin's Finches", "typeClass": "primitive", "multiple": false, "typeName": "title" }, { "value": [ { "authorName": { "value": "Finch, Fiona", "typeClass": "primitive", "multiple": false, "typeName": "authorName" }, "authorAffiliation": { "value": "Birds Inc.", "typeClass": "primitive", "multiple": false, "typeName": "authorAffiliation" } } ], "typeClass": "compound", "multiple": true, "typeName": "author" }, { "value": [ { "datasetContactEmail": { "typeClass": "primitive", "multiple": false, "typeName": "datasetContactEmail", "value": "finch@mailinator.com" }, "datasetContactName": { "typeClass": "primitive", "multiple": false, "typeName": "datasetContactName", "value": "Finch, Fiona" } } ], "typeClass": "compound", "multiple": true, "typeName": "datasetContact" }, { "value": [ { "dsDescriptionValue": { "value": "Darwin's finches (also known as the Gal\u00e1pagos finches) are a group of about fifteen species of passerine birds.", "multiple": false, "typeClass": "primitive", "typeName": "dsDescriptionValue" } } ], "typeClass": "compound", "multiple": true, "typeName": "dsDescription" }, { "value": [ "Medicine, Health and Life Sciences" ], "typeClass": "controlledVocabulary", "multiple": true, "typeName": "subject" } ], "displayName": "Citation Metadata" } } } } pyDataverse-0.3.4/tests/data/dataverse_upload_full.json000066400000000000000000000005031467256651400233350ustar00rootroot00000000000000{ "name": "Scientific Research", "alias": "science", "dataverseContacts": [ { "contactEmail": "pi@example.edu" }, { "contactEmail": "student@example.edu" } ], "affiliation": "Scientific Research University", "description": "We do all the science.", "dataverseType": "LABORATORY" } pyDataverse-0.3.4/tests/data/dataverse_upload_min.json000066400000000000000000000002211467256651400231530ustar00rootroot00000000000000{ "alias": "test-pyDataverse", "name": "Test pyDataverse", "dataverseContacts": [ { "contactEmail": "info@aussda.at" } ] } pyDataverse-0.3.4/tests/data/file_upload_ds_minimum.json000066400000000000000000000037331467256651400235050ustar00rootroot00000000000000{ "datasetVersion": { "metadataBlocks": { "citation": { "fields": [ { "multiple": true, "typeClass": "compound", "typeName": "author", "value": [ { "authorName": { "multiple": false, "typeClass": "primitive", "typeName": "authorName", "value": "John Doe" } } ] }, { "multiple": true, "typeClass": "compound", "typeName": "datasetContact", "value": [ { "datasetContactName": { "multiple": false, "typeClass": "primitive", "typeName": "datasetContactName", "value": "John Doe" }, "datasetContactEmail": { "multiple": false, "typeClass": "primitive", "typeName": "datasetContactEmail", "value": "john@doe.com" } } ] }, { "multiple": true, "typeClass": "compound", "typeName": "dsDescription", "value": [ { "dsDescriptionValue": { "multiple": false, "typeClass": "primitive", "typeName": "dsDescriptionValue", "value": "This is a description of the dataset" } } ] }, { "multiple": true, "typeClass": "controlledVocabulary", "typeName": "subject", "value": [ "Other" ] }, { "multiple": false, "typeClass": "primitive", "typeName": "title", "value": "My dataset" } ] } } } } pyDataverse-0.3.4/tests/data/output/000077500000000000000000000000001467256651400174405ustar00rootroot00000000000000pyDataverse-0.3.4/tests/data/output/.gitkeep000066400000000000000000000000001467256651400210570ustar00rootroot00000000000000pyDataverse-0.3.4/tests/data/replace.xyz000066400000000000000000000000701467256651400202640ustar00rootroot00000000000000A B C 3.4292 -4.32647 -1.66819 3.4292 -4.51647 -1.65310 pyDataverse-0.3.4/tests/data/tree.json000066400000000000000000000047031467256651400177360ustar00rootroot00000000000000[ { "dataverse_alias": "parent_dv_1", "type": "dataverse", "dataverse_id": 1, "children": [ { "dataverse_alias": "parent_dv_1_sub_dv_1", "type": "dataverse", "dataverse_id": 3 }, { "dataset_id": "1AB23C", "pid": "doi:12.34567/1AB23C", "type": "dataset", "children": [ { "datafile_id": 1, "filename": "appendix.pdf", "label": "appendix.pdf", "pid": "doi:12.34567/1AB23C/ABC123", "type": "datafile" }, { "datafile_id": 2, "filename": "survey.zsav", "label": "survey.zsav", "pid": "doi:12.34567/1AB23C/DEF456", "type": "datafile" } ] }, { "dataset_id": "4DE56F", "pid": "doi:12.34567/4DE56F", "type": "dataset", "children": [ { "datafile_id": 3, "filename": "manual.pdf", "label": "manual.pdf", "pid": "doi:12.34567/4DE56F/GHI789", "type": "datafile" } ] } ] }, { "dataverse_alias": "parent_dv_2", "type": "dataverse", "dataverse_id": 2, "children": [ { "dataset_id": "7GH89I", "pid": "doi:12.34567/7GH89I", "type": "dataset", "children": [ { "datafile_id": 4, "filename": "study.zsav", "label": "study.zsav", "pid": "doi:12.34567/7GH89I/JKL012", "type": "datafile" } ] }, { "dataset_id": "0JK1LM", "pid": "doi:12.34567/0JK1LM", "type": "dataset", "children": [ { "datafile_id": 5, "filename": "documentation.pdf", "label": "documentation.pdf", "pid": "doi:12.34567/0JK1LM/MNO345", "type": "datafile" }, { "datafile_id": 6, "filename": "data.R", "label": "data.R", "pid": "doi:12.34567/0JK1LM/PQR678", "type": "datafile" } ] } ] }, { "dataset_id": "2NO34P", "pid": "doi:12.34567/2NO34P", "type": "dataset", "children": [ { "datafile_id": 7, "filename": "summary.md", "label": "summary.md", "pid": "doi:12.34567/2NO34P/STU901", "type": "datafile" } ] } ] pyDataverse-0.3.4/tests/data/user-guide/000077500000000000000000000000001467256651400201515ustar00rootroot00000000000000pyDataverse-0.3.4/tests/data/user-guide/datafile.txt000066400000000000000000000000071467256651400224600ustar00rootroot00000000000000hello! pyDataverse-0.3.4/tests/data/user-guide/datafiles.csv000066400000000000000000000006721467256651400226270ustar00rootroot00000000000000"org.datafile_id","org.dataset_id","org.filename","org.to_upload","org.is_uploaded","org.to_delete","org.is_deleted","org.to_update","org.is_updated","dv.datafile_id","dv.description","dv.categories","dv.restrict","dv.label","dv.directoryLabel","alma.title","alma.pages","alma.year" 1,1,"datafile.txt","TRUE","TRUE","TRUE","TRUE","TRUE","TRUE",634,"My description bbb.","[""Data""]","FALSE","Text Report","data/subdir1","Text Report",23,1997 pyDataverse-0.3.4/tests/data/user-guide/dataset.json000066400000000000000000000044741467256651400225020ustar00rootroot00000000000000{ "datasetVersion": { "metadataBlocks": { "citation": { "fields": [ { "value": "Youth in Austria 2005", "typeClass": "primitive", "multiple": false, "typeName": "title" }, { "value": [ { "authorName": { "value": "LastAuthor1, FirstAuthor1", "typeClass": "primitive", "multiple": false, "typeName": "authorName" }, "authorAffiliation": { "value": "AuthorAffiliation1", "typeClass": "primitive", "multiple": false, "typeName": "authorAffiliation" } } ], "typeClass": "compound", "multiple": true, "typeName": "author" }, { "value": [ { "datasetContactEmail": { "typeClass": "primitive", "multiple": false, "typeName": "datasetContactEmail", "value": "ContactEmail1@mailinator.com" }, "datasetContactName": { "typeClass": "primitive", "multiple": false, "typeName": "datasetContactName", "value": "LastContact1, FirstContact1" } } ], "typeClass": "compound", "multiple": true, "typeName": "datasetContact" }, { "value": [ { "dsDescriptionValue": { "value": "DescriptionText", "multiple": false, "typeClass": "primitive", "typeName": "dsDescriptionValue" } } ], "typeClass": "compound", "multiple": true, "typeName": "dsDescription" }, { "value": [ "Medicine, Health and Life Sciences" ], "typeClass": "controlledVocabulary", "multiple": true, "typeName": "subject" } ], "displayName": "Citation Metadata" } } } } pyDataverse-0.3.4/tests/data/user-guide/datasets.csv000066400000000000000000000122241467256651400224770ustar00rootroot00000000000000"org.dataset_id","org.dataverse_id","org.doi","org.privateurl","org.to_upload","org.is_uploaded","org.to_publish","org.is_published","org.to_delete","org.is_deleted","org.to_update","org.is_updated","dv.license","dv.termsOfAccess","dv.termsOfUse","dv.otherId","dv.title","dv.subtitle","dv.alternativeTitle","dv.series","dv.notesText","dv.author","dv.dsDescription","dv.subject","dv.keyword","dv.topicClassification","dv.language","dv.grantNumber","dv.dateOfCollection","dv.kindOfData","dv.dataSources","dv.accessToSources","dv.alternativeURL","dv.characteristicOfSources","dv.dateOfDeposit","dv.depositor","dv.distributionDate","dv.otherReferences","dv.productionDate","dv.productionPlace","dv.contributor","dv.relatedDatasets","dv.relatedMaterial","dv.datasetContact","dv.distributor","dv.producer","dv.publication","dv.software","dv.timePeriodCovered","dv.geographicUnit","dv.geographicBoundingBox","dv.geographicCoverage","dv.actionsToMinimizeLoss","dv.cleaningOperations","dv.collectionMode","dv.collectorTraining","dv.controlOperations","dv.dataCollectionSituation","dv.dataCollector","dv.datasetLevelErrorNotes","dv.deviationsFromSampleDesign","dv.frequencyOfDataCollection","dv.otherDataAppraisal","dv.socialScienceNotes","dv.researchInstrument","dv.responseRate","dv.samplingErrorEstimates","dv.samplingProcedure","dv.unitOfAnalysis","dv.universe","dv.timeMethod","dv.weighting","dv.fileAccessRequest" 1,1,"doi:10.11587/19ZW6I",,"TRUE","TRUE","TRUE","TRUE","TRUE","TRUE","TRUE","TRUE","CC0","Terms of Access","CC0 Waiver","[{""otherIdAgency"": ""OtherIDAgency1"", ""otherIdValue"": ""OtherIDIdentifier1""}]","Replication Data for: Title","Subtitle","Alternative Title","{""seriesName"": ""SeriesName"", ""seriesInformation"": ""SeriesInformation""}","Notes1","[{""authorName"": ""LastAuthor1, FirstAuthor1"", ""authorAffiliation"": ""AuthorAffiliation1"", ""authorIdentifierScheme"": ""ORCID"", ""authorIdentifier"": ""AuthorIdentifier1""}]","[{""dsDescriptionValue"": ""DescriptionText2"", ""dsDescriptionDate"": ""1000-02-02""}]","[""Agricultural Sciences"", ""Business and Management"", ""Engineering"", ""Law""]","[{""keywordValue"": ""KeywordTerm1"", ""keywordVocabulary"": ""KeywordVocabulary1"", ""keywordVocabularyURI"": ""http://KeywordVocabularyURL1.org""}]","[{""topicClassValue"": ""Topic Class Value1"", ""topicClassVocab"": ""Topic Classification Vocabulary"", ""topicClassVocabURI"": ""http://www.topicURL.net""}]","[""English"", ""German""]","[{""grantNumberAgency"": ""GrantInformationGrantAgency1"", ""grantNumberValue"": ""GrantInformationGrantNumber1""}]","[{""dateOfCollectionStart"": ""1006-01-01"", ""dateOfCollectionEnd"": ""1006-01-01""}]","[""KindOfData1"", ""KindOfData2""]","[""DataSources1"", ""DataSources2""]","DocumentationAndAccessToSources","http://AlternativeURL.org","CharacteristicOfSourcesNoted","1002-01-01","LastDepositor, FirstDepositor","1004-01-01","[""OtherReferences1"", ""OtherReferences2""]","1003-01-01","ProductionPlace","[{""contributorType"": ""Data Collector"", ""contributorName"": ""LastContributor1, FirstContributor1""}]","[""RelatedDatasets1"", ""RelatedDatasets2""]","[""RelatedMaterial1"", ""RelatedMaterial2""]","[{""datasetContactName"": ""LastContact1, FirstContact1"", ""datasetContactAffiliation"": ""ContactAffiliation1"", ""datasetContactEmail"": ""ContactEmail1@mailinator.com""}]","[{""distributorName"": ""LastDistributor1, FirstDistributor1"", ""distributorAffiliation"": ""DistributorAffiliation1"", ""distributorAbbreviation"": ""DistributorAbbreviation1"", ""distributorURL"": ""http://DistributorURL1.org"", ""distributorLogoURL"": ""http://DistributorLogoURL1.org""}]","[{""producerName"": ""LastProducer1, FirstProducer1"", ""producerAffiliation"": ""ProducerAffiliation1"", ""producerAbbreviation"": ""ProducerAbbreviation1"", ""producerURL"": ""http://ProducerURL1.org"", ""producerLogoURL"": ""http://ProducerLogoURL1.org""}]","[{""publicationCitation"": ""RelatedPublicationCitation1"", ""publicationIDType"": ""ark"", ""publicationIDNumber"": ""RelatedPublicationIDNumber1"", ""publicationURL"": ""http://RelatedPublicationURL1.org""}]","[{""softwareName"": ""SoftwareName1"", ""softwareVersion"": ""SoftwareVersion1""}]","[{""timePeriodCoveredStart"": ""1005-01-01"", ""timePeriodCoveredEnd"": ""1005-01-02""}]","[""GeographicUnit1"", ""GeographicUnit2""]","[{""westLongitude"": ""10"", ""eastLongitude"": ""20"", ""northLongitude"": ""30"", ""southLongitude"": ""40""}]","[{""country"": ""Afghanistan"", ""state"": ""GeographicCoverageStateProvince1"", ""city"": ""GeographicCoverageCity1"", ""otherGeographicCoverage"": ""GeographicCoverageOther1""}]","ActionsToMinimizeLosses","CleaningOperations","CollectionMode","CollectorTraining","ControlOperations","CharacteristicsOfDataCollectionSituation","LastDataCollector1, FirstDataCollector1","StudyLevelErrorNotes","MajorDeviationsForSampleDesign","Frequency","OtherFormsOfDataAppraisal","[{""socialScienceNotesType"": ""NotesType"", ""socialScienceNotesSubject"": ""NotesSubject"", ""socialScienceNotesText"": ""NotesText""}]","TypeOfResearchInstrument","ResponseRate","EstimatesOfSamplingError","SamplingProcedure","[""UnitOfAnalysis1"", ""UnitOfAnalysis2""]","[""Universe1"", ""Universe2""]","TimeMethod","Weighting","True" pyDataverse-0.3.4/tests/data/user-guide/dataverse.json000066400000000000000000000002371467256651400230240ustar00rootroot00000000000000{ "alias": "pyDataverse_user-guide", "name": "pyDataverse - User Guide", "dataverseContacts": [ { "contactEmail": "info@aussda.at" } ] } pyDataverse-0.3.4/tests/data/user.json000066400000000000000000000001771467256651400177560ustar00rootroot00000000000000{ "email": "admin@stefankasberger.at", "firstName": "pyDataverse", "lastName": "Test", "userName": "pyDataverseTest" } pyDataverse-0.3.4/tests/logs/000077500000000000000000000000001467256651400161335ustar00rootroot00000000000000pyDataverse-0.3.4/tests/logs/.gitkeep000066400000000000000000000000001467256651400175520ustar00rootroot00000000000000pyDataverse-0.3.4/tests/models/000077500000000000000000000000001467256651400164525ustar00rootroot00000000000000pyDataverse-0.3.4/tests/models/__init__.py000066400000000000000000000000001467256651400205510ustar00rootroot00000000000000pyDataverse-0.3.4/tests/models/test_datafile.py000066400000000000000000000426731467256651400216500ustar00rootroot00000000000000"""Datafile data model tests.""" import json import jsonschema import os import platform import pytest from pyDataverse.models import Datafile from pyDataverse.utils import read_file, write_json from ..conftest import test_config def data_object(): """Get Datafile object. Returns ------- pydataverse.models.Datafile :class:`Datafile` object. """ return Datafile() def dict_flat_set_min(): """Get flat dict for set() of minimum Datafile. Returns ------- dict Flat dict with minimum Datafile data. """ return {"pid": "doi:10.11587/RRKEA9", "filename": "10109_qu_de_v1_0.pdf"} def dict_flat_set_full(): """Get flat dict for set() of full Datafile. Returns ------- dict Flat dict with full Datafile data. """ return { "pid": "doi:10.11587/NVWE8Y", "filename": "20001_ta_de_v1_0.pdf", "description": "Another data file.", "restrict": True, "categories": ["Documentation"], "label": "Questionnaire", "directoryLabel": "data/subdir1", } def object_data_init(): """Get dictionary for Datafile with initial attributes. Returns ------- dict Dictionary of init data attributes set. """ return { "_Datafile_default_json_format": "dataverse_upload", "_Datafile_default_json_schema_filename": test_config[ "datafile_upload_schema_filename" ], "_Datafile_allowed_json_formats": ["dataverse_upload", "dataverse_download"], "_Datafile_json_dataverse_upload_attr": [ "description", "categories", "restrict", "label", "directoryLabel", "pid", "filename", ], "_internal_attributes": [], } def object_data_min(): """Get dictionary for Datafile with minimum attributes. Returns ------- pyDataverse.Datafile :class:`Datafile` with minimum attributes set. """ return {"pid": "doi:10.11587/RRKEA9", "filename": "10109_qu_de_v1_0.pdf"} def object_data_full(): """Get flat dict for :func:`get()` with initial data of Datafile. Returns ------- pyDataverse.Datafile :class:`Datafile` with full attributes set. """ return { "pid": "doi:10.11587/NVWE8Y", "filename": "20001_ta_de_v1_0.pdf", "description": "Another data file.", "restrict": True, "categories": ["Documentation"], "label": "Questionnaire", "directoryLabel": "data/subdir1", } def dict_flat_get_min(): """Get flat dict for :func:`get` with minimum data of Datafile. Returns ------- dict Minimum Datafile dictionary returned by :func:`get`. """ return {"pid": "doi:10.11587/RRKEA9", "filename": "10109_qu_de_v1_0.pdf"} def dict_flat_get_full(): """Get flat dict for :func:`get` of full data of Datafile. Returns ------- dict Full Datafile dictionary returned by :func:`get`. """ return { "pid": "doi:10.11587/NVWE8Y", "filename": "20001_ta_de_v1_0.pdf", "description": "Another data file.", "restrict": True, "categories": ["Documentation"], "label": "Questionnaire", "directoryLabel": "data/subdir1", } def json_upload_min(): """Get JSON string of minimum Datafile. Returns ------- dict JSON string. """ return read_file(test_config["datafile_upload_min_filename"]) def json_upload_full(): """Get JSON string of full Datafile. Returns ------- str JSON string. """ return read_file(test_config["datafile_upload_full_filename"]) def json_dataverse_upload_attr(): """List of attributes import or export in format `dataverse_upload`. Returns ------- list List of attributes, which will be used for import and export. """ return [ "description", "categories", "restrict", "label", "directoryLabel", "pid", "filename", ] def json_dataverse_upload_required_attr(): """List of attributes required for `dataverse_upload` JSON. Returns ------- list List of attributes, which will be used for import and export. """ return ["pid", "filename"] class TestDatafileGeneric(object): """Generic tests for Datafile().""" def test_datafile_set_and_get_valid(self): """Test Datafile.get() with valid data.""" data = [ ((dict_flat_set_min(), object_data_min()), dict_flat_get_min()), ((dict_flat_set_full(), object_data_full()), dict_flat_get_full()), (({}, {}), {}), ] pdv = data_object() pdv.set(dict_flat_set_min()) assert isinstance(pdv.get(), dict) for input, data_eval in data: pdv = data_object() pdv.set(input[0]) data = pdv.get() for key, val in data_eval.items(): assert data[key] == input[1][key] == data_eval[key] assert len(data) == len(input[1]) == len(data_eval) def test_datafile_set_invalid(self): """Test Datafile.set() with invalid data.""" # invalid data for data in test_config["invalid_set_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.set(data) def test_datafile_from_json_valid(self): """Test Datafile.from_json() with valid data.""" data = [ (({json_upload_min()}, {}), object_data_min()), (({json_upload_full()}, {}), object_data_full()), ( ({json_upload_min()}, {"data_format": "dataverse_upload"}), object_data_min(), ), (({json_upload_min()}, {"validate": False}), object_data_min()), ( ( {json_upload_min()}, {"filename_schema": "wrong", "validate": False}, ), object_data_min(), ), ( ( {json_upload_min()}, { "filename_schema": test_config[ "datafile_upload_schema_filename" ], "validate": True, }, ), object_data_min(), ), (({"{}"}, {"validate": False}), {}), ] for input, data_eval in data: pdv = data_object() args = input[0] kwargs = input[1] pdv.from_json(*args, **kwargs) for key, val in data_eval.items(): assert getattr(pdv, key) == data_eval[key] assert len(pdv.__dict__) - len(object_data_init()) == len(data_eval) def test_datafile_from_json_invalid(self): """Test Datafile.from_json() with invalid data.""" # invalid data for data in test_config["invalid_json_data_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.from_json(data, validate=False) if int(platform.python_version_tuple()[1]) >= 5: for json_string in test_config["invalid_json_strings"]: with pytest.raises(json.decoder.JSONDecodeError): pdv = data_object() pdv.from_json(json_string, validate=False) else: for json_string in test_config["invalid_json_strings"]: with pytest.raises(ValueError): pdv = data_object() pdv.from_json(json_string, validate=False) # invalid `filename_schema` for filename_schema in test_config["invalid_filename_strings"]: with pytest.raises(FileNotFoundError): pdv = data_object() pdv.from_json(json_upload_min(), filename_schema=filename_schema) for filename_schema in test_config["invalid_filename_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.from_json(json_upload_min(), filename_schema=filename_schema) # invalid `data_format` for data_format in ( test_config["invalid_data_format_types"] + test_config["invalid_data_format_strings"] ): with pytest.raises(AssertionError): pdv = data_object() pdv.from_json( json_upload_min(), data_format=data_format, validate=False ) # invalid `validate` for validate in test_config["invalid_validate_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.from_json(json_upload_min(), validate=validate) with pytest.raises(jsonschema.exceptions.ValidationError): pdv = data_object() pdv.from_json("{}") for attr in json_dataverse_upload_required_attr(): with pytest.raises(jsonschema.exceptions.ValidationError): pdv = data_object() data = json.loads(json_upload_min()) del data[attr] data = json.dumps(data) pdv.from_json(data, validate=True) def test_datafile_to_json_valid(self): """Test Datafile.json() with valid data.""" data = [ ((dict_flat_set_min(), {}), json.loads(json_upload_min())), ((dict_flat_set_full(), {}), json.loads(json_upload_full())), ( (dict_flat_set_min(), {"data_format": "dataverse_upload"}), json.loads(json_upload_min()), ), ( (dict_flat_set_min(), {"validate": False}), json.loads(json_upload_min()), ), ( ( dict_flat_set_min(), {"filename_schema": "wrong", "validate": False}, ), json.loads(json_upload_min()), ), ( ( dict_flat_set_min(), { "filename_schema": test_config[ "datafile_upload_schema_filename" ], "validate": True, }, ), json.loads(json_upload_min()), ), (({}, {"validate": False}), {}), ] pdv = data_object() pdv.set(dict_flat_set_min()) assert isinstance(pdv.json(), str) for input, data_eval in data: pdv = data_object() pdv.set(input[0]) kwargs = input[1] data = json.loads(pdv.json(**kwargs)) for key, val in data_eval.items(): assert data[key] == data_eval[key] assert len(data) == len(data_eval) def test_datafile_to_json_invalid(self): """Test Datafile.json() with non-valid data.""" # invalid `filename_schema` for filename_schema in test_config["invalid_filename_strings"]: with pytest.raises(FileNotFoundError): obj = data_object() obj.json(filename_schema=filename_schema) for filename_schema in test_config["invalid_filename_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.json(filename_schema=filename_schema) # invalid `data_format` for data_format in ( test_config["invalid_data_format_types"] + test_config["invalid_data_format_strings"] ): with pytest.raises(AssertionError): pdv = data_object() pdv.set(dict_flat_set_min()) pdv.json(data_format=data_format, validate=False) # invalid `validate` for validate in test_config["invalid_validate_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.set(dict_flat_set_min()) pdv.json(validate=validate) with pytest.raises(jsonschema.exceptions.ValidationError): pdv = data_object() pdv.set({}) pdv.json() for attr in json_dataverse_upload_required_attr(): with pytest.raises(jsonschema.exceptions.ValidationError): pdv = data_object() data = json.loads(json_upload_min()) del data[attr] pdv.set(data) pdv.json(validate=True) def test_datafile_validate_json_valid(self): """Test Datafile.validate_json() with valid data.""" data = [ ((dict_flat_set_min(), {}), True), ((dict_flat_set_full(), {}), True), ((dict_flat_set_min(), {"data_format": "dataverse_upload"}), True), ( ( dict_flat_set_min(), { "data_format": "dataverse_upload", "filename_schema": test_config[ "datafile_upload_schema_filename" ], }, ), True, ), ( ( dict_flat_set_min(), {"filename_schema": test_config["datafile_upload_schema_filename"]}, ), True, ), ] for input, data_eval in data: pdv = data_object() pdv.set(input[0]) assert pdv.validate_json() == data_eval def test_datafile_validate_json_invalid(self): """Test Datafile.validate_json() with non-valid data.""" # invalid data for attr in json_dataverse_upload_required_attr(): with pytest.raises(jsonschema.exceptions.ValidationError): for data in [dict_flat_set_min(), dict_flat_set_full()]: pdv = data_object() pdv.set(data) delattr(pdv, attr) pdv.validate_json() # invalid `filename_schema` for filename_schema in test_config["invalid_filename_strings"]: with pytest.raises(FileNotFoundError): pdv = data_object() pdv.set(dict_flat_set_min()) pdv.validate_json(filename_schema=filename_schema) for filename_schema in test_config["invalid_filename_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.set(dict_flat_set_min()) pdv.validate_json(filename_schema=filename_schema) class TestDatafileSpecific(object): """Specific tests for Datafile().""" def test_datafile_init_valid(self): """Test Datafile.__init__() with valid data.""" # specific data = [ (Datafile(), {}), (Datafile(dict_flat_set_min()), object_data_min()), (Datafile(dict_flat_set_full()), object_data_full()), (Datafile({}), {}), ] for pdv, data_eval in data: for key, val in data_eval.items(): assert getattr(pdv, key) == data_eval[key] assert len(pdv.__dict__) - len(object_data_init()) == len(data_eval) def test_datafile_init_invalid(self): """Test Datafile.init() with invalid data.""" pdv = Datafile() # invalid data for data in ["invalid_set_types"]: with pytest.raises(AssertionError): pdv.set(data) if not os.environ.get("TRAVIS"): class TestDatafileGenericTravisNot(object): """Generic tests for Datafile(), not running on Travis (no file-write permissions).""" def test_dataverse_from_json_to_json_valid(self): """Test Dataverse to JSON from JSON with valid data.""" data = [ ({json_upload_min()}, {}), ({json_upload_full()}, {}), ({json_upload_min()}, {"data_format": "dataverse_upload"}), ({json_upload_min()}, {"validate": False}), ( {json_upload_min()}, {"filename_schema": "wrong", "validate": False}, ), ( {json_upload_min()}, { "filename_schema": test_config[ "datafile_upload_schema_filename" ], "validate": True, }, ), ({"{}"}, {"validate": False}), ] for args_from, kwargs_from in data: pdv_start = data_object() args = args_from kwargs = kwargs_from pdv_start.from_json(*args, **kwargs) if "validate" in kwargs: if not kwargs["validate"]: kwargs = {"validate": False} write_json( test_config["datafile_json_output_filename"], json.loads(pdv_start.json(**kwargs)), ) pdv_end = data_object() kwargs = kwargs_from pdv_end.from_json( read_file(test_config["datafile_json_output_filename"]), **kwargs ) for key, val in pdv_end.get().items(): assert getattr(pdv_start, key) == getattr(pdv_end, key) assert len(pdv_start.__dict__) == len( pdv_end.__dict__, ) pyDataverse-0.3.4/tests/models/test_dataset.py000066400000000000000000001225531467256651400215200ustar00rootroot00000000000000"""Dataset data model tests.""" import json import os import platform import pytest from pyDataverse.models import Dataset from ..conftest import test_config def read_file(filename, mode="r"): """Read in a file. Parameters ---------- filename : str Filename with full path. mode : str Read mode of file. Defaults to `r`. See more at https://docs.python.org/3.5/library/functions.html#open Returns ------- str Returns data as string. """ with open(filename, mode) as f: return f.read() def write_json(filename, data, mode="w", encoding="utf-8"): """Write data to a json file. Parameters ---------- filename : str Filename with full path. data : dict Data to be written in the json file. mode : str Write mode of file. Defaults to `w`. See more at https://docs.python.org/3/library/functions.html#open """ with open(filename, mode, encoding=encoding) as f: json.dump(data, f, indent=2) def data_object(): """Get Dataset object. Returns ------- pydataverse.models.Dataset :class:`Dataset` object. """ return Dataset() def dict_flat_set_min(): """Get flat dict for set() of minimum Dataset. Returns ------- dict Flat dict with minimum Dataset data. """ return { "title": "Darwin's Finches", "author": [{"authorName": "Finch, Fiona", "authorAffiliation": "Birds Inc."}], "datasetContact": [ { "datasetContactEmail": "finch@mailinator.com", "datasetContactName": "Finch, Fiona", } ], "dsDescription": [ { "dsDescriptionValue": "Darwin's finches (also known as the Galápagos finches) are a group of about fifteen species of passerine birds." } ], "subject": ["Medicine, Health and Life Sciences"], "citation_displayName": "Citation Metadata", } def dict_flat_set_full(): """Get flat dict for set() of full Dataset. Returns ------- dict Flat dict with full Dataset data. """ return { "license": "CC0", "termsOfUse": "CC0 Waiver", "termsOfAccess": "Terms of Access", "fileAccessRequest": True, "protocol": "doi", "authority": "10.11587", "identifier": "6AQBYW", "citation_displayName": "Citation Metadata", "title": "Replication Data for: Title", "subtitle": "Subtitle", "alternativeTitle": "Alternative Title", "alternativeURL": "http://AlternativeURL.org", "otherId": [ {"otherIdAgency": "OtherIDAgency1", "otherIdValue": "OtherIDIdentifier1"} ], "author": [ { "authorName": "LastAuthor1, FirstAuthor1", "authorAffiliation": "AuthorAffiliation1", "authorIdentifierScheme": "ORCID", "authorIdentifier": "AuthorIdentifier1", } ], "datasetContact": [ { "datasetContactName": "LastContact1, FirstContact1", "datasetContactAffiliation": "ContactAffiliation1", "datasetContactEmail": "ContactEmail1@mailinator.com", } ], "dsDescription": [ { "dsDescriptionValue": "DescriptionText2", "dsDescriptionDate": "1000-02-02", } ], "subject": [ "Agricultural Sciences", "Business and Management", "Engineering", "Law", ], "keyword": [ { "keywordValue": "KeywordTerm1", "keywordVocabulary": "KeywordVocabulary1", "keywordVocabularyURI": "http://KeywordVocabularyURL1.org", } ], "topicClassification": [ { "topicClassValue": "Topic Class Value1", "topicClassVocab": "Topic Classification Vocabulary", "topicClassVocabURI": "https://topic.class/vocab/uri", } ], "publication": [ { "publicationCitation": "RelatedPublicationCitation1", "publicationIDType": "ark", "publicationIDNumber": "RelatedPublicationIDNumber1", "publicationURL": "http://RelatedPublicationURL1.org", } ], "notesText": "Notes1", "producer": [ { "producerName": "LastProducer1, FirstProducer1", "producerAffiliation": "ProducerAffiliation1", "producerAbbreviation": "ProducerAbbreviation1", "producerURL": "http://ProducerURL1.org", "producerLogoURL": "http://ProducerLogoURL1.org", } ], "productionDate": "1003-01-01", "productionPlace": "ProductionPlace", "contributor": [ { "contributorType": "Data Collector", "contributorName": "LastContributor1, FirstContributor1", } ], "grantNumber": [ { "grantNumberAgency": "GrantInformationGrantAgency1", "grantNumberValue": "GrantInformationGrantNumber1", } ], "distributor": [ { "distributorName": "LastDistributor1, FirstDistributor1", "distributorAffiliation": "DistributorAffiliation1", "distributorAbbreviation": "DistributorAbbreviation1", "distributorURL": "http://DistributorURL1.org", "distributorLogoURL": "http://DistributorLogoURL1.org", } ], "distributionDate": "1004-01-01", "depositor": "LastDepositor, FirstDepositor", "dateOfDeposit": "1002-01-01", "timePeriodCovered": [ { "timePeriodCoveredStart": "1005-01-01", "timePeriodCoveredEnd": "1005-01-02", } ], "dateOfCollection": [ {"dateOfCollectionStart": "1006-01-01", "dateOfCollectionEnd": "1006-01-01"} ], "kindOfData": ["KindOfData1", "KindOfData2"], "language": ["German"], "series": { "seriesName": "SeriesName", "seriesInformation": "SeriesInformation", }, "software": [ {"softwareName": "SoftwareName1", "softwareVersion": "SoftwareVersion1"} ], "relatedMaterial": ["RelatedMaterial1", "RelatedMaterial2"], "relatedDatasets": ["RelatedDatasets1", "RelatedDatasets2"], "otherReferences": ["OtherReferences1", "OtherReferences2"], "dataSources": ["DataSources1", "DataSources2"], "originOfSources": "OriginOfSources", "characteristicOfSources": "CharacteristicOfSourcesNoted", "accessToSources": "DocumentationAndAccessToSources", "geospatial_displayName": "Geospatial Metadata", "geographicCoverage": [ { "country": "Afghanistan", "state": "GeographicCoverageStateProvince1", "city": "GeographicCoverageCity1", "otherGeographicCoverage": "GeographicCoverageOther1", } ], "geographicUnit": ["GeographicUnit1", "GeographicUnit2"], "geographicBoundingBox": [ { "westLongitude": "10", "eastLongitude": "20", "northLongitude": "30", "southLongitude": "40", } ], "socialscience_displayName": "Social Science and Humanities Metadata", "unitOfAnalysis": ["UnitOfAnalysis1", "UnitOfAnalysis2"], "universe": ["Universe1", "Universe2"], "timeMethod": "TimeMethod", "dataCollector": "LastDataCollector1, FirstDataCollector1", "collectorTraining": "CollectorTraining", "frequencyOfDataCollection": "Frequency", "samplingProcedure": "SamplingProcedure", "targetSampleSize": { "targetSampleActualSize": "100", "targetSampleSizeFormula": "TargetSampleSizeFormula", }, "deviationsFromSampleDesign": "MajorDeviationsForSampleDesign", "collectionMode": "CollectionMode", "researchInstrument": "TypeOfResearchInstrument", "dataCollectionSituation": "CharacteristicsOfDataCollectionSituation", "actionsToMinimizeLoss": "ActionsToMinimizeLosses", "controlOperations": "ControlOperations", "weighting": "Weighting", "cleaningOperations": "CleaningOperations", "datasetLevelErrorNotes": "StudyLevelErrorNotes", "responseRate": "ResponseRate", "samplingErrorEstimates": "EstimatesOfSamplingError", "otherDataAppraisal": "OtherFormsOfDataAppraisal", "socialScienceNotes": { "socialScienceNotesType": "NotesType", "socialScienceNotesSubject": "NotesSubject", "socialScienceNotesText": "NotesText", }, "journal_displayName": "Journal Metadata", "journalVolumeIssue": [ { "journalVolume": "JournalVolume1", "journalIssue": "JournalIssue1", "journalPubDate": "1008-01-01", } ], "journalArticleType": "abstract", } def object_data_init(): """Get dictionary for Dataset with initial attributes. Returns ------- dict Dictionary of init data attributes set. """ return { "_Dataset_default_json_format": "dataverse_upload", "_Dataset_default_json_schema_filename": test_config[ "dataset_upload_schema_filename" ], "_Dataset_allowed_json_formats": [ "dataverse_upload", "dataverse_download", "dspace", "custom", ], "_Dataset_json_dataverse_upload_attr": json_dataverse_upload_attr(), "_internal_attributes": [], } def object_data_min(): """Get dictionary for Dataset with minimum attributes. Returns ------- pyDataverse.Dataset :class:`Dataset` with minimum attributes set. """ return { "title": "Darwin's Finches", "author": [{"authorName": "Finch, Fiona", "authorAffiliation": "Birds Inc."}], "datasetContact": [ { "datasetContactEmail": "finch@mailinator.com", "datasetContactName": "Finch, Fiona", } ], "dsDescription": [ { "dsDescriptionValue": "Darwin's finches (also known as the Galápagos finches) are a group of about fifteen species of passerine birds." } ], "subject": ["Medicine, Health and Life Sciences"], "citation_displayName": "Citation Metadata", } def object_data_full(): """Get dictionary for Dataset with full attributes. Returns ------- pyDataverse.Dataset :class:`Dataset` with full attributes set. """ return { "license": "CC0", "termsOfUse": "CC0 Waiver", "termsOfAccess": "Terms of Access", "fileAccessRequest": True, "protocol": "doi", "authority": "10.11587", "identifier": "6AQBYW", "citation_displayName": "Citation Metadata", "title": "Replication Data for: Title", "subtitle": "Subtitle", "alternativeTitle": "Alternative Title", "alternativeURL": "http://AlternativeURL.org", "otherId": [ {"otherIdAgency": "OtherIDAgency1", "otherIdValue": "OtherIDIdentifier1"} ], "author": [ { "authorName": "LastAuthor1, FirstAuthor1", "authorAffiliation": "AuthorAffiliation1", "authorIdentifierScheme": "ORCID", "authorIdentifier": "AuthorIdentifier1", } ], "datasetContact": [ { "datasetContactName": "LastContact1, FirstContact1", "datasetContactAffiliation": "ContactAffiliation1", "datasetContactEmail": "ContactEmail1@mailinator.com", } ], "dsDescription": [ { "dsDescriptionValue": "DescriptionText2", "dsDescriptionDate": "1000-02-02", } ], "subject": [ "Agricultural Sciences", "Business and Management", "Engineering", "Law", ], "keyword": [ { "keywordValue": "KeywordTerm1", "keywordVocabulary": "KeywordVocabulary1", "keywordVocabularyURI": "http://KeywordVocabularyURL1.org", } ], "topicClassification": [ { "topicClassValue": "Topic Class Value1", "topicClassVocab": "Topic Classification Vocabulary", "topicClassVocabURI": "https://topic.class/vocab/uri", } ], "publication": [ { "publicationCitation": "RelatedPublicationCitation1", "publicationIDType": "ark", "publicationIDNumber": "RelatedPublicationIDNumber1", "publicationURL": "http://RelatedPublicationURL1.org", } ], "notesText": "Notes1", "producer": [ { "producerName": "LastProducer1, FirstProducer1", "producerAffiliation": "ProducerAffiliation1", "producerAbbreviation": "ProducerAbbreviation1", "producerURL": "http://ProducerURL1.org", "producerLogoURL": "http://ProducerLogoURL1.org", } ], "productionDate": "1003-01-01", "productionPlace": "ProductionPlace", "contributor": [ { "contributorType": "Data Collector", "contributorName": "LastContributor1, FirstContributor1", } ], "grantNumber": [ { "grantNumberAgency": "GrantInformationGrantAgency1", "grantNumberValue": "GrantInformationGrantNumber1", } ], "distributor": [ { "distributorName": "LastDistributor1, FirstDistributor1", "distributorAffiliation": "DistributorAffiliation1", "distributorAbbreviation": "DistributorAbbreviation1", "distributorURL": "http://DistributorURL1.org", "distributorLogoURL": "http://DistributorLogoURL1.org", } ], "distributionDate": "1004-01-01", "depositor": "LastDepositor, FirstDepositor", "dateOfDeposit": "1002-01-01", "timePeriodCovered": [ { "timePeriodCoveredStart": "1005-01-01", "timePeriodCoveredEnd": "1005-01-02", } ], "dateOfCollection": [ {"dateOfCollectionStart": "1006-01-01", "dateOfCollectionEnd": "1006-01-01"} ], "kindOfData": ["KindOfData1", "KindOfData2"], "language": ["German"], "series": { "seriesName": "SeriesName", "seriesInformation": "SeriesInformation", }, "software": [ {"softwareName": "SoftwareName1", "softwareVersion": "SoftwareVersion1"} ], "relatedMaterial": ["RelatedMaterial1", "RelatedMaterial2"], "relatedDatasets": ["RelatedDatasets1", "RelatedDatasets2"], "otherReferences": ["OtherReferences1", "OtherReferences2"], "dataSources": ["DataSources1", "DataSources2"], "originOfSources": "OriginOfSources", "characteristicOfSources": "CharacteristicOfSourcesNoted", "accessToSources": "DocumentationAndAccessToSources", "geospatial_displayName": "Geospatial Metadata", "geographicCoverage": [ { "country": "Afghanistan", "state": "GeographicCoverageStateProvince1", "city": "GeographicCoverageCity1", "otherGeographicCoverage": "GeographicCoverageOther1", } ], "geographicUnit": ["GeographicUnit1", "GeographicUnit2"], "geographicBoundingBox": [ { "westLongitude": "10", "eastLongitude": "20", "northLongitude": "30", "southLongitude": "40", } ], "socialscience_displayName": "Social Science and Humanities Metadata", "unitOfAnalysis": ["UnitOfAnalysis1", "UnitOfAnalysis2"], "universe": ["Universe1", "Universe2"], "timeMethod": "TimeMethod", "dataCollector": "LastDataCollector1, FirstDataCollector1", "collectorTraining": "CollectorTraining", "frequencyOfDataCollection": "Frequency", "samplingProcedure": "SamplingProcedure", "targetSampleSize": { "targetSampleActualSize": "100", "targetSampleSizeFormula": "TargetSampleSizeFormula", }, "deviationsFromSampleDesign": "MajorDeviationsForSampleDesign", "collectionMode": "CollectionMode", "researchInstrument": "TypeOfResearchInstrument", "dataCollectionSituation": "CharacteristicsOfDataCollectionSituation", "actionsToMinimizeLoss": "ActionsToMinimizeLosses", "controlOperations": "ControlOperations", "weighting": "Weighting", "cleaningOperations": "CleaningOperations", "datasetLevelErrorNotes": "StudyLevelErrorNotes", "responseRate": "ResponseRate", "samplingErrorEstimates": "EstimatesOfSamplingError", "otherDataAppraisal": "OtherFormsOfDataAppraisal", "socialScienceNotes": { "socialScienceNotesType": "NotesType", "socialScienceNotesSubject": "NotesSubject", "socialScienceNotesText": "NotesText", }, "journal_displayName": "Journal Metadata", "journalVolumeIssue": [ { "journalVolume": "JournalVolume1", "journalIssue": "JournalIssue1", "journalPubDate": "1008-01-01", } ], "journalArticleType": "abstract", } def dict_flat_get_min(): """Get flat dict for :func:`get` with minimum data of Dataset. Returns ------- dict Minimum Dataset dictionary returned by :func:`get`. """ return { "title": "Darwin's Finches", "author": [{"authorName": "Finch, Fiona", "authorAffiliation": "Birds Inc."}], "datasetContact": [ { "datasetContactEmail": "finch@mailinator.com", "datasetContactName": "Finch, Fiona", } ], "dsDescription": [ { "dsDescriptionValue": "Darwin's finches (also known as the Galápagos finches) are a group of about fifteen species of passerine birds." }, ], "subject": ["Medicine, Health and Life Sciences"], "citation_displayName": "Citation Metadata", } def dict_flat_get_full(): """Get flat dict for :func:`get` of full data of Dataset. Returns ------- dict Full Datafile dictionary returned by :func:`get`. """ return { "license": "CC0", "termsOfUse": "CC0 Waiver", "termsOfAccess": "Terms of Access", "fileAccessRequest": True, "protocol": "doi", "authority": "10.11587", "identifier": "6AQBYW", "title": "Replication Data for: Title", "subtitle": "Subtitle", "alternativeTitle": "Alternative Title", "alternativeURL": "http://AlternativeURL.org", "otherId": [ {"otherIdAgency": "OtherIDAgency1", "otherIdValue": "OtherIDIdentifier1"} ], "author": [ { "authorName": "LastAuthor1, FirstAuthor1", "authorAffiliation": "AuthorAffiliation1", "authorIdentifierScheme": "ORCID", "authorIdentifier": "AuthorIdentifier1", } ], "datasetContact": [ { "datasetContactName": "LastContact1, FirstContact1", "datasetContactAffiliation": "ContactAffiliation1", "datasetContactEmail": "ContactEmail1@mailinator.com", } ], "dsDescription": [ { "dsDescriptionValue": "DescriptionText2", "dsDescriptionDate": "1000-02-02", } ], "subject": [ "Agricultural Sciences", "Business and Management", "Engineering", "Law", ], "keyword": [ { "keywordValue": "KeywordTerm1", "keywordVocabulary": "KeywordVocabulary1", "keywordVocabularyURI": "http://KeywordVocabularyURL1.org", } ], "topicClassification": [ { "topicClassValue": "Topic Class Value1", "topicClassVocab": "Topic Classification Vocabulary", "topicClassVocabURI": "https://topic.class/vocab/uri", } ], "publication": [ { "publicationCitation": "RelatedPublicationCitation1", "publicationIDType": "ark", "publicationIDNumber": "RelatedPublicationIDNumber1", "publicationURL": "http://RelatedPublicationURL1.org", } ], "notesText": "Notes1", "producer": [ { "producerName": "LastProducer1, FirstProducer1", "producerAffiliation": "ProducerAffiliation1", "producerAbbreviation": "ProducerAbbreviation1", "producerURL": "http://ProducerURL1.org", "producerLogoURL": "http://ProducerLogoURL1.org", } ], "productionDate": "1003-01-01", "productionPlace": "ProductionPlace", "contributor": [ { "contributorType": "Data Collector", "contributorName": "LastContributor1, FirstContributor1", } ], "grantNumber": [ { "grantNumberAgency": "GrantInformationGrantAgency1", "grantNumberValue": "GrantInformationGrantNumber1", } ], "distributor": [ { "distributorName": "LastDistributor1, FirstDistributor1", "distributorAffiliation": "DistributorAffiliation1", "distributorAbbreviation": "DistributorAbbreviation1", "distributorURL": "http://DistributorURL1.org", "distributorLogoURL": "http://DistributorLogoURL1.org", } ], "distributionDate": "1004-01-01", "depositor": "LastDepositor, FirstDepositor", "dateOfDeposit": "1002-01-01", "timePeriodCovered": [ { "timePeriodCoveredStart": "1005-01-01", "timePeriodCoveredEnd": "1005-01-02", } ], "dateOfCollection": [ {"dateOfCollectionStart": "1006-01-01", "dateOfCollectionEnd": "1006-01-01"} ], "kindOfData": ["KindOfData1", "KindOfData2"], "language": ["German"], "series": { "seriesName": "SeriesName", "seriesInformation": "SeriesInformation", }, "software": [ {"softwareName": "SoftwareName1", "softwareVersion": "SoftwareVersion1"} ], "relatedMaterial": ["RelatedMaterial1", "RelatedMaterial2"], "relatedDatasets": ["RelatedDatasets1", "RelatedDatasets2"], "otherReferences": ["OtherReferences1", "OtherReferences2"], "dataSources": ["DataSources1", "DataSources2"], "originOfSources": "OriginOfSources", "characteristicOfSources": "CharacteristicOfSourcesNoted", "accessToSources": "DocumentationAndAccessToSources", "geospatial_displayName": "Geospatial Metadata", "geographicCoverage": [ { "country": "Afghanistan", "state": "GeographicCoverageStateProvince1", "city": "GeographicCoverageCity1", "otherGeographicCoverage": "GeographicCoverageOther1", } ], "geographicUnit": ["GeographicUnit1", "GeographicUnit2"], "geographicBoundingBox": [ { "westLongitude": "10", "eastLongitude": "20", "northLongitude": "30", "southLongitude": "40", } ], "socialscience_displayName": "Social Science and Humanities Metadata", "unitOfAnalysis": ["UnitOfAnalysis1", "UnitOfAnalysis2"], "universe": ["Universe1", "Universe2"], "timeMethod": "TimeMethod", "dataCollector": "LastDataCollector1, FirstDataCollector1", "collectorTraining": "CollectorTraining", "frequencyOfDataCollection": "Frequency", "samplingProcedure": "SamplingProcedure", "targetSampleSize": { "targetSampleActualSize": "100", "targetSampleSizeFormula": "TargetSampleSizeFormula", }, "deviationsFromSampleDesign": "MajorDeviationsForSampleDesign", "collectionMode": "CollectionMode", "researchInstrument": "TypeOfResearchInstrument", "dataCollectionSituation": "CharacteristicsOfDataCollectionSituation", "actionsToMinimizeLoss": "ActionsToMinimizeLosses", "controlOperations": "ControlOperations", "weighting": "Weighting", "cleaningOperations": "CleaningOperations", "datasetLevelErrorNotes": "StudyLevelErrorNotes", "responseRate": "ResponseRate", "samplingErrorEstimates": "EstimatesOfSamplingError", "otherDataAppraisal": "OtherFormsOfDataAppraisal", "socialScienceNotes": { "socialScienceNotesType": "NotesType", "socialScienceNotesSubject": "NotesSubject", "socialScienceNotesText": "NotesText", }, "journal_displayName": "Journal Metadata", "journalVolumeIssue": [ { "journalVolume": "JournalVolume1", "journalIssue": "JournalIssue1", "journalPubDate": "1008-01-01", } ], "journalArticleType": "abstract", "citation_displayName": "Citation Metadata", } def json_upload_min(): """Get JSON string of minimum Dataset. Returns ------- str JSON string. """ return read_file(test_config["dataset_upload_min_filename"]) def json_upload_full(): """Get JSON string of full Dataset. Returns ------- str JSON string. """ return read_file(test_config["dataset_upload_full_filename"]) def json_dataverse_upload_attr(): """List of attributes import or export in format `dataverse_upload`. Returns ------- list List of attributes, which will be used for import and export. """ return [ "license", "termsOfUse", "termsOfAccess", "fileAccessRequest", "protocol", "authority", "identifier", "citation_displayName", "title", "subtitle", "alternativeTitle", "alternativeURL", "otherId", "author", "datasetContact", "dsDescription", "subject", "keyword", "topicClassification", "publication", "notesText", "producer", "productionDate", "productionPlace", "contributor", "grantNumber", "distributor", "distributionDate", "depositor", "dateOfDeposit", "timePeriodCovered", "dateOfCollection", "kindOfData", "language", "series", "software", "relatedMaterial", "relatedDatasets", "otherReferences", "dataSources", "originOfSources", "characteristicOfSources", "accessToSources", "geospatial_displayName", "geographicCoverage", "geographicUnit", "geographicBoundingBox", "socialscience_displayName", "unitOfAnalysis", "universe", "timeMethod", "dataCollector", "collectorTraining", "frequencyOfDataCollection", "samplingProcedure", "targetSampleSize", "deviationsFromSampleDesign", "collectionMode", "researchInstrument", "dataCollectionSituation", "actionsToMinimizeLoss", "controlOperations", "weighting", "cleaningOperations", "datasetLevelErrorNotes", "responseRate", "samplingErrorEstimates", "otherDataAppraisal", "socialScienceNotes", "journal_displayName", "journalVolumeIssue", "journalArticleType", ] def json_dataverse_upload_required_attr(): """List of attributes required for `dataverse_upload` JSON. Returns ------- list List of attributes, which will be used for import and export. """ return ["title", "author", "datasetContact", "dsDescription", "subject"] class TestDatasetGeneric(object): """Generic tests for Dataset().""" def test_dataset_set_and_get_valid(self): """Test Dataset.get() with valid data.""" data = [ ((dict_flat_set_min(), object_data_min()), dict_flat_get_min()), ((dict_flat_set_full(), object_data_full()), dict_flat_get_full()), (({}, {}), {}), ] pdv = data_object() pdv.set(dict_flat_set_min()) assert isinstance(pdv.get(), dict) for input, data_eval in data: pdv = data_object() pdv.set(input[0]) data = pdv.get() for key, val in data_eval.items(): assert data[key] == input[1][key] == data_eval[key] assert len(data) == len(input[1]) == len(data_eval) def test_dataset_set_invalid(self): """Test Dataset.set() with invalid data.""" # invalid data for data in test_config["invalid_set_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.set(data) def test_dataset_validate_json_valid(self): """Test Dataset.validate_json() with valid data.""" data = [ ((dict_flat_set_min(), {}), True), ((dict_flat_set_full(), {}), True), ((dict_flat_set_min(), {"data_format": "dataverse_upload"}), True), ( ( dict_flat_set_min(), { "data_format": "dataverse_upload", "filename_schema": test_config[ "dataset_upload_schema_filename" ], }, ), True, ), ( ( dict_flat_set_min(), {"filename_schema": test_config["dataset_upload_schema_filename"]}, ), True, ), ] for input, data_eval in data: pdv = data_object() pdv.set(input[0]) assert pdv.validate_json() == data_eval class TestDatasetSpecific(object): """Specific tests for Dataset().""" def test_dataset_from_json_valid(self): """Test Dataset.from_json() with valid data.""" data = [ (({json_upload_min()}, {}), object_data_min()), (({json_upload_full()}, {}), object_data_full()), ( ({json_upload_min()}, {"data_format": "dataverse_upload"}), object_data_min(), ), (({json_upload_min()}, {"validate": False}), object_data_min()), ( ( {json_upload_min()}, {"filename_schema": "", "validate": False}, ), object_data_min(), ), ( ( {json_upload_min()}, {"filename_schema": "wrong", "validate": False}, ), object_data_min(), ), ( ( {json_upload_min()}, { "filename_schema": test_config[ "dataset_upload_schema_filename" ], "validate": True, }, ), object_data_min(), ), ] for input, data_eval in data: pdv = data_object() args = input[0] kwargs = input[1] pdv.from_json(*args, **kwargs) for key, val in data_eval.items(): assert getattr(pdv, key) == data_eval[key] assert len(pdv.__dict__) - len(object_data_init()) == len(data_eval) def test_dataset_to_json_valid(self): """Test Dataset.json() with valid data.""" data = [ ((dict_flat_set_min(), {}), json.loads(json_upload_min())), ((dict_flat_set_full(), {}), json.loads(json_upload_full())), ( (dict_flat_set_min(), {"data_format": "dataverse_upload"}), json.loads(json_upload_min()), ), ( (dict_flat_set_min(), {"validate": False}), json.loads(json_upload_min()), ), ( ( dict_flat_set_min(), {"filename_schema": "", "validate": False}, ), json.loads(json_upload_min()), ), ( ( dict_flat_set_min(), {"filename_schema": "wrong", "validate": False}, ), json.loads(json_upload_min()), ), ( ( dict_flat_set_min(), { "filename_schema": test_config[ "dataset_upload_schema_filename" ], "validate": True, }, ), json.loads(json_upload_min()), ), ] pdv = data_object() pdv.set(dict_flat_set_min()) assert isinstance(pdv.json(), str) # TODO: recursevily test values of lists and dicts for input, data_eval in data: pdv = data_object() pdv.set(input[0]) kwargs = input[1] data = json.loads(pdv.json(**kwargs)) assert data assert isinstance(data, dict) assert len(data) == len(data_eval) assert len(data["datasetVersion"]["metadataBlocks"]["citation"]) == len( data_eval["datasetVersion"]["metadataBlocks"]["citation"] ) assert len( data["datasetVersion"]["metadataBlocks"]["citation"]["fields"] ) == len( data_eval["datasetVersion"]["metadataBlocks"]["citation"]["fields"] ) def test_dataset_init_valid(self): """Test Dataset.__init__() with valid data.""" # specific data = [ (Dataset(), {}), (Dataset(dict_flat_set_min()), object_data_min()), (Dataset(dict_flat_set_full()), object_data_full()), (Dataset({}), {}), ] for pdv, data_eval in data: for key, val in data_eval.items(): assert getattr(pdv, key) == data_eval[key] assert len(pdv.__dict__) - len(object_data_init()) == len(data_eval) def test_dataset_init_invalid(self): """Test Dataset.init() with invalid data.""" pdv = Dataset() # invalid data for data in test_config["invalid_set_types"]: with pytest.raises(AssertionError): pdv.set(data) def test_dataset_from_json_invalid(self): """Test Dataset.from_json() with invalid data.""" # invalid data for data in test_config["invalid_json_data_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.from_json(data, validate=False) if int(platform.python_version_tuple()[1]) >= 5: for json_string in test_config["invalid_json_strings"]: with pytest.raises(json.decoder.JSONDecodeError): pdv = data_object() pdv.from_json(json_string, validate=False) else: for json_string in test_config["invalid_json_strings"]: with pytest.raises(ValueError): pdv = data_object() pdv.from_json(json_string, validate=False) # invalid `filename_schema` for filename_schema in test_config["invalid_filename_strings"]: with pytest.raises(FileNotFoundError): pdv = data_object() pdv.from_json(json_upload_min(), filename_schema=filename_schema) for filename_schema in test_config["invalid_filename_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.from_json(json_upload_min(), filename_schema=filename_schema) # invalid `data_format` for data_format in ( test_config["invalid_data_format_types"] + test_config["invalid_data_format_strings"] ): with pytest.raises(AssertionError): pdv = data_object() pdv.from_json( json_upload_min(), data_format=data_format, validate=False ) # invalid `validate` for validate in test_config["invalid_validate_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.from_json(json_upload_min(), validate=validate) def test_dataset_to_json_invalid(self): """Test Dataset.json() with non-valid data.""" # invalid `filename_schema` for filename_schema in test_config["invalid_filename_strings"]: with pytest.raises(FileNotFoundError): obj = data_object() obj.json(filename_schema=filename_schema) for filename_schema in test_config["invalid_filename_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.json(filename_schema=filename_schema) # invalid `data_format` for data_format in ( test_config["invalid_data_format_types"] + test_config["invalid_data_format_strings"] ): with pytest.raises(AssertionError): pdv = data_object() pdv.set(dict_flat_set_min()) pdv.json(data_format=data_format, validate=False) # invalid `validate` for validate in test_config["invalid_validate_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.set(dict_flat_set_min()) pdv.json(validate=validate) def test_dataset_validate_json_invalid(self): """Test Dataset.validate_json() with non-valid data.""" # invalid `filename_schema` for filename_schema in test_config["invalid_filename_strings"]: with pytest.raises(FileNotFoundError): pdv = data_object() pdv.set(dict_flat_set_min()) pdv.validate_json(filename_schema=filename_schema) for filename_schema in test_config["invalid_filename_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.set(dict_flat_set_min()) pdv.validate_json(filename_schema=filename_schema) if not os.environ.get("TRAVIS"): class TestDatasetSpecificTravisNot(object): """Generic tests for Dataset(), not running on Travis (no file-write permissions).""" def test_dataset_to_json_from_json_valid(self): """Test Dataset to JSON from JSON with valid data.""" data = [ (dict_flat_set_min(), {}), (dict_flat_set_full(), {}), (dict_flat_set_min(), {"data_format": "dataverse_upload"}), (dict_flat_set_min(), {"validate": False}), ( dict_flat_set_min(), {"filename_schema": "wrong", "validate": False}, ), ( dict_flat_set_min(), { "filename_schema": test_config[ "dataset_upload_schema_filename" ], "validate": True, }, ), ] for data_set, kwargs_from in data: kwargs = {} pdv_start = data_object() pdv_start.set(data_set) if "validate" in kwargs_from: if not kwargs_from["validate"]: kwargs = {"validate": False} write_json( test_config["dataset_json_output_filename"], json.loads(pdv_start.json(**kwargs)), ) pdv_end = data_object() kwargs = kwargs_from pdv_end.from_json( read_file(test_config["dataset_json_output_filename"]), **kwargs ) for key, val in pdv_end.get().items(): assert getattr(pdv_start, key) == getattr(pdv_end, key) assert len(pdv_start.__dict__) == len( pdv_end.__dict__, ) pyDataverse-0.3.4/tests/models/test_dataverse.py000066400000000000000000000467671467256651400220650ustar00rootroot00000000000000"""Dataverse data model tests.""" import json import jsonschema import os import platform import pytest from pyDataverse.models import Dataverse from ..conftest import test_config def read_file(filename, mode="r"): """Read in a file. Parameters ---------- filename : str Filename with full path. mode : str Read mode of file. Defaults to `r`. See more at https://docs.python.org/3.5/library/functions.html#open Returns ------- str Returns data as string. """ with open(filename, mode) as f: return f.read() def write_json(filename, data, mode="w", encoding="utf-8"): """Write data to a json file. Parameters ---------- filename : str Filename with full path. data : dict Data to be written in the json file. mode : str Write mode of file. Defaults to `w`. See more at https://docs.python.org/3/library/functions.html#open """ with open(filename, mode, encoding=encoding) as f: json.dump(data, f, indent=2) def data_object(): """Get Dataverse object. Returns ------- pydataverse.models.Dataverse :class:`Dataverse` object. """ return Dataverse() def dict_flat_set_min(): """Get flat dict for set() of minimum Dataverse. Returns ------- dict Flat dict with minimum Dataverse data. """ return { "alias": "test-pyDataverse", "name": "Test pyDataverse", "dataverseContacts": [{"contactEmail": "info@aussda.at"}], } def dict_flat_set_full(): """Get flat dict for set() of full Dataverse. Returns ------- dict Flat dict with full Dataverse data. """ return { "name": "Scientific Research", "alias": "science", "dataverseContacts": [ {"contactEmail": "pi@example.edu"}, {"contactEmail": "student@example.edu"}, ], "affiliation": "Scientific Research University", "description": "We do all the science.", "dataverseType": "LABORATORY", } def object_data_init(): """Get dictionary for Dataverse with initial attributes. Returns ------- dict Dictionary of init data attributes set. """ return { "_Dataverse_default_json_format": "dataverse_upload", "_Dataverse_default_json_schema_filename": test_config[ "dataverse_upload_schema_filename" ], "_Dataverse_allowed_json_formats": ["dataverse_upload", "dataverse_download"], "_Dataverse_json_dataverse_upload_attr": [ "affiliation", "alias", "dataverseContacts", "dataverseType", "description", "name", ], "_internal_attributes": [], } def object_data_min(): """Get dictionary for Dataverse with minimum attributes. Returns ------- pyDataverse.Dataverse :class:`Dataverse` with minimum attributes set. """ return { "alias": "test-pyDataverse", "name": "Test pyDataverse", "dataverseContacts": [{"contactEmail": "info@aussda.at"}], } def object_data_full(): """Get dictionary for Dataverse with full attributes. Returns ------- pyDataverse.Dataverse :class:`Dataverse` with full attributes set. """ return { "alias": "science", "name": "Scientific Research", "dataverseContacts": [ {"contactEmail": "pi@example.edu"}, {"contactEmail": "student@example.edu"}, ], "affiliation": "Scientific Research University", "description": "We do all the science.", "dataverseType": "LABORATORY", } def dict_flat_get_min(): """Get flat dict for :func:`get` with minimum data of Dataverse. Returns ------- dict Minimum Dataverse dictionary returned by :func:`get`. """ return { "alias": "test-pyDataverse", "name": "Test pyDataverse", "dataverseContacts": [{"contactEmail": "info@aussda.at"}], } def dict_flat_get_full(): """Get flat dict for :func:`get` of full data of Dataverse. Returns ------- dict Full Datafile dictionary returned by :func:`get`. """ return { "name": "Scientific Research", "alias": "science", "dataverseContacts": [ {"contactEmail": "pi@example.edu"}, {"contactEmail": "student@example.edu"}, ], "affiliation": "Scientific Research University", "description": "We do all the science.", "dataverseType": "LABORATORY", } def json_upload_min(): """Get JSON string of minimum Dataverse. Returns ------- str JSON string. """ return read_file(test_config["dataverse_upload_min_filename"]) def json_upload_full(): """Get JSON string of full Dataverse. Returns ------- str JSON string. """ return read_file(test_config["dataverse_upload_full_filename"]) def json_dataverse_upload_attr(): """List of attributes import or export in format `dataverse_upload`. Returns ------- list List of attributes, which will be used for import and export. """ return [ "affiliation", "alias", "dataverseContacts", "dataverseType", "description", "name", ] def json_dataverse_upload_required_attr(): """List of attributes required for `dataverse_upload` JSON. Returns ------- list List of attributes, which will be used for import and export. """ return ["alias", "dataverseContacts", "name"] class TestDataverseGeneric(object): """Generic tests for Dataverse().""" def test_dataverse_set_and_get_valid(self): """Test Dataverse.get() with valid data.""" data = [ ((dict_flat_set_min(), object_data_min()), dict_flat_get_min()), ((dict_flat_set_full(), object_data_full()), dict_flat_get_full()), (({}, {}), {}), ] pdv = data_object() pdv.set(dict_flat_set_min()) assert isinstance(pdv.get(), dict) for input, data_eval in data: pdv = data_object() pdv.set(input[0]) data = pdv.get() for key, val in data_eval.items(): assert data[key] == input[1][key] == data_eval[key] assert len(data) == len(input[1]) == len(data_eval) def test_dataverse_set_invalid(self): """Test Dataverse.set() with invalid data.""" # invalid data for data in test_config["invalid_set_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.set(data) def test_dataverse_from_json_valid(self): """Test Dataverse.from_json() with valid data.""" data = [ (({json_upload_min()}, {}), object_data_min()), (({json_upload_full()}, {}), object_data_full()), ( ({json_upload_min()}, {"data_format": "dataverse_upload"}), object_data_min(), ), (({json_upload_min()}, {"validate": False}), object_data_min()), ( ( {json_upload_min()}, {"filename_schema": "", "validate": False}, ), object_data_min(), ), ( ( {json_upload_min()}, {"filename_schema": "wrong", "validate": False}, ), object_data_min(), ), ( ( {json_upload_min()}, { "filename_schema": test_config[ "dataverse_upload_schema_filename" ], "validate": True, }, ), object_data_min(), ), (({"{}"}, {"validate": False}), {}), ] for input, data_eval in data: pdv = data_object() args = input[0] kwargs = input[1] pdv.from_json(*args, **kwargs) for key, val in data_eval.items(): assert getattr(pdv, key) == data_eval[key] assert len(pdv.__dict__) - len(object_data_init()) == len(data_eval) def test_dataverse_from_json_invalid(self): """Test Dataverse.from_json() with invalid data.""" # invalid data for data in test_config["invalid_json_data_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.from_json(data, validate=False) if int(platform.python_version_tuple()[1]) >= 5: for json_string in test_config["invalid_json_strings"]: with pytest.raises(json.decoder.JSONDecodeError): pdv = data_object() pdv.from_json(json_string, validate=False) else: for json_string in test_config["invalid_json_strings"]: with pytest.raises(ValueError): pdv = data_object() pdv.from_json(json_string, validate=False) # invalid `filename_schema` for filename_schema in test_config["invalid_filename_strings"]: with pytest.raises(FileNotFoundError): pdv = data_object() pdv.from_json(json_upload_min(), filename_schema=filename_schema) for filename_schema in test_config["invalid_filename_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.from_json(json_upload_min(), filename_schema=filename_schema) # invalid `data_format` for data_format in ( test_config["invalid_data_format_types"] + test_config["invalid_data_format_strings"] ): with pytest.raises(AssertionError): pdv = data_object() pdv.from_json( json_upload_min(), data_format=data_format, validate=False ) # invalid `validate` for validate in test_config["invalid_validate_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.from_json(json_upload_min(), validate=validate) with pytest.raises(jsonschema.exceptions.ValidationError): pdv = data_object() pdv.from_json("{}") for attr in json_dataverse_upload_required_attr(): with pytest.raises(jsonschema.exceptions.ValidationError): pdv = data_object() data = json.loads(json_upload_min()) del data[attr] data = json.dumps(data) pdv.from_json(data, validate=True) def test_dataverse_to_json_valid(self): """Test Dataverse.json() with valid data.""" data = [ ((dict_flat_set_min(), {}), json.loads(json_upload_min())), ((dict_flat_set_full(), {}), json.loads(json_upload_full())), ( (dict_flat_set_min(), {"data_format": "dataverse_upload"}), json.loads(json_upload_min()), ), ( (dict_flat_set_min(), {"validate": False}), json.loads(json_upload_min()), ), ( ( dict_flat_set_min(), {"filename_schema": "", "validate": False}, ), json.loads(json_upload_min()), ), ( ( dict_flat_set_min(), {"filename_schema": "wrong", "validate": False}, ), json.loads(json_upload_min()), ), ( ( dict_flat_set_min(), { "filename_schema": test_config[ "dataverse_upload_schema_filename" ], "validate": True, }, ), json.loads(json_upload_min()), ), (({}, {"validate": False}), {}), ] pdv = data_object() pdv.set(dict_flat_set_min()) assert isinstance(pdv.json(), str) for input, data_eval in data: pdv = data_object() pdv.set(input[0]) kwargs = input[1] data = json.loads(pdv.json(**kwargs)) for key, val in data_eval.items(): assert data[key] == data_eval[key] assert len(data) == len(data_eval) def test_dataverse_to_json_invalid(self): """Test Dataverse.json() with non-valid data.""" # invalid `filename_schema` for filename_schema in test_config["invalid_filename_strings"]: with pytest.raises(FileNotFoundError): obj = data_object() obj.json(filename_schema=filename_schema) for filename_schema in test_config["invalid_filename_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.json(filename_schema=filename_schema) # invalid `data_format` for data_format in ( test_config["invalid_data_format_types"] + test_config["invalid_data_format_strings"] ): with pytest.raises(AssertionError): pdv = data_object() pdv.set(dict_flat_set_min()) pdv.json(data_format=data_format, validate=False) # invalid `validate` for validate in test_config["invalid_validate_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.set(dict_flat_set_min()) pdv.json(validate=validate) with pytest.raises(jsonschema.exceptions.ValidationError): pdv = data_object() pdv.set({}) pdv.json() for attr in json_dataverse_upload_required_attr(): with pytest.raises(jsonschema.exceptions.ValidationError): pdv = data_object() data = json.loads(json_upload_min()) del data[attr] pdv.set(data) pdv.json(validate=True) def test_dataverse_validate_json_valid(self): """Test Dataverse.validate_json() with valid data.""" data = [ ((dict_flat_set_min(), {}), True), ((dict_flat_set_full(), {}), True), ((dict_flat_set_min(), {"data_format": "dataverse_upload"}), True), ( ( dict_flat_set_min(), { "data_format": "dataverse_upload", "filename_schema": test_config[ "dataverse_upload_schema_filename" ], }, ), True, ), ( ( dict_flat_set_min(), { "filename_schema": test_config[ "dataverse_upload_schema_filename" ] }, ), True, ), ] for input, data_eval in data: pdv = data_object() pdv.set(input[0]) assert pdv.validate_json() == data_eval def test_dataverse_validate_json_invalid(self): """Test Dataverse.validate_json() with non-valid data.""" # invalid data for attr in json_dataverse_upload_required_attr(): with pytest.raises(jsonschema.exceptions.ValidationError): for data in [dict_flat_set_min(), dict_flat_set_full()]: pdv = data_object() pdv.set(data) delattr(pdv, attr) pdv.validate_json() # invalid `filename_schema` for filename_schema in test_config["invalid_filename_strings"]: with pytest.raises(FileNotFoundError): pdv = data_object() pdv.set(dict_flat_set_min()) pdv.validate_json(filename_schema=filename_schema) for filename_schema in test_config["invalid_filename_types"]: with pytest.raises(AssertionError): pdv = data_object() pdv.set(dict_flat_set_min()) pdv.validate_json(filename_schema=filename_schema) class TestDataverseSpecific(object): """Specific tests for Dataverse().""" def test_dataverse_init_valid(self): """Test Dataverse.__init__() with valid data.""" # specific data = [ (Dataverse(), {}), (Dataverse(dict_flat_set_min()), object_data_min()), (Dataverse(dict_flat_set_full()), object_data_full()), (Dataverse({}), {}), ] for pdv, data_eval in data: for key, val in data_eval.items(): print(getattr(pdv, key)) print(data_eval[key]) assert getattr(pdv, key) == data_eval[key] assert len(pdv.__dict__) - len(object_data_init()) == len(data_eval) def test_dataverse_init_invalid(self): """Test Dataverse.init() with invalid data.""" pdv = Dataverse() # invalid data for data in test_config["invalid_set_types"]: with pytest.raises(AssertionError): pdv.set(data) if not os.environ.get("TRAVIS"): class TestDataverseGenericTravisNot(object): """Generic tests for Dataverse(), not running on Travis (no file-write permissions).""" def test_dataverse_from_json_to_json_valid(self): """Test Dataverse to JSON from JSON with valid data.""" data = [ ({json_upload_min()}, {}), ({json_upload_full()}, {}), ({json_upload_min()}, {"data_format": "dataverse_upload"}), ({json_upload_min()}, {"validate": False}), ( {json_upload_min()}, {"filename_schema": "", "validate": False}, ), ( {json_upload_min()}, {"filename_schema": "wrong", "validate": False}, ), ( {json_upload_min()}, { "filename_schema": test_config[ "dataverse_upload_schema_filename" ], "validate": True, }, ), ({"{}"}, {"validate": False}), ] for args_from, kwargs_from in data: pdv_start = data_object() args = args_from kwargs = kwargs_from pdv_start.from_json(*args, **kwargs) if "validate" in kwargs: if not kwargs["validate"]: kwargs = {"validate": False} data_out = json.loads(pdv_start.json(**kwargs)) write_json(test_config["dataverse_json_output_filename"], data_out) data_in = read_file(test_config["dataverse_json_output_filename"]) pdv_end = data_object() kwargs = kwargs_from pdv_end.from_json(data_in, **kwargs) for key, val in pdv_end.get().items(): assert getattr(pdv_start, key) == getattr(pdv_end, key) assert len(pdv_start.__dict__) == len( pdv_end.__dict__, ) pyDataverse-0.3.4/tests/models/test_dvobject.py000066400000000000000000000007441467256651400216700ustar00rootroot00000000000000"""Dataverse data model tests.""" from pyDataverse.models import DVObject class TestDVObject(object): """Tests for :class:DVObject().""" def test_dataverse_init(self): """Test Dataverse.__init__().""" obj = DVObject() assert not hasattr(obj, "default_json_format") assert not hasattr(obj, "allowed_json_formats") assert not hasattr(obj, "default_json_schema_filename") assert not hasattr(obj, "json_dataverse_upload_attr") pyDataverse-0.3.4/tests/utils/000077500000000000000000000000001467256651400163275ustar00rootroot00000000000000pyDataverse-0.3.4/tests/utils/__init__.py000066400000000000000000000000001467256651400204260ustar00rootroot00000000000000pyDataverse-0.3.4/tests/utils/test_utils.py000066400000000000000000000061621467256651400211050ustar00rootroot00000000000000from pyDataverse.utils import read_json, dataverse_tree_walker from ..conftest import test_config class TestUtilsSaveTreeData: def test_dataverse_tree_walker_valid_default(self): dv_ids = [1, 2, 3] dv_aliases = ["parent_dv_1", "parent_dv_1_sub_dv_1", "parent_dv_2"] ds_ids = ["1AB23C", "4DE56F", "7GH89I", "0JK1LM", "2NO34P"] ds_pids = [ "doi:12.34567/1AB23C", "doi:12.34567/4DE56F", "doi:12.34567/7GH89I", "doi:12.34567/0JK1LM", "doi:12.34567/2NO34P", ] df_ids = [1, 2, 3, 4, 5, 6, 7] df_filenames = [ "appendix.pdf", "survey.zsav", "manual.pdf", "study.zsav", "documentation.pdf", "data.R", "summary.md", ] df_labels = [ "appendix.pdf", "survey.zsav", "manual.pdf", "study.zsav", "documentation.pdf", "data.R", "summary.md", ] df_pids = [ "doi:12.34567/1AB23C/ABC123", "doi:12.34567/1AB23C/DEF456", "doi:12.34567/4DE56F/GHI789", "doi:12.34567/7GH89I/JKL012", "doi:12.34567/0JK1LM/MNO345", "doi:12.34567/0JK1LM/PQR678", "doi:12.34567/2NO34P/STU901", ] data = read_json(test_config["tree_filename"]) dataverses, datasets, datafiles = dataverse_tree_walker(data) assert isinstance(dataverses, list) assert isinstance(datasets, list) assert isinstance(datafiles, list) assert len(dataverses) == 3 assert len(datasets) == 5 assert len(datafiles) == 7 for dv in dataverses: assert "dataverse_alias" in dv assert "dataverse_id" in dv assert dv["dataverse_alias"] in dv_aliases dv_aliases.pop(dv_aliases.index(dv["dataverse_alias"])) assert dv["dataverse_id"] in dv_ids dv_ids.pop(dv_ids.index(dv["dataverse_id"])) assert (len(dv_aliases)) == 0 assert (len(dv_ids)) == 0 for ds in datasets: assert "dataset_id" in ds assert "pid" in ds assert ds["dataset_id"] in ds_ids ds_ids.pop(ds_ids.index(ds["dataset_id"])) assert ds["pid"] in ds_pids ds_pids.pop(ds_pids.index(ds["pid"])) assert (len(ds_ids)) == 0 assert (len(ds_pids)) == 0 for df in datafiles: assert "datafile_id" in df assert "filename" in df assert "label" in df assert "pid" in df assert df["datafile_id"] in df_ids df_ids.pop(df_ids.index(df["datafile_id"])) assert df["filename"] in df_filenames df_filenames.pop(df_filenames.index(df["filename"])) assert df["label"] in df_labels df_labels.pop(df_labels.index(df["label"])) assert df["pid"] in df_pids df_pids.pop(df_pids.index(df["pid"])) assert (len(df_ids)) == 0 assert (len(df_filenames)) == 0 assert (len(df_pids)) == 0 pyDataverse-0.3.4/tox.ini000066400000000000000000000045731467256651400153510ustar00rootroot00000000000000[tox] requires = tox>=4 envlist = py3,py3{8,9,10,11,12},coverage,coveralls,lint skip_missing_interpreters = True [testenv] description = default settings for unspecified tests skip_install = True allowlist_externals = poetry passenv = * commands_pre = poetry lock --no-update poetry install --with=tests commands = pytest -v tests --cov=pyDataverse --basetemp={envtmpdir} [testenv:py3] [testenv:py38] basepython = python3.8 [testenv:py39] basepython = python3.9 [testenv:py310] basepython = python3.10 [testenv:py311] basepython = python3.11 [testenv:py312] basepython = python3.12 [testenv:py313] basepython = python3.13 [testenv:coverage] description = create report for coverage commands = pytest tests --cov=pyDataverse --cov-report=term-missing --cov-report=xml --cov-report=html [testenv:coveralls] description = create reports for coveralls commands = pytest tests --doctest-modules -v --cov=pyDataverse [testenv:lint] commands_pre = poetry lock --no-update poetry install --with=lint commands = ruff check pyDataverse tests [testenv:mypy] commands_pre = poetry lock --no-update poetry install --with=lint commands = mypy pyDataverse tests [testenv:docs] description = invoke sphinx-build to build the HTML docs commands_pre = poetry lock --no-update poetry install --with=docs commands = sphinx-build -d pyDataverse/docs/build/docs_doctree pyDataverse/docs/source docs/build/html --color -b html {posargs} [testenv:pydocstyle] description = pydocstyle for auto-formatting commands_pre = poetry lock --no-update poetry install --with=docs commands = pydocstyle pyDataverse/ pydocstyle tests/ [testenv:radon-mc] description = Radon McCabe number commands_pre = poetry lock --no-update poetry install --with=lint commands = radon cc pyDataverse/ -a [testenv:radon-mi] description = Radon Maintainability Index commands_pre = poetry lock --no-update poetry install --with=lint commands = radon mi pyDataverse/ radon mi tests/ [testenv:radon-raw] description = Radon raw metrics commands_pre = poetry lock --no-update poetry install --with=lint commands = radon raw pyDataverse/ radon raw tests/ [testenv:radon-hal] description = Radon Halstead metrics commands_pre = poetry lock --no-update poetry install --with=lint commands = radon hal pyDataverse/ radon hal tests/