pax_global_header00006660000000000000000000000064141371003620014507gustar00rootroot0000000000000052 comment=d87e341344d725fc8fc8cf02c865c95fcaf36234 python-rdata-0.5/000077500000000000000000000000001413710036200137655ustar00rootroot00000000000000python-rdata-0.5/.github/000077500000000000000000000000001413710036200153255ustar00rootroot00000000000000python-rdata-0.5/.github/workflows/000077500000000000000000000000001413710036200173625ustar00rootroot00000000000000python-rdata-0.5/.github/workflows/main.yml000066400000000000000000000015361413710036200210360ustar00rootroot00000000000000name: Tests on: push: pull_request: jobs: build: runs-on: ${{ matrix.os }} name: Python ${{ matrix.python-version }} on ${{ matrix.os }} strategy: matrix: os: [ubuntu-latest, macos-latest, windows-latest] python-version: ['3.7', '3.8', '3.9'] steps: - uses: actions/checkout@v2 - name: Set up Python ${{ matrix.python-version }} on ${{ matrix.os }} uses: actions/setup-python@v2 with: python-version: ${{ matrix.python-version }} - name: Install dependencies run: | pip3 install codecov pytest-cov || pip3 install --user codecov pytest-cov; - name: Run tests run: | pip3 install . coverage run --source=rdata/ --omit=rdata/tests/ setup.py test; - name: Upload coverage to Codecov uses: codecov/codecov-action@v1 python-rdata-0.5/.gitignore000066400000000000000000000022631413710036200157600ustar00rootroot00000000000000# Byte-compiled / optimized / DLL files __pycache__/ *.py[cod] *$py.class # C extensions *.so # Distribution / packaging .Python build/ develop-eggs/ dist/ downloads/ eggs/ .eggs/ lib/ lib64/ parts/ sdist/ var/ wheels/ *.egg-info/ .installed.cfg *.egg MANIFEST # PyInstaller # Usually these files are written by a python script from a template # before PyInstaller builds the exe, so as to inject date/other infos into it. *.manifest *.spec # Installer logs pip-log.txt pip-delete-this-directory.txt # Unit test / coverage reports htmlcov/ .tox/ .coverage .coverage.* .cache nosetests.xml coverage.xml *.cover .hypothesis/ .pytest_cache/ # Translations *.mo *.pot # Django stuff: *.log local_settings.py db.sqlite3 # Flask stuff: instance/ .webassets-cache # Scrapy stuff: .scrapy # Sphinx documentation docs/_build/ # PyBuilder target/ # Jupyter Notebook .ipynb_checkpoints # pyenv .python-version # celery beat schedule file celerybeat-schedule # SageMath parsed files *.sage.py # Environments .env .venv env/ venv/ ENV/ env.bak/ venv.bak/ # Spyder project settings .spyderproject .spyproject # Rope project settings .ropeproject # mkdocs documentation /site # mypy .mypy_cache/ python-rdata-0.5/LICENSE000066400000000000000000000020661413710036200147760ustar00rootroot00000000000000MIT License Copyright (c) 2018 Carlos Ramos Carreño Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. python-rdata-0.5/MANIFEST.in000066400000000000000000000001301413710036200155150ustar00rootroot00000000000000include MANIFEST.in include VERSION include LICENSE include rdata/py.typed include *.txtpython-rdata-0.5/README.rst000066400000000000000000000104121413710036200154520ustar00rootroot00000000000000rdata ===== |build-status| |docs| |coverage| |landscape| |pypi| Read R datasets from Python. .. Github does not support include in README for dubious security reasons, so we copy-paste instead. Also Github does not understand Sphinx directives. .. include:: docs/simpleusage.rst Installation ============ rdata is on PyPi and can be installed using :code:`pip`: .. code:: pip install rdata It is also available for :code:`conda` using the :code:`conda-forge` channel: .. code:: conda install -c conda-forge rdata Documentation ============= The documentation of rdata is in `ReadTheDocs `_. Simple usage ============ Read a R dataset ---------------- The common way of reading an R dataset is the following one: >>> import rdata >>> parsed = rdata.parser.parse_file(rdata.TESTDATA_PATH / "test_vector.rda") >>> converted = rdata.conversion.convert(parsed) >>> converted {'test_vector': array([1., 2., 3.])} This consists on two steps: #. First, the file is parsed using the function `parse_file`. This provides a literal description of the file contents as a hierarchy of Python objects representing the basic R objects. This step is unambiguous and always the same. #. Then, each object must be converted to an appropriate Python object. In this step there are several choices on which Python type is the most appropriate as the conversion for a given R object. Thus, we provide a default `convert` routine, which tries to select Python objects that preserve most information of the original R object. For custom R classes, it is also possible to specify conversion routines to Python objects. Convert custom R classes ------------------------ The basic `convert` routine only constructs a `SimpleConverter` objects and calls its `convert` method. All arguments of `convert` are directly passed to the `SimpleConverter` initialization method. It is possible, although not trivial, to make a custom `Converter` object to change the way in which the basic R objects are transformed to Python objects. However, a more common situation is that one does not want to change how basic R objects are converted, but instead wants to provide conversions for specific R classes. This can be done by passing a dictionary to the `SimpleConverter` initialization method, containing as keys the names of R classes and as values, callables that convert a R object of that class to a Python object. By default, the dictionary used is `DEFAULT_CLASS_MAP`, which can convert commonly used R classes such as `data.frame` and `factor`. As an example, here is how we would implement a conversion routine for the factor class to `bytes` objects, instead of the default conversion to Pandas `Categorical` objects: >>> import rdata >>> def factor_constructor(obj, attrs): ... values = [bytes(attrs['levels'][i - 1], 'utf8') ... if i >= 0 else None for i in obj] ... ... return values >>> new_dict = { ... **rdata.conversion.DEFAULT_CLASS_MAP, ... "factor": factor_constructor ... } >>> parsed = rdata.parser.parse_file(rdata.TESTDATA_PATH ... / "test_dataframe.rda") >>> converted = rdata.conversion.convert(parsed, new_dict) >>> converted {'test_dataframe': class value 0 b'a' 1 1 b'b' 2 2 b'b' 3} .. |build-status| image:: https://github.com/vnmabus/rdata/actions/workflows/main.yml/badge.svg?branch=master :alt: build status :scale: 100% :target: https://github.com/vnmabus/rdata/actions/workflows/main.yml .. |docs| image:: https://readthedocs.org/projects/rdata/badge/?version=latest :alt: Documentation Status :scale: 100% :target: https://rdata.readthedocs.io/en/latest/?badge=latest .. |coverage| image:: http://codecov.io/github/vnmabus/rdata/coverage.svg?branch=develop :alt: Coverage Status :scale: 100% :target: https://codecov.io/gh/vnmabus/rdata/branch/develop .. |landscape| image:: https://landscape.io/github/vnmabus/rdata/develop/landscape.svg?style=flat :target: https://landscape.io/github/vnmabus/rdata/develop :alt: Code Health .. |pypi| image:: https://badge.fury.io/py/rdata.svg :alt: Pypi version :scale: 100% :target: https://pypi.python.org/pypi/rdata/python-rdata-0.5/VERSION000066400000000000000000000000031413710036200150260ustar00rootroot000000000000000.5python-rdata-0.5/conftest.py000066400000000000000000000000361413710036200161630ustar00rootroot00000000000000collect_ignore = ['setup.py'] python-rdata-0.5/docs/000077500000000000000000000000001413710036200147155ustar00rootroot00000000000000python-rdata-0.5/docs/.gitignore000066400000000000000000000000261413710036200167030ustar00rootroot00000000000000/functions/ /modules/ python-rdata-0.5/docs/Makefile000066400000000000000000000011321413710036200163520ustar00rootroot00000000000000# Minimal makefile for Sphinx documentation # # You can set these variables from the command line. SPHINXOPTS = SPHINXBUILD = sphinx-build SPHINXPROJ = rdata SOURCEDIR = . BUILDDIR = _build # Put it first so that "make" without argument is like "make help". help: @$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) .PHONY: help Makefile # Catch-all target: route all unknown targets to Sphinx using the new # "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). %: Makefile @$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)python-rdata-0.5/docs/_templates/000077500000000000000000000000001413710036200170525ustar00rootroot00000000000000python-rdata-0.5/docs/_templates/autosummary/000077500000000000000000000000001413710036200214405ustar00rootroot00000000000000python-rdata-0.5/docs/_templates/autosummary/base.rst000066400000000000000000000001501413710036200231000ustar00rootroot00000000000000{{ objname | escape | underline}} .. currentmodule:: {{ module }} .. auto{{ objtype }}:: {{ objname }}python-rdata-0.5/docs/_templates/autosummary/class.rst000066400000000000000000000010241413710036200232740ustar00rootroot00000000000000{{ objname | escape | underline}} .. currentmodule:: {{ module }} .. autoclass:: {{ objname }} {% block methods %} {% if methods %} .. rubric:: Methods .. autosummary:: {% for item in methods %} ~{{ name }}.{{ item }} {%- endfor %} {% endif %} .. automethod:: __init__ {% endblock %} {% block attributes %} {% if attributes %} .. rubric:: Attributes .. autosummary:: {% for item in attributes %} ~{{ name }}.{{ item }} {%- endfor %} {% endif %} {% endblock %}python-rdata-0.5/docs/_templates/autosummary/module.rst000066400000000000000000000021551413710036200234620ustar00rootroot00000000000000{{ objname | escape | underline}} .. automodule:: {{ fullname }} {% block attributes %} {% if attributes %} .. rubric:: {{ _('Module Attributes') }} .. autosummary:: :toctree: {% for item in attributes %} {{ item }} {%- endfor %} {% endif %} {% endblock %} {% block functions %} {% if functions %} .. rubric:: {{ _('Functions') }} .. autosummary:: :toctree: {% for item in functions %} {{ item }} {%- endfor %} {% endif %} {% endblock %} {% block classes %} {% if classes %} .. rubric:: {{ _('Classes') }} .. autosummary:: :toctree: {% for item in classes %} {{ item }} {%- endfor %} {% endif %} {% endblock %} {% block exceptions %} {% if exceptions %} .. rubric:: {{ _('Exceptions') }} .. autosummary:: :toctree: {% for item in exceptions %} {{ item }} {%- endfor %} {% endif %} {% endblock %} {% block modules %} {% if modules %} .. rubric:: Modules .. autosummary:: :toctree: :recursive: {% for item in modules %} {{ item }} {%- endfor %} {% endif %} {% endblock %}python-rdata-0.5/docs/apilist.rst000066400000000000000000000020461413710036200171160ustar00rootroot00000000000000API List ======== List of functions and structures -------------------------------- A complete list of all functions and structures provided by rdata. Parse :code:`.rda` format ^^^^^^^^^^^^^^^^^^^^^^^^^ Functions for parsing data in the :code:`.rda` format. These functions return a structure representing the contents of the file, without transforming it to more appropiate Python objects. Thus, if a different way of converting R objects to Python objects is needed, it can be done from this structure. .. autosummary:: :toctree: modules rdata.parser.parse_file rdata.parser.parse_data Conversion of the R objects ^^^^^^^^^^^^^^^^^^^^^^^^^^^ These objects and functions convert the parsed R objects to appropiate Python objects. The Python object corresponding to a R object is chosen to preserve most original properties, but it could change in the future, if a more fitting Python object is found. .. autosummary:: :toctree: modules rdata.conversion.Converter rdata.conversion.SimpleConverter rdata.conversion.convert python-rdata-0.5/docs/conf.py000066400000000000000000000147121413710036200162210ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- # # dcor documentation build configuration file, created by # sphinx-quickstart on Tue Aug 7 12:49:32 2018. # # This file is execfile()d with the current directory set to its # containing dir. # # Note that not all possible configuration values are present in this # autogenerated file. # # All configuration values have a default; values that are commented out # serve to show the default. # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. # # import os # import sys # sys.path.insert(0, '/home/carlos/git/rdata/rdata') import sys import pkg_resources try: release = pkg_resources.get_distribution('rdata').version except pkg_resources.DistributionNotFound: print('To build the documentation, The distribution information of rdata\n' 'Has to be available. Either install the package into your\n' 'development environment or run "setup.py develop" to setup the\n' 'metadata. A virtualenv is recommended!\n') sys.exit(1) del pkg_resources version = '.'.join(release.split('.')[:2]) # -- General configuration ------------------------------------------------ # If your documentation needs a minimal Sphinx version, state it here. # # needs_sphinx = '1.0' # Add any Sphinx extension module names here, as strings. They can be # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. extensions = ['sphinx.ext.autodoc', 'sphinx.ext.autosummary', 'sphinx.ext.todo', 'sphinx.ext.viewcode', 'sphinx.ext.napoleon', 'sphinx.ext.mathjax', 'sphinx.ext.intersphinx'] # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] # The suffix(es) of source filenames. # You can specify multiple suffix as a list of string: # # source_suffix = ['.rst', '.md'] source_suffix = '.rst' # The master toctree document. master_doc = 'index' # General information about the project. project = 'rdata' copyright = '2018, Carlos Ramos Carreño' author = 'Carlos Ramos Carreño' # The version info for the project you're documenting, acts as replacement for # |version| and |release|, also used in various other places throughout the # built documents. # # The short X.Y version. # version = '' # The full version, including alpha/beta/rc tags. # release = '' # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. # # This is also used if you do content translation via gettext catalogs. # Usually you set "language" from the command line for these cases. language = 'en' # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. # This patterns also effect to html_static_path and html_extra_path exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store'] # The name of the Pygments (syntax highlighting) style to use. pygments_style = 'sphinx' # If true, `todo` and `todoList` produce output, else they produce nothing. todo_include_todos = True add_module_names = False autosummary_generate = True # -- Options for HTML output ---------------------------------------------- # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. # html_theme = 'sphinx_rtd_theme' # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the # documentation. # # html_theme_options = {} # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ['_static'] # Custom sidebar templates, must be a dictionary that maps document names # to template names. # # This is required for the alabaster theme # refs: http://alabaster.readthedocs.io/en/latest/installation.html#sidebars html_sidebars = { '**': [ 'about.html', 'navigation.html', 'relations.html', # needs 'show_related': True theme option to display 'searchbox.html', 'donate.html', ] } # -- Options for HTMLHelp output ------------------------------------------ # Output file base name for HTML help builder. htmlhelp_basename = 'rdatadoc' # -- Options for LaTeX output --------------------------------------------- latex_elements = { # The paper size ('letterpaper' or 'a4paper'). # # 'papersize': 'letterpaper', # The font size ('10pt', '11pt' or '12pt'). # # 'pointsize': '10pt', # Additional stuff for the LaTeX preamble. # # 'preamble': '', # Latex figure (float) alignment # # 'figure_align': 'htbp', } # Grouping the document tree into LaTeX files. List of tuples # (source start file, target name, title, # author, documentclass [howto, manual, or own class]). latex_documents = [ (master_doc, 'rdata.tex', 'rdata Documentation', 'Carlos Ramos Carreño', 'manual'), ] # -- Options for manual page output --------------------------------------- # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). man_pages = [ (master_doc, 'rdata', 'rdata Documentation', [author], 1) ] # -- Options for Texinfo output ------------------------------------------- # Grouping the document tree into Texinfo files. List of tuples # (source start file, target name, title, author, # dir menu entry, description, category) texinfo_documents = [ (master_doc, 'rdata', 'rdata Documentation', author, 'rdata', 'One line description of project.', 'Miscellaneous'), ] # -- Options for Epub output ---------------------------------------------- # Bibliographic Dublin Core info. epub_title = project epub_author = author epub_publisher = author epub_copyright = copyright # The unique identifier of the text. This can be a ISBN number # or the project homepage. # # epub_identifier = '' # A unique identification for the text. # # epub_uid = '' # A list of files that should not be packed into the epub file. epub_exclude_files = ['search.html'] intersphinx_mapping = {'python': ('https://docs.python.org/3', None), 'pandas': ('http://pandas.pydata.org/pandas-docs/dev', None)} python-rdata-0.5/docs/index.rst000066400000000000000000000026661413710036200165700ustar00rootroot00000000000000rdata version |version| ======================= |build-status| |docs| |coverage| |landscape| |pypi| Open :code:`.rda` R data files containing datasets and convert them to the appropiate Python objects. .. toctree:: :maxdepth: 4 :caption: Contents: installation simpleusage apilist internalapi rdata is developed `on Github `_. Please report `issues `_ there as well. Indices and tables ================== * :ref:`genindex` * :ref:`modindex` * :ref:`search` .. |build-status| image:: https://api.travis-ci.org/vnmabus/rdata.svg?branch=master :alt: build status :scale: 100% :target: https://travis-ci.org/vnmabus/rdata .. |docs| image:: https://readthedocs.org/projects/rdata/badge/?version=latest :alt: Documentation Status :scale: 100% :target: https://rdata.readthedocs.io/en/latest/?badge=latest .. |coverage| image:: http://codecov.io/github/vnmabus/rdata/coverage.svg?branch=develop :alt: Coverage Status :scale: 100% :target: https://codecov.io/gh/vnmabus/rdata/branch/develop .. |landscape| image:: https://landscape.io/github/vnmabus/rdata/develop/landscape.svg?style=flat :target: https://landscape.io/github/vnmabus/rdata/develop :alt: Code Health .. |pypi| image:: https://badge.fury.io/py/rdata.svg :alt: Pypi version :scale: 100% :target: https://pypi.python.org/pypi/rdata/ python-rdata-0.5/docs/installation.rst000066400000000000000000000003661413710036200201550ustar00rootroot00000000000000Installation ============ rdata is on PyPi and can be installed using :code:`pip`: .. code:: pip install rdata It is also available for :code:`conda` using the :code:`conda-forge` channel: .. code:: conda install -c conda-forge rdata python-rdata-0.5/docs/internalapi.rst000066400000000000000000000002771413710036200177630ustar00rootroot00000000000000Internal documentation ====================== List of modules --------------- .. autosummary:: :toctree: modules :recursive: rdata.parser._parser rdata.conversion._conversionpython-rdata-0.5/docs/make.bat000066400000000000000000000014511413710036200163230ustar00rootroot00000000000000@ECHO OFF pushd %~dp0 REM Command file for Sphinx documentation if "%SPHINXBUILD%" == "" ( set SPHINXBUILD=sphinx-build ) set SOURCEDIR=. set BUILDDIR=_build set SPHINXPROJ=rdata if "%1" == "" goto help %SPHINXBUILD% >NUL 2>NUL if errorlevel 9009 ( echo. echo.The 'sphinx-build' command was not found. Make sure you have Sphinx echo.installed, then set the SPHINXBUILD environment variable to point echo.to the full path of the 'sphinx-build' executable. Alternatively you echo.may add the Sphinx directory to PATH. echo. echo.If you don't have Sphinx installed, grab it from echo.http://sphinx-doc.org/ exit /b 1 ) %SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% goto end :help %SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% :end popd python-rdata-0.5/docs/simpleusage.rst000066400000000000000000000057441413710036200177770ustar00rootroot00000000000000Simple usage ============ Read a R dataset ---------------- The common way of reading an R dataset is the following one: >>> import rdata >>> parsed = rdata.parser.parse_file(rdata.TESTDATA_PATH / "test_vector.rda") >>> converted = rdata.conversion.convert(parsed) >>> converted {'test_vector': array([1., 2., 3.])} This consists on two steps: #. First, the file is parsed using the function :func:`~rdata.parser.parse_file`. This provides a literal description of the file contents as a hierarchy of Python objects representing the basic R objects. This step is unambiguous and always the same. #. Then, each object must be converted to an appropriate Python object. In this step there are several choices on which Python type is the most appropriate as the conversion for a given R object. Thus, we provide a default :func:`~rdata.conversion.convert` routine, which tries to select Python objects that preserve most information of the original R object. For custom R classes, it is also possible to specify conversion routines to Python objects. Convert custom R classes ------------------------ The basic :func:`~rdata.conversion.convert` routine only constructs a :class:`~rdata.conversion.SimpleConverter` objects and calls its :func:`~rdata.conversion.SimpleConverter.convert` method. All arguments of :func:`~rdata.conversion.convert` are directly passed to the :class:`~rdata.conversion.SimpleConverter` initialization method. It is possible, although not trivial, to make a custom :class:`~rdata.conversion.Converter` object to change the way in which the basic R objects are transformed to Python objects. However, a more common situation is that one does not want to change how basic R objects are converted, but instead wants to provide conversions for specific R classes. This can be done by passing a dictionary to the :class:`~rdata.conversion.SimpleConverter` initialization method, containing as keys the names of R classes and as values, callables that convert a R object of that class to a Python object. By default, the dictionary used is :data:`~rdata.conversion._conversion.DEFAULT_CLASS_MAP`, which can convert commonly used R classes such as `data.frame` and `factor`. As an example, here is how we would implement a conversion routine for the factor class to :class:`bytes` objects, instead of the default conversion to Pandas :class:`~pandas.Categorical` objects: >>> import rdata >>> def factor_constructor(obj, attrs): ... values = [bytes(attrs['levels'][i - 1], 'utf8') ... if i >= 0 else None for i in obj] ... ... return values >>> new_dict = { ... **rdata.conversion.DEFAULT_CLASS_MAP, ... "factor": factor_constructor ... } >>> parsed = rdata.parser.parse_file(rdata.TESTDATA_PATH ... / "test_dataframe.rda") >>> converted = rdata.conversion.convert(parsed, new_dict) >>> converted {'test_dataframe': class value 0 b'a' 1 1 b'b' 2 2 b'b' 3} python-rdata-0.5/pyproject.toml000066400000000000000000000001401413710036200166740ustar00rootroot00000000000000[build-system] # Minimum requirements for the build system to execute. requires = ["setuptools"]python-rdata-0.5/rdata/000077500000000000000000000000001413710036200150605ustar00rootroot00000000000000python-rdata-0.5/rdata/__init__.py000066400000000000000000000004141413710036200171700ustar00rootroot00000000000000import os as _os import pathlib as _pathlib from . import conversion, parser def _get_test_data_path() -> _pathlib.Path: return _pathlib.Path(_os.path.dirname(__file__)) / "tests" / "data" TESTDATA_PATH = _get_test_data_path() """ Path of the test data. """ python-rdata-0.5/rdata/conversion/000077500000000000000000000000001413710036200172455ustar00rootroot00000000000000python-rdata-0.5/rdata/conversion/__init__.py000066400000000000000000000006651413710036200213650ustar00rootroot00000000000000from ._conversion import (RExpression, RLanguage, convert_list, convert_attrs, convert_vector, convert_char, convert_symbol, convert_array, Converter, SimpleConverter, dataframe_constructor, factor_constructor, ts_constructor, DEFAULT_CLASS_MAP, convert) python-rdata-0.5/rdata/conversion/_conversion.py000066400000000000000000000474061413710036200221560ustar00rootroot00000000000000import abc import warnings from fractions import Fraction from types import MappingProxyType, SimpleNamespace from typing import ( Any, Callable, ChainMap, Hashable, List, Mapping, MutableMapping, NamedTuple, Optional, Union, cast, ) import numpy as np import pandas import xarray from .. import parser from ..parser import RObject class RLanguage(NamedTuple): """ R language construct. """ elements: List[Any] class RExpression(NamedTuple): """ R expression. """ elements: List[RLanguage] def convert_list( r_list: parser.RObject, conversion_function: Callable[ [Union[parser.RData, parser.RObject] ], Any]=lambda x: x ) -> Union[Mapping[Union[str, bytes], Any], List[Any]]: """ Expand a tagged R pairlist to a Python dictionary. Parameters ---------- r_list: RObject Pairlist R object, with tags. conversion_function: Callable Conversion function to apply to the elements of the list. By default is the identity function. Returns ------- dictionary: dict A dictionary with the tags of the pairwise list as keys and their corresponding values as values. See Also -------- convert_vector """ if r_list.info.type is parser.RObjectType.NILVALUE: return {} elif r_list.info.type not in [parser.RObjectType.LIST, parser.RObjectType.LANG]: raise TypeError("Must receive a LIST, LANG or NILVALUE object") if r_list.tag is None: tag = None else: tag = conversion_function(r_list.tag) cdr = conversion_function(r_list.value[1]) if tag is not None: if cdr is None: cdr = {} return {tag: conversion_function(r_list.value[0]), **cdr} else: if cdr is None: cdr = [] return [conversion_function(r_list.value[0]), *cdr] def convert_env( r_env: parser.RObject, conversion_function: Callable[ [Union[parser.RData, parser.RObject] ], Any]=lambda x: x ) -> ChainMap[Union[str, bytes], Any]: if r_env.info.type is not parser.RObjectType.ENV: raise TypeError("Must receive a ENV object") frame = conversion_function(r_env.value.frame) enclosure = conversion_function(r_env.value.enclosure) hash_table = conversion_function(r_env.value.hash_table) dictionary = {} for d in hash_table: if d is not None: dictionary.update(d) return ChainMap(dictionary, enclosure) def convert_attrs( r_obj: parser.RObject, conversion_function: Callable[ [Union[parser.RData, parser.RObject] ], Any]=lambda x: x ) -> Mapping[Union[str, bytes], Any]: """ Return the attributes of an object as a Python dictionary. Parameters ---------- r_obj: RObject R object. conversion_function: Callable Conversion function to apply to the elements of the attribute list. By default is the identity function. Returns ------- dictionary: dict A dictionary with the names of the attributes as keys and their corresponding values as values. See Also -------- convert_list """ if r_obj.attributes: attrs = cast( Mapping[Union[str, bytes], Any], conversion_function(r_obj.attributes), ) else: attrs = {} return attrs def convert_vector( r_vec: parser.RObject, conversion_function: Callable[ [Union[parser.RData, parser.RObject]], Any]=lambda x: x, attrs: Optional[Mapping[Union[str, bytes], Any]] = None, ) -> Union[List[Any], Mapping[Union[str, bytes], Any]]: """ Convert a R vector to a Python list or dictionary. If the vector has a ``names`` attribute, the result is a dictionary with the names as keys. Otherwise, the result is a Python list. Parameters ---------- r_vec: RObject R vector. conversion_function: Callable Conversion function to apply to the elements of the vector. By default is the identity function. Returns ------- vector: dict or list A dictionary with the ``names`` of the vector as keys and their corresponding values as values. If the vector does not have an argument ``names``, then a normal Python list is returned. See Also -------- convert_list """ if attrs is None: attrs = {} if r_vec.info.type not in [parser.RObjectType.VEC, parser.RObjectType.EXPR]: raise TypeError("Must receive a VEC or EXPR object") value: Union[List[Any], Mapping[Union[str, bytes], Any]] = [ conversion_function(o) for o in r_vec.value ] # If it has the name attribute, use a dict instead field_names = attrs.get('names') if field_names: value = dict(zip(field_names, value)) return value def safe_decode(byte_str: bytes, encoding: str) -> Union[str, bytes]: """ Decode a (possibly malformed) string. """ try: return byte_str.decode(encoding) except UnicodeDecodeError as e: warnings.warn( f"Exception while decoding {byte_str!r}: {e}", ) return byte_str def convert_char( r_char: parser.RObject, default_encoding: Optional[str] = None, force_default_encoding: bool = False, ) -> Union[str, bytes, None]: """ Decode a R character array to a Python string or bytes. The bits that signal the encoding are in the general pointer. The string can be encoded in UTF8, LATIN1 or ASCII, or can be a sequence of bytes. Parameters ---------- r_char: RObject R character array. Returns ------- string: str or bytes Decoded string. See Also -------- convert_symbol """ if r_char.info.type is not parser.RObjectType.CHAR: raise TypeError("Must receive a CHAR object") if r_char.value is None: return None assert isinstance(r_char.value, bytes) if not force_default_encoding: if r_char.info.gp & parser.CharFlags.UTF8: return safe_decode(r_char.value, "utf_8") elif r_char.info.gp & parser.CharFlags.LATIN1: return safe_decode(r_char.value, "latin_1") elif r_char.info.gp & parser.CharFlags.ASCII: return safe_decode(r_char.value, "ascii") elif r_char.info.gp & parser.CharFlags.BYTES: return r_char.value if default_encoding: return safe_decode(r_char.value, default_encoding) else: # Assume ASCII if no encoding is marked warnings.warn(f"Unknown encoding. Assumed ASCII.") return safe_decode(r_char.value, "ascii") def convert_symbol(r_symbol: parser.RObject, conversion_function: Callable[ [Union[parser.RData, parser.RObject]], Any]=lambda x: x ) -> Union[str, bytes]: """ Decode a R symbol to a Python string or bytes. Parameters ---------- r_symbol: RObject R symbol. conversion_function: Callable Conversion function to apply to the char element of the symbol. By default is the identity function. Returns ------- string: str or bytes Decoded string. See Also -------- convert_char """ if r_symbol.info.type is parser.RObjectType.SYM: symbol = conversion_function(r_symbol.value) assert isinstance(symbol, (str, bytes)) return symbol else: raise TypeError("Must receive a SYM object") def convert_array( r_array: RObject, conversion_function: Callable[ [Union[parser.RData, parser.RObject] ], Any]=lambda x: x, attrs: Optional[Mapping[Union[str, bytes], Any]] = None, ) -> Union[np.ndarray, xarray.DataArray]: """ Convert a R array to a Numpy ndarray or a Xarray DataArray. If the array has attribute ``dimnames`` the output will be a Xarray DataArray, preserving the dimension names. Parameters ---------- r_array: RObject R array. conversion_function: Callable Conversion function to apply to the attributes of the array. By default is the identity function. Returns ------- array: ndarray or DataArray Array. See Also -------- convert_vector """ if attrs is None: attrs = {} if r_array.info.type not in {parser.RObjectType.LGL, parser.RObjectType.INT, parser.RObjectType.REAL, parser.RObjectType.CPLX}: raise TypeError("Must receive an array object") value = r_array.value shape = attrs.get('dim') if shape is not None: # R matrix order is like FORTRAN value = np.reshape(value, shape, order='F') dimnames = attrs.get('dimnames') if dimnames: dimension_names = ["dim_" + str(i) for i, _ in enumerate(dimnames)] coords: Mapping[Hashable, Any] = { dimension_names[i]: d for i, d in enumerate(dimnames) if d is not None} value = xarray.DataArray(value, dims=dimension_names, coords=coords) return value def dataframe_constructor( obj: Any, attrs: Mapping[Union[str, bytes], Any], ) -> pandas.DataFrame: return pandas.DataFrame(obj, columns=obj) def _factor_constructor_internal( obj: Any, attrs: Mapping[Union[str, bytes], Any], ordered: bool, ) -> pandas.Categorical: values = [attrs['levels'][i - 1] if i >= 0 else None for i in obj] return pandas.Categorical(values, attrs['levels'], ordered=ordered) def factor_constructor( obj: Any, attrs: Mapping[Union[str, bytes], Any], ) -> pandas.Categorical: return _factor_constructor_internal(obj, attrs, ordered=False) def ordered_constructor( obj: Any, attrs: Mapping[Union[str, bytes], Any], ) -> pandas.Categorical: return _factor_constructor_internal(obj, attrs, ordered=True) def ts_constructor( obj: Any, attrs: Mapping[Union[str, bytes], Any], ) -> pandas.Series: start, end, frequency = attrs['tsp'] frequency = int(frequency) real_start = Fraction(int(round(start * frequency)), frequency) real_end = Fraction(int(round(end * frequency)), frequency) index = np.arange(real_start, real_end + Fraction(1, frequency), Fraction(1, frequency)) if frequency == 1: index = index.astype(int) return pandas.Series(obj, index=index) Constructor = Callable[[Any, Mapping], Any] default_class_map_dict: Mapping[Union[str, bytes], Constructor] = { "data.frame": dataframe_constructor, "factor": factor_constructor, "ordered": ordered_constructor, "ts": ts_constructor, } DEFAULT_CLASS_MAP = MappingProxyType(default_class_map_dict) """ Default mapping of constructor functions. It has support for converting several commonly used R classes: - Converts R \"data.frame\" objects into Pandas :class:`~pandas.DataFrame` objects. - Converts R \"factor\" objects into unordered Pandas :class:`~pandas.Categorical` objects. - Converts R \"ordered\" objects into ordered Pandas :class:`~pandas.Categorical` objects. - Converts R \"ts\" objects into Pandas :class:`~pandas.Series` objects. """ class Converter(abc.ABC): """ Interface of a class converting R objects in Python objects. """ @abc.abstractmethod def convert(self, data: Union[parser.RData, parser.RObject]) -> Any: """ Convert a R object to a Python one. """ pass class SimpleConverter(Converter): """ Class converting R objects to Python objects. Parameters ---------- constructor_dict: Dictionary mapping names of R classes to constructor functions with the following prototype: .. code-block :: python def constructor(obj, attrs): This dictionary can be used to support custom R classes. By default, the dictionary used is :data:`~rdata.conversion._conversion.DEFAULT_CLASS_MAP` which has support for several common classes. default_encoding: Default encoding used for strings with unknown encoding. If `None`, the one stored in the file will be used, or ASCII as a fallback. force_default_encoding: Use the default encoding even if the strings specify other encoding. """ def __init__( self, constructor_dict: Mapping[ Union[str, bytes], Constructor, ] = DEFAULT_CLASS_MAP, default_encoding: Optional[str] = None, force_default_encoding: bool = False, global_environment: Optional[Mapping[Union[str, bytes], Any]] = None, ) -> None: self.constructor_dict = constructor_dict self.default_encoding = default_encoding self.force_default_encoding = force_default_encoding self.global_environment = ChainMap( {} if global_environment is None else global_environment ) self.empty_environment: Mapping[Union[str, bytes], Any] = ChainMap({}) self._reset() def _reset(self) -> None: self.references: MutableMapping[int, Any] = {} self.default_encoding_used = self.default_encoding def convert(self, data: Union[parser.RData, parser.RObject]) -> Any: self._reset() return self._convert_next(data) def _convert_next(self, data: Union[parser.RData, parser.RObject]) -> Any: """ Convert a R object to a Python one. """ obj: RObject if isinstance(data, parser.RData): obj = data.object if self.default_encoding is None: self.default_encoding_used = data.extra.encoding else: obj = data attrs = convert_attrs(obj, self._convert_next) reference_id = id(obj) # Return the value if previously referenced value: Any = self.references.get(id(obj)) if value is not None: pass if obj.info.type == parser.RObjectType.SYM: # Return the internal string value = convert_symbol(obj, self._convert_next) elif obj.info.type == parser.RObjectType.LIST: # Expand the list and process the elements value = convert_list(obj, self._convert_next) elif obj.info.type == parser.RObjectType.ENV: # Return a ChainMap of the environments value = convert_env(obj, self._convert_next) elif obj.info.type == parser.RObjectType.LANG: # Expand the list and process the elements, returning a # special object rlanguage_list = convert_list(obj, self._convert_next) assert isinstance(rlanguage_list, list) value = RLanguage(rlanguage_list) elif obj.info.type == parser.RObjectType.CHAR: # Return the internal string value = convert_char( obj, default_encoding=self.default_encoding_used, force_default_encoding=self.force_default_encoding, ) elif obj.info.type in {parser.RObjectType.LGL, parser.RObjectType.INT, parser.RObjectType.REAL, parser.RObjectType.CPLX}: # Return the internal array value = convert_array(obj, self._convert_next, attrs=attrs) elif obj.info.type == parser.RObjectType.STR: # Convert the internal strings value = [self._convert_next(o) for o in obj.value] elif obj.info.type == parser.RObjectType.VEC: # Convert the internal objects value = convert_vector(obj, self._convert_next, attrs=attrs) elif obj.info.type == parser.RObjectType.EXPR: rexpression_list = convert_vector( obj, self._convert_next, attrs=attrs) assert isinstance(rexpression_list, list) # Convert the internal objects returning a special object value = RExpression(rexpression_list) elif obj.info.type == parser.RObjectType.S4: value = SimpleNamespace(**attrs) elif obj.info.type == parser.RObjectType.EMPTYENV: value = self.empty_environment elif obj.info.type == parser.RObjectType.GLOBALENV: value = self.global_environment elif obj.info.type == parser.RObjectType.REF: # Return the referenced value value = self.references.get(id(obj.referenced_object)) # value = self.references[id(obj.referenced_object)] if value is None: reference_id = id(obj.referenced_object) assert obj.referenced_object is not None value = self._convert_next(obj.referenced_object) elif obj.info.type == parser.RObjectType.NILVALUE: value = None else: raise NotImplementedError(f"Type {obj.info.type} not implemented") if obj.info.object: classname = attrs["class"] for i, c in enumerate(classname): constructor = self.constructor_dict.get(c, None) if constructor: new_value = constructor(value, attrs) else: new_value = NotImplemented if new_value is NotImplemented: missing_msg = (f"Missing constructor for R class " f"\"{c}\". ") if len(classname) > (i + 1): solution_msg = (f"The constructor for class " f"\"{classname[i+1]}\" will be " f"used instead." ) else: solution_msg = ("The underlying R object is " "returned instead.") warnings.warn(missing_msg + solution_msg, stacklevel=1) else: value = new_value break self.references[reference_id] = value return value def convert( data: Union[parser.RData, parser.RObject], *args: Any, **kwargs: Any, ) -> Any: """ Uses the default converter (:func:`SimpleConverter`) to convert the data. Examples: Parse one of the included examples, containing a vector >>> import rdata >>> >>> parsed = rdata.parser.parse_file( ... rdata.TESTDATA_PATH / "test_vector.rda") >>> converted = rdata.conversion.convert(parsed) >>> converted {'test_vector': array([1., 2., 3.])} Parse another example, containing a dataframe >>> import rdata >>> >>> parsed = rdata.parser.parse_file( ... rdata.TESTDATA_PATH / "test_dataframe.rda") >>> converted = rdata.conversion.convert(parsed) >>> converted {'test_dataframe': class value 0 a 1 1 b 2 2 b 3} """ return SimpleConverter(*args, **kwargs).convert(data) python-rdata-0.5/rdata/parser/000077500000000000000000000000001413710036200163545ustar00rootroot00000000000000python-rdata-0.5/rdata/parser/__init__.py000066400000000000000000000002321413710036200204620ustar00rootroot00000000000000from ._parser import ( DEFAULT_ALTREP_MAP, CharFlags, RData, RObject, RObjectInfo, RObjectType, parse_data, parse_file, ) python-rdata-0.5/rdata/parser/_parser.py000066400000000000000000000737001413710036200203700ustar00rootroot00000000000000from __future__ import annotations import abc import bz2 import enum import gzip import lzma import os import pathlib import warnings import xdrlib from dataclasses import dataclass from types import MappingProxyType from typing import ( Any, BinaryIO, Callable, List, Mapping, Optional, Set, TextIO, Tuple, Union, ) import numpy as np class FileTypes(enum.Enum): """ Type of file containing a R file. """ bzip2 = "bz2" gzip = "gzip" xz = "xz" rdata_binary_v2 = "rdata version 2 (binary)" rdata_binary_v3 = "rdata version 3 (binary)" magic_dict = { FileTypes.bzip2: b"\x42\x5a\x68", FileTypes.gzip: b"\x1f\x8b", FileTypes.xz: b"\xFD7zXZ\x00", FileTypes.rdata_binary_v2: b"RDX2\n", FileTypes.rdata_binary_v3: b"RDX3\n" } def file_type(data: memoryview) -> Optional[FileTypes]: """ Returns the type of the file. """ for filetype, magic in magic_dict.items(): if data[:len(magic)] == magic: return filetype return None class RdataFormats(enum.Enum): """ Format of a R file. """ XDR = "XDR" ASCII = "ASCII" binary = "binary" format_dict = { RdataFormats.XDR: b"X\n", RdataFormats.ASCII: b"A\n", RdataFormats.binary: b"B\n", } def rdata_format(data: memoryview) -> Optional[RdataFormats]: """ Returns the format of the data. """ for format_type, magic in format_dict.items(): if data[:len(magic)] == magic: return format_type return None class RObjectType(enum.Enum): """ Type of a R object. """ NIL = 0 # NULL SYM = 1 # symbols LIST = 2 # pairlists CLO = 3 # closures ENV = 4 # environments PROM = 5 # promises LANG = 6 # language objects SPECIAL = 7 # special functions BUILTIN = 8 # builtin functions CHAR = 9 # internal character strings LGL = 10 # logical vectors INT = 13 # integer vectors REAL = 14 # numeric vectors CPLX = 15 # complex vectors STR = 16 # character vectors DOT = 17 # dot-dot-dot object ANY = 18 # make “any” args work VEC = 19 # list (generic vector) EXPR = 20 # expression vector BCODE = 21 # byte code EXTPTR = 22 # external pointer WEAKREF = 23 # weak reference RAW = 24 # raw vector S4 = 25 # S4 classes not of simple type ALTREP = 238 # Alternative representations EMPTYENV = 242 # Empty environment GLOBALENV = 253 # Global environment NILVALUE = 254 # NIL value REF = 255 # Reference class CharFlags(enum.IntFlag): HAS_HASH = 1 BYTES = 1 << 1 LATIN1 = 1 << 2 UTF8 = 1 << 3 CACHED = 1 << 5 ASCII = 1 << 6 @dataclass class RVersions(): """ R versions. """ format: int serialized: int minimum: int @dataclass class RExtraInfo(): """ Extra information. Contains the default encoding (only in version 3). """ encoding: Optional[str] = None @dataclass class RObjectInfo(): """ Internal attributes of a R object. """ type: RObjectType object: bool attributes: bool tag: bool gp: int reference: int @dataclass class RObject(): """ Representation of a R object. """ info: RObjectInfo value: Any attributes: Optional[RObject] tag: Optional[RObject] = None referenced_object: Optional[RObject] = None def _str_internal( self, indent: int = 0, used_references: Optional[Set[int]] = None ) -> str: if used_references is None: used_references = set() string = "" string += f"{' ' * indent}{self.info.type}\n" if self.tag: tag_string = self.tag._str_internal(indent + 4, used_references.copy()) string += f"{' ' * (indent + 2)}tag:\n{tag_string}\n" if self.info.reference: assert self.referenced_object reference_string = (f"{' ' * (indent + 4)}..." if self.info.reference in used_references else self.referenced_object._str_internal( indent + 4, used_references.copy())) string += (f"{' ' * (indent + 2)}reference: " f"{self.info.reference}\n{reference_string}\n") string += f"{' ' * (indent + 2)}value:\n" if isinstance(self.value, RObject): string += self.value._str_internal(indent + 4, used_references.copy()) elif isinstance(self.value, tuple) or isinstance(self.value, list): for elem in self.value: string += elem._str_internal(indent + 4, used_references.copy()) elif isinstance(self.value, np.ndarray): string += " " * (indent + 4) if len(self.value) > 4: string += (f"[{self.value[0]}, {self.value[1]} ... " f"{self.value[-2]}, {self.value[-1]}]\n") else: string += f"{self.value}\n" else: string += f"{' ' * (indent + 4)}{self.value}\n" if(self.attributes): attr_string = self.attributes._str_internal( indent + 4, used_references.copy()) string += f"{' ' * (indent + 2)}attributes:\n{attr_string}\n" return string def __str__(self) -> str: return self._str_internal() @dataclass class RData(): """ Data contained in a R file. """ versions: RVersions extra: RExtraInfo object: RObject @dataclass class EnvironmentValue(): """ Value of an environment. """ locked: bool enclosure: RObject frame: RObject hash_table: RObject AltRepConstructor = Callable[ [RObject], Tuple[RObjectInfo, Any], ] AltRepConstructorMap = Mapping[bytes, AltRepConstructor] def format_float_with_scipen(number: float, scipen: int) -> bytes: fixed = np.format_float_positional(number, trim="-") scientific = np.format_float_scientific(number, trim="-") assert(isinstance(fixed, str)) assert(isinstance(scientific, str)) return ( scientific if len(fixed) - len(scientific) > scipen else fixed ).encode() def deferred_string_constructor( state: RObject, ) -> Tuple[RObjectInfo, Any]: new_info = RObjectInfo( type=RObjectType.STR, object=False, attributes=False, tag=False, gp=0, reference=0, ) object_to_format = state.value[0].value scipen = state.value[1].value value = [ RObject( info=RObjectInfo( type=RObjectType.CHAR, object=False, attributes=False, tag=False, gp=CharFlags.ASCII, reference=0, ), value=format_float_with_scipen(num, scipen), attributes=None, tag=None, referenced_object=None, ) for num in object_to_format ] return new_info, value def compact_seq_constructor( state: RObject, *, is_int: bool = False ) -> Tuple[RObjectInfo, Any]: new_info = RObjectInfo( type=RObjectType.INT if is_int else RObjectType.REAL, object=False, attributes=False, tag=False, gp=0, reference=0, ) start = state.value[1] stop = state.value[0] step = state.value[2] if is_int: start = int(start) stop = int(stop) step = int(step) value = np.arange(start, stop, step) return new_info, value def compact_intseq_constructor( state: RObject, ) -> Tuple[RObjectInfo, Any]: return compact_seq_constructor(state, is_int=True) def compact_realseq_constructor( state: RObject, ) -> Tuple[RObjectInfo, Any]: return compact_seq_constructor(state, is_int=False) def wrap_constructor( state: RObject, ) -> Tuple[RObjectInfo, Any]: new_info = RObjectInfo( type=state.value[0].info.type, object=False, attributes=False, tag=False, gp=0, reference=0, ) value = state.value[0].value return new_info, value default_altrep_map_dict: Mapping[bytes, AltRepConstructor] = { b"deferred_string": deferred_string_constructor, b"compact_intseq": compact_intseq_constructor, b"compact_realseq": compact_realseq_constructor, b"wrap_real": wrap_constructor, b"wrap_string": wrap_constructor, b"wrap_logical": wrap_constructor, b"wrap_integer": wrap_constructor, b"wrap_complex": wrap_constructor, b"wrap_raw": wrap_constructor, } DEFAULT_ALTREP_MAP = MappingProxyType(default_altrep_map_dict) class Parser(abc.ABC): """ Parser interface for a R file. """ def __init__( self, *, expand_altrep: bool = True, altrep_constructor_dict: AltRepConstructorMap = DEFAULT_ALTREP_MAP, ): self.expand_altrep = expand_altrep self.altrep_constructor_dict = altrep_constructor_dict def parse_bool(self) -> bool: """ Parse a boolean. """ return bool(self.parse_int()) @abc.abstractmethod def parse_int(self) -> int: """ Parse an integer. """ pass @abc.abstractmethod def parse_double(self) -> float: """ Parse a double. """ pass def parse_complex(self) -> complex: """ Parse a complex number. """ return complex(self.parse_double(), self.parse_double()) @abc.abstractmethod def parse_string(self, length: int) -> bytes: """ Parse a string. """ pass def parse_all(self) -> RData: """ Parse all the file. """ versions = self.parse_versions() extra_info = self.parse_extra_info(versions) obj = self.parse_R_object() return RData(versions, extra_info, obj) def parse_versions(self) -> RVersions: """ Parse the versions header. """ format_version = self.parse_int() r_version = self.parse_int() minimum_r_version = self.parse_int() if format_version not in [2, 3]: raise NotImplementedError( f"Format version {format_version} unsupported", ) return RVersions(format_version, r_version, minimum_r_version) def parse_extra_info(self, versions: RVersions) -> RExtraInfo: """ Parse the versions header. """ encoding = None if versions.format >= 3: encoding_len = self.parse_int() encoding = self.parse_string(encoding_len).decode("ASCII") extra_info = RExtraInfo(encoding) return extra_info def expand_altrep_to_object( self, info: RObject, state: RObject, ) -> Tuple[RObjectInfo, Any]: """Expand alternative representation to normal object.""" assert info.info.type == RObjectType.LIST class_sym = info.value[0] while class_sym.info.type == RObjectType.REF: class_sym = class_sym.referenced_object assert class_sym.info.type == RObjectType.SYM assert class_sym.value.info.type == RObjectType.CHAR altrep_name = class_sym.value.value assert isinstance(altrep_name, bytes) constructor = self.altrep_constructor_dict[altrep_name] return constructor(state) def parse_R_object( self, reference_list: Optional[List[RObject]] = None ) -> RObject: """ Parse a R object. """ if reference_list is None: # Index is 1-based, so we insert a dummy object reference_list = [] info_int = self.parse_int() info = parse_r_object_info(info_int) tag = None attributes = None referenced_object = None tag_read = False attributes_read = False add_reference = False result = None value: Any if info.type == RObjectType.NIL: value = None elif info.type == RObjectType.SYM: # Read Char value = self.parse_R_object(reference_list) # Symbols can be referenced add_reference = True elif info.type in [RObjectType.LIST, RObjectType.LANG]: tag = None if info.attributes: attributes = self.parse_R_object(reference_list) attributes_read = True elif info.tag: tag = self.parse_R_object(reference_list) tag_read = True # Read CAR and CDR car = self.parse_R_object(reference_list) cdr = self.parse_R_object(reference_list) value = (car, cdr) elif info.type == RObjectType.ENV: result = RObject( info=info, tag=tag, attributes=attributes, value=None, referenced_object=referenced_object, ) reference_list.append(result) locked = self.parse_bool() enclosure = self.parse_R_object(reference_list) frame = self.parse_R_object(reference_list) hash_table = self.parse_R_object(reference_list) attributes = self.parse_R_object(reference_list) value = EnvironmentValue( locked=locked, enclosure=enclosure, frame=frame, hash_table=hash_table, ) elif info.type == RObjectType.CHAR: length = self.parse_int() if length > 0: value = self.parse_string(length=length) elif length == 0: value = b"" elif length == -1: value = None else: raise NotImplementedError( f"Length of CHAR cannot be {length}") elif info.type == RObjectType.LGL: length = self.parse_int() value = np.empty(length, dtype=np.bool_) for i in range(length): value[i] = self.parse_bool() elif info.type == RObjectType.INT: length = self.parse_int() value = np.empty(length, dtype=np.int64) for i in range(length): value[i] = self.parse_int() elif info.type == RObjectType.REAL: length = self.parse_int() value = np.empty(length, dtype=np.double) for i in range(length): value[i] = self.parse_double() elif info.type == RObjectType.CPLX: length = self.parse_int() value = np.empty(length, dtype=np.complex_) for i in range(length): value[i] = self.parse_complex() elif info.type in [RObjectType.STR, RObjectType.VEC, RObjectType.EXPR]: length = self.parse_int() value = [None] * length for i in range(length): value[i] = self.parse_R_object(reference_list) elif info.type == RObjectType.S4: value = None elif info.type == RObjectType.ALTREP: altrep_info = self.parse_R_object(reference_list) altrep_state = self.parse_R_object(reference_list) altrep_attr = self.parse_R_object(reference_list) if self.expand_altrep: info, value = self.expand_altrep_to_object( info=altrep_info, state=altrep_state, ) attributes = altrep_attr else: value = (altrep_info, altrep_state, altrep_attr) elif info.type == RObjectType.EMPTYENV: value = None elif info.type == RObjectType.GLOBALENV: value = None elif info.type == RObjectType.NILVALUE: value = None elif info.type == RObjectType.REF: value = None # Index is 1-based referenced_object = reference_list[info.reference - 1] else: raise NotImplementedError(f"Type {info.type} not implemented") if info.tag and not tag_read: warnings.warn(f"Tag not implemented for type {info.type} " "and ignored") if info.attributes and not attributes_read: attributes = self.parse_R_object(reference_list) if result is None: result = RObject( info=info, tag=tag, attributes=attributes, value=value, referenced_object=referenced_object, ) else: result.info = info result.attributes = attributes result.value = value result.referenced_object = referenced_object if add_reference: reference_list.append(result) return result class ParserXDR(Parser): """ Parser used when the integers and doubles are in XDR format. """ def __init__( self, data: memoryview, position: int = 0, *, expand_altrep: bool = True, altrep_constructor_dict: AltRepConstructorMap = DEFAULT_ALTREP_MAP, ) -> None: super().__init__( expand_altrep=expand_altrep, altrep_constructor_dict=altrep_constructor_dict, ) self.data = data self.position = position self.xdr_parser = xdrlib.Unpacker(data) def parse_int(self) -> int: self.xdr_parser.set_position(self.position) result = self.xdr_parser.unpack_int() self.position = self.xdr_parser.get_position() return result def parse_double(self) -> float: self.xdr_parser.set_position(self.position) result = self.xdr_parser.unpack_double() self.position = self.xdr_parser.get_position() return result def parse_string(self, length: int) -> bytes: result = self.data[self.position:(self.position + length)] self.position += length return bytes(result) def parse_file( file_or_path: Union[BinaryIO, TextIO, 'os.PathLike[Any]', str], *, expand_altrep: bool = True, altrep_constructor_dict: AltRepConstructorMap = DEFAULT_ALTREP_MAP, ) -> RData: """ Parse a R file (.rda or .rdata). Parameters: file_or_path (file-like, str, bytes or path-like): File in the R serialization format. expand_altrep (bool): Wether to translate ALTREPs to normal objects. altrep_constructor_dict: Dictionary mapping each ALTREP to its constructor. Returns: RData: Data contained in the file (versions and object). See Also: :func:`parse_data`: Similar function that receives the data directly. Examples: Parse one of the included examples, containing a vector >>> import rdata >>> >>> parsed = rdata.parser.parse_file( ... rdata.TESTDATA_PATH / "test_vector.rda") >>> parsed RData(versions=RVersions(format=2, serialized=196610, minimum=131840), extra=RExtraInfo(encoding=None), object=RObject(info=RObjectInfo(type=, object=False, attributes=False, tag=True, gp=0, reference=0), value=(RObject(info=RObjectInfo(type=, object=False, attributes=False, tag=False, gp=0, reference=0), value=array([1., 2., 3.]), attributes=None, tag=None, referenced_object=None), RObject(info=RObjectInfo(type=, object=False, attributes=False, tag=False, gp=0, reference=0), value=None, attributes=None, tag=None, referenced_object=None)), attributes=None, tag=RObject(info=RObjectInfo(type=, object=False, attributes=False, tag=False, gp=0, reference=0), value=RObject(info=RObjectInfo(type=, object=False, attributes=False, tag=False, gp=64, reference=0), value=b'test_vector', attributes=None, tag=None, referenced_object=None), attributes=None, tag=None, referenced_object=None), referenced_object=None)) """ if isinstance(file_or_path, (os.PathLike, str)): path = pathlib.Path(file_or_path) data = path.read_bytes() else: # file is a pre-opened file buffer: Optional[BinaryIO] = getattr(file_or_path, 'buffer', None) if buffer is None: assert isinstance(file_or_path, BinaryIO) binary_file: BinaryIO = file_or_path else: binary_file = buffer data = binary_file.read() return parse_data( data, expand_altrep=expand_altrep, altrep_constructor_dict=altrep_constructor_dict, ) def parse_data( data: bytes, *, expand_altrep: bool = True, altrep_constructor_dict: AltRepConstructorMap = DEFAULT_ALTREP_MAP, ) -> RData: """ Parse the data of a R file, received as a sequence of bytes. Parameters: data (bytes): Data extracted of a R file. expand_altrep (bool): Wether to translate ALTREPs to normal objects. altrep_constructor_dict: Dictionary mapping each ALTREP to its constructor. Returns: RData: Data contained in the file (versions and object). See Also: :func:`parse_file`: Similar function that parses a file directly. Examples: Parse one of the included examples, containing a vector >>> import rdata >>> >>> with open(rdata.TESTDATA_PATH / "test_vector.rda", "rb") as f: ... parsed = rdata.parser.parse_data(f.read()) >>> >>> parsed RData(versions=RVersions(format=2, serialized=196610, minimum=131840), extra=RExtraInfo(encoding=None), object=RObject(info=RObjectInfo(type=, object=False, attributes=False, tag=True, gp=0, reference=0), value=(RObject(info=RObjectInfo(type=, object=False, attributes=False, tag=False, gp=0, reference=0), value=array([1., 2., 3.]), attributes=None, tag=None, referenced_object=None), RObject(info=RObjectInfo(type=, object=False, attributes=False, tag=False, gp=0, reference=0), value=None, attributes=None, tag=None, referenced_object=None)), attributes=None, tag=RObject(info=RObjectInfo(type=, object=False, attributes=False, tag=False, gp=0, reference=0), value=RObject(info=RObjectInfo(type=, object=False, attributes=False, tag=False, gp=64, reference=0), value=b'test_vector', attributes=None, tag=None, referenced_object=None), attributes=None, tag=None, referenced_object=None), referenced_object=None)) """ view = memoryview(data) filetype = file_type(view) parse_function = ( parse_rdata_binary if filetype in { FileTypes.rdata_binary_v2, FileTypes.rdata_binary_v3, } else parse_data ) if filetype is FileTypes.bzip2: new_data = bz2.decompress(data) elif filetype is FileTypes.gzip: new_data = gzip.decompress(data) elif filetype is FileTypes.xz: new_data = lzma.decompress(data) elif filetype in {FileTypes.rdata_binary_v2, FileTypes.rdata_binary_v3}: view = view[len(magic_dict[filetype]):] new_data = view else: raise NotImplementedError("Unknown file type") return parse_function( new_data, # type: ignore expand_altrep=expand_altrep, altrep_constructor_dict=altrep_constructor_dict, ) def parse_rdata_binary( data: memoryview, expand_altrep: bool = True, altrep_constructor_dict: AltRepConstructorMap = DEFAULT_ALTREP_MAP, ) -> RData: """ Select the appropiate parser and parse all the info. """ format_type = rdata_format(data) if format_type: data = data[len(format_dict[format_type]):] if format_type is RdataFormats.XDR: parser = ParserXDR( data, expand_altrep=expand_altrep, altrep_constructor_dict=altrep_constructor_dict, ) return parser.parse_all() else: raise NotImplementedError("Unknown file format") def bits(data: int, start: int, stop: int) -> int: """ Read bits [start, stop) of an integer. """ count = stop - start mask = ((1 << count) - 1) << start bitvalue = data & mask return bitvalue >> start def is_special_r_object_type(r_object_type: RObjectType) -> bool: """ Check if a R type has a different serialization than the usual one. """ return (r_object_type is RObjectType.NILVALUE or r_object_type is RObjectType.REF) def parse_r_object_info(info_int: int) -> RObjectInfo: """ Parse the internal information of an object. """ type_exp = RObjectType(bits(info_int, 0, 8)) reference = 0 if is_special_r_object_type(type_exp): object_flag = False attributes = False tag = False gp = 0 else: object_flag = bool(bits(info_int, 8, 9)) attributes = bool(bits(info_int, 9, 10)) tag = bool(bits(info_int, 10, 11)) gp = bits(info_int, 12, 28) if type_exp == RObjectType.REF: reference = bits(info_int, 8, 32) return RObjectInfo( type=type_exp, object=object_flag, attributes=attributes, tag=tag, gp=gp, reference=reference ) python-rdata-0.5/rdata/py.typed000066400000000000000000000001001413710036200165460ustar00rootroot00000000000000# Marker file for PEP 561. The rdata package uses inline types.python-rdata-0.5/rdata/tests/000077500000000000000000000000001413710036200162225ustar00rootroot00000000000000python-rdata-0.5/rdata/tests/__init__.py000066400000000000000000000000001413710036200203210ustar00rootroot00000000000000python-rdata-0.5/rdata/tests/data/000077500000000000000000000000001413710036200171335ustar00rootroot00000000000000python-rdata-0.5/rdata/tests/data/test_altrep_compact_intseq.rda000066400000000000000000000001761413710036200252460ustar00rootroot00000000000000 r0b```f`a`e`f2XCCt-XF'*I-.O))J-O-HL.+)N-ʾbd|*eYSb`q~d` b'cpython-rdata-0.5/rdata/tests/data/test_altrep_compact_realseq.rda000066400000000000000000000002001413710036200253630ustar00rootroot00000000000000 r0b```f`a`e`f2XCCt-XF'.I-.O))J-O-HL./JM)N-JbdJYSb`q> 83ȩpython-rdata-0.5/rdata/tests/data/test_altrep_deferred_string.rda000066400000000000000000000002561413710036200254020ustar00rootroot00000000000000 r0b```f`a`e`f2XCCt-XF'.I-.O))J-OIMK-*JM/.)KJbdJYSb`q bN `; Y`ʷHq{)Geu8# P;5python-rdata-0.5/rdata/tests/data/test_altrep_wrap_logical.rda000066400000000000000000000001751413710036200246770ustar00rootroot00000000000000 r0b```f`a`e`f2XCCt-XF'(I-.O))J-//J,OLNʽbdo% 55"4Q#YA+ Q>Ed+HߔYlwͦ/%AgTpython-rdata-0.5/rdata/tests/data/test_empty_str.rda000066400000000000000000000001101413710036200227000ustar00rootroot00000000000000 r0b```b`fcf`b2Y# '+I-.O-(/.) ɂ?{0p PZJs@i )@l+.Gpython-rdata-0.5/rdata/tests/data/test_na_string.rda000066400000000000000000000001161413710036200226440ustar00rootroot00000000000000 r0b```b`fcf`b2Y# '+I-.K/.)Ke8RnEpython-rdata-0.5/rdata/tests/data/test_s4.rda000066400000000000000000000002131413710036200212040ustar00rootroot00000000000000 r0b```b`faa`b2Y# 'f/I-./6a`dDbKMHblΉE9h*A @B6 P59' ŴԢ 80@i(UsIqkq V;w 0 ZXs0 2lfƮpython-rdata-0.5/rdata/tests/data/test_vector.rda000066400000000000000000000001161413710036200221620ustar00rootroot00000000000000 r0b```b`b&f H020pi" ?0`cQTRpython-rdata-0.5/rdata/tests/test_rdata.py000066400000000000000000000231151413710036200207300ustar00rootroot00000000000000import unittest from collections import ChainMap from fractions import Fraction from types import SimpleNamespace from typing import Any, Dict import numpy as np import pandas as pd import rdata TESTDATA_PATH = rdata.TESTDATA_PATH class SimpleTests(unittest.TestCase): def test_opened_file(self) -> None: parsed = rdata.parser.parse_file(open(TESTDATA_PATH / "test_vector.rda")) converted = rdata.conversion.convert(parsed) self.assertIsInstance(converted, dict) def test_opened_string(self) -> None: parsed = rdata.parser.parse_file(str(TESTDATA_PATH / "test_vector.rda")) converted = rdata.conversion.convert(parsed) self.assertIsInstance(converted, dict) def test_logical(self) -> None: parsed = rdata.parser.parse_file(TESTDATA_PATH / "test_logical.rda") converted = rdata.conversion.convert(parsed) np.testing.assert_equal(converted, { "test_logical": np.array([True, True, False, True, False]) }) def test_vector(self) -> None: parsed = rdata.parser.parse_file(TESTDATA_PATH / "test_vector.rda") converted = rdata.conversion.convert(parsed) np.testing.assert_equal(converted, { "test_vector": np.array([1., 2., 3.]) }) def test_empty_string(self) -> None: parsed = rdata.parser.parse_file(TESTDATA_PATH / "test_empty_str.rda") converted = rdata.conversion.convert(parsed) np.testing.assert_equal(converted, { "test_empty_str": [""] }) def test_na_string(self) -> None: parsed = rdata.parser.parse_file( TESTDATA_PATH / "test_na_string.rda") converted = rdata.conversion.convert(parsed) np.testing.assert_equal(converted, { "test_na_string": [None] }) def test_complex(self) -> None: parsed = rdata.parser.parse_file(TESTDATA_PATH / "test_complex.rda") converted = rdata.conversion.convert(parsed) np.testing.assert_equal(converted, { "test_complex": np.array([1 + 2j, 2, 0, 1 + 3j, -1j]) }) def test_matrix(self) -> None: parsed = rdata.parser.parse_file(TESTDATA_PATH / "test_matrix.rda") converted = rdata.conversion.convert(parsed) np.testing.assert_equal(converted, { "test_matrix": np.array([[1., 2., 3.], [4., 5., 6.]]) }) def test_list(self) -> None: parsed = rdata.parser.parse_file(TESTDATA_PATH / "test_list.rda") converted = rdata.conversion.convert(parsed) np.testing.assert_equal(converted, { "test_list": [ np.array([1.]), ['a', 'b', 'c'], np.array([2., 3.]), ['hi'] ] }) def test_expression(self) -> None: parsed = rdata.parser.parse_file(TESTDATA_PATH / "test_expression.rda") converted = rdata.conversion.convert(parsed) np.testing.assert_equal(converted, { "test_expression": rdata.conversion.RExpression([ rdata.conversion.RLanguage(['^', 'base', 'exponent'])]) }) def test_encodings(self) -> None: with self.assertWarns( UserWarning, msg="Unknown encoding. Assumed ASCII." ): parsed = rdata.parser.parse_file( TESTDATA_PATH / "test_encodings.rda", ) converted = rdata.conversion.convert(parsed) np.testing.assert_equal(converted, { "test_encoding_utf8": ["eĥoŝanĝo ĉiuĵaŭde"], "test_encoding_latin1": ["cañón"], "test_encoding_bytes": [b"reba\xf1o"], "test_encoding_latin1_implicit": [b"\xcd\xf1igo"], }) def test_encodings_v3(self) -> None: parsed = rdata.parser.parse_file( TESTDATA_PATH / "test_encodings_v3.rda", ) converted = rdata.conversion.convert(parsed) np.testing.assert_equal(converted, { "test_encoding_utf8": ["eĥoŝanĝo ĉiuĵaŭde"], "test_encoding_latin1": ["cañón"], "test_encoding_bytes": [b"reba\xf1o"], "test_encoding_latin1_implicit": ["Íñigo"], }) def test_dataframe(self) -> None: for f in {"test_dataframe.rda", "test_dataframe_v3.rda"}: with self.subTest(file=f): parsed = rdata.parser.parse_file( TESTDATA_PATH / f, ) converted = rdata.conversion.convert(parsed) pd.testing.assert_frame_equal( converted["test_dataframe"], pd.DataFrame({ "class": pd.Categorical( ["a", "b", "b"]), "value": [1, 2, 3], }) ) def test_ts(self) -> None: parsed = rdata.parser.parse_file(TESTDATA_PATH / "test_ts.rda") converted = rdata.conversion.convert(parsed) pd.testing.assert_series_equal(converted["test_ts"], pd.Series({ 2000 + Fraction(2, 12): 1., 2000 + Fraction(3, 12): 2., 2000 + Fraction(4, 12): 3., })) def test_s4(self) -> None: parsed = rdata.parser.parse_file(TESTDATA_PATH / "test_s4.rda") converted = rdata.conversion.convert(parsed) np.testing.assert_equal(converted, { "test_s4": SimpleNamespace( age=np.array(28), name=["Carlos"], **{'class': ["Person"]} ) }) def test_environment(self) -> None: parsed = rdata.parser.parse_file( TESTDATA_PATH / "test_environment.rda") converted = rdata.conversion.convert(parsed) dict_env = {'string': ['test']} empty_global_env: Dict[str, Any] = {} np.testing.assert_equal(converted, { "test_environment": ChainMap(dict_env, ChainMap(empty_global_env)) }) global_env = {"global": "test"} converted_global = rdata.conversion.convert( parsed, global_environment=global_env, ) np.testing.assert_equal(converted_global, { "test_environment": ChainMap(dict_env, ChainMap(global_env)) }) def test_emptyenv(self) -> None: parsed = rdata.parser.parse_file( TESTDATA_PATH / "test_emptyenv.rda") converted = rdata.conversion.convert(parsed) np.testing.assert_equal(converted, { "test_emptyenv": ChainMap({}) }) def test_list_attrs(self) -> None: parsed = rdata.parser.parse_file(TESTDATA_PATH / "test_list_attrs.rda") converted = rdata.conversion.convert(parsed) np.testing.assert_equal(converted, { "test_list_attrs": [['list'], [5]] }) def test_altrep_compact_intseq(self) -> None: """Test alternative representation of sequences of ints.""" parsed = rdata.parser.parse_file( TESTDATA_PATH / "test_altrep_compact_intseq.rda", ) converted = rdata.conversion.convert(parsed) np.testing.assert_equal(converted, { "test_altrep_compact_intseq": np.arange(1000), }) def test_altrep_compact_realseq(self) -> None: """Test alternative representation of sequences of ints.""" parsed = rdata.parser.parse_file( TESTDATA_PATH / "test_altrep_compact_realseq.rda", ) converted = rdata.conversion.convert(parsed) np.testing.assert_equal(converted, { "test_altrep_compact_realseq": np.arange(1000.0), }) def test_altrep_deferred_string(self) -> None: """Test alternative representation of deferred strings.""" parsed = rdata.parser.parse_file( TESTDATA_PATH / "test_altrep_deferred_string.rda", ) converted = rdata.conversion.convert(parsed) np.testing.assert_equal(converted, { "test_altrep_deferred_string": [ "1", "2.3", "10000", "1e+05", "-10000", "-1e+05", "0.001", "1e-04", "1e-05", ], }) def test_altrep_wrap_real(self) -> None: """Test alternative representation of wrap_real.""" parsed = rdata.parser.parse_file( TESTDATA_PATH / "test_altrep_wrap_real.rda", ) converted = rdata.conversion.convert(parsed) np.testing.assert_equal(converted, { "test_altrep_wrap_real": [3], }) def test_altrep_wrap_string(self) -> None: """Test alternative representation of wrap_string.""" parsed = rdata.parser.parse_file( TESTDATA_PATH / "test_altrep_wrap_string.rda", ) converted = rdata.conversion.convert(parsed) np.testing.assert_equal(converted, { "test_altrep_wrap_string": ["Hello"], }) def test_altrep_wrap_logical(self) -> None: """Test alternative representation of wrap_logical.""" parsed = rdata.parser.parse_file( TESTDATA_PATH / "test_altrep_wrap_logical.rda", ) converted = rdata.conversion.convert(parsed) np.testing.assert_equal(converted, { "test_altrep_wrap_logical": [True], }) if __name__ == "__main__": # import sys;sys.argv = ['', 'Test.testName'] unittest.main() python-rdata-0.5/readthedocs-requirements.txt000066400000000000000000000000601413710036200215300ustar00rootroot00000000000000-r requirements.txt Sphinx>=3.1 sphinx_rtd_themepython-rdata-0.5/requirements.txt000066400000000000000000000000361413710036200172500ustar00rootroot00000000000000numpy xarray pandas setuptoolspython-rdata-0.5/setup.cfg000066400000000000000000000007141413710036200156100ustar00rootroot00000000000000[aliases] test=pytest [tool:pytest] addopts = --doctest-modules --doctest-glob="*.rst" doctest_optionflags = NORMALIZE_WHITESPACE ELLIPSIS [isort] multi_line_output = 3 include_trailing_comma = true use_parentheses = true combine_as_imports = 1 [mypy] strict = True strict_equality = True implicit_reexport = True [mypy-numpy.*] ignore_missing_imports = True [mypy-pandas.*] ignore_missing_imports = True [mypy-setuptools.*] ignore_missing_imports = Truepython-rdata-0.5/setup.py000066400000000000000000000040351413710036200155010ustar00rootroot00000000000000# encoding: utf-8 """ Read R datasets from Python. This package parses .rda datasets used in R. It does not depend on the R language or its libraries, and thus it is released under a MIT license. """ import os import sys from setuptools import find_packages, setup needs_pytest = {'pytest', 'test', 'ptr'}.intersection(sys.argv) pytest_runner = ['pytest-runner'] if needs_pytest else [] DOCLINES = (__doc__ or '').split("\n") with open(os.path.join(os.path.dirname(__file__), 'VERSION'), 'r') as version_file: version = version_file.read().strip() setup(name='rdata', version=version, description=DOCLINES[1], long_description="\n".join(DOCLINES[3:]), url='https://github.com/vnmabus/rdata', author='Carlos Ramos Carreño', author_email='vnmabus@gmail.com', include_package_data=True, platforms=['any'], license='MIT', packages=find_packages(), python_requires='>=3.7, <4', classifiers=[ 'Development Status :: 4 - Beta', 'Intended Audience :: Developers', 'Intended Audience :: Science/Research', 'License :: OSI Approved :: MIT License', 'Natural Language :: English', 'Operating System :: OS Independent', 'Programming Language :: Python :: 3', 'Programming Language :: Python :: 3.6', 'Programming Language :: Python :: 3.7', 'Programming Language :: Python :: 3.8', 'Topic :: Scientific/Engineering :: Mathematics', 'Topic :: Software Development :: Libraries :: Python Modules', 'Typing :: Typed', ], keywords=['rdata', 'r', 'dataset'], install_requires=['numpy', 'xarray', 'pandas'], setup_requires=pytest_runner, tests_require=['pytest-cov', 'numpy>=1.14' # The printing format for numpy changes ], test_suite='rdata.tests', zip_safe=False)